Webpage series footprint-scanner

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

Post Body

Hey all. I'm looking for a program that can scan a webpage's source code for a specific footprint, then use a link on that page to jump to a new URL and repeat the process.

So for example, say we have a series of web-articles, it would go something like 1. scan Article 1 source code for term, 2. find the 'Next Article' link's URL, 3. scan Article 2 source code for term, etc. etc etc.

So basically I'm looking for something similar to ScrapeBox's page scanner but where you only need to load one URL, set a target-footprint and target-'next link', and it will keep jumping/returning results until it runs out of 'next links'. The kind of thing that would help me find all instances of the word "consumer interest" in a series of 10,000 sequential/linked URLs without having to load each URL manually.

Any ideas?

Author

Account Strength

90%

Account Age

12 years

Verified Email

Yes

Verified Flair

Total Karma

764

Link Karma

248

Comment Karma

328

Profile updated: 4 days ago

Posts updated: 1 year ago

appledaveon

Subreddit

r/AskProgramming

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.

Posted: 6 years ago
Reddit URL: View post on reddit.com
External URL: reddit.com/r/AskProgramm...