This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
I own a site that is an authority within the industry: 100,000 / visitors per day with around 4,500 domains linking in, including many large news sites like The Daily Mail and Vice.
The site is very large, over 25 million pages. Most of this is statistical content (no text). Only about 1.25% of pages have paragraph content.
I have found that Google is only prepared to index 12 million of the pages, so I have deduced which are the most important and excluded others from crawling with robots.txt. This appears to have boosted traffic by prioritising indexing of content that is more likely to generate clicks.
Bing on the other hand is only prepared to index 1.5 million pages. If they indexed as much as Google I'd probably be looking at in excess of 10,000 extra visitors per day.
I have noticed that some sites in a similar position (legitimate content but no paragraph text) produce automatically generated sentences from their statistical content. For example for an directory of IP addresses you could produce:
192.168.0.1 is a United States IP Address managed my NAIPC and assigned to The Internet Corporation of Dallas, Texas as a part of a block of 256 IPs. The IP is not on any black lists. Six IPs in the range are on IP block lists. The IP is considered safe.
Has anyone got any experience in producing such auto-generated text? And what was the outcome?
Subreddit
Post Details
- Posted
- 5 years ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/bigseo/comm...