Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

39
[Update] Currently working on rewriting the Reddit Ingest Script to Support Parallel Requests
Author Summary
Stuck_In_the_Matrix is in update
Post Body

For many years since Pushshift started, the comment volume on Reddit was low enough where sequentially fetching new data was sufficient to keep up with the amount of new data from Reddit.

Unfortunately, spam has become a huge issue with Reddit and there are times where millions of spam comments are generated in short bursts.

The new ingest script will have the following new features:

  • Ability to use multiple accounts to combine the rate-limits so that Pushshift can stay near real-time despite spam bursts.

  • Ability to keep the ingest near real-time. The goal is to have the ingest fetch new material within 5 seconds of it being created on Reddit. There are times when there are massive spam bursts that may cause the ingest to lag a bit more than 5 seconds, but once the new script is put into production, the Pushshift API should never fall minutes / hours behind.

  • More timely monthly dumps. Monthly dumps should be available within two weeks of the previous month's end with the goal to have them out within seven days.

I know a lot of services depend on the API being near real-time and it is frustrating for me to see the API fall hours behind due to large spam bursts on Reddit, so this upgrade will alleviate a lot of the current issues with the API falling behind.

The plan is to have the new ingest in production by February.

Author
Account Strength
100%
Account Age
11 years
Verified Email
No
Verified Flair
No
Total Karma
143,730
Link Karma
34,810
Comment Karma
108,242
Profile updated: 2 days ago
Posts updated: 6 months ago

Subreddit

Post Details

Location
We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
4 years ago