Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

19
[Beta] New API will be coming online for beta testing soon -- Elasticsearch search is currently available
Author Summary
Stuck_In_the_Matrix is in beta
Post Body

I am tweaking the mapping files to add a lot more functionality to the new beta testing server. The server itself is actually a Raspberry Pi 4B (since I'm fresh out of servers for the time being). Amazingly, it is extremely fast for basic searches!

The endpoint is:

http://pi4.pushshift.io

Both comments and submissions are currently be ingested on these endpoints.

Comments: http://pi4.pushshift.io/rc/_search

Submissions: http://pi4.pushshift.io/rs/_search

I am testing a lot of new functionality but anyone is welcome to test some code against it. I've increased the rate-limit to 600 requests a minute (10 a second). Right now, the API structure is based on Elasticsearch queries. It isn't too difficult to use and I will also provide more detailed examples in the future.

Some of the new features include the emoji search capability. Foreign language search has also been improved. There are some new fields available including:

author_created_utc: The time when the author's account was created.

author_delta: The time difference between when the comment was made and when the author's account was made.

nest_level: The position of the comment in the tree structure. A top-level comment has a nest level of 1. A reply to a top level comment would have a nest level of 2 and so on. If the nest level is null, the nest_level could not be computed for that comment (this generally happens when the top level comment is unavailable for whatever reason).

reply_delay: This is the elapsed time in seconds between when the parent object was created and when the comment itself was created. A comment with a nest level of 1 has the reply_delay calculated by subtracting the time the submission was made from the time the comment itself was created. If the comment has a nest_level greater than one, then the reply_delay is the difference between the comment's creation time and that of its parent.

The nest_level, reply_delay, author_created_utc and author_delta are very interesting metrics because they can be used to detect bots / spam accounts.

Another mapping feature I am testing is partial author / subreddit matching. For instance, if the author's account name is throwaway4839583, you can search for "throwaway" and match any comment who's author has "throwaway" within the full author name (case-insensitive). I am also testing this feature with subreddit names as well.

The code driving this new beta endpoint is in active development so any suggestions, feature requests, bug reports, etc. are much appreciated.

Feel free to make as many requests as you want against it (up to the rate limit). I had to set some type of rate-limit to prevent abuse against the endpoint -- but you are free to make as many requests as you want to test things out.

The site itself may go down at any time for maintenance so do keep that in mind. Also, I would strongly recommend not having any production system depend on the availability of this endpoint. That said, I will try to keep it up as often as possible.

Author
Account Strength
100%
Account Age
11 years
Verified Email
No
Verified Flair
No
Total Karma
143,730
Link Karma
34,810
Comment Karma
108,242
Profile updated: 2 days ago
Posts updated: 6 months ago

Subreddit

Post Details

Location
We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
5 years ago