This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
I found a method to track and ingest all tweets made by all verified Twitter accounts (including original tweets, retweets and quoted tweets). I need to check the Twitter TOS to see exactly what data I can make available for academic use (they usually have more relaxed terms for academic use).
The data is being fed into elasticsearch (version 7.x) in near real-time (generally a 0-20 second delay -- mainly dependent on how quickly Twitter makes the tweets available via their API).
The goal for this project is to track all politicians and news sources for disinformation research. If there is any interest in this data, please let me know. This method gets all tweets, retweets and quoted tweets from all verified accounts (I have done some preliminary testing and there doesn't appear to be any missing data from the accounts I've tested -- including extremely active ones). This also includes the data for tweets that are retweeted or quoted by a verified account.
All tweets ingested have their meta-data refreshed (retweet_count, favorite_count, etc.) at 1 hour, 4 hour and 24 hour periods. The total tweets ingested averages around 50k-100k tweets an hour. Newly verified accounts from Twitter are automatically added instantly (no missed tweets from newly verified accounts).
Subreddit
Post Details
- Posted
- 5 years ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/pushshift/c...