Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

16
[Status] This thread is for Recovery status
Author Summary
Stuck_In_the_Matrix is in Status
Post Body

[2019-07-14 23:30 ET] I am now loading in all submissions from 2010 and earlier. I expect those submissions to be recovered later tonight. I'll update this submission with occasional status updates on the recovery process. There is some good news from this. The updated data will have more complete json data for very old submissions!

Since I have to reload the data, I may as well make one improvement. I'll be adding the author id and author created_utc to submissions so that submissions can be filtered based on when the author's account was created.

[2019-07-15 21:35 ET] Submissions for the years 2005 thru 2012 (inclusive) have been successfully reloaded. I am merging the segments now and will have the new index up so that it is available via the API in the next hour. Submissions for 2013 and 2014 are processing now and should be completed by tomorrow. I made some optimizations when rebuilding the indexes so hopefully the API is more responsive once the reload of data is complete.

[2019-07-15 22:30 ET] Submissions for years 2005 thru 2012 (inclusive) are now merged and available via the API.

[2019-07-16 13:00 ET] Submissions for the year 2013 have been re-indexed and are now available. Submissions for 2014 should be completed in a few hours. Submissions for 2015 and 2016 should be completed by tomorrow afternoon.

[2019-07-16 18:55 ET] Submissions for the year 2014 have been re-indexed and are now available. Submissions for 2015 should be comp

[2019-07-17 01:50 ET] Submissions for the year 2015 have been re-indexed.

[2019-07-17 02:15 ET] Submissions for 2016 are being re-indexed. I anticipate 2016 to be complete in around 4-5 hours. Although 2016 was not affected, I am still loading in the author_created_utc information. The only gap left for submissions is 2017-01-01 through 2017-08-01 (inclusive). I have started that concurrently with the 2016 re-index and expect that to be completed in around 8-10 hours. While looking at the data, I noticed that the submissions for 2018 need to have their score updated along with gildings. Since I already have the data from the monthly dumps, I will also process 2018 to update the data and also include the author_created_utc information. I expect submissions to be completed by late Friday / early Saturday. At that point, I will move to comments. There is also a lot of score data that needs to be updated for comments. I anticipate all comments to be restored / updated by the following weekend.


(Submissions for 2016 were not affected but I am going to reprocess them to add the author_created_utc fields and also to optimize the indices which will decrease search latency. I will also reprocess 2017 and 2018 as there are many submissions that need their score / gilding data updated.)

(I noticed I have enough capacity to add replicas for submissions. Once all submissions have been restored, I will add a node just for submission replicas. This will add redundancy for all submissions and also add more search capacity for all submissions. This should drastically improve uptime for submissions so that 100% of all submissions are available all the time -- even if a node goes offline for whatever reason. This will also help load balance searches by distributing submission searches across the replicas. Search latency will improve and provide for faster searches and aggregations.)

(Also note that this entire process will need to be repeated at some future date when the API is moved to v2.0. However, the process will be a bit more orderly than this one.) 😬

Author
Account Strength
100%
Account Age
11 years
Verified Email
No
Verified Flair
No
Total Karma
143,730
Link Karma
34,810
Comment Karma
108,242
Profile updated: 2 days ago
Posts updated: 6 months ago

Subreddit

Post Details

Location
We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
5 years ago