Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

15
Optimize Selects On MySQL Table with 1.3 Billion Rows
Post Body

Hey All,

I'm the dev of Repost Sleuth here on Reddit.

I'm looking for input on the most performant way to select ~100 million rows from a table with 1.3 billion rows in a MySQL database.

Example table looks like this

post_type created_at hash
String(20) DATETIME String(64)

There's a composite index containing post_type and created_at

I periodically need to select a year worth of rows as fast as possible.

A query similar to:

SELECT id, hash, post_id FROM post WHERE post_type='image' AND (created_at BETWEEN '2022-1-1' AND '2022-7-15');

Explain for query:

https://preview.redd.it/u71h5hu53ub91.png?width=690&format=png&auto=webp&s=c64bd5dc5513088a3c2766b2e7c62fbcdbf01b88

The test database above explain was run on has 580 million rows.

For the past several years I've maintained a table for each year. This allowed me to select the whole table. I was able to pull all records fast enough for my needs.

I'm experimenting with keeping all data in a single table VS split by years to simplify things. However, with the table and query described above, it takes a painfully long time to pull the data I need.

I'm hoping someone here with more experience can offer some suggestions to optimize this. I'm not sure if there's ways to do the indexes better or possibly core MySQL tweaks

I'd appreciate any input.

Author
Account Strength
100%
Account Age
11 years
Verified Email
Yes
Verified Flair
No
Total Karma
16,028
Link Karma
7,214
Comment Karma
7,964
Profile updated: 2 days ago
Posts updated: 9 months ago

Subreddit

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
2 years ago