This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
Imagine i have to implement a time series
data store where an entry looks like this:
{id - 64 bit auto incrementing long, time - 64 bit long, value - 64-512 bit binary, crc - 32 bit, version - 64 bit}
Primary key is {time, id}
The size of above entry would be between 36B - 92B.My table size would be at max 10GB.One host can be having 100s of table as this is a multi tenant system.
So I will have ~ 10GB/36B ~ 300M entries.
Now I have following req:
- Optimize for ingestion esp on tip(current time) which moves forwar
- Do deduplication based on
{id time version}
to reject lower versions synchronously. Againtime
here mostly would be tip - Have support for fast snapshot of storage for backups
- Support deletion based on predicate which would be like:
Note that duplicates would be rare and hence I believe I would benefit from keeping an index(id time) in memory and not entire data records.
I am evaluating following:
- Hash/Range based index - I am thinking of a bitcask like storage where i can keep index in memory. Since an index entry would take {16byte for key 8byte for offset} = 24B, I would need 24B * 300 M ~ 7GB memory for index alone for 1 table which is a lot.Hence I am thinking of a slightly different design though where I will divide my store into N partitions internally on time(say 10) and keep only the bucket(s) which are actively ingesting in memory. Since my most common case is tip ingestion, it will be 1 bucket that would be memory and so my index size goes down by factor of 10. This however adds some complexity in design. Also I believe implementing 4 is tricky if no
time
predicate is in query and I have to open all buckets. I guess the one way to get around this is to track tombstones separately. - LSM based engine - This should be obvious, however it does make sizing the memtable a bit tricky. Since the memtable now stores the whole entry, it means I can have less values in memory.
- BTree based engine - Thinking of something like Sqlite with primary key as
{time id}
(and not{id time}
). However I don;t think it would shine on writes. This howevers offers ability to run complex queries(if needed in future).
Anyone wants to guide me here?
Edit: Title wrongly says "hash", ignore it
Subreddit
Post Details
- Posted
- 10 months ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/databasedev...