Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

1
Data structure advice needed
Post Body

Hello!

Thinking about optimizing my current application. The current schema looks roughly like this.

Two main collections: Project and Album.

Project currently has 200 000 documents, and looks like this, where the reviews array is being updated with new items daily.

{

    "_id" : ObjectId("6064c95822796579f33bea4f"),

    "name" : "justSomeName"

    "reviews" : [

        {

            "album" : ObjectId("5f34ee8bf0857e55ed5ebf97"),

            "notes" : "Crap",

            "rating" : 1

        },

        {

            "album" : ObjectId("5f34ee8bf0857e55ed5ec035"),

            "notes" : "great!",

            "rating" : 5

        },

        {

            .... // Possibbly 1000 more items in this array

        }

    ]

}

Album has approximately 1000 documents (and will stay that many), all of them more or less static :

{

    "_id" : ObjectId("5f34ee8bf0857e55ed5ebd9c"),

    "name" : "More Songs About Buildings And Food",

    "release_date" : "1978",

    "spotify" : "spotify:album:01RJdKvXyz515O37itqMIJ",

    "genres" : [

        "post-punk",

        "new-wave"

    ]

}

The most common query is to get a project with all its reviewed items, so at first i felt this schema structure is a good approach. Keeping the schema tightly connected to how the frontend is consuming it.

Theres no need for getting a single review item.

However, now i would need a way of getting all reviews for a specific album, so i guess the best approach here would be to put reviews in its own collection Review, making it easier to query reviews by project and/or by album.

Now to the questions:

  1. Am i right in thinking this is the best solution to my problem? Or is there another, better, way of solving this?
  2. Which approach would be best: a) Let the project have a reference array to all of its reviews OR b) let the review have a projectId?
  3. Moving all reviews to a new collection would mean a couple of millions documents in that collection. Would that cause slow queries? Such as this, which would be a common way of consuming the data: Reviews.find({projectId})

Author
Account Strength
100%
Account Age
5 years
Verified Email
Yes
Verified Flair
No
Total Karma
15,086
Link Karma
6,946
Comment Karma
6,370
Profile updated: 2 days ago
Posts updated: 7 months ago

Subreddit

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
3 years ago