MongoDB: Handling Relations

Elvis Rozario
6 min readJul 18, 2019

--

MongoDB Relations

Relations are the crux of any database and relations in NoSQL databases are handled in a completely different way compared to an SQL database. There is one very important difference that you need to keep in mind while building a NoSQL database and that is, NoSQL databases usually always have a JSON like Schema. Once you’re familiar with that, then handling relations will be a lot easier.

This can be categorized as a beginner level post, but definitely useful for anyone interested in understanding how relations are handled in NoSQL databases. I’m assuming you have a basic understanding of what a database is and maybe have used at least one DB (SQL or NoSQL).

If you’re coming from an SQL background, just note that a collection in NoSQL is similar to a table in SQL and a document is similar to a row in the table.

Ideally, there are 3 basic kinds of relationships, they are as below:

One to One Relation

One to one relation, as the name suggests requires one entity to have an exclusive relationship with another entity and vice versa. Let’s consider a simple example to understand this relationship better…

The relationship between a user and his account. One user can have one account associated with him and one account can have only one user associated with it.

One to one relationships can be handled in two ways…

First and the easiest one is to have just one collection, the ‘user’ collection and the account of that particular user will be stored as an object in the user document itself.

The second way is to create another collection named account and store a reference key (ideally the ID of the account) in the user document.

You might think why in the world would I ever need to do this?

This way is usually used when one of the following three scenarios occur …

  1. The main document is too large (MongoDB documents have a size limit of 16mb)
  2. When some sensitive information needs to be stored (you might not want to return account information on every user GET request).
  3. When there’s an exclusive need for getting the account data without the user data (when ‘account’ is requested you don’t want to send ‘user’ information with it and/or when a ‘user’ is requested you don’t want to send ‘account’ information with it, even though both of them are connected)

One to many Relation

One to many relation, requires one entity to have an exclusive relationship with another entity but the other entity can have relations with multiple other entities. Let’s consider a simple example to understand this relationship better…

Consider, a user has multiple accounts, but each account can have a single user associated with it (think about these accounts as bank accounts, it’ll let you understand the example better). In this case, again there are two ways to handle it.

The first is to store an array of accounts in the user collection itself. This will let you GET all the accounts associated with a user in a single call. MongoDB also has features to push and pull data from an array in a document, which makes it quite easy to add or remove accounts from the user if need be.

The second way is to create another collection named ‘account’ and store a reference key (ideally the ID of the account) in the ‘user’ document. The reasons to do this are the same as in the case of one to one relations.

One issue with this approach is that when a new account needs to be created for a particular user, we need to create a new account and also update the existing user document with the id of this new account (basically requires 2 database calls). Obviously you can store the user ID in Account collection as well, in that way, you’ll only need one call to create a new account but it depends on the system you’re planning to build.

Before building the schema, it’s important that you plan out what kind of calls will be used more in your system and plan your schema accordingly.

For example, in this case, since this is a bank application (assumption), you know that most of the calls you’ll make would be getting a single user (while logging in maybe) and another call to get the accounts associated with that user (when he goes to the accounts tab maybe) and hence the above schema seems a pretty good one for this use case. In fact, storing user_id in the accounts’ collection would be an even better approach in this case.

Now consider another scenario, this time it’s a public forum, users can create posts and these posts can be viewed by the public. In this case, it’s better to store user_id in posts collection, instead of storing post_ids in users collection, since you know that your selling point is the posts list that the users can view and hence the calls you mostly make would be to get the posts list, with the user data associated with it (maybe in the homepage itself, like Facebook’s timeline). This way, while updating you wouldn’t need to update two collections.

Another scenario would be that you need both of them, that is, you need posts in users’ data as well and users in posts data as well. This will make creating new posts a bit slow (since you need to add IDs to the users’ collection as well), but getting data in both cases would be fast.

Many to many Relation

Many to many relation, doesn’t require any entity to have exclusive relations. Both entities can have multiple relations. Let’s consider a simple example to understand this relationship better…

Consider the relationship between users and products in an eCommerce environment. There is a list of users and there is a list of products. Any user can buy any product, meaning a user can buy multiple products and a product can be bought by multiple users. In this case, there is just one ideal way to handle it.

There’ll be two collections, one a collection for users and the other a collection from products. Whenever a user buys a product, add the ID of the product as a reference in the user’s collection, and since the user can buy multiple products, these IDs need to be stored as an array.

When a product needs to be updated, only that product in the product collection needs to be updated and every user who has bought the product will automatically get the updated product.

Obviously when a user buys any of the product, then it should go to another collection (maybe bought_items collection) so that the products there doesn’t get updated when the product in the products collection gets updated (since ideally, you shouldn’t make changes to already bought products). These things are actually related to the architecture of the application you’re building. That being said, I might start writing articles on Database architecture planning to cover these topics soon (I hope). Until then, see ya! (I know I’m not good at ending posts 😛)

--

--

Elvis Rozario

Senior Backend Engineer at FreightBro. I like to write about NodeJS, MongoDB, and AWS. LinkedIn — https://www.linkedin.com/in/elvis-rozario-55185570