Mongoose subdocuments vs nested schema

If you have schemas that are re-used in various parts of your model, then it might be useful to define individual schemas for the child docs so you don't have to duplicate yourself.

You should use embedded documents if that are static documents or that are not more than a few hundred because of performance impact. I have gone through about that issue for a while ago. Newly, Asya Kamsky who works as a solutions architect for MongoDB had written an article about "using subdocuments".

I hope that helps to who is looking for solutions or the best practice.

Original post on http://askasya.com/post/largeembeddedarrays . You can reach her stackoverflow profile on https://stackoverflow.com/users/431012/asya-kamsky

First of all, we have to consider why we would want to do such a thing. Normally, I would advise people to embed things that they always want to get back when they are fetching this document. The flip side of this is that you don't want to embed things in the document that you don't want to get back with it.

If you embed activity I perform into the document, it'll work great at first because all of my activity is right there and with a single read you can get back everything you might want to show me: "you recently clicked on this and here are your last two comments" but what happens after six months go by and I don't care about things I did a long time ago and you don't want to show them to me unless I specifically go to look for some old activity?

First, you'll end up returning bigger and bigger document and caring about smaller and smaller portion of it. But you can use projection to only return some of the array, the real pain is that the document on disk will get bigger and it will still all be read even if you're only going to return part of it to the end user, but since my activity is not going to stop as long as I'm active, the document will continue growing and growing.

The most obvious problem with this is eventually you'll hit the 16MB document limit, but that's not at all what you should be concerned about. A document that continuously grows will incur higher and higher cost every time it has to get relocated on disk, and even if you take steps to mitigate the effects of fragmentation, your writes will overall be unnecessarily long, impacting overall performance of your entire application.

There is one more thing that you can do that will completely kill your application's performance and that's to index this ever-increasing array. What that means is that every single time the document with this array is relocated, the number of index entries that need to be updated is directly proportional to the number of indexed values in that document, and the bigger the array, the larger that number will be.

I don't want this to scare you from using arrays when they are a good fit for the data model - they are a powerful feature of the document database data model, but like all powerful tools, it needs to be used in the right circumstances and it should be used with care.

According to the docs, it's exactly the same. However, using a Schema would add an _id field as well (as long as you don't have that disabled), and presumably uses some more resources for tracking subdocs.

Alternate declaration syntax

New in v3 If you don't need access to the sub-document schema instance, you may also declare sub-docs by simply passing an object literal [...]

Mongoose subdocuments vs nested schema

Tags:

Javascript

Mongodb

Node.Js

Mongoose

Related

Recent Posts