MongoDB - What Scales Better, An Array Property, A Nested Object Property or Putting That Data In A Separate Model?

Here's a data model for an application with a REST API that will hopefully scale to be quite large (if I'm lucky). I've read about difficulties that can arise with Arrays in MongoDB, so I'm hesitant to use them. Here are the specific requirements for the applications:

  • The Arrays will contain objects
  • The Array Objects will be modified frequently
  • There will be a lot of queries performed on the objects in the arrays
  • The Arrays will never exceed a maximum of 30 objects
  • I would like the REST API to send out data with Array properties, given they are simpler to work with. Of course, how we save that in the database can be different...

Given these specific requirements, which of the below models should scale better using MongoDB?

An Array for Order.products

var OrderSchema = new Schema({
    customer_name: String,
    products: {
        type: Array, 
        default: []
    }
});

A Nested Object Order.products

// Converting the array items to nested objects before saving, and performing the reverse before sending them back to the User.  
// For example, Order.products = { one: {}, two: {}, three: {} }

var OrderSchema = new Schema({
    customer_name: String,
    products: {
         type: Schema.Types.Mixed,
         default: {}
    }
});

Or Putting Those Items In A Separate Data Model...

var OrderProductSchema = new Schema({
    order_id: { type: Schema.ObjectId },
    product_title: String,
    product_price: Number
});

The answer is: it depends.

For example on your access patterns: will you query single products often or only want so see complete order?

Or data consistency: Do you care if the product info can happen not to be up to date for all orders all the time?

You said you want to query the array items often that would be a point in favor of the seperate Product collection.

You also said there will never be more than 30 Products per order and they will be send out as arrays. That would be easy menagable by an array of subdocuments.

So basically 1 and 3 would work best: 3 would be less work when retrieving the single product but means more work when looking at orders and building the products array you want to send out.

In 1 on the other it would be easier to send out complete orders with an product array but it would be more work keeping the products consistent and retrieving single products.

For questions like these I always find the schema design posts on the mongodb blog most helpful: part1, part2 and part3


Edit:

With the info that each product is unique and there is no need to keep them consistent I would go with 1: embedding them in an array.

The reasoning:

So with that new info: option 1