I have a bunch of documents, this is a smaller prototype (the original documents have more fields, but those are not important for this question):
{"_id": {"$oid" : "53fedbcdd1d73a2502de6ae0"}, "commonID": "12345", "name": "Bus tour", "option_name" : "Morning tour"}
This document represents a tour option. Multiple tour options are grouped together by the commonID. Now I want a list of unique tours, so not every option. Only one option per tour.
For instance: A certain tour in Amsterdam is called "Bus tour". There are multiple options like: "Morning tour", "Afternoon Tour", "Evening Tour". I only want one document with the commonID "12345". But this for all the tours in the collection.
I read into mapReduce and Aggregation Pipelines but I couldn't find out how to approach this situation. Maybe you can suggest some options or examples I can read into.
If you want to use the aggregation framework, your aggregation pipeline should look something like that :
collection.aggregate( [ { $group: { _id: "$commonID", name : "$name", description : "$description", options: { $addToSet : {option: "$option_name", discrition :"$option_discription" }, ... } }])
Depends how you want to group your data
Take the below dataset for example:
db.test.insert({"commonID": "12345", "name": "Bus tour", "option_name" : "Morning tour"});
db.test.insert({"commonID": "12345", "name": "Bus tour", "option_name" : "Afternoon tour"});
db.test.insert({"commonID": "12345", "name": "Bus tour", "option_name" : "Evening tour"});
db.test.insert({"commonID": "12345", "name": "Car tour", "option_name" : "Evening tour"});
For your requirement, you can achieve the below result:
{
"_id" : "12345",
"name" : [ "Bus tour", "Bus tour", "Bus tour", "Car tour" ],
"options" : [ "Morning tour", "Afternoon tour", "Evening tour", "Evening tour" ]
}
For every "commonID" you can easily get an associative mapping with two arrays in output, you can consider "name" and "options" together as a key value pair, with name[0] -> key and options[0] -> value. So, for id -> "12345", there exists four names each with an option. You can remove duplicate key,value mappings in the client API.
The query to acheive this is:
db.test.aggregate([{$group:{"_id":"$commonID","name":{$push:"$name"},"options"{$push:"$option_name"}}}])
updated w.r.t to your comment:
"a document per commonID with the list of all options if possible!"