I have some documents of the form below in my conversations collection which hold a chat between multiple people. threadId holds id of one of the users in conversation between 2.
Conversation between 51c0c35b7daabc3410000001 and 51c0c35b7daabc3410000002
{
"from" : ObjectId("51c0c35b7daabc3410000001"),
"to" : ObjectId("51c0c35b7daabc3410000002"),
"threadId" : ObjectId("51c0c35b7daabc3410000002"),
"message" : "Hi there!"
}
{
"from" : ObjectId("51c0c35b7daabc3410000002"),
"to" : ObjectId("51c0c35b7daabc3410000001"),
"threadId" : ObjectId("51c0c35b7daabc3410000002"),
"message" : "How are you?"
}
{
"from" : ObjectId("51c0c35b7daabc3410000001"),
"to" : ObjectId("51c0c35b7daabc3410000002"),
"threadId" : ObjectId("51c0c35b7daabc3410000002"),
"message" : "I am good. How are you?"
}
Conversation between 51c0c35b7daabc3410000001 and 51c0c69989a44a6812000003
{
"from" : ObjectId("51c0c35b7daabc3410000001"),
"to" : ObjectId("51c0c69989a44a6812000003"),
"threadId" : ObjectId("51c0c69989a44a6812000003"),
"message" : "Hello!"
}
{
"from" : ObjectId("51c0c69989a44a6812000003"),
"to" : ObjectId("51c0c35b7daabc3410000001"),
"threadId" : ObjectId("51c0c69989a44a6812000003"),
"message" : "Hey.."
}
I want to group based on threadId and the output should be of the form
{
"from" : ObjectId("51c0c35b7daabc3410000001"),
"to" : ObjectId("51c0c35b7daabc3410000002"),
"threadId" : ObjectId("51c0c35b7daabc3410000002"),
"message" : "I am good. How are you?"
}
{
"from" : ObjectId("51c0c69989a44a6812000003"),
"to" : ObjectId("51c0c35b7daabc3410000001"),
"threadId" : ObjectId("51c0c69989a44a6812000003"),
"message" : "Hey.."
}
So I want to find all the documents with a distinct threadId (or group based only on threadId). The output should also include all the other fields in the document.
mapReduce, aggregate/group, distinct or a combination of these ? How do I go about it ?
EDIT
To put it in a SQL statement, I want something as simple as
SELECT * FROM conversations GROUP BY threadId;
I ended up doing this
// Aggregate pipeline
Conversation.aggregate(
{ $match: {
listing: new ObjectId(listingId)
}
},
{ $group: {
_id: '$threadId',
message: { $last: "$message" },
to: { $last: "$to" },
from: { $last: "$from" }
}
},
{ $project : {
_id : 0,
threadId : "$_id",
message: "$message",
to: "$to",
from: "$from"
}
},
function(err, threads) {
console.log(err);
console.log(threads);
}
);
It seems to work fine. Let me know if there is any simpler way or if this snippet can be optimized. Hope this helps someone.