mongodb - Using the aggregation framework to compare array element overlap -


i have collections documents structured below:

{    carrier: "abc",    flightnumber: 123,    dates: [       isodate("2015-01-01t00:00:00z"),       isodate("2015-01-02t00:00:00z"),       isodate("2015-01-03t00:00:00z")     ] } 

i search collection see if there documents same carrier , flightnumber have dates in dates array on lap. example:

{    carrier: "abc",    flightnumber: 123,    dates: [       isodate("2015-01-01t00:00:00z"),       isodate("2015-01-02t00:00:00z"),       isodate("2015-01-03t00:00:00z")     ] }, {    carrier: "abc",    flightnumber: 123,    dates: [       isodate("2015-01-03t00:00:00z"),       isodate("2015-01-04t00:00:00z"),       isodate("2015-01-05t00:00:00z")     ] } 

if above records present in collection return them because both have carrier: abc, flightnumber: 123 , have date isodate("2015-01-03t00:00:00z") in dates array. if date not present in second document neither should returned.

typically grouping , counting below:

db.flights.aggregate([   {      $group: {         _id: { carrier: "$carrier", flightnumber: "$flightnumber" },         uniqueids: { $addtoset: "$_id" },        count: { $sum: 1 }      }   },    {      $match: {         count: { $gt: 1 }      }   } ]) 

but i'm not sure how modify array overlap. can suggest how achieve this?

you $unwind array if want @ contents "grouped" within them:

db.flights.aggregate([   { "$unwind": "$dates" },   { "$group": {     "_id": { "carrier": "$carrier", "flightnumber": "$flightnumber", "date": "$dates" },      "count": { "$sum": 1 },      "_ids": { "$addtoset": "$_id" }   }},   { "$match": { "count": { "$gt": 1 } } },   { "$unwind": "$_ids" },   { "$group": { "_id": "$_ids" } } ]) 

that in fact tell documents "overlap" resides, because "same dates" along other same grouping key values concerned have "count" occurs more once. indicating overlap.

anything after $match "presentation" there no point reporting same _id value multiple overlaps if want see overlaps. in fact if want see them best leave "grouped set" alone.

now add $lookup if retrieving actual documents important you:

db.flights.aggregate([   { "$unwind": "$dates" },   { "$group": {     "_id": { "carrier": "$carrier", "flightnumber": "$flightnumber", "date": "$dates" },      "count": { "$sum": 1 },      "_ids": { "$addtoset": "$_id" }   }},   { "$match": { "count": { "$gt": 1 } } },   { "$unwind": "$_ids" },   { "$group": { "_id": "$_ids" } },   }},   { "$lookup": {     "from": "flights",     "localfield": "_id",     "foreignfield": "_id",     "as": "_ids"   }},   { "$unwind": "$_ids" },   { "$replaceroot": {     "newroot": "$_ids"   }} ]) 

and $replaceroot or $project make return whole document. or have done $addtoset $$root if not problem size.

but overall point covered in first 3 pipeline stages, or in "first". if want work arrays "across documents", primary operator still $unwind.


alternately more "reporting" format:

db.flights.aggregate([   { "$addfields": { "copy": "$$root" } },    { "$unwind": "$dates" },   { "$group": {     "_id": {       "carrier": "$carrier",       "flightnumber": "$flightnumber",       "dates": "$dates"      },     "count": { "$sum": 1 },     "_docs": { "$addtoset": "$copy" }     }},   { "$match": { "count": { "$gt": 1 } } },   { "$group": {     "_id": {       "carrier": "$_id.carrier",       "flightnumber": "$_id.flightnumber",     },     "overlaps": {       "$push": {         "date": "$_id.dates",         "_docs": "$_docs"         }       }     }} ]) 

which report overlapped dates within each group , tell documents contained overlap:

{     "_id" : {         "carrier" : "abc",         "flightnumber" : 123.0     },     "overlaps" : [          {             "date" : isodate("2015-01-03t00:00:00.000z"),             "_docs" : [                  {                     "_id" : objectid("5977f9187dcd6a5f6a9b4b97"),                     "carrier" : "abc",                     "flightnumber" : 123.0,                     "dates" : [                          isodate("2015-01-03t00:00:00.000z"),                          isodate("2015-01-04t00:00:00.000z"),                          isodate("2015-01-05t00:00:00.000z")                     ]                 },                  {                     "_id" : objectid("5977f9187dcd6a5f6a9b4b96"),                     "carrier" : "abc",                     "flightnumber" : 123.0,                     "dates" : [                          isodate("2015-01-01t00:00:00.000z"),                          isodate("2015-01-02t00:00:00.000z"),                          isodate("2015-01-03t00:00:00.000z")                     ]                 }             ]         }     ] } 

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -