检查数组中的每个元素是否匹配条件

我有一个文件集合：

date: Date users: [ { user: 1, group: 1 } { user: 5, group: 2 } ] date: Date users: [ { user: 1, group: 1 } { user: 3, group: 2 } ]

我想查询这个集合来查找所有文档，其中用户数组中的每个用户标识都在另一个数组中[1,5,7]。在这个例子中，只有第一个文档匹配。

我能find的最佳解决scheme是：

 $where: function() { var ids = [1, 5, 7]; return this.users.every(function(u) { return ids.indexOf(u.user) !== -1; }); }

不幸的是，这似乎损害了performance在$ where文档中说明：

$在评估JavaScript，不能利用索引。

我怎样才能改善这个查询？

你想要的查询是这样的：

 db.collection.find({"users":{"$not":{"$elemMatch":{"user":{$nin:[1,5,7]}}}}})

这说，find我没有列表1,5,7以外的元素的所有文件。

我不知道更好，但有几种不同的方法来解决这个问题，并取决于您可用的MongoDB的版本。

不太确定这是否是您的意图，但所示的查询将匹配第一个文档示例，因为随着您的逻辑实现，您正在匹配该文档数组内必须包含在示例数组内的元素。

所以，如果你真的想要文件包含所有这些元素，那么$all操作符将是明显的select：

 db.collection.find({ "users.user": { "$all": [ 1, 5, 7 ] } })

但是，假定你的逻辑实际上是有意的，至less按照build议，你可以通过与$in运算符相结合来“过滤”这些结果，这样在评估过的JavaScript中就会有更less的文档受到$where **条件的约束：

 db.collection.find({ "users.user": { "$in": [ 1, 5, 7 ] }, "$where": function() { var ids = [1, 5, 7]; return this.users.every(function(u) { return ids.indexOf(u.user) !== -1; }); } })

而且你得到一个索引，尽pipe实际的扫描将被匹配的文件乘以数组中元素的数量，但仍然比没有附加filter更好。

或者甚至可能你考虑$and操作符的逻辑抽象，与$or $size操作符一起使用，这取决于你实际的数组条件：

 db.collection.find({ "$or": [ { "users.user": { "$all": [ 1, 5, 7 ] } }, { "users.user": { "$all": [ 1, 5 ] } }, { "users.user": { "$all": [ 1, 7 ] } }, { "users": { "$size": 1 }, "users.user": 1 }, { "users": { "$size": 1 }, "users.user": 5 }, { "users": { "$size": 1 }, "users.user": 7 } ] })

所以这是你匹配条件的所有可能排列的几代，但性能可能会有所不同，取决于你的可用安装版本。

注意：在这种情况下，实际上是一个完全失败，因为这会做一些完全不同的事情，实际上会导致$in

替代scheme是与汇总框架，您的里程可能会有所不同，其中最有效率，因为您的集合中的文档数量，MongoDB 2.6及以上的方法之一：

 db.problem.aggregate([ // Match documents that "could" meet the conditions { "$match": { "users.user": { "$in": [ 1, 5, 7 ] } }}, // Keep your original document and a copy of the array { "$project": { "_id": { "_id": "$_id", "date": "$date", "users": "$users" }, "users": 1, }}, // Unwind the array copy { "$unwind": "$users" }, // Just keeping the "user" element value { "$group": { "_id": "$_id", "users": { "$push": "$users.user" } }}, // Compare to see if all elements are a member of the desired match { "$project": { "match": { "$setEquals": [ { "$setIntersection": [ "$users", [ 1, 5, 7 ] ] }, "$users" ]} }}, // Filter out any documents that did not match { "$match": { "match": true } }, // Return the original document form { "$project": { "_id": "$_id._id", "date": "$_id.date", "users": "$_id.users" }} ])

所以这种方法使用了一些新引入的集合运算符来比较内容，当然你需要重新构造这个数组来进行比较。

正如所指出的那样，在$setIsSubset有一个直接的操作符可以做到这一点，它在单个操作符中完成上面的组合操作符的等价操作：

 db.collection.aggregate([ { "$match": { "users.user": { "$in": [ 1,5,7 ] } }}, { "$project": { "_id": { "_id": "$_id", "date": "$date", "users": "$users" }, "users": 1, }}, { "$unwind": "$users" }, { "$group": { "_id": "$_id", "users": { "$push": "$users.user" } }}, { "$project": { "match": { "$setIsSubset": [ "$users", [ 1, 5, 7 ] ] } }}, { "$match": { "match": true } }, { "$project": { "_id": "$_id._id", "date": "$_id.date", "users": "$_id.users" }} ])

或者采用不同的方法，同时仍然利用MongoDB 2.6中的$size运算符：

 db.collection.aggregate([ // Match documents that "could" meet the conditions { "$match": { "users.user": { "$in": [ 1, 5, 7 ] } }}, // Keep your original document and a copy of the array // and a note of it's current size { "$project": { "_id": { "_id": "$_id", "date": "$date", "users": "$users" }, "users": 1, "size": { "$size": "$users" } }}, // Unwind the array copy { "$unwind": "$users" }, // Filter array contents that do not match { "$match": { "users.user": { "$in": [ 1, 5, 7 ] } }}, // Count the array elements that did match { "$group": { "_id": "$_id", "size": { "$first": "$size" }, "count": { "$sum": 1 } }}, // Compare the original size to the matched count { "$project": { "match": { "$eq": [ "$size", "$count" ] } }}, // Filter out documents that were not the same { "$match": { "match": true } }, // Return the original document form { "$project": { "_id": "$_id._id", "date": "$_id.date", "users": "$_id.users" }} ])

当然还是可以做的，尽pipe在2.6之前的版本中有更长的一段时间：

 db.collection.aggregate([ // Match documents that "could" meet the conditions { "$match": { "users.user": { "$in": [ 1, 5, 7 ] } }}, // Keep your original document and a copy of the array { "$project": { "_id": { "_id": "$_id", "date": "$date", "users": "$users" }, "users": 1, }}, // Unwind the array copy { "$unwind": "$users" }, // Group it back to get it's original size { "$group": { "_id": "$_id", "users": { "$push": "$users" }, "size": { "$sum": 1 } }}, // Unwind the array copy again { "$unwind": "$users" }, // Filter array contents that do not match { "$match": { "users.user": { "$in": [ 1, 5, 7 ] } }}, // Count the array elements that did match { "$group": { "_id": "$_id", "size": { "$first": "$size" }, "count": { "$sum": 1 } }}, // Compare the original size to the matched count { "$project": { "match": { "$eq": [ "$size", "$count" ] } }}, // Filter out documents that were not the same { "$match": { "match": true } }, // Return the original document form { "$project": { "_id": "$_id._id", "date": "$_id.date", "users": "$_id.users" }} ])

这通常会以不同的方式完成，尝试一下，看看最适合你的是什么。很可能， $in与现有表单的简单组合可能会是最好的。但在所有情况下，请确保您有一个可以select的索引：

 db.collection.ensureIndex({ "users.user": 1 })

只要你以某种方式访问它，就会给你最好的性能，就像这里所有的例子一样。

判决书

我对此感兴趣，所以最终做出了一个testing用例，以便看看哪些performance最好。所以首先一些testing数据的生成：

 var batch = []; for ( var n = 1; n <= 10000; n++ ) { var elements = Math.floor(Math.random(10)*10)+1; var obj = { date: new Date(), users: [] }; for ( var x = 0; x < elements; x++ ) { var user = Math.floor(Math.random(10)*10)+1, group = Math.floor(Math.random(10)*10)+1; obj.users.push({ user: user, group: group }); } batch.push( obj ); if ( n % 500 == 0 ) { db.problem.insert( batch ); batch = []; } }

如果集合中有10000个文档，长度为1..10的随机数组的长度为1..0，我得到430个文档的匹配数（从匹配的$in减去7749），结果如下（avg ）：

带有$in子句的JavaScript：420ms
与$size合计：395ms
与组数组聚合：650ms
与两个集合运算符聚合：275ms
与$setIsSubset聚合： $setIsSubset

注意除了最后两个样品以外，所有样品的峰值变化都快了大约100ms，最后两个样品的响应速度都是220ms。 JavaScript查询中的变化最大，也显示了100ms的慢结果。

但这里的重点是硬件，在我的笔记本电脑下的虚拟机是不是特别好，但给出了一个想法。

因此，集合，特别是带集合运算符的MongoDB 2.6.1版本明显地赢得了性能，而$setIsSubset作为单个运算符获得了额外的轻微收益。

这是特别有趣的（如2.4兼容方法所示），在这个过程中最大的成本将是$unwind语句（超过100ms avg），所以在$in选项的平均值约为32ms的情况下，其余的stream水线阶段执行平均在100ms以内。这样就给出了一个聚合与JavaScript性能相对的概念。

检查数组中的每个元素是否匹配条件

判决书

仅检索MongoDB集合中的对象数组中的查询元素

在MongoDb中以15分钟的时间间隔进行分组

MongoDB – 获取集合中每个组的max属性的文档

使用$ lookup操作符的多个连接条件

聚合后如何取回原始文档

从MongoDB删除重复项

MongoDB的$ in子句是否保证顺序？

使用另一个字段的值更新MongoDB字段

mongoose聚合$匹配不符合ID的

包括所有现有字段并将新字段添加到文档