NumPy：比较两个数组中的元素

任何人都遇到过这个问题？假设你有两个如下所示的数组

a = array([1,2,3,4,5,6]) b = array([1,4,5])

有没有办法比较b中存在的元素？例如，

 c = a == b # Wishful example here print c array([1,4,5]) # Or even better array([True, False, False, True, True, False])

我试图避免循环，因为它需要数百万元素的年龄。有任何想法吗？

干杯

事实上，比起其中的任何一个都有一个更简单的解决scheme：

 import numpy as np a = array([1,2,3,4,5,6]) b = array([1,4,5]) c = np.in1d(a,b)

由此产生的c是：

 array([ True, False, False, True, True, False], dtype=bool)

使用np.intersect1d。

 #!/usr/bin/env python import numpy as np a = np.array([1,2,3,4,5,6]) b = np.array([1,4,5]) c=np.intersect1d(a,b) print(c) # [1 4 5]

请注意，如果a或b有不唯一的元素，np.intersect1d会给出错误的答案。在这种情况下，使用np.intersect1d_nu。

还有np.setdiff1d，setxor1d，setmember1d和union1d。请参阅使用文档的Numpy示例列表

感谢您的回复kaiser.se。这不是我正在寻找的东西，而是从朋友的build议和你说的我提出了以下。

 import numpy as np a = np.array([1,4,5]).astype(np.float32) b = np.arange(10).astype(np.float32) # Assigning matching values from a in b as np.nan b[b.searchsorted(a)] = np.nan # Now generating Boolean arrays match = np.isnan(b) nonmatch = match == False

这是一个繁琐的过程，但它跳动编写循环或使用循环编织。

干杯

Numpy有一个函数numpy.setmember1d（），它可以在有序数组和分离数组上工作，并返回你想要的布尔数组。如果input数组不符合标准，则需要将其转换为设置格式，并对结果进行转换。

 import numpy as np a = np.array([6,1,2,3,4,5,6]) b = np.array([1,4,5]) # convert to the uniqued form a_set, a_inv = np.unique1d(a, return_inverse=True) b_set = np.unique1d(b) # calculate matching elements matches = np.setmea_set, b_set) # invert the transformation result = matches[a_inv] print(result) # [False True False False True True False]

编辑：不幸的是，在numpy的setmember1d方法是真的效率低下。您build议的searchsorting和分配方法的工作更快，但如果您可以直接分配，您可以直接分配结果，并避免大量不必要的复制。如果b包含不在a中的任何东西，你的方法也会失败。以下更正了这些错误：

 result = np.zeros(a.shape, dtype=np.bool) idxs = a.searchsorted(b) idxs = idxs[np.where(idxs < a.shape[0])] # Filter out out of range values idxs = idxs[np.where(a[idxs] == b)] # Filter out where there isn't an actual match result[idxs] = True print(result)

我的基准testing显示，你的方法在91us与6.6ms之间，而在1M元素a和100元素b上，numpy的setmember1d是109ms。

ebresset，你的答案不会工作，除非a是b的一个子集（而a和b是sorting的）。否则，searchsorted将返回错误的索引。我不得不做类似的事情，并结合你的代码：

 # Assume a and b are sorted idxs = numpy.mod(b.searchsorted(a),len(b)) idxs = idxs[b[idxs]==a] b[idxs] = numpy.nan match = numpy.isnan(b)

你的例子意味着类似集合的行为，更关心数组中的存在，而不是把正确的元素放在正确的位置。 Numpy用它的math数组和matrix做了不同的处理，它只会告诉你关于确切点的项目。你能为你做这个工作吗？

 >>> import numpy >>> a = numpy.array([1,2,3]) >>> b = numpy.array([1,3,3]) >>> a == b array([ True, False, True], dtype=bool)

NumPy：比较两个数组中的元素

如何识别Python中的numpytypes？

如何将csv读入numpy的logging数组？

如何从numpy中的数组中find连续元素的组？

更好的方式来搅乱两个numpyarrays

什么是在Python中的zip的反函数？

为什么在我导入numpy之后多处理只使用一个核心？

为什么NumPy而不是Python列表？

从NumPy 2D数组中删除重复的列和行

拆分（爆炸）pandas数据框string条目分隔行

按列sortingNumPy中的数组