python相当于filter（）获取两个输出列表（即列表分区）

比方说，我有一个列表和一个过滤function。使用类似的东西

>>> filter(lambda x: x > 10, [1,4,12,7,42]) [12, 42]

我可以得到符合标准的元素。有没有一个我可以使用的函数会输出两个列表，其中一个元素匹配，其余元素之一？我可以调用filter()函数两次，但这有点难看:)

编辑：元素的顺序应该保存，我可能有多个相同的元素。

尝试这个：

 def partition(pred, iterable): trues = [] falses = [] for item in iterable: if pred(item): trues.append(item) else: falses.append(item) return trues, falses

用法：

 >>> trues, falses = partition(lambda x: x > 10, [1,4,12,7,42]) >>> trues [12, 42] >>> falses [1, 4, 7]

在itertools食谱中还有一个实现build议：

 from itertools import filterfalse, tee def partition(pred, iterable): 'Use a predicate to partition entries into false entries and true entries' # partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9 t1, t2 = tee(iterable) return filterfalse(pred, t1), filter(pred, t2)

配方来自Python 3.x文档。在Python 2.x filterfalse被称为ifilterfalse 。

 >>> def partition(l, p): ... return reduce(lambda x, y: (x[0]+[y], x[1]) if p(y) else (x[0], x[1]+[y]), l, ([], [])) ... >>> partition([1, 2, 3, 4, 5], lambda x: x < 3) ([1, 2], [3, 4, 5])

和上面的代码稍微丑陋，但更快的版本：

 def partition(l, p): return reduce(lambda x, y: x[0].append(y) or x if p(y) else x[1].append(y) or x, l, ([], []))

这是第二次编辑，但我认为这很重要：

  def partition(l, p): return reduce(lambda x, y: x[not p(y)].append(y) or x, l, ([], []))

第二个和第三个和迭代的一样快，但代码less。

我认为groupby在这里可能更有意义：

http://docs.python.org/library/itertools.html#itertools.groupby

例如，将列表分成奇数和偶数（或可以是任意数量的组）：

 >>> l=range(6) >>> key=lambda x: x % 2 == 0 >>> from itertools import groupby >>> {k:list(g) for k,g in groupby(sorted(l,key=key),key=key)} {False: [1, 3, 5], True: [0, 2, 4]}

如果你的列表中没有重复的元素，你可以使用set：

 >>> a = [1,4,12,7,42] >>> b = filter(lambda x: x > 10, [1,4,12,7,42]) >>> no_b = set(a) - set(b) set([1, 4, 7])

或者你可以通过一个列表可以理解：

 >>> no_b = [i for i in a if i not in b]

注意：这不是一个函数，只是知道第一个fitler（）的结果，你可以推导出那些没有太多过滤标准的元素。

 from itertools import ifilterfalse def filter2(predicate, iterable): return filter(predicate, iterable), list(ifilterfalse(predicate, iterable))

我刚刚有这个要求。我不喜欢itertools配方，因为它涉及两个单独的数据传递。这是我的实现：

 def filter_twoway(test, data): "Like filter(), but returns the passes AND the fails as two separate lists" collected = {True: [], False: []} for datum in data: collected[test(datum)].append(datum) return (collected[True], collected[False])

每个人似乎都认为他们的解决scheme是最好的，所以我决定用timeit来testing他们。我用“def is_odd（x）：return x＆1”作为我的谓词函数，而“xrange（1000）”作为迭代器。这是我的Python版本：

 Python 2.7.3 (v2.7.3:70274d53c1dd, Apr 9 2012, 20:52:43) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin

这里是我的testing结果：

 Mark Byers 1000 loops, best of 3: 325 usec per loop cldy 1000 loops, best of 3: 1.96 msec per loop Dan S 1000 loops, best of 3: 412 usec per loop TTimo 1000 loops, best of 3: 503 usec per loop

这些都是可比的对方。现在，让我们尝试使用Python文档中给出的示例。

 import itertools def partition(pred, iterable, # Optimized by replacing global lookups with local variables # defined as default values. filter=itertools.ifilter, filterfalse=itertools.ifilterfalse, tee=itertools.tee): 'Use a predicate to partition entries into false entries and true entries' # partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9 t1, t2 = tee(iterable) return filterfalse(pred, t1), filter(pred, t2)

这似乎要快一点。

 100000 loops, best of 3: 2.58 usec per loop

itertools示例代码跳动至less100倍！道德是，不要重复发明轮子。

TL; DR

被马克·拜尔斯（ Mark Byers）接受的，得票数最多的答案 [1]

 def partition(pred, iterable): trues = [] falses = [] for item in iterable: if pred(item): trues.append(item) else: falses.append(item) return trues, falses

是最简单和最快的。

标准化不同的方法

已经提出的不同方法可以大致分为三类，

直接通过lis.append列表操作，返回一个2元组列表，
lis.append由函数方法调用，返回一个2元组的列表，
使用itertools罚款文档中给出的规范的配方，返回一个2元组，简而言之，发电机。

这里遵循三种技术的香草实现，首先是function方法，然后itertools和最终两个不同的直接列表操作的实现，使用False的替代是零， True是一个技巧。

请注意，这是Python3 – 因此reduce来自functools – 和OP请求一个元组(positives, negatives)但我的实现都返回(negatives, positives) …

 $ ipython Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:51:32) Type 'copyright', 'credits' or 'license' for more information IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: import functools ...: ...: def partition_fu(p, l, r=functools.reduce): ...: return r(lambda x, y: x[p(y)].append(y) or x, l, ([], [])) ...: In [2]: import itertools ...: ...: def partition_it(pred, iterable, ...: filterfalse=itertools.filterfalse, ...: tee=itertools.tee): ...: t1, t2 = tee(iterable) ...: return filterfalse(pred, t1), filter(pred, t2) ...: In [3]: def partition_li(p, l): ...: a, b = [], [] ...: for n in l: ...: if p(n): ...: b.append(n) ...: else: ...: a.append(n) ...: return a, b ...: In [4]: def partition_li_alt(p, l): ...: x = [], [] ...: for n in l: x[p(n)].append(n) ...: return x ...:

我们需要一个谓词来适用于我们的列表和清单（再次，松散地说）来运行。

 In [5]: p = lambda n:n%2 In [6]: five, ten = range(50000), range(100000)

为了克服在testingitertools方法的问题，这是由joeln在2013年10月31日在6:17报告

废话。您已经计算了在filterfalse和filter构build生成器所需的时间，但是您尚未迭代input或调用pred一次！ itertools配方的优点在于它没有实现任何列表，或者在input中超出必要的位置。它调用pred是Byers等的两倍。

我已经想到了一个void实例，它实例化了由不同分区函数返回的两个迭代元素中的所有元素。

首先，我们使用两个固定列表来了解重载的含义（使用非常方便的IPython的magic %timeit ）

 In [7]: %timeit for e, o in zip(five, five): pass 4.21 ms ± 39.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

接下来，我们依次使用不同的实现

 In [8]: %timeit for e, o in zip(*partition_fu(p, ten)): pass 53.9 ms ± 112 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) In [9]: %timeit for e, o in zip(*partition_it(p, ten)): pass 44.5 ms ± 3.84 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) In [10]: %timeit for e, o in zip(*partition_li(p, ten)): pass 36.3 ms ± 101 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) In [11]: %timeit for e, o in zip(*partition_li_alt(p, ten)): pass 37.3 ms ± 109 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) In [12]:

注释

最简单的方法也是最快的。

使用x[p(n)]技巧是没有用的，因为在每一步你都要索引一个数据结构，给你一个小小的惩罚 – 不过要知道你是否想要说服下降的文化的幸存者pythonizing。

函数式方法在操作上等同于可选的append实现，速度较慢了50％，这可能是因为我们对每个列表元素都有一个额外的（w / r到谓词评估）函数调用。

itertools方法具有（习惯的）优点，即没有可能的大型列表实例化，并且input列表没有完全处理，如果你打破了消费者循环，但是当我们使用它时，由于需要应用tee两端的谓语

在旁边

我已经爱上了Marii公开的object.mutate() or object成语，他们的回答显示了这个问题的function性方法 – 恐怕迟早我会滥用它。

脚注

[1]接受和最多投票今天，2017年9月14日 – 但当然，我对这个答案我的最高希望！

已经有很多好的答案了。我喜欢用这个：

 def partition( pred, iterable ): def _dispatch( ret, v ): if ( pred( v ) ): ret[0].append( v ) else: ret[1].append( v ) return ret return reduce( _dispatch, iterable, ( [], [] ) ) if ( __name__ == '__main__' ): import random seq = range( 20 ) random.shuffle( seq ) print( seq ) print( partition( lambda v : v > 10, seq ) )

你可以看看django.utils.functional.partition解决scheme：

 def partition(predicate, values): """ Splits the values into two sets, based on the return value of the function (True/False). eg: >>> partition(lambda x: x > 3, range(5)) [0, 1, 2, 3], [4] """ results = ([], []) for item in values: results[predicate(item)].append(item) return results

在我看来，这是这里介绍的最优雅的解决scheme。

这部分没有logging，只有源代码可以在https://docs.djangoproject.com/en/dev/_modules/django/utils/functional/上find

python相当于filter（）获取两个输出列表（即列表分区）

TL; DR

标准化不同的方法

注释

在旁边

脚注

如何将一个向量分成规则的，连续的序列组？