你如何将一个列表分成均匀大小的块？

我有一个任意长度的列表，我需要将它分成相等大小的块并对其进行操作。有一些明显的方法可以做到这一点，比如保留一个计数器和两个列表，当第二个列表填满时，将它添加到第一个列表中，并清空下一轮数据的第二个列表，但这可能是非常昂贵的。

我想知道是否有人有一个很好的解决这个任何长度的名单，例如使用发电机。

我正在寻找itertools有用的东西，但我找不到任何明显有用的东西。虽然可能错过了。

相关的问题：什么是最“pythonic”的方式来遍历一个列表块？

这是一个生成器，可以生成所需的块：

 def chunks(l, n): """Yield successive n-sized chunks from l.""" for i in range(0, len(l), n): yield l[i:i + n]

 import pprint pprint.pprint(list(chunks(range(10, 75), 10))) [[10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 44, 45, 46, 47, 48, 49], [50, 51, 52, 53, 54, 55, 56, 57, 58, 59], [60, 61, 62, 63, 64, 65, 66, 67, 68, 69], [70, 71, 72, 73, 74]]

如果你使用Python 2，你应该使用xrange()而不是range() ：

 def chunks(l, n): """Yield successive n-sized chunks from l.""" for i in xrange(0, len(l), n): yield l[i:i + n]

你也可以简单地使用列表理解而不是写一个函数。 Python 3：

 [l[i:i + n] for i in range(0, len(l), n)]

Python 2版本：

 [l[i:i + n] for i in xrange(0, len(l), n)]

如果你想要超级简单的东西：

 def chunks(l, n): n = max(1, n) return (l[i:i+n] for i in xrange(0, len(l), n))

直接从（旧）Python文档（itertools食谱）：

 from itertools import izip, chain, repeat def grouper(n, iterable, padvalue=None): "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')" return izip(*[chain(iterable, repeat(padvalue, n-1))]*n)

目前的版本，正如JFSebastian所build议的：

 #from itertools import izip_longest as zip_longest # for Python 2.x from itertools import zip_longest # for Python 3.x #from six.moves import zip_longest # for both (uses the six compat library) def grouper(n, iterable, padvalue=None): "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')" return zip_longest(*[iter(iterable)]*n, fillvalue=padvalue)

我猜Guido的时间机器工作 – 将工作 – 将工作 – 再次工作。

这些解决scheme因为[iter(iterable)]*n （或者早期版本中的等价物）创build一个迭代器，在列表中重复n次。 izip_longest然后有效地执行“每个”迭代器的循环; 因为这是相同的迭代器，所以每次这样的调用都会使它进步，从而导致每个这样的zip-roundrobin生成一个n元素的元组。

这是一个在任意迭代器上工作的生成器：

 def split_seq(iterable, size): it = iter(iterable) item = list(itertools.islice(it, size)) while item: yield item item = list(itertools.islice(it, size))

例：

 >>> import pprint >>> pprint.pprint(list(split_seq(xrange(75), 10))) [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 44, 45, 46, 47, 48, 49], [50, 51, 52, 53, 54, 55, 56, 57, 58, 59], [60, 61, 62, 63, 64, 65, 66, 67, 68, 69], [70, 71, 72, 73, 74]]

我知道这是有点老，但我不为什么没有提到numpy.array_split ：

 lst = range(50) In [26]: np.array_split(lst,5) Out[26]: [array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]), array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29]), array([30, 31, 32, 33, 34, 35, 36, 37, 38, 39]), array([40, 41, 42, 43, 44, 45, 46, 47, 48, 49])]

 def chunk(input, size): return map(None, *([iter(input)] * size))

我很惊讶没有人想过用它的双参数forms ：

 from itertools import islice def chunk(it, size): it = iter(it) return iter(lambda: tuple(islice(it, size)), ())

演示：

 >>> list(chunk(range(14), 3)) [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]

这适用于任何迭代，并产生懒惰的输出。它返回元组而不是迭代器，但我认为它有一定的优雅。它也不垫; 如果你想填充，上面的一个简单的变化就足够了：

 from itertools import islice, chain, repeat def chunk_pad(it, size, padval=None): it = chain(iter(it), repeat(padval)) return iter(lambda: tuple(islice(it, size)), (padval,) * size)

演示：

 >>> list(chunk_pad(range(14), 3)) [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)] >>> list(chunk_pad(range(14), 3, 'a')) [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')]

就像基于izip_longest的解决scheme，上面总是 izip_longest 。据我所知，没有一行或两行itertools配方的function， 可select垫。通过结合上述两种方法，这个方法非常接近：

 _no_padding = object() def chunk(it, size, padval=_no_padding): if padval == _no_padding: it = iter(it) sentinel = () else: it = chain(iter(it), repeat(padval)) sentinel = (padval,) * size return iter(lambda: tuple(islice(it, size)), sentinel)

演示：

 >>> list(chunk(range(14), 3)) [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)] >>> list(chunk(range(14), 3, None)) [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)] >>> list(chunk(range(14), 3, 'a')) [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')]

我相信这是最短的提议，提供可选的填充。

简单而优雅

 l = range(1, 1000) print [l[x:x+10] for x in xrange(0, len(l), 10)]

或者如果你喜欢：

 chunks = lambda l, n: [l[x: x+n] for x in xrange(0, len(l), n)] chunks(l, 10)

我在这个问题的重复中看到了最令人敬畏的Python-ish答案：

 from itertools import zip_longest a = range(1, 16) i = iter(a) r = list(zip_longest(i, i, i)) >>> print(r) [(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12), (13, 14, 15)]

你可以为任何n创buildn元组。如果a = range(1, 15) ，那么结果将是：

 [(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12), (13, 14, None)]

如果列表被平均分配，那么可以用zipreplacezip_longest ，否则三元组(13, 14, None)将会丢失。上面使用了Python 3。对于Python 2，请使用izip_longest 。

对其他答案的批评在这里：

这些答案都不是均匀大小的块，它们都在最后留下一块大块，所以它们不是完全平衡的。如果你正在使用这些function来分配工作，那么你已经build立了一个可能完成其他任务的前景，所以当其他人继续努力工作的时候，它会坐下来无所事事。

例如，当前的最高回答结束于：

 [60, 61, 62, 63, 64, 65, 66, 67, 68, 69], [70, 71, 72, 73, 74]]

我只是讨厌最后的小矮人！

其他如list(grouper(3, xrange(7)))和chunk(xrange(7), 3)都返回： [(0, 1, 2), (3, 4, 5), (6, None, None)] 。 None只是填充，而在我看来是不雅观的。他们不是均匀地分块迭代。

为什么我们不能把这些分得更好？

我的解决scheme

这里有一个平衡的解决scheme，根据我在生产中使用的函数进行调整（在Python 3中用rangereplacexrange ）：

 def baskets_from(items, maxbaskets=25): baskets = [[] for _ in xrange(maxbaskets)] # in Python 3 use range for i, item in enumerate(items): baskets[i % maxbaskets].append(item) return filter(None, baskets)

我创build了一个生成器，如果你把它放在一个列表中，

 def iter_baskets_from(items, maxbaskets=3): '''generates evenly balanced baskets from indexable iterable''' item_count = len(items) baskets = min(item_count, maxbaskets) for x_i in xrange(baskets): yield [items[y_i] for y_i in xrange(x_i, item_count, baskets)]

最后，因为我看到所有上面的函数都以一个连续的顺序返回元素（就像他们给出的那样）：

 def iter_baskets_contiguous(items, maxbaskets=3, item_count=None): ''' generates balanced baskets from iterable, contiguous contents provide item_count if providing a iterator that doesn't support len() ''' item_count = item_count or len(items) baskets = min(item_count, maxbaskets) items = iter(items) floor = item_count // baskets ceiling = floor + 1 stepdown = item_count % baskets for x_i in xrange(baskets): length = ceiling if x_i < stepdown else floor yield [items.next() for _ in xrange(length)]

产量

要testing它们：

 print(baskets_from(xrange(6), 8)) print(list(iter_baskets_from(xrange(6), 8))) print(list(iter_baskets_contiguous(xrange(6), 8))) print(baskets_from(xrange(22), 8)) print(list(iter_baskets_from(xrange(22), 8))) print(list(iter_baskets_contiguous(xrange(22), 8))) print(baskets_from('ABCDEFG', 3)) print(list(iter_baskets_from('ABCDEFG', 3))) print(list(iter_baskets_contiguous('ABCDEFG', 3))) print(baskets_from(xrange(26), 5)) print(list(iter_baskets_from(xrange(26), 5))) print(list(iter_baskets_contiguous(xrange(26), 5)))

打印出来：

 [[0], [1], [2], [3], [4], [5]] [[0], [1], [2], [3], [4], [5]] [[0], [1], [2], [3], [4], [5]] [[0, 8, 16], [1, 9, 17], [2, 10, 18], [3, 11, 19], [4, 12, 20], [5, 13, 21], [6, 14], [7, 15]] [[0, 8, 16], [1, 9, 17], [2, 10, 18], [3, 11, 19], [4, 12, 20], [5, 13, 21], [6, 14], [7, 15]] [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11], [12, 13, 14], [15, 16, 17], [18, 19], [20, 21]] [['A', 'D', 'G'], ['B', 'E'], ['C', 'F']] [['A', 'D', 'G'], ['B', 'E'], ['C', 'F']] [['A', 'B', 'C'], ['D', 'E'], ['F', 'G']] [[0, 5, 10, 15, 20, 25], [1, 6, 11, 16, 21], [2, 7, 12, 17, 22], [3, 8, 13, 18, 23], [4, 9, 14, 19, 24]] [[0, 5, 10, 15, 20, 25], [1, 6, 11, 16, 21], [2, 7, 12, 17, 22], [3, 8, 13, 18, 23], [4, 9, 14, 19, 24]] [[0, 1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15], [16, 17, 18, 19, 20], [21, 22, 23, 24, 25]]

请注意，连续的发生器提供与其他两个相同长度模式的块，但项目都是按顺序排列的，并且它们被均匀地划分，以便可以划分离散元素的列表。

more-itertools有一个块迭代器。

它还有更多的东西，包括itertools文档中的所有配方。

例如，如果您的块大小为3，则可以执行以下操作：

 zip(*[iterable[i::3] for i in range(3)])

来源： http : //code.activestate.com/recipes/303060-group-a-list-into-sequential-n-tuples/

当我的块大小是固定的，我可以input，例如'3'，并且永远不会改变。

如果你知道名单大小：

 def SplitList(list, chunk_size): return [list[offs:offs+chunk_size] for offs in range(0, len(list), chunk_size)]

如果你不（迭代器）：

 def IterChunks(sequence, chunk_size): res = [] for item in sequence: res.append(item) if len(res) >= chunk_size: yield res res = [] if res: yield res # yield the last, incomplete, portion

在后一种情况下，如果可以确定序列始终包含整数给定大小的块（即，没有不完整的最后一块），则可以用更美观的方式对其进行改写。

生成器expression式：

 def chunks(seq, n): return (seq[i:i+n] for i in xrange(0, len(seq), n))

例如。

 print list(chunks(range(1, 1000), 10))

我喜欢tzot和JFSebastian提出的Python文档版本，但是它有两个缺点：

它不是很明确
我通常不需要在最后一个块填充值

我在我的代码中使用了很多：

 from itertools import islice def chunks(n, iterable): iterable = iter(iterable) while True: yield tuple(islice(iterable, n)) or iterable.next()

更新：一个懒块版本：

 from itertools import chain, islice def chunks(n, iterable): iterable = iter(iterable) while True: yield chain([next(iterable)], islice(iterable, n-1))

toolz库有这样的partitionfunction：

 from toolz.itertoolz.core import partition list(partition(2, [1, 2, 3, 4])) [(1, 2), (3, 4)]

在这一点上，我认为我们需要一个recursion生成器 ，以防万一…

在python 2中：

 def chunks(li, n): if li == []: return yield li[:n] for e in chunks(li[n:], n): yield e

在python 3中：

 def chunks(li, n): if li == []: return yield li[:n] yield from chunks(li[n:], n)

另外，如果有大量的外星人入侵， 装饰的recursion发生器可能会变得非常方便：

 def dec(gen): def new_gen(li, n): for e in gen(li, n): if e == []: return yield e return new_gen @dec def chunks(li, n): yield li[:n] for e in chunks(li[n:], n): yield e

 def split_seq(seq, num_pieces): start = 0 for i in xrange(num_pieces): stop = start + len(seq[i::num_pieces]) yield seq[start:stop] start = stop

用法：

 seq = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] for seq in split_seq(seq, 3): print seq

 [AA[i:i+SS] for i in range(len(AA))[::SS]]

在AA是数组的情况下，SS是块大小。例如：

 >>> AA=range(10,21);SS=3 >>> [AA[i:i+SS] for i in range(len(AA))[::SS]] [[10, 11, 12], [13, 14, 15], [16, 17, 18], [19, 20]] # or [range(10, 13), range(13, 16), range(16, 19), range(19, 21)] in py3

您也可以使用utilspie库的get_chunks函数：

 >>> from utilspie import iterutils >>> a = [1, 2, 3, 4, 5, 6, 7, 8, 9] >>> list(iterutils.get_chunks(a, 5)) [[1, 2, 3, 4, 5], [6, 7, 8, 9]]

你可以通过pip安装utilspie ：

 sudo pip install utilspie

免责声明：我是utilspie库的创build者 。

嘿，一行版本

 In [48]: chunk = lambda ulist, step: map(lambda i: ulist[i:i+step], xrange(0, len(ulist), step)) In [49]: chunk(range(1,100), 10) Out[49]: [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27, 28, 29, 30], [31, 32, 33, 34, 35, 36, 37, 38, 39, 40], [41, 42, 43, 44, 45, 46, 47, 48, 49, 50], [51, 52, 53, 54, 55, 56, 57, 58, 59, 60], [61, 62, 63, 64, 65, 66, 67, 68, 69, 70], [71, 72, 73, 74, 75, 76, 77, 78, 79, 80], [81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [91, 92, 93, 94, 95, 96, 97, 98, 99]]

考虑使用matplotlib.cbook作品

例如：

 import matplotlib.cbook as cbook segments = cbook.pieces(np.arange(20), 3) for s in segments: print s

另一个更明确的版本。

 def chunkList(initialList, chunkSize): """ This function chunks a list into sub lists that have a length equals to chunkSize. Example: lst = [3, 4, 9, 7, 1, 1, 2, 3] print(chunkList(lst, 3)) returns [[3, 4, 9], [7, 1, 1], [2, 3]] """ finalList = [] for i in range(0, len(initialList), chunkSize): finalList.append(initialList[i:i+chunkSize]) return finalList

码：

 def split_list(the_list, chunk_size): result_list = [] while the_list: result_list.append(the_list[:chunk_size]) the_list = the_list[chunk_size:] return result_list a_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] print split_list(a_list, 3)

结果：

 [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]

我意识到这个问题是旧的（在Google上偶然发现了这个问题），但是肯定像下面这样的问题比任何复杂的build议都简单得多，而且只使用切片：

 def chunker(iterable, chunksize): for i,c in enumerate(iterable[::chunksize]): yield iterable[i*chunksize:(i+1)*chunksize] >>> for chunk in chunker(range(0,100), 10): ... print list(chunk) ... [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19] [20, 21, 22, 23, 24, 25, 26, 27, 28, 29] ... etc ...

看到这个参考

 >>> orange = range(1, 1001) >>> otuples = list( zip(*[iter(orange)]*10)) >>> print(otuples) [(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), ... (991, 992, 993, 994, 995, 996, 997, 998, 999, 1000)] >>> olist = [list(i) for i in otuples] >>> print(olist) [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], ..., [991, 992, 993, 994, 995, 996, 997, 998, 999, 1000]] >>>

Python3

 a = [1, 2, 3, 4, 5, 6, 7, 8, 9] CHUNK = 4 [a[i*CHUNK:(i+1)*CHUNK] for i in xrange((len(a) + CHUNK - 1) / CHUNK )]

如果不调用len（）对于大型列表来说是很好的：

 def splitter(l, n): i = 0 chunk = l[:n] while chunk: yield chunk i += n chunk = l[i:i+n]

这是为了迭代：

 def isplitter(l, n): l = iter(l) chunk = list(islice(l, n)) while chunk: yield chunk chunk = list(islice(l, n))

以上的function性风味：

 def isplitter2(l, n): return takewhile(bool, (tuple(islice(start, n)) for start in repeat(iter(l))))

要么：

 def chunks_gen_sentinel(n, seq): continuous_slices = imap(islice, repeat(iter(seq)), repeat(0), repeat(n)) return iter(imap(tuple, continuous_slices).next,())

要么：

 def chunks_gen_filter(n, seq): continuous_slices = imap(islice, repeat(iter(seq)), repeat(0), repeat(n)) return takewhile(bool,imap(tuple, continuous_slices))

 >>> f = lambda x, n, acc=[]: f(x[n:], n, acc+[(x[:n])]) if x else acc >>> f("Hallo Welt", 3) ['Hal', 'lo ', 'Wel', 't'] >>>

如果你进入括号 – 我拿起一本关于二郎的书:)

 def chunks(iterable,n): """assumes n is an integer>0 """ iterable=iter(iterable) while True: result=[] for i in range(n): try: a=next(iterable) except StopIteration: break else: result.append(a) if result: yield result else: break g1=(i*i for i in range(10)) g2=chunks(g1,3) print g2 '<generator object chunks at 0x0337B9B8>' print list(g2) '[[0, 1, 4], [9, 16, 25], [36, 49, 64], [81]]'

你如何将一个列表分成均匀大小的块？

对其他答案的批评在这里：

我的解决scheme

产量

创build一个数组或两个date之间的所有date列表

列表中的math符号

Pythonic方法返回列表中的每一个第n项

在列表中确定连续重复的最奇怪的方法是什么？

迭代列表中的每两个元素

列表和元组有什么区别？

将列表拆分成更小的列表

元素明智地添加2个列表？

Parallel.ForEach添加到列表

Python：获取列表的内容并将其附加到另一个列表