将列表分成大致相等长度的N个部分

将列表分成大致相等的部分最好的办法是什么? 例如,如果列表中有7个元素并将其分成2个部分,我们希望在一个部分中获得3个元素,另一个元素应该包含4个元素。

我正在寻找像even_split(L, n)这样的将L分成n部分的东西。

 def chunks(L, n): """ Yield successive n-sized chunks from L. """ for i in xrange(0, len(L), n): yield L[i:i+n] 

上面的代码给出了3个块,而不是3个块。 我可以简单地转置(遍历这个,并采取每一列的第一个元素,调用第一部分,然后采取第二,把它放在第二部分等),但是破坏项目的顺序。

这是一个可以工作的:

 def chunkIt(seq, num): avg = len(seq) / float(num) out = [] last = 0.0 while last < len(seq): out.append(seq[int(last):int(last + avg)]) last += avg return out 

testing:

 >>> chunkIt(range(10), 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8, 9]] >>> chunkIt(range(11), 3) [[0, 1, 2], [3, 4, 5, 6], [7, 8, 9, 10]] >>> chunkIt(range(12), 3) [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]] 

只要你不想像连续的块一样愚蠢的东西:

 >>> def chunkify(lst,n): ... return [lst[i::n] for i in xrange(n)] ... >>> chunkify(range(13), 3) [[0, 3, 6, 9, 12], [1, 4, 7, 10], [2, 5, 8, 11]] 

你可以简单地把它写成列表生成器:

 def split(a, n): k, m = divmod(len(a), n) return (a[i * k + min(i, m):(i + 1) * k + min(i + 1, m)] for i in xrange(n)) 

例:

 >>> list(split(range(11), 3)) [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10]] 

这是numpy.array_split *的存在理由

 >>> L [0, 1, 2, 3, 4, 5, 6, 7] >>> print(*np.array_split(L, 3)) [0 1 2] [3 4 5] [6 7] >>> print(*np.array_split(range(10), 4)) [0 1 2] [3 4 5] [6 7] [8 9] 

*在6号房间抵达零比雷埃夫斯

更改代码以生成n块而不是n块:

 def chunks(l, n): """ Yield n successive chunks from l. """ newn = int(len(l) / n) for i in xrange(0, n-1): yield l[i*newn:i*newn+newn] yield l[n*newn-newn:] l = range(56) three_chunks = chunks (l, 3) print three_chunks.next() print three_chunks.next() print three_chunks.next() 

这使:

 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17] [18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35] [36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55] 

这将分配额外的元素到最后的组,这是不完美的,但在你的“大致N个相等的部分”的规范:-)因此,我的意思是56元素会更好(19,19,18),而这给(18,18,20)。

您可以使用以下代码获得更平衡的输出:

 #!/usr/bin/python def chunks(l, n): """ Yield n successive chunks from l. """ newn = int(1.0 * len(l) / n + 0.5) for i in xrange(0, n-1): yield l[i*newn:i*newn+newn] yield l[n*newn-newn:] l = range(56) three_chunks = chunks (l, 3) print three_chunks.next() print three_chunks.next() print three_chunks.next() 

其输出:

 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18] [19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37] [38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55] 

这是一个增加None以使列表长度相同的列表

 >>> from itertools import izip_longest >>> def chunks(l, n): """ Yield n successive chunks from l. Pads extra spaces with None """ return list(zip(*izip_longest(*[iter(l)]*n))) >>> l=range(54) >>> chunks(l,3) [(0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51), (1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52), (2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, 35, 38, 41, 44, 47, 50, 53)] >>> chunks(l,4) [(0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52), (1, 5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53), (2, 6, 10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50, None), (3, 7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47, 51, None)] >>> chunks(l,5) [(0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50), (1, 6, 11, 16, 21, 26, 31, 36, 41, 46, 51), (2, 7, 12, 17, 22, 27, 32, 37, 42, 47, 52), (3, 8, 13, 18, 23, 28, 33, 38, 43, 48, 53), (4, 9, 14, 19, 24, 29, 34, 39, 44, 49, None)] 

看看numpy.split :

 >>> a = numpy.array([1,2,3,4]) >>> numpy.split(a, 2) [array([1, 2]), array([3, 4])] 

如果将n元素大致分为k块,则可以使n % k块比其他块大,以分配额外的元素。

下面的代码会给你块的长度:

 [(n // k) + (1 if i < (n % k) else 0) for i in range(k)] 

示例: n=11, k=3导致[4, 4, 3]

然后,您可以轻松计算块的开始指标:

 [i * (n // k) + min(i, n % k) for i in range(k)] 

例如: n=11, k=3导致[0, 4, 8]

使用第i+1块作为边界,我们得到列表l第n块是len

 l[i * (n // k) + min(i, n % k):(i+1) * (n // k) + min(i+1, n % k)] 

作为最后一步,使用列表理解从所有的块中创build一个列表:

 [l[i * (n // k) + min(i, n % k):(i+1) * (n // k) + min(i+1, n % k)] for i in range(k)] 

例如: n=11, k=3, l=range(n)导致[range(0, 4), range(4, 8), range(8, 11)]

使用numpy.linspace方法实现。

只需指定要将数组分成几部分即可。分区大小几乎相等。

例如:

 import numpy as np a=np.arange(10) print "Input array:",a parts=3 i=np.linspace(np.min(a),np.max(a)+1,parts+1) i=np.array(i,dtype='uint16') # Indices should be floats split_arr=[] for ind in range(i.size-1): split_arr.append(a[i[ind]:i[ind+1]] print "Array split in to %d parts : "%(parts),split_arr 

给:

 Input array: [0 1 2 3 4 5 6 7 8 9] Array split in to 3 parts : [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8, 9])] 

这是我的解决scheme:

 def chunks(l, amount): if amount < 1: raise ValueError('amount must be positive integer') chunk_len = len(l) // amount leap_parts = len(l) % amount remainder = amount // 2 # make it symmetrical i = 0 while i < len(l): remainder += leap_parts end_index = i + chunk_len if remainder >= amount: remainder -= amount end_index += 1 yield l[i:end_index] i = end_index 

产生

  >>> list(chunks([1, 2, 3, 4, 5, 6, 7], 3)) [[1, 2], [3, 4, 5], [6, 7]] 

这是一个可以处理任何正数(整数)块的生成器。 如果块的数量大于input列表长度,则某些块将是空的。 该algorithm在短块和长块之间交替而不是分离它们。

我还包括一些testingragged_chunks函数的代码。

 ''' Split a list into "ragged" chunks The size of each chunk is either the floor or ceiling of len(seq) / chunks chunks can be > len(seq), in which case there will be empty chunks Written by PM 2Ring 2017.03.30 ''' def ragged_chunks(seq, chunks): size = len(seq) start = 0 for i in range(1, chunks + 1): stop = i * size // chunks yield seq[start:stop] start = stop # test def test_ragged_chunks(maxsize): for size in range(0, maxsize): seq = list(range(size)) for chunks in range(1, size + 1): minwidth = size // chunks #ceiling division maxwidth = -(-size // chunks) a = list(ragged_chunks(seq, chunks)) sizes = [len(u) for u in a] deltas = all(minwidth <= u <= maxwidth for u in sizes) assert all((sum(a, []) == seq, sum(sizes) == size, deltas)) return True if test_ragged_chunks(100): print('ok') 

我们可以通过将乘法输出到range调用中来使其稍微高效一些,但是我认为以前的版本更具可读性(和DRYer)。

 def ragged_chunks(seq, chunks): size = len(seq) start = 0 for i in range(size, size * chunks + 1, size): stop = i // chunks yield seq[start:stop] start = stop 

这是另一个变体,将“剩余”元素均匀地分散在所有的块中,一次一个,直到没有剩下的元素。 在这个实现中,更大的块在进程开始时发生。

 def chunks(l, k): """ Yield k successive chunks from l.""" if k < 1: yield [] raise StopIteration n = len(l) avg = n/k remainders = n % k start, end = 0, avg while start < n: if remainders > 0: end = end + 1 remainders = remainders - 1 yield l[start:end] start, end = end, end+avg 

例如,从14个元素列表中生成4个块:

 >>> list(chunks(range(14), 4)) [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10], [11, 12, 13]] >>> map(len, list(chunks(range(14), 4))) [4, 4, 3, 3] 

与工作答案一样,但是考虑到尺寸小于块的数量的列表。

 def chunkify(lst,n): [ lst[i::n] for i in xrange(n if n < len(lst) else len(lst)) ] 

如果n(块的数量)是7并且lst(要划分的列表)是[1,2,3],则块是[[0],[1],[2]]而不是[[0],[1 ],[2],[],[],[],[]]

你也可以使用:

 split=lambda x,n: x if not x else [x[:n]]+[split([] if not -(len(x)-n) else x[-(len(x)-n):],n)][0] split([1,2,3,4,5,6,7,8,9],2) [[1, 2], [3, 4], [5, 6], [7, 8], [9]] 

另一种方式是这样的,这里的想法是使用石斑鱼,但摆脱None 。 在这种情况下,我们将在列表的第一部分包含所有由“small_parts”组成的元素,在列表的后面部分包含“large_parts”。 “较大的部分”的长度是len(small_parts)+1。我们需要把x看作两个不同的子部分。

 from itertools import izip_longest import numpy as np def grouper(n, iterable, fillvalue=None): # This is grouper from itertools "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx" args = [iter(iterable)] * n return izip_longest(fillvalue=fillvalue, *args) def another_chunk(x,num): extra_ele = len(x)%num #gives number of parts that will have an extra element small_part = int(np.floor(len(x)/num)) #gives number of elements in a small part new_x = list(grouper(small_part,x[:small_part*(num-extra_ele)])) new_x.extend(list(grouper(small_part+1,x[small_part*(num-extra_ele):]))) return new_x 

我设置它的方式返回元组列表:

 >>> x = range(14) >>> another_chunk(x,3) [(0, 1, 2, 3), (4, 5, 6, 7, 8), (9, 10, 11, 12, 13)] >>> another_chunk(x,4) [(0, 1, 2), (3, 4, 5), (6, 7, 8, 9), (10, 11, 12, 13)] >>> another_chunk(x,5) [(0, 1), (2, 3, 4), (5, 6, 7), (8, 9, 10), (11, 12, 13)] >>> 

使用列表理解:

 def divide_list_to_chunks(list_, n): return [list_[start::n] for start in range(n)] 

四舍五入的空间,并使用它作为索引比amit12690build议更容易的解决scheme。

 function chunks=chunkit(array,num) index = round(linspace(0,size(array,2),num+1)); chunks = cell(1,num); for x = 1:num chunks{x} = array(:,index(x)+1:index(x+1)); end end