识别列表中的连续号码组

我想识别列表中的连续号码组,以便:

myfunc([2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]) 

返回:

 [(2,5), (12,17), 20] 

想知道做这件事的最好方法是什么(特别是如果Python中有东西的话)。

编辑:注意我最初忘了提及个人数字应作为个人数字,而不是范围。

编辑2:回答OP的新要求

 ranges = [] for key, group in groupby(enumerate(data), lambda (index, item): index - item): group = map(itemgetter(1), group) if len(group) > 1: ranges.append(xrange(group[0], group[-1])) else: ranges.append(group[0]) 

输出:

 [xrange(2, 5), xrange(12, 17), 20] 

您可以使用范围或任何其他自定义类来replacexrange。


Python文档有一个非常干净的配方 :

 from operator import itemgetter from itertools import groupby data = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17] for k, g in groupby(enumerate(data), lambda (i,x):ix): print map(itemgetter(1), g) 

输出:

 [2, 3, 4, 5] [12, 13, 14, 15, 16, 17] 

如果你想得到完全相同的输出,你可以这样做:

 ranges = [] for k, g in groupby(enumerate(data), lambda (i,x):ix): group = map(itemgetter(1), g) ranges.append((group[0], group[-1])) 

输出:

 [(2, 5), (12, 17)] 

编辑:该示例已经在文档中解释,但也许我应该更多地解释:

解决scheme的关键是与范围进行区分,以便连续的数字全部出现在同一组中。

如果数据是: [2, 3, 4, 5, 12, 13, 14, 15, 16, 17] groupby(enumerate(data), lambda (i,x):ix) [2, 3, 4, 5, 12, 13, 14, 15, 16, 17]然后groupby(enumerate(data), lambda (i,x):ix)等价于以下内容:

 groupby( [(0, 2), (1, 3), (2, 4), (3, 5), (4, 12), (5, 13), (6, 14), (7, 15), (8, 16), (9, 17)], lambda (i,x):ix ) 

lambda函数从元素值中减去元素索引。 所以,当你在每个项目上应用lambda。 你会得到以下关键groupby:

 [-2, -2, -2, -2, -8, -8, -8, -8, -8, -8] 

groupby按相同的键值对元素进行分组,所以前4个元素将被分组在一起等等。

我希望这使得它更可读。

我觉得至less有点可读的“天真”的解决scheme。

 x = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 22, 25, 26, 28, 51, 52, 57] def group(L): first = last = L[0] for n in L[1:]: if n - 1 == last: # Part of the group, bump the end last = n else: # Not part of the group, yield current group and start a new yield first, last first = last = n yield first, last # Yield the last group >>>print list(group(x)) [(2, 5), (12, 17), (22, 22), (25, 26), (28, 28), (51, 52), (57, 57)] 

假设您的列表已sorting:

 >>> from itertools import groupby >>> def ranges(lst): pos = (j - i for i, j in enumerate(lst)) t = 0 for i, els in groupby(pos): l = len(list(els)) el = lst[t] t += l yield range(el, el+l) >>> lst = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17] >>> list(ranges(lst)) [range(2, 6), range(12, 18)] 

这里是应该工作的东西,不需要任何导入:

 def myfunc(lst): ret = [] a = b = lst[0] # a and b are range's bounds for el in lst[1:]: if el == b+1: b = el # range grows else: # range ended ret.append(a if a==b else (a,b)) # is a single or a range? a = b = el # let's start again with a single ret.append(a if a==b else (a,b)) # corner case for last single/range return ret 

请注意,使用groupby的代码不能像Python 3中给出的那样工作,所以使用这个。

 for k, g in groupby(enumerate(data), lambda x:x[0]-x[1]): group = list(map(itemgetter(1), g)) ranges.append((group[0], group[-1])) 

这不使用一个标准的函数 – 它只是在input,但它应该工作:

 def myfunc(l): r = [] p = q = None for x in l + [-1]: if x - 1 == q: q += 1 else: if p: if q > p: r.append('%s-%s' % (p, q)) else: r.append(str(p)) p = q = x return '(%s)' % ', '.join(r) 

请注意,它要求input只包含升序的正数。 您应该validationinput,但为了清楚起见,省略了该代码。

这是我提出的答案。 我正在为其他人编写代码来理解,所以我对variables名和注释非常详细。

首先是一个快速辅助function:

 def getpreviousitem(mylist,myitem): '''Given a list and an item, return previous item in list''' for position, item in enumerate(mylist): if item == myitem: # First item has no previous item if position == 0: return None # Return previous item return mylist[position-1] 

然后实际的代码:

 def getranges(cpulist): '''Given a sorted list of numbers, return a list of ranges''' rangelist = [] inrange = False for item in cpulist: previousitem = getpreviousitem(cpulist,item) if previousitem == item - 1: # We're in a range if inrange == True: # It's an existing range - change the end to the current item newrange[1] = item else: # We've found a new range. newrange = [item-1,item] # Update to show we are now in a range inrange = True else: # We were in a range but now it just ended if inrange == True: # Save the old range rangelist.append(newrange) # Update to show we're no longer in a range inrange = False # Add the final range found to our list if inrange == True: rangelist.append(newrange) return rangelist 

示例运行:

 getranges([2, 3, 4, 5, 12, 13, 14, 15, 16, 17]) 

收益:

 [[2, 5], [12, 17]] 
 import numpy as np myarray = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20] sequences = np.split(myarray, np.array(np.where(np.diff(myarray) > 1)[0]) + 1) l = [] for s in sequences: if len(s) > 1: l.append((np.min(s), np.max(s))) else: l.append(s[0]) print(l) 

输出:

 [(2, 5), (12, 17), 20]