如何outlookPython生成器中的一个元素?

我无法弄清楚如何在Python生成器中outlook一个元素。 只要我看,它走了。

这是我的意思:

gen = iter([1,2,3]) next_value = gen.next() # okay, I looked forward and see that next_value = 1 # but now: list(gen) # is [2, 3] -- the first value is gone! 

这是一个更真实的例子:

 gen = element_generator() if gen.next_value() == 'STOP': quit_application() else: process(gen.next()) 

任何人都可以帮我写一个发电机,你可以看一个元素向前?

Python生成器API是一种方式:您不能推回您阅读过的元素。 但是您可以使用itertools模块创build一个新的迭代器,并预先添加元素:

 import itertools gen = iter([1,2,3]) peek = gen.next() print list(itertools.chain([peek], gen)) 

为了完整起见, more-itertools包 (应该可能是任何Python程序员的工具箱的一部分)包含一个peekable包装器来实现这种行为。 正如文档中的代码示例所示:

 >>> p = peekable(xrange(2)) >>> p.peek() 0 >>> p.next() 0 >>> p.peek() 1 >>> p.next() 1 

该包与Python 2和3都兼容,即使文档显示了Python 2语法。

好的 – 两年太晚了 – 但我遇到了这个问题,没有find任何我满意的答案。 拿出这个元生成器:

 class Peekorator(object): def __init__(self, generator): self.empty = False self.peek = None self.generator = generator try: self.peek = self.generator.next() except StopIteration: self.empty = True def __iter__(self): return self def next(self): """ Return the self.peek element, or raise StopIteration if empty """ if self.empty: raise StopIteration() to_return = self.peek try: self.peek = self.generator.next() except StopIteration: self.peek = None self.empty = True return to_return def simple_iterator(): for x in range(10): yield x*3 pkr = Peekorator(simple_iterator()) for i in pkr: print i, pkr.peek, pkr.empty 

结果是:

 0 3 False 3 6 False 6 9 False 9 12 False ... 24 27 False 27 None False 

即在迭代期间的任何时候,您都有权访问列表中的下一个项目。

您可以使用itertools.tee生成生成器的轻量级副本。 那么偷看一个副本不会影响第二个副本:

 import itertools copy1, copy2 = itertools.tee(original_generator) if copy1.next() == "STOP": stop_application() process_items(copy2) 

“copy2”生成器不会受到您对“copy1”进行骚扰的影响。 请注意,在调用“tee”之后,您不应该使用“original_generator”,这会破坏事物。

FWIW,这是解决这个问题的错误方法。 任何要求您在发生器中提前看一个项目的algorithm都可以写成使用当前生成器项目和以前的项目。 那么你不需要破坏你使用的生成器,你的代码就会简单得多。 看到我对这个问题的其他答案。

为了好玩,我根据Aaron的build议创build了一个lookahead类的实现:

 import itertools class lookahead_chain(object): def __init__(self, it): self._it = iter(it) def __iter__(self): return self def next(self): return next(self._it) def peek(self, default=None, _chain=itertools.chain): it = self._it try: v = self._it.next() self._it = _chain((v,), it) return v except StopIteration: return default lookahead = lookahead_chain 

有了这个,以下将起作用:

 >>> t = lookahead(xrange(8)) >>> list(itertools.islice(t, 3)) [0, 1, 2] >>> t.peek() 3 >>> list(itertools.islice(t, 3)) [3, 4, 5] 

通过这个实现,连续多次调用peek是个不错的主意。

在查看CPython源代码的同时,我发现了一个更短,更高效的更好方法:

 class lookahead_tee(object): def __init__(self, it): self._it, = itertools.tee(it, 1) def __iter__(self): return self._it def peek(self, default=None): try: return self._it.__copy__().next() except StopIteration: return default lookahead = lookahead_tee 

用法与上面相同,但是您不会在这里付出代价来连续多次使用peek。 有了更多的几行,你也可以在迭代器中看到多个项目(最多可用的RAM)。

 >>> gen = iter(range(10)) >>> peek = next(gen) >>> peek 0 >>> gen = (value for g in ([peek], gen) for value in g) >>> list(gen) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 

你应该使用(i-1,i),其中'i-1'代表使用项目(i,i + 1),其中'i'是当前项目,i + 1是'是发电机的以前的版本。

以这种方式调整algorithm会产生与您目前所拥有的相同的东西,除了试图“向前看”的额外不必要的复杂性外。

向前看是一个错误,你不应该这样做。

这将工作 – 它缓冲一个项目,并调用每个项目和序列中的下一个项目的function。

你的要求是在序列结尾发生的事情是模糊的。 当你在最后一个时,“向前看”是什么意思?

 def process_with_lookahead( iterable, aFunction ): prev= iterable.next() for item in iterable: aFunction( prev, item ) prev= item aFunction( item, None ) def someLookaheadFunction( item, next_item ): print item, next_item 

一个简单的解决scheme是使用这样的function:

 def peek(it): first = next(it) return first, itertools.chain([first], it) 

那你可以这样做:

 >>> it = iter(range(10)) >>> x, it = peek(it) >>> x 0 >>> next(it) 0 >>> next(it) 1 

虽然itertools.chain()是这里工作的天然工具,但要注意这样的循环:

 for elem in gen: ... peek = next(gen) gen = itertools.chain([peek], gen) 

…因为这将消耗一个线性增长的内存量,并最终停下来。 (这段代码本质上似乎创build了一个链表,每个链()调用一个节点)。我知道这不是因为我检查了库,但是因为这只是导致我的程序的主要放缓 – 摆脱gen = itertools.chain([peek], gen)线再次加速。 (Python 3.3)

python3代码@ jonathan-hartley答案:

 def peek(iterator, eoi=None): iterator = iter(iterator) try: prev = next(iterator) except StopIteration: return iterator for elm in iterator: yield prev, elm prev = elm yield prev, eoi for curr, nxt in peek(range(10)): print((curr, nxt)) # (0, 1) # (1, 2) # (2, 3) # (3, 4) # (4, 5) # (5, 6) # (6, 7) # (7, 8) # (8, 9) # (9, None) 

创build一个能在__iter__上完成这个操作的类,并且只产生prev项,并把elm放在某个属性中,这会很简单。

如果有人有兴趣,请纠正我,如果我错了,但我相信很容易添加一些推回function的任何迭代器。

 class Back_pushable_iterator: """Class whose constructor takes an iterator as its only parameter, and returns an iterator that behaves in the same way, with added push back functionality. The idea is to be able to push back elements that need to be retrieved once more with the iterator semantics. This is particularly useful to implement LL(k) parsers that need k tokens of lookahead. Lookahead or push back is really a matter of perspective. The pushing back strategy allows a clean parser implementation based on recursive parser functions. The invoker of this class takes care of storing the elements that should be pushed back. A consequence of this is that any elements can be "pushed back", even elements that have never been retrieved from the iterator. The elements that are pushed back are then retrieved through the iterator interface in a LIFO-manner (as should logically be expected). This class works for any iterator but is especially meaningful for a generator iterator, which offers no obvious push back ability. In the LL(k) case mentioned above, the tokenizer can be implemented by a standard generator function (clean and simple), that is completed by this class for the needs of the actual parser. """ def __init__(self, iterator): self.iterator = iterator self.pushed_back = [] def __iter__(self): return self def __next__(self): if self.pushed_back: return self.pushed_back.pop() else: return next(self.iterator) def push_back(self, element): self.pushed_back.append(element) def main(): it = Back_pushable_iterator(x for x in range(10)) x = next(it) # 0 print(x) it.push_back(x) x = next(it) # 0 print(x) x = next(it) # 1 print(x) x = next(it) # 2 y = next(it) # 3 print(x) print(y) it.push_back(y) it.push_back(x) x = next(it) # 2 y = next(it) # 3 print(x) print(y) for x in it: print(x) # 4-9 if __name__ == "__main__": main()