Python的隐藏function

Python编程语言的鲜为人知的但有用的function是什么?

  • 尝试限制Python核心的答案。
  • 每个答案一个function。
  • 给出一个示例和function的简短描述,而不仅仅是文档的链接。
  • 作为第一行使用标题标记function。

快速链接到答案:

  • 参数拆包
  • 背带
  • 链接比较运算符
  • 装饰
  • 默认参数陷阱/可变默认参数的危险
  • 字典默认.get
  • 文档stringtesting
  • 省略切片语法
  • 列举
  • 对于/其他
  • 函数作为iter()参数
  • 生成器expression式
  • import this
  • 就地价值交换
  • 列出步进
  • __missing__项目
  • 多行正则expression式
  • 命名的string格式
  • 嵌套列表/发生器的理解
  • 运行时的新types
  • .pth文件
  • ROT13编码
  • 正则expression式debugging
  • 发送给发电机
  • 交互式解释器中的标签完成
  • 三元expression
  • try/except/else
  • 开箱+ print()function
  • with声明

链接比较操作符:

 >>> x = 5 >>> 1 < x < 10 True >>> 10 < x < 20 False >>> x < 10 < x*10 < 100 True >>> 10 > x <= 9 True >>> 5 == x > 4 True 

如果你认为1 < xTrue ,然后比较True < 10 ,那也是True ,那么不,那真的不会发生什么(见最后一个例子)。它真的转化为1 < x and x < 10 ,并且x < 10 and 10 < x * 10 and x*10 < 100 ,但是具有较less的打字并且每个词只评估一次。

获取python正则expression式分析树来debugging你的正则expression式。

正则expression式是python的一大特性,但是debugging它们可能是一件痛苦的事情,而得到一个正则expression式很容易。

幸运的是,python可以通过将无证实验隐藏标志re.DEBUG (实际上是128)传递给re.compile来打印正则expression式分析树。

 >>> re.compile("^\[font(?:=(?P<size>[-+][0-9]{1,2}))?\](.*?)[/font]", re.DEBUG) at at_beginning literal 91 literal 102 literal 111 literal 110 literal 116 max_repeat 0 1 subpattern None literal 61 subpattern 1 in literal 45 literal 43 max_repeat 1 2 in range (48, 57) literal 93 subpattern 2 min_repeat 0 65535 any None in literal 47 literal 102 literal 111 literal 110 literal 116 

一旦你理解了语法,你可以发现你的错误。 在那里,我们可以看到,我忘了逃避[/font]中的[/font]

当然,你可以将它与任何你想要的标志相结合,比如正则expression式:

 >>> re.compile(""" ^ # start of a line \[font # the font tag (?:=(?P<size> # optional [font=+size] [-+][0-9]{1,2} # size specification ))? \] # end of tag (.*?) # text between the tags \[/font\] # end of the tag """, re.DEBUG|re.VERBOSE|re.DOTALL) 

枚举

用enumerate包装一个迭代器,它将产生该项目及其索引。

例如:

 >>> a = ['a', 'b', 'c', 'd', 'e'] >>> for index, item in enumerate(a): print index, item ... 0 a 1 b 2 c 3 d 4 e >>> 

参考文献:

  • Python教程循环技术
  • Python文档内置函数 – enumerate
  • PEP 279

创build生成器对象

如果你写

 x=(n for n in foo if bar(n)) 

你可以走出发电机并将其分配给x。 现在这意味着你可以做

 for n in x: 

这样做的好处是,你不需要中间存储,如果你这样做,你将需要中间存储

 x = [n for n in foo if bar(n)] 

在某些情况下,这可能导致显着的加速。

你可以附加很多if语句到生成器的末尾,基本上复制嵌套for循环:

 >>> n = ((a,b) for a in range(0,2) for b in range(4,6)) >>> for i in n: ... print i (0, 4) (0, 5) (1, 4) (1, 5) 

iter()可以采用可调用的参数

例如:

 def seek_next_line(f): for c in iter(lambda: f.read(1),'\n'): pass 

iter(callable, until_value)函数重复调用callable并产生结果,直到返回until_value

要小心可变的默认参数

 >>> def foo(x=[]): ... x.append(1) ... print x ... >>> foo() [1] >>> foo() [1, 1] >>> foo() [1, 1, 1] 

相反,你应该使用表示“未给定”的定点值,并用默认的可变值replace:

 >>> def foo(x=None): ... if x is None: ... x = [] ... x.append(1) ... print x >>> foo() [1] >>> foo() [1] 

将值发送到生成器函数 。 例如有这个function:

 def mygen(): """Yield 5 until something else is passed back via send()""" a = 5 while True: f = (yield a) #yield a and possibly get f in return if f is not None: a = f #store the new value 

您可以:

 >>> g = mygen() >>> g.next() 5 >>> g.next() 5 >>> g.send(7) #we send this back to the generator 7 >>> g.next() #now it will yield 7 until we send something else 7 

如果您不喜欢使用空白来表示作用域,则可以通过发出以下命令来使用C风格{}:

 from __future__ import braces 

切片运算符中的step参数。 例如:

 a = [1,2,3,4,5] >>> a[::2] # iterate over the whole list in 2-increments [1,3,5] 

特殊情况x[::-1]是'x颠倒'的一个有用的习惯用法。

 >>> a[::-1] [5,4,3,2,1] 

装饰

装饰器允许在另一个函数中包装函数或方法,这些函数可以添加function,修改参数或结果等等。你可以在函数定义的上面写一个装饰器,以“at”符号(@)开始。

示例显示了一个print_args装饰器,它在调用装饰函数的参数之前打印它:

 >>> def print_args(function): >>> def wrapper(*args, **kwargs): >>> print 'Arguments:', args, kwargs >>> return function(*args, **kwargs) >>> return wrapper >>> @print_args >>> def write(text): >>> print text >>> write('foo') Arguments: ('foo',) {} foo 

for … else语法(请参阅http://docs.python.org/ref/for.html

 for i in foo: if i == 0: break else: print("i was never 0") 

除非中断被调用,否则“else”块通常会在for循环结束时执行。

上面的代码可以模拟如下:

 found = False for i in foo: if i == 0: found = True break if not found: print("i was never 0") 

从2.5开始,字典有一个特殊的方法__missing__ ,用于缺失项目:

 >>> class MyDict(dict): ... def __missing__(self, key): ... self[key] = rv = [] ... return rv ... >>> m = MyDict() >>> m["foo"].append(1) >>> m["foo"].append(2) >>> dict(m) {'foo': [1, 2]} 

collections还有一个名为defaultdict的dict子类,它defaultdict ,但对于不存在的项目调用没有参数的函数:

 >>> from collections import defaultdict >>> m = defaultdict(list) >>> m["foo"].append(1) >>> m["foo"].append(2) >>> dict(m) {'foo': [1, 2]} 

我build议将这些字典转换为正规的字典,然后再将它们传递给不期望这样的子类的函数。 很多代码使用d[a_key]并捕获d[a_key]来检查是否存在将添加新项目到字典的项目。

原地换值

 >>> a = 10 >>> b = 5 >>> a, b (10, 5) >>> a, b = b, a >>> a, b (5, 10) 

赋值的右侧是创build新元组的expression式。 分配的左侧立即将(未引用的)元组解包到名字ab

赋值后,新的元组被引用并被标记为垃圾回收,绑定到ab的值已被交换。

如Python数据结构教程部分所述 ,

请注意,多个赋值实际上只是元组打包和序列解包的组合。

可读的正则expression式

在Python中,您可以将正则expression式分成多行,为匹配项命名并插入注释。

示例详细语法(从Dive into Python ):

 >>> pattern = """ ... ^ # beginning of string ... M{0,4} # thousands - 0 to 4 M's ... (CM|CD|D?C{0,3}) # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's), ... # or 500-800 (D, followed by 0 to 3 C's) ... (XC|XL|L?X{0,3}) # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's), ... # or 50-80 (L, followed by 0 to 3 X's) ... (IX|IV|V?I{0,3}) # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's), ... # or 5-8 (V, followed by 0 to 3 I's) ... $ # end of string ... """ >>> re.search(pattern, 'M', re.VERBOSE) 

示例命名匹配(来自Regular Expression HOWTO )

 >>> p = re.compile(r'(?P<word>\b\w+\b)') >>> m = p.search( '(((( Lots of punctuation )))' ) >>> m.group('word') 'Lots' 

你也可以在不使用re.VERBOSE情况下写一个正则expression式,这要归功于string连接。

 >>> pattern = ( ... "^" # beginning of string ... "M{0,4}" # thousands - 0 to 4 M's ... "(CM|CD|D?C{0,3})" # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's), ... # or 500-800 (D, followed by 0 to 3 C's) ... "(XC|XL|L?X{0,3})" # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's), ... # or 50-80 (L, followed by 0 to 3 X's) ... "(IX|IV|V?I{0,3})" # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's), ... # or 5-8 (V, followed by 0 to 3 I's) ... "$" # end of string ... ) >>> print pattern "^M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$" 

函数参数解包

您可以使用***将列表或字典解压缩为函数参数。

例如:

 def draw_point(x, y): # do some magic point_foo = (3, 4) point_bar = {'y': 3, 'x': 2} draw_point(*point_foo) draw_point(**point_bar) 

非常有用的快捷方式,因为列表,元组和字典被广泛用作容器。

当您在代码文件的顶部使用正确的编码声明时,ROT13是源代码的有效编码:

 #!/usr/bin/env python # -*- coding: rot13 -*- cevag "Uryyb fgnpxbiresybj!".rapbqr("rot13") 

以完全dynamic的方式创build新的types

 >>> NewType = type("NewType", (object,), {"x": "hello"}) >>> n = NewType() >>> nx "hello" 

这完全一样

 >>> class NewType(object): >>> x = "hello" >>> n = NewType() >>> nx "hello" 

可能不是最有用的东西,但很高兴知道。

编辑 :修正了新types的名称,应该是与class语句完全一样的NewType

编辑 :调整标题,以更准确地描述function。

上下文pipe理器和“ with ”声明

在PEP 343中引入的上下文pipe理器是一个对象,充当一组语句的运行时上下文。

由于该特性使用了新的关键字,因此逐渐引入:通过__future__指令可以在Python 2.5中使用。 Python 2.6及更高版本(包括Python 3)默认可用。

我用了很多“with”语句,因为我认为这是一个非常有用的构造,下面是一个快速演示:

 from __future__ import with_statement with open('foo.txt', 'w') as f: f.write('hello!') 

这里幕后发生的是, “with”语句在文件对象上调用特殊的__enter____exit__方法。 exception详细信息也会传递给__exit__如果从with语句主体引发exception,则允许exception处理发生在那里。

在这种情况下,这对你有什么作用,它保证当执行超出with套件的范围时,文件是closures的,无论是否正常发生或是否抛出exception。 这基本上是一种抽象出常见的exception处理代码的方法。

其他常见用例包括locking线程和数据库事务。

字典有一个get()方法

字典有一个“get()”方法。 如果你做了['钥匙']和钥匙不在那里,你会得到一个例外。 如果你做了d.get('key'),如果'key'不存在,你将返回None。 您可以添加第二个参数来获取该项目而不是None,例如:d.get('key',0)。

这对于添加数字这样的事情是很好的:

sum[value] = sum.get(value, 0) + 1

他们是一大堆Python核心function背后的魔力。

当你使用虚线访问来查找一个成员(例如xy)时,Python首先在实例字典中查找成员。 如果找不到,它会在类字典中查找它。 如果它在类字典中find它,并且该对象实现描述符协议,而不是仅仅返回它,Python就会执行它。 描述符是实现__get____set____delete__方法的任何类。

以下是如何使用描述符实现自己的(只读)版本的属性:

 class Property(object): def __init__(self, fget): self.fget = fget def __get__(self, obj, type): if obj is None: return self return self.fget(obj) 

你可以像内置的属性()一样使用它:

 class MyClass(object): @Property def foo(self): return "Foo!" 

描述符在Python中用于实现属性,绑定方法,静态方法,类方法和插槽等等。 Understanding them makes it easy to see why a lot of things that previously looked like Python 'quirks' are the way they are.

Raymond Hettinger has an excellent tutorial that does a much better job of describing them than I do.

Conditional Assignment

 x = 3 if (y == 1) else 2 

It does exactly what it sounds like: "assign 3 to x if y is 1, otherwise assign 2 to x". Note that the parens are not necessary, but I like them for readability. You can also chain it if you have something more complicated:

 x = 3 if (y == 1) else 2 if (y == -1) else 1 

Though at a certain point, it goes a little too far.

Note that you can use if … else in any expression. 例如:

 (func1 if y == 1 else func2)(arg1, arg2) 

Here func1 will be called if y is 1 and func2, otherwise. In both cases the corresponding function will be called with arguments arg1 and arg2.

Analogously, the following is also valid:

 x = (class1 if y == 1 else class2)(arg1, arg2) 

where class1 and class2 are two classes.

Doctest : documentation and unit-testing at the same time.

Example extracted from the Python documentation:

 def factorial(n): """Return the factorial of n, an exact integer >= 0. If the result is small enough to fit in an int, return an int. Else return a long. >>> [factorial(n) for n in range(6)] [1, 1, 2, 6, 24, 120] >>> factorial(-1) Traceback (most recent call last): ... ValueError: n must be >= 0 Factorials of floats are OK, but the float must be an exact integer: """ import math if not n >= 0: raise ValueError("n must be >= 0") if math.floor(n) != n: raise ValueError("n must be exact integer") if n+1 == n: # catch a value like 1e300 raise OverflowError("n too large") result = 1 factor = 2 while factor <= n: result *= factor factor += 1 return result def _test(): import doctest doctest.testmod() if __name__ == "__main__": _test() 

Named formatting

% -formatting takes a dictionary (also applies %i/%s etc. validation).

 >>> print "The %(foo)s is %(bar)i." % {'foo': 'answer', 'bar':42} The answer is 42. >>> foo, bar = 'question', 123 >>> print "The %(foo)s is %(bar)i." % locals() The question is 123. 

And since locals() is also a dictionary, you can simply pass that as a dict and have % -substitions from your local variables. I think this is frowned upon, but simplifies things..

New Style Formatting

 >>> print("The {foo} is {bar}".format(foo='answer', bar=42)) 

To add more python modules (espcially 3rd party ones), most people seem to use PYTHONPATH environment variables or they add symlinks or directories in their site-packages directories. Another way, is to use *.pth files. Here's the official python doc's explanation:

"The most convenient way [to modify python's search path] is to add a path configuration file to a directory that's already on Python's path, usually to the …/site-packages/ directory. Path configuration files have an extension of .pth, and each line must contain a single path that will be appended to sys.path. (Because the new paths are appended to sys.path, modules in the added directories will not override standard modules. This means you can't use this mechanism for installing fixed versions of standard modules.)"

Exception else clause:

 try: put_4000000000_volts_through_it(parrot) except Voom: print "'E's pining!" else: print "This parrot is no more!" finally: end_sketch() 

The use of the else clause is better than adding additional code to the try clause because it avoids accidentally catching an exception that wasn't raised by the code being protected by the try … except statement.

See http://docs.python.org/tut/node10.html

Re-raising exceptions :

 # Python 2 syntax try: some_operation() except SomeError, e: if is_fatal(e): raise handle_nonfatal(e) # Python 3 syntax try: some_operation() except SomeError as e: if is_fatal(e): raise handle_nonfatal(e) 

The 'raise' statement with no arguments inside an error handler tells Python to re-raise the exception with the original traceback intact , allowing you to say "oh, sorry, sorry, I didn't mean to catch that, sorry, sorry."

If you wish to print, store or fiddle with the original traceback, you can get it with sys.exc_info(), and printing it like Python would is done with the 'traceback' module.

Main messages 🙂

 import this # btw look at this module's source :) 

De-cyphered :

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess. There should be one– and preferably only one –obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea — let's do more of those!

Interactive Interpreter Tab Completion

 try: import readline except ImportError: print "Unable to load readline module." else: import rlcompleter readline.parse_and_bind("tab: complete") >>> class myclass: ... def function(self): ... print "my function" ... >>> class_instance = myclass() >>> class_instance.<TAB> class_instance.__class__ class_instance.__module__ class_instance.__doc__ class_instance.function >>> class_instance.f<TAB>unction() 

You will also have to set a PYTHONSTARTUP environment variable.

Nested list comprehensions and generator expressions:

 [(i,j) for i in range(3) for j in range(i) ] ((i,j) for i in range(4) for j in range(i) ) 

These can replace huge chunks of nested-loop code.

Operator overloading for the set builtin:

 >>> a = set([1,2,3,4]) >>> b = set([3,4,5,6]) >>> a | b # Union {1, 2, 3, 4, 5, 6} >>> a & b # Intersection {3, 4} >>> a < b # Subset False >>> a - b # Difference {1, 2} >>> a ^ b # Symmetric Difference {1, 2, 5, 6} 

More detail from the standard library reference: Set Types