为什么statistics.mean（）这么慢？

我将statistics模块的mean函数的性能与简单的sum(l)/len(l)方法进行比较，发现由于某种原因， mean函数非常慢。我用下面的两个代码片段timeit来比较它们，有没有人知道是什么原因导致执行速度的巨大差异？我正在使用Python 3.5。

 from timeit import repeat print(min(repeat('mean(l)', '''from random import randint; from statistics import mean; \ l=[randint(0, 10000) for i in range(10000)]''', repeat=20, number=10)))

上面的代码在我的机器上执行约0.043秒。

 from timeit import repeat print(min(repeat('sum(l)/len(l)', '''from random import randint; from statistics import mean; \ l=[randint(0, 10000) for i in range(10000)]''', repeat=20, number=10)))

上面的代码在我的机器上执行大约0.000565秒。

Python的statistics模块不是为了速度而构build的，而是为了精确

在这个模块的规格中，似乎是这样的

当处理大幅度不同的花车时，内置的和会失去准确性。因此，上述天真的意思是没有通过这个“酷刑testing”

assert mean([1e30, 1, 3, -1e30]) == 1

返回0而不是1，纯粹是100％的计算错误。

在平均值内使用math.fsum将使浮点数据更加准确，但即使在没有必要的情况下，也可以将所有参数转换为浮点数的副作用。例如，我们应该期望分数列表的平均值是一个分数，而不是一个浮点数。

相反，如果我们看一下这个模块中的_sum()的实现， _sum()方法的文档string的第一行似乎证实：

 def _sum(data, start=0): """_sum(data [, start]) -> (type, sum, count) Return a high-precision sum of the given numeric data as a fraction, together with the type to be converted to and the count of items. [...] """

所以是的， sum statistics实现，而不是Python的内置sum()函数的一个简单的单线程调用，本身大约需要20行，其中嵌套for循环。

发生这种情况是因为statistics._sumselect保证所有可能遇到的数字的最大精度（即使它们相差很大），而不是简单地强调速度。

因此，看起来正常的内置sumcertificate快了一百倍。它的成本是一个低得多的精度，你碰巧用异乎寻常的数字来称呼它。

其他选项

如果你需要在你的algorithm中优先考虑速度，你应该看看Numpy，而在C中实现的algorithm

NumPy的意思并不像长statistics那样精确，但是它实现了（自2013年以来）基于成对求和的例程，它比天真的sum/len （链接中的更多信息）要好。

然而…

 import numpy as np import statistics np_mean = np.mean([1e30, 1, 3, -1e30]) statistics_mean = statistics.mean([1e30, 1, 3, -1e30]) print('NumPy mean: {}'.format(np_mean)) print('Statistics mean: {}'.format(statistics_mean)) > NumPy mean: 0.0 > Statistics mean: 1.0

如果你照顾速度使用numpy / scipy / pandas来代替：

 In [119]: from random import randint; from statistics import mean; import numpy as np; In [122]: l=[randint(0, 10000) for i in range(10**6)] In [123]: mean(l) Out[123]: 5001.992355 In [124]: %timeit mean(l) 1 loop, best of 3: 2.01 s per loop In [125]: a = np.array(l) In [126]: np.mean(a) Out[126]: 5001.9923550000003 In [127]: %timeit np.mean(a) 100 loops, best of 3: 2.87 ms per loop

结论：它会快几个数量级 – 在我的例子中，速度要快700倍，但可能并不那么精确（因为numpy并不使用Kahan求和algorithm）。

我问了一会儿同样的问题，但是一旦我注意到源代码中的317行的平均值被调用的_sum函数，我明白了原因：

 def _sum(data, start=0): """_sum(data [, start]) -> (type, sum, count) Return a high-precision sum of the given numeric data as a fraction, together with the type to be converted to and the count of items. If optional argument ``start`` is given, it is added to the total. If ``data`` is empty, ``start`` (defaulting to 0) is returned. Examples -------- >>> _sum([3, 2.25, 4.5, -0.5, 1.0], 0.75) (<class 'float'>, Fraction(11, 1), 5) Some sources of round-off error will be avoided: >>> _sum([1e50, 1, -1e50] * 1000) # Built-in sum returns zero. (<class 'float'>, Fraction(1000, 1), 3000) Fractions and Decimals are also supported: >>> from fractions import Fraction as F >>> _sum([F(2, 3), F(7, 5), F(1, 4), F(5, 6)]) (<class 'fractions.Fraction'>, Fraction(63, 20), 4) >>> from decimal import Decimal as D >>> data = [D("0.1375"), D("0.2108"), D("0.3061"), D("0.0419")] >>> _sum(data) (<class 'decimal.Decimal'>, Fraction(6963, 10000), 4) Mixed types are currently treated as an error, except that int is allowed. """ count = 0 n, d = _exact_ratio(start) partials = {d: n} partials_get = partials.get T = _coerce(int, type(start)) for typ, values in groupby(data, type): T = _coerce(T, typ) # or raise TypeError for n,d in map(_exact_ratio, values): count += 1 partials[d] = partials_get(d, 0) + n if None in partials: # The sum will be a NAN or INF. We can ignore all the finite # partials, and just look at this special one. total = partials[None] assert not _isfinite(total) else: # Sum all the partial sums using builtin sum. # FIXME is this faster if we sum them in order of the denominator? total = sum(Fraction(n, d) for d, n in sorted(partials.items())) return (T, total, count)

与只调用内置sum相比，有许多操作发生，因为每个文档stringmean计算高精度总和 。

你可以看到使用均值vs总和可以给你不同的输出：

 In [7]: l = [.1, .12312, 2.112, .12131] In [8]: sum(l) / len(l) Out[8]: 0.6141074999999999 In [9]: mean(l) Out[9]: 0.6141075

len（）和sum（）都是Python内置函数（function有限），用C语言编写，更重要的是，它们被优化以便与某些types或对象（列表）快速合作。

你可以在这里看看内置函数的实现：

https://hg.python.org/sandbox/python2.7/file/tip/Python/bltinmodule.c

statistics.mean（）是用Python编写的高级函数。看看它是如何实现的：

https://hg.python.org/sandbox/python2.7/file/tip/Lib/statistics.py

你可以看到后来在内部使用另外一个叫做_sum（）的函数，与内build函数相比，它做了一些额外的检查。

根据那篇文章：在Python中计算算术平均数（average）

这应该是“由于在统计学中特别精确地实施了和运算符”。

平均函数用内部_sum函数编码，这个函数应该比普通的函数更精确，但是速度要慢很多（代码可以在这里find： https ： //hg.python.org/cpython/file/3.5/Lib/ statistics.py ）。

在PEP中指定： https ： //www.python.org/dev/peps/pep-0450/对于该模块来说，精度被认为是更重要的速度。

为什么statistics.mean（）这么慢？

为什么Android更喜欢静态类

MongoDB – 更新集合中所有logging的最快方法是什么？

多less个线程太多了？

IE浏览器中的JavaScript Profiler

为什么array.push有时比array = value更快？

translate3d vs翻译性能

IF是否比IF-ELSE更好？

当你具体的时候，CSS是否更快？

StringBuilder / StringBuffer与“+”运算符

Java获得可用内存