统计：Python中的组合

我需要在Python中计算组合函数（nCr），但是在math ， numpy或stat库中找不到函数。类似于这个types的函数：

 comb = calculate_combinations(n, r)

我需要可能的组合的数量，而不是实际的组合，所以itertools.combinations不关心我。

最后，我想避免使用阶乘，因为我将计算组合的数字可能变得太大，因子将是可怕的。

这似乎是一个非常容易回答的问题，但是我正在淹没关于生成所有实际组合的问题，这不是我想要的。 🙂

非常感谢

请参阅scipy.misc.comb 。当exact为假时，它使用gammaln函数来获得良好的精度，而不需要太多的时间。在确切的情况下，它返回一个任意精度的整数，这可能需要很长时间来计算。

为什么不自己写呢？这是一个单线或这样的：

 from operator import mul # or mul=lambda x,y:x*y from fractions import Fraction def nCk(n,k): return int( reduce(mul, (Fraction(ni, i+1) for i in range(k)), 1) )

testing – 印刷帕斯卡的三angular形：

 >>> for n in range(17): ... print ' '.join('%5d'%nCk(n,k) for k in range(n+1)).center(100) ... 1 1 1 1 2 1 1 3 3 1 1 4 6 4 1 1 5 10 10 5 1 1 6 15 20 15 6 1 1 7 21 35 35 21 7 1 1 8 28 56 70 56 28 8 1 1 9 36 84 126 126 84 36 9 1 1 10 45 120 210 252 210 120 45 10 1 1 11 55 165 330 462 462 330 165 55 11 1 1 12 66 220 495 792 924 792 495 220 66 12 1 1 13 78 286 715 1287 1716 1716 1287 715 286 78 13 1 1 14 91 364 1001 2002 3003 3432 3003 2002 1001 364 91 14 1 1 15 105 455 1365 3003 5005 6435 6435 5005 3003 1365 455 105 15 1 1 16 120 560 1820 4368 8008 11440 12870 11440 8008 4368 1820 560 120 16 1 >>>

PS。编辑来replaceint(round(reduce(mul, (float(ni)/(i+1) for i in range(k)), 1))) with int(reduce(mul, (Fraction(ni, i+1) for i in range(k)), 1))所以它不会犯大N / K

谷歌代码快速search（它使用公式@马克拜尔斯的答案）：

 def choose(n, k): """ A fast way to calculate binomial coefficients by Andrew Dalke (contrib). """ if 0 <= k <= n: ntok = 1 ktok = 1 for t in xrange(1, min(k, n - k) + 1): ntok *= n ktok *= t n -= 1 return ntok // ktok else: return 0

如果你需要一个确切的答案， scipy.misc.comb()比scipy.misc.comb()要快10倍（在所有0 <=（n，k）<1e3对上testing）。

 def comb(N,k): # from scipy.comb(), but MODIFIED! if (k > N) or (N < 0) or (k < 0): return 0L N,k = map(long,(N,k)) top = N val = 1L while (top > (Nk)): val *= top top -= 1 n = 1L while (n < k+1L): val /= n n += 1 return val

如果你想得到确切的结果和速度，请尝试gmpy – gmpy.comb应该按照你的要求做，而且速度相当快（当然，作为gmpy的原作者，我有偏见;-)。

如果你想要一个确切的结果，使用sympy.binomial 。这似乎是最快的方法，手下来。

 x = 1000000 y = 234050 %timeit scipy.misc.comb(x, y, exact=True) 1 loops, best of 3: 1min 27s per loop %timeit gmpy.comb(x, y) 1 loops, best of 3: 1.97 s per loop %timeit int(sympy.binomial(x, y)) 100000 loops, best of 3: 5.06 µs per loop

在很多情况下，math定义的直接翻译是相当充分的（记住Python会自动使用大数字算术）：

 from math import factorial def calculate_combinations(n, r): return factorial(n) // factorial(r) // factorial(nr)

对于我testing的一些投入（例如n = 1000 r = 500），这比另一个（当前最高的投票答案）build议的一个class轮reduce速度快了10倍以上。另一方面，它是由@JF Sebastian提供的snippit。

这是另一个select。这个最初是用C ++编写的，所以它可以被反向移植到C ++来获得有限精度的整数（例如__int64）。其优点是（1）只涉及整数运算，（2）通过连续的乘除运算来避免整数值的膨胀。我用Nas Banov的Pascal三angulartesting了结果，得到了正确答案：

 def choose(n,r): """Computes n! / (r! (nr)!) exactly. Returns a python long int.""" assert n >= 0 assert 0 <= r <= n c = 1L denom = 1 for (num,denom) in zip(xrange(n,nr,-1), xrange(1,r+1,1)): c = (c * num) // denom return c

理由：为了最小化乘法和除法的次数，我们将expression式改写为

  n! n(n-1)...(n-r+1) --------- = ---------------- r!(nr)! r!

为尽可能避免乘法溢出，我们将按照以下严格顺序从左到右进行评估：

 n / 1 * (n-1) / 2 * (n-2) / 3 * ... * (n-r+1) / r

我们可以certificate按这个顺序操作的整数运算是精确的（即没有舍入误差）。

使用dynamic规划，时间复杂度是Θ（n * m）和空间复杂度Θ（m）：

 def binomial(n, k): """ (int, int) -> int | c(n-1, k-1) + c(n-1, k), if 0 < k < n c(n,k) = | 1 , if n = k | 1 , if k = 0 Precondition: n > k >>> binomial(9, 2) 36 """ c = [0] * (n + 1) c[0] = 1 for i in range(1, n + 1): c[i] = 1 j = i - 1 while j > 0: c[j] += c[j - 1] j -= 1 return c[k]

当n大于20时，直接公式产生大整数。

所以，还有一个回应：

 from math import factorial binomial = lambda n,r: reduce(long.__mul__, range(nr, n+1), 1L) // factorial(r)

简短，快捷，高效。

sympy很简单。

 import sympy comb = sympy.binomial(n, r)

仅使用与Python分发的标准库 ：

 import itertools def nCk(n, k): return len(list(itertools.combinations(range(n), k)))

这可能是一样快，你可以在纯Python中做相当大的input：

 def choose(n, k): if k == n: return 1 if k > n: return 0 d, q = max(k, nk), min(k, nk) num = 1 for n in xrange(d+1, n+1): num *= n denom = 1 for d in xrange(1, q+1): denom *= d return num / denom

如果你的程序有一个n的上界（比如说n <= N ）并且需要重复计算nCr（最好是N次），使用lru_cache可以给你一个巨大的性能提升：

 from functools import lru_cache @lru_cache(maxsize=None) def nCr(n, r): return 1 if r == 0 or r == n else nCr(n - 1, r - 1) + nCr(n - 1, r)

构buildcaching（隐式完成）需要O(N^2)时间。任何后续调用nCr将返回O(1) 。

统计：Python中的组合

在python中创build一个列表的所有可能的组合

快速排列 – >数字 – >置换映射algorithm

我怎样才能打印出给定的电话号码可以代表的所有可能的字母组合？

PHPalgorithm从一个集合中生成一个特定大小的所有组合

生成一个matrix，其中包含从n个向量中获取的元素的所有组合

列表的所有组合

比inheritance更喜欢构图？

Path.Combine和点符号

Java Swing中是否存在“Group Box”等价物？

foldl与foldr行为与无限列表