随机数似乎不是随机的

我正在尝试生成6个字符或更less的随机base32数字。 这应该给大约10亿个不同的组合。

我创build了一个程序来生成这些“随机”数字。 不过,平均每4万代产生一个副本。

为什么这些“随机”的数字在十亿多种不同的组合中经常出现重复?

这是我的代码:

static void Main(string[] args) { int seed = Environment.TickCount; Random r = new Random(seed); Dictionary<int, int> resultDictionary = new Dictionary<int, int>(); for (int x = 1; x <= 1000; x++) { Dictionary<string, int> dictionary = new Dictionary<string, int>(); try { while (true) { int rand = r.Next(0, 1073741823); CrockfordBase32Encoding encoding = new CrockfordBase32Encoding(); string encodedRand = encoding.Encode((ulong)rand, false); dictionary.Add(encodedRand, rand); } } catch (Exception) { } Console.WriteLine(string.Format("{0} - {1}", x, dictionary.Count)); resultDictionary.Add(x, dictionary.Count); x++; } Console.WriteLine(); Console.WriteLine("Average Number Before Duplicate: " + resultDictionary.Average(x => x.Value)); Console.ReadLine(); } 

这与生日问题类似。 给定一组n个人,两个人共享同一个生日1的概率是多less? 这比你想象的要高。

在你的情况下,随机select0到1,073,741,823次之间的数字有什么可能会给你一个重复的结果?

上面链接的一个近似值是1-exp(-(n*n)/(2*d)) 。 如果n=40,000 ,相当于select一个副本的概率约为52.5%,那么平均看到4万个副本之后看起来是重复的。


1假定生日是普遍分布的,事实上并非如此,而是“足够接近”,使得math更容易

这被称为生日问题 ,只是基本的概率论。

在1到K范围内的N个随机数重复的概率是:

在这里输入图像说明
在这里输入图像说明

计算获得至less一个副本的机会从1减去值。

在你的情况下,它的评估

 P(40000, 1073741823) = 1 - p(40000, 1073741823) 

通过使用Wolfram Alpha进行计算的结果是

 0.5252888122305790 

这意味着它可能会有50%以上的可能性。 当你产生更多的数字,你会越来越频繁地得到重复。

以下是对N的更多评价:

  N Result 40000 0.5253 100000 0.9905 200000 0.9999 

包含在框架中的随机数发生器是伪随机的,没有任何随机数分布的保证。 如果您关心分布模式,请考虑以下文章: http : //www.codeproject.com/Articles/15102/NET-random-number-generators-and-distributions

不过,我的统计教授(不是一个)曾经说过:“有一个小谎,一个大谎,有统计”。

首先是完整的代码,所以人们不必在互联网上寻找类实现来testing:

 using System; using System.Collections.Generic; using System.Linq; using System.Threading; namespace ConsoleApplication2 { class Program { static void Main(string[] args) { var r = RandomProvider.GetThreadRandom(); Dictionary<int, int> resultDictionary = new Dictionary<int, int>(); for (int x = 1; x <= 1000; x++) { Dictionary<string, int> dictionary = new Dictionary<string, int>(); try { while (true) { int rand = r.Next(0, 1073741823); CrockfordBase32Encoding encoding = new CrockfordBase32Encoding(); string encodedRand = encoding.Encode((ulong)rand, false); dictionary.Add(encodedRand, rand); } } catch (Exception) { } Console.WriteLine("{0} - {1}", x, dictionary.Count); resultDictionary.Add(x, dictionary.Count); x++; } Console.WriteLine(); Console.WriteLine("Average Number Before Duplicate: " + resultDictionary.Average(x => x.Value)); Console.WriteLine("Minimum Number Before Duplicate: " + resultDictionary.Min(x => x.Value)); Console.WriteLine("Maximum Number Before Duplicate: " + resultDictionary.Max(x => x.Value)); Console.WriteLine(" Median Number Before Duplicate: " + resultDictionary.Select(x=>x.Value).Median()); Console.ReadLine(); } } public static class Extensions { public static double Median<T>(this IEnumerable<T> list) { List<double> orderedList = list.Select(s=>Convert.ToDouble(s)) .OrderBy(numbers => numbers) .ToList(); int listSize = orderedList.Count; double result; if (listSize % 2 == 0) // even { int midIndex = listSize / 2; result = ((orderedList.ElementAt(midIndex - 1) + orderedList.ElementAt(midIndex)) / 2); } else // odd { double element = (double)listSize / 2; element = Math.Round(element, MidpointRounding.AwayFromZero); result = orderedList.ElementAt((int)(element - 1)); } return result; } } public static class RandomProvider { private static int seed = Environment.TickCount; private static ThreadLocal<Random> randomWrapper = new ThreadLocal<Random>(() => new Random(Interlocked.Increment(ref seed)) ); public static Random GetThreadRandom() { return randomWrapper.Value; } } public class CrockfordBase32Encoding { const int Base = 32; const int CheckDigitBase = 37; static readonly IDictionary<int, char> valueEncodings; static readonly IDictionary<int, char> checkDigitEncodings; static readonly IDictionary<char, int> valueDecodings; static readonly IDictionary<char, int> checkDigitDecodings; static CrockfordBase32Encoding() { var symbols = new SymbolDefinitions(); valueEncodings = symbols.ValueEncodings; checkDigitEncodings = symbols.CheckDigitEncodings; valueDecodings = symbols.ValueDecodings; checkDigitDecodings = symbols.CheckDigitDecodings; } public string Encode(ulong input, bool includeCheckDigit) { var chunks = SplitInto5BitChunks(input); var characters = chunks.Select(chunk => valueEncodings[chunk]); if (includeCheckDigit) { var checkValue = (int)(input % CheckDigitBase); characters = characters.Concat(new[] { checkDigitEncodings[checkValue] }); } return new string(characters.ToArray()); } internal static IEnumerable<byte> SplitInto5BitChunks(ulong input) { const int bitsPerChunk = 5; const int shift = (sizeof(ulong) * 8) - bitsPerChunk; var chunks = new List<byte>(); do { var lastChunk = input << shift >> shift; chunks.Insert(0, (byte)lastChunk); input = input >> bitsPerChunk; } while (input > 0); return chunks; } public ulong? Decode(string encodedString, bool treatLastCharacterAsCheckDigit) { if (encodedString == null) throw new ArgumentNullException("encodedString"); if (encodedString.Length == 0) return null; IEnumerable<char> charactersInReverse = encodedString.Reverse().ToArray(); int? expectedCheckValue = null; if (treatLastCharacterAsCheckDigit) { var checkDigit = charactersInReverse.First(); if (!checkDigitDecodings.ContainsKey(checkDigit)) return null; expectedCheckValue = checkDigitDecodings[checkDigit]; charactersInReverse = charactersInReverse.Skip(1); } ulong number = 0; ulong currentBase = 1; foreach (var character in charactersInReverse) { if (!valueDecodings.ContainsKey(character)) return null; var value = valueDecodings[character]; number += (ulong)value * currentBase; currentBase *= Base; } if (expectedCheckValue.HasValue && (int)(number % CheckDigitBase) != expectedCheckValue) return null; return number; } } internal class SymbolDefinitions : List<SymbolDefinition> { readonly List<SymbolDefinition> extraCheckDigits = new List<SymbolDefinition>(); public SymbolDefinitions() { AddRange(new[] { new SymbolDefinition { Value = 0, EncodeSymbol = '0', DecodeSymbols = new[] { '0', 'O', 'o' } }, new SymbolDefinition { Value = 1, EncodeSymbol = '1', DecodeSymbols = new[] { '1', 'I', 'i', 'L', 'l' } }, new SymbolDefinition { Value = 2, EncodeSymbol = '2', DecodeSymbols = new[] { '2' } }, new SymbolDefinition { Value = 3, EncodeSymbol = '3', DecodeSymbols = new[] { '3' } }, new SymbolDefinition { Value = 4, EncodeSymbol = '4', DecodeSymbols = new[] { '4' } }, new SymbolDefinition { Value = 5, EncodeSymbol = '5', DecodeSymbols = new[] { '5' } }, new SymbolDefinition { Value = 6, EncodeSymbol = '6', DecodeSymbols = new[] { '6' } }, new SymbolDefinition { Value = 7, EncodeSymbol = '7', DecodeSymbols = new[] { '7' } }, new SymbolDefinition { Value = 8, EncodeSymbol = '8', DecodeSymbols = new[] { '8' } }, new SymbolDefinition { Value = 9, EncodeSymbol = '9', DecodeSymbols = new[] { '9' } }, new SymbolDefinition { Value = 10, EncodeSymbol = 'A', DecodeSymbols = new[] { 'A', 'a' } }, new SymbolDefinition { Value = 11, EncodeSymbol = 'B', DecodeSymbols = new[] { 'B', 'b' } }, new SymbolDefinition { Value = 12, EncodeSymbol = 'C', DecodeSymbols = new[] { 'C', 'c' } }, new SymbolDefinition { Value = 13, EncodeSymbol = 'D', DecodeSymbols = new[] { 'D', 'd' } }, new SymbolDefinition { Value = 14, EncodeSymbol = 'E', DecodeSymbols = new[] { 'E', 'e' } }, new SymbolDefinition { Value = 15, EncodeSymbol = 'F', DecodeSymbols = new[] { 'F', 'f' } }, new SymbolDefinition { Value = 16, EncodeSymbol = 'G', DecodeSymbols = new[] { 'G', 'g' } }, new SymbolDefinition { Value = 17, EncodeSymbol = 'H', DecodeSymbols = new[] { 'H', 'h' } }, new SymbolDefinition { Value = 18, EncodeSymbol = 'J', DecodeSymbols = new[] { 'J', 'j' } }, new SymbolDefinition { Value = 19, EncodeSymbol = 'K', DecodeSymbols = new[] { 'K', 'k' } }, new SymbolDefinition { Value = 20, EncodeSymbol = 'M', DecodeSymbols = new[] { 'M', 'm' } }, new SymbolDefinition { Value = 21, EncodeSymbol = 'N', DecodeSymbols = new[] { 'N', 'n' } }, new SymbolDefinition { Value = 22, EncodeSymbol = 'P', DecodeSymbols = new[] { 'P', 'p' } }, new SymbolDefinition { Value = 23, EncodeSymbol = 'Q', DecodeSymbols = new[] { 'Q', 'q' } }, new SymbolDefinition { Value = 24, EncodeSymbol = 'R', DecodeSymbols = new[] { 'R', 'r' } }, new SymbolDefinition { Value = 25, EncodeSymbol = 'S', DecodeSymbols = new[] { 'S', 's' } }, new SymbolDefinition { Value = 26, EncodeSymbol = 'T', DecodeSymbols = new[] { 'T', 't' } }, new SymbolDefinition { Value = 27, EncodeSymbol = 'V', DecodeSymbols = new[] { 'V', 'v' } }, new SymbolDefinition { Value = 28, EncodeSymbol = 'W', DecodeSymbols = new[] { 'W', 'w' } }, new SymbolDefinition { Value = 29, EncodeSymbol = 'X', DecodeSymbols = new[] { 'X', 'x' } }, new SymbolDefinition { Value = 30, EncodeSymbol = 'Y', DecodeSymbols = new[] { 'Y', 'y' } }, new SymbolDefinition { Value = 31, EncodeSymbol = 'Z', DecodeSymbols = new[] { 'Z', 'z' } }, }); extraCheckDigits.AddRange(new[] { new SymbolDefinition { Value = 32, EncodeSymbol = '*', DecodeSymbols = new[] { '*' } }, new SymbolDefinition { Value = 33, EncodeSymbol = '~', DecodeSymbols = new[] { '~' } }, new SymbolDefinition { Value = 34, EncodeSymbol = '$', DecodeSymbols = new[] { '$' } }, new SymbolDefinition { Value = 35, EncodeSymbol = '=', DecodeSymbols = new[] { '=' } }, new SymbolDefinition { Value = 36, EncodeSymbol = 'U', DecodeSymbols = new[] { 'U', 'u' } }, }); } public IDictionary<int, char> ValueEncodings { get { return this.ToDictionary(s => s.Value, s => s.EncodeSymbol); } } public IDictionary<int, char> CheckDigitEncodings { get { return this .Union(extraCheckDigits) .ToDictionary(s => s.Value, s => s.EncodeSymbol); } } public IDictionary<char, int> ValueDecodings { get { return this .SelectMany(s => s.DecodeSymbols.Select(d => new { s.Value, DecodeSymbol = d })) .ToDictionary(s => s.DecodeSymbol, s => s.Value); } } public IDictionary<char, int> CheckDigitDecodings { get { return this .Union(extraCheckDigits) .SelectMany(s => s.DecodeSymbols.Select(d => new { s.Value, DecodeSymbol = d })) .ToDictionary(s => s.DecodeSymbol, s => s.Value); } } } internal class SymbolDefinition { public int Value { get; set; } public IEnumerable<char> DecodeSymbols { get; set; } public char EncodeSymbol { get; set; } } } 

我添加了一些额外的输出行:

 Average Number Before Duplicate: 41043.954 Minimum Number Before Duplicate: 2498 Maximum Number Before Duplicate: 127683 Median Number Before Duplicate: 37860 

没有那么有意思,而平均约为40k,看最小和最大两个数量级。

随机性不保证均匀分布。 连续两次掷骰子,两次掷出的数字都是随机的。 之前已经完成了一次两次或两次以上的彩票大奖。

如果你需要每个线程更独特的分布,我已经包含了Jon Skeet 最出色的书 (是的,我是一个迷人的)的RandomProvider样本。

UPDATE

平行执行的一个小改写,因为折磨硅基生命forms是很有趣的:

  static void Main(string[] args) { ConcurrentDictionary<int, int> resultDictionary = new ConcurrentDictionary<int, int>(); Parallel.For(0, 1000, x => { var r = RandomProvider.GetThreadRandom(); ConcurrentDictionary<string, int> dictionary = new ConcurrentDictionary<string, int>(); while (true) { int rand = r.Next(0, 1073741823); CrockfordBase32Encoding encoding = new CrockfordBase32Encoding(); string encodedRand = encoding.Encode((ulong) rand, false); if (!dictionary.TryAdd(encodedRand, rand)) break; } Console.WriteLine("{0} - {1}", x, dictionary.Count); resultDictionary.TryAdd(x, dictionary.Count); }); Console.WriteLine(); Console.WriteLine("Average Number Before Duplicate: " + resultDictionary.Average(x => x.Value)); Console.WriteLine("Minimum Number Before Duplicate: " + resultDictionary.Min(x => x.Value)); Console.WriteLine("Maximum Number Before Duplicate: " + resultDictionary.Max(x => x.Value)); Console.WriteLine(" Median Number Before Duplicate: " + resultDictionary.Select(x=>x.Value).Median()); Console.ReadLine(); } 

结果是:

 Average Number Before Duplicate: 41826.375 Minimum Number Before Duplicate: 1655 Maximum Number Before Duplicate: 134671 Median Number Before Duplicate: 39119 

更新2

所以CodeProject文章的作者已经将他的工作作为一个NuGet包发布:

 Install-Package Troschuetz.Random 

我使用了相同的示例代码来testing不同的生成器:

StandardGenerator

 Average Number Before Duplicate: 40434.148 Minimum Number Before Duplicate: 978 Maximum Number Before Duplicate: 136248 Median Number Before Duplicate: 38845 

ALFGenerator

 Average Number Before Duplicate: 40395.845 Minimum Number Before Duplicate: 828 Maximum Number Before Duplicate: 125705 Median Number Before Duplicate: 38042 

MT19937Generator

 Average Number Before Duplicate: 40478.174 Minimum Number Before Duplicate: 2723 Maximum Number Before Duplicate: 121367 Median Number Before Duplicate: 38279 

XorShift128Generator

 Average Number Before Duplicate: 41463.732 Minimum Number Before Duplicate: 878 Maximum Number Before Duplicate: 111206 Median Number Before Duplicate: 39013.5 

所以你有它。 享受这是值得的亚..