algorithm确定数组是否包含n … n + m?

我在Reddit上看到了这个问题,并没有提出正面的解决scheme,我认为这是一个完美的问题。 这是关于面试问题的一个线索:

编写一个采用大小为m的int数组的方法,如果数组由数字n … n + m-1组成,则返回(True / False),该范围内的所有数字以及该范围内的数字。 数组不保证被sorting。 (例如,{2,3,4}将返回true。{1,3,1}将返回false,{1,2,4}将返回false。

我遇到的这个问题是我的面试官不断要求我优化(更快的O(n),更less的内存等),他声称你可以在一个arrays中使用恒定数量的记忆。 从来没有想过这一个。

随着你的解决scheme,请指出他们是否假设arrays包含独特的项目。 同时指出你的解决scheme是否假定序列从1开始(我已经稍微修改了这个问题,以允许它进入2,3,4的情况)

编辑:我现在认为,在处理重复的空间algorithm中不存在线性时间和常量。 任何人都可以validation此?

重复的问题归结为testing,以查看数组是否包含O(n)时间,O(1)空间中的重复项。 如果可以这样做,您可以先简单testing一下,如果没有重复,则运行发布的algorithm。 那么你能在O(n)时间O(1)空间中testingdupe吗?

在假设数字小于1是不允许的,也没有重复,这里有一个简单的总和标识 – 以1为增量的1m的数字之和是(m * (m + 1)) / 2 。 然后,您可以对数组进行求和并使用此标识。

你可以在上面的保证下找出是否存在重复,再加上保证没有数在m以上或者小于n(可以在O(N)检查)

在伪代码中的想法:
0)从N = 0开始
1)获取列表中的第N个元素。
2)如果清单已经sorting,如果它不在正确的位置,检查它应该在哪里。
3)如果它应该已经有相同的号码的地方,你有一个愚蠢 – 返回真实
4)否则,交换数字(把第一个数字放在正确的地方)。
5)你刚换过的号码是否在正确的位置?
6)如果不是,请回到第二步。
7)否则,从第一步开始,N = N + 1。如果这将超过列表的末尾,则没有任何愚蠢行为。

而且,是的,虽然它看起来像O(N ^ 2)但它运行在O(N) O(N ^ 2)

给大家注意(从评论中收集的东西)

这个解决scheme的工作假设你可以修改数组,然后使用就地基数sorting(达到O(N)速度)。

其他的解决scheme已经提出,但我不确定其中有没有被certificate。 有一堆可能有用的总和,但是其中大部分总和会代表总和所需要的位数,从而破坏了恒定的额外空间保证。 我也不知道他们中的任何一个是否能够为一组给定的数字产生不同的数字。 我认为可能有一个正方形的总和,有一个已知的公式来计算它(参见Wolfram's )

新的见解(嗯,更多的冥想没有帮助解决它,但很有趣,我要睡觉):

所以,有人提到可能使用sum +平方和。 没有人知道这是否奏效,并且我意识到当(x + y)=(n + m)时,例如事实2 + 2 = 1 + 3,它只会成为一个问题。正方形也有这个问题,这要归功于毕达哥拉斯三元组 (即3 ^ 2 + 4 ^ 2 + 25 ^ 2 == 5 ^ 2 + 7 ^ 2 + 24 ^ 2,且平方和不起作用)。 如果我们使用费马最后一个定理 ,我们知道这不会发生n ^ 3。 但是我们也不知道这里是否没有x + y + z = n(除非我们这样做,我不知道它)。 所以不能保证这一点也不会中断 – 如果我们继续沿着这条道路走下去,我们很快就会失去一点点。

然而,在我的欢乐中,我忘了注意,你可以打破这个平方和,但是这样做可以创build一个无效的正常和。 我不认为你们能做到这一点,但正如我们已经指出的那样,我们没有任何证据。


我必须说,find反例有时比certificate事情容易得多! 考虑以下序列,它们的总和为28,总和为140:

 [1, 2, 3, 4, 5, 6, 7] [1, 1, 4, 5, 5, 6, 6] [2, 2, 3, 3, 4, 7, 7] 

我找不到任何这样的长度小于或等于6的例子。 如果你想要一个具有适当的最小值和最大值的例子,试试这个长度为8:

 [1, 3, 3, 4, 4, 5, 8, 8] 

更简单的方法(修改hazzen的想法):

长度为m的整数数组包含从n到n + m-1的所有数字

  • 每个数组元素在n和n + m-1之间
  • 没有重复

(原因:在给定的整数范围内只有m个值,所以如果该数组在这个范围内包含m个唯一值,则它必须包含其中的每一个)

如果你允许修改数组,你可以通过hazzen的algorithm思想的修改版本(无需进行任何求和)在列表中一次性检查:

  • 对于从0到m-1的所有数组索引,
    1. 如果array [i] <n或array [i]> = n + m => RETURN FALSE(“value out of range found”)
    2. 计算j = array [i] – n(这是array [i]在从n到n + m-1的sorting数组中的基于0的位置)
    3. 虽然j不等于i
      1. 如果list [i]等于list [j] => RETURN FALSE(“duplicate found”)
      2. 与列表[j]交换列表[i]
      3. 重新计算j = array [i] – n
  • 返回真值

我不确定原始数组的修改是否与最大允许的O(1)额外空间相比,但是如果不是这样的话,这应该是原始海报想要的解决scheme。

通过使用a[i] % a.length而不是a[i] ,可以将问题简化为需要确定数字0a.length - 1

我们认为这是理所当然的,并试图检查数组是否包含[0,m)。

find不在正确位置的第一个节点,例如

 0 1 2 3 7 5 6 8 4 ; the original dataset (after the renaming we discussed) ^ `---this is position 4 and the 7 shouldn't be here 

将该号码交换到应该在的位置。 即交换78

 0 1 2 3 8 5 6 7 4 ; | `--------- 7 is in the right place. `--------------- this is now the 'current' position 

现在我们重复一遍。 再看看我们现在的位置,我们问:

“这是这里的正确号码吗?”

  • 如果不是的话,我们把它换成正确的地方。
  • 如果它在正确的位置,我们就向右移动,然后再做一次。

遵循这个规则再次,我们得到:

 0 1 2 3 4 5 6 7 8 ; 4 and 8 were just swapped 

这将逐渐从左到右正确地build立列表,并且每个数字将最多移动一次,因此这是O(n)。

如果存在欺骗行为,我们会立即注意到,在尝试将列表中的某个数字backwards换。

为什么其他解决scheme使用每个值的总和? 我认为这是有风险的,因为当你把O(n)项加到一个数字中时,你在技术上使用的是多于O(1)的空间。

更简单的方法:

第一步,找出是否有重复。 我不确定这是否可能在O(1)空间。 无论如何,如果有重复,则返回false。

第2步,遍历列表,跟踪最低最高的项目。

步骤3,(最高 – 最低)是否等于m? 如果是这样,则返回true。

任何单程algorithm都需要Omega(n)个存储位。

假设相反,存在使用o(n)个比特的单通道algorithm。 因为它只有一次通过,所以它必须总结o(n)空间中的前n / 2个值。 由于有S({1,…,n})绘制的可能的n / 2个值集合,所以存在两个不同的集合A和B(n / 2值,这样两者之后的内存状态是相同的。 如果A'= S \ A是补充A的“正确的”一组值,那么该algorithm不可能正确地回答input

AA' – 是的

BA' – 没有

因为它不能区分第一种情况和第二种情况。

QED

如果我错了,就投我一票,但我想我们可以使用方差来确定是否有重复。 因为我们事先知道平均值(n +(m-1)/ 2或类似的东西),所以我们可以把差值的数目和平方加起来来表示总和与方程(mn + m(m-1 )/ 2),方差是(0 + 1 + 4 + … +(m-1)^ 2)/ m。 如果差异不匹配,则可能是重复的。

编辑:方差应该是(0 + 1 + 4 + … + [(m-1)/ 2] ^ 2)* 2 / m,因为一半的元素小于平均值,另一半是大于平均值。

如果有重复,则上述等式中的术语将与正确的顺序不同,即使另一个重复完全消除了平均值的变化。 因此,只有在求和和方差都与预先计算的desrired值相匹配的情况下,函数才会返回true。

这是O(n)中的一个工作解决scheme

这是使用Hazzenbuild议的伪代码加上我自己的一些想法。 它适用于负数,也不需要任何平方和的东西。

 function testArray($nums, $n, $m) { // check the sum. PHP offers this array_sum() method, but it's // trivial to write your own. O(n) here. if (array_sum($nums) != ($m * ($m + 2 * $n - 1) / 2)) { return false; // checksum failed. } for ($i = 0; $i < $m; ++$i) { // check if the number is in the proper range if ($nums[$i] < $n || $nums[$i] >= $n + $m) { return false; // value out of range. } while (($shouldBe = $nums[$i] - $n) != $i) { if ($nums[$shouldBe] == $nums[$i]) { return false; // duplicate } $temp = $nums[$i]; $nums[$i] = $nums[$shouldBe]; $nums[$shouldBe] = $temp; } } return true; // huzzah! } var_dump(testArray(array(1, 2, 3, 4, 5), 1, 5)); // true var_dump(testArray(array(5, 4, 3, 2, 1), 1, 5)); // true var_dump(testArray(array(6, 4, 3, 2, 0), 1, 5)); // false - out of range var_dump(testArray(array(5, 5, 3, 2, 1), 1, 5)); // false - checksum fail var_dump(testArray(array(5, 4, 3, 2, 5), 1, 5)); // false - dupe var_dump(testArray(array(-2, -1, 0, 1, 2), -2, 5)); // true 

一段时间后,我听说一个从电话公司工作的人那里得到一个非常聪明的sortingalgorithm。 他们不得不sorting大量的电话号码。 在经历了一系列不同的sorting策略之后,他们终于遇到了一个非常优雅的解决scheme:他们只是创build了一个位数组,并将位数的偏移量作为电话号码处理。 然后,他们通过一次扫描数据库,将每个数字的位更改为1.之后,他们扫描位数组,然后将位数设置为高的条目扫描出来。

沿着这些路线,我相信你可以使用数组本身的数据作为元数据结构来查找重复数据。 最坏的情况下,你可以有一个单独的数组,但我敢肯定你可以使用input数组,如果你不介意一点点交换。

我会暂时忽略n参数,b / c只是混淆了事情 – 添加索引偏移量很容易做到。

考虑:

 for i = 0 to m if (a[a[i]]==a[i]) return false; // we have a duplicate while (a[a[i]] > a[i]) swapArrayIndexes(a[i], i) sum = sum + a[i] next if sum = (n+m-1)*m return true else return false 

这不是O(n) – 可能更接近O(n Log n) – 但它确实提供了恒定的空间,并且可以为该问题提供不同的攻击向量。

如果我们想要O(n),那么使用一个字节数组和一些位操作将会提供一个额外的n / 32字节的内存使用的复制检查(当然假设是32位整数)。

编辑:上面的algorithm可以进一步改进通过添加总和检查循环内,并检查:

 if sum > (n+m-1)*m return false 

这样它就会快速失败。

假设你只知道数组的长度,你可以修改数组,它可以在O(1)空间和O(n)时间完成。

这个过程有两个简单的步骤。 1.“模sorting”数组。 [5,3,2,4] => [4,5,2,3](O(2n))2.检查每个值的邻居是否比自己高(模)(O(n))

全部告诉你最多需要3次通过arrays。

模sorting是“棘手”的部分,但目标很简单。 获取数组中的每个值并将其存储在自己的地址(模数长度)中。 这需要一次通过数组,循环遍历每个位置,通过将其交换到正确的位置并移动到其目的地的值来“驱逐”其值。 如果你的价值与你刚刚被驱逐的价值一致,那么你有一个副本,可以提前退出。 最坏的情况是O(2n)。

检查是通过arrays检查每个值与它的下一个最高邻居。 总是O(n)。

组合algorithm是O(n)+ O(2n)= O(3n)= O(n)

从我的解决scheme伪代码:

的foreach(值[]) 
  而(价值[我]不与我一致)
    被驱逐=价值[i]
     evict(values [i])//交换到它的“合适”位置
     if(values [i]%length ==被驱逐%长度)
      返回false;  //当我们驱逐那个号码时,一个'重复'到达
  结束时
最终的foreach
的foreach(值[])
   if((values [i] +1)%length!= values [i + 1]%length)
    返回false
最终的foreach

我已经在下面列出了Java代码的概念certificate,这不是很好,但它通过了我所做的所有unit testing。 我把它们称为“直阵”,因为它们对应于直线(忽略套装的连续序列)的扑克手。

 public class StraightArray { static int evict(int[] a, int i) { int t = a[i]; a[i] = a[t%a.length]; a[t%a.length] = t; return t; } static boolean isStraight(int[] values) { for(int i = 0; i < values.length; i++) { while(values[i]%values.length != i) { int evicted = evict(values, i); if(evicted%values.length == values[i]%values.length) { return false; } } } for(int i = 0; i < values.length-1; i++) { int n = (values[i]%values.length)+1; int m = values[(i+1)]%values.length; if(n != m) { return false; } } return true; } } 

Hazzen在C中的algorithm实现

 #include<stdio.h> #define swapxor(a,i,j) a[i]^=a[j];a[j]^=a[i];a[i]^=a[j]; int check_ntom(int a[], int n, int m) { int i = 0, j = 0; for(i = 0; i < m; i++) { if(a[i] < n || a[i] >= n+m) return 0; //invalid entry j = a[i] - n; while(j != i) { if(a[i]==a[j]) return -1; //bucket already occupied. Dupe. swapxor(a, i, j); //faster bitwise swap j = a[i] - n; if(a[i]>=n+m) return 0; //[NEW] invalid entry } } return 200; //OK } int main() { int n=5, m=5; int a[] = {6, 5, 7, 9, 8}; int r = check_ntom(a, n, m); printf("%d", r); return 0; } 

编辑:对代码进行更改以消除非法内存访问。

 boolean determineContinuousArray(int *arr, int len) { // Suppose the array is like below: //int arr[10] = {7,11,14,9,8,100,12,5,13,6}; //int len = sizeof(arr)/sizeof(int); int n = arr[0]; int *result = new int[len]; for(int i=0; i< len; i++) result[i] = -1; for (int i=0; i < len; i++) { int cur = arr[i]; int hold ; if ( arr[i] < n){ n = arr[i]; } while(true){ if ( cur - n >= len){ cout << "array index out of range: meaning this is not a valid array" << endl; return false; } else if ( result[cur - n] != cur){ hold = result[cur - n]; result[cur - n] = cur; if (hold == -1) break; cur = hold; }else{ cout << "found duplicate number " << cur << endl; return false; } } } cout << "this is a valid array" << endl; for(int j=0 ; j< len; j++) cout << result[j] << "," ; cout << endl; return true; } 
 def test(a, n, m): seen = [False] * m for x in a: if x < n or x >= n+m: return False if seen[xn]: return False seen[xn] = True return False not in seen print test([2, 3, 1], 1, 3) print test([1, 3, 1], 1, 3) print test([1, 2, 4], 1, 3) 

请注意,这只会使第一个数组通过一次,而不会考虑not in涉及的线性search。 🙂

我也可以使用python set ,但是我select了不需要考虑set的性能特征的直接解决scheme。

更新:Smashery指出,我已经误解了“恒定的内存”,这种解决scheme实际上并没有解决问题。

如果你想知道数字的总和[n ... n + m - 1]就用这个方程式。

 var sum = m * (m + 2 * n - 1) / 2; 

即使n是小数,这也适用于任何数字,正数或负数。

为什么其他解决scheme使用每个值的总和? 我认为这是有风险的,因为当你把O(n)项加到一个数字中时,你在技术上使用的是多于O(1)的空间。

O(1)指示恒定的空间,其不随n的数量而改变。 只要它是一个常数,它是1或2个variables并不重要。 你为什么说它比O(1)空间多? 如果你正在计算n个数字的总和,把它累加到一个临时variables中,无论如何你只能使用1个variables。

评论答案,因为系统不允许我写评论呢。

更新(在回复评论):在这个答案我的意思是O(1)空间,无论“空间”或“时间”省略。 引用的文本是这个答复的早期答案的一部分。

鉴于此 –

编写一个大小为m的int数组的方法…

我认为可以得出结论:m有一个上限,等于最大的int值(2 ^ 32是典型值)。 换句话说,即使m没有被指定为int,但是数组中不能有重复的事实意味着不能超过32位中可以形成的值的数目,这又意味着m是限于一个int也。

如果这样的结论是可以接受的,那么我build议使用(2 ^ 33 + 2)* 4字节= 34,359,738,376字节= 34.4GB的固定空间来处理所有可能的情况。 (不包括input数组及其循环所需的空间)。

当然,为了优化,我首先要考虑m,只分配实际需要的数量,(2m + 2)* 4个字节。

如果这是可接受的O(1)空间约束 – 对于所述的问题 – 然后让我继续进行algorithmbuild议… 🙂

假设 :m个整数,正数或负数,不大于4个字节可以容纳的数组。 重复处理。 第一个值可以是任何有效的int。 像上面那样限制m。

首先 ,创build一个长度为2m-1的int数组,并提供三个intvariables: leftdiffright 。 注意,使2m + 2 …

其次 ,从input数组中获取第一个值,并将其复制到新数组中的m-1位置。 初始化这三个variables。

  • 设置ary [m-1] – nthVal // n = 0
  • 设置left = diff = right = 0

第三 ,遍历input数组中的剩余值,并对每个迭代执行以下操作:

  • set diff = nthVal – ary [m-1]
  • 如果( diff > m-1 + right || diff <1-m + left )返回false //出界
  • 如果( ary [m-1 + diff ]!= null)返回false //重复
  • 设置ary [m-1 + diff ] = nthval
  • 如果( diff > leftleft = diff //约束进一步向右界限
  • 如果( 差异 < = 差异 //约束进一步向左右边界

我决定把这个代码,它的工作。

这是一个使用C#的工作示例:

 public class Program { static bool puzzle(int[] inAry) { var m = inAry.Count(); var outAry = new int?[2 * m - 1]; int diff = 0; int left = 0; int right = 0; outAry[m - 1] = inAry[0]; for (var i = 1; i < m; i += 1) { diff = inAry[i] - inAry[0]; if (diff > m - 1 + right || diff < 1 - m + left) return false; if (outAry[m - 1 + diff] != null) return false; outAry[m - 1 + diff] = inAry[i]; if (diff > left) left = diff; if (diff < right) right = diff; } return true; } static void Main(string[] args) { var inAry = new int[3]{ 2, 3, 4 }; Console.WriteLine(puzzle(inAry)); inAry = new int[13] { -3, 5, -1, -2, 9, 8, 2, 3, 0, 6, 4, 7, 1 }; Console.WriteLine(puzzle(inAry)); inAry = new int[3] { 21, 31, 41 }; Console.WriteLine(puzzle(inAry)); Console.ReadLine(); } } 

注意 :这个评论是基于问题的原始文本(从那以后就被修正了)

如果问题与上面所写的完全相同 (而且不仅仅是一个打字错误),对于大小为n的数组,如果数组由数字1 … n + 1组成,函数应该返回(True / False)

…那么答案总是错误的,因为具有所有数字1 … n + 1的数组将是大小n + 1而不是n。 因此这个问题可以在O(1)中得到回答。 🙂

XORalgorithm的反例。

(不能发表评论)

@popopome

对于a = {0, 2, 7, 5,}它返回true (意味着a是范围[0, 4)的置换),但是在这种情况下必须返回falsea显然不是一个整数[0, 4) )。

另一个反例: {0, 0, 1, 3, 5, 6, 6} – 所有值都在范围内,但有重复。

我可以不正确地实现popopome的想法(或testing),因此这里是代码:

 bool isperm_popopome(int m; int a[m], int m, int n) { /** O(m) in time (single pass), O(1) in space, no restrictions on n, no overflow, a[] may be readonly */ int even_xor = 0; int odd_xor = 0; for (int i = 0; i < m; ++i) { if (a[i] % 2 == 0) // is even even_xor ^= a[i]; else odd_xor ^= a[i]; const int b = i + n; if (b % 2 == 0) // is even even_xor ^= b; else odd_xor ^= b; } return (even_xor == 0) && (odd_xor == 0); } 

AC版本的B3的伪代码

(为了避免对伪码的误解)

反例: {1, 1, 2, 4, 6, 7, 7}

 int pow_minus_one(int power) { return (power % 2 == 0) ? 1 : -1; } int ceil_half(int n) { return n / 2 + (n % 2); } bool isperm_b3_3(int m; int a[m], int m, int n) { /** O(m) in time (single pass), O(1) in space, doesn't use n possible overflow in sum a[] may be readonly */ int altsum = 0; int mina = INT_MAX; int maxa = INT_MIN; for (int i = 0; i < m; ++i) { const int v = a[i] - n + 1; // [n, n+m-1] -> [1, m] to deal with n=0 if (mina > v) mina = v; if (maxa < v) maxa = v; altsum += pow_minus_one(v) * v; } return ((maxa-mina == m-1) and ((pow_minus_one(mina + m-1) * ceil_half(mina + m-1) - pow_minus_one(mina-1) * ceil_half(mina-1)) == altsum)); } 

在Python中:

 def ispermutation(iterable, m, n): """Whether iterable and the range [n, n+m) have the same elements. pre-condition: there are no duplicates in the iterable """ for i, elem in enumerate(iterable): if not n <= elem < n+m: return False return i == m-1 print(ispermutation([1, 42], 2, 1) == False) print(ispermutation(range(10), 10, 0) == True) print(ispermutation((2, 1, 3), 3, 1) == True) print(ispermutation((2, 1, 3), 3, 0) == False) print(ispermutation((2, 1, 3), 4, 1) == False) print(ispermutation((2, 1, 3), 2, 1) == False) 

在时间上是O(m),在空间上是O(1)。 它不考虑重复。

备用解决scheme:

 def ispermutation(iterable, m, n): """Same as above. pre-condition: assert(len(list(iterable)) == m) """ return all(n <= elem < n+m for elem in iterable) 

我目前最好的select

 def uniqueSet( array ) check_index = 0; check_value = 0; min = array[0]; array.each_with_index{ |value,index| check_index = check_index ^ ( 1 << index ); check_value = check_value ^ ( 1 << value ); min = value if value < min } check_index = check_index << min; return check_index == check_value; end 

O(n)和空间O(1)

我写了一个脚本来蛮力组合,可能会失败,它没有find任何。 如果你有一个违反这个函数的数组,请告诉。 🙂


@JF塞巴斯蒂安

它不是一个真正的哈希algorithm。 从技术上讲,它是一个高效率的“看到”值的布尔数组。

 ci = 0, cv = 0 [5,4,3]{ i = 0 v = 5 1 << 0 == 000001 1 << 5 == 100000 0 ^ 000001 = 000001 0 ^ 100000 = 100000 i = 1 v = 4 1 << 1 == 000010 1 << 4 == 010000 000001 ^ 000010 = 000011 100000 ^ 010000 = 110000 i = 2 v = 3 1 << 2 == 000100 1 << 3 == 001000 000011 ^ 000100 = 000111 110000 ^ 001000 = 111000 } min = 3 000111 << 3 == 111000 111000 === 111000 

这主要是为了“伪造”最常见的问题情况,一个使用重复的情况。 在这个系统中,XOR惩罚你使用相同的值两次,并假设你做了0次。

这里的警告当然是:

  1. ( 1 << $x > 0 )input数组长度和最大数组值都受$x的最大值限制。
  2. 最终效果取决于您的基础系统如何实现以下function:

    1. shift 1 bit n places right.
    2. xor 2 registers. ( where 'registers' may, depending on implementation, span several registers )

    edit Noted, above statements seem confusing. Assuming a perfect machine, where an "integer" is a register with Infinite precision, which can still perform a ^ b in O(1) time.

But failing these assumptions, one has to start asking the algorithmic complexity of simple math.

  • How complex is 1 == 1 ?, surely that should be O(1) every time right?.
  • What about 2^32 == 2^32 .
  • O(1)? 2^33 == 2^33? Now you've got a question of register size and the underlying implementation.
  • Fortunately XOR and == can be done in parallel, so if one assumes infinite precision and a machine designed to cope with infinite precision, it is safe to assume XOR and == take constant time regardless of their value ( because its infinite width, it will have infinite 0 padding. Obviously this doesn't exist. But also, changing 000000 to 000100 is not increasing memory usage.
  • Yet on some machines , ( 1 << 32 ) << 1 will consume more memory, but how much is uncertain.

AC version of Kent Fredric's Ruby solution

(to facilitate testing)

Counter-example (for C version): {8, 33, 27, 30, 9, 2, 35, 7, 26, 32, 2, 23, 0, 13, 1, 6, 31, 3, 28, 4, 5, 18, 12, 2, 9, 14, 17, 21, 19, 22, 15, 20, 24, 11, 10, 16, 25}. Here n=0, m=35. This sequence misses 34 and has two 2 .

It is an O(m) in time and O(1) in space solution.

Out-of-range values are easily detected in O(n) in time and O(1) in space, therefore tests are concentrated on in-range (means all values are in the valid range [n, n+m) ) sequences. Otherwise {1, 34} is a counter example (for C version, sizeof(int)==4, standard binary representation of numbers).

The main difference between C and Ruby version: << operator will rotate values in C due to a finite sizeof(int), but in Ruby numbers will grow to accomodate the result eg,

Ruby: 1 << 100 # -> 1267650600228229401496703205376

C: int n = 100; 1 << n // -> 16

In Ruby: check_index ^= 1 << i; is equivalent to check_index.setbit(i) . The same effect could be implemented in C++: vector<bool> v(m); v[i] = true;

 bool isperm_fredric(int m; int a[m], int m, int n) { /** O(m) in time (single pass), O(1) in space, no restriction on n, ?overflow? a[] may be readonly */ int check_index = 0; int check_value = 0; int min = a[0]; for (int i = 0; i < m; ++i) { check_index ^= 1 << i; check_value ^= 1 << (a[i] - n); // if (a[i] < min) min = a[i]; } check_index <<= min - n; // min and n may differ eg, // {1, 1}: min=1, but n may be 0. return check_index == check_value; } 

Values of the above function were tested against the following code:

 bool *seen_isperm_trusted = NULL; bool isperm_trusted(int m; int a[m], int m, int n) { /** O(m) in time, O(m) in space */ for (int i = 0; i < m; ++i) // could be memset(s_i_t, 0, m*sizeof(*s_i_t)); seen_isperm_trusted[i] = false; for (int i = 0; i < m; ++i) { if (a[i] < n or a[i] >= n + m) return false; // out of range if (seen_isperm_trusted[a[i]-n]) return false; // duplicates else seen_isperm_trusted[a[i]-n] = true; } return true; // a[] is a permutation of the range: [n, n+m) } 

Input arrays are generated with:

 void backtrack(int m; int a[m], int m, int nitems) { /** generate all permutations with repetition for the range [0, m) */ if (nitems == m) { (void)test_array(a, nitems, 0); // {0, 0}, {0, 1}, {1, 0}, {1, 1} } else for (int i = 0; i < m; ++i) { a[nitems] = i; backtrack(a, m, nitems + 1); } } 

The Answer from "nickf" dows not work if the array is unsorted var_dump(testArray(array(5, 3, 1, 2, 4), 1, 5)); //gives "duplicates" !!!!

Also your formula to compute sum([n…n+m-1]) looks incorrect…. the correct formula is (m(m+1)/2 – n(n-1)/2)

An array contains N numbers, and you want to determine whether two of the numbers sum to a given number K. For instance, if the input is 8,4, 1,6 and K is 10, the answer is yes (4 and 6). A number may be used twice. Do the following. 一个。 Give an O(N2) algorithm to solve this problem. 湾 Give an O(N log N) algorithm to solve this problem. (Hint: Sort the items first. After doing so, you can solve the problem in linear time.) c. Code both solutions and compare the running times of your algorithms. 4。

Product of m consecutive numbers is divisible by m! [ m factorial ]


so in one pass you can compute the product of the m numbers, also compute m! and see if the product modulo m ! is zero at the end of the pass

I might be missing something but this is what comes to my mind …

something like this in python

my_list1 = [9,5,8,7,6]

my_list2 = [3,5,4,7]

def consecutive(my_list):

 count = 0 prod = fact = 1 for num in my_list: prod *= num count +=1 fact *= count if not prod % fact: return 1 else: return 0 

print consecutive(my_list1)

print consecutive(my_list2)


HotPotato ~$ python m_consecutive.py

1

0

I propose the following:

Choose a finite set of prime numbers P_1,P_2,…,P_K, and compute the occurrences of the elements in the input sequence (minus the minimum) modulo each P_i. The pattern of a valid sequence is known.

For example for a sequence of 17 elements, modulo 2 we must have the profile: [9 8], modulo 3: [6 6 5], modulo 5: [4 4 3 3 3], etc.

Combining the test using several bases we obtain a more and more precise probabilistic test. Since the entries are bounded by the integer size, there exists a finite base providing an exact test. This is similar to probabilistic pseudo primality tests.

 S_i is an int array of size P_i, initially filled with 0, i=1..K M is the length of the input sequence Mn = INT_MAX Mx = INT_MIN for x in the input sequence: for i in 1..K: S_i[x % P_i]++ // count occurrences mod Pi Mn = min(Mn,x) // update min Mx = max(Mx,x) // and max if Mx-Mn != M-1: return False // Check bounds for i in 1..K: // Check profile mod P_i Q = M / P_i R = M % P_i Check S_i[(Mn+j) % P_i] is Q+1 for j=0..R-1 and Q for j=R..P_i-1 if this test fails, return False return True 

Any contiguous array [ n, n+1, …, n+m-1 ] can be mapped on to a 'base' interval [ 0, 1, …, m ] using the modulo operator. For each i in the interval, there is exactly one i%m in the base interval and vice versa.

Any contiguous array also has a 'span' m (maximum – minimum + 1) equal to it's size.

Using these facts, you can create an "encountered" boolean array of same size containing all falses initially, and while visiting the input array, put their related "encountered" elements to true.

This algorithm is O(n) in space, O(n) in time, and checks for duplicates.

 def contiguous( values ) #initialization encountered = Array.new( values.size, false ) min, max = nil, nil visited = 0 values.each do |v| index = v % encountered.size if( encountered[ index ] ) return "duplicates"; end encountered[ index ] = true min = v if min == nil or v < min max = v if max == nil or v > max visited += 1 end if ( max - min + 1 != values.size ) or visited != values.size return "hole" else return "contiguous" end end tests = [ [ false, [ 2,4,5,6 ] ], [ false, [ 10,11,13,14 ] ] , [ true , [ 20,21,22,23 ] ] , [ true , [ 19,20,21,22,23 ] ] , [ true , [ 20,21,22,23,24 ] ] , [ false, [ 20,21,22,23,24+5 ] ] , [ false, [ 2,2,3,4,5 ] ] ] tests.each do |t| result = contiguous( t[1] ) if( t[0] != ( result == "contiguous" ) ) puts "Failed Test : " + t[1].to_s + " returned " + result end end 

I like Greg Hewgill's idea of Radix sorting. To find duplicates, you can sort in O(N) time given the constraints on the values in this array.

For an in-place O(1) space O(N) time that restores the original ordering of the list, you don't have to do an actual swap on that number; you can just mark it with a flag:

 //Java: assumes all numbers in arr > 1 boolean checkArrayConsecutiveRange(int[] arr) { // find min/max int min = arr[0]; int max = arr[0] for (int i=1; i<arr.length; i++) { min = (arr[i] < min ? arr[i] : min); max = (arr[i] > max ? arr[i] : max); } if (max-min != arr.length) return false; // flag and check boolean ret = true; for (int i=0; i<arr.length; i++) { int targetI = Math.abs(arr[i])-min; if (arr[targetI] < 0) { ret = false; break; } arr[targetI] = -arr[targetI]; } for (int i=0; i<arr.length; i++) { arr[i] = Math.abs(arr[i]); } return ret; } 

Storing the flags inside the given array is kind of cheating, and doesn't play well with parallelization. I'm still trying to think of a way to do it without touching the array in O(N) time and O(log N) space. Checking against the sum and against the sum of least squares (arr[i] – arr.length/2.0)^2 feels like it might work. The one defining characteristic we know about a 0…m array with no duplicates is that it's uniformly distributed; we should just check that.

Now if only I could prove it.

I'd like to note that the solution above involving factorial takes O(N) space to store the factorial itself. N! > 2^N, which takes N bytes to store.

哎呀! I got caught up in a duplicate question and did not see the already identical solutions here. And I thought I'd finally done something original! Here is a historical archive of when I was slightly more pleased:


Well, I have no certainty if this algorithm satisfies all conditions. In fact, I haven't even validated that it works beyond a couple test cases I have tried. Even if my algorithm does have problems, hopefully my approach sparks some solutions.

This algorithm, to my knowledge, works in constant memory and scans the array three times. Perhaps an added bonus is that it works for the full range of integers, if that wasn't part of the original problem.

I am not much of a pseudo-code person, and I really think the code might simply make more sense than words. Here is an implementation I wrote in PHP. Take heed of the comments.

 function is_permutation($ints) { /* Gather some meta-data. These scans can be done simultaneously */ $lowest = min($ints); $length = count($ints); $max_index = $length - 1; $sort_run_count = 0; /* I do not have any proof that running this sort twice will always completely sort the array (of course only intentionally happening if the array is a permutation) */ while ($sort_run_count < 2) { for ($i = 0; $i < $length; ++$i) { $dest_index = $ints[$i] - $lowest; if ($i == $dest_index) { continue; } if ($dest_index > $max_index) { return false; } if ($ints[$i] == $ints[$dest_index]) { return false; } $temp = $ints[$dest_index]; $ints[$dest_index] = $ints[$i]; $ints[$i] = $temp; } ++$sort_run_count; } return true; } 

So there is an algorithm that takes O(n^2) that does not require modifying the input array and takes constant space.

First, assume that you know n and m . This is a linear operation, so it does not add any additional complexity. Next, assume there exists one element equal to n and one element equal to n+m-1 and all the rest are in [n, n+m) . Given that, we can reduce the problem to having an array with elements in [0, m) .

Now, since we know that the elements are bounded by the size of the array, we can treat each element as a node with a single link to another element; in other words, the array describes a directed graph. In this directed graph, if there are no duplicate elements, every node belongs to a cycle, that is, a node is reachable from itself in m or less steps. If there is a duplicate element, then there exists one node that is not reachable from itself at all.

So, to detect this, you walk the entire array from start to finish and determine if each element returns to itself in <=m steps. If any element is not reachable in <=m steps, then you have a duplicate and can return false. Otherwise, when you finish visiting all elements, you can return true:

 for (int start_index= 0; start_index<m; ++start_index) { int steps= 1; int current_element_index= arr[start_index]; while (steps<m+1 && current_element_index!=start_index) { current_element_index= arr[current_element_index]; ++steps; } if (steps>m) { return false; } } return true; 

You can optimize this by storing additional information:

  1. Record sum of the length of the cycle from each element, unless the cycle visits an element before that element, call it sum_of_steps .
  2. For every element, only step m-sum_of_steps nodes out. If you don't return to the starting element and you don't visit an element before the starting element, you have found a loop containing duplicate elements and can return false .

This is still O(n^2), eg {1, 2, 3, 0, 5, 6, 7, 4} , but it's a little bit faster.

ciphwn has it right. It is all to do with statistics. What the question is asking is, in statistical terms, is whether or not the sequence of numbers form a discrete uniform distribution. A discrete uniform distribution is where all values of a finite set of possible values are equally probable. Fortunately there are some useful formulas to determine if a discrete set is uniform. Firstly, to determine the mean of the set (a..b) is (a+b)/2 and the variance is (nn-1)/12. Next, determine the variance of the given set:

 variance = sum [i=1..n] (f(i)-mean).(f(i)-mean)/n 

and then compare with the expected variance. This will require two passes over the data, once to determine the mean and again to calculate the variance.

参考文献:

  • uniform discrete distribution
  • variance

Here is a solution in O(N) time and O(1) extra space for finding duplicates :-

 public static boolean check_range(int arr[],int n,int m) { for(int i=0;i<m;i++) { arr[i] = arr[i] - n; if(arr[i]>=m) return(false); } System.out.println("In range"); int j=0; while(j<m) { System.out.println(j); if(arr[j]<m) { if(arr[arr[j]]<m) { int t = arr[arr[j]]; arr[arr[j]] = arr[j] + m; arr[j] = t; if(j==arr[j]) { arr[j] = arr[j] + m; j++; } } else return(false); } else j++; } 

Explanation:-

  1. Bring number to range (0,m-1) by arr[i] = arr[i] – n if out of range return false.
  2. for each i check if arr[arr[i]] is unoccupied that is it has value less than m
  3. if so swap(arr[i],arr[arr[i]]) and arr[arr[i]] = arr[arr[i]] + m to signal that it is occupied
  4. if arr[j] = j and simply add m and increment j
  5. if arr[arr[j]] >=m means it is occupied hence current value is duplicate hence return false.
  6. if arr[j] >= m then skip