如何从列表<string>中find所有重复?

我有一个List<string>有一些单词重复。 我需要find所有重复的单词。

有什么窍门让他们全部?

在.NET Framework 3.5及更高版本中,您可以使用Enumerable.GroupBy返回枚举重复枚举的枚举,然后过滤掉任何具有<= 1的Count的枚举,然后select它们的键返回到单枚枚举:

 var duplicateKeys = list.GroupBy(x => x) .Where(group => group.Count() > 1) .Select(group => group.Key); 

如果您正在使用LINQ,则可以使用以下查询:

 var duplicateItems = from x in list group x by x into grouped where grouped.Count() > 1 select grouped.Key; 

或者,如果你更喜欢没有语法糖:

 var duplicateItems = list.GroupBy(x => x).Where(x => x.Count() > 1).Select(x => x.Key); 

这将所有相同的元素进行分组,然后仅过滤到具有多个元素的组。 最后它只是从这些组中select密钥,因为你不需要计数。

如果你不想使用LINQ,你可以使用这个扩展方法:

 public void SomeMethod { var duplicateItems = list.GetDuplicates(); … } public static IEnumerable<T> GetDuplicates<T>(this IEnumerable<T> source) { HashSet<T> itemsSeen = new HashSet<T>(); HashSet<T> itemsYielded = new HashSet<T>(); foreach (T item in source) { if (!itemsSeen.Add(item)) { if (itemsYielded.Add(item)) { yield return item; } } } } 

这跟踪它已经看到和产生的项目。 如果以前没有看过某个项目,则将其添加到可见项目列表中,否则将忽略该项目。 如果它之前没有产生一个项目,它会产生它,否则忽略它。

没有LINQ:

 string[] ss = {"1","1","1"}; var myList = new List<string>(); var duplicates = new List<string>(); foreach (var s in ss) { if (!myList.Contains(s)) myList.Add(s); else duplicates.Add(s); } // show list without duplicates foreach (var s in myList) Console.WriteLine(s); // show duplicates list foreach (var s in duplicates) Console.WriteLine(s); 

使用LINQ,当然。 下面的代码会给你一个项目string的字典,以及你的源列表中每个项目的计数。

 var item2ItemCount = list.GroupBy(item => item).ToDictionary(x=>x.Key,x=>x.Count()); 

如果你正在寻找一个更通用的方法:

 public static List<U> FindDuplicates<T, U>(this List<T> list, Func<T, U> keySelector) { return list.GroupBy(keySelector) .Where(group => group.Count() > 1) .Select(group => group.Key).ToList(); } 

编辑:这里是一个例子:

 public class Person { public string Name {get;set;} public int Age {get;set;} } List<Person> list = new List<Person>() { new Person() { Name = "John", Age = 22 }, new Person() { Name = "John", Age = 30 }, new Person() { Name = "Jack", Age = 30 } }; var duplicateNames = list.FindDuplicates(p => p.Name); var duplicateAges = list.FindDuplicates(p => p.Age); foreach(var dupName in duplicateNames) { Console.WriteLine(dupName); // Will print out John } foreach(var dupAge in duplicateAges) { Console.WriteLine(dupAge); // Will print out 30 } 

我假设列表中的每个string包含几个单词,让我知道如果这是不正确的。

 List<string> list = File.RealAllLines("foobar.txt").ToList(); var words = from line in list from word in line.Split(new[] { ' ', ';', ',', '.', ':', '(', ')' }, StringSplitOptions.RemoveEmptyEntries) select word; var duplicateWords = from w in words group w by w.ToLower() into g where g.Count() > 1 select new { Word = g.Key, Count = g.Count() } 

为了什么是值得的,这是我的方式:

 List<string> list = new List<string>(new string[] { "cat", "Dog", "parrot", "dog", "parrot", "goat", "parrot", "horse", "goat" }); Dictionary<string, int> wordCount = new Dictionary<string, int>(); //count them all: list.ForEach(word => { string key = word.ToLower(); if (!wordCount.ContainsKey(key)) wordCount.Add(key, 0); wordCount[key]++; }); //remove words appearing only once: wordCount.Keys.ToList().FindAll(word => wordCount[word] == 1).ForEach(key => wordCount.Remove(key)); Console.WriteLine(string.Format("Found {0} duplicates in the list:", wordCount.Count)); wordCount.Keys.ToList().ForEach(key => Console.WriteLine(string.Format("{0} appears {1} times", key, wordCount[key]))); 
  lblrepeated.Text = ""; string value = txtInput.Text; char[] arr = value.ToCharArray(); char[] crr=new char[1]; int count1 = 0; for (int i = 0; i < arr.Length; i++) { int count = 0; char letter=arr[i]; for (int j = 0; j < arr.Length; j++) { char letter3 = arr[j]; if (letter == letter3) { count++; } } if (count1 < count) { Array.Resize<char>(ref crr,0); int count2 = 0; for(int l = 0;l < crr.Length;l++) { if (crr[l] == letter) count2++; } if (count2 == 0) { Array.Resize<char>(ref crr, crr.Length + 1); crr[crr.Length-1] = letter; } count1 = count; } else if (count1 == count) { int count2 = 0; for (int l = 0; l < crr.Length; l++) { if (crr[l] == letter) count2++; } if (count2 == 0) { Array.Resize<char>(ref crr, crr.Length + 1); crr[crr.Length - 1] = letter; } count1 = count; } } for (int k = 0; k < crr.Length; k++) lblrepeated.Text = lblrepeated.Text + crr[k] + count1.ToString();