Wednesday, November 12, 2008

LINQ : Some string handling

Based on an example by Eric White, here's some simple string handling (running as always inside LINQPad).


// Per Eric White. The following is a query to split a string into words, only
// allow words > 1 character, eliminate the ones that contain the letter
// "y", convert the words to lower case, count the number of occurrences of
// each word, and return the ten most used words.

string text = "Debugging LINQ queries can be problematic. One of the reasons is that quite often, you write a large query " +
"as a single expression, and you can’t set a breakpoint mid-expression. Is LINQ really that problematic? " +
"Queries can be fun as is debugging!";

var uniqueWords = text
.Split(' ', '.', ',', '?', '!')
.Where (i => i != "")
.Where (i => i.Count() > 1)
.Where (i => !i.Contains ("y"))
.Select(i => i.ToLower())
.GroupBy(i => i)
.OrderByDescending(i => i.Count())
.Select(i => new { Word = i.Key, Count = i.Count() })
.Take(10);

uniqueWords.Dump();


Returns:

Word Count
is 3
debugging 2
linq 2
queries 2
can 2
be 2
problematic 2
that 2
as 2
one 1

Enjoy!

No comments: