An Extensive Examination of LINQ: The Standard Query OperatorsBy Scott Mitchell
Query operators are methods that work with a sequence of data and perform some task based on the data. They are created as extension methods on the
IEnumerable<T>interface, which is the interface implemented by classes that hold enumerable data. For example, arrays and the classes in the
System.Collections.Genericnamespaces all implement
IEnumerable<T>. In The Ins and Outs of Query Operators we looked at how to create your own query operator that, once created, can be applied to any enumerable object.
While it is possible to create your own query operators, the good news is that the .NET Framework already ships with a bevy of useful query operators. These query operators are referred to as the standard query operators and are one of the primary pieces of LINQ. The standard query operators include functionality for aggregating sequences of data, concatenating two sequences, converting sequences from one type to another, and splicing out a particular element from the enumeration. There are also standard query operators for generating new sequences, grouping and joining sequences, ordering the elements in sequences, filtering the data in a sequence, and partitioning the sequence.
All together, there are more than 40 standard query operators. This article explores some of the more germane ones, giving examples of the standard query operator in use and examining its underlying source code. There are also several demos included in the download available at the end of the article. Read on to learn more!
Standard Query Operator Overview and Classifications
The standard query operators are a set of query operators that ship with the .NET Framework. Specifically, the standard query operators are defined in the
Enumerableclass, which is found in the
System.Linqnamespace. The standard query operators are extension methods on the
Each standard query operator is classified as performing a particular type of operation. In previous installments we looked at the
Count standard query operator, and talked
Sum standard query operator. These two operators are examples of aggregate operators, as they take a sequence of data - a list of integers,
let's say - and aggregate the data, returning some scalar value (the total number of integers or the sum of said integers in the case of
The standard query operators can be classified according to the following types of operations performed:
- Aggregation operators
- Concatenation operators
- Element operators
- Equality operators
- Generation operators
- Grouping operators
- Joining operators
- Ordering operators
- Partitioning operators
- Projection operators
- Quantifiers operators
- Restriction operators
- Set operators
Summing, Averaging, Counting, and Finding Maximum and Minimum Elements
The .NET Framework includes a number of aggregate standard query operators. These operators examine a sequence of data and compute a scalar value. For instance, the
Countoperator, which we've seen in previous installments, returns the total number of elements in the sequence. Other aggregate operators include
Sum. A simple example follows, which shows using many of these operators on the
Fibonacciclass that we created in the preceding installment.
Keep in mind that the
Max methods used above are not part of the
Fibonacci class. Rather, they are extension methods on the
IEnumerable<T> interface, which the
Fibonacci class implements.
Furthermore, notice how I used implicit variable typing when reading back the values from these operators (
var count = fib.Count() and
Dim count = fib.Count(), for example). I could have used explicit typing -
int count = fib.Count() and
Dim count As Integer = fib.Count -
but it's good to get used to implicit typing as this pattern is commonly used with more intricate LINQ queries.
The source code for the aggregation operators are pretty straightforward. For example, the
Enumerable class defines two overloads of the
operator. The first works on an object that implements
IEnumerable<T>, and returns an integer value. It's abbreviated code follows. (Note: I've simplified
the method declaration to make it more readable. I used Reflector to view the source code in the .NET Framework.)
In the examples above, the
IEnumerable<T> object named
source that appears to be passed into the method is actually the object the extension
method is being applied to. The
Count method simply enumerates the elements in source, tallies how many iterations it performs, and returns this value. That's it!
Count overload accepts a function as input, which you can use to filter what elements get counted. For example, to instruct the
to only count odd numbers you could do something like:
var count = fib.Count(n => n % 2 == 1) or
Dim count = fib.Count(Function(n) n Mod 2 = 1).
The aggregate operators are examples of greedy query operators. As we discussed in
The Ins and Outs of Query Operators, LINQ operators are either lazy or greedy. A lazy query operator is one
that is not evaluated until the elements of the sequence are enumerated. The sequence can be enumerated either by a
foreach loop or by the application of a greedy
query operator. Point being, when a greedy query operator is applied to a sequence the value computed by the greedy operator is generated immediately. The source code snippet
above shows how the
Count method immediately enumerates its source. This is why it is considered a greedy operator.
Count method can work with an enumerable object of any type. Other operators limit the types they can be applied to. For example, the
Average operator can
only be applied to numeric sequences. This restriction is imposed by having a variety of overloads defined in the
Enumerable class for the
Rather than having a single method that applies to objects of
IEnumerable<T>, there are overloads for
Average(this IEnumerable<int> source)
Average(this IEnumerable<decimal> source)
Average(this IEnumerable<double> source)
- And so on...
Averageoperator, but in order to use it you must supply a method that returns the numerical value for the element that will be used in the average calculation. This overload is useful if you have a collection of objects that contain a numeric value you want to average. For example, imagine that we have a list of
Employeeobjects, where each
Employeeinstance has a
Salaryproperty. The following pseudo code would compute the average salary:
The above code assumes that there's some process that returns a populated list of
Employee objects. The average salary is then computed. Because the
object itself cannot be averaged (as it's not a numeric type) we need to pass a method into the
Average operator that provides the value to average for each
Employee object, in this case the value of each
Salary property. The net result is that we compute the average salary
of all employees in the
The .NET Framework includes a handful of operators for sequence conversion. The
ToArrayoperators convert an enumerable object of type
List<T>or an array of type
T, respectively. These two methods are most often used to force a lazy query operator to evaluate. In the previous installment we talked about how a lazy query operator is not evaluated until the source elements are enumerated. To force immediate execution of the query operators you can use
Consider the example from the previous installment. In the code below we have a
fib, that is initialized to having 10 elements.
oddFibs, is defined that works with the odd numbers. However, before the query is enumerated the
Grow method is called, which doubles the number of
oddFibs is enumerated in the
foreach loop the output contains the odd numbers of the first 20 Fibonacci
numbers, and not the first 10.
To force the
oddFibs query to evaluate immediately (rather than waiting for it to be enumerated) you could use the
foreach loop in the above code would output the odd numbers in the first 10 Fibonacci numbers because the
ToList call converted the query
into a list of integers, namely a list of integers that compose the odd integers in
fib, of which there are only 10 Fibonacci numbers in it at this time.
Keep in mind that
oddFibs is a different type in both examples. In the first example, oddFibs is of type
IEnumerable<int>. In the second example,
ToList operator converts the
IEnumerable<int> sequence returned by the
Where operator into a
The element standard query operators retrieve a particular element from a sequence. The simplest operators in this class are
Last, which return the starting and ending elements in the sequence, respectively. The following code snippet uses these two operators to retrieve the smallest and largest values in the
Fibonaccicollection. (Note that
Lastdo not necessarily return the smallest and largest valued elements in a sequence; they do so for the
Fibonaccisequence because the Fibonacci numbers are monotonically increasing.)
ElementAt operator to get the element at a particular location in the enumeration, where the enumeration is indexed starting at zero. The following
snippet verifies that the sum of the third and fourth Fibonacci numbers equals the fifth.
The .NET Framework includes query operators for ordering enumerations. The
OrderByoperator orders an enumeration in ascending order;
OrderByDescendingorders an enumeration in descending order. When ordering an enumeration you must provide a method as an input parameter to the operator that specifies the field by which the elements in the sequence are to be ordered by. For example, if you have a list of
Employeeobjects and you want to order them by salary in ascending order, you could use code like the following:
The method passed into the
OrderBy operator indicates that each
Employee object should be ordered by the
If you are ordering a sequence of primitive types that do not have any properties (such as ordering a list of integers or an array of string) you still need to pass in a
method indicating the value to order on, but the format would look like
x => x or
Function(x) x. For example, to order a
in descending order you'd do:
The ordering operators include an overload where you can pass in a comparer method that given two elements in the sequence specifies how the two relate - if they are equal or not, and if not then what element comes before the other. If provided, this method is used by the ordering operators. You must provide such a method if the field you are ordering by does not have a built-in comparer. (Types like integers, strings, and dates already have comparers defined in the .NET Framework.)
Previous installments looked at the
Whereoperator, which enables a developer to specify a condition and filter out all elements from a sequence that do not meet that condition. We'll look at the
Whereoperator momentarily, but before we do let's first focus on the partitioning operators. The partitioning operators divide the sequence into two partitions with a "left partition" and a "right partition." The two simplest partitioning operators are
Take, which skip over the first n elements or take the first n elements. The following code snippet shows how to use
Skipto skip over the first three Fibonacci numbers.
fibWithFirstThreeRemoved enumeration (currently) contains the 4th, 5th, 6th, 7th, 8th, 9th, and 10th Fibonacci numbers.
TakeWhile operators partition the sequence until some condition is true. We could replace the above
SkipWhile operator like so:
Keep in mind that the
n in the lambda expression is the current Fibonacci number being evaluated and does not have any bearing on the index of the element in the sequence.
The first four Fibonacci numbers are 1, 1, 2, and 3. The
SkipWhile operator evaluates each element from the beginning and skips over it if
the method evaluates to True. Therefore, it skips over the first three elements - 1, 1, and 2 - but not the third - 3 - because the first three are less than or equal to 2,
but the third one is not.
Restriction (Filtering) Operators
The standard query operators include a single restriction (or filtering) operator:
Whereoperator accepts a method as its input that specifies the condition for inclusion. When enumerated, the operator applies the condition to each element in its source; if the condition holds, the element is included in the resultset, otherwise it is filtered out.
The following snippet starts by getting the list of files in the current folder. It then uses the Where operator along with the Sum and Average operators to glean information
about the amount of space taken up by the files and by certain types of files. (For more information on how to programmatically work with the file system from an ASP.NET page,
be sure to consult the
System.IO namespace FAQs over on
The above code starts be retrieving information about all of the files in the folder that the currently executing ASP.NET page resides in. It then uses the
Sum operators to get the number of files and the total file size. Note that the
Sum method includes a selector method. The elements of the
fInfo sequence are
FileInfo objects. One of the properties of the
FileInfo object is
Length, which returns the size of the file
in bytes. Therefore, we call the
Sum operator and supply a method that returns the field to sum, namely
Where operator is used to get only those files that have the extension ".aspx". The
Sum operators are applied to
this query to get the count and total file size of the
.aspx pages in the folder.
LINQ includes a host of standard query operators, which are built-in operators that perform some calculation or modification to a sequence. The standard query operators can be broken down into various types, such as aggregation, conversion, element, grouping, joining, projection, and restriction types, among others. This article looked at a variety of standard query operators and showed them in action. The download available at the end of this article includes a handful of demos.
The standard query operator examples in this article (and in the download) use the extension method syntax, such as: SequenceObject.Operator, or
fib.Count(). An Introduction to LINQ noted that LINQ has a unique query syntax that allows
you to use query operators in a SQL-like syntax. The next installment will explore LINQ's query syntax, which is what enables developers to write SQL-like queries
in C# and Visual Basic syntax.
EnumerableClass (technical docs)