An Extensive Examination of LINQ: Using the Query SyntaxBy Scott Mitchell
|A Multipart Series on LINQ|
This article is one in a series of articles on LINQ, which was introduced with .NET version 3.5.
LINQ's standard query operators provide dozens of built-in techniques for programmatically enumerating, aggregating, selecting, and filtering data. These standard query operators are implemented as extension functions on the
IEnumerable<T>interface and therefore available to any object that implements this interface, which includes arrays, the collection classes in the
System.Collectionsnamespace, and objects that are composed of a set of enumerable items (such as a
DataTable). The examples we've explored in this tutorial series thus far have primarily used these standard query operators in their extension method form, namely as methods applied to
IEnumerable<T>collections. For example, the following snippet uses this style of syntax to query a collection of
Employeeobjects, returning a list of the employees'
Salaryfor those employees that make more than $100,000 per year, sorted by salary in descending order.
While the above syntax certainly works, it is not the most human-friendly syntax. For starters, it's quite verbose. It's also unlike the data querying syntax most developers are familiar with - SQL. To address these shortcomings, Microsoft added a number of language enhancements to C# 3.0 and Visual 9, which are the versions that shipped with the .NET Framework 3.5 and Visual Studio 2008. In addition to extension methods, implicitly typed variables, object initializers, lambda expressions, and anonymous types, Microsoft also added a new query syntax that allows developers to write LINQ queries in a SQL-like syntax. The following code is semantically equivalent to the example above, but uses the query syntax (or query expressions) instead of the extension method syntax:
This query syntax is another syntactic sugar language enhancement. When compiled, the above code is translated into calls to the corresponding standard query operators.
This article explores both C# and Visual Basic's query syntax. We'll discuss the mapping between the query syntax and LINQ's standard query operators, as well as see a number of query syntax examples. Read on to learn more!
Getting Started With The
The query syntax introduced to C# 3.0 and Visual Basic 9 allows for more readable code when querying an enumerable object using LINQ's standard query operators; it is designed to be more terse than the extension method syntax and to more closely model the familiar SQL syntax. Much like with SQL, every query expression includes a
fromclause, but unlike SQL, query expressions start with a
fromclause (whereas SQL queries start with a
fromclause dictates what enumeration the query is operating against and takes the form:
from variableName in enumerableCollection. You can optionally define the type of variableName, although this isn't necessary thanks to support for implicitly-typed variables. All query expressions must start with a
Another common clause in query expressions is the
select clause. The
select clause indicates the type of object returned by the query expression,
and can return either the same type of object that is being enumerated or can perform a projection and return a new, anonymous type based on the input type. (The query syntax
code sample in the Introduction used the
select clause to return a new, anonymous type, namely an object with two properties,
Let's look at a very simple query expression. The following expression works with the
Fibonacci class, which we've used extensively in previous installments in
this article series. The
Fibonacci class returns the first N Fibonacci numbers and implements
IEnumerable<T>. Here is the most minimal query expression, which is, admittedly, quite boring and doesn't do anything interesting:
Let's break down the two parts to this query syntax. In general, the
from clause is structured like
from variableName in enumerableCollection.
In English, this means, "Query the enumeration enumerableCollection, and for each element in the enumeration store it in a variable named variableName. Consequently,
from clause in the above example -
from fn in fib - can be expressed in prose as, "Let me enumerate the elements in
the current number being enumerated
I find it helpful to think of query expressions as an assembly line. At one end you have a basket of elements, which is the input. The basket is defined by
the enumerableCollection object in the
from clause. The assembly line operates by taking elements from the input basket one at a time and doing some
processing on the element to determine if it's part of the output or not. If the element is part of the output, it is put in the output basket at the end of the assembly
line. In a nutshell, the query expression defines an assembly line whose input is specified by the
from clause and whose output is (typically) an enumeration
whose elements are defined by the
select clause. When working with complex types on the assembly line (like
Employee objects) the
select clause can create and store an alternate type of object in the output basket, perhaps one that has only a subset of the properties. Or the
clause can put the same type of object that is churning through the assembly line in the output basket.
The query expression above is blindingly simple because it's an assembly line with just an input basket and an output basket. The assembly line enumerates each item in the input (the first 10 Fibonacci numbers) and then puts each number in the output basket. There's nothing interesting happening in between these two baskets.
What's important to keep in mind is that merely defining the query expression and assigning it to a variable does not actually spin up the assembly line operation.
Rather, such a statement spells out the rules of the assembly line. The assembly line isn't put into motion until the query is enumerated or until a greedy query operator is
encountered. (We talked about query operators and deferred execution and lazy and greedy operators in The Ins
and Outs of Query Operators.) In the above code sample,
queryResults doesn't "contain" anything. Rather, it's just a blueprint for how to work with the
fib enumeration. In order to work with this assembly line we need to use a
foreach statement, like so:
foreach loops spin up the assembly line defined by
queryResults, walking through each element in the
fib object one at a time.
Filtering With The
One of the most commonly used standard query operators is the
Whereoperator, which filters out elements based on a functional that returns a Boolean value. Previous installments looked at using the
Whereclause to filter the Fibonacci numbers so as only to return the even Fibonacci numbers, for example. The Boolean function used by the
Whereoperator to determine whether the element in the enumeration should be filtered out or not is commonly specified via a lambda expression, which takes the form
variables => bodyin C# and the form
Function(variables) bodyin VB. (Consult Lambda Expressions and Anonymous Types for more information on lambda expressions.) The query syntax includes a
whereclause that maps to the
Whereoperator, but is more terse as the
whereclause in the query syntax only requires the lambda expression
variables =>portion or
Function(variables)portion can be omitted.
The following snippet shows a query expression that uses the
where clause to return only odd Fibonacci numbers. Our assembly line now includes a component that
filters out even numbers.
Note that the
where clause references the variable from the
from clause and does not need the "head" of the lambda expression, but instead defines
just the body. This syntax is more readable than fully-qualified lambda expressions and more closely resembles the
WHERE clause syntax used in SQL.
where clause can also include a call to other functions. For instance, we could add an
IsPrime method to the ASP.NET page's code-behind class
that accepts an integer value as input and returns a Boolean value as output that indicates whether the input was prime or not. (A prime
number is a natural number that is only divisible by 1 and itself; 2, 3, 5, 7, 11, 13, 17, 19, 23, and 29 are the first ten prime numbers.) The following example
is in C# only for the sake of brevity. The download available at the end of the article includes a Visual Basic version, as well.
primeFibs query, when enumerated, outputs the following numbers:
2, 3, 5, 13, 89, 233, 1597
Ordering the Query Results
In addition to the
selectclauses, the query syntax supports an
orderbyclause. (In Visual Basic, the clause includes a space, as in
Order By). The
orderbyclause sorts the results by the specified value. For our Fibonacci example, which enumerates over a simple type (integers), using the orderby clause is as simple as specifying the variableName from the
fromclause. Doing so orders the contents in the output in ascending order. To sort the results in descending order add the
The following query expression, when enumerated by a
foreach loop, would output (indicating that the majority of the first 20 Fibonacci numbers are odd):
2584, 610, 144, 34, 8, 2
Working With Complex Types
The preceding demos illustrated using query expressions with an enumeration of a simple type (the
Fibonacciobject is composed of integers). It is possible to use query expressions to enumerate collections of complex types, such as a list of
Employeeobjects. The following example creates a collection of four Employee objects (using the object and collection initialization features added to C# 3.0) and then creates a query expression that returns the
Salaryof those employees that make more than $60,000, ordering the results from the highest paid employee to the lowest paid employee (who make more than $60,000). This is then bound to a GridView control (
gvBigBucks). When the GridView's
DataBindmethod is invoked, the GridView enumerates the query and displays the name and salary for Jisun and Alice, the two employees returned by the query. (The VB version of this query is available in the download accompanying this article.)
The takeaway here is that the complex types (the
Employee object) can appear in the query syntax. For example, in the
where clause we filter based
Salary property of the Employee object by using the syntax
variableName.Salary. Ditto for in the
orderby clause. What's
more, we can use projection in the
select clause to have the query return a new, anonymous type instead of the same input type. In the example above, the query
returns a new type that contains two properties named
Salary of type
Everything's A Standard Query Operator
It's important to keep in mind that the query syntax added to C# 3.0 and Visual Basic 9 is nothing more than syntactic sugar. The compiler translates the query syntax into calls to equivalent standard query operators. The
whereclause, for example, is mapped to the
Wherestandard query operator; the
selectclause is mapped to the
Selectstandard query operator; and so on. In short, any query you can write using the query syntax can be written using the extension method syntax. The benefit of the query syntax is that it is more readable and less verbose than the extension method syntax, but functionally the two syn taxes are the same.
Differences Between C# and Visual Basic Query Syntax
At a high-level, the query syntax used by C# and Visual Basic is the same. Both provide a more human-friendly way to write LINQ queries that get translated into equivalent standard query operator calls at compile-time. However, the syntax between the two differs a bit. For instance, the C#
orderbyclause lacks a space, whereas Visual Basics
Order Byclause includes one. Similarly, C#'s clauses are case-sensitive, whereas Visual Basics are not. Visual Basic is also less picky about the order that the query syntax clauses appear. C# query expressions must start with a
fromclause and end with a
group byclause. While Visual Basic query expressions must start with a
Fromclause they need not end with a
Group Byclause. Instead, they can end with a
Whereclause or an
Order Byclause or other clauses, as the VB code examples in this article have demonstrated.
Moreover, VB's query syntax includes more clauses than C#. With Visual Basic you can include aggregate clauses like
Min, as well as partitioning clauses like
Take While. There are no equivalent
query syntax clauses for these standard query operators in C#.
Other Query Syntax Clauses
In addition to
orderby, both C# and Visual Basic support additional query syntax clauses, including
group by. These query syntax clauses will be presented in a future installment.
|A Multipart Series on LINQ|
This article is one in a series of articles on LINQ, which was introduced with .NET version 3.5.