Within this article and within further articles I want to compare Linq and classical loops. During code reviews you may sometimes hear the suggestion to use language specific features as they will simplify the source code. Such a feature is Linq. On the other hand you may ask yourself whether such a feature always create better code. Of course, to compare two different implementations of the same use case, you have to look at different criteria: complexity, maintainability, robustness (risk of errors), comprehensibility and so on. One of the most important or even the most important factor is the wish to have code which is easy to understand. This factor depends on the knowledge of the developers and on the best practices and clean code principles of the team or the company. Therefore you cannot generally define whether one or another solution is best practice for all companies
Following I want to compare Linq queries and classical loops for the use case to do a simple query. I want to show the pros and cons of each solution and give my personal opinion which solution I prefer. You may use my input to make your own decision.
The query will be implemented with respect of two different use cases. In the first use case we have a known and well defined data handling. The data handling is responsible for the data validation and will ensure we never get not initialized data. So we don’t have to check for null pointers or data ranges.
In the second use case we don’t know the data handling components. We want to offer a query which can be used by everyone. Our query itself shall be robust enough to deal with null pointers and other data issues.
Use case one: clean data
Let’s start with the first use case. We have a simple data class for a person. Out of a list of persons we want to find a specific one by name. To keep it simple we just look for the first person with the name. If the list does not contain the data we are looking for, we shall return a default person object.
The data class is defined as following:
public class Person { public string Name { get; set; } public uint Age { get; set; } public static readonly Person Default = new Person() { Name = "new person", Age = 0 }; }
Our demo console application creates a list of data and calls the data query method, first with an existing person and second with a not existing one.
List<Person> persons = new List<Person>(); persons.Add(new Person() { Name = "John Doe", Age = 35 }); persons.Add(new Person() { Name = "Jane Doe", Age = 41 }); //--------- Person person; //search existing person person = FindPerson(persons, "Jane Doe"); Console.WriteLine("Name: " + person.Name); //search not existing person person = FindPerson(persons, "???"); Console.WriteLine("Name: " + person.Name); Console.ReadKey();
The data query method shall be implemented twice: by using a loop and by using Linq. We start with the classical loop:
static private Person FindPerson(List<Person> persons, string name) { foreach (Person person in persons) { if (string.Equals(person.Name, name, StringComparison.OrdinalIgnoreCase)) { return person; } } return Person.Default; }
And we implement the same function by using Linq.
static private Person FindPerson(List<Person> persons, string name) { var result = from person in persons where string.Equals(person.Name, name, StringComparison.OrdinalIgnoreCase) select person; return result .DefaultIfEmpty<Person>(Person.Default) .First<Person>(); }
Code Review for use case one
Let us compare the two implementations.
The loop is very clean, simple and easy to understand. With one look into the code I understand its functionality. The code does not contain any comments. And I don’t miss comments as the code is self-explaining.
The Linq implementation is simple too. The Linq query is a simple “from-where-select” query which is easy to understand. But if I read the code I will stumble over the “DefaultIfEmpty” function call. As I read this line of code I have to think about it a little moment to see it is needed for the use case that we don’t find a matching value in our data list.
As a result of the code review I actually prefer the first implementation using the loop because it is clear and easy to understand. Reading the Linq query has let stop me and think about the “DefaultIfEmpty” part.
In case the developer adds a little comment why the “DefaultIsEmpty” call is done, the Linq query is easy to understand too and I don’t prefer any of the two implementations.
Use case two: dirty data
The second use case adds an important need: the query must be robust. So the data may for example contain null values. Like in the first use case the method shall return a default person if the one we looking for is not found. Null values or not initialized list shall not throw an error. In this case also the default person shall be returned.
In our test console application we create some dirty data. And we add additional tests to call the function with the data or even with null parameters.
List<Person> persons = new List<Person>(); persons.Add(new Person() { Name = "John Doe", Age = 35 }); persons.Add(null); persons.Add(new Person() { Name = null, Age = 38 }); persons.Add(new Person() { Name = "Jane Doe", Age = 41 }); //--------- Person person; //search existing person person = FindPerson(persons, "Jane Doe"); Console.WriteLine("Name: " + person.Name); //search not existing person person = FindPerson(persons, "???"); Console.WriteLine("Name: " + person.Name); //search in a list which is not yet initialized person = FindPerson(null, "???"); Console.WriteLine("Name: " + person.Name); Console.ReadKey();
The implemented query using the loop must be adapted to handle all these special cases. The following source code shows an according implementation. The list and the list content will be checked for null values.
static private Person FindPerson(List<Person> persons, string name) { if (persons == null) { return Person.Default; } foreach (Person person in persons) { if (person == null) { continue; } if (string.Equals(person.Name, name, StringComparison.OrdinalIgnoreCase)) { return person; } } return Person.Default; }
The implementation of the Linq query must be adapted too. A check of the whole list, as well of the single element is added.
static private Person FindPerson(List<Person> persons, string name) { if (persons == null) { return Person.Default; } var result = from person in persons where person != null where string.Equals(person.Name, name, StringComparison.OrdinalIgnoreCase) select person; return result .DefaultIfEmpty<Person>(Person.Default) .First<Person>(); }
Code Review for use case two
I think we have the same situation like in the first case. The adapted methods are quite easy to understand. They are clean and self-explaining. The “DefaultIsEmpty” part may need a comment or if the team will use Linq regularly such a comment may be obsolete. Therefore both implementations are equivalent from my point of view.