Introduction to Regular Expressions

Regular expressions are a powerful tool to check a string against an expected pattern.  Furthermore it is possible to change a string according to the pattern. But regular expressions look very magic to most software developers, especially in C#. In other languages like Ruby they are very common but I nearly never see them in C# applications. Therefore I want to use this article to give a short introduction to regular expressions and to show that they are not magically at all.

Regular expressions may be used to compare a string with a pattern, to count matching parts of the string or to manipulate strings. Within this article I want to show you the pattern matching feature but you may easily adapt the things you learn to solve both other issues to.

Execute a Regular Expression pattern matching test

The following source code shows a possible template you can use for pattern matching.

string myValue;
Regex myRegex;

myRegex = new Regex(@"...");

Structure of a Regular Expression

The following table contains the most important elements to create a regular expression. Please see also the table in the next section with some examples. These examples will help you to understand the meaning of the elements.

Element Description
^ Start of the text
$ End of the text
<S Whitespace
[a-zA-Z] Letter
[0-9] Number
[…] One of the signs
[^…] None of the signs
…|… Or
* Any count of…
? One time or not
+ At least one time
{n,} Minimum n times
{n,m} Minimum n times and   maximum m times


The following table shows some example of regular expressions. For each example a matching and a not matching string is shown.

Description Regular expression Example which   matches Example which does   not match
Length must be six   elements ^.{6,6}$ abcdef abcdefg
Minimum length of   four elements ^.{4,}$ abcdef abc
Maximum length of   six elements (and minimum length of one) ^.{1,6}$ abcdef abcdefg
Content must be … ^abcdef$ abcdef abcdefg
Starts with … ^abc abcdef abdefg
Ends with … ef$ abcdef abcdefg
Starts with … and   ends with … ^ab.*ef$ abcdef abcdefg
Starts with … or … ^(xyz|abc) abcdef abdefg
Starts NOT with … ^[^a][^b] bdef abcefg
Starts NOT with … or   … ^[^a|b] cdef abcefg
Starts with … and   has length … ^ab.{4,4}$ abcdef abWdefg
Starts with … and   ends with … and has defined length … ^abc.{3,3}def$ abcxyzdef abcxydef
Starts with … and   ends with … and has minimum length … ^abc.{1,}def$ abcxyzdef abcdef
Only contains   letters ‘a’ to ‘z’ (uppercase or lowercase) ^[a-zA-Z]{1,}$ abcdef ab12ef
Only contains   numbers ‘0’ to ‘9’ ^[0-9]{1,}$ 12345 12ef
Pattern matching   ‘xxxyyyxxx’ where x are letters and y are numbers ^[a-zA-Z]{3,3}[0-9]{3,3}[a-zA-Z]{3,3}$ abc123def abc1s3def
Starts with one or   more times of … ^a*bc aaaabc xbc
Has length of … and   may start with an optional part (in this case ‘X’) ^(X)?.{8,8}$ abcdefgh




Starts with one or more   … (in this case: must start with at least one ‘a’ but may also start with   more than one ‘a’ letters) a+ abac xyz
Starts with … to …   times with … (in this case: starts with two to four ‘a’ letters) ^a{2,4}bc$ aaabc aaaaaaaabc
Starts with … to   unlimited times with … (in this case: starts with two or more ‘a’ letters) ^a{2,}bc$ aaaaaaaabc abc


It is not easy to use regular expressions, because they have a very technical syntax and if you don’t use them every day you will forget the details of this syntax. But if you want to implement some difficult string comparisons I recommend you to use regular expressions. The power of the pattern matching features will outbalance the disadvantage of the decreased readability.

