Regular expressions are a powerful tool to check a string against an expected pattern. Furthermore it is possible to change a string according to the pattern. But regular expressions look very magic to most software developers, especially in C#. In other languages like Ruby they are very common but I nearly never see them in C# applications. Therefore I want to use this article to give a short introduction to regular expressions and to show that they are not magically at all.
Regular expressions may be used to compare a string with a pattern, to count matching parts of the string or to manipulate strings. Within this article I want to show you the pattern matching feature but you may easily adapt the things you learn to solve both other issues to.
Execute a Regular Expression pattern matching test
The following source code shows a possible template you can use for pattern matching.
string myValue; Regex myRegex; myRegex = new Regex(@"..."); myRegex.IsMatch(myValue));
Structure of a Regular Expression
The following table contains the most important elements to create a regular expression. Please see also the table in the next section with some examples. These examples will help you to understand the meaning of the elements.
Element | Description |
^ | Start of the text |
$ | End of the text |
<S | Whitespace |
[a-zA-Z] | Letter |
[0-9] | Number |
[…] | One of the signs |
[^…] | None of the signs |
…|… | Or |
* | Any count of… |
? | One time or not |
+ | At least one time |
{n,} | Minimum n times |
{n,m} | Minimum n times and maximum m times |
Examples
The following table shows some example of regular expressions. For each example a matching and a not matching string is shown.
Description | Regular expression | Example which matches | Example which does not match |
Length must be six elements | ^.{6,6}$ | abcdef | abcdefg |
Minimum length of four elements | ^.{4,}$ | abcdef | abc |
Maximum length of six elements (and minimum length of one) | ^.{1,6}$ | abcdef | abcdefg |
Content must be … | ^abcdef$ | abcdef | abcdefg |
Starts with … | ^abc | abcdef | abdefg |
Ends with … | ef$ | abcdef | abcdefg |
Starts with … and ends with … | ^ab.*ef$ | abcdef | abcdefg |
Starts with … or … | ^(xyz|abc) | abcdef | abdefg |
Starts NOT with … | ^[^a][^b] | bdef | abcefg |
Starts NOT with … or … | ^[^a|b] | cdef | abcefg |
Starts with … and has length … | ^ab.{4,4}$ | abcdef | abWdefg |
Starts with … and ends with … and has defined length … | ^abc.{3,3}def$ | abcxyzdef | abcxydef |
Starts with … and ends with … and has minimum length … | ^abc.{1,}def$ | abcxyzdef | abcdef |
Only contains letters ‘a’ to ‘z’ (uppercase or lowercase) | ^[a-zA-Z]{1,}$ | abcdef | ab12ef |
Only contains numbers ‘0’ to ‘9’ | ^[0-9]{1,}$ | 12345 | 12ef |
Pattern matching ‘xxxyyyxxx’ where x are letters and y are numbers | ^[a-zA-Z]{3,3}[0-9]{3,3}[a-zA-Z]{3,3}$ | abc123def | abc1s3def |
Starts with one or more times of … | ^a*bc | aaaabc | xbc |
Has length of … and may start with an optional part (in this case ‘X’) | ^(X)?.{8,8}$ | abcdefgh
Xabcdefgh |
Abcdefghi
Xabcdefghi |
Starts with one or more … (in this case: must start with at least one ‘a’ but may also start with more than one ‘a’ letters) | a+ | abac | xyz |
Starts with … to … times with … (in this case: starts with two to four ‘a’ letters) | ^a{2,4}bc$ | aaabc | aaaaaaaabc |
Starts with … to unlimited times with … (in this case: starts with two or more ‘a’ letters) | ^a{2,}bc$ | aaaaaaaabc | abc |
Summary
It is not easy to use regular expressions, because they have a very technical syntax and if you don’t use them every day you will forget the details of this syntax. But if you want to implement some difficult string comparisons I recommend you to use regular expressions. The power of the pattern matching features will outbalance the disadvantage of the decreased readability.