Regex

A regular expression (shortened as regex or regexp), is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for “find” or “find and replace” operations on strings, or for input validation.

Anchors

^

Matches the beginning of the string or line. Example: ^word

$

Matches the end of the string or line. Example: \.txt$

Flags

  • i: Makes the expression case insensitive
  • g: Ensures that the expression does not stop on the first match

Group & References

()

Groups an expression. Example: (ha)+

\1

References a grouped expression. \1 references the first group, \2 the second and so on. Example: (ha)\s\1

(?:)

Makes a grouping that cannot be referenced. Example: (?:ha)+

Character Classes

[abc]

Matches any character in the set. Example: b[eo]r

[^abc]

Matches any character not in the set. Example: b[^eo]r

[a-z]

Matches all characters between two characters, including themselves. Example: [e-i]

.

Matches any character except line breaks.

\w

Matches any alphanumeric character. Including the underline.

\W

Matches any non-alphanumeric character.

\d

Matches any numeric character.

\D

Matches any non-numeric character.

\s

Matches any whitespace character.

\S

Matches any non-whitespace character.

Lookarounds

(?=)

Positive Lookahead. Example: \d(?=after)

(?!)

Negative Lookahead. Example: \d(?!after)

(?<=)

Positive Lookbehind. Example: (?<=behind)\d

(?<!)

Negative Lookbehind. Example: (?<!behind)\d

Quantifiers And Alternation

+

Expression matches one or more. Example: be+r

*

Expression matches zero or more. Example: be*r

{}

Expression matches within specified ranges (matches this many times):

  • Match exactly: {4}
  • Match minimum: {4,}
  • Match between: {4,9} Example: be{1,2}r

?

Makes the expression optional or lazy. Example: colou?r

|

Works like OR. It waits for one of the expressions it reserved to match. Example: (c|r)at

Common Regular Expressions

  • IPv4-Address: \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
  • MAC-Address: (?:[0-9a-fA-F]{2}\:){5}[0-9a-fA-F]{2}
  • Hex Color Codes: ^#?([a-fA-F0-9]{6})$
  • Mail Address: ^([a-zA-Z0–9._%-]+@[a-zA-Z0–9.-]+\.[a-zA-Z]{2,6})*$