In this post, we will talk about the various symbols and what do they mean in the making of a regular expression. So, let’s see them one by one –
Subexpression | Matches |
---|---|
\^ | Start of a string/line |
$ | End of a string/line |
\b | Word boundary |
\B | Not a word boundary |
\A | Beginning of entire string |
\z | End of entire string |
\Z | End of entire string (except allowable final line terminator) |
. | Any one character (except line terminator) |
[…] | Any one character from listed one |
[\^…] | Any one character from not the listed one |
Please see this post to understand the working of above Symbols.
Let’s now move to Normal( greedy ), Reluctant (non-greedy), and Possessive (very greedy) quantifiers
Normal(greedy) Quantifiers –
Subexpression | Matches |
---|---|
{m,n} | Matches from m to n repetitions |
{m,} | Matches m or more repetitions |
{m} | Matches exactly m repetitions |
{,n} | Matches from 0 to n repetitions (Short for {0,n} ) |
\* | Matches 0 or more repetitions (Short for {0,} ) |
+ | Matches 1 or more repetitions (Short for {1,} ) |
? | Matches exatcly 0 or 1 repetition (Short for {0,1} ) |
Reluctant (non-greedy) Quantifiers –
Subexpression | Matches |
---|---|
{m,n}? | Reluctant quantifier for “from m to n repetitions” |
{m,}? | Reluctant quantifier for “m or more repetitions” |
{m}? | Reluctant quantifier for 0 up to n repetitions |
\*? | Reluctant quantifier: 0 or more |
+? | Reluctant quantifier: 1 or more |
?? | Reluctant quantifier: 0 or 1 times |
Possessive (very greedy) quantifiers –
Subexpression | Matches |
---|---|
{m,n}+ | Possessive quantifier for “from m to n repetitions” |
{m,}+ | Possessive quantifier for “m or more repetitions” |
{m}+ | Possessive quantifier for 0 up to n repetitions |
\*+ | Possessive quantifier: 0 or more |
++ | Possessive quantifier: 1 or more |
?+ | Possessive quantifier: 0 or 1 times |
Now lets see various escapes and shorthands in regular expression
Escapes and Shorthands –
Subexpression | Matches |
---|---|
\ | Escape (quote) character: turns most metacharacters off; turns subsequent alphabetic into metacharacters |
\Q | Escape (quote) all characters up to \E |
\E | Ends quoting begun with \Q |
\t | A Tab character |
\r | Return (carriage return) character |
\n | ANewline character |
\f | A Form feed |
\w(small w) | Character in a word (use w+ for a word ) |
\W(Capital w) | A nonword character |
\d(small d) | A Numeric digit (use \d+ for an integer ) |
\D(capital d) | A nondigit character |
\s(small s) | A Whitespace character |
\S(capital s) | A nonwhitespace character |