Regular expressions (regex) are a language for describing text patterns: searching, validating and replacing. They're incredibly powerful but their cryptic syntax is intimidating. This cheatsheet gathers all the essentials with examples, so you stop Googling "how do I make an email regex" every time.
Basic metacharacters
| Symbol | Means | Example |
|---|---|---|
. |
Any character (except newline) | a.c → "abc", "axc" |
\ |
Escapes a special character | \. → a literal dot |
| |
OR (alternation) | cat|dog |
^ |
Start of line | ^Hello |
$ |
End of line | end$ |
Character classes
| Pattern | Matches |
|---|---|
[abc] |
An a, b or c |
[^abc] |
Any character that is NOT a, b or c |
[a-z] |
Any lowercase |
[A-Z] |
Any uppercase |
[0-9] |
Any digit |
[a-zA-Z0-9] |
Letters and numbers |
Predefined classes (shortcuts)
| Shortcut | Equals | Means |
|---|---|---|
\d |
[0-9] |
A digit |
\D |
[^0-9] |
NOT a digit |
\w |
[a-zA-Z0-9_] |
Word character |
\W |
[^a-zA-Z0-9_] |
NOT a word character |
\s |
space, tab, newline | Whitespace |
\S |
— | NOT whitespace |
Quantifiers (how many times)
| Symbol | Means | Example |
|---|---|---|
* |
0 or more | a* |
+ |
1 or more | a+ |
? |
0 or 1 (optional) | colou?r → "color" or "colour" |
{n} |
Exactly n | \d{4} → 4 digits |
{n,} |
n or more | \d{2,} |
{n,m} |
Between n and m | \d{2,4} |
By default quantifiers are greedy (take the maximum). Add ? to make them lazy (the minimum): .*?.
Anchors and boundaries
| Pattern | Means |
|---|---|
^ |
Start of string/line |
$ |
End of string/line |
\b |
Word boundary |
\B |
NOT a word boundary |
\bcat\b finds "cat" as a whole word, not inside "category".
Groups and capture
| Pattern | Means |
|---|---|
(abc) |
Capture group |
(?:abc) |
NON-capturing group |
(?<name>abc) |
Named group |
\1 |
Reference to group 1 |
Groups are used to extract parts ((\d{4})-(\d{2}) captures year and month separately) and to apply quantifiers to several characters: (ab)+.
Lookarounds (look around without consuming)
| Pattern | Means |
|---|---|
(?=...) |
Positive lookahead (followed by) |
(?!...) |
Negative lookahead (NOT followed by) |
(?<=...) |
Positive lookbehind (preceded by) |
(?<!...) |
Negative lookbehind |
Example: \d+(?= €) captures the number only if followed by " €", without including the symbol.
Flags (modifiers)
| Flag | Effect |
|---|---|
g |
Global (all matches, not just the first) |
i |
Case-insensitive |
m |
Multiline (^ and $ per line) |
s |
. also matches newlines |
Ready-to-use patterns
Email (simple): ^[\w.+-]+@[\w-]+\.[\w.-]+$
Phone (intl): ^\+?\d{1,3}[\s-]?\d{6,12}$
URL: ^https?:\/\/[^\s]+$
ZIP code (US): ^\d{5}(-\d{4})?$
Numbers only: ^\d+$
Date YYYY-MM-DD: ^\d{4}-\d{2}-\d{2}$
Hex color: ^#?([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$
Extra spaces: \s{2,}
Note: validating emails with regex 100% perfectly is nearly impossible (the standard is very complex). For most cases, the simple pattern above is enough.
Tip: always test your regex
A regex can look correct and fail on edge cases. The best way to refine it is to test it with real text and see what it captures. You can do it instantly with the regex tester on this site, which highlights matches as you type the pattern.
Common mistakes
- Forgetting to escape special characters: to find a literal dot use
\., not.. - Unexpected greed:
<.*>on<a><b>captures everything at once; use<.*?>to stop at the first>. - Not using the
gflag when you want all matches. - Anchoring wrong: without
^and$, a pattern can match a part when you wanted the whole string.
Conclusion
Regex looks like hieroglyphics until you realize they're combinable pieces: classes, quantifiers, anchors and groups. With this cheatsheet you have the complete vocabulary and ready patterns. Save it, and when you build a new one, always test it with real text before trusting it.