Regular Expression Matching
Master this topic with zero to advance depth.
Regular Expression Matching
Given an input string s and a pattern p, implement regular expression matching with support for . and * where:
.Matches any single character.*Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
Visual Representation
s = "aa", p = "a*"
- 'a*' can match "", "a", "aa", "aaa", etc.
- Since it matches "aa", result is True.
s = "ab", p = ".*"
- '.*' matches anything. Result is True.Examples
index 0 matches 'a', but index 1 does not. Pattern is exhausted.
Level I: Brute Force (Recursion)
Intuition
At each step, we check if the first character of the string matches the first character of the pattern. Special handling is needed for '*' which can match zero or more of the preceding element.
Thought Process
- If
pis empty, returntrueifsis also empty. - Check if the first characters match (
s[0] == p[0]orp[0] == '.'). - If
p[1] == '*':- Either skip '*' and preceding char (zero match):
isMatch(s, p[2:]). - Or if first chars match, consume one char of
sand keep pattern:first_match && isMatch(s[1:], p).
- Either skip '*' and preceding char (zero match):
- Else: If first chars match, recurse on
isMatch(s[1:], p[1:]).
Detailed Dry Run
s = "aa", p = "a*"
- p[1] is '*'. Try skip: isMatch("aa", "") -> False.
- Try consume: 'a'=='a', so isMatch("a", "a*").
- Again '', try consume: 'a'=='a', so isMatch("", "a").
- Again '*', try skip: isMatch("", "") -> True.
Level II: Memoization (Top-Down 2D)
Intuition
The recursive regex matching recalculates many overlapping (i, j) states. Adding a 2D memo array indexed by (si, pi) eliminates repeated work.
Visual
s="aa", p="a*"
solve(0,0): p[1]='*' -> zero match: solve(0,2); one+ match: solve(1,0)
solve(1,0): cached after first call!Detailed Dry Run
s="aa", p="a*". memo[0][0]: p[1]='*', try zero: memo[0][2]=T (empty matches empty). Return True.
⚠️ Common Pitfalls & Tips
The '*' operator makes regex DP tricky. Always first try the 'zero match' path (pi+2), then only try 'one+ match' if firstMatch is true to avoid infinite loops.
Level III: Dynamic Programming (Bottom-Up 2D)
Intuition
We use a 2D table dp[i][j] to represent if s[0...i-1] matches p[0...j-1].
Thought Process
dp[0][0] = true(empty string matches empty pattern).- Pattern with '*':
- Case 1: '*' matches zero characters:
dp[i][j] = dp[i][j-2]. - Case 2: '*' matches one or more:
dp[i][j] = dp[i-1][j]ifs[i-1]matchesp[j-2].
- Case 1: '*' matches zero characters:
- Pattern without '*':
- If
s[i-1]matchesp[j-1](either same char or '.'),dp[i][j] = dp[i-1][j-1].
- If
Detailed Dry Run
s = "aa", p = "a*"
| i\j | "" | 'a' | '*' |
|---|---|---|---|
| "" | T | F | T (zero 'a') |
| 'a' | F | T | T (one 'a') |
| 'a' | F | F | T (two 'a') |
⚠️ Common Pitfalls & Tips
Handling the empty string base case for patterns like a*b*c* is tricky. Ensure you initialize dp[0][j] correctly for '*' characters.
Found an issue or have a suggestion?
Help us improve! Report bugs or suggest new features on our Telegram group.