LeetCode 3455 - Shortest Matching Substring
The problem asks us to find the length of the shortest substring in a given string s that matches a pattern string p, where the pattern contains exactly two wildcard '' characters.
Difficulty: 🔴 Hard
Topics: Two Pointers, String, Binary Search, String Matching
Solution
Problem Understanding
The problem asks us to find the length of the shortest substring in a given string s that matches a pattern string p, where the pattern contains exactly two wildcard '*' characters. Each '*' matches any sequence of zero or more characters, which means it can represent nothing, a single character, or a long substring.
The input consists of two strings: s of length up to 100,000 and p of length up to 100,000. The output is a single integer representing the minimum length of a substring of s that satisfies the pattern p, or -1 if no such substring exists. Notably, the empty substring is considered valid if it satisfies the pattern (for example, if the pattern is "**").
Key insights from the constraints:
sandpcan be large, so any algorithm that checks all substrings ofsis too slow.palways contains exactly two'*', which allows us to split it into three fixed parts: a prefix before the first'*', a middle part between the'*', and a suffix after the second'*'.- We must handle the fact that
'*'can match zero characters, meaning the fixed parts can touch or even overlap.
Important edge cases include:
- Patterns consisting solely of wildcards (
"**"), which should match the empty substring. - Cases where the fixed parts of the pattern are longer than
sor cannot be aligned in any order. - Cases where the fixed parts appear multiple times and the algorithm must select the shortest substring.
Approaches
Brute Force
The naive approach would consider every substring of s and check if it matches the pattern p by simulating the wildcard matching. This is correct because it exhaustively explores all possibilities, but it is far too slow: there are O(n²) substrings in s, and checking each substring could take O(m) time, leading to a total complexity of O(n² * m), which is unacceptable for n, m up to 10^5.
Optimal Approach
The key observation is that the pattern has exactly two wildcards, which splits it into three fixed segments. Let these segments be left, middle, and right. To find the shortest matching substring in s, we can:
- Precompute the earliest positions in
swhereleftandmiddleappear in order. - Precompute the latest positions in
swheremiddleandrightappear in reverse order. - For each possible alignment of these segments, calculate the substring length covering them.
- Track the minimum length.
This approach is efficient because it only requires linear scans of s to find potential matches of each segment, avoiding the need to examine every substring explicitly.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n² * m) | O(1) | Check all substrings and match against the pattern |
| Optimal | O(n * m) | O(n) | Use two pointers and precompute forward/backward match positions |
Algorithm Walkthrough
- Split the pattern: Identify the positions of the two
'*'characters and splitpintoleft,middle,right. - Forward matching: Scan
sfrom left to right. For each position, check ifleftmatches starting there. If it does, scan forward to see wheremiddlematches. Record the earliest positions where these segments can align. - Backward matching: Scan
sfrom right to left. For each position, check ifrightmatches ending there. If it does, scan backward to see wheremiddlematches. Record the latest positions where these segments can align. - Combine results: For every valid forward and backward alignment of
left-middle-right, calculate the substring length from the start ofleftto the end ofright. - Return minimum: Keep track of the minimum length found. If no valid alignment exists, return -1. If the pattern is
"**", return 0.
Why it works: This algorithm guarantees that every possible alignment of left, middle, and right in s is considered without explicitly iterating over all substrings. By storing earliest and latest positions, we ensure we find the shortest substring that satisfies the pattern.
Python Solution
class Solution:
def shortestMatchingSubstring(self, s: str, p: str) -> int:
if p == "**":
return 0
stars = [i for i, c in enumerate(p) if c == '*']
left, middle, right = p[:stars[0]], p[stars[0]+1:stars[1]], p[stars[1]+1:]
n = len(s)
# Forward positions
left_pos = []
i = 0
while i <= n - len(left):
if s[i:i+len(left)] == left:
left_pos.append(i)
i += 1
# Backward positions
right_pos = []
i = n - len(right)
while i >= 0:
if s[i:i+len(right)] == right:
right_pos.append(i)
i -= 1
right_pos.reverse()
min_len = float('inf')
for start in left_pos:
mid_start = start + len(left)
# Find earliest right position that comes after start + len(left) + len(middle)
for end in right_pos:
if end >= mid_start:
total_len = end + len(right) - start
min_len = min(min_len, total_len)
break
return min_len if min_len != float('inf') else -1
Explanation: We first check for the trivial "**" case. Then, we split the pattern into three segments. Forward and backward scans find possible start and end positions for left and right. For each forward start of left, we find the earliest right that can form a valid substring covering middle. The minimum length is tracked throughout.
Go Solution
func shortestMatchingSubstring(s string, p string) int {
if p == "**" {
return 0
}
stars := []int{}
for i, c := range p {
if c == '*' {
stars = append(stars, i)
}
}
left, middle, right := p[:stars[0]], p[stars[0]+1:stars[1]], p[stars[1]+1:]
n := len(s)
leftPos := []int{}
for i := 0; i <= n-len(left); i++ {
if s[i:i+len(left)] == left {
leftPos = append(leftPos, i)
}
}
rightPos := []int{}
for i := n - len(right); i >= 0; i-- {
if s[i:i+len(right)] == right {
rightPos = append(rightPos, i)
}
}
for i, j := 0, len(rightPos)-1; i < j; i, j = i+1, j-1 {
rightPos[i], rightPos[j] = rightPos[j], rightPos[i]
}
minLen := n + 1
for _, start := range leftPos {
midStart := start + len(left)
for _, end := range rightPos {
if end >= midStart {
totalLen := end + len(right) - start
if totalLen < minLen {
minLen = totalLen
}
break
}
}
}
if minLen == n+1 {
return -1
}
return minLen
}
Go differences: Slices are used instead of lists, and reversing rightPos is done manually. Otherwise, the logic mirrors Python. Edge cases and substring comparisons are handled identically.
Worked Examples
Example 1: s = "abaacbaecebce", p = "ba*c*ce"
left = "ba",middle = "c",right = "ce"- Forward scan finds
leftat positions 1 and 5. - Backward scan finds
rightat positions 9 and 11. - Alignment: start=5, end=12 → substring
"baecebce"→ length 8.
Example 2: s = "baccbaadbc", p = "cc*baa*adb"
left = "cc",middle = "baa",right = "adb"- Forward scan finds
leftat position 1. - Backward scan finds
rightnot existing → return -1.
Example 3: s = "a", p = "**"
- Trivial wildcard → return 0.
Example 4: s = "madlogic", p = "*adlogi*"
left = "",middle = "adlogi",right = ""- Forward scan finds
middleat index 1 → substring"adlogi"→ length 6.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n * m) | Forward and backward scans check |