LeetCode Solutions Blog

Problem Description

Given a string s and a pattern p that contains exactly two '*' wildcards (each matching any sequence of zero or more characters), return the length of the shortest contiguous substring of s that matches p. The empty substring is considered valid. If no substring of s can match p, return -1.

For example, if s = "abaacbaecebce" and p = "bacce", one matching substring is "baecebce" (of length 8).

Key Insights

The pattern p always contains exactly two '*' characters, so p can be split into three parts: A, B, and C (which could be empty).
A valid match is any substring T of s that can be partitioned as: T = (any prefix) + A + (any string) + B + (any string) + C + (any suffix). In terms of matching, T must contain A, then later B, then later C (with possibly zero characters in between).
To obtain the shortest matching substring, you want to pick occurrences of A, B, and C that are “as close together as possible.”
A good approach is to first find all occurrences (starting indices) of A, B, and C in s. Then, for each occurrence of A you can binary–search for the next occurrence of B that appears after the end of A, and similarly find an occurrence of C after the end of B.
The challenge is handling edge cases when any of the segments (A, B, or C) is empty (recall that p might begin or end with '*' or, in an extreme, be just "**", where the answer would be 0).
Efficient substring search (using for example KMP) and binary search on sorted lists of occurrence indices keeps the solution within acceptable complexity bounds.

Space and Time Complexity

Time Complexity: O(n + m + occ*log(occ)), where n = s.length, m = p.length, and occ is the number of occurrences found for a given segment (in worst-case O(n)). Space Complexity: O(n) to store the occurrence indices.

Solution

We start by splitting p into three segments: A, B, and C, where p = A + '' + B + '' + C. In s, for each segment we use a KMP-based search (or any efficient substring search algorithm) to compute a sorted list of start indices where that segment occurs. (If the segment is empty, we conceptually allow it to “match” at every position; however, for the purpose of minimizing substring length we only need to consider boundary conditions.)

For each occurrence index a for A (if A is nonempty; if empty, any index may serve as a starting point), we determine the end index of the A match (a + |A|). Then, using binary search on all occurrences of B, we try to find an occurrence b such that b ≥ a + |A|. For the found b, the match for B extends to b + |B|. Then, again by binary search on all occurrences of C, we find c such that c ≥ b + |B|. The candidate matching substring spans from a to c + |C| - 1 and its length is (c + |C| - a).

We iterate over all possible occurrences of A in s and take the minimum such substring length. If we cannot form a valid chain, we return -1. Special handling is done when one or more segments are empty. For example, if p equals "**", the answer is 0 as the empty substring is valid.

This solution uses:

KMP (Knuth-Morris-Pratt) algorithm for efficient pattern searching.
Binary search to quickly find the next occurrence index that meets our boundary requirements.

A similar strategy is applied for each of the target languages.

Code Solutions

Below are code solutions in Python, JavaScript, C++, and Java with line-by-line comments.

Shortest Matching Substring

Problem Description

Key Insights

Space and Time Complexity

Solution

Code Solutions