Problem Description
Given two encoded strings s1 and s2 where each string is formed by taking an original string of lowercase letters, splitting it into a sequence of non-empty substrings, and then optionally replacing some of these substrings by their length (as a numeric string), determine whether there exists at least one original string that could have generated both s1 and s2. The encoded strings consist only of lowercase English letters and digits 1–9. The challenge is to “match” the two encoded strings by accounting for the flexibility of numeric replacements.
Key Insights
- The encoded strings mix letters and numeric strings (which represent arbitrary-length substrings) making direct matching non-trivial.
- We can use a recursive (backtracking) approach with memoization (dynamic programming) to simulate the decoding process.
- Maintain two indices (one for each encoded string) and a running “difference” (diff) that tracks the imbalance in the number of literal characters processed (i.e. how many extra letters one decode path has over the other).
- When encountering digits, explore all valid number interpretations (remembering that consecutive digits can form numbers up to 3 digits long).
- Letters must match exactly, but if one side has a pending “skip” (due to a previously encountered number) then the recursion adjusts the diff accordingly.
- The main trick lies in interpreting digits as flexible “gaps” in the original string reconstruction and using DFS to try all valid splits.
Space and Time Complexity
Time Complexity: O(n1 * n2 * maxDiff) in the worst-case scenario, where n1 and n2 are the lengths of s1 and s2 and maxDiff is bounded by the maximum possible digit sum interpreted. Space Complexity: O(n1 * n2 * maxDiff) for the memoization state.
Solution
We use a DFS recursive approach with memoization. The function recursively processes characters in s1 and s2 with pointers i and j, and a “diff” that indicates how many extra decoded characters (letters) one string currently has compared to the other. When a digit is encountered on one side, we try all valid number interpretations (since digits might represent different substring lengths). If a letter is encountered, it must match the corresponding decoded letter from the other string; otherwise, we adjust the diff accordingly. The recursion ends successfully when both indices reach the end of their respective strings and the diff is 0. This solution uses recursion combined with memoization to avoid redundant computations.