Problem Description
Given two strings – word1 and word2 – we need to find an array of indices (of length equal to word2’s length) that is “valid” in the following sense: • The indices are in increasing order. • If you take the characters from word1 at those indices (in order) you get a string that is “almost equal” to word2, meaning that by changing at most one character in that extracted string you can obtain word2. If there are several valid sequences, you must return the lexicographically smallest (i.e. the array which, when compared element‐by‐element, is smallest). If no valid sequence exists, return an empty array.
Key Insights
- Recognize that an “almost equal” string is either exactly equal to word2 or differs in exactly one position.
- First check if an exact subsequence of word2 exists in word1. (An exact match is valid because zero changes is at most one.)
- If no exact match is found, then try to “force” one substitution at some position in word2.
- For each candidate substitution position t in word2, pick the lexicographically smallest set of indices: use greedy two–pointers (backed by precomputed “next occurrence” arrays) to choose the earliest available index for every non–substituted letter; for the substitution position, choose any index (without requiring a character match) provided that the remainder of word2 (which must match exactly) can still be “found.”
- Precomputing for every index in word1 and for every letter (a–z) the next occurrence helps quickly decide whether a complete exact subsequence exists from a given point, and it makes the greedy selection efficient.
- The challenge is to decide when and where to “use up” the allowed substitution so that the final indices array is lexicographically smallest.
Space and Time Complexity
Time Complexity: O(n · 26 + m · (log n or constant))
• Precomputing the next occurrence table takes O(26 · n) time.
• The greedy selection (and checking feasibility for each potential substitution) runs in O(m) worst-case if implemented carefully.
Space Complexity: O(n · 26) for the next–occurrence table, plus O(m) for helper arrays.
Solution
The idea is to “simulate” the subsequence selection from word1 for word2 by using a next–occurrence table:
- Build an array nextPos such that for every index i in word1 (from 0 to n) and for every letter from a to z, nextPos[i][c] gives the next index (>= i) where letter c appears. This allows us to quickly “jump” to the next matching index.
- Attempt a greedy exact–match subsequence. For each position j in word2, starting from position 0 in word1, choose the smallest index i (>= current) where word1[i] equals word2[j] and ensure that the remainder of word2 can still be matched. If this succeeds for all j then return the indices.
- If an exact match is impossible then consider using one substitution. For each possible substitution position t (0-indexed in word2): a. For j from 0 to m–1, if j is not t, choose the smallest index where word1’s letter equals word2[j] (using the nextPos table) while ensuring that the remaining part can be completed exactly. b. For j == t, you are free to choose any index (starting at the current pointer) – choose the smallest possible candidate that still leaves a valid exact sequence for word2[t+1:].
- Compare all valid candidates and choose the lexicographically smallest indices array.
- Return the chosen indices array or an empty array if no candidate is valid.
Key details include handling edge cases (for instance, word2 being longer than any possible subsequence in word1) and ensuring that the feasibility check (for “completing” the match after a chosen index) is done in O(1) time by using precomputed data.
Code Solutions
Below are code solutions with detailed inline comments in Python, JavaScript, C++ and Java.