Problem Description
Given an array of words, construct the shortest possible reference string s where each word appears as a suffix ending at the '#' character. Each word is encoded as the substring in s starting from a specific index up to (but not including) the next '#' character.
Key Insights
- If one word is a suffix of another, only the longer word needs to be explicitly stored in the encoding.
- By reversing the words, the suffix problem becomes a prefix problem, making it easier to detect and ignore redundant words.
- Using data structures such as a Trie or a set can help efficiently eliminate words that are suffixes of other words.
Space and Time Complexity
Time Complexity: O(n * k^2), where n is the number of words and k is the maximum word length. This arises from checking every possible suffix in each word. Space Complexity: O(n * k), used for storing the words and potential Trie nodes or set entries.
Solution
We start by removing duplicate words using a set. For each word, we remove all its suffixes from the set. This ensures that only words which are not suffixes of any other word remain. The final encoded length is simply the sum of each remaining word's length plus one for the '#' separator.
We can use:
- A set to store the words.
- A nested loop to remove every suffix of each word that appears in the set.
- A final pass to sum word length plus the separator for the constructed reference string.