Given an array of unique strings, return the smallest string that contains each string as a substring. No string in the given array is a substring of another. In case of multiple valid answers, any one is accepted.
Key Insights
Precompute the maximum overlap for every pair of strings such that merging them minimizes additional length.
Use dynamic programming with bitmasking to try every possible permutation of words efficiently.
Reconstruct the final superstring using backtracking from the DP table.
The problem is analogous to solving a variant of the traveling salesman problem where the cost is the negative overlap.
Space and Time Complexity
Time Complexity: O(n^2 * 2^n), where n is the number of words.
Space Complexity: O(n * 2^n) for storing DP states and backtracking information.
Solution
We solve the problem using dynamic programming with bitmasking. First, precompute the overlap between every two strings (i and j) which is the maximum length of the suffix of i that is a prefix of j. Then, use a DP state dp[mask][i] that represents the maximum saved overlap when the set of strings given by mask has been used and ends with the i-th string. The state transitions by considering appending a new word j to the current combination, updating the overlap saved. Once the DP table is computed, reconstruct the optimal order by backtracking and finally combine the words by merging them according to the computed overlaps.
Code Solutions
# Python implementation for Shortest Superstringfrom typing import List
defshortestSuperstring(words: List[str])->str: n =len(words)# Precompute the overlaps between words: overlap[i][j] is the maximum suffix of words[i] that is a prefix of words[j] overlap =[[0]* n for _ inrange(n)]for i inrange(n):for j inrange(n):if i == j:continue# Compute maximum overlap between words[i] and words[j] max_len =min(len(words[i]),len(words[j]))for k inrange(max_len,0,-1):if words[i].endswith(words[j][:k]): overlap[i][j]= k
break# dp[mask][i] (mask: bitmask of visited words, i: ending word) stores maximum overlap sum dp =[[0]* n for _ inrange(1<< n)] parent =[[-1]* n for _ inrange(1<< n)]# Iterate over all bitmask statesfor mask inrange(1,1<< n):for j inrange(n):if(mask &(1<< j))==0:continue prev_mask = mask ^(1<< j)if prev_mask ==0:continuefor i inrange(n):if prev_mask &(1<< i): value = dp[prev_mask][i]+ overlap[i][j]if value > dp[mask][j]: dp[mask][j]= value
parent[mask][j]= i
# Find the end word that gives maximal overlap sum max_overlap =-1 last =-1 full_mask =(1<< n)-1for i inrange(n):if dp[full_mask][i]> max_overlap: max_overlap = dp[full_mask][i] last = i
# Reconstruct the path for maximum overlap path =[] mask = full_mask
while last !=-1: path.append(last) temp = parent[mask][last] mask ^=(1<< last) last = temp
path.reverse()# Build the superstring using the path and computed overlaps superstring = words[path[0]]for idx inrange(1,len(path)): i, j = path[idx-1], path[idx]# Append the non-overlapping part of words[j] superstring += words[j][overlap[i][j]:]# Append remaining words not in path (if any) - though by problem statement all words appear in path used =set(path)for i inrange(n):if i notin used: superstring += words[i]return superstring
# Example usage:if __name__ =="__main__": test_words =["catg","ctaagt","gcta","ttca","atgcatc"]print(shortestSuperstring(test_words))