LeetCode Solutions Blog

Substring with Concatenation of All Words

Number: 30

Difficulty: Hard

Paid? No

Problem Description

Given a string s and an array of strings words (where all words have the same length), find all starting indices in s of substrings that are a concatenation of each word in words exactly once and without any intervening characters.

Key Insights

Use the fact that all words are of the same length to slide through s in fixed-size segments.
Create a frequency map for the words in the array to know the required counts.
Use a sliding window approach with different starting offsets (0 to wordLength-1) to maintain alignment.
Expand the window by checking word-sized chunks and adjust (slide) the window when the frequency of a word exceeds its allowed count.
When the window contains exactly all words, record the starting index.

Space and Time Complexity

Time Complexity: O(n * m) where n is the length of s and m is the number of words (each word check takes constant time due to fixed word length splitting). Space Complexity: O(m) for the frequency map used for the words.

Solution

The solution uses a sliding window exhibiting multiple starting offsets to ensure each word chunk is properly aligned. For each offset, the window slides in increments of wordLength. A hash map tracks the current counts of words found. When a word's count exceeds its allowed frequency (from the input words array), the window is shrunk from the left until the condition is met. If the window size matches total words, the starting index is recorded. This approach efficiently checks each possible concatenation substring.

Code Solutions

# Python solution with detailed comments
class Solution:
    def findSubstring(self, s: str, words: [str]) -> [int]:
        if not s or not words:
            return []
        word_length = len(words[0])
        total_words = len(words)
        window_size = word_length * total_words
        if len(s) < window_size:
            return []
        
        # Build the frequency map of words
        word_freq = {}
        for word in words:
            word_freq[word] = word_freq.get(word, 0) + 1
        
        results = []
        # Check each offset to account for alignment
        for i in range(word_length):
            left = i
            current_count = {}
            count = 0  # current number of words matched
            # Slide the window in increments of word_length
            for right in range(i, len(s) - word_length + 1, word_length):
                curr_word = s[right:right+word_length]
                if curr_word in word_freq:
                    current_count[curr_word] = current_count.get(curr_word, 0) + 1
                    count += 1
                    # Shrink window if frequency is exceeded
                    while current_count[curr_word] > word_freq[curr_word]:
                        left_word = s[left:left+word_length]
                        current_count[left_word] -= 1
                        left += word_length
                        count -= 1
                    # If window matches all words, record the index
                    if count == total_words:
                        results.append(left)
                else:
                    # Reset the window if the word is not in the words list
                    current_count.clear()
                    count = 0
                    left = right + word_length
        return results