We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Number of Matching Subsequences

Number: 808

Difficulty: Medium

Paid? No

Companies: Uber, Google, Salesforce


Problem Description

Given a string s and an array of strings words, determine the number of words in words that are subsequences of s. A subsequence is formed by deleting zero or more characters from s without changing the order of the remaining characters.


Key Insights

  • Instead of checking each word individually against s, leverage the fact that s is fixed and iterate through it to "advance" pointers for each word.
  • Use a mapping (or bucket) from characters to lists of iterators/pointers representing the next character needed by each word.
  • As we iterate over s, update the waiting lists and count when a word is fully matched.
  • This approach avoids redundant scanning and significantly reduces time complexity.

Space and Time Complexity

Time Complexity: O(n + m) where n is the length of s and m is the total number of characters in all words
Space Complexity: O(m) for storing the waiting pointers for each character in all words


Solution

We initialize a dictionary (hash map) that maps each letter (a-z) to a list of word iterators or pointers. Each word in words is associated with its starting character. Then, iterate through each character in s. For the current character, retrieve and clear its waiting list and advance each word’s pointer. If a word is completely matched (i.e., all its characters have been processed), increment the result counter. Otherwise, push the word pointer into the bucket corresponding to its next required character. This technique efficiently checks for subsequences in a single pass over s with minimal repeated scanning.


Code Solutions

# Python solution with line-by-line comments
from collections import defaultdict, deque

def numMatchingSubseq(s, words):
    # Dictionary mapping from character to a deque of iterators over words
    waiting = defaultdict(deque)
    
    # Initialize waiting lists with an iterator for each word starting with its first character
    for word in words:
        # Append the word itself as the sequence, waiting for its first character (or next character pointer)
        waiting[word[0]].append(word)

    count = 0  # To count the number of words that are subsequences
    # Iterate over each character in s
    for char in s:
        # Access the list of words waiting for the current character
        current_bucket = waiting[char]
        # Clear the bucket as we are going to process all the current waiting words
        waiting[char] = deque()
        
        # Process each word that was waiting for the current character
        while current_bucket:
            word = current_bucket.popleft()
            # If the word length is 1, then current character was the final needed character
            if len(word) == 1:
                count += 1   # Word is fully matched
            else:
                # Move to the next character in the word and add it to the appropriate bucket
                waiting[word[1]].append(word[1:])
                
    return count

# Example usage:
s = "abcde"
words = ["a", "bb", "acd", "ace"]
print(numMatchingSubseq(s, words))  # Expected output: 3
← Back to All Questions