We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Find All Good Strings

Number: 1513

Difficulty: Hard

Paid? No

Companies: Google, Dunzo


Problem Description

Given two strings s1 and s2 of length n, and another string evil, count the number of "good" strings of length n that are lexicographically between s1 and s2 (inclusive) and do not contain evil as a substring. Return the answer modulo 10^9+7.


Key Insights

  • The problem requires counting strings with multiple constraints (lexicographical bounds and forbidden substring).
  • A dynamic programming (DP) solution is natural, where states track the current position, whether the prefix is still bound by s1 and/or s2, and how much of evil has been matched so far.
  • The forbidden substring constraint can be efficiently handled using a string matching automaton built by the KMP algorithm. This automaton allows quick transition from one state (number of matched characters) to another.
  • The DP state is defined as DP[pos][tight_lower][tight_upper][match_state] where:
    • pos: current index in the string.
    • tight_lower: whether the current prefix equals the prefix of s1 (and thus lower bound is active).
    • tight_upper: whether the current prefix equals the prefix of s2 (upper bound active).
    • match_state: the length of the prefix of evil that has been matched so far.
  • Recurrence is done by iterating through all valid characters (taking into account lexicographical restrictions) and updating the match_state using the precomputed transitions from the automaton.
  • The answer is the sum of all valid complete strings modulo 10^9+7.

Space and Time Complexity

Time Complexity: O(n * m * 2 * 2 * 26) where n is the string length and m is the length of evil (since m <= 50, this is acceptable). Space Complexity: O(n * m * 2 * 2) for the DP memoization table.


Solution

We solve the problem using a digit DP (or state DP) approach combined with a KMP-style automaton to keep track of whether the forbidden substring "evil" is being formed. The automaton precomputes, for each state (which indicates how many characters of evil have been matched) and for each possible next character, the new state after reading that character. Then, the DP function recurses over the positions of the string with flags indicating whether we have matched the lower bound s1 or the upper bound s2. If at any point the match_state reaches the length of evil, that branch is immediately pruned (as the evil substring appears). We carefully count the number of valid completions. The use of memoization ensures that each state is computed only once.


Code Solutions

Below are the code solutions with line-by-line comments in Python, JavaScript, C++, and Java.

MOD = 10**9 + 7

def build_kmp(evil):
    m = len(evil)
    # Build the KMP (failure) table for evil string.
    lps = [0] * m
    j = 0  # length of previous longest prefix suffix
    for i in range(1, m):
        while j > 0 and evil[i] != evil[j]:
            j = lps[j-1]
        if evil[i] == evil[j]:
            j += 1
            lps[i] = j
    # Build transition table for each state and next character.
    trans = [[0]*26 for _ in range(m)]
    for state in range(m):
        for c in range(26):
            ch = chr(ord('a') + c)
            j = state
            while j > 0 and evil[j] != ch:
                j = lps[j-1]
            if evil[j] == ch:
                j += 1
            trans[state][c] = j
    return trans

def findGoodStrings(n, s1, s2, evil):
    m = len(evil)
    trans = build_kmp(evil)
    
    from functools import lru_cache
    @lru_cache(maxsize=None)
    def dp(pos, tightLow, tightHigh, matched):
        # If we have formed the evil substring, return 0.
        if matched == m:
            return 0
        # All characters processed, valid string found.
        if pos == n:
            return 1
        
        # Determine range of characters at current position.
        low_limit = ord(s1[pos]) if tightLow else ord('a')
        high_limit = ord(s2[pos]) if tightHigh else ord('z')
        total = 0
        for char_code in range(low_limit, high_limit+1):
            # compute the new match state for evil string
            c_index = char_code - ord('a')
            new_matched = trans[matched][c_index]
            # next state for bounds.
            new_tightLow = tightLow and (char_code == ord(s1[pos]))
            new_tightHigh = tightHigh and (char_code == ord(s2[pos]))
            total = (total + dp(pos+1, new_tightLow, new_tightHigh, new_matched)) % MOD
        return total

    return dp(0, True, True, 0)

# Example usage:
print(findGoodStrings(2, "aa", "da", "b"))
← Back to All Questions