We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Find the Original Typed String II

Number: 3618

Difficulty: Hard

Paid? No

Companies: N/A


Problem Description

Alice tries to type a specific string but sometimes presses a key for too long so that a character appears multiple times. Given the final output string (word) and a positive integer k, determine the total number of possible original strings Alice might have intended such that the intended string’s length is at least k. Each contiguous block of the same letter in the final output comes from one intended key press that might have produced extra repeated letters. The answer is to be returned modulo 10^9 + 7.


Key Insights

  • The final output can be split into groups where each group consists of consecutive identical letters.
  • For each group of length n, the intended press could have produced any count from 1 to n.
  • The total number of possible original strings (without length restriction) is the product of the possibilities for each group.
  • Since the intended string must have a length of at least k, we subtract those combinations where the sum of chosen counts (across groups) is less than k.
  • When the number of groups (m) is at least k, the minimum possible length is m which is ≥ k, and therefore every possible combination is valid.
  • A dynamic programming (DP) approach using prefix sums efficiently computes, for small m (m < k), the number of ways to achieve a total intended length below k.

Space and Time Complexity

Time Complexity: O(m * k) where m is the number of groups; note that if m ≥ k, no DP is needed. Space Complexity: O(k) for the DP array.


Solution

The solution first parses the given word into groups of consecutive identical characters. For each group with count = n, there are n possibilities (the intended character could have been typed 1 to n times). The overall number for all groups without restriction is the product of these counts (modulo 10^9+7).

However, not every combination is allowed: the intended string length must be at least k. If m (the number of groups) is at least k, then every combination automatically meets the required length. Otherwise, we use DP to count the number of combinations that yield a total length less than k. The DP state dp[s] represents the number of ways to get an intended length sum exactly equal to s from the processed groups. When processing a new group, we update dp using prefix sums to quickly compute the sum over the range of allowable counts (from 1 to the group’s count). Finally, we subtract the invalid combinations (which yield length < k) from the total combinations.

The solution is implemented in multiple languages as shown below.


Code Solutions

MOD = 10**9 + 7

def countOriginalStrings(word, k):
    n = len(word)
    
    # Group the word into counts of consecutive identical characters.
    groups = []
    count = 1
    for i in range(1, n):
        if word[i] == word[i-1]:
            count += 1
        else:
            groups.append(count)
            count = 1
    groups.append(count)
    
    m = len(groups)
    total = 1
    # Total ways without considering the minimum length condition.
    for cnt in groups:
        total = (total * cnt) % MOD
    
    # If the number of groups is at least k, every combination is valid.
    if m >= k:
        return total

    # dp[s] = number of ways to achieve a total intended length s.
    dp = [0] * (k)
    dp[0] = 1
    
    for cnt in groups:
        new_dp = [0] * (k)
        # Build a prefix sum of the current dp to allow O(1) range queries.
        prefix_dp = [0] * (k + 1)
        for s in range(k):
            prefix_dp[s+1] = (prefix_dp[s] + dp[s]) % MOD
        # For each possible new total length 'new_sum'
        for new_sum in range(k):
            # Intended count from this group can be any x in [1, cnt] such that new_sum - x >=0.
            low_bound = max(0, new_sum - cnt)
            high_bound = new_sum  # inclusive: dp[new_sum - x] for x from 1 to cnt gives indexes [new_sum-cnt, new_sum-1]
            new_dp[new_sum] = (prefix_dp[high_bound+1] - prefix_dp[low_bound]) % MOD
        dp = new_dp

    # Sum dp[s] for all s that do not meet the required length (i.e. s < k)
    # The minimum possible sum after m groups is m, so we sum dp[s] for s from m to k-1.
    invalid = sum(dp[m:k]) % MOD
    result = (total - invalid) % MOD
    return result

# Example test cases:
print(countOriginalStrings("aabbccdd", 7))  # Expected output: 5
print(countOriginalStrings("aabbccdd", 8))  # Expected output: 1
print(countOriginalStrings("aaabbb", 3))    # Expected output: 8
← Back to All Questions