We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Extra Characters in a String

Number: 2755

Difficulty: Medium

Paid? No

Companies: Amazon, Meta, Bloomberg, Uber, Microsoft


Problem Description

Given a string s and a dictionary of words, break s into one or more non-overlapping substrings such that each substring is present in the dictionary. Some characters in s may not be part of any dictionary word; these are considered extra. The goal is to minimize the number of extra characters left after optimally breaking the string.


Key Insights

  • Use dynamic programming to compute the minimum extra characters for each starting position in the string.
  • Store dp[i] as the minimum number of extra characters for the substring starting at index i.
  • Preprocess the dictionary into a hash set for O(1) lookup of words.
  • For each index i, initially consider s[i] as an extra character and then try to match every possible dictionary word starting at i.
  • Update the dp state by considering valid dictionary matches which help to skip multiple extra characters.

Space and Time Complexity

Time Complexity: O(n^2 * k), where n is the length of s and k is the average length of words (for substring extraction and comparison).
Space Complexity: O(n) for the dp array and O(d) for the dictionary hash set, where d is the total number of dictionary words.


Solution

We solve the problem using dynamic programming. We define dp[i] as the minimum number of extra characters when considering the substring from index i to the end. Starting from the end, for each index i, we assume that the character s[i] is extra, thus setting dp[i] to 1 + dp[i+1]. Then, we iterate over subsequent indices j and check if the substring s[i:j+1] is in the dictionary. If it is, we update dp[i] to be the minimum of its current value and dp[j+1], since forming a valid word here means we do not count these characters as extra. The final answer is found at dp[0].


Code Solutions

# Python solution for the problem "Extra Characters in a String"

def minExtraChar(s, dictionary):
    n = len(s)
    # Convert dictionary to a set for fast lookup
    word_set = set(dictionary)
    
    # dp[i] represents the minimal extra characters for substring s[i:]
    dp = [0] * (n + 1)
    dp[n] = 0  # Base case: no extra characters when reaching the end
    
    # Process each index from the end to the beginning
    for i in range(n - 1, -1, -1):
        # Initially, treat s[i] as an extra character
        dp[i] = 1 + dp[i + 1]
        # Check every possible substring starting at index i
        for j in range(i, n):
            if s[i:j+1] in word_set:
                dp[i] = min(dp[i], dp[j+1])
    return dp[0]

# Example usage
print(minExtraChar("leetscode", ["leet", "code", "leetcode"]))  # Expected output: 1
← Back to All Questions