We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Longest Substring with At Least K Repeating Characters

Number: 395

Difficulty: Medium

Paid? No

Companies: Google, TikTok, Amazon, Yandex, Microsoft, Apple, Baidu


Problem Description

Given a string and an integer k, the task is to find the length of the longest substring in which every character appears at least k times. If no such substring exists, return 0.


Key Insights

  • The valid substring must have every character's frequency greater than or equal to k.
  • A sliding window approach can be employed by controlling the number of unique characters within the window.
  • Iterating over all possible unique character counts (from 1 to 26) allows us to systematically explore all valid substrings.
  • Alternatively, a divide and conquer approach splits the string on characters that fail the k-occurrence criteria.

Space and Time Complexity

Time Complexity: O(n * 26) = O(n), since we iterate over the string for each unique character count. Space Complexity: O(26) = O(1), for the frequency counter array.


Solution

We use a sliding window approach that iterates over the possible counts of unique characters (from 1 to 26). For each unique count:

  1. Expand the window while updating the frequency counts.
  2. Maintain the number of unique characters and the number meeting the k threshold.
  3. Contract the window when the number of unique characters exceeds the current target.
  4. If the number of unique characters equals the number meeting the k threshold, update the maximum length.

This approach efficiently explores all possible windows that can form a valid substring.


Code Solutions

# Python solution with line-by-line comments

def longestSubstring(s: str, k: int) -> int:
    max_len = 0
    n = len(s)
    # Check for each potential number of unique characters in the substring
    for target_unique in range(1, 27):  # There are at most 26 unique characters
        count = [0] * 26  # Frequency counter for current window
        start, end = 0, 0  # Sliding window pointers
        unique_count = 0   # Current number of unique characters in window
        count_at_least_k = 0  # Number of characters meeting the k threshold
        
        while end < n:
            # Add character at end pointer
            if count[ord(s[end]) - ord('a')] == 0:
                unique_count += 1  # New unique character encountered
            count[ord(s[end]) - ord('a')] += 1
            if count[ord(s[end]) - ord('a')] == k:
                count_at_least_k += 1  # Character frequency reached k
            end += 1
            
            # Shrink window from left if unique characters exceed target
            while unique_count > target_unique:
                if count[ord(s[start]) - ord('a')] == k:
                    count_at_least_k -= 1
                count[ord(s[start]) - ord('a')] -= 1
                if count[ord(s[start]) - ord('a')] == 0:
                    unique_count -= 1
                start += 1
            
            # If window is valid update max_len
            if unique_count == target_unique and unique_count == count_at_least_k:
                max_len = max(max_len, end - start)
    return max_len

# Example test cases
if __name__ == "__main__":
    print(longestSubstring("aaabb", 3))  # Expected output: 3
    print(longestSubstring("ababbc", 2))  # Expected output: 5
← Back to All Questions