Given a string and an integer k, the task is to find the length of the longest substring in which every character appears at least k times. If no such substring exists, return 0.
Key Insights
The valid substring must have every character's frequency greater than or equal to k.
A sliding window approach can be employed by controlling the number of unique characters within the window.
Iterating over all possible unique character counts (from 1 to 26) allows us to systematically explore all valid substrings.
Alternatively, a divide and conquer approach splits the string on characters that fail the k-occurrence criteria.
Space and Time Complexity
Time Complexity: O(n * 26) = O(n), since we iterate over the string for each unique character count.
Space Complexity: O(26) = O(1), for the frequency counter array.
Solution
We use a sliding window approach that iterates over the possible counts of unique characters (from 1 to 26). For each unique count:
Expand the window while updating the frequency counts.
Maintain the number of unique characters and the number meeting the k threshold.
Contract the window when the number of unique characters exceeds the current target.
If the number of unique characters equals the number meeting the k threshold, update the maximum length.
This approach efficiently explores all possible windows that can form a valid substring.
Code Solutions
# Python solution with line-by-line commentsdeflongestSubstring(s:str, k:int)->int: max_len =0 n =len(s)# Check for each potential number of unique characters in the substringfor target_unique inrange(1,27):# There are at most 26 unique characters count =[0]*26# Frequency counter for current window start, end =0,0# Sliding window pointers unique_count =0# Current number of unique characters in window count_at_least_k =0# Number of characters meeting the k thresholdwhile end < n:# Add character at end pointerif count[ord(s[end])-ord('a')]==0: unique_count +=1# New unique character encountered count[ord(s[end])-ord('a')]+=1if count[ord(s[end])-ord('a')]== k: count_at_least_k +=1# Character frequency reached k end +=1# Shrink window from left if unique characters exceed targetwhile unique_count > target_unique:if count[ord(s[start])-ord('a')]== k: count_at_least_k -=1 count[ord(s[start])-ord('a')]-=1if count[ord(s[start])-ord('a')]==0: unique_count -=1 start +=1# If window is valid update max_lenif unique_count == target_unique and unique_count == count_at_least_k: max_len =max(max_len, end - start)return max_len
# Example test casesif __name__ =="__main__":print(longestSubstring("aaabb",3))# Expected output: 3print(longestSubstring("ababbc",2))# Expected output: 5