Problem Description
Given a string s consisting of lowercase English letters, the variance of a substring is defined as the largest difference between the number of occurrences of any 2 characters present in that substring. The goal is to find the maximum variance among all possible substrings of s. Note that the two characters may be the same or different, but the substring must contain at least one instance of both characters to consider a non-zero variance.
Key Insights
- The problem can be broken down by considering every ordered pair of distinct characters (major and minor).
- For a fixed pair, the task reduces to finding a substring where (count(major) - count(minor)) is maximized, while ensuring that at least one minor exists.
- A modified version of Kadane’s algorithm can be used. Here, we traverse the string, incrementing the difference when encountering the major character and decrementing when encountering the minor character.
- To account for cases where the minor character might not be encountered enough or the running difference becomes negative, we reset the counters appropriately based on future potential.
- Looping over all pairs (with 26 letters, this results in 26*25 combinations) leads to an overall linear-time approach for each pair – resulting in an acceptable complexity given n ≤ 10⁴.
Space and Time Complexity
Time Complexity: O(2625n) ≈ O(n) (constant factor iteration over all letter pairs)
Space Complexity: O(1)
Solution
For each ordered pair of distinct characters:
- Initialize a current difference (diff) between the count of the major and minor characters.
- Maintain a flag to check if at least one minor character has been encountered during the current scan.
- Optionally, count the remaining occurrence of the minor character in the entire string to decide reset conditions.
- Traverse the string:
- Increase diff when seeing the major character.
- Decrease diff when seeing the minor character.
- If at any iteration the diff becomes negative and there is a possibility to encounter the minor again (i.e. remaining minor count > 0), reset the running diff.
- Update the maximum variance when the minor has been seen at least once.
- Repeat for all valid ordered pairs and return the maximum variance found.
This approach leverages the idea behind Kadane’s algorithm with modifications to handle the requirement that both characters must appear at least once in the substring to consider its variance.