We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Substring With Largest Variance

Number: 2360

Difficulty: Hard

Paid? No

Companies: Amazon


Problem Description

Given a string s consisting of lowercase English letters, the variance of a substring is defined as the largest difference between the number of occurrences of any 2 characters present in that substring. The goal is to find the maximum variance among all possible substrings of s. Note that the two characters may be the same or different, but the substring must contain at least one instance of both characters to consider a non-zero variance.


Key Insights

  • The problem can be broken down by considering every ordered pair of distinct characters (major and minor).
  • For a fixed pair, the task reduces to finding a substring where (count(major) - count(minor)) is maximized, while ensuring that at least one minor exists.
  • A modified version of Kadane’s algorithm can be used. Here, we traverse the string, incrementing the difference when encountering the major character and decrementing when encountering the minor character.
  • To account for cases where the minor character might not be encountered enough or the running difference becomes negative, we reset the counters appropriately based on future potential.
  • Looping over all pairs (with 26 letters, this results in 26*25 combinations) leads to an overall linear-time approach for each pair – resulting in an acceptable complexity given n ≤ 10⁴.

Space and Time Complexity

Time Complexity: O(2625n) ≈ O(n) (constant factor iteration over all letter pairs)
Space Complexity: O(1)


Solution

For each ordered pair of distinct characters:

  1. Initialize a current difference (diff) between the count of the major and minor characters.
  2. Maintain a flag to check if at least one minor character has been encountered during the current scan.
  3. Optionally, count the remaining occurrence of the minor character in the entire string to decide reset conditions.
  4. Traverse the string:
    • Increase diff when seeing the major character.
    • Decrease diff when seeing the minor character.
    • If at any iteration the diff becomes negative and there is a possibility to encounter the minor again (i.e. remaining minor count > 0), reset the running diff.
    • Update the maximum variance when the minor has been seen at least once.
  5. Repeat for all valid ordered pairs and return the maximum variance found.

This approach leverages the idea behind Kadane’s algorithm with modifications to handle the requirement that both characters must appear at least once in the substring to consider its variance.


Code Solutions

# Python solution for Substring With Largest Variance

class Solution:
    def largestVariance(self, s: str) -> int:
        # Get a set of unique characters in the string
        unique_chars = set(s)
        max_variance = 0

        # Iterate over each ordered pair (major, minor)
        for major in unique_chars:
            for minor in unique_chars:
                # Skip if both characters are the same
                if major == minor:
                    continue

                # Initialize variables for the current pair
                diff = 0                   # Current difference between counts
                minor_seen = False         # Flag to see if at least one minor has been encountered
                # Count total occurrences of minor in the string (for potential resets)
                minor_remaining = s.count(minor)

                for char in s:
                    if char == major:
                        diff += 1
                    if char == minor:
                        diff -= 1
                        minor_remaining -= 1
                        minor_seen = True

                    # Only update result if at least one minor has been seen
                    if minor_seen:
                        max_variance = max(max_variance, diff)
                    
                    # Reset condition: if diff drops below 0 and there are still minors available to potentially contribute
                    if diff < 0 and minor_remaining > 0:
                        diff = 0
                        minor_seen = False

        return max_variance

# Example usage:
# sol = Solution()
# print(sol.largestVariance("aababbb"))
← Back to All Questions