We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Minimum Operations to Make Character Frequencies Equal

Number: 3638

Difficulty: Hard

Paid? No

Companies: N/A


Problem Description

Given a string s we want to “fix” it using a set of allowed operations so that every character that remains appears the same number of times. The allowed operations are:

  1. Delete any character.
  2. Insert any character.
  3. Change a character to its next letter in the alphabet (note: you cannot change 'z' to 'a').

A string t is called “good” if every character in it occurs the same number of times. Return the minimum number of operations required to transform s into a good string.

For example, in s = "aaabc" one optimal sequence is to change one occurrence of 'a' to 'b' and then insert one extra 'c' so that frequencies become {a:2, b:2, c:2}.


Key Insights

  • The final good string contains some set of characters (possibly a contiguous block in the alphabet due to the “increment‐only” transformation) each with exactly k copies.
  • Characters that are not used in the final string must be deleted.
  • For characters that are used, you can “adjust” their frequency by: • Deleting excess occurrences. • Inserting missing occurrences. • “Pushing” surplus from a lower letter into the next letter (at cost = 1 per shift) instead of doing a deletion plus insertion (which would cost 2).
  • Because the allowed change operation only increments a character by one step, an optimal strategy is to choose a contiguous segment of letters (for example, from letter L to letter R) to appear in the final string. Then simulate from L to R and “push” any surplus forward.
  • For each candidate contiguous segment and target frequency k (for every used character), the cost is the sum of: • Deleting every character outside the segment. • For each letter in the segment, a “greedy” simulation that uses its raw frequency plus any surplus carried (via conversion) from the previous letter. If a deficit is found, you pay via insertion; if there is a surplus you “shift” it upward at cost 1 per character.
  • Finally, the answer is the minimum total cost over all segments (including the “empty” string option, which is equivalent to deleting all characters) and all candidate k’s.

Space and Time Complexity

Time Complexity: O(26^2 * M) where M is the maximum candidate target frequency k for a chosen segment (in worst-case M can be O(n)); overall worst-case O(n) with a small constant. Space Complexity: O(1) extra space besides the frequency counter (using an array of size 26).


Solution

The idea is to first count the frequency of each letter. Then, for every contiguous segment [L,R] of the alphabet (these are the candidate letters to keep in the final string), we compute the cost to “adjust” the counts so that every letter in the segment ends up with exactly k occurrences. In this simulation: – All characters outside the segment are deleted (each costing 1). – For letters within [L, R] we process from L upward. For the current letter, we consider the “available” occurrences (its own frequency plus any surplus carried from the previous letter after adjustments). If that total is less than k, we must insert (costing 1 per insertion) to reach k. If it is more than k, we “push” the extra occurrences to the next letter (each extra occurrence costing 1 for the transformation to the next letter). Because a change operation (i.e. shifting an occurrence from a letter to the next) costs exactly 1—which is always better than doing a deletion plus insertion (cost 2) when shifting between adjacent letters—this simulation yields the minimal cost for that segment when trying to “equalize” them to frequency k. We try all possible segments and candidate k (from 0 up to the total available in the segment) and take the overall minimum cost. (Note: k = 0 means emptying the segment; overall that option is equivalent to deleting every character.) This approach uses a greedy/dynamic‐programming style “chain” over letters in the segment.


Code Solutions

# Python solution with detailed comments
class Solution:
    def minOperations(self, s: str) -> int:
        # Count frequency for each lowercase letter 'a' to 'z'
        freq = [0] * 26
        for ch in s:
            freq[ord(ch) - ord('a')] += 1

        n = len(s)
        best = n  # worst-case: delete everything
        
        # Consider the option of deleting all characters (i.e. empty string)
        best = min(best, n)
        
        # Try every contiguous segment [L, R] as the candidate final letters.
        for L in range(26):
            for R in range(L, 26):
                # Total deletion cost for letters outside the segment [L, R]
                deletion_cost = sum(freq[:L]) + sum(freq[R+1:])
                
                # total available inside the segment (this defines an upper bound for k)
                available_total = sum(freq[L:R+1])
                # Try candidate target frequency k for each letter in the segment
                # k can also be larger than the raw frequency (requiring insertions)
                # but no need to try k > available_total because that would cost too high.
                for k in range(0, available_total + 1):
                    cost_for_segment = 0
                    carried = 0
                    possible = True
                    # Process letters in the segment in order
                    for i in range(L, R+1):
                        # available occurrences at letter i:
                        current = freq[i] + carried
                        if current < k:
                            # deficit; we need to insert missing occurrences at cost 1 each
                            cost_for_segment += (k - current)
                            # after insertion we have exactly k occurrences, and no surplus to pass on
                            carried = 0
                        else:
                            # surplus: convert the extra occurrences to the next letter 
                            surplus = current - k
                            cost_for_segment += surplus  # cost = 1 per conversion
                            carried = surplus
                    # Total cost is cost in segment plus deletion outside the segment.
                    total_cost = deletion_cost + cost_for_segment
                    best = min(best, total_cost)
        return best

# Example usage:
# sol = Solution()
# print(sol.minOperations("aaabc"))
← Back to All Questions