Problem Description
Given a string s we want to “fix” it using a set of allowed operations so that every character that remains appears the same number of times. The allowed operations are:
- Delete any character.
- Insert any character.
- Change a character to its next letter in the alphabet (note: you cannot change 'z' to 'a').
A string t is called “good” if every character in it occurs the same number of times. Return the minimum number of operations required to transform s into a good string.
For example, in s = "aaabc" one optimal sequence is to change one occurrence of 'a' to 'b' and then insert one extra 'c' so that frequencies become {a:2, b:2, c:2}.
Key Insights
- The final good string contains some set of characters (possibly a contiguous block in the alphabet due to the “increment‐only” transformation) each with exactly k copies.
- Characters that are not used in the final string must be deleted.
- For characters that are used, you can “adjust” their frequency by: • Deleting excess occurrences. • Inserting missing occurrences. • “Pushing” surplus from a lower letter into the next letter (at cost = 1 per shift) instead of doing a deletion plus insertion (which would cost 2).
- Because the allowed change operation only increments a character by one step, an optimal strategy is to choose a contiguous segment of letters (for example, from letter L to letter R) to appear in the final string. Then simulate from L to R and “push” any surplus forward.
- For each candidate contiguous segment and target frequency k (for every used character), the cost is the sum of: • Deleting every character outside the segment. • For each letter in the segment, a “greedy” simulation that uses its raw frequency plus any surplus carried (via conversion) from the previous letter. If a deficit is found, you pay via insertion; if there is a surplus you “shift” it upward at cost 1 per character.
- Finally, the answer is the minimum total cost over all segments (including the “empty” string option, which is equivalent to deleting all characters) and all candidate k’s.
Space and Time Complexity
Time Complexity: O(26^2 * M) where M is the maximum candidate target frequency k for a chosen segment (in worst-case M can be O(n)); overall worst-case O(n) with a small constant. Space Complexity: O(1) extra space besides the frequency counter (using an array of size 26).
Solution
The idea is to first count the frequency of each letter. Then, for every contiguous segment [L,R] of the alphabet (these are the candidate letters to keep in the final string), we compute the cost to “adjust” the counts so that every letter in the segment ends up with exactly k occurrences. In this simulation: – All characters outside the segment are deleted (each costing 1). – For letters within [L, R] we process from L upward. For the current letter, we consider the “available” occurrences (its own frequency plus any surplus carried from the previous letter after adjustments). If that total is less than k, we must insert (costing 1 per insertion) to reach k. If it is more than k, we “push” the extra occurrences to the next letter (each extra occurrence costing 1 for the transformation to the next letter). Because a change operation (i.e. shifting an occurrence from a letter to the next) costs exactly 1—which is always better than doing a deletion plus insertion (cost 2) when shifting between adjacent letters—this simulation yields the minimal cost for that segment when trying to “equalize” them to frequency k. We try all possible segments and candidate k (from 0 up to the total available in the segment) and take the overall minimum cost. (Note: k = 0 means emptying the segment; overall that option is equivalent to deleting every character.) This approach uses a greedy/dynamic‐programming style “chain” over letters in the segment.