Problem Description
Given a string word and an integer k, we say that word is k‑special if for any two characters present in word, the difference in their frequency (after optionally deleting some characters) is at most k. In other words, if you let freq(x) denote the frequency of character x in the final string, then for every pair of characters present in that string the condition |freq(x) - freq(y)| ≤ k holds. The task is to find the minimum number of deletions required to make the string k‑special.
Key Insights
- There are only 26 lowercase letters, so we only have up to 26 frequency counts.
- For each letter, you can choose to keep some occurrences (from 0 up to its original count); if a letter is dropped completely, it does not contribute to the k-special condition.
- To guarantee the k‑special property for the remaining letters, their kept frequencies must all lie within a contiguous interval [L, R] such that R − L ≤ k.
- For a fixed interval [L, L+k], if a letter’s frequency f is below L then it cannot “raise” its frequency (we are only allowed to delete) so that letter must be dropped; if f is between L and L+k, keep f; and if f is greater than L+k, keep only L+k.
- Iterate over possible values of L (from 1 to the maximum frequency among letters) and compute the maximum total kept occurrences. The minimum deletions equal (total characters – maximum kept).
Space and Time Complexity
Time Complexity: O(max_f * 26) where max_f is the maximum frequency among the 26 letters (max_f ≤ 10^5), which is acceptable. Space Complexity: O(26) = O(1) for storing frequency counts.
Solution
We begin by computing the frequency of each character in the given word. For any candidate lower bound L of our allowed kept frequency interval [L, L+k], we determine for each letter:
- If its frequency is less than L, we cannot include it (must delete all occurrences).
- Otherwise, we can keep min(f, L+k) occurrences. Summing these kept counts across all letters gives the total number of characters that remain if we force the kept frequencies to lie in [L, L+k]. We try all possible L (from 1 to maximum frequency) and choose the one that leads to the maximum retained count. The answer is then the original length of the string minus this maximum.
Key points:
- Only letters with frequency at least L can be partially kept.
- For letters with frequency greater than L+k, we are forced to delete the extra occurrences.
- By iterating L over the valid range (1..max_f), we effectively search through all potential choices for the minimum kept frequency among the chosen letters.