Problem Description
Given a string s (composed only of the characters '0' to '4') and an integer k, find a substring subs of s with length at least k that contains at least one character a with an odd frequency and one character b (possibly different from a) with an even frequency. Among all such substrings, return the maximum possible value of (frequency of a) - (frequency of b).
Note that subs may contain more than the two characters a and b used in the difference.
Key Insights
- Only digits '0' to '4' appear so there are only 5 possibilities; when selecting a pair (a, b) for the difference, we have at most 20 candidate pairs.
- For a fixed pair (a, b), we can use prefix arrays to compute, for any substring (i+1, j) (using prefix indices i and j), the frequency difference diff = (count_a – count_b) quickly.
- The parity condition (odd frequency for a and even frequency for b) can be checked via the mod2 value of prefix counts. In particular, if pa is the prefix count mod2 for character a and pb for character b, then the substring from i+1 to j is valid when: (pa[j] ⊕ pa[i]) == 1 and (pb[j] ⊕ pb[i]) == 0.
- To enforce the length condition (substring length ≥ k), when processing a prefix ending at j we only consider candidate prefix indices i that are at most j-k.
- For a fixed j we need to quickly obtain the minimum prefix difference among all eligible i that satisfy the parity constraints. We precompute “best” arrays for each possible parity combination to allow O(1) queries.
Space and Time Complexity
Time Complexity: O(20 * n) = O(n), where n is the length of s (since we process at most 20 character pairs and for each we do O(n) work)
Space Complexity: O(n) per pair for prefix and best arrays, but constant extra space overall because digits and parity states are limited
Solution
We iterate over every possible ordered pair of distinct characters (a, b) from '0' to '4'. For each pair, we compute prefix arrays for a and b:
• diff[i]: the difference (count_a - count_b) up to index i
• pa[i] and pb[i]: parity (mod2) of the counts of a and b respectively
For a substring corresponding to a chosen prefix interval (i, j] (with i as the end of the “left” part and j for the current index), the conditions become:
- (pa[j] XOR pa[i]) == 1, so pa[i] must equal 1 - pa[j]
- (pb[j] XOR pb[i]) == 0, so pb[i] must equal pb[j] We then define a mask for each prefix index based on its parities (mask = pa*2 + pb). For the current j, the “required” candidate mask is: req_mask = ( (1 - pa[j]) * 2 + pb[j] ) We precompute arrays best[m][i] that keep the minimum diff value among all indices ≤ i that have mask m. Finally, for each j (starting at index k to enforce the minimum substring length), we compute: candidate = best[req_mask][j-k] candidate difference = diff[j] - candidate and update the global answer. After processing all pairs (a, b) the maximum difference found is the final answer.