LeetCode Solutions Blog

Find X-Sum of All K-Long Subarrays II

Number: 3592

Difficulty: Hard

Paid? No

Companies: N/A

Problem Description

Given an array of integers nums, and two integers k and x, we need to compute for every contiguous subarray of length k its “x-sum”. The x-sum of an array is found by:

Counting the occurrences of every element in the subarray.
Keeping only the occurrences of the top x most frequent elements (if two elements have the same frequency, the one with the larger value is considered “more frequent”).
Summing all the numbers that were kept. If there are fewer than x distinct numbers, then the x‑sum equals the sum of the entire subarray. Return an array answer where answer[i] is the x‑sum of the subarray nums[i..i+k‑1].

Key Insights

Use a sliding window of length k to process each subarray.
Maintain a frequency map (hash table) to track the counts of numbers in the current window.
To quickly obtain the top x elements by frequency (with tie-breaker on value), use a balanced data structure (or two sorted structures / heaps) that can be updated as the window slides.
Split the distinct elements into two groups: one (topSet) holding the currently selected top x numbers, and a second (bottomSet) with the remaining numbers.
Keep an incremental total (topSum) for the x‑sum so that after each sliding window update the answer is produced quickly.
Be careful when updating the frequency: when an element’s count changes (either by adding a new element in or removing an old element), remove its old entry and reinsert it using the new frequency into the appropriate set; then rebalance if necessary.

Space and Time Complexity

Time Complexity: O(n log m) per window update where m is the number of distinct elements, since each update involves log-time insertions/deletions and rebalancing. Space Complexity: O(m) for storing frequencies and balanced sets (with m being the number of distinct elements in a window, worst-case m = k).

Solution

The idea is to use a sliding window while maintaining two balanced collections:

A frequency dictionary mapping every number in the current window to its frequency.
Two sorted sets (or multisets) where each entry is a pair (frequency, value). The ordering is natural (first by frequency then by value), which by default in many languages will treat lower frequencies (or, for equal frequency, lower value) as “smaller.” We ensure that in our “topSet” we always keep the x elements with the highest (frequency, value) order. If the total number of distinct elements is less than x, then topSet holds all the distinct elements.
We maintain topSum – the sum over all (frequency * value) for entries in topSet – as our result for the current window. For each sliding-window update, update the frequency map for the incoming and outgoing element, update the sorted sets accordingly and rebalance them to ensure that topSet always contains the best x keys. Finally, append topSum for the current window.

Below are code implementations in Python, JavaScript, C++ and Java with line-by-line comments.

Code Solutions

# Python solution using two sorted lists with bisect for demonstration.
# Note: In practice, one might use the "sortedcontainers" module for efficiency.

import bisect

def find_xsum(nums, k, x):
    n = len(nums)
    
    # Data structures:
    # freq: dictionary mapping number -> frequency in window.
    freq = {}
    # topSet and bottomSet: sorted lists of tuples (frequency, value).
    # Sorted in ascending order; topSet will eventually contain the x largest tuples.
    topSet = []     # will maintain exactly min(x, total distinct) items
    bottomSet = []  # remaining items
    topSum = 0      # sum over topSet computed as value * frequency for each tuple.
    
    # Helper function to remove an entry from a sorted list.
    def remove_entry(sorted_list, entry):
        idx = bisect.bisect_left(sorted_list, entry)
        # Assumes entry exists.
        if idx < len(sorted_list) and sorted_list[idx] == entry:
            sorted_list.pop(idx)
    
    # Helper function to add an entry into a sorted list.
    def add_entry(sorted_list, entry):
        bisect.insort_left(sorted_list, entry)
    
    # Helper function to rebalance the two sets so that topSet has size = min(x, total distinct)
    def rebalance():
        nonlocal topSum
        total_distinct = len(topSet) + len(bottomSet)
        target_size = min(x, total_distinct)
        # If topSet has fewer than needed, move the largest from bottomSet to topSet.
        while len(topSet) < target_size and bottomSet:
            # largest element from bottomSet is at the end since list is ascending.
            candidate = bottomSet.pop()
            add_entry(topSet, candidate)
            topSum += candidate[0] * candidate[1]
        # If topSet is too large, move smallest element from topSet to bottomSet.
        while len(topSet) > target_size:
            candidate = topSet.pop(0)
            topSum -= candidate[0] * candidate[1]
            add_entry(bottomSet, candidate)
        # Now ensure ordering: if bottomSet’s maximum is greater than topSet’s minimum, swap them.
        while topSet and bottomSet and bottomSet[-1] > topSet[0]:
            low = topSet.pop(0)
            high = bottomSet.pop()
            topSum -= low[0] * low[1]
            topSum += high[0] * high[1]
            add_entry(topSet, high)
            add_entry(bottomSet, low)
    
    # Helper function to update an element with new frequency info.
    # First, remove the old tuple (if exists) from whichever set it is in,
    # then insert the new tuple into the proper set.
    def update(num, old_freq, new_freq):
        nonlocal topSum
        entry_old = (old_freq, num)
        entry_new = (new_freq, num)
        # Try to remove entry_old from topSet and bottomSet.
        removed = False
        # Check topSet.
        idx = bisect.bisect_left(topSet, entry_old)
        if idx < len(topSet) and topSet[idx] == entry_old:
            topSet.pop(idx)
            topSum -= old_freq * num
            removed = True
        else:
            idx = bisect.bisect_left(bottomSet, entry_old)
            if idx < len(bottomSet) and bottomSet[idx] == entry_old:
                bottomSet.pop(idx)
                removed = True
        # Now if new frequency is > 0, insert entry_new into the appropriate set.
        if new_freq > 0:
            total_distinct = len(topSet) + len(bottomSet)
            target_size = min(x, total_distinct+1)  # +1 because we are adding a new entry
            # If topSet is not full, add to topSet.
            if len(topSet) < target_size:
                add_entry(topSet, entry_new)
                topSum += new_freq * num
            else:
                # Check the smallest in topSet.
                if topSet and entry_new > topSet[0]:
                    # The new candidate deserves to be in topSet.
                    candidate = topSet.pop(0)
                    topSum -= candidate[0] * candidate[1]
                    add_entry(bottomSet, candidate)
                    add_entry(topSet, entry_new)
                    topSum += new_freq * num
                else:
                    add_entry(bottomSet, entry_new)
        # Rebalance after update.
        rebalance()
    
    res = []
    # Build the first window.
    for i in range(k):
        num = nums[i]
        old = freq.get(num, 0)
        new = old + 1
        freq[num] = new
        update(num, old, new)
    # Append the x-sum for the first window.
    res.append(topSum)
    
    # Slide the window.
    for i in range(k, n):
        # Element to remove.
        out_num = nums[i - k]
        out_old = freq.get(out_num, 0)
        out_new = out_old - 1
        freq[out_num] = out_new
        update(out_num, out_old, out_new)
        
        # Element to add.
        in_num = nums[i]
        in_old = freq.get(in_num, 0)
        in_new = in_old + 1
        freq[in_num] = in_new
        update(in_num, in_old, in_new)
        
        res.append(topSum)
        
    return res

# Example usage:
print(find_xsum([1,1,2,2,3,4,2,3], 6, 2))  # Expected output: [6,10,12]