We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Online Majority Element In Subarray

Number: 1262

Difficulty: Hard

Paid? No

Companies: Nutanix


Problem Description

Design a data structure MajorityChecker that efficiently finds the majority element for a given subarray. The majority element of a subarray is an element that occurs at least threshold times in the subarray. If no such element exists, return -1.


Key Insights

  • Preprocess the array by storing the indices of each element to allow fast frequency counting using binary search.
  • Given the constraint 2*threshold > subarray length, there can be at most one valid candidate which simplifies verification.
  • During each query, randomly sample a few indices from the subarray to retrieve candidate elements and then verify their frequencies.
  • Binary search on the candidate's index list gives a fast count of its occurrences in the queried range.

Space and Time Complexity

Time Complexity: O(20 * log(n)) per query in worst-case due to the constant number of samples (20) and binary searches for each candidate. Space Complexity: O(n), where n is the size of the array, used for storing indices for each number.


Solution

We preprocess the input array by constructing a map from element values to their list of indices. For each query on subarray [left, right] with a given threshold:

  1. Use random sampling (e.g., 20 samples) to pick candidate elements from the subarray.
  2. For each candidate, use binary search on its list of indices to determine how many times it appears in the subarray.
  3. If the candidate's frequency meets or exceeds the threshold, return that candidate.
  4. If no candidate passes the threshold, return -1. This approach exploits the statistical likelihood, enhanced by the constraints, that a valid majority candidate will be sampled.

Code Solutions

import bisect
import random

class MajorityChecker:
    # Initialize the data structure and precompute indices for each element.
    def __init__(self, arr):
        self.arr = arr
        self.index_map = {}
        for i, num in enumerate(arr):
            if num not in self.index_map:
                self.index_map[num] = []
            self.index_map[num].append(i)
    
    # Query the majority element in the subarray arr[left...right] with at least threshold occurrences.
    def query(self, left, right, threshold):
        for _ in range(20):  # Sample 20 times to increase the chance of finding the majority element.
            rand_index = random.randint(left, right)
            candidate = self.arr[rand_index]
            positions = self.index_map[candidate]
            # Count candidate occurrences using binary search.
            left_occurrence = bisect.bisect_left(positions, left)
            right_occurrence = bisect.bisect_right(positions, right)
            if right_occurrence - left_occurrence >= threshold:
                return candidate
        return -1

# Example usage:
# majorityChecker = MajorityChecker([1, 1, 2, 2, 1, 1])
# print(majorityChecker.query(0, 5, 4))  # Output: 1
# print(majorityChecker.query(0, 3, 3))  # Output: -1
# print(majorityChecker.query(2, 3, 2))  # Output: 2
← Back to All Questions