We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Design HashSet

Number: 816

Difficulty: Easy

Paid? No

Companies: Amazon, Google, Meta, Apple, Wix


Problem Description

Design a HashSet without using any built-in hash table libraries. You need to implement a class MyHashSet that supports the following operations:

  • add(key): Inserts the value key into the HashSet.
  • contains(key): Returns whether the value key exists in the HashSet or not.
  • remove(key): Removes the key from the HashSet, if present.

Key Insights

  • We can simulate a hash table by using an array of buckets.
  • Use a hash function (e.g., key mod bucket_size) to map each key to an index.
  • Each bucket can be implemented as a list (or linked list) to handle collisions via chaining.
  • Since key values are within [0, 10^6] and the number of operations is limited (10^4 calls), a reasonable bucket size (e.g., 1000) provides a good balance.

Space and Time Complexity

Time Complexity:

  • add(key): O(N/buckets) on average, which is approximately O(1)
  • remove(key): O(N/buckets) on average, approximately O(1)
  • contains(key): O(N/buckets) on average, approximately O(1)

Space Complexity:

  • O(buckets + number of keys), where buckets is a fixed-size array and keys are stored in lists.

Solution

We implement a HashSet using an array of buckets where each bucket is a list to handle collisions. A simple hash function (modulo operation) determines the bucket index for each key. When adding, checking, or removing a key, we compute its bucket index and then perform the operation in the corresponding list. This approach avoids built-in hash table libraries and handles potential collisions with chaining.

Code Solutions

# Python implementation of MyHashSet
class MyHashSet:
    def __init__(self):
        # Define the number of buckets and initialize each bucket as an empty list
        self.bucket_size = 1000
        self.buckets = [[] for _ in range(self.bucket_size)]
    
    def _hash(self, key):
        # Simple hash function to compute bucket index
        return key % self.bucket_size
    
    def add(self, key):
        # Compute bucket index from the key
        bucket_index = self._hash(key)
        # Check if key already exists in the bucket; if not, append it
        if key not in self.buckets[bucket_index]:
            self.buckets[bucket_index].append(key)
    
    def remove(self, key):
        # Compute bucket index for the key
        bucket_index = self._hash(key)
        # If key exists in the bucket, remove it
        if key in self.buckets[bucket_index]:
            self.buckets[bucket_index].remove(key)
    
    def contains(self, key):
        # Compute the bucket index and check if key is present
        bucket_index = self._hash(key)
        return key in self.buckets[bucket_index]

# Example usage:
# myHashSet = MyHashSet()
# myHashSet.add(1)
# print(myHashSet.contains(1))  # Should return True
← Back to All Questions