We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Most Common Word

Number: 837

Difficulty: Easy

Paid? No

Companies: Datadog, Amazon, Meta, Microsoft, Google


Problem Description

Given a paragraph of text and a list of banned words, return the most frequent word that is not in the banned list. The search is case-insensitive, punctuation should be ignored, and the result must be in lowercase.


Key Insights

  • Normalize the paragraph by converting all characters to lowercase.
  • Replace all punctuation with spaces to correctly extract words.
  • Use a hash set for the banned words to enable fast lookups.
  • Count the frequency of each non-banned word using a hash map.
  • Determine the word with the highest frequency from the count map.

Space and Time Complexity

Time Complexity: O(N) where N is the length of the paragraph (processing each character and word). Space Complexity: O(N) for storing the words and their counts.


Solution

The solution first normalizes the text by converting it to lowercase and replacing punctuation with spaces. It then splits the paragraph into words. A hash set is used to store banned words for quick lookups. As words are processed, those not in the banned set are stored in a hash map with their corresponding counts. Finally, the word with the maximum count is determined and returned.


Code Solutions

import re

def mostCommonWord(paragraph, banned):
    # Replace punctuation with spaces and convert the paragraph to lowercase.
    words = re.sub(r"[!?',;.]", " ", paragraph.lower()).split()
    
    # Convert banned list to a set for fast lookup.
    banned_set = set(banned)
    
    # Count frequency of each non-banned word.
    count = {}
    for word in words:
        if word not in banned_set:
            count[word] = count.get(word, 0) + 1
    
    # Return the word with the highest frequency.
    return max(count, key=count.get)

# Example usage:
paragraph = "Bob hit a ball, the hit BALL flew far after it was hit."
banned = ["hit"]
print(mostCommonWord(paragraph, banned))
← Back to All Questions