We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Count Pairs Of Similar Strings

Number: 2594

Difficulty: Easy

Paid? No

Companies: Adobe, TikTok, IBM, Google, Oracle


Problem Description

Given an array of strings, count the number of pairs (i, j) (with 0 ≤ i < j < words.length) such that the two strings are similar, meaning they consist of exactly the same set of characters.


Key Insights

  • Two strings are similar if their unique character sets are identical.
  • Convert each word to a representation that captures its unique characters, for example using a set or bitmask.
  • Use a hash table (or dictionary) to count the frequency of each unique character set representation.
  • The number of pairs for a particular set that occurs "freq" times is given by freq*(freq-1)/2.

Space and Time Complexity

Time Complexity: O(n * m) where n is the number of words and m is the average length of a word. Space Complexity: O(n) for storing the frequency of unique character sets.


Solution

The solution involves processing each word to extract its set of unique characters. We can sort these characters or use a bitmask to create a canonical representation for each word. Once we have the representation, we store the count of each unique representation in a hash table. Finally, for each frequency count greater than 1, we compute the number of similar pairs (using the combination formula freq*(freq-1)/2) and sum them to obtain the result. The approach leverages a hash table for efficient counting and, if using bit manipulation, an integer bitmask for a space-efficient representation of the character set.


Code Solutions

# Python solution with line-by-line comments

def similarPairs(words):
    # Dictionary to count frequency of each character set representation
    freq = {}
    
    # Process each word in the list
    for word in words:
        # Create a canonical key by taking the sorted unique characters of the word
        key = ''.join(sorted(set(word)))
        # Update the frequency count for this key
        freq[key] = freq.get(key, 0) + 1
    
    # Initialize the result count
    count = 0
    # For each unique character set, if frequency is greater than 1, count the number of pairs
    for key in freq:
        if freq[key] > 1:
            count += freq[key] * (freq[key] - 1) // 2  # Combination formula: n choose 2
    
    return count

# Example usage:
words = ["aba", "aabb", "abcd", "bac", "aabc"]
print(similarPairs(words))  # Output: 2
← Back to All Questions