We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Similar String Groups

Number: 869

Difficulty: Hard

Paid? No

Companies: Apple, Google


Problem Description

Given a list of strings where each string is an anagram of every other string, two strings are defined to be similar if they are either identical or can become identical by swapping two letters (in distinct positions) in one of the strings. These swaps can be applied transitively, meaning that if string A is similar to B, and B is similar to C, then A, B, and C belong to the same group. The goal is to determine how many such groups exist in the list.


Key Insights

  • The similarity condition is equivalent to saying that two strings differ in either zero or exactly two positions (and in the exactly-two case, the characters must be reversible).
  • The problem can be modeled as a graph where each string is a node and an edge connects two nodes if their corresponding strings are similar.
  • The answer is then the number of connected components in this graph.
  • Both DFS (or BFS) and Union-Find are effective approaches for finding connected components in such a graph.
  • Since every string is an anagram of the others, they all have the same set of characters, which simplifies some edge-case considerations.

Space and Time Complexity

Time Complexity: O(N^2 * M) where N is the number of strings and M is the length of each string, since for each pair of strings, we compare up to M characters. Space Complexity: O(N) to store the data structures (e.g., union-find array or visited set in DFS/BFS).


Solution

We can solve this problem by modeling the strings as nodes in a graph. Two nodes (strings) are connected if they are similar. We can then either perform a DFS/BFS starting from each unvisited node to count connected components or use the Union-Find (Disjoint Set Union) data structure.

For the Union-Find approach:

  • Initialize a parent array where each string is its own parent.
  • Compare every pair of strings to check if they are similar (by checking if the number of differing positions is exactly 0 or 2 and verifying the swap condition when needed).
  • Union the components whenever a similar pair is found.
  • Finally, count the distinct roots in the union-find data structure.

This method efficiently groups the strings and ultimately allows us to count the number of groups.


Code Solutions

# Define the function to determine if two strings are similar.
def are_similar(s, t):
    # List to store positions where s and t differ
    diff = []
    # Iterate through both strings at once
    for i in range(len(s)):
        if s[i] != t[i]:
            diff.append(i)
            # If differences are more than 2, they can't be similar.
            if len(diff) > 2:
                return False
    # If there are no differences, they are identical and hence similar.
    if len(diff) == 0:
        return True
    # If there are exactly two differences, check if swapping fixes them.
    if len(diff) == 2:
        i, j = diff
        return s[i] == t[j] and s[j] == t[i]
    return False

# Union-Find implementation
class UnionFind:
    def __init__(self, n):
        # Initially each element is its own parent.
        self.parent = [i for i in range(n)]

    def find(self, i):
        # Path compression for efficiency.
        if self.parent[i] != i:
            self.parent[i] = self.find(self.parent[i])
        return self.parent[i]

    def union(self, i, j):
        root_i = self.find(i)
        root_j = self.find(j)
        if root_i != root_j:
            # Merge the two components.
            self.parent[root_j] = root_i

def numSimilarGroups(strs):
    n = len(strs)
    uf = UnionFind(n)
    # Compare each pair of strings.
    for i in range(n):
        for j in range(i + 1, n):
            if are_similar(strs[i], strs[j]):
                uf.union(i, j)
    # Count the number of unique parents (i.e., groups).
    groups = set()
    for i in range(n):
        groups.add(uf.find(i))
    return len(groups)

# Example usage:
# strs = ["tars", "rats", "arts", "star"]
# print(numSimilarGroups(strs))  # Output: 2
← Back to All Questions