LeetCode Solutions Blog

Minimize Hamming Distance After Swap Operations

Number: 1840

Difficulty: Medium

Paid? No

Companies: Google

Problem Description

Given two arrays, source and target, of equal length, and a list of allowed swap operations on source, determine the minimum possible Hamming distance between source and target. The Hamming distance is the count of positions where the two arrays differ. You are allowed to perform allowed swap operations (each as many times as needed and in any order) on source in order to minimize the differences with target.

Key Insights

Allowed swaps define a connectivity between indices, meaning indices in the same connected component can have their values rearranged arbitrarily.
Use Union-Find (or DFS) to group indices that are connected via allowed swaps.
For each connected component, count frequency of numbers in source versus target and compute mismatches by comparing these counts.
The minimum Hamming distance is obtained by subtracting the common matched counts in each group from the total size of the group.

Space and Time Complexity

Time Complexity: O(n + m) where n is the length of the arrays and m is the number of allowed swaps (plus additional cost for dictionary operations). Space Complexity: O(n) for union-find data structures and frequency maps.

Solution

The problem can be visualized as grouping indices that can interact with each other via allowed swaps. This is efficiently implemented using the Union-Find data structure. Once the groups are formed, for each group, we build frequency maps for the values in source and target. The number of mismatches within that group is computed as the total indices in the group minus the sum of the minimum frequency matches for each number present in both mappings. Finally, summing the mismatches over all groups gives the minimum Hamming distance after performing all allowed swaps.

Code Solutions

# Union-Find implementation for Python solution
class UnionFind:
    def __init__(self, n):
        # parent array initialization: each index is its own parent initially
        self.parent = list(range(n))

    def find(self, x):
        # Find the root for x with path compression
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]

    def union(self, x, y):
        # Union two nodes by connecting their roots
        rootX = self.find(x)
        rootY = self.find(y)
        if rootX != rootY:
            self.parent[rootY] = rootX

class Solution:
    def minimumHammingDistance(self, source, target, allowedSwaps):
        n = len(source)
        uf = UnionFind(n)
        
        # union indices based on allowedSwaps
        for i, j in allowedSwaps:
            uf.union(i, j)
        
        # Map from root to list of indices in that connected component
        groups = {}
        for i in range(n):
            root = uf.find(i)
            if root not in groups:
                groups[root] = []
            groups[root].append(i)
        
        result = 0
        # Process each group
        for indices in groups.values():
            freqMap = {}
            
            # build frequency counts for source in current group
            for idx in indices:
                freqMap[source[idx]] = freqMap.get(source[idx], 0) + 1
            
            # match target frequencies; reduce frequency from freqMap if match is found
            for idx in indices:
                if target[idx] in freqMap and freqMap[target[idx]] > 0:
                    freqMap[target[idx]] -= 1
                else:
                    result += 1  # mismatch if target value cannot be matched
            
        return result

# Example usage:
# sol = Solution()
# print(sol.minimumHammingDistance([1,2,3,4], [2,1,4,5], [[0,1],[2,3]]))