LeetCode Solutions Blog

Find Duplicate Subtrees

Number: 652

Difficulty: Medium

Paid? No

Problem Description

Given the root of a binary tree, return all duplicate subtrees. Two subtrees are considered duplicates if they have the same structure and identical node values. For each group of duplicate subtrees, return only one of their root nodes.

Key Insights

Use a postorder traversal (DFS) so that each subtree is serialized after processing its children.
Represent each subtree uniquely (e.g., as a string serialization) to detect duplicates.
Use a hash table (dictionary) to count the occurrence of each serialized subtree.
When a serialized subtree's count becomes two, add its root node to the result list (to ensure each duplicate subtree is reported only once).

Space and Time Complexity

Time Complexity: O(n), where n is the number of nodes in the tree. (In the worst-case scenario, serialization of each subtree might lead to higher complexity, but typically each node is processed once.) Space Complexity: O(n), for storing the serialization of each subtree and the hash table.

Solution

The solution adopts a DFS (postorder traversal) approach where each subtree is serialized into a string based on its node value and the serializations of its left and right children. This unique serialization acts as a key in a hash table that records how many times a particular subtree has been seen. When a subtree's serialization appears for the second time, we record its root in the result array. This method ensures that we only return one instance for each group of duplicate subtrees.

Code Solutions

# Definition for a binary tree node.
# class TreeNode:
#     def __init__(self, val=0, left=None, right=None):
#         self.val = val
#         self.left = left
#         self.right = right

class Solution:
    def findDuplicateSubtrees(self, root: TreeNode) -> List[TreeNode]:
        # Dictionary to store the frequency of each subtree serialization.
        subtree_count = {}
        # List to store the root nodes of duplicate subtrees.
        duplicates = []
        
        # Define a helper function for postorder traversal and serialization.
        def serialize(node):
            # If the current node is None, use a special marker.
            if not node:
                return "#"
            # Recursively serialize the left and right subtree.
            left_serial = serialize(node.left)
            right_serial = serialize(node.right)
            # Create a unique serialization for the subtree rooted at the current node.
            serial = str(node.val) + "," + left_serial + "," + right_serial
            # Update the count of this serialization in our dictionary.
            subtree_count[serial] = subtree_count.get(serial, 0) + 1
            # When a duplicate is found for the first time, add the node to the duplicates list.
            if subtree_count[serial] == 2:
                duplicates.append(node)
            # Return the serialization to the parent caller.
            return serial
        
        serialize(root)
        return duplicates