Problem Description
Given a tree (represented by a parents array) with n nodes and an array nums which assigns each node a unique genetic value (in the range [1, 10^5]), find for each node the smallest positive integer that is missing from the subtree rooted at that node (the subtree includes the node itself and all its descendants).
For example, if a subtree’s genetic values are [1,2,3,4], then the smallest missing value is 5. If a subtree does not contain 1, then the answer is 1.
Key Insights
- Only nodes on the path from the node containing genetic value 1 (if it exists) up to the root can have answers different from 1; all others will have answer = 1.
- A DFS starting from the node with genetic value 1 can be used to “collect” genetic values in subtrees.
- Propagate the union of seen genetic values upward along the parent chain, and update a pointer for the smallest missing integer.
- Use an auxiliary visited/seen array (of booleans) for genetic values to efficiently check which numbers have been encountered.
- Each node is processed at most once in the DFS, ensuring linear time complexity.
Space and Time Complexity
Time Complexity: O(n + m) where n is the number of nodes and m is the maximum genetic value (bounded by 10^5) – overall linear. Space Complexity: O(n + m) due to the tree structure (children list), recursion stack, and the visited array for genetic values.
Solution
We start by building an adjacency list from the parents array in order to easily access each node’s children. The key observation is that if genetic value 1 is not present in the tree, then every subtree is missing 1. Otherwise, we locate the node that contains 1. Then, for each node from that node up to the root (using the parent chain), we run an efficient DFS that marks all genetic values found in the currently processed subtree (only processing nodes that haven’t been visited before by taking advantage of the union property). After processing the subtree of the current node, we update a running variable (curMissing) that holds the smallest missing genetic value by checking the visited array. This value is then saved as the answer for the current node, and we move upward.
The approach is efficient because, even though we run a DFS from each node on the path, each node in the tree is processed at most once across all DFS calls.
Code Solutions
Below are code solutions in Python, JavaScript, C++, and Java with line-by-line comments.