We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Longest Common Prefix of K Strings After Removal

Number: 3784

Difficulty: Hard

Paid? No

Companies: N/A


Problem Description

Given an array of strings words and an integer k, for every index i we remove words[i] from the array. Then among the remaining words we want to choose any k (from distinct indices) so that the k chosen words share the longest possible common prefix. For every removal (i.e. for every index) return the length of that longest common prefix. If after removal there are fewer than k words available, we return 0.

For example, if words = ["jump","run","run","jump","run"] and k = 2 then:

  • Removing "jump" at index 0, the remaining list has three "run” and one "jump”. Choosing any two "run" gives a common prefix “run” (length 3).
  • Removing "run" at index 1, the best we can do is choose the two occurrences of "jump" (common prefix “jump” of length 4). The answer array is [3,4,4,3,4].

Key Insights

  • A global data structure (specifically, a Trie) can be used to compute for each prefix (of any word) the total count of words that share that prefix.
  • The “global” best answer (without any removal) is determined by the deepest level (largest prefix length) for which there exists at least one node (prefix) with count ≥ k.
  • Removing a word only “affects” the nodes along the path corresponding to that removed word. For any node whose prefix is not shared with the removed word the count remains unchanged.
  • In order to answer each removal efficiently, we pre‐compute for every depth d the maximum frequency among all trie nodes at that depth (globalBest[d]), how many nodes achieve that maximum (globalBestFreq[d]) and the second‐highest frequency (globalSecond[d]). Then for a removal, if the removed word passes through the best node at depth d and it was unique, its effective count will drop (i.e. count–1) so we need to compare that with the backup candidate (the second best).
  • Note that for a removal word whose length is L, any prefix longer than L is “unaffected” (since the removed word does not contribute there) and the global count remains valid.

Space and Time Complexity

Time Complexity: O(S) where S is the sum of the lengths of all words (for both building and traversing the trie, plus processing each removal). Space Complexity: O(S) due to storing the Trie and some additional arrays keyed by prefix depth.


Solution

We first build a Trie with every word inserted. Each Trie node stores: • A mapping from character to child node. • A count – the number of words that pass through the node.

Then we perform a traversal (BFS or DFS) of the Trie to “group” nodes by their depth. For each depth d we record: • globalBest[d]: the maximum count of any node at depth d. • globalBestFreq[d]: the number of nodes that achieve that maximum. • globalSecond[d]: the largest count that is strictly less than globalBest[d] (or 0 if none).

The “global answer” (if no removal occurred) is the maximum d (i.e. prefix length) for which globalBest[d] ≥ k.

For each removal (i.e. for each word removed), we “simulate” the effect along the path corresponding to that word. For a prefix of length d that is part of the removed word, the effective count becomes:   if the removed word’s prefix node was the unique best at depth d then:     candidate = max( removed_node_count – 1, globalSecond[d] )   otherwise:     candidate = globalBest[d] For any depth d greater than the length of the removed word the candidate remains globalBest[d] (since the removed word does not contribute there).

Thus the answer for removal i is the maximum d (1 ≤ d ≤ maxDepth) for which the candidate count is still at least k. (If the total number of words after removal is less than k then the answer is 0.)

A key “gotcha” is that even if the global best at a given depth comes from the removed word’s contribution, if another node exists (with the same global count) we can “borrow” that value and the candidate is not affected.

We now show complete annotated code solutions in Python, JavaScript, C++ and Java.


Code Solutions

# Python solution

# Define a TrieNode class with children dict and count
class TrieNode:
    def __init__(self):
        self.children = {}
        self.count = 0

# Insert a word into the trie
def insert(root, word):
    node = root
    for ch in word:
        if ch not in node.children:
            node.children[ch] = TrieNode()
        node = node.children[ch]
        node.count += 1

# Build the global trie
root = TrieNode()
words = []  # placeholder: input list of words
# Example input for testing:
# words = ["jump", "run", "run", "jump", "run"]
# k = 2
# In actual use, words and k would be read

# Assume words and k are provided
# For demonstration, using the sample:
words = ["jump", "run", "run", "jump", "run"]
k = 2

for word in words:
    # For each word, update counts along its path in trie.
    insert(root, word)

# Traverse the trie to populate global arrays by depth.
# globalBest[d] = best count at depth d
# globalBestFreq[d] = frequency of that best count
# globalSecond[d] = second best count at depth d
from collections import deque
globalBest = {}    # key: depth, value: best count
globalBestFreq = {} # key: depth, value: frequency of that best
globalSecond = {}  # key: depth, value: second best count

q = deque()
# start with children of root at depth 1.
for child in root.children.values():
    q.append((child, 1))

while q:
    node, depth = q.popleft()
    cnt = node.count
    # update global arrays for this depth
    if depth not in globalBest:
        globalBest[depth] = cnt
        globalBestFreq[depth] = 1
        globalSecond[depth] = 0
    else:
        if cnt > globalBest[depth]:
            globalSecond[depth] = globalBest[depth]
            globalBest[depth] = cnt
            globalBestFreq[depth] = 1
        elif cnt == globalBest[depth]:
            globalBestFreq[depth] += 1
        elif cnt > globalSecond[depth]:
            globalSecond[depth] = cnt
    # add children nodes into the queue with increased depth
    for child in node.children.values():
        q.append((child, depth+1))

# Compute maximum depth (global maximum key in globalBest)
maxDepth = max(globalBest.keys()) if globalBest else 0

# The "global answer" if no removal occurs
globalAnswer = 0
for d in range(1, maxDepth+1):
    if globalBest.get(d, 0) >= k:
        globalAnswer = d

# For each word removal, compute the answer.
n = len(words)
answer = [0]*n

# Pre-calculate removal prefix counts for each word by traversing the trie.
def get_prefix_counts(root, word):
    counts = []
    node = root
    for ch in word:
        node = node.children[ch]
        counts.append(node.count)
    return counts

# Process each removal
for i, word in enumerate(words):
    # If after removal (n-1 words) there are fewer than k words, answer is 0.
    if n - 1 < k:
        answer[i] = 0
        continue
    L = len(word)
    # If the removed word is short then it does not affect nodes at depths > L.
    # So if globalAnswer > L, that candidate remains unaffected.
    if globalAnswer > L:
        answer[i] = globalAnswer
    else:
        # For depths not affected by the removed word (d > L), candidate = globalBest[d].
        # But since globalAnswer <= L, only d in [1, L] need be checked.
        prefixCounts = get_prefix_counts(root, word)
        best = 0
        for d in range(1, L+1):
            count_at_d = prefixCounts[d-1]
            # Adjust effective count if the removed word contributes here.
            if d in globalBest:
                # Check if the removal word is the unique contributor of the global best.
                if count_at_d == globalBest[d] and globalBestFreq[d] == 1:
                    effective = max(count_at_d - 1, globalSecond[d])
                else:
                    effective = globalBest[d]
            else:
                effective = 0
            if effective >= k:
                best = d  # update best possible length
        answer[i] = best

print("Answer:", answer)
← Back to All Questions