Problem Description
Given a list of words and a list of puzzles (each puzzle being a 7-character string with unique letters), count for each puzzle how many words are valid. A word is valid for a given puzzle if:
- The word contains the first letter of the puzzle.
- Every letter in the word must appear in the puzzle. Return an array where the ith element is the count of valid words for puzzles[i].
Key Insights
- Convert each word into a bitmask representing its letters; ignore words with more than 7 unique letters since they cannot be contained in any puzzle.
- Use a hashmap (or dictionary) to record how many words correspond to each bitmask.
- For each puzzle, generate all subsets (using bit manipulation) of the puzzle’s letters excluding the mandatory first letter.
- For every subset, include the first letter and check if this new bitmask exists in the precomputed word map.
- Bit manipulation greatly reduces the subset enumeration complexity since each puzzle only contains 7 letters leading to 2^6 possible subsets (keeping the first letter fixed).
Space and Time Complexity
Time Complexity:
- Preprocessing words: O(N * L) where N is the number of words and L is the average word length.
- For each puzzle: O(2^M) where M = 6 (letters excluding the first letter) so it is constant (64 subsets per puzzle). Overall, this results in an acceptable time cost. Space Complexity:
- O(2^L) for the hashmap in the worst case, which is manageable given the constraints.
- Additional space for subset generation is O(1) per puzzle (only integer bit masks are used).
Solution
We use bit manipulation to represent each word and puzzle as a bitmask. First, convert each word into a bitmask and count the frequency in a hashmap. For each puzzle, the first letter is mandatory, so we keep it fixed and try all possible combinations of remaining letters by enumerating the subsets of the remaining six letters. For each subset generated, we combine it with the fixed first letter bit. If the resultant bitmask exists in the word frequency map, add the frequency count to the answer for that puzzle.
The trick here is:
- Representing strings as bit masks (each bit corresponds to an alphabet letter).
- Efficiently generating all subsets of the letters in a puzzle (excluding the first letter) using bit manipulation.
- Using a hashmap to quickly count valid words matching a bitmask.