Problem Description
Given an m x n binary matrix, a non‐empty subset of rows is defined to be “good” if for every column the sum of bits in the chosen rows is at most floor(k/2), where k is the number of rows chosen. In other words, when you add up each column over the chosen rows, each total must be less than or equal to half the number of chosen rows (floored if k is odd). Return any list (sorted in ascending order) of row indices forming a good subset. If no solution exists, return an empty array.
Key Insights
- Since n (the number of columns) is small (at most 5), each row can be represented as a bitmask (an integer between 0 and 2ⁿ−1).
- A “good subset” must obey a column–wise inequality: for a subset of size k, every column’s ones count must be ≤ floor(k/2).
- A one–row subset is only valid if the row is all zeros.
- For a subset of 2 rows, floor(2/2)=1 so the condition is equivalent to having no column where both rows have a one; in other words, their bitmasks do not have overlapping bits.
- For nonzero rows a solution (if k>1) cannot include duplicate copies of the same nonzero row (because adding the same nonzero row twice would double its contribution in at least one column, making it impossible to satisfy the “at most half” condition).
- Thus, once we pre–compute a dictionary mapping each bitmask to a list of row indices (its occurrences), we can first check if any row is all–zero (mask == 0) and immediately return it.
- Otherwise, we only need to try “small” subsets (size 2, 3, …) chosen from distinct nonzero types. For each candidate combination we can compute the number of ones contributed to each column and check if it “fits” under the required bound (r/2 if r is even; (r–1)/2 if r is odd).
Space and Time Complexity
Time Complexity: O(Σ (number of candidate combinations))
In the worst case there are at most 2⁵=32 distinct masks; when trying candidate subsets of sizes 2,3,… the total number of combinations is bounded by a combinatorial number on 32 (which is acceptable given the small constant).
Space Complexity: O(m) for storing indices in the dictionary plus O(1) additional space.
Solution
We first convert each row into a bitmask. (Each row is a binary representation based on its columns.) If we find a row whose mask is zero then it trivially forms a valid “good subset” by itself. Otherwise, because for any nonzero row the one–element subset would violate the condition (since floor(1/2)=0 but the row contributes one in at least one column), we only consider subsets of size ≥2.
Since repeating a nonzero row twice would always make that column be counted twice (and 2 > floor(2/2)=1), any valid candidate subset can include at most one copy for each nonzero row type. Thus we build a list of distinct nonzero types (each associated with one sample row index) and then iterate over combinations (of size r starting from 2) of these candidates. For each candidate combination we check, for every column j, whether the total number of ones (i.e. the count of rows that have the j–th bit set) is ≤ floor(r/2) (if r is even, allowed = r/2; if odd, allowed = (r–1)//2). Once a valid combination is found, we return the sorted list of row indices corresponding to that combination.
A similar strategy would work in other languages; the main point is representing rows as bitmasks, using a dictionary to map mask values to row indices, and then iterating over combinations (from the distinct nonzero rows) until one satisfying the condition is found.