Problem Description
Given an array of n strings (all of the same length), you may choose to delete any set of columns (i.e., positions) from all strings. Your goal is to ensure that after deleting these columns, the array of strings is in lexicographic order. Return the minimum number of columns you must delete such that the remaining columns yield an array that is sorted in non-decreasing lexicographic order.
Key Insights
- The deletion affects all strings simultaneously – the same column is removed from every string.
- We only care about lexicographic order between adjacent rows.
- A greedy approach can be utilized: process columns from left to right and decide whether a column can be kept or must be deleted.
- Keep track of which pairs of consecutive strings have already been “decided” (i.e., where previous columns have established the order) so that you are only comparing columns for pairs that remain tied.
- When processing a column, if any comparison between tied pairs violates the non-decreasing order, the entire column must be deleted.
Space and Time Complexity
Time Complexity: O(m * n) where m is the number of columns and n is the number of strings. Space Complexity: O(n) for maintaining information about already confirmed order between adjacent string pairs.
Solution
We use a greedy algorithm that iterates over each column from left to right. For each column, we only consider adjacent string pairs which are not yet confirmed to be in correct order. If for any such pair the current column causes a lexicographical violation (i.e., the character in the top string is greater than the one in the bottom string), we decide to delete that column. Otherwise, if the column does not cause any violation, we update our record for each adjacent pair: if the characters indicate a strict order (i.e., the top character is less than the bottom), we mark that pair as sorted. This marking ensures that for future columns, we need not compare that pair, as their order is already determined by previous columns. The cumulative count of columns that are deleted is the answer.