We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Shortest and Lexicographically Smallest Beautiful String

Number: 3150

Difficulty: Medium

Paid? No

Companies: IBM, Amazon, Wells Fargo


Problem Description

Given a binary string s and a positive integer k, a substring is considered beautiful if it contains exactly k ones. Among all beautiful substrings, let len be the length of the shortest one. The task is to return the lexicographically smallest beautiful substring of s with length equal to len. If there is no beautiful substring, return an empty string.


Key Insights

  • Since s.length is at most 100, an O(n²) solution by checking all substrings is acceptable.
  • Use a prefix sum or a sliding window approach to quickly count the number of ones in any substring.
  • First, determine the minimum length among all substrings that have exactly k ones.
  • Then, among all substrings of that minimum length, select the lexicographically smallest. In binary strings, note that '0' is lexicographically smaller than '1'.

Space and Time Complexity

Time Complexity: O(n²) – where n is the length of s; every possible substring is examined. Space Complexity: O(n) – for the prefix sum array used to quickly compute the number of ones in any substring (or O(1) if counting on the fly).


Solution

The solution involves two major steps. First, iterate over all possible substrings using two nested loops and use a prefix sum array (or an incremental counter) to determine the count of ones in constant time for each substring. Update the minimum valid length when a substring with exactly k ones is found.

After determining the length of the shortest beautiful substring, perform another scan (or store candidates during the first pass) to determine which substring of that length is lexicographically smallest. The lexicographical comparison is straightforward given that all candidate substrings are of the same length, meaning we compare character by character until a difference is found, with '0' being less than '1'.

Key data structures and approaches:

  • Use an array (or inline counter in the nested loop) to count ones.
  • Compare substrings using built-in string comparison provided by most languages.

Important gotchas:

  • Make sure to only consider substrings that contain exactly k ones.
  • If no substring satisfies the condition, return an empty string.

Code Solutions

# Python solution with detailed comments

def shortest_beautiful_substring(s: str, k: int) -> str:
    n = len(s)
    
    # Build a prefix sum array to count ones efficiently.
    # prefix[i] stores number of ones in s[:i]
    prefix = [0] * (n + 1)
    for i in range(n):
        prefix[i + 1] = prefix[i] + (1 if s[i] == '1' else 0)
    
    # Function to count ones in the substring s[i:j+1] using prefix sums.
    def count_ones(i, j):
        return prefix[j + 1] - prefix[i]
    
    min_length = float('inf')
    candidates = []
    
    # Iterate over all possible substrings.
    for i in range(n):
        for j in range(i, n):
            # Count the number of ones in this substring.
            ones = count_ones(i, j)
            if ones == k:
                current_length = j - i + 1
                if current_length < min_length:
                    min_length = current_length
                    candidates = [s[i:j+1]]
                elif current_length == min_length:
                    candidates.append(s[i:j+1])
                # Once we find a valid substring in this interval starting from i,
                # no need to extend further as length only increases.
                break

    if min_length == float('inf'):
        return ""
    
    # Among candidates with the shortest length, return the lexicographically smallest.
    return min(candidates)

# Example usage:
print(shortest_beautiful_substring("100011001", 3))  # Output: "11001"
print(shortest_beautiful_substring("1011", 2))       # Output: "11"
print(shortest_beautiful_substring("000", 1))        # Output: ""
← Back to All Questions