We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Count K-Subsequences of a String With Maximum Beauty

Number: 3057

Difficulty: Hard

Paid? No

Companies: Google


Problem Description

Given a string s and an integer k, we need to count the number of k-subsequences of s (subsequences of length k with all distinct characters) that achieve the maximum possible beauty. The beauty of a k-subsequence is defined as the sum of the total counts (frequency) in s of each character in the subsequence. Two subsequences are considered different if they are chosen from different sets of indices, even if they form the same string.


Key Insights

  • The maximum beauty is achieved by picking k distinct characters with the highest frequencies in s.
  • Count the frequency of each character in s. Since s only contains lowercase English letters, there are at most 26 entries.
  • Sort the frequencies in descending order.
  • Let kth_value be the frequency at index k-1 of this sorted list.
  • Let R be the number of characters among the top k that have frequency equal to kth_value.
  • Let total be the total count of characters in s that have frequency equal to kth_value.
  • The answer is then given by the binomial coefficient C(total, R) modulo 10^9 + 7.
  • If there are fewer than k distinct characters in s, the answer is 0 since no valid subsequence exists.

Space and Time Complexity

Time Complexity: O(n) for counting frequencies + O(1) for processing up to 26 characters = O(n) Space Complexity: O(1), since we only store frequency counts for a fixed set of letters


Solution

We first count the frequency of every character in the input string s. Next, we extract these frequencies and sort them in descending order. If we do not have at least k distinct characters, we return 0. Otherwise, the maximum beauty is achieved by selecting the k characters with the highest frequencies.

However, if there is a tie (i.e., more than one character with the same frequency as the kth character), we need to count the number of ways to choose the appropriate number of characters from those available to form the subsequence. Specifically, if R out of the top k have frequency equal to kth_value and there are total characters in s with this frequency, then the answer is C(total, R) modulo 10^9 + 7. Given that the possible number for combinations is small (at most 26), a simple iterative calculation suffices.


Code Solutions

# Python solution with detailed comments
MOD = 10**9 + 7

def nCr(n, r):
    # Simple function for computing combination C(n, r) iteratively 
    if r > n: 
        return 0
    res = 1
    # r = min(r, n - r) to optimize the loop
    r = min(r, n - r)
    for i in range(r):
        res = res * (n - i) % MOD
        # Compute modular multiplicative inverse for division modulo MOD
        res = res * pow(i + 1, MOD - 2, MOD) % MOD
    return res

def countKSubsequences(s, k):
    # Count frequencies of each character
    freq = {}
    for ch in s:
        freq[ch] = freq.get(ch, 0) + 1

    # If fewer distinct characters than k, no valid subsequence exists
    if len(freq) < k:
        return 0

    # Get a list of frequencies and sort in descending order
    freq_list = sorted(freq.values(), reverse=True)

    # kth frequency value (0-indexed)
    kth_value = freq_list[k - 1]

    # Count how many of the top k characters have frequency equal to kth_value
    count_needed = sum(1 for i in freq_list[:k] if i == kth_value)

    # Count total characters in s with frequency equal to kth_value
    total_count = sum(1 for f in freq_list if f == kth_value)

    # Return the combinations (nCr) mod MOD
    return nCr(total_count, count_needed)

# Example usage:
s1 = "bcca"
k1 = 2
print(countKSubsequences(s1, k1))  # Output: 4

s2 = "abbcd"
k2 = 4
print(countKSubsequences(s2, k2))  # Output: 2
← Back to All Questions