We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Better Compression of String

Number: 3474

Difficulty: Medium

Paid? Yes

Companies: Goldman Sachs


Problem Description

Given a compressed string where each letter is immediately followed by its frequency (for example, "a3b1a1c2" represents the string "aaabacc"), return a new compressed string that meets the following conditions:

  1. Each character appears only once.
  2. The characters are in alphabetical order. Note that if a character appears more than once in the input, sum its frequencies for the output.

Key Insights

  • Parse the input string to separate characters and their frequencies.
  • Use a hash table (or dictionary/map) to accumulate the total frequency for each character.
  • Sort the unique characters in alphabetical order.
  • Construct the final string by concatenating each letter with its accumulated frequency.

Space and Time Complexity

Time Complexity: O(n) where n is the length of the compressed string, since we traverse the string and perform a sort on unique characters (with a small constant maximum of 26 letters). Space Complexity: O(1) because the extra space (hash table/dictionary) is bounded by the number of lowercase English letters.


Solution

The solution involves scanning the compressed string and using a hash table to accumulate the sum of frequencies for each character. Since every letter in the input is directly followed by a number (which could be more than one digit), pay careful attention when parsing the digits. After accumulating the counts, we sort the keys (letters) alphabetically and then rebuild the compressed description by concatenating each letter with its total frequency.


Code Solutions

# Python solution

def betterCompression(compressed):
    # Dictionary to store cumulative frequencies of each character
    freq_map = {}
    i = 0
    n = len(compressed)
    
    # Process the string
    while i < n:
        # current character
        char = compressed[i]
        i += 1
        num_start = i
        
        # Read the full number (can be more than one digit)
        while i < n and compressed[i].isdigit():
            i += 1
        # Convert substring to integer frequency
        count = int(compressed[num_start:i])
        
        # Sum the frequency into the dictionary
        if char in freq_map:
            freq_map[char] += count
        else:
            freq_map[char] = count
    
    # Build result with sorted keys
    result = []
    for ch in sorted(freq_map.keys()):
        result.append(ch + str(freq_map[ch]))
    
    return ''.join(result)

# Example usage:
print(betterCompression("a3c9b2c1"))  # Output: a3b2c10
← Back to All Questions