We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Minimum Cost Good Caption

Number: 3701

Difficulty: Hard

Paid? No

Companies: N/A


Problem Description

Given a string caption of length n, you want to transform it into a “good” caption.
A “good” caption is defined as one in which every letter appears only in contiguous groups, and each such group has at least 3 occurrences.
You may change any character at index i, but only by “shifting” it one step in the alphabet (either to the immediately previous letter – if it isn’t 'a' – or to the immediately next letter – if it isn’t 'z'). Repeating operations allows you to obtain any target letter at the cost equal to the absolute difference in their alphabetical order.
Your goal is to pick the operations so that the total cost (the sum over all indices of the number of single‐step operations performed) is minimized. If there are several optimal “good” captions (i.e. with minimum cost), you must return the lexicographically smallest one. If it is impossible to obtain a good caption using any number of operations, return the empty string.


Key Insights

  • The cost to change a letter c to some target letter L is |c - L| since only adjacent moves are allowed.
  • The final caption must be piecewise constant with each constant (or “block”) having length at least 3.
  • A natural strategy is to decide the final letter for every position while “growing” valid blocks. For the very first two positions of any block the decision is forced: you must use the same letter chosen for the block, otherwise you’d end the block too early.
  • Once you have a block of length ≥ 3 (a “complete” block), you have two choices at the next index: either extend the current block with the same letter or start a new block by choosing a different letter (starting the new block with count = 1).
  • To combine cost minimization and lexicographical tie‐breaking, dynamic programming with a state that combines the current index, the letter chosen at that position, and the “run length” (with a special flag for counts of 3 or more) is used.
  • Pre‐computing the “cost to change” each letter is trivial (simply the absolute difference between character codes) so each transition is O(1). The challenge is to design the DP so that only valid transitions (i.e. forced extension for incomplete groups and both “continue” or “break” options when the block is already valid) are allowed.

Space and Time Complexity

Time Complexity: O(n × 26 × 25)
 - n positions, at most 26 possible current letters and, when a block is complete, up to 25 choices to start a new block.
Space Complexity: O(n × 26 × 3)
 - We maintain DP state for each index, for each letter and for run lengths (1, 2, and “complete” which we denote by 3).


Solution

We solve the problem by processing the string one character at a time and maintaining a DP state that records the “best” way (in terms of total cost and lexicographical order) to form a valid prefix that ends at that position with a given letter and run count.
The state is defined as dp[i][letter][r] where:
 • i: number of characters processed (i from 1 to n; note we use 1-indexing for dp where dp[i] is after having set characters 0…i–1).
 • letter: the letter chosen for the current block at position i–1.
 • r: the length of the current block. For simplicity we record exact counts for 1 or 2; once the block reaches length 3, we mark that state as “complete” (represented as 3) even if the actual run is larger than 3.
For the first character, you may choose any letter L (cost = |caption[0] - L|) and the block count becomes 1.
For each subsequent character at index i, if the current state is incomplete (r < 3), you are forced to continue with the same letter (and add cost = |caption[i] - L| and increment the run count) until r reaches 3. Once a block is “complete” (r = 3), you have two options:  1. Continue the block with the same letter L (state remains “complete”, r remains 3).  2. End the current block and start a new block with any letter K (K must be different from L), setting the new block’s count to 1 (cost = |caption[i] - K|).
When computing transitions, if two ways yield the same cost, the lexicographically smaller partial caption is chosen.
Finally, a solution is acceptable only if the final state's block is “complete” (r = 3). We then reconstruct the answer using predecessor pointers stored during the DP transitions.

The algorithm uses a tabulation that runs in O(n) time with constant work per state transition and uses auxiliary tables for predecessor (to reassemble the solution) and cost/lex order information.


Code Solutions

Below are code solutions in Python, JavaScript, C++, and Java. Each solution uses detailed, line‐by‐line comments.

# Python solution for Minimum Cost Good Caption
# We use dynamic programming with state (index, current_letter, run_count),
# where run_count is 1 or 2 (incomplete) or 3 (complete; representing any run length >= 3).

import sys
import string

def minimumCostGoodCaption(caption: str) -> str:
    n = len(caption)
    # if n < 3, it is impossible to have a valid caption.
    if n < 3:
        return ""
    
    # We use a dictionary dp[i] that maps (L, run) -> (cost, prev_index, prev_state, chosen_letter)
    # where state is a tuple key (i, letter, run) and prev_state is the state from previous index.
    # For lex ordering we store the built solution (but we avoid storing full string in dp,
    # instead we keep a pointer to predecessor and reconstruct at the end.)
    dp = [{} for _ in range(n)]
    
    # Initialize for i = 0; dp[0][L, 1] = (cost, None, None)
    for L in string.ascii_lowercase:
        # cost to change caption[0] to L is abs(ord(caption[0]) - ord(L))
        cost = abs(ord(caption[0]) - ord(L))
        # key: (letter, run_count). run 1 means we started a new block
        state = (L, 1)
        dp[0][state] = (cost, None, None)  # (total cost, previous state pointer, decision char)
    
    # Process positions 1 to n-1
    for i in range(1, n):
        dp[i] = {}
        curr_char = caption[i]
        # iterate over all states from dp[i-1]
        for (prevLetter, run) in dp[i-1]:
            prevCost, prevPointer, _ = dp[i-1][(prevLetter, run)]
            # If run is incomplete (<3) then we must continue with the same letter
            if run < 3:
                # forced transition: same letter prevLetter
                cost_here = abs(ord(curr_char) - ord(prevLetter))
                newCost = prevCost + cost_here
                newRun = run + 1
                # if newRun becomes 3, mark as complete (i.e., 3)
                if newRun >= 3:
                    newRun = 3
                newState = (prevLetter, newRun)
                candidate = (newCost, (i-1, prevLetter, run), prevLetter)
                # update dp[i][newState] with candidate if better cost or lexicographically smaller tie.
                if newState not in dp[i]:
                    dp[i][newState] = candidate
                else:
                    oldCost, oldPtr, oldLetter = dp[i][newState]
                    if newCost < oldCost or (newCost == oldCost and prevLetter < oldLetter):
                        dp[i][newState] = candidate
            else:
                # run is complete so we have two options:
                # Option 1: continue with same letter (stay in same block)
                cost_here = abs(ord(curr_char) - ord(prevLetter))
                newCost = prevCost + cost_here
                newState = (prevLetter, 3)  # remain complete
                candidate = (newCost, (i-1, prevLetter, run), prevLetter)
                if newState not in dp[i]:
                    dp[i][newState] = candidate
                else:
                    oldCost, oldPtr, oldLetter = dp[i][newState]
                    if newCost < oldCost or (newCost == oldCost and prevLetter < oldLetter):
                        dp[i][newState] = candidate
                # Option 2: break the block and start a new block with any other letter
                for newLetter in string.ascii_lowercase:
                    if newLetter == prevLetter: 
                        continue  # skipping same letter because that would be Option 1
                    cost_here = abs(ord(curr_char) - ord(newLetter))
                    newCost = prevCost + cost_here
                    newState = (newLetter, 1)
                    candidate = (newCost, (i-1, prevLetter, run), newLetter)
                    if newState not in dp[i]:
                        dp[i][newState] = candidate
                    else:
                        oldCost, oldPtr, oldLetter = dp[i][newState]
                        if newCost < oldCost or (newCost == oldCost and newLetter < oldLetter):
                            dp[i][newState] = candidate

    # Now, among dp[n-1] we only accept states where the block is complete (run == 3)
    answer_cost = sys.maxsize
    final_state = None
    for (letter, run) in dp[n-1]:
        if run == 3:
            cost, pointer, chosen = dp[n-1][(letter, run)]
            if cost < answer_cost:
                answer_cost = cost
                final_state = (n-1, letter, run)
            elif cost == answer_cost:
                # tie-break: lexicographically compare full reconstruction
                candidate_str = reconstruct(dp, n-1, (letter, run))
                current_str = reconstruct(dp, final_state[0], (final_state[1], final_state[2]))
                if candidate_str < current_str:
                    final_state = (n-1, letter, run)
    if final_state is None:
        return ""
    return reconstruct(dp, final_state[0], (final_state[1], final_state[2]))

def reconstruct(dp, i, state):
    # Reconstructs the solution string from dp table.
    res = []
    curr_state = (i, state[0], state[1])
    while curr_state is not None:
        i_index, letter, run = curr_state
        res.append(letter)
        prev_info = dp[i_index][(letter, run)][1] if i_index>=0 and (letter, run) in dp[i_index] else None
        if prev_info is None:
            break
        i_prev, prevLetter, prevRun = prev_info
        curr_state = (i_prev, prevLetter, prevRun)
    return "".join(reversed(res))

# Example usage:
#print(minimumCostGoodCaption("cdcd"))
#print(minimumCostGoodCaption("aca"))
#print(minimumCostGoodCaption("bc"))
← Back to All Questions