We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Probability of a Two Boxes Having The Same Number of Distinct Balls

Number: 1577

Difficulty: Hard

Paid? No

Companies: N/A


Problem Description

Given 2n balls of k distinct colors where the number of balls of each color is given in an array balls, the balls are shuffled uniformly at random and split into two boxes containing n balls each. The task is to calculate the probability that the two boxes have the same number of distinct colors.


Key Insights

  • Use Depth-First Search (DFS) or backtracking to explore all possible distributions of balls for each color into the two boxes.
  • For every color, choose a split (how many of that color go to box1 vs. box2) using combinatorial counts (i.e., combinations).
  • Maintain running counts: total ball count in box1 and distinct color counts in both boxes.
  • Only consider splits where box1 gets exactly n balls; box2 will automatically have n balls.
  • Use combinatorial multiplication to account for the number of ways a particular split can occur.
  • Divide the sum of valid (favorable) distributions by the total possible distributions that yield exactly n balls in box1.

Space and Time Complexity

Time Complexity: O(∏ (balls[i] + 1)) which is exponential in the number of colors; however, balls.length ≤ 8 makes this tractable. Space Complexity: O(k) due to recursion depth, where k is the number of colors.


Solution

We perform a DFS where for each color we explore all possible ways to assign some of its balls to box1 and the rest to box2. For each distribution:

  • Update the count of balls in box1.
  • Update the distinct colors in box1 and box2 if a color appears at least once.
  • Multiply by the combinatorial factor representing the number of ways to choose that many balls from the available ones. Once we assign all colors and box1 has exactly n balls, we check if the number of distinct colors in both boxes are equal. We aggregate the number of ways that result in this condition (favorable count) and also the total number of ways to equally split the balls. Finally, the answer is given by dividing the favorable count by the total count.

Key data structures and techniques:

  • Recursion with DFS/backtracking.
  • Combinatorial calculations using a helper combination function.
  • Maintaining running product of ways to pick balls for current distribution.

Code Solutions

import math

def getProbability(balls):
    totalColors = len(balls)
    totalBalls = sum(balls)
    half = totalBalls // 2
    favorable = 0
    totalWays = 0
    
    # Precompute combination values using math.comb (Python 3.8+)
    def comb(n, k):
        return math.comb(n, k)

    # dfs function will iterate over colors.
    # Parameters:
    # idx: current color index
    # count_box1: number of balls in box1 so far
    # distinct_box1: number of colors in box1 so far
    # distinct_box2: number of colors in box2 so far
    # ways: number of ways to achieve the current state
    def dfs(idx, count_box1, distinct_box1, distinct_box2, ways):
        nonlocal favorable, totalWays
        # When processed all colors, check if valid split
        if idx == totalColors:
            if count_box1 == half:
                totalWays += ways
                if distinct_box1 == distinct_box2:
                    favorable += ways
            return
        
        # Number of balls of the current color
        ball_count = balls[idx]
        # Try every possible distribution of current color from 0 to ball_count in box1
        for i in range(ball_count + 1):
            # i balls go to box1, (ball_count - i) go to box2
            new_count_box1 = count_box1 + i
            # If new_count_box1 exceeds half, no need to continue (prune branch)
            if new_count_box1 > half:
                continue
            # If a color is used at least once in the box, count it as distinct
            new_distinct_box1 = distinct_box1 + (1 if i > 0 else 0)
            new_distinct_box2 = distinct_box2 + (1 if ball_count - i > 0 else 0)
            # Multiply the current number of ways by the number of ways to choose i balls from ball_count
            curr_ways = ways * comb(ball_count, i)
            dfs(idx + 1, new_count_box1, new_distinct_box1, new_distinct_box2, curr_ways)
    
    dfs(0, 0, 0, 0, 1)
    return favorable / totalWays

# Example usage:
print(getProbability([1,1]))   # Expected output: 1.00000
print(getProbability([2,1,1])) # Expected output: 0.66667
print(getProbability([1,2,1,2])) # Expected output: 0.60000
← Back to All Questions