We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Minimum Genetic Mutation

Number: 433

Difficulty: Medium

Paid? No

Companies: Google, Microsoft, Amazon, Apple, X


Problem Description

Given a starting gene string, an ending gene string, and a bank of valid gene sequences, determine the fewest single-character mutations required to change the start gene into the end gene. Each mutation must result in a gene that exists in the bank. If no valid mutation sequence exists, return -1.


Key Insights

  • Use Breadth-First Search (BFS) as it finds the shortest path (minimum mutations) in an unweighted graph.
  • Mutations are generated by changing each character of the gene string one by one to one of the valid characters: A, C, G, or T.
  • Use a set for the bank for O(1) membership checks.
  • Avoid revisiting the same gene by marking mutations as visited (or removing them from the bank).

Space and Time Complexity

Time Complexity: O(N * L * 4) where N is the number of genes in the bank and L is the length of the gene (here L=8), due to checking each position with 4 possible mutations. Space Complexity: O(N) for the queue and visited set, with additional O(N) for the bank set.


Solution

We solve the problem using a Breadth-First Search (BFS) algorithm. The idea is to start from the initial gene and generate all possible one-character mutations. Each valid mutation (i.e., one that exists in the bank) is added to the BFS queue. For each level of BFS, we increment the mutation count. When the end gene is reached, we return the current mutation count.

We store the bank in a set to ensure constant time membership checks and avoid revisiting states. This solution guarantees finding the minimum number of mutations needed if a valid sequence exists, otherwise, it returns -1.


Code Solutions

# Python implementation for Minimum Genetic Mutation
from collections import deque

def minMutation(startGene, endGene, bank):
    # Convert bank list to set for quick lookup
    bankSet = set(bank)
    
    # Early exit if endGene is not in the bank
    if endGene not in bankSet:
        return -1

    # Queue holds tuples (current_gene, mutations_count)
    queue = deque([(startGene, 0)])
    # Valid characters for mutation
    validGenes = ['A', 'C', 'G', 'T']

    while queue:
        current, mutations = queue.popleft()
        # If the current gene equals the endGene, we found the mutation path
        if current == endGene:
            return mutations
        
        # Try all possible single-character mutations
        for i in range(len(current)):
            for char in validGenes:
                if char == current[i]:
                    continue
                # Create a new gene string mutation
                mutated = current[:i] + char + current[i+1:]
                # If this mutation is valid and in the bank
                if mutated in bankSet:
                    # Remove to avoid revisiting and add to the queue
                    bankSet.remove(mutated)
                    queue.append((mutated, mutations + 1))
    
    # No valid mutation sequence was found
    return -1

# Example usage:
# print(minMutation("AACCGGTT", "AAACGGTA", ["AACCGGTA","AACCGCTA","AAACGGTA"]))
← Back to All Questions