We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Find Substring With Given Hash Value

Number: 2275

Difficulty: Hard

Paid? No

Companies: N/A


Problem Description

Given a string s and integers power, modulo, k, and hashValue, find the first (leftmost) substring of length k in s such that its hash computed by the formula   hash(s, p, m) = (val(s[0]) * p^0 + val(s[1]) * p^1 + ... + val(s[k-1]) * p^(k-1)) mod m equals hashValue. Here, val(s[i]) is the 1-indexed value of the character (i.e., 'a'=1, 'b'=2,..., 'z'=26). It is guaranteed that an answer exists.


Key Insights

  • The hash is computed using powers of the given base (power) and modulo arithmetic.
  • Direct recomputation for each possible substring would be inefficient; a rolling hash approach is ideal.
  • Since the formula uses increasing powers from left to right, it is more efficient to traverse the string in reverse and build the hash for substrings in reverse order.
  • Careful handling of modulo arithmetic is necessary when subtracting contributions from characters that fall out of the current window.

Space and Time Complexity

Time Complexity: O(n), where n is the length of the string since we process each character once. Space Complexity: O(1), as we only maintain a few running variables.


Solution

We iterate over the string from right to left to use a reverse rolling hash method. The challenge is to match the hash for a substring of length k computed in the “normal” order with the hash we compute in reverse.

Algorithm steps:

  1. Precompute power^k mod modulo to be used for removing trailing character contributions.
  2. Initialize a running hash variable and a multiplier variable set to 1.
  3. Traverse the string from the end towards the beginning:
    • Add the current character's contribution (its value multiplied by the current multiplier) to the running hash.
    • Update the multiplier by multiplying it by power modulo modulo.
    • When the window size exceeds k (i.e., when our current index + k is less than the string's length), subtract the extra character's contribution using the precomputed power^k.
  4. If the running hash matches hashValue and the substring length is exactly k, record the starting index.
  5. After the loop, output the substring starting at the recorded index of length k.

Data Structures:

  • Use basic integer variables to keep track of the hash, multiplier, and index.
  • String slicing is used to extract the resulting substring.

The trick here is the reverse processing which avoids expensive exponentiation per window and handles the sliding window update efficiently using modulo arithmetic.


Code Solutions

# Python solution with line-by-line comments

class Solution:
    def findSubstring(self, s: str, power: int, modulo: int, k: int, hashValue: int) -> str:
        n = len(s)
        current_hash = 0        # rolling hash for the current substring in reverse order
        current_multiplier = 1  # current multiplier factor (power^i mod modulo)
        result_start = 0        # store the starting index of the target substring
        
        # Precompute (power^k) mod modulo for removing the trailing character effect
        power_k = 1
        for _ in range(k):
            power_k = (power_k * power) % modulo
        
        # Process the string in reverse order
        for i in range(n - 1, -1, -1):
            # Convert character to its value ('a' -> 1, ..., 'z' -> 26)
            char_val = ord(s[i]) - ord('a') + 1
            # Update the rolling hash: add current character's contribution
            current_hash = (current_hash + char_val * current_multiplier) % modulo
            # Update multiplier for the next character
            current_multiplier = (current_multiplier * power) % modulo
            
            # When we have traversed at least k characters, adjust the hash by removing the extra character
            if i + k < n:
                # value of character leaving the sliding window (multiplied with power^k)
                remove_val = (ord(s[i + k]) - ord('a') + 1) * power_k % modulo
                current_hash = (current_hash - remove_val + modulo) % modulo  # ensure non-negative mod
                
            # If current window is exactly length k and hash equals hashValue, record starting position.
            if i + k <= n and current_hash == hashValue:
                result_start = i  # update result_start to current index
        
        # Return the substring of length k from the resulting index
        return s[result_start:result_start + k]
← Back to All Questions