We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Longest Repeating Substring

Number: 1060

Difficulty: Medium

Paid? Yes

Companies: Coupang, Meta, Google, Amazon, TikTok


Problem Description

Given a string s, find the length of its longest repeating substring. A repeating substring is one that appears in the string at least twice (the occurrences may overlap). If no such substring exists, return 0.


Key Insights

  • Use binary search to efficiently check for the existence of a repeating substring of a given length.
  • Apply a rolling hash technique to compute hash values for substrings in O(1) time after initial processing.
  • Use a hash set to detect duplicate hash values (indicating a repeating substring).
  • Adjust binary search bounds based on whether a repeating substring of the current length exists.
  • Alternative approaches include dynamic programming and suffix arrays, but the binary search with rolling hash is efficient and conceptually clean.

Space and Time Complexity

Time Complexity: O(n log n) on average, where n is the length of the string. (Each binary search step takes O(n) time for computing rolling hashes.) Space Complexity: O(n) for storing hash values in the hash set.


Solution

The solution uses binary search to decide on the candidate length L of the repeating substring. For each candidate length L, we use a rolling hash to compute hash values for all substrings of length L and store them in a hash set. If any hash is repeated, it confirms the presence of a repeating substring of length L, and we try a longer length. Otherwise, we decrease the candidate length. The rolling hash is computed using a base (26, for lowercase letters) and a modulus (2^32 to limit integer overflow) which allows efficient hash recomputation for sliding windows. This approach efficiently narrows down the maximum repeating substring length.


Code Solutions

def longestRepeatingSubstring(s):
    n = len(s)
    # Convert characters to integers for rolling hash computation.
    nums = [ord(c) - ord('a') for c in s]
    
    # Base value for rolling hash.
    a = 26
    # Modulus value to avoid overflow.
    mod = 2**32

    # Helper function to check if any substring of given length L repeats.
    def search(L):
        h = 0
        for i in range(L):
            h = (h * a + nums[i]) % mod
        seen = {h}
        # Precompute a^L % mod for removing the first digit in the rolling hash.
        aL = pow(a, L, mod)
        for start in range(1, n - L + 1):
            h = (h * a - nums[start - 1] * aL + nums[start + L - 1]) % mod
            if h in seen:
                return True
            seen.add(h)
        return False

    left, right = 1, n
    ans = 0
    while left <= right:
        mid = left + (right - left) // 2
        if search(mid):
            ans = mid
            left = mid + 1
        else:
            right = mid - 1
    return ans

# Example usage:
s = "aabcaabdaab"
print(longestRepeatingSubstring(s))  # Expected output: 3
← Back to All Questions