We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Design SQL

Number: 2555

Difficulty: Medium

Paid? Yes

Companies: OpenAI, Amazon


Problem Description

Design a class to simulate a simple SQL system that supports creating multiple tables with fixed number of columns. For each table, you need to implement operations to insert rows with auto-increment ids, remove rows (without affecting future id assignments), select a specific cell, and export the entire table in CSV format. Each row must include its auto-assigned id as its first field when exported.


Key Insights

  • Use a hash map (dictionary) to map table names to their corresponding table data.
  • Each table stores the expected number of columns, a next available id (starting from 1), and a mapping from row id to the actual row values.
  • Ensure row insertion only happens if the provided row matches the expected number of columns.
  • For removal, simply delete the row entry; do not adjust the auto-increment counter.
  • When exporting, sort rows by their ids and join the id and cell values using commas.
  • For sparse tables with many deletions, using a dictionary for rows prevents excessive space usage from preallocated arrays.

Space and Time Complexity

Time Complexity:

  • Insert: O(1) per operation (with constant time row length check)
  • Remove: O(1)
  • Select: O(1)
  • Export: O(m log m) if sorting m rows, although if rows are kept in insertion order or tracked, it can be made O(m)

Space Complexity:

  • O(t + m) where t is the number of tables and m is the total number of rows stored across tables (additional overhead for deleted rows is minimal since they are removed from storage).

Solution

The solution uses a dictionary (or hash table) to store table metadata and rows. Each table is represented as an object that tracks:

  • The expected number of columns for each row.
  • The next row id to be assigned.
  • A dictionary mapping row ids to the row data (a list/array of values).

For insertion, we check if the table exists and the length of the input row matches the expected number of columns. We assign the next id and store the row. For removal, we simply delete the row from the dictionary if it exists. For selection, we ensure that the table, row, and column exist before returning the value; otherwise, we return "". For export, we iterate through the stored rows (sorting by row id if needed) and for each row, create a comma-separated string starting with the row id followed by the cell values.

For sparse tables with many deletions, using a dictionary to store rows is more memory efficient compared to an array since it only stores existing rows, and operations remain efficient.


Code Solutions

# Python code solution with line-by-line comments

class SQL:
    def __init__(self, names, columns):
        # Initialize a dictionary to hold each table's metadata and rows
        self.tables = {}
        for name, col in zip(names, columns):
            # Each table stores the expected number of columns, the next available row id, and a dictionary for rows.
            self.tables[name] = {'cols': col, 'next_id': 1, 'rows': {}}

    def ins(self, name, row):
        # Insert a row into the table with name 'name'
        if name not in self.tables:
            return False  # Invalid table name
        table = self.tables[name]
        # Check if row length matches expected number of columns
        if len(row) != table['cols']:
            return False
        # Get the current next available id and insert the row
        row_id = table['next_id']
        table['rows'][row_id] = row
        # Increment the next id irrespective of any removals
        table['next_id'] += 1
        return True

    def rmv(self, name, rowId):
        # Remove a row from the table, if it exists
        if name not in self.tables:
            return  # Invalid table name, do nothing
        table = self.tables[name]
        if rowId in table['rows']:
            del table['rows'][rowId]

    def sel(self, name, rowId, columnId):
        # Retrieve a specific cell value from a table
        if name not in self.tables:
            return "<null>"
        table = self.tables[name]
        # Check if the row exists and if columnId is within the valid range (1-indexed)
        if rowId not in table['rows'] or columnId < 1 or columnId > table['cols']:
            return "<null>"
        # Return the appropriate column value (list is 0-indexed)
        return table['rows'][rowId][columnId - 1]

    def exp(self, name):
        # Export all rows of a table in CSV format
        if name not in self.tables:
            return []
        table = self.tables[name]
        exported = []
        # Sorting rowIds to ensure correct order in export
        for row_id in sorted(table['rows'].keys()):
            # Prepend row id to the row values and join with commas
            row_data = [str(row_id)] + table['rows'][row_id]
            exported.append(",".join(row_data))
        return exported

# Sample usage:
# sql = SQL(["one", "two", "three"], [2, 3, 1])
# print(sql.ins("two", ["first", "second", "third"]))  # True
# print(sql.sel("two", 1, 3))  # "third"
# print(sql.ins("two", ["fourth", "fifth", "sixth"]))  # True
# print(sql.exp("two"))  # ["1,first,second,third", "2,fourth,fifth,sixth"]
# sql.rmv("two", 1)
# print(sql.sel("two", 2, 2))  # "fifth"
# print(sql.exp("two"))  # ["2,fourth,fifth,sixth"]
← Back to All Questions