We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Reshape Data: Pivot

Number: 3072

Difficulty: Easy

Paid? No

Companies: Amazon


Problem Description

Given a table that records temperatures by city and month, pivot (reshape) the data so that each row represents a unique month and each city becomes its own column containing that month’s temperature.


Key Insights

  • We need to transform (or pivot) the table from a long format to a wide format.
  • The "month" column will be the row identifier.
  • Each unique "city" will have its own column.
  • The cell at the intersection of a month row and a city column will contain the temperature value.
  • The operation is similar to a pivot operation available in many data manipulation libraries.

Space and Time Complexity

Time Complexity: O(n), where n is the number of rows in the input table. Space Complexity: O(n), for storing the pivoted result.


Solution

The solution traverses each record in the given data and builds a mapping keyed by month. For each record, the temperature is stored in a sub-map keyed by the city name. This creates a structure where each month maps to a dictionary of city-temperature pairs. Finally, the mapping is converted into the desired tabular structure (list of rows or DataFrame). This approach efficiently re-organizes the data in one pass.


Code Solutions

Below are implementations in Python, JavaScript, C++, and Java with line-by-line comments.

import pandas as pd

def reshape_data(weather):
    # Pivot the DataFrame so that each row is a month and each column is a city.
    # The values in each cell are the temperatures.
    pivot_table = weather.pivot(index='month', columns='city', values='temperature').reset_index()
    # Optionally, sort the resulting DataFrame by month if needed.
    # pivot_table = pivot_table.sort_values(by='month')
    return pivot_table

# Example usage:
data = {
    'city': ['Jacksonville', 'Jacksonville', 'Jacksonville', 'Jacksonville', 'Jacksonville',
             'ElPaso', 'ElPaso', 'ElPaso', 'ElPaso', 'ElPaso'],
    'month': ['January', 'February', 'March', 'April', 'May',
              'January', 'February', 'March', 'April', 'May'],
    'temperature': [13, 23, 38, 5, 34, 20, 6, 26, 2, 43]
}

weather = pd.DataFrame(data)
result = reshape_data(weather)
print(result)
← Back to All Questions