We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Reshape Data: Concatenate

Number: 3064

Difficulty: Easy

Paid? No

Companies: N/A


Problem Description

Given two DataFrames, df1 and df2, both containing the same columns (student_id, name, age), the task is to vertically concatenate (stack) the rows from df2 underneath the rows from df1. The combined DataFrame should include all rows from both source DataFrames.


Key Insights

  • Both DataFrames have identical schemas, which makes vertical stacking straightforward.
  • The problem is analogous to performing a union operation without dropping duplicates.
  • Many programming languages and libraries (e.g., pandas for Python) offer built-in functions to concatenate DataFrames.
  • The operation involves simply appending the rows of one DataFrame to another.

Space and Time Complexity

Time Complexity: O(n + m), where n is the number of rows in df1 and m is the number of rows in df2.
Space Complexity: O(n + m), as a new DataFrame is created containing all rows.


Solution

We can solve this problem by utilizing built-in concatenation methods available in popular libraries. For example, in Python's pandas, we can use the pd.concat method to combine the two DataFrames vertically (axis=0). The approach is simple: pass df1 and df2 to the concat function to stack their rows. Similar approaches are applied in other languages by combining arrays, vectors, or lists. The key is ensuring that the order of columns remains consistent during the concatenation.


Code Solutions

import pandas as pd                                          # Import pandas library

# Create the first DataFrame (df1)
data1 = {
    'student_id': [1, 2, 3, 4],
    'name': ['Mason', 'Ava', 'Taylor', 'Georgia'],
    'age': [8, 6, 15, 17]
}
df1 = pd.DataFrame(data1)                                    # Construct df1 with sample data

# Create the second DataFrame (df2)
data2 = {
    'student_id': [5, 6],
    'name': ['Leo', 'Alex'],
    'age': [7, 7]
}
df2 = pd.DataFrame(data2)                                    # Construct df2 with sample data

# Concatenate DataFrames vertically using pd.concat
result = pd.concat([df1, df2], axis=0)                      # Combine the two DataFrames row-wise

# Optionally reset the index for a clean result (if needed)
result = result.reset_index(drop=True)                      # Reset index of the concatenated DataFrame

# Print the resulting concatenated DataFrame
print(result)
← Back to All Questions