We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Get the Size of a DataFrame

Number: 3076

Difficulty: Easy

Paid? No

Companies: Google


Problem Description

Given a DataFrame named players, determine the number of rows and columns it contains and return the result as an array in the form [number of rows, number of columns].


Key Insights

  • The DataFrame’s shape is an inherent property that represents its dimensions.
  • In Python (using pandas), the shape attribute returns a tuple (rows, columns).
  • In other languages that do not have a built-in DataFrame, the same concept can be emulated by treating the structure as a collection of rows, where each row is a collection of columns.

Space and Time Complexity

Time Complexity: O(1) – accessing the dimensions is a constant time operation. Space Complexity: O(1) – only a fixed amount of additional space is used.


Solution

The solution leverages the fact that DataFrames inherently store their dimensions. For example, in Python, using the pandas library, the players DataFrame has an attribute called shape that returns a tuple: (number of rows, number of columns). The solution simply extracts and returns these values in the required format. For languages without native DataFrame support, the common approach is to represent the DataFrame as a list or vector of rows (with each row being a list/array of columns) and then use the length of the list for rows and the length of any row (if it exists) for columns.


Code Solutions

import pandas as pd

# Function to return the number of rows and columns of the DataFrame players
def getDataFrameSize(players):
    # players.shape returns a tuple (number of rows, number of columns)
    num_rows, num_cols = players.shape
    # Return the dimensions as a list [number of rows, number of columns]
    return [num_rows, num_cols]

# Example usage:
if __name__ == "__main__":
    # Sample data representing the DataFrame
    data = {
        "player_id": [846, 749, 155, 583, 388, 883, 355, 247, 761, 642],
        "name": ["Mason", "Riley", "Bob", "Isabella", "Zachary", "Ava", "Violet", "Thomas", "Jack", "Charlie"],
        "age": [21, 30, 28, 32, 24, 23, 18, 27, 33, 36],
        "position": ["Forward", "Winger", "Striker", "Goalkeeper", "Midfielder", "Defender", "Striker", "Striker", "Midfielder", "Center-back"],
        "team": ["RealMadrid", "Barcelona", "ManchesterUnited", "Liverpool", "BayernMunich", "Chelsea", "Juventus", "ParisSaint-Germain", "ManchesterCity", "Arsenal"]
    }
    # Create a DataFrame named players from the data
    players = pd.DataFrame(data)
    # Print the size of the DataFrame: [number of rows, number of columns]
    print(getDataFrameSize(players))  # Expected Output: [10, 5]
← Back to All Questions