Search code examples
pandasdataframepandas-groupbykaggle

Trouble specifying column names when making dataframe from aggregate function


When I make a dataframe with:

freq = pd.DataFrame(combined.groupby(['Latitude', 'Longitude','from_station_name']).agg('count')['trip_id'])

It works just fine, but when I attempt:

freq = pd.DataFrame(combined.groupby(['Latitude', 'Longitude','from_station_name']).agg('count')['trip_id'], columns = ['lat','long','station','trips'])

I just see the headers when I look at the dataframe. I can make the dataframe and then use:

freq.columns = ['lat','long','station','trips']

But was wondering how to do this in one step. I've tried specifying "data =" for the aggregate function. Tried double enclosing the brackets for the column names, removing the brackets for the column names. Any advice is appreciated.


Solution

  • You don't need to pass your groupby object into a new dataframe constructor (like @Vaishali mentioned already)

    If you want to rename your columns after groupby, you can simply do something like:

    combined.groupby(['Latitude', 'Longitude','from_station_name']).trip_id.agg('count').rename(columns={'Latitude': 'lat', 'Longitude': 'long', 'from_station_name':'station', 'count': 'trips})