Search code examples
rpython-3.xgroup-bysummarize

Python equivalent to dplyr's summarize


Does there exist a summarize function in python like there is in R?!

I was going over the frequent itemset algo apriori and was looking for an a good dataset. I found one here

I can kind of read and understand R but do not know if a summarize function exists in Python :

In R this notebook has :

order_baskets <- ordr_pr %>% 
  inner_join(prods, by="product_id") %>% 
  group_by(order_id) %>%
  summarise(basket = as.vector(list(product_name)))

In python I would just :

pd.merge(ordr_pr, prods, how='inner', on='product_id')
 .groupby(order_id)
  # summarize( basket = as.vector(list(product_name)))

After the merge I am kind of lost, I am not even sure if the groupby does the same thing in python as it does in R.


Solution

  • You are looking for the aggregate or the agg function. thus you could have:

    pd.merge(ordr_pr, prods, how='inner', on='product_id').groupby(order_id).agg({'product_name':list})