I have 2 indexes say: year and month. And I’m using pivot to display the count of products.
Now, in my df, let’s say there are no records of month july and august in 2020. But the pivot will show these months and values 0. I don’t want the pivot to show these irrelevant rows (which are not present in the df) as they make my final pivot very long. How to reduce this?
Here is my example df:
df = pd.DataFrame({'Product Type': ['Fruits', 'Fruits', 'Vegetable', 'Vegetable', 'Vegetable', 'Vegetable', 'Fruits', 'Fruits', 'Vegetables', 'Cars', 'Cars', 'Cars', 'Bikes', 'Bikes'],
'Product': ['Apple', 'Banana', 'Apple', 'Apple', 'Brocoli', 'Carrot', 'Apple', 'Banana', 'Brocoli', 'BMW M3', 'BMW M3', 'BMW M3', 'Hayabusa', 'Hayabusa'],
'Amount': [4938, 3285, 4947, 1516, 2212, 3778, 1110, 4436, 1049, 494, 2818, 3737, 954, 4074],
})
And my code so far:
import pandas as pd
import numpy as np
df = pd.read_csv('try.csv')
bins = [0,1000,2000,5000,float(np.inf)]
labels = ['0-1000','1000-2000','2000-5000','5000+']
df['bins'] = pd.cut(df['Amount'],bins=bins, labels=labels, right=True)
pivot = df.pivot_table(index=['Product Type','Product'],columns='bins', aggfunc='size')
pivot.dropna(inplace=True)
pivot
Expected ouput:
Amount 0-1000 1000-2000 2000-5000 5000+
Product Type Product
Bikes Hayabusa 1 0 1 0
Cars BMW M3 1 0 2 0
Fruits Apple 0 1 1 0
Banana 0 0 2 0
Vegetable Apple 0 1 1 0
Brocoli 0 0 1 0
Carrot 0 0 1 0
Vegetables Brocoli 0 1 0 0
In the df, Bikes only contains 'hayabusa', which i want in my pivot's Bike category. How should I do this?
bins = [0, 1000, 2000, 5000, np.inf]
labels = ['0-1000', '1000-2000', '2000-5000', '5000+']
out = pd.crosstab([df['Product Type'], df['Product']],
pd.cut(df['Amount'], bins=bins, labels=labels)
).reindex(columns=labels, fill_value=0)
Output:
Amount 0-1000 1000-2000 2000-5000 5000+
Product Type Product
Bikes Hayabusa 1 0 1 0
Cars BMW M3 1 0 2 0
Fruits Apple 0 1 1 0
Banana 0 0 2 0
Vegetable Apple 0 1 1 0
Brocoli 0 0 1 0
Carrot 0 0 1 0
Vegetables Brocoli 0 1 0 0
Used input:
df = pd.DataFrame({'Product Type': ['Fruits', 'Fruits', 'Vegetable', 'Vegetable', 'Vegetable', 'Vegetable', 'Fruits', 'Fruits', 'Vegetables', 'Cars', 'Cars', 'Cars', 'Bikes', 'Bikes'],
'Product': ['Apple', 'Banana', 'Apple', 'Apple', 'Brocoli', 'Carrot', 'Apple', 'Banana', 'Brocoli', 'BMW M3', 'BMW M3', 'BMW M3', 'Hayabusa', 'Hayabusa'],
'Amount': [4938, 3285, 4947, 1516, 2212, 3778, 1110, 4436, 1049, 494, 2818, 3737, 954, 4074],
})