I have a sas proc transpose i'm trying to replicate in pandas.
Here is an example:
ID = ['ID1', 'ID1', 'ID1', 'ID1', 'ID1']
obs_week = [201701,201701,201701,201701,201701]
weeks_id = [1,2,3,4,5]
spend = [100,200,300,400,500]
df = pd.DataFrame(zip(ID, obs_week, weeks_id, spend ), columns = ['id', 'obs_week', 'weeks_id', 'spend'])
df
This gives a table like this:
id obs_week weeks_id spend
0 ID1 201701 1 100
1 ID1 201701 2 200
2 ID1 201701 3 300
3 ID1 201701 4 400
4 ID1 201701 5 500
I am trying to transpose this so that ID1 and obs_week become unique and then the weeks_id becomes the new columns with a prefix.
The sas code looks like this:
proc transpose data=spend out=spend_hh (drop = _label_ _name_) prefix=spend_;
by id obs_week;
id weeks_id;
var spend;
run;
I have managed to get close using df.pivot_table
df.pivot_table(index=['id','obs_week'], columns='weeks_id', aggfunc=sum, fill_value=0)
Giving a table like this
spend
weeks_id 1 2 3 4 5
id obs_week
ID1 201701 100 200 300 400 500
My issue is that i would like to rename 1 2 3 4 5 to be spend_1, spend_2, etc
I also want to do this for multiple different variables within the file but i assume i can just limit the selection to just the fields i want
My answer should look like this:
id obs_week spend_1 spend_2 spend_3 spend_4 spend_5
0 ID1 201701 100 200 300 400 500
Is this just collapsing the headers somehow?
I also want id and obs_week to not be part of the index.
You need list comprehension for create columns names first and then reset_index
for column with index and rename_axis
for remove weeks_id
text:
df = df.pivot_table(index=['id','obs_week'], columns='weeks_id', aggfunc=sum, fill_value=0)
df.columns = ['{}_{}'.format(x[0], x[1]) for x in df.columns]
df = df.reset_index().rename_axis(None, axis=1)
print (df)
id obs_week spend_1 spend_2 spend_3 spend_4 spend_5
0 ID1 201701 100 200 300 400 500
Or:
df.columns = ['_'.join((x[0], str(x[1]))) for x in df.columns]
df = df.reset_index().rename_axis(None, axis=1)
print (df)
id obs_week spend_1 spend_2 spend_3 spend_4 spend_5
0 ID1 201701 100 200 300 400 500