exdf = pd.DataFrame({'Employee name': ['Alex','Mike'],
'2014.1': [5, 2], '2014.2': [3, 4], '2014.3': [3, 6], '2014.4': [4, 3], '2015.1': [7, 5], '2015.2': [5, 4]})
exdf
Employee name 2014.1 2014.2 2014.3 2014.4 2015.1 2015.2
0 Alex 5 3 3 4 7 5
1 Mike 2 4 6 3 5 4
Suppose the above dataframe has several such rows and columns with output from each employee for each quarter. I want to create a new dataframe with columns:
newdf=pd.Dataframe(columns=['Employee name','Year','Quarter','Output'])
So, new dataframe will have nxm rows where n and m are rows and columns in original dataframe. What I have tried is filling every row and column entry using nested for loop.
But I'm sure there is a more efficient method.
for i in range(df.shape[0]):
for j in range(df.shape[1]):
newdf.iloc[?]=exdf.iloc[?]
Use DataFrame.melt
with Series.str.split
, last change order of columns:
df = exdf.melt('Employee name', var_name='Year', value_name='Output')
df[['Year', 'Quarter']] = df['Year'].str.split('.', expand=True)
df = df[['Employee name','Year','Quarter','Output']]
print (df)
Employee name Year Quarter Output
0 Alex 2014 1 5
1 Mike 2014 1 2
2 Alex 2014 2 3
3 Mike 2014 2 4
4 Alex 2014 3 3
5 Mike 2014 3 6
6 Alex 2014 4 4
7 Mike 2014 4 3
8 Alex 2015 1 7
9 Mike 2015 1 5
10 Alex 2015 2 5
11 Mike 2015 2 4