I have two dataframes as such:
#Load the required libraries
import pandas as pd
import numpy as np
#Create dataset-1
A_1 = {'id': [1, 1, 1, 1,
2, 2,
3, 3, 3, 3, 3, 3,
4, 4, 4, ],
'cycle': [1,2, 3, 4,
1,2,
1,2, 3, 4, 5,6,
1,2, 3, ],
'Salary': [7,9,10,11,
4, 5,
8,9,10,11,12,13,
7, 7,9,],
'Days': [123, 128, 141, 52,
96, 120,
15,123, 128, 66, 120, 141,
123, 128, 66, ],
}
#Convert to dataframe-1
A_1 = pd.DataFrame(A_1)
print("\n A_1 = \n", A_1)
#Create dataset-2
A_2 = {'id': [1, 1, 1, 1,
2, 2,
3, 3, 3,
4, 4, 4, ],
'cycle': [1,2, 3, 4,
1,2,
1,2, 3,
1,2, 3, ],
'Salary': [8,9,10,12,
4, 5,
8,9,14,
4, 8,9,],
'Days': [98, 128, 141, 52,
1, 5,
15,99, 7,
6, 128, 66, ],
}
#Convert to dataframe-2
A_2 = pd.DataFrame(A_2)
print("\n A_2 = \n", A_2)
The above two dataframes apear as such:
I have concatenated the above dataframes as such:
#Concatenate A_1 and A_2
A_concat = pd.concat([A_1, A_2], axis=0)
print("\n A_concat = \n", A_concat)
The concatenated dataframe appears as such:
Now, I need to Recall A_1 and A_2 from A_concat (Refer above figure). I tried something as such:
#Recall A_1
A_1_recall = A_concat[1]
print("\n A_1_recall = \n", A_1_recall)
#Recall A__2
A_2_recall = A_concat[2]
print("\n A_2_recall = \n", A_2_recall)
However, I don't get the desired output.
I expect that A_1_recall should give me the elements as that in A_1 and A_2_recall should give me the elements as that in A_2, as such:
Can somebody please let me know how do I achieve this task in Python?
You can provide a set of keys
when doing pd.concat()
:
A_concat = pd.concat([A_1, A_2], axis=0, keys=[1,2])
A_concat
now looks like this:
id cycle Salary Days
1 0 1 1 7 123
1 1 2 9 128
2 1 3 10 141
3 1 4 11 52
4 2 1 4 96
5 2 2 5 120
6 3 1 8 15
7 3 2 9 123
8 3 3 10 128
9 3 4 11 66
10 3 5 12 120
11 3 6 13 141
12 4 1 7 123
13 4 2 7 128
14 4 3 9 66
2 0 1 1 8 98
1 1 2 9 128
2 1 3 10 141
3 1 4 12 52
4 2 1 4 1
5 2 2 5 5
6 3 1 8 15
7 3 2 9 99
8 3 3 14 7
9 4 1 4 6
10 4 2 8 128
11 4 3 9 66
Now your A_1_recall
and A_2_recall
can be initiated like so:
A_1_recall = A_concat.loc[1,:]
A_2_recall = A_concat.loc[2,:]
A_1_recall
:
id cycle Salary Days
0 1 1 7 123
1 1 2 9 128
2 1 3 10 141
3 1 4 11 52
4 2 1 4 96
5 2 2 5 120
6 3 1 8 15
7 3 2 9 123
8 3 3 10 128
9 3 4 11 66
10 3 5 12 120
11 3 6 13 141
12 4 1 7 123
13 4 2 7 128
14 4 3 9 66
A_2_recall
:
id cycle Salary Days
0 1 1 8 98
1 1 2 9 128
2 1 3 10 141
3 1 4 12 52
4 2 1 4 1
5 2 2 5 5
6 3 1 8 15
7 3 2 9 99
8 3 3 14 7
9 4 1 4 6
10 4 2 8 128
11 4 3 9 66