I have a file consisting of sales for different items. I have a column of model predictions named outputs
and which model produced those predictions named, model
. I need to take the current model's (that's in production) predictions in a column named 'FCAST_QTYand make them part of the outputs column with the string "FCAST_QTY" being put into the
model column. So essentially, melting that column (
FCAST_QTY') into the columns output
and model
so the current production model is in the same columns and the multiple models that are in development. This will make it easier to compare/contrast. I'm not sure how to do this. Example data below.
import pandas as pd
from pandas import Timestamp
sales_dict = {'MO_YR': {0: Timestamp('2021-01-01 00:00:00'),
1: Timestamp('2021-02-01 00:00:00'),
2: Timestamp('2021-03-01 00:00:00'),
3: Timestamp('2021-04-01 00:00:00'),
4: Timestamp('2021-05-01 00:00:00'),
5: Timestamp('2021-06-01 00:00:00'),
6: Timestamp('2021-07-01 00:00:00'),
7: Timestamp('2021-08-01 00:00:00'),
8: Timestamp('2021-09-01 00:00:00'),
9: Timestamp('2021-10-01 00:00:00'),
10: Timestamp('2021-11-01 00:00:00'),
11: Timestamp('2021-12-01 00:00:00'),
12: Timestamp('2021-01-01 00:00:00'),
13: Timestamp('2021-02-01 00:00:00'),
14: Timestamp('2021-03-01 00:00:00')},
'ITEM_BASE': {0: '289461K',
1: '289461K',
2: '289461K',
3: '289461K',
4: '289461K',
5: '289461K',
6: '289461K',
7: '289461K',
8: '289461K',
9: '289461K',
10: '289461K',
11: '289461K',
12: '400520J',
13: '400520J',
14: '400520J'},
'eaches': {0: 2592,
1: 3844,
2: 759,
3: 825,
4: 663,
5: 2025,
6: 471,
7: 1160,
8: 5987,
9: 679,
10: 469,
11: 907,
12: 64,
13: 48,
14: 63},
'FCAST_QTY': {0: 2800.0,
1: 5200.0,
2: 550.0,
3: 475.0,
4: 575.0,
5: 475.0,
6: 650.0,
7: 550.0,
8: 7900.0,
9: 1187.0,
10: 1187.0,
11: 1900.0,
12: 51.0,
13: 55.0,
14: 59.0},
'log_eaches': {0: 7.860185057472165,
1: 8.254268770090183,
2: 6.63200177739563,
3: 6.715383386334682,
4: 6.496774990185863,
5: 7.61332497954064,
6: 6.154858094016418,
7: 7.05617528410041,
8: 8.697345730925353,
9: 6.520621127558696,
10: 6.15060276844628,
11: 6.810142450115136,
12: 4.158883083359672,
13: 3.871201010907891,
14: 4.143134726391533},
'output': {0: 8.646015798513993,
1: 8.378197900630752,
2: 7.045235414873291,
3: 5.117058321275769,
4: 9.082928370640056,
5: 5.225648643174155,
6: 7.446383013291042,
7: 6.307484284901181,
8: 7.752673979530179,
9: 9.02189934080111,
10: 4.677594714421006,
11: 7.218749101888444,
12: 4.04018241973268,
13: 3.940978322900716,
14: 3.962359464699719},
'model': {0: 'LR_output',
1: 'LR_output',
2: 'LR_output',
3: 'LR_output',
4: 'LR_output',
5: 'LR_output',
6: 'LR_output',
7: 'LR_output',
8: 'LR_output',
9: 'LR_output',
10: 'LR_output',
11: 'LR_output',
12: 'AR(12)',
13: 'AR(12)',
14: 'AR(12)'}}
df = pd.DataFrame.from_dict(sales_dict)
Expected Output Added:
expected_dict = {'MO_YR': {0: Timestamp('2021-01-01 00:00:00'),
1: Timestamp('2021-02-01 00:00:00'),
2: Timestamp('2021-03-01 00:00:00'),
3: Timestamp('2021-04-01 00:00:00'),
4: Timestamp('2021-05-01 00:00:00'),
5: Timestamp('2021-06-01 00:00:00'),
6: Timestamp('2021-07-01 00:00:00'),
7: Timestamp('2021-08-01 00:00:00'),
8: Timestamp('2021-09-01 00:00:00'),
9: Timestamp('2021-10-01 00:00:00'),
10: Timestamp('2021-11-01 00:00:00'),
11: Timestamp('2021-12-01 00:00:00'),
12: Timestamp('2021-01-01 00:00:00'),
13: Timestamp('2021-02-01 00:00:00'),
14: Timestamp('2021-03-01 00:00:00'),
15: Timestamp('2021-01-01 00:00:00'),
16: Timestamp('2021-02-01 00:00:00'),
17: Timestamp('2021-03-01 00:00:00'),
18: Timestamp('2021-04-01 00:00:00'),
19: Timestamp('2021-05-01 00:00:00'),
20: Timestamp('2021-06-01 00:00:00'),
21: Timestamp('2021-07-01 00:00:00'),
22: Timestamp('2021-08-01 00:00:00'),
23: Timestamp('2021-09-01 00:00:00'),
24: Timestamp('2021-10-01 00:00:00'),
25: Timestamp('2021-11-01 00:00:00'),
26: Timestamp('2021-12-01 00:00:00'),
27: Timestamp('2021-01-01 00:00:00'),
28: Timestamp('2021-02-01 00:00:00'),
29: Timestamp('2021-03-01 00:00:00')},
'ITEM_BASE': {0: '289461K',
1: '289461K',
2: '289461K',
3: '289461K',
4: '289461K',
5: '289461K',
6: '289461K',
7: '289461K',
8: '289461K',
9: '289461K',
10: '289461K',
11: '289461K',
12: '400520J',
13: '400520J',
14: '400520J',
15: '289461K',
16: '289461K',
17: '289461K',
18: '289461K',
19: '289461K',
20: '289461K',
21: '289461K',
22: '289461K',
23: '289461K',
24: '289461K',
25: '289461K',
26: '289461K',
27: '400520J',
28: '400520J',
29: '400520J'},
'eaches': {0: 2592,
1: 3844,
2: 759,
3: 825,
4: 663,
5: 2025,
6: 471,
7: 1160,
8: 5987,
9: 679,
10: 469,
11: 907,
12: 64,
13: 48,
14: 63,
15: 2592,
16: 3844,
17: 759,
18: 825,
19: 663,
20: 2025,
21: 471,
22: 1160,
23: 5987,
24: 679,
25: 469,
26: 907,
27: 64,
28: 48,
29: 63},
'log_eaches': {0: 7.860185057472165,
1: 8.254268770090183,
2: 6.63200177739563,
3: 6.715383386334682,
4: 6.496774990185863,
5: 7.61332497954064,
6: 6.154858094016418,
7: 7.05617528410041,
8: 8.697345730925353,
9: 6.520621127558696,
10: 6.15060276844628,
11: 6.810142450115136,
12: 4.158883083359672,
13: 3.871201010907891,
14: 4.143134726391533,
15: 7.860185057472165,
16: 8.254268770090183,
17: 6.63200177739563,
18: 6.715383386334682,
19: 6.496774990185863,
20: 7.61332497954064,
21: 6.154858094016418,
22: 7.05617528410041,
23: 8.697345730925353,
24: 6.520621127558696,
25: 6.15060276844628,
26: 6.810142450115136,
27: 4.158883083359672,
28: 3.871201010907891,
29: 4.143134726391533,},
'output': {0: 8.646015798513993,
1: 8.378197900630752,
2: 7.045235414873291,
3: 5.117058321275769,
4: 9.082928370640056,
5: 5.225648643174155,
6: 7.446383013291042,
7: 6.307484284901181,
8: 7.752673979530179,
9: 9.02189934080111,
10: 4.677594714421006,
11: 7.218749101888444,
12: 4.04018241973268,
13: 3.940978322900716,
14: 3.962359464699719,
15: 2800.0,
16: 5200.0,
17: 550.0,
18: 475.0,
19: 575.0,
20: 475.0,
21: 650.0,
22: 550.0,
23: 7900.0,
24: 1187.0,
25: 1187.0,
26: 1900.0,
27: 51.0,
28: 55.0,
29: 59.0},
'model': {0: 'LR_output',
1: 'LR_output',
2: 'LR_output',
3: 'LR_output',
4: 'LR_output',
5: 'LR_output',
6: 'LR_output',
7: 'LR_output',
8: 'LR_output',
9: 'LR_output',
10: 'LR_output',
11: 'LR_output',
12: 'AR(12)',
13: 'AR(12)',
14: 'AR(12)',
15:'FCAST_QTY',
16:'FCAST_QTY',
17:'FCAST_QTY',
18:'FCAST_QTY',
19:'FCAST_QTY',
20:'FCAST_QTY',
21:'FCAST_QTY',
22:'FCAST_QTY',
23:'FCAST_QTY',
24:'FCAST_QTY',
25:'FCAST_QTY',
26:'FCAST_QTY',
27:'FCAST_QTY',
28:'FCAST_QTY',
29:'FCAST_QTY'}}
df = pd.DataFrame.from_dict(expected_dict)
Create a new dataframe with your logic and append it to the original dataframe:
fcast_qty = (df
.drop(columns = ['output', 'model'])
.rename(columns={"FCAST_QTY":"output"})
.assign(model="FCAST_QTY")
)
pd.concat([df.drop(columns='FCAST_QTY'), fcast_qty], ignore_index = True)
MO_YR ITEM_BASE eaches log_eaches output model
0 2021-01-01 289461K 2592 7.860185 8.646016 LR_output
1 2021-02-01 289461K 3844 8.254269 8.378198 LR_output
2 2021-03-01 289461K 759 6.632002 7.045235 LR_output
3 2021-04-01 289461K 825 6.715383 5.117058 LR_output
4 2021-05-01 289461K 663 6.496775 9.082928 LR_output
5 2021-06-01 289461K 2025 7.613325 5.225649 LR_output
6 2021-07-01 289461K 471 6.154858 7.446383 LR_output
7 2021-08-01 289461K 1160 7.056175 6.307484 LR_output
8 2021-09-01 289461K 5987 8.697346 7.752674 LR_output
9 2021-10-01 289461K 679 6.520621 9.021899 LR_output
10 2021-11-01 289461K 469 6.150603 4.677595 LR_output
11 2021-12-01 289461K 907 6.810142 7.218749 LR_output
12 2021-01-01 400520J 64 4.158883 4.040182 AR(12)
13 2021-02-01 400520J 48 3.871201 3.940978 AR(12)
14 2021-03-01 400520J 63 4.143135 3.962359 AR(12)
15 2021-01-01 289461K 2592 7.860185 2800.000000 FCAST_QTY
16 2021-02-01 289461K 3844 8.254269 5200.000000 FCAST_QTY
17 2021-03-01 289461K 759 6.632002 550.000000 FCAST_QTY
18 2021-04-01 289461K 825 6.715383 475.000000 FCAST_QTY
19 2021-05-01 289461K 663 6.496775 575.000000 FCAST_QTY
20 2021-06-01 289461K 2025 7.613325 475.000000 FCAST_QTY
21 2021-07-01 289461K 471 6.154858 650.000000 FCAST_QTY
22 2021-08-01 289461K 1160 7.056175 550.000000 FCAST_QTY
23 2021-09-01 289461K 5987 8.697346 7900.000000 FCAST_QTY
24 2021-10-01 289461K 679 6.520621 1187.000000 FCAST_QTY
25 2021-11-01 289461K 469 6.150603 1187.000000 FCAST_QTY
26 2021-12-01 289461K 907 6.810142 1900.000000 FCAST_QTY
27 2021-01-01 400520J 64 4.158883 51.000000 FCAST_QTY
28 2021-02-01 400520J 48 3.871201 55.000000 FCAST_QTY
29 2021-03-01 400520J 63 4.143135 59.000000 FCAST_QTY