Search code examples
pandaspandas-melt

How to Melt a column into another melted column within Pandas?


I have a file consisting of sales for different items. I have a column of model predictions named outputs and which model produced those predictions named, model. I need to take the current model's (that's in production) predictions in a column named 'FCAST_QTYand make them part of the outputs column with the string "FCAST_QTY" being put into themodel column. So essentially, melting that column (FCAST_QTY') into the columns output and model so the current production model is in the same columns and the multiple models that are in development. This will make it easier to compare/contrast. I'm not sure how to do this. Example data below.

import pandas as pd
from pandas import Timestamp
sales_dict = {'MO_YR': {0: Timestamp('2021-01-01 00:00:00'),
  1: Timestamp('2021-02-01 00:00:00'),
  2: Timestamp('2021-03-01 00:00:00'),
  3: Timestamp('2021-04-01 00:00:00'),
  4: Timestamp('2021-05-01 00:00:00'),
  5: Timestamp('2021-06-01 00:00:00'),
  6: Timestamp('2021-07-01 00:00:00'),
  7: Timestamp('2021-08-01 00:00:00'),
  8: Timestamp('2021-09-01 00:00:00'),
  9: Timestamp('2021-10-01 00:00:00'),
  10: Timestamp('2021-11-01 00:00:00'),
  11: Timestamp('2021-12-01 00:00:00'),
  12: Timestamp('2021-01-01 00:00:00'),
  13: Timestamp('2021-02-01 00:00:00'),
  14: Timestamp('2021-03-01 00:00:00')},
 'ITEM_BASE': {0: '289461K',
  1: '289461K',
  2: '289461K',
  3: '289461K',
  4: '289461K',
  5: '289461K',
  6: '289461K',
  7: '289461K',
  8: '289461K',
  9: '289461K',
  10: '289461K',
  11: '289461K',
  12: '400520J',
  13: '400520J',
  14: '400520J'},
'eaches': {0: 2592,
  1: 3844,
  2: 759,
  3: 825,
  4: 663,
  5: 2025,
  6: 471,
  7: 1160,
  8: 5987,
  9: 679,
  10: 469,
  11: 907,
  12: 64,
  13: 48,
  14: 63},
'FCAST_QTY': {0: 2800.0,
  1: 5200.0,
  2: 550.0,
  3: 475.0,
  4: 575.0,
  5: 475.0,
  6: 650.0,
  7: 550.0,
  8: 7900.0,
  9: 1187.0,
  10: 1187.0,
  11: 1900.0,
  12: 51.0,
  13: 55.0,
  14: 59.0},
'log_eaches': {0: 7.860185057472165,
  1: 8.254268770090183,
  2: 6.63200177739563,
  3: 6.715383386334682,
  4: 6.496774990185863,
  5: 7.61332497954064,
  6: 6.154858094016418,
  7: 7.05617528410041,
  8: 8.697345730925353,
  9: 6.520621127558696,
  10: 6.15060276844628,
  11: 6.810142450115136,
  12: 4.158883083359672,
  13: 3.871201010907891,
  14: 4.143134726391533},
 'output': {0: 8.646015798513993,
  1: 8.378197900630752,
  2: 7.045235414873291,
  3: 5.117058321275769,
  4: 9.082928370640056,
  5: 5.225648643174155,
  6: 7.446383013291042,
  7: 6.307484284901181,
  8: 7.752673979530179,
  9: 9.02189934080111,
  10: 4.677594714421006,
  11: 7.218749101888444,
  12: 4.04018241973268,
  13: 3.940978322900716,
  14: 3.962359464699719},
 'model': {0: 'LR_output',
  1: 'LR_output',
  2: 'LR_output',
  3: 'LR_output',
  4: 'LR_output',
  5: 'LR_output',
  6: 'LR_output',
  7: 'LR_output',
  8: 'LR_output',
  9: 'LR_output',
  10: 'LR_output',
  11: 'LR_output',
  12: 'AR(12)',
  13: 'AR(12)',
  14: 'AR(12)'}}

df = pd.DataFrame.from_dict(sales_dict)

Expected Output Added:

expected_dict = {'MO_YR': {0: Timestamp('2021-01-01 00:00:00'),
  1: Timestamp('2021-02-01 00:00:00'),
  2: Timestamp('2021-03-01 00:00:00'),
  3: Timestamp('2021-04-01 00:00:00'),
  4: Timestamp('2021-05-01 00:00:00'),
  5: Timestamp('2021-06-01 00:00:00'),
  6: Timestamp('2021-07-01 00:00:00'),
  7: Timestamp('2021-08-01 00:00:00'),
  8: Timestamp('2021-09-01 00:00:00'),
  9: Timestamp('2021-10-01 00:00:00'),
  10: Timestamp('2021-11-01 00:00:00'),
  11: Timestamp('2021-12-01 00:00:00'),
  12: Timestamp('2021-01-01 00:00:00'),
  13: Timestamp('2021-02-01 00:00:00'),
  14: Timestamp('2021-03-01 00:00:00'),
15: Timestamp('2021-01-01 00:00:00'),
  16: Timestamp('2021-02-01 00:00:00'),
  17: Timestamp('2021-03-01 00:00:00'),
  18: Timestamp('2021-04-01 00:00:00'),
  19: Timestamp('2021-05-01 00:00:00'),
  20: Timestamp('2021-06-01 00:00:00'),
  21: Timestamp('2021-07-01 00:00:00'),
  22: Timestamp('2021-08-01 00:00:00'),
  23: Timestamp('2021-09-01 00:00:00'),
  24: Timestamp('2021-10-01 00:00:00'),
  25: Timestamp('2021-11-01 00:00:00'),
  26: Timestamp('2021-12-01 00:00:00'),
  27: Timestamp('2021-01-01 00:00:00'),
  28: Timestamp('2021-02-01 00:00:00'),
  29: Timestamp('2021-03-01 00:00:00')},
 'ITEM_BASE': {0: '289461K',
  1: '289461K',
  2: '289461K',
  3: '289461K',
  4: '289461K',
  5: '289461K',
  6: '289461K',
  7: '289461K',
  8: '289461K',
  9: '289461K',
  10: '289461K',
  11: '289461K',
  12: '400520J',
  13: '400520J',
  14: '400520J',
15: '289461K',
  16: '289461K',
  17: '289461K',
  18: '289461K',
  19: '289461K',
  20: '289461K',
  21: '289461K',
  22: '289461K',
  23: '289461K',
  24: '289461K',
  25: '289461K',
  26: '289461K',
  27: '400520J',
  28: '400520J',
  29: '400520J'},
'eaches': {0: 2592,
  1: 3844,
  2: 759,
  3: 825,
  4: 663,
  5: 2025,
  6: 471,
  7: 1160,
  8: 5987,
  9: 679,
  10: 469,
  11: 907,
  12: 64,
  13: 48,
  14: 63,
15: 2592,
  16: 3844,
  17: 759,
  18: 825,
  19: 663,
  20: 2025,
  21: 471,
  22: 1160,
  23: 5987,
  24: 679,
  25: 469,
  26: 907,
  27: 64,
  28: 48,
  29: 63},
'log_eaches': {0: 7.860185057472165,
  1: 8.254268770090183,
  2: 6.63200177739563,
  3: 6.715383386334682,
  4: 6.496774990185863,
  5: 7.61332497954064,
  6: 6.154858094016418,
  7: 7.05617528410041,
  8: 8.697345730925353,
  9: 6.520621127558696,
  10: 6.15060276844628,
  11: 6.810142450115136,
  12: 4.158883083359672,
  13: 3.871201010907891,
  14: 4.143134726391533,
  15: 7.860185057472165,
  16: 8.254268770090183,
  17: 6.63200177739563,
  18: 6.715383386334682,
  19: 6.496774990185863,
  20: 7.61332497954064,
  21: 6.154858094016418,
  22: 7.05617528410041,
  23: 8.697345730925353,
  24: 6.520621127558696,
  25: 6.15060276844628,
  26: 6.810142450115136,
  27: 4.158883083359672,
  28: 3.871201010907891,
  29: 4.143134726391533,},
 'output': {0: 8.646015798513993,
  1: 8.378197900630752,
  2: 7.045235414873291,
  3: 5.117058321275769,
  4: 9.082928370640056,
  5: 5.225648643174155,
  6: 7.446383013291042,
  7: 6.307484284901181,
  8: 7.752673979530179,
  9: 9.02189934080111,
  10: 4.677594714421006,
  11: 7.218749101888444,
  12: 4.04018241973268,
  13: 3.940978322900716,
  14: 3.962359464699719,
  15: 2800.0,
  16: 5200.0,
  17: 550.0,
  18: 475.0,
  19: 575.0,
  20: 475.0,
  21: 650.0,
  22: 550.0,
  23: 7900.0,
  24: 1187.0,
  25: 1187.0,
  26: 1900.0,
  27: 51.0,
  28: 55.0,
  29: 59.0},
 'model': {0: 'LR_output',
  1: 'LR_output',
  2: 'LR_output',
  3: 'LR_output',
  4: 'LR_output',
  5: 'LR_output',
  6: 'LR_output',
  7: 'LR_output',
  8: 'LR_output',
  9: 'LR_output',
  10: 'LR_output',
  11: 'LR_output',
  12: 'AR(12)',
  13: 'AR(12)',
  14: 'AR(12)',
15:'FCAST_QTY',
16:'FCAST_QTY',
17:'FCAST_QTY',
18:'FCAST_QTY',
19:'FCAST_QTY',
20:'FCAST_QTY',
21:'FCAST_QTY',
22:'FCAST_QTY',
23:'FCAST_QTY',
24:'FCAST_QTY',
25:'FCAST_QTY',
26:'FCAST_QTY',
27:'FCAST_QTY',
28:'FCAST_QTY',
29:'FCAST_QTY'}}
 df = pd.DataFrame.from_dict(expected_dict)

Solution

  • Create a new dataframe with your logic and append it to the original dataframe:

    fcast_qty = (df
                 .drop(columns = ['output', 'model'])
                 .rename(columns={"FCAST_QTY":"output"})
                 .assign(model="FCAST_QTY")
                )
    
    pd.concat([df.drop(columns='FCAST_QTY'), fcast_qty], ignore_index = True)
            MO_YR ITEM_BASE  eaches  log_eaches       output      model
    0  2021-01-01   289461K    2592    7.860185     8.646016  LR_output
    1  2021-02-01   289461K    3844    8.254269     8.378198  LR_output
    2  2021-03-01   289461K     759    6.632002     7.045235  LR_output
    3  2021-04-01   289461K     825    6.715383     5.117058  LR_output
    4  2021-05-01   289461K     663    6.496775     9.082928  LR_output
    5  2021-06-01   289461K    2025    7.613325     5.225649  LR_output
    6  2021-07-01   289461K     471    6.154858     7.446383  LR_output
    7  2021-08-01   289461K    1160    7.056175     6.307484  LR_output
    8  2021-09-01   289461K    5987    8.697346     7.752674  LR_output
    9  2021-10-01   289461K     679    6.520621     9.021899  LR_output
    10 2021-11-01   289461K     469    6.150603     4.677595  LR_output
    11 2021-12-01   289461K     907    6.810142     7.218749  LR_output
    12 2021-01-01   400520J      64    4.158883     4.040182     AR(12)
    13 2021-02-01   400520J      48    3.871201     3.940978     AR(12)
    14 2021-03-01   400520J      63    4.143135     3.962359     AR(12)
    15 2021-01-01   289461K    2592    7.860185  2800.000000  FCAST_QTY
    16 2021-02-01   289461K    3844    8.254269  5200.000000  FCAST_QTY
    17 2021-03-01   289461K     759    6.632002   550.000000  FCAST_QTY
    18 2021-04-01   289461K     825    6.715383   475.000000  FCAST_QTY
    19 2021-05-01   289461K     663    6.496775   575.000000  FCAST_QTY
    20 2021-06-01   289461K    2025    7.613325   475.000000  FCAST_QTY
    21 2021-07-01   289461K     471    6.154858   650.000000  FCAST_QTY
    22 2021-08-01   289461K    1160    7.056175   550.000000  FCAST_QTY
    23 2021-09-01   289461K    5987    8.697346  7900.000000  FCAST_QTY
    24 2021-10-01   289461K     679    6.520621  1187.000000  FCAST_QTY
    25 2021-11-01   289461K     469    6.150603  1187.000000  FCAST_QTY
    26 2021-12-01   289461K     907    6.810142  1900.000000  FCAST_QTY
    27 2021-01-01   400520J      64    4.158883    51.000000  FCAST_QTY
    28 2021-02-01   400520J      48    3.871201    55.000000  FCAST_QTY
    29 2021-03-01   400520J      63    4.143135    59.000000  FCAST_QTY