Search code examples
pythonpandasstring-literalsdata-wrangling

Creating a new column based on the key of a dictionary?


I am trying to create a new column in a dataframe within a for loop of dictionary items that uses a string literal and the key, but it throws a "ValueError: cannot set a frame with no defined index and a scalar" error message.

Dictionary definition for exp categories

  d = {'Travel & Entertainment': [1,2,3,4,5,6,7,8,9,10,11], 'Office supplies & Expenses': [13,14,15,16,17],
    'Professional Fees':[19,20,21,22,23], 'Fees & Assessments':[25,26,27], 'IT Expenses':[29],
    'Bad Debt Expense':[31],'Miscellaneous expenses': [33,34,35,36,37],'Marketing Expenses':[40,41,42],
    'Payroll & Related Expenses': [45,46,47,48,49,50,51,52,53,54,55,56], 'Total Utilities':[59,60],
    'Total Equipment Maint, & Rental Expense': [63,64,65,66,67,68],'Total Mill Expense':[70,71,72,73,74,75,76,77],
    'Total Taxes':[80,81],'Total Insurance Expense':[83,84,85],'Incentive Compensation':[88],
    'Strategic Initiative':[89]}

Creating a new dataframe based on a master dataframe

mcon = VA.loc[:,['Expense', 'Mgrl', 'Exp Category', 'Parent Category']]
mcon.loc[:,'Variance Type'] = ['Unfavorable' if x < 0 else 'favorable' for x in mcon['Mgrl']]
mcon.loc[:,'Business Unit'] = 'Managerial Consolidation'
mcon = mcon[['Business Unit', 'Exp Category','Parent Category', 'Expense', 'Mgrl', 'Variance Type']]
mcon.rename(columns={'Mgrl':'Variance'}, inplace=True)

Creating a new dataframe that will be written to excel eventually

a1 = pd.DataFrame() 
for key, value in d.items():
    umconm = mcon.iloc[value].query('Variance < 0').nsmallest(5, 'Variance')
    fmconm = mcon.iloc[value].query('Variance > 0').nlargest(5, 'Variance')
    if umconm.empty == False or fmconm.empty == False:
        a1 = pd.concat([a1,umconm,fmconm], ignore_index = True)
    else:
        continue
a1.to_csv('example.csv', index = False)

Output looks like this

enter image description here

I am trying to add a new column that says Higher/Lower budget than {key} where key stands for the expense type using the below code

for key, value in d.items():
    umconm = mcon.iloc[value].query('Variance < 0').nsmallest(5, 'Variance')
    umconm.loc[:,'Explanation'] = f'Lower than budgeted {key}'
    fmconm = mcon.iloc[value].query('Variance > 0').nlargest(5, 'Variance')
    fmconm.loc[:,'Explanation'] = f'Higher than budgeted {key}'
    if umconm.empty == False or fmconm.empty == False:
        a1 = pd.concat([a1,umconm,fmconm], ignore_index = True)
    else:
        continue

but using the above string literal gives me the error message "ValueError: cannot set a frame with no defined index and a scalar"

I would really appreciate any help to either correct this or find a different solution for adding this field to my dataframe. Thanks in advance!


Solution

  • this error occurs because this line

    umconm = mcon.iloc[value].query('Variance < 0').nsmallest(5, 'Variance')
    

    will produce empty dataframe sometimes without index. instead use this approach when you want to set your column (not loc):

    a['Explanation'] = f'Lower than budgeted {key}'