Search code examples
pythonpandasseriestolist

convert pandas series (with strings) to python list


It's probably a silly thing but I can't seem to correctly convert a pandas series originally got from an excel sheet to a list.

dfCI is created by importing data from an excel sheet and looks like this:

tab      var             val
MsrData  sortfield       DetailID
MsrData  strow           4
MsrData  inputneeded     "MeasDescriptionTest", "SiteLocTest", "SavingsCalcsProvided","BiMonthlyTest"     

# get list of cols for which input is needed
cols = dfCI[((dfCI['var'] == 'inputneeded') & (dfCI['tab'] == 'MsrData'))]['val'].values.tolist()
print(cols)

>> ['"MeasDescriptionTest", "SiteLocTest", "SavingsCalcsProvided", "BiMonthlyTest"']

# replace null text with text
invalid = 'Input Needed'
for col in cols:
   dfMSR[col] = np.where((dfMSR[col].isnull()), invalid, dfMSR[col])

However the second set of (single) quotes added when I converted cols from series to list, makes all the columns a single value so that

col = '"MeasDescriptionTest", "SiteLocTest", "SavingsCalcsProvided", "BiMonthlyTest"'

The desired output for cols is

cols = ["MeasDescriptionTest", "SiteLocTest", "SavingsCalcsProvided", "BiMonthlyTest"]

What am I doing wrong?


Solution

  • Once you've got col, you can convert it to your expected output:

    In [1109]: col = '"MeasDescriptionTest", "SiteLocTest", "SavingsCalcsProvided", "BiMonthlyTest"'
    
    In [1114]: cols = [i.strip() for i in col.replace('"', '').split(',')]
    
    In [1115]: cols
    Out[1115]: ['MeasDescriptionTest', 'SiteLocTest', 'SavingsCalcsProvided', 'BiMonthlyTest']