For loops w/ Assign Function Not Working Properly

I'm working with a dataframe that has several columns containing distinct elements that I would like to parse into lists. These items are separated by double pipes (||).

I can currently write single lines of code using the assign function that can separate elements within a single column, but I am having trouble writing a function that could take a list of columns and perform the same function across all of them.

For example, imagine my dataframe is called DATA:

import pandas as pd

d1 = {'Column_1': ["SF||NYG","SF||NYG","SF||NYG",
            "SF||NYG","SF||NYG","SF||NYG","SF||NYG"],
      'Column_2': ["SF||NYG","SF||NYG","SF||NYG",
            "SF||NYG","SF||NYG","SF||NYG","SF||NYG"]}

DATA = pd.DataFrame(d1)
print (DATA)

  Column_1 Column_2
0  SF||NYG  SF||NYG
1  SF||NYG  SF||NYG
2  SF||NYG  SF||NYG
3  SF||NYG  SF||NYG
4  SF||NYG  SF||NYG
5  SF||NYG  SF||NYG
6  SF||NYG  SF||NYG

If I want to separate out items in Column_1 into a list, I can write code that achieves my objective like this:

DATA = DATA.assign(Column_1=DATA["Column_1"].str.split('\|\|'))

print(DATA)

    Column_1 Column_2
0  [SF, NYG]  SF||NYG
1  [SF, NYG]  SF||NYG
2  [SF, NYG]  SF||NYG
3  [SF, NYG]  SF||NYG
4  [SF, NYG]  SF||NYG
5  [SF, NYG]  SF||NYG
6  [SF, NYG]  SF||NYG

However, if I attempt to write an abstracted function of this:

def sep_list(df, col_list, sep):
    for col in col_list:
        df = df.assign(col=df[col].str.split(sep))

The function will run, but the data within columns won't be separated out as I expect. Essentially, the data will look exactly the same as when it went into the function. I feel like it has something to do with the "col=df[col]" part, as the first col in the working line of code above is not surrounded by quotes, and it seems the code from the function would run the first col as being in quotes (e.g., "Column_1"=DATA["Column_1"] instead of Column_1=DATA["Column_1"].

I'm not sure how to re-write my function to do the equivalent of the single code line on a list of columns that are passed into the function. I've tried creating a separate variable within the function that would strip quotes from the col name, but this did not seem to help.

Any help would be greatly appreciated.

Solution

You can use a lambda function and apply function to do this:

DATA.apply(lambda x: x.str.split('\|\|'))
Out[33]: 
    Column_1   Column_2
0  [SF, NYG]  [SF, NYG]
1  [SF, NYG]  [SF, NYG]
2  [SF, NYG]  [SF, NYG]
3  [SF, NYG]  [SF, NYG]
4  [SF, NYG]  [SF, NYG]
5  [SF, NYG]  [SF, NYG]
6  [SF, NYG]  [SF, NYG]