Search code examples
pythonpandasdataframetolist

Iterate in dataframe with more that one element per entry to have lists


I have a dataframe that looks kind of like this. It has two columns and some columns have more than one data. For instance the first row in the first column has "tom-15", "to-13" and the second column has 10, 3.3. The NA value is empty

data = [['"tom-15", "to-13"', '10, 3.3'], ['"nick-12"', '15.5'], ['NA', '14']]
df = pd.DataFrame(data, columns=['Att1', 'Att2'])

I need to use these columns as parameters for other parts of my code but as some columns have two or more entries I want to create a list on each case. I tried this:

for index, row in df.iterrows():
    l1 = row["Att1"].tolist()
    l2 = row["Att2"].tolist()

It gives me an error 'str' object has no attribute 'tolist'. How do I create a list in each case? Basically, I want two lists after each iteration, an l1 list with the contents from Att1 and l2 with the contents of Att2. The content should change on each iteration.

I want the first set of lists to look like this l1=["tom-15", "to-13"] and l2 like this l2=[10, 3.3]. Therefore, the last set should be l1=[] and l2=[14]


Solution

  • df = pd.DataFrame(data, columns=['Att1', 'Att2'])
    df = df.replace("NA", None)
    df = df.applymap(lambda x: [i.strip('"') for i in x.split(", ")], na_action="ignore")
    
    for index, row in df2.iterrows():
        l1 = row["Att1"] if row["Att1"] else []
        l2 = row["Att2"] if row["Att2"] else []
        print("index =", index)
        print("l1 =", l1)
        print("l2 =", l2)
        print()
    

    Output:

    index = 0
    l1 = ['tom-15', 'to-13']
    l2 = ['10', '3.3']
    
    index = 1
    l1 = ['nick-12']
    l2 = ['15.5']
    
    index = 2
    l1 = []
    l2 = ['14']