Search code examples
pythonloopsseries

How to create new column using loop with condition


This is my DataFrame and I want to create a new column using loop with conditions.

import pandas as pd
student_card = pd.DataFrame({'ID':[20190103, 20190222, 20190531],
                             'name':['Kim', 'Yang', 'Park'],
                             'class':['H', 'W', 'S']})


student_card['new'] = pd.Series() #1.create new column
for i, v in student_card['name'].items(): #2.set index and values
    if "Yang" in v: #3.if there's "Yang" in value
        student_card['new'].append(v) #4. append the value of name column in new column

So I tried this method and got stuck with following error:

TypeError: cannot concatenate object of type '<class 'str'>'; only Series and DataFrame objs are valid

Which is not true btw (type of this column is Series)


Solution

  • What append does is to concatenate a series, which is not the case in your code as v is a string, i is the index of that string. You can try printing print(type(v)) and see for yourself. As for the documentation, you can find it here: https://pandas.pydata.org/docs/reference/api/pandas.Series.append.html

    What you are looking for is to set a value to a prexisting index on a column (or Series as its called in pandas). Something like that:

    df.loc[index] = value
    

    So in your code, this should do the trick

    import pandas as pd
    student_card = pd.DataFrame({'ID':[20190103, 20190222, 20190531],
                                 'name':['Kim', 'Yang', 'Park'],
                                 'class':['H', 'W', 'S']})
    
    
    student_card['new'] = pd.Series() #1.create new column
    for i, v in student_card['name'].items(): #2.set index and values
        if "Yang" in v: #3.if there's "Yang" in value
            student_card['new'].loc[i] = v #4. append the value of name column in new column