Search code examples
pythonpandasdataframeloopssorting

how to create sub dataframes, appending rows if some column value is included in a list?


I have a dataframe like this

Data = pd.read_csv('info_com.csv')
df1 = pd.DataFrame(Data)

This DF has a column 'Code', that includes all product codes, and the DF pretty much includes all the information of said products. So, I have a list with the codes of interest, but in reality, I need two df, one with the rows of the "good codes" and one with the bad codes.

I tried doing some kind of sort, and a loop

   ComDF1 = df1['Codes']
   goodCodes = ['B0CD3589HR' ... 'B0CD726Q8T']


  for i in range(len(ComDF1)):
     if ComDF1[i] in goodCodes:
       df1.iloc[i].append(dfWP)
     else:
       df1.iloc[i].append(dfNWP)

  print(dfWP, dfNWP)

And expected two DF with all the info of every product, sorted by the existence or the absence of the column value in the list...


Solution

  • I created some example data. Rather than looping over the rows, create a mask of the rows that you want and you can use that to grab those and then 'not those' with the tilde ~ operator.

    import pandas as pd
    
    example_data = [{'Codes': 'B0CD3589HR', 'other_data': '8675309'}, {'Codes': 'B0CD726Q8T', 'other_data': 'spam'},
                    {'Codes': 'sdbadcodesdk', 'other_data': 'eggs'}, {'Codes': 'klbadzoot4d', 'other_data': 'a herring'},
                    {'Codes': 'B0CD3589HR', 'other_data': 'a shrubbery'}, {'Codes': 'B0CD726Q8T', 'other_data': 'ni'}]
    
    df1 = pd.DataFrame.from_records(example_data)
    
    goodCodes = ['B0CD3589HR', 'B0CD726Q8T']
    
    good_code_rows_mask = df1['Codes'].isin(goodCodes)
    
    good_codes_df = df1.loc[good_code_rows_mask]
    not_good_codes_df = df1.loc[~good_code_rows_mask]