Search code examples
pythonpandaspython-3.7pandas-loc

what is the pandas ix or iloc multiple condition syntax


Processing the government Office of Foreign Assets Controls (OFAC) list

https://www.treasury.gov/ofac/downloads/sdn.csv

The 2nd column (the number starting from 0) suggests if this row is an individual, business (-0-), airplane, or vessels information

If I want to keep the both the individual column equal to 'individual' and '-0-', what is the right syntax? The following codes only work for the individual column equal to 'individual'

name_orig = pd.read_csv('http://www.treasury.gov/ofac/downloads/sdn.csv', sep=',', header=None)

name_orig.rename(columns={0: 'id', 1: 'names', 2: 'individual', 11: 'sdn_info'}, inplace=True)

names = name_orig.ix[name_orig.individual == 'individual', ['id', 'names', 'individual', 'sdn_info']]

This doesn't seem to work

names = name_orig.ix[name_orig.individual == 'individual' | name_orig.individual == '-0-' , ['id', 'names', 'individual', 'sdn_info']]

Solution

  • Apparently if you run

    name_orig.individual.unique()
    

    Output:

    array(['-0- ', 'individual', 'vessel', 'aircraft', nan], dtype=object)
    

    The -0- has an extra space. I think this will work:

    names = name_orig.ix[((name_orig.individual == 'individual') | (name_orig.individual == '-0- ')), ['id', 'names', 'individual', 'sdn_info']]