Search code examples
pythonrapidscudf

How to drop columns with NA using cudf?


Pandas:

data = data.dropna(axis = 'columns')

I am trying to do something similar using a cudf dataframe but the apis don't offer this functionality.

My solution is to convert to a pandas df, do the above command, then re-convert to a cudf. Is there a better solution?


Solution

  • cuDF now supports column based dropna, so the following will work:

    import cudf
    ​
    df = cudf.DataFrame({'a':[0,1,None], 'b':[None,0,2], 'c':[1,2,3]})
    print(df)
          a     b  c
    0     0  null  1
    1     1     0  2
    2  null     2  3
    
    df.dropna(axis='columns')
        c
    0   1
    1   2
    2   3