Search code examples
pythonpython-polars

How to drop row in polars-python


How to add new feature like length of data frame & Drop rows value using indexing. I want to a add a new column where I can count the no-of rows available in a data frame, & using indexing drop rows value.

for i in range(len(df)):
    if (df['col1'][i] == df['col2'][i]) and (df['col4'][i] == df['col3'][i]):
        pass
    elif (df['col1'][i] == df['col3'][i]) and (df['col4'][i] == df['col2'][i]): 
        df['col1'][i] = df['col2'][i]
        df['col4'][i] = df['col3'][i]
    else:
       df = df.drop(i)

Solution

  • Polars doesn't allow much mutation and favors pure data handling. Meaning that you create a new DataFrame instead of modifying an existing one.

    So it helps to think of the data you want to keep instead of the row you want to remove.

    Below I have written an example that keeps all data except for the 2nd row. Note that the slice will be the fastest of the two and will have zero data copy.

    df = pl.DataFrame({
        "a": [1, 2, 3],
        "b": [True, False, None]
    }).with_row_index()
    
    print(df)
    
    # filter on condition
    df_a = df.filter(pl.col("index") != 1)
    
    # stack two slices
    df_b = df[:1].vstack(df[2:])
    
    # or via explicit slice syntax
    # df_b = df.slice(0, 1).vstack(df.slice(2, -1))
    
    assert df_a.equals(df_b)
    
    print(df_a)
    

    Outputs:

    shape: (3, 3)
    ┌───────┬─────┬───────┐
    │ index ┆ a   ┆ b     │
    │ ---   ┆ --- ┆ ---   │
    │ u32   ┆ i64 ┆ bool  │
    ╞═══════╪═════╪═══════╡
    │ 0     ┆ 1   ┆ true  │
    │ 1     ┆ 2   ┆ false │
    │ 2     ┆ 3   ┆ null  │
    └───────┴─────┴───────┘
    
    shape: (2, 3)
    ┌───────┬─────┬──────┐
    │ index ┆ a   ┆ b    │
    │ ---   ┆ --- ┆ ---  │
    │ u32   ┆ i64 ┆ bool │
    ╞═══════╪═════╪══════╡
    │ 0     ┆ 1   ┆ true │
    │ 2     ┆ 3   ┆ null │
    └───────┴─────┴──────┘