Search code examples
pandasassign

Assign slice of variable lenght to column with method chaining


I would like to assign a column a slice of variable lentgh of another column, but somehow it does not work as I expect, and I do not understand why:

import numpy as np
import pandas as pd

m = np.array([[1, 'AAAAA'],
               [2, 'BBBB'],
               [3, 'CCC']])

df = (pd.DataFrame(m, columns = ['id', 's1'])
        .assign(
                s2 = lambda x: x['s1'].str.slice(start=0, stop=x['s1'].str.len()-1))
        )

print(df)

which leads to

  id     s1  s2
0  1  AAAAA NaN
1  2   BBBB NaN
2  3    CCC NaN

However, I would expect the following:

  id     s1   s2
0  1  AAAAA AAAA
1  2   BBBB  BBB
2  3    CCC   CC

Any idea what happens here?


Solution

  • You need str[:-1] for indexing all values of column without last:

    df = (pd.DataFrame(m, columns = ['id', 's1'])
            .assign(
                    s2 = lambda x: x['s1'].str[:-1])
            )
    
    print(df)
      id     s1    s2
    0  1  AAAAA  AAAA
    1  2   BBBB   BBB
    2  3    CCC    CC
    

    Your solution working only is use apply for check each row separately, like:

    df = (pd.DataFrame(m, columns = ['id', 's1'])
            .assign(
                    s2 = lambda x: x.apply(lambda y: y['s1'][0:len(y['s1'])-1], axis=1))
            )
    
    print(df)
      id     s1    s2
    0  1  AAAAA  AAAA
    1  2   BBBB   BBB
    2  3    CCC    CC