Search code examples
pythonconcatenationmultiple-columnsstring-length

Python - Column concatenation based on length of data inside the column


Need help on column concatenation based on length size .

Column3= df["column1"] + "_" + df["column2"]

data = {'column1':['af28912368', 'Nan', '234671', 'asr61239'], 'column2':[701, Nan, 761, 312]}

df = pd.DataFrame(data)

df :

column1 column2
af28912368 701
NaN Nan
234671 761
asr61239 312

If length of the data in column1 is greater than 8 , then need to Insert last 8 symbols of column1 If length of the data in column1 <8 & >0 , Insert value of df['column1'] + (8-Len(df['column1'])) of Blank Spaces If NaN on Column1 , Column3 can remain as NaN

expected result as shown on column3

column1 column2 column3
aa289123sf 701 289123sf_701
NaN 723 Nan
234671 761 234671 _761
asr61239 312 asr61239_312

I tried this :

df["column3"] = df["column1"].str[-8:] + '_' + df["column2"].astype(str) But not working with different length size of df["column1"] . Please help on this one.


Solution

  • you're first row column1 value keeps changing so I'm assuming this is a typo and not part of the exercise

    this script worked for me:

    ───────┬─────────────────────────────────────────────────────────────────────
           │ File: test-so.py
    ───────┼─────────────────────────────────────────────────────────────────────
       1   │ import pandas as pd
       2   │ from pprint import pprint
       3   │
       4   │ data = {'column1':['af28912368', None, '234671', 'asr61239'], 'column2':[701, None, 761, 312]}
       5   │ df = pd.DataFrame(data)
       6   │
       7   │ def funk(col1,col2):
       8   │     try:
       9   │         tmp = len(col1)
      10   │         if tmp > 8:
      11   │             return col1[:8] +'_'+ str(int(col2))
      12   │         elif tmp <= 8 and tmp > 0:
      13   │             return (int(tmp % 8))*' ' + col1 +'_'+ str(int(col2))
      14   │         else:
      15   │             return None
      16   │     except TypeError as e:
      17   │         return None
      18   │
      19   │ def main():
      20   │     df['column3'] = df.apply(lambda row: funk(row['column1'],row['column2']),axis=1)
      21   │     pprint(df)
      22   │
      23   │ if __name__ == '__main__':
      24   │     main()
    

    from command line:

    python test-so.py
    

    yields:

    pic