Search code examples
pythonpandasdataframestring-lengthmaxlength

How to add a specific number of characters to the end of string in Pandas?


I am using the Pandas library within Python and I am trying to increase the length of a column with text in it to all be the same length. I am trying to do this by adding a specific character (this will be white space normally, in this example I will use "_") a number of times until it reaches the maximum length of that column.

For example:

Col1_Before

A
B
A1R
B2
AABB4

Col1_After

A____
B____
A1R__
B2___
AABB4

So far I have got this far (using the above table as the example). It is the next part (and the part that does it that I am stuck on).

df['Col1_Max'] = df.Col1.map(lambda x: len(x)).max()
df['Col1_Len'] = df.Col1.map(lambda x: len(x))
df['Difference_Len'] = df ['Col1_Max'] - df ['Col1_Len']

I may have not explained myself well as I am still learning. If this is confusing let me know and I will clarify.


Solution

  • Without creating extra columns:

    In [63]: data
    Out[63]: 
        Col1
    0      A
    1      B
    2    A1R
    3     B2
    4  AABB4
    
    In [64]: max_length = data.Col1.map(len).max()
    
    In [65]: data.Col1 = data.Col1.apply(lambda x: x + '_'*(max_length - len(x)))
    
    In [66]: data
    Out[66]: 
        Col1
    0  A____
    1  B____
    2  A1R__
    3  B2___
    4  AABB4