Search code examples
pythonpandasdatabasedataframerow

Python - Changing names of rows if they are repeated


I wanna change the column "Cuenta" of my dataframe from the left one to the right table, like I show you in the next picture:

entimaer image description here

As you can see the change depends of the number of repetitions of the value "Cuenta" over "Entidad".

df = pd.DataFrame({
"Entidad":["A","A","A","A","A","A","A","B","B","B","B","B"],
"Cuenta": ["Surco","Lima","Miraflores","Lima","SMP","Surco","Lima","Surco","Lima","Miraflores","Lima","SMP"],
"Valor": [12,14,11,7,5,4,22,11,34,21,17,25],
})

Thank your again for your help.


Solution

  • First create a new "suffix" column in your pandas dataframe:

    df['suffix']=1

    Then create a new column consisting of "Entidad" and "Cuenta":

    df['Entidad_Cuenta'] = df['Entidad']+'_'+df['Cuenta']

    You can then groupby your dataframe by "Entidad_Cuenta" and compute the cumulative sum of the "suffix", in order to count the number of identical "Cuenta" values for each "Entidad"; you may then append this suffix to "Cuenta":

    df['Cuenta'] = df['Cuenta'] + df.groupby('Entidad_Cuenta').cumsum()['suffix'].astype(str)
    

    df['Cuenta'] returns

    0          Surco1
    1           Lima1
    2     Miraflores1
    3           Lima2
    4            SMP1
    5          Surco2
    6           Lima3
    7          Surco1
    8           Lima1
    9     Miraflores1
    10          Lima2
    11           SMP1
    

    I will leave it to you to figure out how to drop "suffix" and "Entidad_Cuenta" from your output dataframe.