Search code examples
pythonpandascamelcasingsnakecasing

Convert pandas column names from snake case to camel case


I have a pandas dataframe where the column names are capital and snake case. I want to convert them into camel case with first world starting letter to be lower case. The following code is not working for me. Please let me know how to fix this.

import pandas as pd

# Sample DataFrame with column names
data = {'RID': [1, 2, 3],
        'RUN_DATE': ['2023-01-01', '2023-01-02', '2023-01-03'],
        'PRED_VOLUME_NEXT_360': [100, 150, 200]}

df = pd.DataFrame(data)

# Convert column names to lowercase
df.columns = df.columns.str.lower()

# Convert column names to camel case with lowercase starting letter
df.columns = [col.replace('_', ' ').title().replace(' ', '').replace(col[0], col[0].lower(), 1) for col in df.columns]

# Print the DataFrame with updated column names
print(df)

I want to column names RID, RUN_DATE, PRED_VOLUME_NEXT_360 to be converted to rid, runDate, predVolumeNext360, but the code is giving Rid, RunDate and PredVolumeNext360.


Solution

  • You could use a regex to replace _x by _X:

    df.columns = (df.columns.str.lower()
                    .str.replace('_(.)', lambda x: x.group(1).upper(),
                                 regex=True)
                 )
    

    Or with a custom function:

    def to_camel(s):
        l = s.lower().split('_')
        l[1:] = [x.capitalize() for x in l[1:]]
        return ''.join(l)
    
    df = df.rename(columns=to_camel)
    

    Output:

       rid     runDate  predVolumeNext360
    0    1  2023-01-01                100
    1    2  2023-01-02                150
    2    3  2023-01-03                200