Search code examples
pythonpandasnonetypekeyerror

Unable to fetch the data using column name from pandas dataframe after assigning a column name


I have created a pandas dataframe from an API endpoint. I have used the following to convert to a df.

df = pd.json_normalize(js.loads(js.dumps(key_value))

This however does create df as required but the column names are the reference ids. I have extracted the column of interest by:

new_df = df.iloc[:,4]

Now since at a later stage I would need to merge the above df with another df I assigned a column name to the above df and I have done that using:

new_df.columns = ["EMP_ID"]

However, when I use

print(new_df['EMP_ID']

I get None.

I did figure out that pandas is unable to recognise the column name. Is there a different way to assign a column name in this scenario.

Note: When merging this df with another will give a KeyError issue as column name ["EMP_ID"] is not recognised.


Solution

  • It seems like you're trying to rename the column in the DataFrame new_df to "EMP_ID". If new_df is a pandas Series (extracted using df.iloc[:, 4]), you need to convert it to a DataFrame before changing the column name. If new_df is already a DataFrame, you can directly rename the column.

    new_df = df.iloc[:, 4]
    

    Convert the Series to a DataFrame

    new_df = new_df.to_frame(name="EMP_ID")
    

    OR

    Adding a new empty column "EMP_ID" with None as a placeholder

    new_df["EMP_ID"] = None
    

    Now new_df has a new empty column named "EMP_ID"

    Later, you can add values to the "EMP_ID" column For example, assigning values from another DataFrame or a list

    new_df["EMP_ID"] = df2["EMP_ID"]