Search code examples
pythonpandaskeyerror

KeyError: Not Found in Axis during pd.DataFrame.drop()


I am playing with the Accidental Drug Related Deaths 2012-2018 dataset. I am using the following drugs array:

array(['Heroin', 'Cocaine', 'Fentanyl', 'FentanylAnalogue', 'Oxycodone',
       'Oxymorphone', 'Ethanol', 'Hydrocodone', 'Benzodiazepine',
       'Methadone', 'Amphet', 'Tramad', 'Morphine_NotHeroin',
       'Hydromorphone', 'Other', 'OpiateNOS', 'AnyOpioid'], dtype=object)

The dataset comes with these pre-one-hot-encoded. I'm trying to convert them to a single column then remove the original features:

deaths['DrugDeath'] = deaths[drugs].idxmax(1)
deaths = deaths.drop(drugs)

When I do this, I get the following error:

KeyError: "['Heroin' 'Cocaine' 'Fentanyl' 'FentanylAnalogue' 'Oxycodone'\n 'Oxymorphone' 
'Ethanol' 'Hydrocodone' 'Benzodiazepine' 'Methadone'\n 'Amphet' 'Tramad' 'Morphine_NotHeroin'
 'Hydromorphone' 'Other'\n 'OpiateNOS' 'AnyOpioid'] not found in axis"

What confuses me about this is not only the error, but also the fact that this was not the format that I input. I put in an array. Now, as per the pd.DataFrame.drop() documentation, I need a "single label or list-like." Unsure of what "list-like" means or whether an array qualifies, I decided to try deaths = deaths.drop(drugs.tolist()), but to no avail.

What is this error, and what does it mean? Why is it responding to a structure of my input that I didn't use? How do I correctly implement this?


Solution

  • I don't think I'll fully understand this without seeing your full code, what your deaths df looks like, etc, but maybe try:

    deaths = deaths.drop(columns=list(drugs))