Search code examples
pythonpandaschaining

SettingWithCopyWarning using .loc


I have dataframe that has data about different plants in different dates and days. enter image description here

I whave created new dataframe which contains only the plants I want using:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

df_plants = pd.read_csv('Data_plants_26_11_2019.csv')
df_Nit=pd.read_csv('chemometrics.csv')

#create new colum which contains aonly the hour using lambda
df_plants['Hour']=df_plants['time'].apply(lambda time: time.split(' ')[1])
df_plants['date']=df_plants['time'].apply(lambda time: time.split(' ')[0])

#select only my plants
options=['J01B','J01C','J02C','J02D','J03B','J03C','J04C','J08C','J08D','J09A','J09C','J10A','J12C','J12D','J13A','J14A','J15A','J18A']
filter_plants=df_plants[df_plants['plant'].isin(options)]

After creating this, I have tried to compute some indices (using columns that are not shown in the image) but I have started to get warnings that I haven't gotten when I compute it on all the plants:

filter_plants['NDVI']=(filter_plants['801.03']- filter_plants['680.75'])/(filter_plants['801.03']+filter_plants['680.75'])

C:\ProgramData\Anaconda2\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy """Entry point for launching an IPython kernel.

I have read about this warning here https://www.dataquest.io/blog/settingwithcopywarning/ and I thought it relates to the fact I haven't created "filter plants" with loc, so I have tried to add it before:

#select only plants that their nitrogen content was checked
options=['J01B','J01C','J02C','J02D','J03B','J03C','J04C','J08C','J08D','J09A','J09C','J10A','J12C','J12D','J13A','J14A','J15A','J18A']
filter_plants=df_plants.loc[df_plants['plant'].isin(options)]

but it didn't help and was the same. I have also tried to add loc in the computation of indices but I still have gotten the same error.

What is the problem? how can I fix it so i'll not have error in the next steps I run on this?

My end goal is ofcurse to get rid of the warning.


Solution

  • My usual way of dealing with this warning is df.copy(). Basically the issue here is that pandas doesn't properly make filter_plants its own dataframe (if I understand correctly) but just a slice of df_plants, unless you use .copy().

    If you simply change your line to:

    filter_plants=df_plants[df_plants['plant'].isin(options)].copy()
    

    That should fix it.