Search code examples
pythonpandaspython-datetimepandas-loc

Logical statement for datetime64[ns]


I am encountering the following error when I use df.loc. I have a parameter s which is datetime64[ns], I have dataframe with two of the column MDT1 and MTD2, which is also datetime64[ns] type. When I use the following statement using df.loc

s= df_1x2['date'].astype (str)+'-'+ df_1x2['Time'].astype(str)
s = pd.to_datetime(s)
df_1x2.loc[(s>df_1x2['MDT1'])&(s<df_1x2['MDT2']),'Morning']= s

The last line of the code shows me the following error in Jupyter Notebook

ValueError: Can only compare identically-labeled Series objects

However, when I run second time, it works as expected. It was just that the first run in Jupyter Notebook got me the error.

Below is the link to download the sample data file:

Click here to download the sample data file

Below is the code:

s = pd.to_datetime(df['Date']).astype(str)+' '+df['Time.1'].astype(str)+':00'
df['Matchdate'] = pd.to_datetime (s)

d =np.reciprocal(df_1x2[['Home','Draw','Away']].astype(float))

df_1x2['Margin'] = d.sum(axis = 1)-1
df_1x2['Open_DT'] = df_1x2['date'].astype (str)+'-'+ df_1x2['Time'].astype(str)
df_1x2['Open_DT'] = pd.to_datetime(df_1x2['Open_DT'])
df_1x2['Open_DT1'] = df_1x2.groupby(['Match','Odds_Type'])['Open_DT'].transform(min)
s = df_1x2['Open_DT'] == df_1x2['Open_DT1']
df_1x2.loc[s,'Open'] ='Opening'


df_1x2['Current_DT'] = df_1x2['date'].astype (str)+'-'+ df_1x2['Time'].astype(str)
df_1x2['Current_DT'] = pd.to_datetime(df_1x2['Current_DT'])
df_1x2['Current_DT1'] = df_1x2.groupby(['Match','Odds_Type'])['Current_DT'].transform(max)
sc = df_1x2['Current_DT'] == df_1x2['Current_DT1']
df_1x2.loc[sc,'Current'] ='Current'
df_1x2.loc[df_1x2.Open == 'Opening','Current']='Opening'

df_1x2 ['morning Date'] = pd.Timestamp.today()
Time1 = '08:00:00'
Time2 = '11:00:00'
s = df_1x2['date'].astype (str)+'-'+ df_1x2['Time'].astype(str)
s = pd.to_datetime(s)
df_1x2.reset_index(drop = True, inplace = True)

df_1x2 ['morning Date'] = pd.to_datetime(df_1x2 ['morning Date']).dt.date
df_1x2['MDT1'] = pd.to_datetime(df_1x2['morning Date'].astype(str)+' '+Time1)


df_1x2['MDT2'] = pd.to_datetime(df_1x2['morning Date'].astype(str)+' '+Time2)

df_1x2 .reset_index(drop= True, inplace = True)



df_1x2.loc[(s>df_1x2['MDT1'])&(s<df_1x2['MDT2']),'Morning']= s

How can I rectify it?


Solution

  • You are applying an entire series s to a filtered dataframe column. I think it is easier to just use mask or np.where instead of .loc:

    df_1x2['Morning'] = df_1x2['Morning'].mask((s > df_1x2['MDT1']) & (s < df_1x2['MDT2']), s)
    

    or np.where:

    df_1x2['Morning'] = np.where((s > df_1x2['MDT1']) & (s < df_1x2['MDT2']), s, df_1x2['Morning'])
    

    If the column doesn't exist yet, then use:

    df_1x2['Morning'] = np.where((s > df_1x2['MDT1']) & (s < df_1x2['MDT2']), s, np.datetime64('NaT'))
    

    .loc would be okay for this use case if you were assigning a single value like 10 rather than a series that has a longer length than your filtered dataframe.


    The issue is that you reset_index() after defining s, which means it is not comparable to your dataframe:

    s = pd.to_datetime(df['Date']).astype(str)+' '+df['Time.1'].astype(str)+':00'
    df['Matchdate'] = pd.to_datetime (s)
    
    d =np.reciprocal(df_1x2[['Home','Draw','Away']].astype(float))
    
    df_1x2['Margin'] = d.sum(axis = 1)-1
    df_1x2['Open_DT'] = df_1x2['date'].astype (str)+'-'+ df_1x2['Time'].astype(str)
    df_1x2['Open_DT'] = pd.to_datetime(df_1x2['Open_DT'])
    df_1x2['Open_DT1'] = df_1x2.groupby(['Match','Odds_Type'])['Open_DT'].transform(min)
    s = df_1x2['Open_DT'] == df_1x2['Open_DT1']
    df_1x2.loc[s,'Open'] ='Opening'
    
    
    df_1x2['Current_DT'] = df_1x2['date'].astype (str)+'-'+ df_1x2['Time'].astype(str)
    df_1x2['Current_DT'] = pd.to_datetime(df_1x2['Current_DT'])
    df_1x2['Current_DT1'] = df_1x2.groupby(['Match','Odds_Type'])['Current_DT'].transform(max)
    sc = df_1x2['Current_DT'] == df_1x2['Current_DT1']
    df_1x2.loc[sc,'Current'] ='Current'
    df_1x2.loc[df_1x2.Open == 'Opening','Current']='Opening'
    
    df_1x2 ['morning Date'] = pd.Timestamp.today()
    Time1 = '08:00:00'
    Time2 = '11:00:00'
    
    ########### I moved s from here...........
    
    df_1x2.reset_index(drop = True, inplace = True)
    
    df_1x2 ['morning Date'] = pd.to_datetime(df_1x2 ['morning Date']).dt.date
    df_1x2['MDT1'] = pd.to_datetime(df_1x2['morning Date'].astype(str)+' '+Time1)
    
    
    df_1x2['MDT2'] = pd.to_datetime(df_1x2['morning Date'].astype(str)+' '+Time2)
    
    df_1x2 .reset_index(drop= True, inplace = True)
    
    ########### .........to here. 
    ##### Resetting the  index after defining s, means you can no longer compare directly to your dataframe as s has a different index than your dataframe now.
    
    s = df_1x2['date'].astype (str)+'-'+ df_1x2['Time'].astype(str)
    s = pd.to_datetime(s)
    
    
    df_1x2.loc[(s>df_1x2['MDT1'])&(s<df_1x2['MDT2']),'Morning']= s