Search code examples
pythonexcelalgorithmpandasdata-analysis

Ambiguous truth value with boolean logic


I am trying to use some boolean logic in a function on a dataframe, but get an error:

In [4]:

data={'level':[20,19,20,21,25,29,30,31,30,29,31]}
frame=DataFrame(data)
frame
Out[4]:
level
0   20
1   19
2   20
3   21
4   25
5   29
6   30
7   31
8   30
9   29
10  31

In [35]:

def calculate(x):
    baseline=max(frame['level'],frame['level'].shift(1))#doesnt work
    #baseline=x['level']+4#works
    difftobase=x['level']-baseline
    return baseline, difftobase
frame['baseline'], frame['difftobase'] = zip(*frame.apply(calculate, axis=1))#works

However, this throws the following error at:

baseline=max(frame['level'],frame['level'].shift(1))#doesnt work


ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', u'occurred at index 0')

I read How to look back at previous rows from within Pandas dataframe function call? and http://pandas.pydata.org/pandas-docs/stable/gotchas.html but can't figure out how to apply this to my problem?


Solution

  • Inadequate use of the function max. np.maximum (perhaps np.ma.max as well as per numpy documentation) works. Apparently regular max can not deal with arrays (easily). Replacing

    baseline=max(frame['level'],frame['level'].shift(1))#doesnt work
    

    with

    baseline=np.maximum(frame['level'],frame['level'].shift(1))
    

    does the trick. I removed the other part to make it easier to read:

    In [23]:
    #q 1 analysis
    def calculate_rowise(x):
        baseline=np.maximum(frame['level'],frame['level'].shift(1))#works
        return baseline
    frame.apply(calculate_rowise)
    
    Out[23]:
    level
    0   NaN
    1   20
    2   20
    3   21
    4   25
    5   29
    6   30
    7   31
    8   31
    9   30
    10  31
    

    PS the original problem is hiding another issue that shows up when taking out the shift portion of the function. The return shape doesn't match, but thats another problem, just mentioning it here for full disclosure