Search code examples
pythonpandasdataframepandas-loc

error in dataframe loc when applying these two dataframes


I'm trying to put these two dataframes(data2 and trades) together tto make it look like this https://i.sstatic.net/pR8bW.png:

data2:

            Close
2015-08-28  113.290001
2015-08-31  112.760002
2015-09-01  107.720001
2015-09-02  112.339996
2015-09-03  110.370003
2015-09-04  109.269997
2015-09-08  112.309998
2015-09-09  110.150002
2015-09-10  112.570000
2015-09-11  114.209999

trades:

               Trades
2015-08-28     3.0
2015-08-31     3.0
2015-09-01     3.0
2015-09-02     3.0
2015-09-03     2.0

code:

import matplotlib.pyplot as plt

fig = plt.figure()

ax1 = fig.add_subplot(111, ylabel='Portfolio value in $')

data2["Close"].plot(ax=ax1, lw=2.)

ax1.plot(data2.loc[trades.Trades == 2.0].index, data2.total[trades.Trades == 2.0],
         '^', markersize=10, color='m')
ax1.plot(data2.loc[trades.Trades == 3.0].index, 
         data2.total[trades.Trades == 3.0],
         'v', markersize=10, color='k')

plt.show()

But this gives the following error:

---------------------------------------------------------------------------
IndexingError                             Traceback (most recent call last)
<ipython-input-38-9cde686354a8> in <module>()
      7 data2["Close"].plot(ax=ax1, lw=2.)
      8 
----> 9 ax1.plot(data2.loc[trades.Trades == 2.0].index, data2.total[trades.Trades == 2.0],
     10          '^', markersize=10, color='m')
     11 ax1.plot(data2.loc[trades.Trades == 3.0].index, 

3 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in check_bool_indexer(index, key)
   2316         if mask.any():
   2317             raise IndexingError(
-> 2318                 "Unalignable boolean Series provided as "
   2319                 "indexer (index of the boolean Series and of "
   2320                 "the indexed object do not match)."

IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).

Solution

  • The indexes of the two data frames are different. I've taken the approach of define masks for data2 dataframe that are based of values in trades dataframe and it works.

    Additionally your sample code referred to total which does not exist. Updates to use Close

    import pandas as pd
    import io
    import matplotlib.pyplot as plt
    
    data2 = pd.read_csv(io.StringIO("""            Close
    2015-08-28  113.290001
    2015-08-31  112.760002
    2015-09-01  107.720001
    2015-09-02  112.339996
    2015-09-03  110.370003
    2015-09-04  109.269997
    2015-09-08  112.309998
    2015-09-09  110.150002
    2015-09-10  112.570000
    2015-09-11  114.209999"""), sep="\s+")
    
    trades = pd.read_csv(io.StringIO("""               Trades
    2015-08-28     3.0
    2015-08-31     3.0
    2015-09-01     3.0
    2015-09-02     3.0
    2015-09-03     2.0"""), sep="\s+")
    
    # make sure it's dates
    data2 = data2.reset_index().assign(index=lambda x: pd.to_datetime(x["index"])).set_index("index")
    trades = trades.reset_index().assign(index=lambda x: pd.to_datetime(x["index"])).set_index("index")
    
    fig = plt.figure()
    ax1 = fig.add_subplot(111, ylabel='Portfolio value in $')
    
    data2["Close"].plot(ax=ax1, lw=2.)
    
    mask2 = data2.index.isin((trades.Trades == 2.0).index)
    mask3 = data2.index.isin((trades.Trades == 3.0).index)
    
    ax1.plot(data2.loc[mask2].index, data2.Close[mask2],
             '^', markersize=10, color='m')
    ax1.plot(data2.loc[mask3].index, 
             data2.Close[mask3],
             'v', markersize=10, color='k')
    
    
    plt.show()
    

    output enter image description here