I've imported stock market data using pandas_datareader:
from pandas_datareader import data, wb
import pandas as pd
import datetime
start = datetime.datetime(2006, 1,1)
end = datetime.datetime(2016, 1,1)
boaml = data.DataReader('BAC', 'morningstar', start, end)
citi = data.DataReader('C', 'morningstar', start, end)
The data looks neat as represented by the result of citi.head()
:
Close High Low Open Volume
Symbol Date
C
2006-01-02 485.3 487.1 482.2 483.5 0
2006-01-03 492.9 493.8 481.1 490.0 1536700
2006-01-04 483.8 491.0 483.5 488.6 1852790
2006-01-05 486.2 487.8 484.0 484.4 1015470
2006-01-06 486.2 489.0 482.0 488.8 1358930
Now, when I try to concatenate them using pd.concat()
, I get NaN on the upper right corner, and in the lower left corner of the matrix:
bank_stocks = pd.concat([boaml, citi], axis=1, join='outer')
Look at bank_stocks.head()
:
Close High Low Open Volume Close High Low Open Volume
Symbol Date
BAC 2006-01-02 46.15 46.36 45.91 46.02 0.0 NaN NaN NaN NaN NaN
2006-01-03 47.08 47.18 46.15 46.92 16197900.0 NaN NaN NaN NaN NaN
2006-01-04 46.58 47.24 46.45 47.00 17427400.0 NaN NaN NaN NaN NaN
2006-01-05 46.64 46.83 46.32 46.58 14668900.0 NaN NaN NaN NaN NaN
2006-01-06 46.57 46.91 46.35 46.80 11965700.0 NaN NaN NaN NaN NaN
And bank_stocks.tail()
:
Close High Low Open Volume Close High Low Open Volume
Symbol Date
C 2015-12-28 NaN NaN NaN NaN NaN 52.38 52.57 51.96 52.57 8760674.0
2015-12-29 NaN NaN NaN NaN NaN 52.98 53.22 52.74 52.76 10153634.0
2015-12-30 NaN NaN NaN NaN NaN 52.30 52.94 52.25 52.84 8763137.0
2015-12-31 NaN NaN NaN NaN NaN 51.75 52.39 51.75 52.07 11275231.0
2016-01-01 NaN NaN NaN NaN NaN 51.75 51.75 51.75 51.75 0.0
(Apologies in advance if the output isn't clear, I hope that the code can ease when reproducing the error).
I understand that the problem relies on Symbol
, however, I have tried MultiIndexing and did not work.
Any idea how can I obtain a matrix that concatenates the stock data for both boaml
and citi
under the same Date, and without showing NaN?
Your level 0 MultiIndex 'Symbol' is causing the issue. Try removing that level and then concat
citi.index = citi.index.droplevel()
boaml.index = boaml.index.droplevel()
pd.concat([citi.add_suffix('_citi'), boaml.add_suffix('_boaml')], axis = 1)