First let me say ggplot for Python is the beginning of something great and kudos to the developers for putting in the work. Currently I'm having two major issues with the same plot. If I plot 8 stocks or less the image looks good except that the legend runs off the figure area (Problem 1). If I plot more than 8 stocks the plot triggers some erratic line(s) that is clearly not representative of the data. Additionally the legend does not resize and instead leaves off the additional stock tickers (Problem 2). Any help is appreciated. Thanks!
Decent Plot Code:
import datetime
from ggplot import *
import pandas.io.data as web
import pandas as pd
import numpy as np
start = datetime.datetime(2014,1,1)
end = datetime.datetime(2014, 3,19)
stocks = ['APO','AVG','FI','ANIK','CELG','PACW','CBOE','BIIB']
stockData = {}
for ticker in stocks:
stockData[ticker] = web.get_data_yahoo(ticker, start, end)
price = pd.DataFrame({tic: data['Adj Close'] for tic, data in stockData.iteritems()})
returns = price.pct_change()
returns = returns.apply(cumsum)
rt = returns.index
returns['Date'] = rt
# plotting the cum performance for each security
ret = pd.melt(returns, id_vars='Date')
plot = ggplot(aes(x='Date', y='value', color='variable'),data=ret) +geom_line()
# plotting the equity curve of the theoretical portfolio
zt = returns
del zt['Date']
zt = zt.apply(np.sum, axis=1)
z = pd.DataFrame(zt, index=zt.index)
z['Date'] = rt
z.columns = ['equity curve', 'Date']
ret2 = pd.melt(z, id_vars='Date')
plot2 = ggplot(aes(x='Date', y='value'),data=ret2) +geom_line()
print plot
print plot2
BAD Plot Code:
import datetime
from ggplot import *
import pandas.io.data as web
import pandas as pd
import numpy as np
start = datetime.datetime(2014,1,1)
end = datetime.datetime(2014, 3,19)
stocks = ['APO','AVG','FI','ANIK','CELG','PACW','CBOE','BIIB','ISIS', 'SDRL'] # <-- notice two additional tickers
stockData = {}
for ticker in stocks:
stockData[ticker] = web.get_data_yahoo(ticker, start, end)
price = pd.DataFrame({tic: data['Adj Close'] for tic, data in stockData.iteritems()})
returns = price.pct_change()
returns = returns.apply(cumsum)
rt = returns.index
returns['Date'] = rt
# plotting the cum performance for each security
ret = pd.melt(returns, id_vars='Date')
plot = ggplot(aes(x='Date', y='value', color='variable'),data=ret) +geom_line()
# plotting the equity curve of the theoretical portfolio
zt = returns
del zt['Date']
zt = zt.apply(np.sum, axis=1)
z = pd.DataFrame(zt, index=zt.index)
z['Date'] = rt
z.columns = ['equity curve', 'Date']
ret2 = pd.melt(z, id_vars='Date')
plot2 = ggplot(aes(x='Date', y='value'),data=ret2) +geom_line()
print plot
print plot2
For the Problem 2, this is because ggplot
run out of colors, you can add more colors to fix the problem, just add following code to the beginning of your code:
import ggplot as gg
gg.colors.COLORS.extend(["#ff0000", "#00ff00", "#0000ff"])
For the Problem 1, it seems that we need to place the legend after create the figure:
ret = pd.melt(returns, id_vars='Date').dropna()
plot = ggplot(aes(x='Date', y='value', color='variable'), data=ret) +geom_line()
fig = plot.draw()
ax = fig.axes[0]
offbox = ax.artists[0]
offbox.set_bbox_to_anchor((1, 0.5), ax.transAxes)
here is the result: