I'm currently trying to construct a relative frequency plot for the logarithmic returns that I obtain for some assets (shares specifically), where I've used a log10 scale on the y-axis using plt.yscale('log'). However, I obtain some vertical lines which clearly tend to infinity at certain points of my graph that I constructed on Python shown below:
This obviously shouldn't happen. Instead, it should look like this graph:
This is clearly similar to mine, except that it doesn't include the vertical lines on those points. My code is as follows:
plt.figure(1)
plt.figure(figsize=(9,7))
hist1, bins1 = np.histogram(returns_assetA_daily_mat, bins=20)
hist2, bins2 = np.histogram(returns_assetA_weekly_mat, bins=20)
hist3, bins3 = np.histogram(returns_assetA_monthly_mat, bins=20)
hist1 = hist1/len(returns_assetA_daily_mat)
hist2 = hist2/len(returns_assetA_weekly_mat)
hist3 = hist3/len(returns_assetA_monthly_mat)
bins1 = 0.5 * (bins1[1:] + bins1[:-1])
bins2 = 0.5 * (bins2[1:] + bins2[:-1])
bins3 = 0.5 * (bins3[1:] + bins3[:-1])
plt.plot(bins1, hist1, bins2, hist2, bins3, hist3)
plt.yscale('log')
plt.xlabel('Log-Returns')
plt.ylabel('Relative Frequency')
plt.title('Original for Asset A')
plt.show()
It's quite useful to know that
returns_assetA_daily_mat, returns_assetA_weekly_mat, returns_assetA_monthly_mat
are simply row arrays with values of the daily, weekly and monthly logarithmic returns of the assets which contain negative values, positive values and zeros too, so perhaps since I'm doing the log10 scale on the y-axis the negative values or zeros could be the cause of the underlying issue, since obviously as x tends to zero a logarithm will tend to minus infinity? Maybe the issue lies within my code structure? If there are no solutions to this issue is there any way where I can isolate those points which contain the vertical lines to tend to minus infinity so that they look like the isolated points instead? I'm a Python newbie who's currently learning it as part of my masters degree in Computational Finance, so any kind of help would be highly appreciated! Thanks a lot in advance!
In Matplotlib, one way to "remove" points visually in a plot is to set the affected points to numpy.nan
. The effect of this is that the points before and after numpy.nan
will show a gap which is what I believe you are after.
Therefore for your arrays, find any values that are negative or 0 and set them to numpy.nan
before plotting. Because you are computing histograms, these should never produce values that are negative so you are really only checking for bins that are equal to 0 instead.
One thing you need to make sure is to change the type of your arrays to float
. numpy.nan
only exists for floating-point arrays.
If you also want to plot each of them to add a legend, simply plot each array one at a time on the same figure by calling matplotlib.pyplot.plot
three times then add your legend, then show the plot:
plt.figure(1)
plt.figure(figsize=(9,7))
hist1, bins1 = np.histogram(returns_assetA_daily_mat, bins=20)
hist2, bins2 = np.histogram(returns_assetA_weekly_mat, bins=20)
hist3, bins3 = np.histogram(returns_assetA_monthly_mat, bins=20)
hist1 = hist1/len(returns_assetA_daily_mat)
hist2 = hist2/len(returns_assetA_weekly_mat)
hist3 = hist3/len(returns_assetA_monthly_mat)
bins1 = 0.5 * (bins1[1:] + bins1[:-1])
bins2 = 0.5 * (bins2[1:] + bins2[:-1])
bins3 = 0.5 * (bins3[1:] + bins3[:-1])
# New code
hist1_new = hist1.astype(np.float)
hist2_new = hist2.astype(np.float)
hist3_new = hist3.astype(np.float)
hist1_new[hist1 <= 0] = np.nan
hist2_new[hist2 <= 0] = np.nan
hist3_new[hist3 <= 0] = np.nan
# New - Plot the three graphs separately for making the legend
# Also plot the NaN versions
plt.plot(bins1, hist1_new, label='Daily')
plt.plot(bins2, hist2_new, label='Weekly')
plt.plot(bins3, hist3_new, label='Monthly')
plt.yscale('log')
plt.xlabel('Log-Returns')
plt.ylabel('Relative Frequency')
plt.title('Original for Asset A')
plt.legend() # Added for the legend
plt.show()
I don't have your data, but I can show you a toy example of this working:
# Import relevant packages
import numpy as np
import matplotlib.pyplot as plt
# Create array from 0 to 8 for the horizontal axis
x = np.arange(9)
# Create test array with some zero, positive and negative values
y = np.array([1, 2, 3, 0, -1, -2, 1, 2, -1])
# Create a figure with two graphs in one row
plt.subplot(1, 2, 1)
# Graph the data normally
plt.plot(x, y)
# Visually remove those points that are zero or negative
y2 = y.astype(np.float)
y2[y2 <= 0] = np.nan
# Plot these points now
plt.subplot(1, 2, 2)
plt.plot(x, y2)
# Adjust the x and y limits (see further discussion below)
plt.xlim(0, 8)
plt.ylim(-1, 3)
# Show the figure
plt.show()
Note that the second plot I enforce that the x
and y
limits are the same as the first plot because we are visually removing the points and so the axes will automatically adjust. We get: