I wrote a program to approximate an integral using a Riemann sum and graph it using matplotlib in Python. For functions with equal areas above and below the x-axis, the resulting area should be zero, but my program outputs a very small number instead.
The following code graphs the odd function f(x) = x^3 from -1 to 1, so the area should be zero. My code instead approximates it to be 1.68065561477562 e^-15.
What is causing this? Is it a rounding error in delta_x, x, or y? I know I could just round the value to zero, but I'm wondering if there is another problem or way to solve this.
I have tried using the Decimal.decimal class for delta_x, but I just got an even smaller number.
The Python code:
import matplotlib.pyplot as plt
import numpy as np
# Approximates and graphs integral using Riemann Sum
# example function: f(x) = x^3
def f_x(x):
return x**3
# integration range from a to b with n rectangles
a, b, n = -1, 1, 1000
# calculate delta x, list of x-values, list of y-values, and approximate area under curve
delta_x = (b - a) / n
x = np.arange(a, b+delta_x, delta_x)
y = [f_x(i) for i in x]
area = sum(y) * delta_x
# graph using matplotlib
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x, y)
ax.bar(x, y, delta_x, alpha=.5)
plt.title('a={}, b={}, n={}'.format(a, b, n))
plt.xlabel('A = {}'.format(area))
plt.show()
You need to be aware that what you are computing is not a Riemann integral in the original sense. You are dividing the interval into n
bins, but then sum over n+1
bins (here n = 1000
but len(x) == 1001
). So the result may be close to what you expect, but it is certainly not a good way to get there.
Using the Riemann sum you would divide your interval into n
bins, and then sum over the values of those n
bins. You have the choice whether to compute the left Riemann sum, the right Riemann sum, or possibly taking the midpoints.
import numpy as np
def f_x(x):
return x**3
# integration range from a to b with n rectangles
a, b, n = -1, 1, 1000
delta_x = (b - a) / float(n)
x_left = np.arange(a, b, delta_x)
x_right = np.arange(a+delta_x, b+delta_x, delta_x)
x_mid = np.arange(a+delta_x/2., b+delta_x/2., delta_x)
print len(x_left), len(x_right), len(x_mid) ### 1000 1000 1000
area_left = f_x(x_left).sum() * delta_x
area_right = f_x(x_right).sum() * delta_x
area_mid = f_x(x_mid).sum() * delta_x
print area_left # -0.002
print area_right # 0.002
print area_mid # 1.81898940355e-15
While the midpoint sum already gives a good result, for symmetric functions it would be a good idea to choose n
even, and take the average of the left and right sum,
print 0.5*(area_right+area_left) # 1.76204537072e-15
This is equally close to 0.
Now it is worthwhile noting that numpy.arange
produces some errors by itself. A better choice would be using numpy.linspace
x_left = np.linspace(a, b-delta_x, n)
x_right = np.linspace(a+delta_x, b, n)
x_mid = np.linspace(a+delta_x/2., b-delta_x/2., n)
yielding
print area_left # -0.002
print area_right # 0.002
print area_mid # 8.52651282912e-17
print 0.5*(area_right+area_left) # 5.68121938382e-17
5.68121938382e-17
is pretty close to 0. The reason why it is not entirely 0 is indeed floating point inaccuracies.
The famous example of that would be
0.1 + 0.2 - 0.3
which results in 5.5e-17
instead of 0. This is to show that this simply operation introduces the same error of the order of 1e-17 as the Riemann integration.