Alternative algorithm for integral calculation inaccurate

I thought of an algorithm for integral calculation that should be more accurate that the regular rectangle approach. My algorithm can be best described with a graphic (I am using f(x) = sin(x) as example):

First the x and y positions of the points P1, P2, P3, P4 are calculated (red dots).
The area of the green four sided figure is a part of the result. This area is calculated by dividing it into two triangles (blue line).
The area of each triangle is calculated using the Heron’s formula.

I think it is obvious that this should lead to much better results than the rectangle approach.

In code this looks like this:

double integral(double f(double x), double min, double max) {
    Point p1, p2, p3, p4;
    double area = 0.0;
    double s1 = 0.0;
    double s2 = 0.0;


    for(double x = min; x < max; x += stepSize) {
        p1.x = x;
        p1.y = 0.0;

        p2.x = x;
        p2.y = f(x);

        p3.x = x + stepSize;
        p3.y = f(x + stepSize);

        p4.x = x + stepSize;
        p4.y = 0.0;


        s1 = 0.5 * (distance(p1, p2) + distance(p2, p4) + distance(p1, p4));
        s2 = 0.5 * (distance(p2, p3) + distance(p3, p4) + distance(p2, p4));

        area += sqrt(s1 * (s1 - distance(p1, p2)) * (s1 - distance(p2, p4)) * (s1 - distance(p1, p4)));
        area += sqrt(s2 * (s2 - distance(p2, p3)) * (s2 - distance(p3, p4)) * (s2 - distance(p2, p4)));
    }

    return area;
}

The distance function is just returning the distance between two points in 2D space. The Point struct is just holding a x and y coordinate. stepSize is a constant that I set to 0.001

My function is giving a result that is correct, but I wanted to know how much more precise it is compared to the rectangle approach.

On the internet I found this code that is calculating a integral using rectangles:

double integral2(double(*f)(double x), double a, double b, int n) {
    double step = (b - a) / n;  // width of each small rectangle
    double area = 0.0;  // signed area
    for (int i = 0; i < n; i ++) {
        area += f(a + (i + 0.5) * step) * step; // sum up each small rectangle
    }
    return area;
}

I both tested them using the standard math.h sin function from 0.0 to half π. This area should be 1.

My algorithm has given me the result 1.000204 for a step-size of 0.001.

The rectangle algorithm hast given me the result 1.000010 with a calculated step-size of 0.015708.

How can such a difference in accuracy and step-size be explained? Did I implement my algorithm wrong?

Update

Using the calculated step-size of the second method, I get the result 0.999983 which is much closer to one than the result with a step-size of 0.001.

Now how can that work??

Solution

Your last trapezoid may be too wide: x+stepSize may be above max if max-min isn't a multiple of stepSize. That's why in the rectangular summation code you included, rather than stepSize, they use n (the number of rectangles).

You compute the trapezoid in a complicated way. Note that its area is stepSize * (P2.y + P3.y)/2. This adds computation cost, but I guess is not the cause of the numerical error in your test integral.

Except for these issues, your method is otherwise equivalent to the trapezoid rule. https://en.wikipedia.org/wiki/Trapezoidal_rule

Here is Python code that approximates the integral in three different ways, using 100 rectangles. The three ways are trap_heron (your method, using Heron's rule), trap (trapezoid method), and rect (rectangular summation). Your question is C++, but the results should be the same.

import math

N = 100

def dist(a, b):
    dx = a[0] - b[0]
    dy = a[1] - b[1]
    return math.sqrt(dx*dx + dy*dy)

def trap_heron(f, min, max):
    area = 0.0
    for i in range(N):
        x0 = min + (max-min) * i/N
        x1 = min + (max-min) * (i+1)/N
        y0 = f(x0)
        y1 = f(x1)

        p1 = (x0, 0.0)
        p2 = (x0, y0)
        p3 = (x1, y1)
        p4 = (x1, 0.0)

        s1 = 0.5 * (dist(p1, p2) + dist(p2, p4) + dist(p1, p4))
        s2 = 0.5 * (dist(p2, p3) + dist(p3, p4) + dist(p2, p4))

        area += math.sqrt(s1 * (s1 - dist(p1, p2)) * (s1 - dist(p2, p4)) * (s1 - dist(p1, p4)))
        area += math.sqrt(s2 * (s2 - dist(p2, p3)) * (s2 - dist(p3, p4)) * (s2 - dist(p2, p4)))
    return area

def trap(f, min, max):
    area = 0.0
    for i in range(N):
        x0 = min + (max-min) * i/N
        x1 = min + (max-min) * (i+1)/N
        y0 = f(x0)
        y1 = f(x1)
        area += (x1-x0) * (y0+y1)/2

    return area


def rect(f, min, max):
    area = 0.0
    for i in range(N):
        y = f(min + (max-min)*(i+0.5)/N)
        area += (max-min)/N * y
    return area

print(trap(math.sin, 0, math.pi/2))
print(trap_heron(math.sin, 0, math.pi/2))
print(rect(math.sin, 0, math.pi/2))

The output is:

0.9999794382396076
0.9999794382396054
1.0000102809119051

Note that trap and trap_heron produce very nearly the same result.

In your comments, you have a result of 1.015686. The error is very close to stepSize * sin(pi/2), so I guess you've summed up one too many trapezoids.