Search code examples
javascriptmathgeometrytrigonometry

How to create a pareto distribution prediction function?


pareto distribution prediction

I am trying to create a pareto distribution prediction function:


const paretoDistributionPrediction = (numberOfPeople, personAtIndex) => {

    let prediction = '?'

    if (numberOfPeople === 2) {
        prediction = personAtIndex === 1 ? 80 : 20;
    }
    
    return prediction + "%";
}

paretoDistributionPrediction(2, 1)

The perfect shape that gives the results I want is a cycloid rotated 45deg to fit without its cups in a square.

The right side of the square is always divided in equal parts by numberOfPeople. From the interaction with the cycloid curve, the lines perpendicularly divide the bottom side of the square. Then I want to know how much is in percentage a division from the bottom side of the square for personAtIndex (index of an item in numberOfPeople).

After the right side of the square is divided in equal parts by horizontal lines, how to find out the size of the corresponded division on the bottom side of the square after the incident lines have perpendicular intersected with the cycloid curve?

In the first drawing you can see how I reached the conclusion that a cycloid is the shape that will give an 80 20 ratio output.

Bellow images are expected outputs for paretoDistributionPrediction=f function: f(2,1)=80%, f(2,2)=20%, f(3,1)=68%,f(3,2)=23%,f(3,3)=9%,f(4,1)=58%,f(4,2)=22%,f(4,4)=7%.

population = 2 population = 3 population = 4 population = 5


Solution

  • A solution

    A cycloid that conforms to the description of the question is given by the parametric coordinates:

    const r = 0.3107954365513382, dr = 0.22339354437585907;
    const sin45deg = Math.sqrt(2)/2;
    const x = t => r * sin45deg * (1 - t + Math.sin(t) - Math.cos(t) + Math.PI) + dr,
        y = t => - r * sin45deg * (1 + t - Math.sin(t) - Math.cos(t) - Math.PI) - dr,
    

    I'll describe later how I derived those equations and computed the parameters.

    Now, regardless of the actual curve that is used, if it defined parametrically, the mathematics involved for determining the parameters of the distributions involve getting the value of y that corresponds to a certain x on the curve - it means solve the equation y(t) = y to get the parameter t and then use that parameter to compute x(t).

    For the case of the cycloid, as well as most other curves, the solution of the equation y(t) = y cannot be derived in closed form, so it has to be computed numerically. The numerical method I have chosen is Newton's method; it requires the derivative of y(t), y'(t), which I implemented in the function yd(t).

    With these, the solution to the problem in question is this:

    const r = 0.3107954365513382, dr = 0.22339354437585907;
    const sin45deg = Math.sqrt(2)/2;
    const x = t => r * sin45deg * (1 - t + Math.sin(t) - Math.cos(t) + Math.PI) + dr,
        y = t => r * sin45deg * (1 + t - Math.sin(t) - Math.cos(t) - Math.PI) + dr,
        yd = t => r * sin45deg * (1 - Math.cos(t) + Math.sin(t));
    
    const getXForY = y0 => {
        let t = Math.PI, yt = y(t) - y0;
        for(let i = 0; i < 10; i++){
            const tNext = t - yt / yd(t);
            yt = y(tNext) - y0;
            const dt = tNext - t;
            t = tNext;
            if(Math.abs(yt) < 1e-20 || Math.abs(dt) < 1e-15){
                break;
            }
        }
        return x(t);
    }
    
    const paretoDistributionPredictionFor = (numberOfPeople, personAtIndex) => {
        if(personAtIndex < 1 || personAtIndex > numberOfPeople){
            return null;
        }
        const fracInStart = (personAtIndex - 1)/numberOfPeople,
            fracInEnd = personAtIndex/numberOfPeople;
        const fracOutStart = 1 - getXForY(1 - fracInStart),
            fracOutEnd = 1 - getXForY(1 - fracInEnd),
            fracOutDiff = -fracOutEnd + fracOutStart;
    
        return (fracOutDiff * 100).toFixed(2).replace(/\.0+$/, '')+'%';
    }
    
    console.log('f(2, 1) = ' + paretoDistributionPredictionFor(2, 1));
    console.log('f(2, 1) = ' + paretoDistributionPredictionFor(2, 2));
    
    console.log(Array(30).fill('-').join(''));
    
    console.log('f(3, 1) = ' + paretoDistributionPredictionFor(3, 1));
    console.log('f(3, 2) = ' + paretoDistributionPredictionFor(3, 2));
    console.log('f(3, 3) = ' + paretoDistributionPredictionFor(3, 3));
    
    console.log(Array(30).fill('-').join(''));
    
    console.log('f(4, 1) = ' + paretoDistributionPredictionFor(4, 1));
    console.log('f(4, 2) = ' + paretoDistributionPredictionFor(4, 2));
    console.log('f(4, 3) = ' + paretoDistributionPredictionFor(4, 3));
    console.log('f(4, 4) = ' + paretoDistributionPredictionFor(4, 4));

    also as jsFiddle


    The cycloid

    The standard equation of the cycloid is

    x = t => r * (t - Math.sin(t));
    y = t => r * (1 - Math.cos(t)); 
    

    These equations give a cycloid that passes through the point (0, 0), (PI*r, 2*r) - the maximum, (2*PI*r, 0).

    To make it symmetrical with respect to the origin (more precisely with respect to the y axis), we have to subtract PI * r from x

    x = t => r * (t - Math.sin(t) - Math.PI);
    y = t => r * (1 - Math.cos(t)); 
    

    This curve passes through (-PI*r, 0), (0, 2*r) - the maximum, and (PI*r, 0).

    The next step is to rotate the cycloid by -3*PI/4, that is 145 degrees clockwise. This gives the equations:

    const sin45deg = Math.sqrt(2)/2;
    const x = t => r * sin45deg * (1 - t + Math.sin(t) - Math.cos(t) + Math.PI),
        y = t => - r * sin45deg * (1 + t - Math.sin(t) - Math.cos(t) - Math.PI),
    

    These would allow one to compute r such that the cycloid passes through the points (1, 0) and (0, -1); due to symmetry, only one condition has to be verified. This gives r = 0.43077140606812125 (see below on how I computed r and dr, for an outline of the method to derive r).

    The problem with this curve is that it doesn't pass through the point (0.8, -0.5), which would make it verify the 80:20 ratio of Pareto fame. The x value that corresponds to y = -0.5 (50%) is x = 0.7094, which would make it 71:29. This can be seen in the snippet below (get the cursor near the cycloid to get other ratio values).

    const r = 0.43077140606812125;
    const sin45deg = Math.sqrt(2)/2;
    const x = t => r * sin45deg * (1 - t + Math.sin(t) - Math.cos(t) + Math.PI),
        y = t => r * sin45deg * (1 + t - Math.sin(t) - Math.cos(t) - Math.PI),
        yd = t => r * sin45deg * (1 - Math.cos(t) + Math.sin(t));
    
    const getXForY = y0 => {
        let t = Math.PI, yt = y(t) - y0;
        for(let i = 0; i < 10; i++){
            const tNext = t - yt / yd(t);
            yt = y(tNext) - y0;
            const dt = tNext - t;
            t = tNext;
            if(Math.abs(yt) < 1e-20 || Math.abs(dt) < 1e-15){
                break;
            }
        }
        return  x(t);
    }
    
    const linspace = function(x1, x2, n){
        return Array.from({length: n}, (_, i) => x1 + (x2 - x1) * i / (n - 1));
    }
    const N = 100;
    
    const cycloidPoints = (() => {
        const ta = linspace(0, 2 * Math.PI, N);
        return ta.map(t => ({x: x(t), y: -y(t)}));
    })();
    const cycloidXAxis = [cycloidPoints[0], cycloidPoints[N-1]];
    const squarePoints = [{x:0, y: 0}, {x: 1, y: 0}, {x: 1, y: -1}, {x: 0, y: -1}, {x: 0, y: 0}];
    
    const pareto = getXForY(0.5);
    const paretoPoints = [{x: 1, y: -0.5}, {x: pareto, y: -0.5}, {x: pareto, y: -1}];
    
    const data = {
        datasets: [
            {
                label: 'Cycloid',
                data: cycloidPoints,
                borderColor: 'rgba(255, 26, 104, 1)',
            },
            {
                label: 'Cycloid x-axis',
                data: cycloidXAxis,
                borderColor: 'rgba(70, 140, 40, 1)',
                borderDash: [5, 4],
                borderWidth: 2
            },
            {
                label: 'square',
                data: squarePoints,
                borderColor: 'rgba(0, 0, 0, 1)',
            },
            {
                data: paretoPoints,
                borderColor: 'rgba(40, 70, 140, 1)',
            }]
    }
    const config = {
        type: 'line',
        data,
        options: {
            responsive: true,
            maintainAspectRatio: false,
            pointRadius: 0,
            borderWidth: 1,
            tension: 0,
            scales: {
                x: {
                    type: 'linear',
                }
            },
            interaction: {
                intersect: false,
                mode: "nearest"
            },
            plugins: {
                tooltip: {
                    filter({datasetIndex, parsed: {x, y}}){
                        return datasetIndex === 0 && x >= 0 && x <= 1 && -y >= 0 && -y <= 1;
                    },
    
                    callbacks: {
                        title(){
                            return '';
                        },
                        label(context){
                            if(context.datasetIndex !== 0){
                                return null;
                            }
                            const {x, y} = context.parsed;
                            if(x < 0 || x > 1 || -y < 0 || -y > 1){
                                return ''
                            }
                            return `${(-y*100).toFixed(0)}% => ${(x*100).toFixed(0)}%`;
                        }
                    }
                },
                legend: {
                    labels: {
                        usePointStyle: true,
                        pointStyle: 'line',
                        filter(a){
                            return a.text;
                        }
                    }
                },
                annotation: {
                    annotations: {
                        label1: {
                            type: "label",
                            xValue: pareto,
                            yValue: -0.5,
                            position: 'end',
                            borderWidth: 1,
                            borderDash: [4, 6],
                            content: `50% => ${(pareto * 100).toFixed(0)}%`
                        }
                    }
                }
            }
        }
    };
    
    const chart = new Chart(
        document.getElementById('chart'),
        config
    );
    
    // set approx equal scales
    const containerEl = document.getElementById('container');
    const xRange = chart.scales.x.max - chart.scales.x.min,
        yRange = chart.scales.y.max - chart.scales.y.min,
        xg = chart.chartArea.width,
        yg = chart.chartArea.height,
        dx = containerEl.offsetWidth - xg,
        dy = containerEl.offsetHeight - yg,
        scaleX = xg/xRange,
        scaleY = yg/yRange;
    
    const reScaleX = scaleX / Math.min(scaleX, scaleY),
        reScaleY = scaleY / Math.min(scaleX, scaleY);
    
    
    containerEl.style.width = (containerEl.offsetWidth / reScaleX + dx * (1 - 1/reScaleX))+"px";
    containerEl.style.height = (containerEl.offsetHeight / reScaleY  + dy * (1 - 1/reScaleY))+"px";
    <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
    <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/chartjs-plugin-annotation.min.js"></script>
    
    <div style="height:500px" id="container">
        <canvas id="chart"></canvas>
    </div>

    also as jsFiddle

    To address this issue, I added a translation of the cycloid, after the rotation. To keep symmetry, the translation has to be along the x = -y line; this would add a dr to x, while, y = y - dr. With this new parameter, the equations of the cycloid are:

    x = t => r * sin45deg * (1 - t + Math.sin(t) - Math.cos(t) + Math.PI) + dr;
    y = t => - r * sin45deg * (1 + t - Math.sin(t) - Math.cos(t) - Math.PI) - dr;
    

    Now we have two equations to solve for the two unknowns, r and dr: the first equation comes again from the condition that the cycloid passes through the point x=1, y=0, while the second is to verify the 80:20 ration, that is that the cycloid passes through the point x=0.8, y=-0.5.

    With these, we get the values of the parameters r = 0.3107954365513382, dr = 0.22339354437585907 (see below how). Here's the cycloid in this case:

    const r = 0.3107954365513382, dr = 0.22339354437585907;
    const sin45deg = Math.sqrt(2)/2;
    const x = t => r * sin45deg * (1 - t + Math.sin(t) - Math.cos(t) + Math.PI) + dr,
        y = t => r * sin45deg * (1 + t - Math.sin(t) - Math.cos(t) - Math.PI) + dr,
        yd = t => r * sin45deg * (1 - Math.cos(t) + Math.sin(t));
    
    const getXForY = y0 => {
        let t = Math.PI, yt = y(t) - y0;
        for(let i = 0; i < 10; i++){
            const tNext = t - yt / yd(t);
            yt = y(tNext) - y0;
            const dt = tNext - t;
            t = tNext;
            if(Math.abs(yt) < 1e-20 || Math.abs(dt) < 1e-15){
                break;
            }
        }
        return  x(t);
    }
    
    const linspace = function(x1, x2, n){
        return Array.from({length: n}, (_, i) => x1 + (x2 - x1) * i / (n - 1));
    }
    const N = 100;
    
    const cycloidPoints = (() => {
        const ta = linspace(0, 2 * Math.PI, N);
        return ta.map(t => ({x: x(t), y: -y(t)}));
    })();
    const cycloidXAxis = [cycloidPoints[0], cycloidPoints[N-1]];
    const squarePoints = [{x:0, y: 0}, {x: 1, y: 0}, {x: 1, y: -1}, {x: 0, y: -1}, {x: 0, y: 0}];
    
    const pareto = getXForY(0.5);
    const paretoPoints = [{x: 1, y: -0.5}, {x: pareto, y: -0.5}, {x: pareto, y: -1}];
    
    
    const data = {
        datasets: [
            {
                label: 'Cycloid',
                data: cycloidPoints,
                borderColor: 'rgba(255, 26, 104, 1)',
            },
            {
                label: 'Cycloid x-axis',
                data: cycloidXAxis,
                borderColor: 'rgba(70, 140, 40, 1)',
                borderDash: [5, 4],
                borderWidth: 2
            },
            {
                label: 'square',
                data: squarePoints,
                borderColor: 'rgba(0, 0, 0, 1)',
            },
            {
                data: paretoPoints,
                borderColor: 'rgba(40, 70, 140, 1)',
            }]
    }
    const config = {
        type: 'line',
        data,
        options: {
            responsive: true,
            maintainAspectRatio: false,
            pointRadius: 0,
            borderWidth: 1,
            tension: 0,
            scales: {
                x: {
                    type: 'linear',
                }
            },
            interaction: {
                intersect: false,
                mode: "nearest"
            },
            plugins: {
                tooltip: {
                    filter({datasetIndex, parsed: {x, y}}){
                        return datasetIndex === 0 && x >= 0 && x <= 1 && -y >= 0 && -y <= 1;
                    },
    
                    callbacks: {
                        title(){
                            return '';
                        },
                        label(context){
                            if(context.datasetIndex !== 0){
                                return null;
                            }
                            const {x, y} = context.parsed;
                            if(x < 0 || x > 1 || -y < 0 || -y > 1){
                                return ''
                            }
                            return `${(-y*100).toFixed(0)}% => ${(x*100).toFixed(0)}%`;
                        }
                    }
                },
                legend: {
                    labels: {
                        usePointStyle: true,
                        pointStyle: 'line',
                        filter(a){
                            return a.text;
                        }
                    }
                },
                annotation: {
                    annotations: {
                        label1: {
                            type: "label",
                            xValue: pareto,
                            yValue: -0.5,
                            position: 'end',
                            borderWidth: 1,
                            borderDash: [4, 6],
                            content: `50% => ${(pareto * 100).toFixed(0)}%`
                        }
                    }
                }
            }
        }
    };
    
    const chart = new Chart(
        document.getElementById('chart'),
        config
    );
    
    // set approx equal scales
    const containerEl = document.getElementById('container');
    const xRange = chart.scales.x.max - chart.scales.x.min,
        yRange = chart.scales.y.max - chart.scales.y.min,
        xg = chart.chartArea.width,
        yg = chart.chartArea.height,
        dx = containerEl.offsetWidth - xg,
        dy = containerEl.offsetHeight - yg,
        scaleX = xg/xRange,
        scaleY = yg/yRange;
    
    const reScaleX = scaleX / Math.min(scaleX, scaleY),
        reScaleY = scaleY / Math.min(scaleX, scaleY);
    
    
    containerEl.style.width = (containerEl.offsetWidth / reScaleX + dx * (1 - 1/reScaleX))+"px";
    containerEl.style.height = (containerEl.offsetHeight / reScaleY  + dy * (1 - 1/reScaleY))+"px";
    <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
    <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/chartjs-plugin-annotation.min.js"></script>
    
    <div style="height:500px" id="container">
        <canvas id="chart"></canvas>
    </div>

    also as jsFiddle


    Computing r and dr

    This is of little consequence for the rest of the discussion, but I'll give the code and description for the sake of completeness.

    I computed the values of r and dr, from the equations x(t) = 0, y(t) = 1 and x(t) = 0.5 and y(t) = -0.8 as described in the previous section. For any pair of values for r and dr we can compute four values of t that the four equations above: x(t1) = 0, y(t2) = 1, x(t3) = 0.5, y(t4) = -0.8. Computing the four ts is done as described in the first section, by using Newton's method.

    Now, the first two equations of the four above should come together, that is for the same t: t1 = t2; the same goes for the last two, we intend to find the r and dr that give t3 = t4. This means we have our two equations as:

    t1(r, dr) - t2(r, dr) = 0
    t3(r, dr) - t4(r, dr) = 0
    

    This system is also solved using Newton's method, with the difference that the differentials are computed numerically.

    For the method to be convergent, one has to start with good initial values; fortunately, there is the cycloid graphic above, where we could play with the values of parameters with visual feedback, such that we obtained good initial values.

    Here's the code:

    const sin45deg = Math.sqrt(2)/2;
    const xr = (r, dr) => (t => r * sin45deg * (1 - t + Math.sin(t) - Math.cos(t) + Math.PI) + dr),
        xdr = (r, dr) => (t => r * sin45deg * (-1 + Math.cos(t) + Math.sin(t))),
        yr = (r, dr) => (t => r * sin45deg * (1 + t - Math.sin(t) - Math.cos(t) - Math.PI) + dr),
        ydr = (r, dr) => (t => r * sin45deg * (1 - Math.cos(t) + Math.sin(t)));
    
    const getTForX = (r, dr, x0, nMax = 100) => {
        const x = xr(r, dr), xd = xdr(r, dr);
        let t = Math.PI, xt = x(t) - x0;
        const a_dt = [], a_xt = [];
        for(let i = 0; i < nMax; i++){
            const tNext = t - xt / xd(t);
            xt = x(tNext) - x0;
            const dt = tNext - t;
            t = tNext;
            a_dt.push(dt);
            a_xt.push(xt);
            if(Math.abs(xt) < 1e-20 || Math.abs(dt) < 1e-12){
                break;
            }
            else if(i === nMax-1){
                console.error({r, dr, x0}, {a_xt, a_dt});
                throw 'not converging on x; choose different initial values';
            }
        }
        return t;
    }
    
    const getTForY = (r, dr, y0, nMax = 100) => {
        const y = yr(r, dr), yd = ydr(r, dr);
        let t = Math.PI, yt = y(t) - y0;
        const a_dt = [], a_yt = [];
        for(let i = 0; i < nMax; i++){
            const tNext = t - yt / yd(t);
            yt = y(tNext) - y0;
            const dt = tNext - t;
            a_dt.push(dt);
            a_yt.push(yt);
            t = tNext;
            if(Math.abs(yt) < 1e-20 || Math.abs(dt) < 1e-12){
                break;
            }
            else if(i === nMax-1){
                console.error({r, dr, y0}, {a_yt, a_dt});
                throw 'not converging on x; choose different initial values';
            }
        }
        return t;
    }
    
    const x0 = 1, y0 = 0;
    const x1 = 0.8, y1 = 0.5;
    
    const eq0 = (r, dr) => /*Math.abs*/(getTForX(r, dr, x0) - getTForY(r, dr, y0));
    const eq1 = (r, dr) => /*Math.abs*/(getTForX(r, dr, x1) - getTForY(r, dr, y1));
    
    const eps_diff = 1e-3;
    const J0 = (r, dr) => {
        const v0 = eq0(r, dr);
        return [(eq0(r+eps_diff, dr)-v0)/eps_diff, (eq0(r, dr+eps_diff)-v0)/eps_diff];
    }
    const J1 = (r1, dr1) => {
        const v1 = eq1(r1, dr1);
        return [(eq1(r1+eps_diff, dr1)-v1)/eps_diff, (eq1(r1, dr1+eps_diff)-v1)/eps_diff];
    }
    
    const solveMain = (r0, dr0, nMax = 20) => {
        let r = r0, dr = dr0;
        let df0 = eq0(r, dr),
            df1 = eq1(r, dr);
        for(let i = 0; i < nMax; i++){
            const [J00, J01] = J0(r, dr),
                [J10, J11] = J1(r, dr);
    
            const det = J00 * J11 - J01 * J10;
            const delta_r = -(df0 * J11 - df1 * J01) / det,
                delta_dr = -(-J10 * df0 + J00 * df1) / det;
    
            r = r + delta_r;
            dr = dr + delta_dr;
            df0 = eq0(r, dr);
            df1 = eq1(r, dr);
    
            if(Math.max(Math.abs(df0), Math.abs(df1)) < 1e-20 ||
                Math.max(Math.abs(delta_r), Math.abs(delta_dr)) < 1e-12){
                break;
            }
            else if(i === nMax-1){
                throw 'not converging globally; choose different initial values';
            }
        }
        return ({r, dr});
    }
    
    // initial values can be obtained by playing with these params on the graphic
    console.log(solveMain(0.3, 0.25));

    also as jsFiddle

    Note In the code in the first section, I used -y instead of y, to simplify computation. Graphically it would mean flipping the chart along x axis, or moving the action from quadrant 4 to quadrant 1.