Bjontegaard calculation using only one pair of PSNR and BitRate

I want to calculate the BD-Rate for two different video encoding settings using the python script below.

Using 4 RD Points (R1 and PSNR1 are the reference RD Points of the Video1 while R2 and PSNR2 are the new tests with different video settings of Video2) the script works fine ie

from bjontegaard_metric import *    
R1 = np.array([686.76, 309.58, 157.11, 85.95])
PSNR1 = np.array([40.28, 37.18, 34.24, 31.42])
R2 = np.array([893.34, 407.8, 204.93, 112.75])
PSNR2 = np.array([40.39, 37.21, 34.17, 31.24])

print 'BD-PSNR: ', BD_PSNR(R1, PSNR1, R2, PSNR2)
print 'BD-RATE: ', BD_RATE(R1, PSNR1, R2, PSNR2)

But with just 1 RD Point ie

 from bjontegaard_metric import *
 R1 = np.array([686.76])
 PSNR1 = np.array([40.28])
 R2 = np.array([893.34])
 PSNR2 = np.array([40.39])
        
 print 'BD-PSNR: ', BD_PSNR(R1, PSNR1, R2, PSNR2)
 print 'BD-RATE: ', BD_RATE(R1, PSNR1, R2, PSNR2)

I get a warning: RankWarning: Polyfit may be poorly conditioned. Each video encoder run, returns just one pair of PSNR and Bitrate as a result. So I want to compare two pairs of PSNR/BitRate (Reference video & modified video). Is there any way to fix this warning? The results I get using only 1 RD point are reliable?

import numpy as np
import scipy.interpolate

def BD_PSNR(R1, PSNR1, R2, PSNR2, piecewise=0):
    lR1 = np.log(R1)
    lR2 = np.log(R2)

    PSNR1 = np.array(PSNR1)
    PSNR2 = np.array(PSNR2)

    p1 = np.polyfit(lR1, PSNR1, 3)
    p2 = np.polyfit(lR2, PSNR2, 3)

    # integration interval
    min_int = max(min(lR1), min(lR2))
    max_int = min(max(lR1), max(lR2))

    # find integral
    if piecewise == 0:
        p_int1 = np.polyint(p1)
        p_int2 = np.polyint(p2)

        int1 = np.polyval(p_int1, max_int) - np.polyval(p_int1, min_int)
        int2 = np.polyval(p_int2, max_int) - np.polyval(p_int2, min_int)
    else:
        # See https://chromium.googlesource.com/webm/contributor-guide/+/master/scripts/visual_metrics.py
        lin = np.linspace(min_int, max_int, num=100, retstep=True)
        interval = lin[1]
        samples = lin[0]
        v1 = scipy.interpolate.pchip_interpolate(np.sort(lR1), PSNR1[np.argsort(lR1)], samples)
        v2 = scipy.interpolate.pchip_interpolate(np.sort(lR2), PSNR2[np.argsort(lR2)], samples)
        # Calculate the integral using the trapezoid method on the samples.
        int1 = np.trapz(v1, dx=interval)
        int2 = np.trapz(v2, dx=interval)

    # find avg diff
    avg_diff = (int2-int1)/(max_int-min_int)

    return avg_diff


def BD_RATE(R1, PSNR1, R2, PSNR2, piecewise=0):
    lR1 = np.log(R1)
    lR2 = np.log(R2)

    # rate method
    p1 = np.polyfit(PSNR1, lR1, 3)
    p2 = np.polyfit(PSNR2, lR2, 3)

    # integration interval
    min_int = max(min(PSNR1), min(PSNR2))
    max_int = min(max(PSNR1), max(PSNR2))

    # find integral
    if piecewise == 0:
        p_int1 = np.polyint(p1)
        p_int2 = np.polyint(p2)

        int1 = np.polyval(p_int1, max_int) - np.polyval(p_int1, min_int)
        int2 = np.polyval(p_int2, max_int) - np.polyval(p_int2, min_int)
    else:
        lin = np.linspace(min_int, max_int, num=100, retstep=True)
        interval = lin[1]
        samples = lin[0]
        v1 = scipy.interpolate.pchip_interpolate(np.sort(PSNR1), lR1[np.argsort(PSNR1)], samples)
        v2 = scipy.interpolate.pchip_interpolate(np.sort(PSNR2), lR2[np.argsort(PSNR2)], samples)
        # Calculate the integral using the trapezoid method on the samples.
        int1 = np.trapz(v1, dx=interval)
        int2 = np.trapz(v2, dx=interval)

    # find avg diff
    avg_exp_diff = (int2-int1)/(max_int-min_int)
    avg_diff = (np.exp(avg_exp_diff)-1)*100
    return avg_diff

Solution

According to IETF at https://tools.ietf.org/id/draft-ietf-netvc-testing-06.html#rfc.section.4.2 number 2 At least four points must be computed. These points should be the same quantizers when comparing two versions of the same codec. So any lesser points than 4 are not valid for reliable results.

1. Rate/distortion points are calculated for the reference and test codec.
2. At least four points must be computed. These points should be the same quantizers when comparing two versions of the same codec.
3. Additional points outside of the range should be discarded.
4. The rates are converted into log-rates.
5. A piecewise cubic hermite interpolating polynomial is fit to the points for each codec to produce functions of log-rate in terms of distortion.

Metric score ranges are computed:
1. If comparing two versions of the same codec, the overlap is the intersection of the two curves, bound by the chosen quantizer points.
2. If comparing dissimilar codecs, a third anchor codec’s metric scores at fixed quantizers are used directly as the bounds.
3. The log-rate is numerically integrated over the metric range for each curve, using at least 1000 samples and trapezoidal integration.
4. The resulting integrated log-rates are converted back into linear rate, and then the percent difference is calculated from the reference to the test codec.