Search code examples
z3covariancetheorem-provingdafnyinduction

Proving a covariance inequality in Dafny, use contradiction?


I am trying to prove the following property in Dafny: covariance(x,y) <= variance(x) * variance(y), where x and y are real.

First of all, I would like to know if I am interpreting covariance and variance correctly. Thus, I make the following question: is it the same proving covariance(x,y) <= variance(x) * variance(y) and proving covariance(x,y) <= covariance(x,x) * covariance(y,y)? I make this question because the variance usually appears to the power of 2 (i.e., sigma^2(x)), which confuses me.

However, I understand from https://en.wikipedia.org/wiki/Covariance that this equivalence holds. Concretely, the following is stated: The variance is a special case of the covariance in which the two variables are identical.

I also take one of the definitions of covariance from that link. Concretely: cov(x,y)=(1/n)*summation(i from 1 to n){(x_i-mean(x))*(y_i-mean(y))}.

Thus, assuming that I must prove covariance(x,y) <= covariance(x,x) * covariance(y,y) and that covariance is defined like above, I test the following lemma in Dafny:

lemma covariance_equality (x: seq<real>, y:seq<real>)
    requires |x| > 1 && |y| > 1; //my conditions
    requires |x| == |y|; //my conditions
    requires forall elemX :: elemX in x ==> 0.0 < elemX; //my conditions
    requires forall elemY :: elemY in y ==> 0.0 < elemY; //my conditions
    ensures covariance_fun (x,y) <= covariance_fun(x,x) * covariance_fun (y,y);
    {}

Which cannot be proven automatically. The point is that I do not know how to prove it manually. I made this (not useful) approach of a contraction proof, but I guess this is not the way:

lemma covariance_equality (x: seq<real>, y:seq<real>)
    requires |x| > 1 && |y| > 1; //my conditions
    requires |x| == |y|; //my conditions
    requires forall elemX :: elemX in x ==> 0.0 < elemX; //my conditions
    requires forall elemY :: elemY in y ==> 0.0 < elemY; //my conditions
    ensures covariance_fun (x,y) <= covariance_fun(x,x) * covariance_fun (y,y);
    {

    if !(covariance_fun(x,y) < covariance_fun(x,c) * covariance_fun (y,y)) {

      //...
      assert !(forall elemX' :: elemX' in x ==> 0.0<elemX') && (forall elemY' :: elemY' in y ==> 0.0<elemY'); //If we assume this, the proof is verified
      assert false;
    } 

}

My main question is: how can I prove this lemma?

So that you have all the information, I offer the functions by which covariance calculation is made up. First, the covariance function is as follows:

function method covariance_fun(x:seq<real>, y:seq<real>):real
    requires |x| > 1 && |y| > 1;
    requires |x| == |y|;
    decreases |x|,|y|;
{
    summation(x,mean_fun(x),y,mean_fun(y)) / ((|x|-1) as real)
}

As you can see, it performs summation and then divides by the length. Also, note that the summation function receives mean of x and mean of y. Summation is as follows:

function method summation(x:seq<real>,x_mean:real,y:seq<real>,y_mean:real):real
  requires |x| == |y|;
  decreases |x|,|y|;
{
if x == [] then 0.0 
else ((x[0] - x_mean)*(y[0] - y_mean)) 
    + summation(x[1..],x_mean,y[1..],y_mean)
}

As you can see, it recursively performs (x-mean(x))*(y-mean(y)) of each position.

As for helper functions, mean_fun calculates the mean and is as follows:

function method mean_fun(s:seq<real>):real
  requires |s| >= 1;
  decreases |s|;
{
    sum_seq(s) / (|s| as real)
}

Note that is uses sum_seq function which, given a sequence, calculates the sum of all its positions. It is defined recursively:

function method sum_seq(s:seq<real>):real
  decreases s;
{
if s == [] then 0.0 else s[0] + sum_seq(s[1..])
}

Now, with all these ingredientes, can anyone tell me (1) if I already did something wrong and/or (2) how to prove the lemma?

PS: I have been looking and this lemma seems to be the Cauchy–Schwarz inequality, but still do not know how to solve it.

Elaborating on the answer

I will (1) define a convariance function cov in terms of product and then (2) use this definition in a new lemma that should hold.

Also, note that I add the restriction that 'x' and 'y' are non-empty, but you can modify this

//calculates the mean of a sequence
function method mean_fun(s:seq<real>):real
  requires |s| >= 1;
  decreases |s|;
{
    sum_seq(s) / (|s| as real)
}

//from x it constructs a=[x[0]-x_mean, x[1]-x_mean...]
function construct_list (x:seq<real>, m:real) : (a:seq<real>)
  requires |x| >= 1
  ensures |x|==|a|
  {
    if |x| == 1 then [x[0]-m]
    else [x[0]-m] + construct_list(x[1..],m)
  }

//covariance is calculated in terms of product
function cov(x: seq<real>, y: seq<real>) : real
  requires |x| == |y|
  requires |x| >= 1
  ensures cov(x,y) == product(construct_list(x, mean_fun(x)),construct_list(y, mean_fun(y))) / (|x| as real) //if we do '(|x| as real)-1', then we can have a division by zero
  //i.e., ensures cov(x,y) == product(a,b) / (|x| as real)
{
  //if |a| == 1 then 0.0 //we do base case in '|a|==1' instead of '|a|==1', because we have precondition '|a| >= 1'
  //else 
  var x_mean := mean_fun(x);
  var y_mean := mean_fun(y);

  var a := construct_list(x, x_mean);
  var b := construct_list(y, y_mean);

  product(a,b) / (|a| as real)
}

//Now, test that Cauchy holds for Cov
lemma cauchyInequality_usingCov (x: seq<real>, y: seq<real>)
   requires |x| == |y|
   requires |x| >= 1
   requires forall i :: 0 <= i < |x| ==> x[i] >= 0.0 && y[i] >= 0.0
   ensures square(cov(x,y)) <= cov(x,x)*cov(y,y)
{
   CauchyInequality(x,y);
}
\\IT DOES NOT HOLD!

Proving HelperLemma (not provable?)

Trying to prove HelperLemma I find that Dafny is able to prove that LHS of a*a*x + b*b*y >= 2.0*a*b*z is positive.

Then, I started to prove different cases (I know this is not the best procedure):

(1) If one of a,b or z is negative (while the rest positive) or when the three are negative; then RHS is negative And when RHS is negative, LHS is greater or equal (since LHS is positive).

(2) If a or b or z are 0, then RHS is 0, thus LHS is greater or equal.

(3) If a>=1, b>=1 and z>=1, I tried to prove this using sqrt_ineq(). BUT! We can find a COUNTER EXAMPLE. With a=1, b=1, z=1, RHS is 2; now, choose x to be 0 and y to be 0 (i.e., LHS=0). Since 0<2, this is a counter example.

Am I right? If not, how can I prove this lemma?


lemma HelperLemma(a: real, x: real, b: real, y: real, z: real)
  requires x >= 0.0 && y >= 0.0 //&& z >= 0.0
  requires square(z) <= x * y
  ensures a*a*x + b*b*y >= 2.0*a*b*z
  {
    assert square(z) == z*z;
    assert a*a*x + b*b*y >= 0.0; //a*a*x + b*b*y is always positive

    if ((a<0.0 && b>=0.0 && z>=0.0) || (a>=0.0 && b<0.0 && z>=0.0) || (a>=0.0 && b>=0.0 && z<0.0) || (a<0.0 && b<0.0 && z<0.0)) {
      assert 2.0*a*b*z <=0.0; //2.0*a*b*z <= 0.0 is negative or 0 when product is negative
      assert a*a*x + b*b*y >= 2.0*a*b*z; // When 2.0*a*b*z <= 0.0 is negative or 0, a*a*x + b*b*y is necessarily greater
    }

    if (a==0.0 || b==0.0 || z==0.0){
      assert a*a*x + b*b*y >= 2.0*a*b*z;
    }

    else if (a>=1.0 && b>=1.0 && z>= 1.0){
      assert a*a >= a;
      assert b*b >= b;

      assert a*a+b*b >= a*b;

      assert square(z) <= x * y;
      //sqrt_ineq(square(z),x*y);
      //assert square(z) == z*z;
      //sqrt_square(z);
      //assert sqrt(z) <= sqrt(x*y);
      //assert (x==0.0 || y == 0.0) ==> (x*y==0.0) ==> (square(z) <= 0.0) ==> z<=0.0;
      assert a*a*x + b*b*y >= 2.0*a*b*z;
      //INDEED: COUNTER EXAMPLE: a=1, b=1, z=1. Thus: RHS is 2; now, choose x to be 0 and y to be 0 (i.e., LHS=0). Since 0<2, this is a counter example.
    }
  }


Solution

  • Edited after clarification/fixing of post condition.

    Here is attempt to do it in Dafny using calculation and induction. Few comments on verification

    Since task at hand is verification, I assumed sqrt function and some of its properties. HelperLemma is provable provided we assume some properties of sqrt function. There are two assumption in verification that is easy to prove.

    function square(r: real) : real
     ensures square(r) >=  0.0
    {
      r * r
    }
    
    function sqrt(r: real) : real
    
    function {:fuel 2} product(a: seq<real>, b: seq<real>) : real
      requires |a| == |b|
    {
      if |a| == 0 then 0.0
      else a[0] * b[0] + product(a[1..], b[1..])
    }
    
    lemma sqrt_ineq(a: real, b: real)
      requires a >= 0.0 && b >= 0.0
      requires a <= b
      ensures sqrt(a) <= sqrt(b)
    
    lemma sqrt_square(a: real)
      ensures sqrt(square(a)) == a
    
    lemma CauchyBaseInequality(a1: real, a2: real, b1: real, b2: real)
      ensures square(a1*b1 + a2*b2) <= (a1*a1 + a2*a2) * (b1*b1 + b2*b2)
    {
      calc <= {
        square(a1*b1 + a2*b2) - (a1*a1 + a2*a2) * (b1*b1 + b2*b2);
        a1*b1*a1*b1 + a2*b2*a2*b2 + 2.0*a1*b1*a2*b2 - a1*a1*b1*b1 - a1*a1*b2*b2 - a2*a2*b1*b1 - a2*a2*b2*b2;
        2.0*a1*b1*a2*b2 - a1*a1*b2*b2 - a2*a2*b1*b1;
        - square(a1*b2 - a2*b1);
        0.0;
      }
    }
    
    lemma HelperLemma(a: real, x: real, b: real, y: real, z: real)
      requires x >= 0.0 && y >= 0.0
      requires square(z) <= x * y
      ensures a*a*x + b*b*y >= 2.0*a*b*z
    
    lemma CauchyInequality(a: seq<real>, b: seq<real>)
      requires |a| == |b|
      ensures square(product(a, b)) <= product(a, a) * product(b, b)
    {
      if |a| <= 2 {
         if |a| == 2 {
           CauchyBaseInequality(a[0], a[1], b[0], b[1]);
         }
      }
      else {
        var x, xs := a[0], a[1..];
        var y, ys := b[0], b[1..];
        calc >= {
          product(a, a) * product(b, b) - square((product(a, b)));
          (x*x + product(xs, xs))*(y*y + product(ys, ys)) - square(x*y + product(xs, ys));
          x*x*y*y + y*y*product(xs, xs) + x*x*product(ys, ys) + product(xs, xs)*product(ys, ys)
            - x*x*y*y - square(product(xs, ys)) - 2.0*x*y*product(xs, ys);
          y*y*product(xs, xs) + x*x*product(ys, ys) + product(xs, xs) * product(ys, ys)
            - square(product(xs, ys)) - 2.0*x*y*product(xs, ys);
          { CauchyInequality(xs, ys); }
          y*y*product(xs, xs) + x*x*product(ys, ys) - 2.0*x*y*product(xs, ys);
          {
            CauchyInequality(xs, ys);
            assume product(xs, xs) >= 0.0;
            assume product(ys, ys) >= 0.0;
            HelperLemma(y, product(xs, xs), x, product(ys, ys), product(xs, ys));
          }
          0.0;
        }
      }
    }
    
    lemma cauchyInequality_usingCov (x: seq<real>, y: seq<real>)
       requires |x| == |y|
       requires |x| >= 1
       requires forall i :: 0 <= i < |x| ==> x[i] >= 0.0 && y[i] >= 0.0
       ensures square(cov(x,y)) <= cov(x,x)*cov(y,y)
    {
       var x_mean := mean_fun(x);
       var y_mean := mean_fun(y);
        
       var a := construct_list(x, x_mean);
       var b := construct_list(y, y_mean);
       assert cov(x, y) == product(a, b) / (|x| as real);
       assert cov(x, x) == product(a, a) / (|x| as real);
       assert cov(y, y) == product(b, b) / (|x| as real);
       CauchyInequality(a, b);
        
     }
    

    Verification of Helper Lemma

    function square(r: real): real
      ensures square(r) >= 0.0
    {
       r * r
    }
    
    function abs(a: real) : real
      ensures abs(a) >= 0.0
    {
      if a < 0.0 then -a else a
    }
    
    function sqrt(r: real): real
    requires r >= 0.0
    ensures square(sqrt(r)) == r  && sqrt(r) >= 0.0
      
    lemma sqrtIneq(a: real, b: real)
      requires a >= 0.0 && b >= 0.0
      requires a <= b
      ensures sqrt(a) <= sqrt(b)
    
    lemma sqrtLemma(a: real)
       ensures sqrt(square(a)) == abs(a)
    
    lemma sqrtMultiply(a: real, b: real)
      requires a >= 0.0 && b >= 0.0
      ensures sqrt(a*b) == sqrt(a) * sqrt(b);
    
    lemma differenceLemma(a: real, b: real)
       requires a >= abs(b) 
       ensures a - b >= 0.0 && a + b >= 0.0
    {
    }
    
    lemma HelperLemma(a: real, x: real, b: real, y: real, z: real)
      requires x >= 0.0 && y >= 0.0
      requires square(z) <= x * y
      ensures a*a*x + b*b*y >= 2.0*a*b*z
    {
       
       var sx := sqrt(x);
       var sy := sqrt(y);
    
       assert sx * sx == x;
       assert sy * sy == y;
       if (a >= 0.0 && b >= 0.0) || (a < 0.0 && b < 0.0) {
        calc >= {
          a*a*x + b*b*y - 2.0*a*b*z;
          a*a*sx*sx + b*b*sy*sy - 2.0*a*b*sx*sy + 2.0*a*b*sx*sy - 2.0*a*b*z;
          square(a*sx - b*sy) + 2.0*a*b*sx*sy - 2.0*a*b*z;
          2.0*a*b*sx*sy - 2.0*a*b*z;
          2.0*a*b*(sx*sy - z);
          {
            sqrtIneq(square(z), x * y);
            assert sqrt(x * y) >= sqrt(square(z));
            sqrtMultiply(x, y);
            assert sx * sy >= sqrt(square(z));
            sqrtLemma(z);
            assert sx * sy >= abs(z);
            differenceLemma(sx*sy, z);
            assert sx*sy >= z;
          }
          0.0;
        }
       }
       else {
          calc >= {
             a*a*x + b*b*y - 2.0*a*b*z;
             a*a*sx*sx + b*b*sy*sy + 2.0*a*b*sx*sy - 2.0*a*b*sx*sy - 2.0*a*b*z;
             square(a*sx + b*sy) - 2.0*a*b*sx*sy - 2.0*a*b*z;
             - 2.0*a*b*sx*sy - 2.0*a*b*z;
             - 2.0*a*b*(sx*sy + z);
             {
               sqrtIneq(square(z), x * y);
               assert sqrt(x * y) >= sqrt(square(z));
               sqrtMultiply(x, y);
               assert sx * sy >= sqrt(square(z));
               sqrtLemma(z);
               assert sx * sy >= abs(z);
               differenceLemma(sx*sy, z);
               assert sx*sy >= -z;
             }
             0.0;
           }
       }
    }