Search code examples
androidfloating-pointdoublejit

Double precision value computation errors on MediaTek processors


I've found that one of my application posted on the market produces weird results on some phones. Upon investigation it turns out there is an issue with one function which computes distance between two GeoPoints - sometimes it returns completely wrong value. This issue reproduces only on devices with MediaTek MT6589 SoC (aka MTK6589). And AFAIK all of such devices have Android 4.2 installed.

Update I was also able to reproduce the bug on Lenovo S6000 tablet with MediaTek MT8125/8389 chip and on Fly IQ444 Quattro with MT6589 and with Android 4.1 installed.

I created a test project which helps to reproduce the bug. It runs computation repeatedly for 1'000 or 100'000 iterations. To exclude possibility of threading issues computation is performed on the UI thread (with small pauses to keep UI responding). In the test project I used just a part from the original distance formula:

private double calcX() {
    double t = 1.0;
    double X = 0.5 + t / 16384;
    return X;
}

As you can check by yourself on web2.0calc.com the value of X should be approximately: 0.50006103515625.
However on the devices with MT6589 chip often the wrong value is computed: 2.0.

Project is available at Google Code (APK is available also). The source of the test class is presented below:

public class MtkTestActivity extends Activity {

  static final double A = 0.5;
  static final double B = 1;
  static final double D = 16384;

  static final double COMPUTED_CONST = A + B / D;

  /*
   * Main calculation where bug occurs
   */
  public double calcX() {
    double t = B;
    double X = A + t / D;
    return X;
  }

  class TestRunnable implements Runnable {

    static final double EP = 0.00000000001;

    static final double EXPECTED_LOW = COMPUTED_CONST - EP;

    static final double EXPECTED_HIGH = COMPUTED_CONST + EP;

    public void run() {
      for (int i = 0; i < SMALL_ITERATION; i++) {
        double A = calcX();

        if (A < EXPECTED_LOW || A > EXPECTED_HIGH) {
          mFailedInCycle = true;
          mFails++;
          mEdit.getText().append("FAILED on " + mIteration + " iteration with: " + A + '\n');
        }
        mIteration++;
      }

      if (mIteration % 5000 == 0) {
        if (mFailedInCycle) {
          mFailedInCycle = false;
        } else {
          mEdit.getText().append("passed " + mIteration + " iterations\n");
        }
      }

      if (mIteration < mIterationsCount) {
        mHandler.postDelayed(new TestRunnable(), DELAY);
      } else {
        mEdit.getText().append("\nFinished test with " + mFails + " fails");
      }
    }

  }

  public void onTestClick(View v) {
    startTest(IT_10K);
  }

  public void onTestClick100(View v) {
    startTest(IT_100K);
  }

  private void startTest(int iterationsCount) {
    Editable text = mEdit.getText();
    text.clear();
    text.append("\nStarting " + iterationsCount + " iterations test...");
    text.append("\n\nExpected result " + COMPUTED_CONST + "\n\n");
    mIteration = 0;
    mFails = 0;
    mFailedInCycle = false;
    mIterationsCount = iterationsCount;
    mHandler.postDelayed(new TestRunnable(), 100);
  }

  @Override
  protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_main);
    mHandler = new Handler(getMainLooper());
    mEdit = (EditText) findViewById(R.id.edtText1);
  }

  private static final int IT_10K = 1000;

  private static final int IT_100K = 100000;

  private static final int SMALL_ITERATION = 50;

  private static final int DELAY = 10;

  private int mIteration;

  private int mFails;

  private boolean mFailedInCycle;

  private Handler mHandler;

  private int mIterationsCount;

  private EditText mEdit;

}

To fix the issue it's enough to just change all double to float in calcX() method.

Further investigation
Turning off JIT (by adding android:vmSafeMode="true" to the app manifest) fixes bug as well.

Have anyone seen this bug before? Maybe this is a known issue?

p.s.: if anyone would be able to reproduce this bug on the device with other chip, or could test it with any MediaTek chip and Android >= 4.3, I will highly appreciate it.


Solution

  • This was a JIT bug that was active in the JellyBean source from late 2012 through early 2013. In short, if two or more double-precision constants that were different in the high 32 bits, but identical in the low 32 bits were used in the same basic block the JIT would think they were the same, and inappropriately optimize one of them away.

    I introduced the defect in: https://android-review.googlesource.com/#/c/47280/

    and fixed it in: https://android-review.googlesource.com/#/c/57602/

    The defect should not appear in any recent Android builds.