Search code examples
javaperformancebigdecimalfactorial

BigDecimal toPlainString takes very long to generate String


I made a program in Java to calculate extreme factorials like that of 1 milion. What it does essentially is to start a loop from 1 to n and every iteration, multiply a BigDecimal with the value of the counter variable in the loop. Once the loop is done, it invokes BigDecimal#toPlainString() which returns the number produced as a String. The invokation of this method however takes very very long to execute. For example, in the code bellow:

public static void main(String[] args) {
        BigDecimal number = new BigDecimal(1);
        long startTime = System.currentTimeMillis();
        for (int i = 1; i < 500000; i++) {
            number = number.multiply(new BigDecimal(i));
        }
        System.out.println("Generating took: " + (System.currentTimeMillis() - startTime) + "ms. Creating String.");
        startTime = System.currentTimeMillis();
        String result = number.toPlainString();
        System.out.println("String generation took: " + (System.currentTimeMillis() - startTime) + "ms");
        FileUtils.writeStringToFile(new File("Path/To/File"), result);
    }

the output to console was:

Generating took: 181324ms. Creating String.
String generation took: 710498ms

which demonstrates how long the method toPlainString() took.

I understand that we are dealing with huge numbers (arround a milion digits in the example above) but i wanted to know if there is any way to speed up this method and how should i go about doing that?

Thanks!

EDIT #1: The only reason why the milisecond time calculations where added in the post is to bring 'long' into prespective, and possibly demonstrate the behaviour of the code in case the problem is not reproducable by all readers. What i am trying to do is determine why it took so long in my case, and most importantly how to speed up the process of converting to String.


Solution

  • The reason why BigDecimal#PlainString takes very long to generate the string with Java 7 is: It was implemented very inefficiently in Java 7. Fortunately, it is much faster in Java 8.

    Here, it may be important to note that in this particular case, it is not really the string creation in BigDecimal, but that in BigInteger. The value that is computed in the given example is a large factorial, and thus, effectively an integral value. The internal scale field of the BigDecimal will be 0 then, and having a look at the toPlainString method shows that in this case, the string value of the internal intVal field will be returned:

    public String toPlainString() {
        if(scale==0) {
            if(intCompact!=INFLATED) {
                return Long.toString(intCompact);
            } else {
                return intVal.toString();
            }
        }
        ...
    }
    

    This intVal field is a BigInteger, and this is the actual culprit here.

    The following program is not intended as a proper "microbenchmark", but only supposed to give an estimate of the performance: It creates several factorials, and generates the string representations of these:

    import java.math.BigDecimal;
    
    public class BigDecimalToPlainStringPerformance
    {
        public static void main(String[] args)
        {
            for (int n = 10000; n <= 50000; n += 5000)
            {
                BigDecimal number = factorial(n);
                long before = System.nanoTime();
                String result = number.toPlainString();
                long after = System.nanoTime();
    
                double ms = (after - before) / 1e6;
                System.out.println(n + "! took " + ms +
                    " ms, length " + result.length());
            }
        }
    
        private static BigDecimal factorial(int n)
        {
            BigDecimal number = new BigDecimal(1);
            for (int i = 1; i < n; i++)
            {
                number = number.multiply(new BigDecimal(i));
            }
            return number;
        }
    
    }
    

    With Java 7 (u07), on my (old) PC, the output is along the lines of

    10000! took 514.98249 ms, length 35656
    15000! took 1232.86507 ms, length 56126
    20000! took 2364.799995 ms, length 77333
    25000! took 3877.565724 ms, length 99090
    30000! took 5814.925361 ms, length 121283
    35000! took 8231.13608 ms, length 143841
    40000! took 11088.823021 ms, length 166709
    45000! took 14344.778177 ms, length 189850
    50000! took 18155.089823 ms, length 213232
    

    Fortunately, this performance problem has been fixed in Java 8. With Java 8 (u45), the output is

    10000! took 77.20227 ms, length 35656
    15000! took 113.811951 ms, length 56126
    20000! took 188.293764 ms, length 77333
    25000! took 261.328745 ms, length 99090
    30000! took 355.001264 ms, length 121283
    35000! took 481.912925 ms, length 143841
    40000! took 610.812827 ms, length 166709
    45000! took 698.80725 ms, length 189850
    50000! took 840.87391 ms, length 213232
    

    showing that the performance has been improved significantly.

    From quickly skimming over the commit logs in the OpenJDK, there is one commit that is probably the most relevant here:

    Accelerate conversion to string by means of Schoenhage recursive base conversion

    (I did not verify this, but it seems the only one which dedicatedly aimed at improving the toString performance)