I made a program in Java to calculate extreme factorials like that of 1 milion. What it does essentially is to start a loop from 1 to n
and every iteration, multiply a BigDecimal
with the value of the counter variable in the loop. Once the loop is done, it invokes BigDecimal#toPlainString()
which returns the number produced as a String. The invokation of this method however takes very very long to execute. For example, in the code bellow:
public static void main(String[] args) {
BigDecimal number = new BigDecimal(1);
long startTime = System.currentTimeMillis();
for (int i = 1; i < 500000; i++) {
number = number.multiply(new BigDecimal(i));
}
System.out.println("Generating took: " + (System.currentTimeMillis() - startTime) + "ms. Creating String.");
startTime = System.currentTimeMillis();
String result = number.toPlainString();
System.out.println("String generation took: " + (System.currentTimeMillis() - startTime) + "ms");
FileUtils.writeStringToFile(new File("Path/To/File"), result);
}
the output to console was:
Generating took: 181324ms. Creating String.
String generation took: 710498ms
which demonstrates how long the method toPlainString()
took.
I understand that we are dealing with huge numbers (arround a milion digits in the example above) but i wanted to know if there is any way to speed up this method and how should i go about doing that?
Thanks!
EDIT #1: The only reason why the milisecond time calculations where added in the post is to bring 'long' into prespective, and possibly demonstrate the behaviour of the code in case the problem is not reproducable by all readers. What i am trying to do is determine why it took so long in my case, and most importantly how to speed up the process of converting to String.
The reason why BigDecimal#PlainString
takes very long to generate the string with Java 7 is: It was implemented very inefficiently in Java 7. Fortunately, it is much faster in Java 8.
Here, it may be important to note that in this particular case, it is not really the string creation in BigDecimal
, but that in BigInteger
. The value that is computed in the given example is a large factorial, and thus, effectively an integral value. The internal scale
field of the BigDecimal
will be 0
then, and having a look at the toPlainString
method shows that in this case, the string value of the internal intVal
field will be returned:
public String toPlainString() {
if(scale==0) {
if(intCompact!=INFLATED) {
return Long.toString(intCompact);
} else {
return intVal.toString();
}
}
...
}
This intVal
field is a BigInteger
, and this is the actual culprit here.
The following program is not intended as a proper "microbenchmark", but only supposed to give an estimate of the performance: It creates several factorials, and generates the string representations of these:
import java.math.BigDecimal;
public class BigDecimalToPlainStringPerformance
{
public static void main(String[] args)
{
for (int n = 10000; n <= 50000; n += 5000)
{
BigDecimal number = factorial(n);
long before = System.nanoTime();
String result = number.toPlainString();
long after = System.nanoTime();
double ms = (after - before) / 1e6;
System.out.println(n + "! took " + ms +
" ms, length " + result.length());
}
}
private static BigDecimal factorial(int n)
{
BigDecimal number = new BigDecimal(1);
for (int i = 1; i < n; i++)
{
number = number.multiply(new BigDecimal(i));
}
return number;
}
}
With Java 7 (u07), on my (old) PC, the output is along the lines of
10000! took 514.98249 ms, length 35656
15000! took 1232.86507 ms, length 56126
20000! took 2364.799995 ms, length 77333
25000! took 3877.565724 ms, length 99090
30000! took 5814.925361 ms, length 121283
35000! took 8231.13608 ms, length 143841
40000! took 11088.823021 ms, length 166709
45000! took 14344.778177 ms, length 189850
50000! took 18155.089823 ms, length 213232
Fortunately, this performance problem has been fixed in Java 8. With Java 8 (u45), the output is
10000! took 77.20227 ms, length 35656
15000! took 113.811951 ms, length 56126
20000! took 188.293764 ms, length 77333
25000! took 261.328745 ms, length 99090
30000! took 355.001264 ms, length 121283
35000! took 481.912925 ms, length 143841
40000! took 610.812827 ms, length 166709
45000! took 698.80725 ms, length 189850
50000! took 840.87391 ms, length 213232
showing that the performance has been improved significantly.
From quickly skimming over the commit logs in the OpenJDK, there is one commit that is probably the most relevant here:
Accelerate conversion to string by means of Schoenhage recursive base conversion
(I did not verify this, but it seems the only one which dedicatedly aimed at improving the toString
performance)