Search code examples
jvmgoogle-compute-enginejitscrypt

Why does the 1st call to scrypt() use just 1% CPU and take half an hour in GCE?


[ Summary & answer: Apparently the problem is that it takes long to seed the random number generator. See my answer below. ]

In Google Compute Engine (GCE), the very first request my Java Virtual Machine app makes to the scrypt password hashing function takes long — because the code has not yet been Just-In-Time compiled I suppose. So I'm warming up scrypt, by making a dummy scrypt("pswd", 2,1,1) call at server startup. However, what happens, is that the CPU rises to 300%+, stays there for 10-20 seconds, then falls back to 1%, although the request to scrypt() hasn't finished yet. Now, the CPU stays at 1%, for many minutes (up to half an hour, with 2 GCE vCPU), until eventually scrypt() is done.

Why this weird behavior?

Why won't scrypt() continue running at 300% CPU until it's done? It's not running out of memory. Look at the Docker stats a bit below.

After the 1st scrypt() request, subsequent requests finish "immediately". For example, this: SCryptUtil.scrypt("pswd", 65536, 8, 1) takes < 0.2 seconds, although it does a lot more work than: SCryptUtil.scrypt("pswd", 2, 1, 1) which (as mentioned) is my very first scrypt() call and usually takes a few minutes, with 4 GCE vCPU — and often around half an hour, with 2 GCE vCPU.

I'm using a GCE instance with 4 vCPU, 3.6 GB RAM. Docker 1.11.1. OpenJDK 1.8.0_77. In an Alpine Linux 3.3 Docker container, Ubuntu 16.04 Docker host. Cannot reproduce this on my laptop; on my laptop, scrypt is always fast, doesn't need any warm up.

docker stats, after 5-10 seconds: (now edp_play_1, line 2, uses 300+% CPU)

CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O               BLOCK I/O           PIDS
edp_nginx_1         0.02%               55.92 MB / 104.9 MB   53.33%              6.191 kB / 2.897 kB   0 B / 0 B           6
edp_play_1          315.12%             914.7 MB / 2.831 GB   32.31%              43.4 kB / 66.09 kB    0 B / 2.58 MB       67
edp_postgres_1      0.33%               29.84 MB / 314.6 MB   9.49%               529.1 kB / 307.9 kB   0 B / 327.7 kB      17
edp_redis_1         0.08%               6.513 MB / 52.43 MB   12.42%              4.984 kB / 1.289 kB   0 B / 0 B           3

docker stats after half a minute: (now edp_play_1 uses only 0.97% CPU — and stays like this, for up to half an hour, until done)

CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O               BLOCK I/O           PIDS
edp_nginx_1         0.02%               55.92 MB / 104.9 MB   53.33%              6.341 kB / 3.047 kB   0 B / 0 B           6
edp_play_1          0.97%               1.011 GB / 2.831 GB   35.71%              130.2 kB / 215.2 kB   0 B / 5.546 MB      66
edp_postgres_1      0.28%               29.84 MB / 314.6 MB   9.49%               678.2 kB / 394.7 kB   0 B / 458.8 kB      17
edp_redis_1         0.06%               6.513 MB / 52.43 MB   12.42%              4.984 kB / 1.289 kB   0 B / 0 B           3

If you want to test in Scala & sbt, this is what happens for me in GCE:

scala> import com.lambdaworks.crypto.SCryptUtil
import com.lambdaworks.crypto.SCryptUtil

scala> def time[R](block: => R): R = { val t0 = System.nanoTime() ; val result = block ; val t1 = System.nanoTime() ; println("Elapsed time: " + (t1 - t0) + "ns") ; result ; }
time: [R](block: => R)R

scala> time { SCryptUtil.scrypt("dummy password 1", 2, 1, 1) }
Elapsed time: 313823012366ns   <-- 5 minutes
res0: String = $s0$10101$2g6nrD0f5gDOTuP44f0mKg==$kqEe4TWSFXwtwGy3YgmIcqAhDvjMS89acST7cwPf/n4=

scala> time { SCryptUtil.scrypt("dummy password 1", 2, 1, 1) }
Elapsed time: 178461ns
res1: String = $s0$10101$C0iGNvfP+ywAxDS0ARoqVw==$k60w5Jpdt28PHGKT0ypByPocCyJISrq+T1XwmPlHR5w=

scala> time { SCryptUtil.scrypt("dummy password 1", 65536, 8, 1) }
Elapsed time: 130900544ns   <-- 0.1 seconds
res2: String = $s0$100801$UMTfIuBRY6lO1asECmVNYg==$b8i7GABgeczVHKVssJ8c2M7Z011u0TMBtVF4VSRohKI=

scala> 313823012366L / 1e9
res3: Double = 313.823012366

scala> 130900544L / 1e9
res4: Double = 0.130900544

Note: This is not related to Docker. I just tested outside Docker, with openjdk 8 installed directly on the GCE instance, and the result is the same: scrypt(..) takes like 3 minutes the first time, but the CPU is 90-100% idle. Thereafter, requests to scrypt complete immediately.


Solution

  • The problem is that it takes long to seed the random number generator. Scrypt does this:

    public static String scrypt(String passwd, int N, int r, int p) {
        try {
            byte[] salt = new byte[16];
            SecureRandom.getInstance("SHA1PRNG").nextBytes(salt);   <--- look
    
            byte[] derived = SCrypt.scrypt(passwd.getBytes("UTF-8"), salt, N, r, p, 32);
    

    (here)

    The call to nextBytes(salt) causes the SecureRandom object to seed itself, and this takes up to half an hour, on my Google Compute Engine instance.

    This is not related to Java or Docker, instead, look here: (directly on the host machine, not inside any Docker container)

    # < /dev/random stdbuf -o0 strings --bytes 1 | stdbuf -o0 tr -d '\n\t '
    

    This reads random chars from /dev/random, and I've had it running for a few minutes now, but only 3 chars have been output so far, after a few minutes. So it's super slow.

    Using the less random, but faster, /dev/urandom instead, then this:

    # < /dev/urandom stdbuf -o0 strings --bytes 1 | stdbuf -o0 tr -d '\n\t '
    

    prints 99999 chars immediately.

    (I found the above < /dev/random ... commands here: https://unix.stackexchange.com/a/114883/128585 )

    On my laptop, though, the /dev/random/ version immediately prints 30-40 chars. Then it blocks, and prints one or a few chars every 10 seconds or so. Perhaps it gets randomness from me when I use the mouse or keyboard or network.


    Update

    What I did: I'm now using /dev/urandom instead — as far as I've read on the Internet, that's totally fine.

    And I also started using a hardware random number generator; apparently the GCE instances have these.

    apt install rng-tools  # start using any hardware rand num gen, on Ubuntu