According to their paper (OSDI'08), symbolic execution tool KLEE needs about 1 hour to generate test cases for a single COREUTIL application. I am wondering if it can run in parallel?(say, multi-threaded on multicore or even GPU)?
Most static analysis tools are built using languages which are essentially single-threaded (C, C++, Java).
While parallelism support is available in such languages, as a general rule the static analysis tool builders do not take advantage of it. Part of the problem is that the parallelism in such tools is highly irregular, and does not come in easily-found large chunks which keeps the overhead down, a necessity for ensuring efficient parallelism. This makes it hard for the tools designers to find/designate such chunks, and they are already fighting the difficult problem of making their tool work.
KLEE itself I don't think is parallel for this reason. There is work on dividing the analysis supporting test generation into big chunks to hand to parallel running copies of KLEE ("Cloud9"); this is one way to divide the work up but these are really big computational grains, distributed across a cloud of independent processors. One of the key issues is "load balancing" (giving the right amount of work to each copy so that the parallelelism is well used). See http://dslab.epfl.ch/pubs/cloud9.pdf
(See my bio for a line of tools that has explicitly parallel foundations with fine-grain computations based on SMT).