Search code examples
javascalaunit-testingjenkinsnoclassdeffounderror

Sporadic java.lang.NoClassDefFoundError in Scala


We have a weird problem. We are using an automatic test tool. The DSL was implemented in Scala. The system which we test with this tool was written in Java, and the interface between the two components is RMI. Indeed, the interface part of the automatic test tool is also Java (the rest is Scala). We have the full control of the source code of these components.

We already have at the magnitude of thousand test cases. We execute these test cases automatically once every night, using Jenkins on a Linux server. The problem is that we sporadically receive a java.lang.NoClassDefFoundError exception. This typically happens when trying to access a Java artifacts from a Scala code.

If we execute the same test case manually, or check the result of the next nightly run, then typically the problem solves automatically, but sometimes it happens again in a completely different place. In case of some runs no such problem appears at all. The biggest problem is that the error is not reproducible; furthermore, as it happens in case of an automatic run, we have hardly any information about the exact circumstances, just the test case and the log.

Has somebody already encountered with such a problem? Do you have any idea how to proceed? Any hint or piece of information would be helpful, not only the solution of the problem. Thank you!


Solution

  • I found the reason of the error (99% sure). We had the following 2 Jenkins jobs:

    1. Job1: Performs a full clean build of the tested system, written in Java, then performs a full clean build of the DSL, and finally executes the test cases. This is a long running job (~5 hours).
    2. Job2: Performs a full clean build of the tested system, and then executes something else on it. The DSL is not involved. This is a shorter job (~1 hour).

    We have one single Maven repository for all the jobs. Furthermore, some parts of the tested component is part of the interface between the two components.

    Considering the time stamps the following happened:

    1. Job1 performed the full build of both components, and started a test suite containing several test cases, which execution lasts about half an hour.
    2. The garbage collector might swept out the components not used yet.
    3. Job2 started the build, and it also rebuilt the interface parts, including the one swept out by garbage collector of Job1.
    4. Job1 reached a test case which uses an interface component already swept out.

    The solution was the following: we moved Job2 to an earlier time; now it finishes the job before Job1 starts the tests.