Search code examples
jenkinspersistencespring-boot-testokd

Jenkins agent jobs killed when running Spring Boot Tests with DB


I am running a self-hosted OKD 4 cluster with minimum production requirements (3 control planes and two compute nodes). This setup includes a Jenkins installation - installed via Helm (https://www.jenkins.io/doc/book/installing/kubernetes/) So far everything worked fine: builds start automatically when changes are pushed to Github and when they are successful are deployed to the same cluster where Jenkins runs in.

But currently I am facing the problem that when a build job executes a Spring Boot test which fires up a persistence context. The build agent (a jdk-11 image, see additionalAgent configuration below) gets killed as soon as Spring starts up the persistence context. Downloading dependencies and compilation works fine, btw.

additionalAgents:
  jdk-11:
    podName: jdk-11
    customJenkinsLabels: jdk-11
    image: jenkins/jnlp-agent-jdk11
    tag: latest
...

When the tests are disabled the job runs fine. But as soon as the persistence gets initialised the agent gets killed.

Those are the configurations I have tried for the test:

  1. Starting with an in-memory h2 database and flyway provisioning.
  2. Without flyway provisioning.
  3. Even without the database connection string set.

The time where the job gets killed is almost the same: For 1. it is

2021-10-20 22:44:06.637  INFO 299 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2021-10-20 22:44:07.032  INFO 299 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 310 ms. Found 2 JPA repository interfaces.
2021-10-20 22:44:08.240  INFO 299 --- [           main] o.s.cloud.context.scope.GenericScope     : BeanFactory id=1c9e8306-7514-338e-8a9f-3cfba5c1169b
2021-10-20 22:44:10.527  INFO 299 --- [           main] o.f.c.internal.license.VersionPrinter    : Flyway Community Edition 7.7.3 by Redgate
2021-10-20 22:44:10.532  INFO 299 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Starting...
2021-10-20 22:44:11.744  INFO 299 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Start completed.
2021-10-20 22:44:12.041  INFO 299 --- [           main] o.f.c.i.database.base.DatabaseType       : Database: jdbc:h2:mem:testdb (H2 1.4)
Killed

For 2.

2021-10-21 19:50:51.604  INFO 306 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2021-10-21 19:50:52.005  INFO 306 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 391 ms. Found 2 JPA repository interfaces.
2021-10-21 19:50:53.510  INFO 306 --- [           main] o.s.cloud.context.scope.GenericScope     : BeanFactory id=0fd77ef3-b5a2-35cb-b157-6d27c0cfe9a5
2021-10-21 19:50:56.405  INFO 306 --- [           main] o.hibernate.jpa.internal.util.LogHelper  : HHH000204: Processing PersistenceUnitInfo [name: default]
2021-10-21 19:50:56.708  INFO 306 --- [           main] org.hibernate.Version                    : HHH000412: Hibernate ORM core version 5.4.32.Final
2021-10-21 19:50:57.503  INFO 306 --- [           main] o.hibernate.annotations.common.Version   : HCANN000001: Hibernate Commons Annotations {5.1.2.Final}
Killed

And for 3.

2021-10-21 22:02:48.810  INFO 309 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2021-10-21 22:02:49.198  INFO 309 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 380 ms. Found 2 JPA repository interfaces.
2021-10-21 22:02:50.509  INFO 309 --- [           main] o.s.cloud.context.scope.GenericScope     : BeanFactory id=0fd77ef3-b5a2-35cb-b157-6d27c0cfe9a5
2021-10-21 22:02:53.523  INFO 309 --- [           main] o.hibernate.jpa.internal.util.LogHelper  : HHH000204: Processing PersistenceUnitInfo [name: default]
2021-10-21 22:02:53.898  INFO 309 --- [           main] org.hibernate.Version                    : HHH000412: Hibernate ORM core version 5.4.32.Final
Killed

The log of the Jenkins pod just states

Terminated Kubernetes instance for agent jenkins/jdk-11-bjtz5
Disconnected computer jdk-11-bjtz5
2021-10-21 22:02:57.342+0000 [id=465]   INFO    o.c.j.p.k.KubernetesSlave#deleteSlavePod: Terminated Kubernetes instance for agent jenkins/jdk-11-bjtz5
2021-10-21 22:02:57.342+0000 [id=465]   INFO    o.c.j.p.k.KubernetesSlave#_terminate: Disconnected computer jdk-11-bjtz5
2021-10-21 22:02:57.356+0000 [id=436]   INFO    j.s.DefaultJnlpSlaveReceiver#channelClosed: Computer.threadPoolForRemoting [#56] for jdk-11-bjtz5 terminated: java.nio.channels.ClosedChannelException

In all cases there are no exceptions, stacktraces or suspicious events. And these steps are reproducable - when I run the build with the same configuration again the agents gets killed at exactly the same step in the test.

The setup:

  • Jenkins Version 2.303.2
  • Jenkins uses a MySQL database running in the same cluster
  • all Jenkins plugins are up-to-date
  • OKD currently running at version 4.8.0-0.okd-2021-10-10-030117
  • currently there are no resource quotas set and the system still has plenty of free resources

I am presuming that a little bit of configuration is missing to make this work. But I just cannot find what it could be. So I am asking: have had anyone the same issue here? Or any guesses what the missing part could be?

When there is some information missing please point it out and I will add it.


Solution

  • After a little step back I took a look into the actual pod which runs the build. And found out that the memory limit of the agents was the problem.

    So increasing the limit solved the problem!

    I have done that by modifying the local jenkins-values.yaml and updated the limits section of the agent: block in it.

    A little confusing for me was the fact that no log entry stated the exceed of memory usage.

    Next thought is that I will set a memory limit for the test step via Java options to kill the maven process before the pod exceeds the limit. Guess it would be more transparent in the build.

    And as a sidenote: the limit has been set to 512Mi previously and has been exceeded by ~10 MiB -.-
    Luckily for me I have found it at this point and the other build-jobs where running fine was just due to the lack of resource-usage (haven't figured that just starting Hibernate would exceed the 512 MiB mark)