Search code examples
postgresqlspring-boot-testtestcontainerstestcontainers-junit5

Tests using Postgres images via Testcontainers time out


I'm working on a Spring Boot project with a Postgres database backend where JUnit 5 and Testcontainers is used for integration tests that involve database access.

Testcontainers is set up by modifying the JDBC URL like this:

spring:
  datasource:
    url: jdbc:tc:postgresql:9.6.8:///test

This setup did work fine for many months but now I'm hitting a road block.

So far there are already 20 integration test classes and adding another one leads to failing tests due to an error that looks like a time out to me.

When adding the 21st test class, another test (let's call it RandomTest) hangs for a few minutes and then fails with this error:

 java.lang.IllegalStateException at DefaultCacheAwareContextLoaderDelegate.java:98
        Caused by: org.springframework.beans.factory.BeanCreationException at AbstractAutowireCapableBeanFactory.java:1804
            Caused by: org.flywaydb.core.internal.exception.FlywaySqlException at JdbcUtils.java:68
                Caused by: java.sql.SQLException at JdbcDatabaseContainer.java:263
                    Caused by: org.postgresql.util.PSQLException at ConnectionFactoryImpl.java:659

I know it can't be a problem with the test per se, because when I run it individually, there's no problem:

./gradlew test --tests RandomTest
[...]
BUILD SUCCESSFUL in 16s

It may also be noteworthy that I only have this problem when running the tests with Gradle (both locally and on the CI server). I don't see this problem when running them in IntelliJ.

So it looks to me like this is some kind of resource problem like the Postgres instance that Testcontainers starts up running out of memory or out of connections or whatever, but that's just guessing.

I tried different configuration modifications that I found in the Testcontainers docs:

  1. Running the container in daemon mode like this:
spring:
  datasource:
    url: jdbc:tc:postgresql:9.6.8:///test?TC_DAEMON=true
  1. Disabling Ryuk by setting TESTCONTAINERS_RYUK_DISABLED=true
  2. Starting Ryuk in (un-)privileged mode explicitly with ryuk.container.privileged=true|false (I tried both because I'm not sure what the default is)

None of these had a noticeable impact in terms of my problem.

I'm thinking that maybe we are overusing Testcontainers for too many tests? Should I instead use H2 for most integration tests and use Testcontainers only for a few selected tests to make sure that everything works with the production database?

Or am I missing something?


Solution

  • Okay, it turned out that it actually was a problem with the newly added test.

    The test author had added a method that was supposed to clean up the database after the test like this:

    @AfterEach
    public void beforeEach() {
        fooRepository.deleteAll();
        barRepository.deleteAll();
        bazRepository.deleteAll();
    }
    

    When removing this, all the tests work fine again. I guess this clean up takes a bit longer than execution of the test itself so that the database connection is not released in time for the next test to use it, or something like this.