Search code examples
scalaslickslick-pg

slick-pg filter rows by column with specific array_length


According to https://www.postgresql.org/docs/8.4/functions-array.html, there is an array_length PG Oper/Function. Then, according to https://github.com/tminglei/slick-pg/tree/master/core/src/main/scala/com/github/tminglei/slickpg/array, I can see that it is documented that there exists a Slick Oper/Function,. I have been looking through the documentation and various posts. However, I cannot seem to connect the dots on how to leverage this PG Oper/Function in a query similar to:

    import DatabaseObjects.profile.api._
    val randomFunction: Rep[Double] = SimpleFunction.nullary[Double]("random")
    val query: Query[GeneratorV1, GeneratorV1Row, Seq] = GeneratorV1
      .sortBy(_ => randomFunction)
      .filter(_.rasterizationJobConfig +>> "activity" === activityTypeEnum)
      .filter(_.jobStatus === GeneratorStatuses.success)
      .take(N)

I need to change this query to add the last filter. Below is the modified query with pseudo-code for the additional array_length filter. As in the following SQL query:

SELECT * from generator_v1 WHERE rasterization_job_config->>'activity' = 'PLANTING'
  AND job_status = 'success' AND array_length(worker_job_ids, 1) = 2 LIMIT 5;
    import DatabaseObjects.profile.api._
    val randomFunction: Rep[Double] = SimpleFunction.nullary[Double]("random")
    val query: Query[GeneratorV1, GeneratorV1Row, Seq] = GeneratorV1
      .sortBy(_ => randomFunction)
      .filter(_.rasterizationJobConfig +>> "activity" === activityTypeEnum)
      .filter(_.jobStatus === GeneratorStatuses.success)
      .filter(/* pseudo-code */ _.jobIds.array_length > 0 && _.jobIds.array_length < N)
      .take(N)

So far, I have tried:

    import DatabaseObjects.profile.api._
    val randomFunction: Rep[Double] = SimpleFunction.nullary[Double]("random")

    def arrayLength(rep: Rep[List[Long]]): Rep[Boolean] = {
      val expression = SimpleExpression.unary[List[Long], Boolean] { (s, queryBuilder) =>
        queryBuilder.sqlBuilder += "array_length("
        queryBuilder.expr(s)
        queryBuilder.sqlBuilder += ", 1) = 2"
      }
      expression.apply(rep)
    }

    val query: Query[GeneratorV1, GeneratorV1Row, Seq] = GeneratorV1
      .sortBy(_ => randomFunction)
      .filter(_.rasterizationJobConfig +>> "activity" === activityTypeEnum)
      .filter(_.jobStatus === GeneratorStatuses.success)
      .filter(row => (row.workerJobIds, arrayLength(row.workerJobIds)))
      .take(N)

Solution

  • As it is detailed in the slick-pg supported array functions docs you shared

    Supported Array Oper/Functions

    Slick Oper/Function PG Oper/Function
    length array_length

    The Slick Oper/Function is length and not array_length as you did in the code. Which means, if slick-pg was correctly setup, your code should look like

    val query: Query[GeneratorV1, GeneratorV1Row, Seq] = GeneratorV1
          .sortBy(_ => randomFunction)
          .filter(_.rasterizationJobConfig +>> "activity" === activityTypeEnum)
          .filter(_.jobStatus === GeneratorStatuses.success)
          .filter(row => row.jobIds.length().>(0) && row.jobIds.length().<(N))
          .take(N)
    

    Here you have an integration test using scalatest, testcontainer-postgresql, slick, slick-pg.

    • build.sbt
    ThisBuild / scalaVersion := "2.13.15"
    
    lazy val root = (project in file("."))
      .settings(
        name := "stackoverflow-pocs-scala",
        libraryDependencies ++= Seq(
          "com.typesafe.slick"         %% "slick"           % "3.5.1",
          "com.github.tminglei"        %% "slick-pg"        % "0.22.2",
          "ch.qos.logback"              % "logback-classic" % "1.5.6",
          "com.typesafe.scala-logging" %% "scala-logging"   % "3.9.5",
          "org.postgresql"              % "postgresql"      % "42.7.3",
          "org.scalatest"              %% "scalatest"       % "3.2.19" % Test,
          "com.dimafeng" %% "testcontainers-scala-scalatest"  % "0.41.4" % Test,
          "com.dimafeng" %% "testcontainers-scala-postgresql" % "0.41.4" % Test
        )
      )
    
    • src/test/resources/logback-test.xml (just to printl the sql during test execution)
    <configuration>
        <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
            <encoder>
                <pattern>%highlight([%-5level]) - time: [%d{HH:mm:ss.SSS}] - thread: [%thread] - logger: %logger - msg: %msg%n</pattern>
            </encoder>
        </appender>
    
        <appender name="ASYNC" class="ch.qos.logback.classic.AsyncAppender">
            <appender-ref ref="STDOUT" />
        </appender>
    
        <root level="INFO">
            <appender-ref ref="ASYNC"/>
        </root>
    
        <!-- SHOW SQL STATEMENTS GENERATED AND EXECUTED BY SLICK -->
        <logger name="slick.jdbc.JdbcBackend.statement"         level="DEBUG" />
        <logger name="slick.jdbc.JdbcBackend.parameter"         level="DEBUG" />
        <logger name="slick.jdbc.StatementInvoker.result"       level="DEBUG" />
    
        <logger name="tc" level="WARN"/>
        <logger name="org.testcontainers" level="WARN"/>
    </configuration>
    
    • src/test/scala/MyPostgresProfile.scala
    import com.github.tminglei.slickpg.{ExPostgresProfile, PgArraySupport}
    
    trait MyPostgresProfile extends ExPostgresProfile with PgArraySupport {
      override val api: MyAPI.type = MyAPI
    
      object MyAPI extends ExtPostgresAPI with ArrayImplicits
    }
    
    object MyPostgresProfile extends MyPostgresProfile
    
    • src/test/scala/PostgresArrayLengthTest.scala
    import com.dimafeng.testcontainers.PostgreSQLContainer
    import com.dimafeng.testcontainers.scalatest.TestContainerForAll
    import dev.gaston.stackoverflow.poc.scala.MyPostgresProfile.api._
    import org.scalatest.funsuite.AsyncFunSuite
    import org.scalatest.matchers.should.Matchers
    import org.testcontainers.utility.DockerImageName
    import slick.dbio.Effect
    import slick.lifted.ProvenShape
    import slick.sql.FixedSqlAction
    
    class PostgresArrayLengthTest
        extends AsyncFunSuite
        with Matchers
        with TestContainerForAll {
    
      // postgresql container
      override val containerDef: PostgreSQLContainer.Def =
        PostgreSQLContainer.Def(DockerImageName.parse("postgres:15.4-alpine3.18"))
    
      // case class to map with the schema
      case class Row(id: Int, ints: List[Int])
    
      // slick schema
      case class Rows(tag: Tag) extends Table[Row](tag, "rows") {
        def id: Rep[Int] = column[Int]("id", O.AutoInc, O.PrimaryKey) // some column
        def ints: Rep[List[Int]] = column[List[Int]]("ints") // the array column
        override def * : ProvenShape[Row] = (id, ints) <> (Row.tupled, Row.unapply)
      }
    
      // rows to insert
      val row1: Row = Row(1, List(1, 2, 3, 4))
      val row2: Row = Row(2, List(5, 6, 7))
      val row3: Row = Row(3, List(8, 9))
      val row4: Row = Row(4, Nil)
      val row5: Row = Row(5, List(10, 11, 12))
      val row6: Row = Row(6, Nil)
    
      val rowsQuery = TableQuery[Rows]
    
      // all the rows to be inserted in a list
      val rowsList: List[Row] = List(row1, row2, row3, row4, row5, row6)
    
      // sql create schema
      val createSchema: FixedSqlAction[Unit, NoStream, Effect.Schema] =
        rowsQuery.schema.create
    
      // sql insert rows
      val insertRows = DBIO.sequence(rowsList.map(row => rowsQuery += row))
    
      test("dummy test that using postgres array functions with slick") {
        withContainers { pg =>
          // slick db connection
          val db = Database.forURL(pg.jdbcUrl, pg.username, pg.password)
          for {
            _ <- db.run(createSchema) // create the table
            _ <- db.run(insertRows) // insert the rows
            rowsFromQuery <-
              db.run(
                rowsQuery
                  // filter rows where array length is > 0 and <= 2
                  .filter(row =>
                    row.ints.length().>(0) // array length >  0
                      && row.ints.length().<=(2)
                  ) // array length <= 2
                  .result
              )
          } yield {
            rowsFromQuery should
              contain theSameElementsAs
              rowsList.filter(row => row.ints.nonEmpty && row.ints.size <= 2)
          }
        }
      }
    
    }
    

    Once everything is setup, sbt test can be executed and you should see messages similar to

    [DEBUG] - Preparing statement: create table "rows" ("id" SERIAL NOT NULL PRIMARY KEY,"ints" int4 [] NOT NULL)
    [DEBUG] - Preparing statement: insert into "rows" ("ints")  values (?)
    [DEBUG] - /-----------\
    [DEBUG] - | 1         |
    [DEBUG] - | Array     |
    [DEBUG] - |-----------|
    [DEBUG] - | {"1","2"} |
    [DEBUG] - \-----------/
    [DEBUG] - Preparing statement: insert into "rows" ("ints")  values (?)
    [DEBUG] - /-----------\
    [DEBUG] - | 1         |
    [DEBUG] - | Array     |
    [DEBUG] - |-----------|
    [DEBUG] - | {"3","4"} |
    [DEBUG] - \-----------/
    [DEBUG] - Preparing statement: insert into "rows" ("ints")  values (?)
    [DEBUG] - /---------------\
    [DEBUG] - | 1             |
    [DEBUG] - | Array         |
    [DEBUG] - |---------------|
    [DEBUG] - | {"5","6","7"} |
    [DEBUG] - \---------------/
    [DEBUG] - Preparing statement: insert into "rows" ("ints")  values (?)
    [DEBUG] - /-------\
    [DEBUG] - | 1     |
    [DEBUG] - | Array |
    [DEBUG] - |-------|
    [DEBUG] - | {}    |
    [DEBUG] - \-------/
    [DEBUG] - Preparing statement: insert into "rows" ("ints")  values (?)
    [DEBUG] - /-----------\
    [DEBUG] - | 1         |
    [DEBUG] - | Array     |
    [DEBUG] - |-----------|
    [DEBUG] - | {"8","9"} |
    [DEBUG] - \-----------/
    [DEBUG] - Preparing statement: insert into "rows" ("ints")  values (?)
    [DEBUG] - /-------\
    [DEBUG] - | 1     |
    [DEBUG] - | Array |
    [DEBUG] - |-------|
    [DEBUG] - | {}    |
    [DEBUG] - \-------/
    [DEBUG] - Preparing statement: select "id", "ints" from "rows" where (coalesce(array_length("ints",1), 0) > 0) and (coalesce(array_length("ints",1), 0) <= 2)
    [DEBUG] - /----+-------\
    [DEBUG] - | 1  | 2     |
    [DEBUG] - | id | ints  |
    [DEBUG] - |----+-------|
    [DEBUG] - | 1  | {1,2} |
    [DEBUG] - | 2  | {3,4} |
    [DEBUG] - | 5  | {8,9} |
    [DEBUG] - \----+-------/