Search code examples
elixirectopostgrex

How do I prevent a cumbersome query from timing out in Postgrex?


I'm running a query and loading the results into a Stream using Postgrex, like so:

{:ok, host_pid} = Postgrex.start_link(hostname: "somewhere.hostname.io", username: "myuser", password: "mypass", database: "mydb")

Postgrex.transaction(host_pid, fn(conn) ->
    # do query that takes 5 seconds
    # with timeout set to be really big
    query = Postgrex.prepare!(conn, "", "SELECT pg_sleep(5)", timeout: 50_000)
    stream = Postgrex.stream(conn, query)
    result_to_iodata = fn(%Postgrex.Result{rows: rows}) -> format_query_result(rows) end
    Enum.into(stream, File.stream!("eg"), result_to_iodata)
end)

But I get the following error:

localhost$ mix run lib/MyPostgrexScript.exs 
** (DBConnection.ConnectionError) connection not available and request was dropped 
from queue after 2950ms. You can configure how long requests wait in the queue 
using :queue_target and :queue_interval. See DBConnection.start_link/2 for more information
    (db_connection) lib/db_connection.ex:836: DBConnection.transaction/3
    lib/MyPostgrexScript.exs:3: MyPostgrexModule.sleep/0
    (elixir) lib/code.ex:767: Code.require_file/2
    (mix) lib/mix/tasks/run.ex:147: Mix.Tasks.Run.run/5

Since I want to do cumbersome queries which will certainly require more than 2950ms to run, I am wondering how I configure Postgrex to let my query take more time. I read about the :timeout option at https://hexdocs.pm/postgrex/Postgrex.html#transaction/3 but I'm not sure to how include it, or whether it's what I'm looking for.

Any guidance is immensely appreciated, thanks!


Solution

  • I read about the :timeout option at https://hexdocs.pm/postgrex/Postgrex.html#transaction/3 but I'm not sure to how include it,

    Like this (see the last line):

    Postgrex.transaction(
        host_pid,
    
        fn(conn) ->
            # do query that takes 5 seconds
            # with timeout set to be really big
            query = Postgrex.prepare!(conn, "", "SELECT pg_sleep(5)", timeout: 50_000)
            stream = Postgrex.stream(conn, query)
            result_to_iodata = fn(%Postgrex.Result{rows: rows}) ->
                                     format_query_result(rows) 
                               end
            Enum.into(stream, File.stream!("eg"), result_to_iodata)
        end,
    
        timeout: 30_000  #30 seconds
    )
    

    Whenever the elixir docs define a function like this:

    func_name(arg1, ...argN, opts \\ [] )
    

    opts is a keyword list, e.g:

    [{:a, 1}, {:b, 2}]
    

    However, if a keyword list is the last argument in a function call, the keyword list can be written like this:

    func(arg1, arg2, a: 1, b: 2)
    

    and the function definition will receive the three arguments

    arg1, arg2, [{:a, 1}, {:b, 2}]
    

    In any case, the default for :timeout is:

    :timeout - Transaction timeout (default: 15000);
    

    and the error says:

    connection not available and request was dropped from queue after 2950ms

    Because 2950 < 15000 it doesn't seem like the :timeout value is the source of the error.

    The error message continues:

    connection not available.... You can configure how long requests wait in the queue using :queue_target and :queue_interval. See DBConnection.start_link/2 for more information

    This explains how to configure those timeouts:

    In config/<env>.exs (where <env> is dev, test, or prod):

    config :my_app, MyApp.Repo,
      adapter: Ecto.Adapters.Postgres,
      pool_size: 10,
      migration_timestamps: [type: :utc_datetime_usec],
      migration_lock: nil,
      queue_target: 5000
    

    Is what we had to do recently due to increased number of errors in production.

    Also,

    Handling requests is done through a queue. When DBConnection is started, there are two relevant options to control the queue:

    :queue_target in milliseconds, defaults to 50ms
    :queue_interval in milliseconds, defaults to 1000ms
    

    Our goal is to wait at most :queue_target for a connection. If all connections checked out during a :queue_interval takes more than :queue_target, then we double the :queue_target. If checking out connections take longer than the new target, then we start dropping messages.

    For example, by default our target is 50ms. If all connections checkouts take longer than 50ms for a whole second, we double the target to 100ms and we start dropping messages if the time to checkout goes above the new limit.

    This allows us to better plan for overloads as we can refuse requests before they are sent to the database, which would otherwise increase the burden on the database, making the overload worse.

    But, if you haven't touched those defaults, then I wonder why you are seeing 2950ms in the error message and not something closer to 50ms or 1000ms.