Search code examples
windows-10elixirpostgrex

Elixir postgrex with poolboy example on Windows fails with 'module DBConnection.Poolboy not available'


I am exploring using Elixir for fast Postgres data imports of mixed types (CSV, JSON). Being new to Elixir, I am following the the example given in the youtube video "Fast Import and Export with Elixir and Postgrex - Elixir Hex package showcase" (https://www.youtube.com/watch?v=YQyKRXCtq4s). The basic mix application works until the point that Poolboy is introduced, i.e. Postgrex successfully loads records into the database using a single connection.

When I try to follow the Poolboy configuration, and test it by running

FastIoWithPostgrex.import("./data_with_ids.txt")

in iex or the command line, I get the following error, for which I can not determine the cause (username and password removed):

** (UndefinedFunctionError) function DBConnection.Poolboy.child_spec/1 is 
undefined (module DBConnection.Poolboy is not available)
DBConnection.Poolboy.child_spec({Postgrex.Protocol, [types: 
Postgrex.DefaultTypes, name: :pg, pool: DBConnection.Poolboy, pool_size: 4, 
hostname: "localhost", port: 9000, username: "XXXX", password: 
"XXXX", database: "ASDDataAnalytics-DEV"]})
(db_connection) lib/db_connection.ex:383: DBConnection.start_link/2
(fast_io_with_postgrex) lib/fast_io_with_postgrex.ex:8: 
FastIoWithPostgrex.import/1

I am running this on Windows 10, connecting to a PostgreSQL 10.x Server through a local SSH tunnel. Here is the lib/fast_io_with_postgrex.ex file:

defmodule FastIoWithPostgrex do
  @moduledoc """
  Documentation for FastIoWithPostgrex.
  """

  def import(filepath) do

    {:ok, pid} = Postgrex.start_link(name: :pg,
      pool: DBConnection.Poolboy,
      pool_size: 4,
      hostname: "localhost",
      port: 9000,
      username: "XXXX", password: "XXXX", database: "ASDDataAnalytics-DEV")

    File.stream!(filepath)
    |> Stream.map(fn line ->
        [id_str, word] = line |> String.trim |> String.split("\t", trim: true, parts: 2)
        {id, ""} = Integer.parse(id_str)
        [id, word]
    end)
    |> Stream.chunk_every(10_000, 10_000, [])
    |> Task.async_stream(fn word_rows ->
      Enum.each(word_rows, fn word_sql_params ->
        Postgrex.transaction(:pg, fn conn ->
          IO.inspect Postgrex.query!(conn, "INSERT INTO asdda_dataload.words (id, word) VALUES ($1, $2)", word_sql_params)
#        IO.inspect Postgrex.query!(pid, "INSERT INTO asdda_dataload.words (id, word) VALUES ($1, $2)", word_sql_params)    
        end , pool: DBConnection.Poolboy, pool_timeout: :infinity, timeout: :infinity) 
      end)
    end, timeout: :infinity)
    |> Stream.run

  end # def import(file)
end

Here is the mix.exs file:

defmodule FastIoWithPostgrex.MixProject do
  use Mix.Project

  def project do
    [
      app: :fast_io_with_postgrex,
      version: "0.1.0",
      elixir: "~> 1.7",
      start_permanent: Mix.env() == :prod,
      deps: deps()
    ]
  end

  # Run "mix help compile.app" to learn about applications.
  def application do
    [
      extra_applications: [:logger, :poolboy, :connection]
    ]
  end

  # Run "mix help deps" to learn about dependencies.
  defp deps do
    [
      # {:dep_from_hexpm, "~> 0.3.0"},
      # {:dep_from_git, git: "https://github.com/elixir-lang/my_dep.git", 
tag: "0.1.0"},

      {:postgrex, "~>0.14.1"},
      {:poolboy, "~>1.5.1"}
    ]
  end
end

Here is the config/config.exs file:

# This file is responsible for configuring your application
# and its dependencies with the aid of the Mix.Config module.

use Mix.Config

config :fast_io_with_postgrex, :postgrex,
  database: "ASDDataAnalytics-DEV",
  username: "XXXX",
  password: "XXXX",
  name: :pg,
  pool: DBConnection.Poolboy,
  pool_size: 4

# This configuration is loaded before any dependency and is restricted
# to this project. If another project depends on this project, this
# file won't be loaded nor affect the parent project. For this reason,
# if you want to provide default values for your application for
# 3rd-party users, it should be done in your "mix.exs" file.

# You can configure your application as:
#
#     config :fast_io_with_postgrex, key: :value
#
# and access this configuration in your application as:
#
#     Application.get_env(:fast_io_with_postgrex, :key)
#
# You can also configure a 3rd-party app:
#
#     config :logger, level: :info
#

# It is also possible to import configuration files, relative to this
# directory. For example, you can emulate configuration per environment
# by uncommenting the line below and defining dev.exs, test.exs and such.
# Configuration from the imported file will override the ones defined
# here (which is why it is important to import them last).
#
#     import_config "#{Mix.env()}.exs"

Any assistance with finding the cause of this error would be greatly appreciated!


Solution

  • I didn't want to goo to deep into how this isn't working, but that example is a little old, and that poolboy 1.5.1 you get pulled with deps.get is from 2015.. and the example uses elixir 1.4

    Also, if you see Postgrex's mix.exs deps, you will notice your freshly installed lib (1.14) depends on elixir_ecto/db_connection 2.x

    The code you are referring uses Postgres 1.13.x, which depends on {:db_connection, "~> 1.1"}. So i would expect incompatibilities.

    I would play with the versions of the libs you see in examples code mix.lock file, an the elixir version if I wanted to see that working.

    Maybe try lowering the Postgrex version first to something around that time (maybe between 0.12.2 and the locked version of the example).

    Also, the version of elixir might have some play here, check this

    Greetings!

    • have fun

    EDIT:

    You can use DBConnection.ConnectionPool instead of poolboy and get way using the latest postgrexand elixir versions, not sure about the performance difference but you can compare, just do this:

    on config/config.exs (check if you need passwords, etc..)

    config :fast_io_with_postgrex, :postgrex,
      database: "fp",
      name: :pg,
      pool: DBConnection.ConnectionPool,
      pool_size: 4
    

    And in lib/fast_io_with.....ex replace both Postgrex.start_link(... lines with:

    {:ok, pid} = Application.get_env(:fast_io_with_postgrex, :postgrex)
              |> Postgrex.start_link
    

    That gives me:

    mix run -e 'FastIoWithPostgrex.import("./data_with_ids.txt")'
    1.76s user 0.69s system 106% cpu 2.294 total
    

    on Postgrex 0.14.1 and Elixir 1.7.3