Search code examples
elixirecto

Map and Zip SQL query result to create grouped data structure


I have a data structure from SQL.query(Repo...) as such:

IO.inspect results.columns:

["tour_id", "name", "year", "num_shows", "num_cities", "num_festivals", "num_countries"]

and the actual rows, IO.inspect results.rows:

[
  [5, "tour", 2022, 13, 10, 4, 2],
  [1, "asia", 2018, 4, 4, 3, 3],
  [2, "north america", 2018, 39, 17, 2, 2],
  [3, "europe", 2018, 13, 12, 9, 10],
  [4, "las vegas shows", 2018, 3, 1, 0, 1]
]

I am struggling to do something like Enum.zip where the result would be:

[
 [ tour_id: 5, name: "tour", year: 2022, num_shows: 13, ....],
 [ tour_id: 1, name: "asia", year: 2018, num_shows: 4, ....],
]

Ultimately, I want to do a group_by based on year:

2018: [
        [ tour_id: 1, name: "asia", year: 2018, num_shows: 4, ....],
        [ tour_id: 3, name: "europe", year: 2018, num_shows: 13, ....],
      ],
2022: [ ... ]

Or, however else one may suggest it should be formatted. I thought converting the rows and columns to keyword lists would do the trick, and then I can do the group_by.

Thank you


Solution

  • If you want a keyword list of column name -> row value at the end, the first step would be to convert the column names to atoms:

    columns =
      ["tour_id", "name", "year", "num_shows", "num_cities", "num_festivals", "num_countries"]
      |> Enum.map(&String.to_atom/1)
    

    Then you can map over the rows and zip each one with the columns:

    rows = [
      [5, "tour", 2022, 13, 10, 4, 2],
      [1, "asia", 2018, 4, 4, 3, 3],
      [2, "north america", 2018, 39, 17, 2, 2],
      [3, "europe", 2018, 13, 12, 9, 10],
      [4, "las vegas shows", 2018, 3, 1, 0, 1]
    ]
    
    zipped = for row <- rows, do: Enum.zip(columns, row)
    
    IO.inspect(zipped)
    
    [
      [
        tour_id: 5,
        name: "tour",
        year: 2022,
        num_shows: 13,
        num_cities: 10,
        num_festivals: 4,
        num_countries: 2
      ],
      [
        tour_id: 1,
        name: "asia",
        year: 2018,
        num_shows: 4,
        num_cities: 4,
        num_festivals: 3,
        num_countries: 3
      ],
      [
        tour_id: 2,
        name: "north america",
        year: 2018,
        num_shows: 39,
        num_cities: 17,
        num_festivals: 2,
        num_countries: 2
      ],
      [
        tour_id: 3,
        name: "europe",
        year: 2018,
        num_shows: 13,
        num_cities: 12,
        num_festivals: 9,
        num_countries: 10
      ],
      [
        tour_id: 4,
        name: "las vegas shows",
        year: 2018,
        num_shows: 3,
        num_cities: 1,
        num_festivals: 0,
        num_countries: 1
      ]
    ]
    

    And finally group by the year:

    IO.inspect(Enum.group_by(zipped, fn row -> row[:year] end))
    
    %{
      2018 => [
        [
          tour_id: 1,
          name: "asia",
          year: 2018,
          num_shows: 4,
          num_cities: 4,
          num_festivals: 3,
          num_countries: 3
        ],
        [
          tour_id: 2,
          name: "north america",
          year: 2018,
          num_shows: 39,
          num_cities: 17,
          num_festivals: 2,
          num_countries: 2
        ],
        [
          tour_id: 3,
          name: "europe",
          year: 2018,
          num_shows: 13,
          num_cities: 12,
          num_festivals: 9,
          num_countries: 10
        ],
        [
          tour_id: 4,
          name: "las vegas shows",
          year: 2018,
          num_shows: 3,
          num_cities: 1,
          num_festivals: 0,
          num_countries: 1
        ]
      ],
      2022 => [
        [
          tour_id: 5,
          name: "tour",
          year: 2022,
          num_shows: 13,
          num_cities: 10,
          num_festivals: 4,
          num_countries: 2
        ]
      ]
    }