Search code examples
dataframeiteratorjulia

All combinations of the rows in two data frames


I have the following two data frames

using DataFrames
a = DataFrame(x1=["a";"b";"c"], y1=1:3)
b = DataFrame(x2=["d";"e"], y2=4:5)

How can I create a data frame c that has all possible combinations of rows in a and b, i.e.

6×4 DataFrame
 Row │ x1      x2      y1     y2    
     │ String  String  Int64  Int64 
─────┼──────────────────────────────
   1 │ a       d           1      4
   2 │ b       d           2      4
   3 │ c       d           3      4
   4 │ a       e           1      5
   5 │ b       e           2      5
   6 │ c       e           3      5

There sure must be something more elegant than

hcat(rename!(DataFrame(Iterators.product(a.x1, b.x2)), [:x1, :x2]), rename!(DataFrame(Iterators.product(a.y1, b.y2)), [:y1, :y2]))

After all, this is quite impractical if I have larger and more complex data frames.


Solution

  • you're looking for crossjoin: https://dataframes.juliadata.org/stable/lib/functions/#DataAPI.crossjoin

    julia> crossjoin(a,b)
    6×4 DataFrame
     Row │ x1      y1     x2      y2
         │ String  Int64  String  Int64
    ─────┼──────────────────────────────
       1 │ a           1  d           4
       2 │ a           1  e           5
       3 │ b           2  d           4
       4 │ b           2  e           5
       5 │ c           3  d           4
       6 │ c           3  e           5