Search code examples
dataframejoinjuliatranslatecrosswalk

translate Dataframe using crosswalk in julia


I have a very large dataframe (original_df) with columns of codes

14 15
21 22
18 16

And a second dataframe (crosswalk) which maps 'old_codes' to 'new_codes'

14 104
15 105
16 106
18 108
21 201
22 202

Of course, the resultant df (resultant_df) that I would like would have values:

104 105
201 202
108 106

I am aware of two ways to accomplish this. First, I could iterate through each code in original_df, find the code in crosswalk, then rewrite the corresponding cell in original_df with the translated code from crosswalk. The faster and more natural option would be to leftjoin() each column of original_df on 'old_codes'. Unfortunately, it seems I have to do this separately for each column, and then delete each column after its conversion column has been created -- this feels unnecessarily complicated. Is there a simpler way to convert all of original_df at once using the crosswalk?


Solution

  • You can do the following (I am using column numbers as you have not provided column names):

    d = Dict(crosswalk[!, 1] .=> crosswalk[!, 2])
    resultant_df = select(original_df, [i => ByRow(x -> d[x]) for i in 1:ncol(original_df)], renamecols=false)