Search code examples
plotsyntaxjuliagadfly

How to change the color key using Gadfly's `Scale.color_discrete_manual` in Julia?


I have imported a DataFrame as below:

julia> df
100×3 DataFrames.DataFrame
│ Row │ ex1     │ ex2     │ admit │
├─────┼─────────┼─────────┼───────┤
│ 1   │ 34.6237 │ 78.0247 │ 0     │
│ 2   │ 30.2867 │ 43.895  │ 0     │
│ 3   │ 35.8474 │ 72.9022 │ 0     │
│ 4   │ 60.1826 │ 86.3086 │ 1     │
│ 5   │ 79.0327 │ 75.3444 │ 1     │
│ 6   │ 45.0833 │ 56.3164 │ 0     │
│ 7   │ 61.1067 │ 96.5114 │ 1     │
│ 8   │ 75.0247 │ 46.554  │ 1     │
⋮
│ 92  │ 90.4486 │ 87.5088 │ 1     │
│ 93  │ 55.4822 │ 35.5707 │ 0     │
│ 94  │ 74.4927 │ 84.8451 │ 1     │
│ 95  │ 89.8458 │ 45.3583 │ 1     │
│ 96  │ 83.4892 │ 48.3803 │ 1     │
│ 97  │ 42.2617 │ 87.1039 │ 1     │
│ 98  │ 99.315  │ 68.7754 │ 1     │
│ 99  │ 55.34   │ 64.9319 │ 1     │
│ 100 │ 74.7759 │ 89.5298 │ 1     │

I want to plot this DataFrame using ex1 as x-axis, ex2 as y-axis. In addition, the data is categorized by the third column :admit, so I want to give dots different colors based on the :admit value.

I used Scale.color_discrete_manual to set up colors, and I tried to use Guide.manual_color_key to change the color key legend. However it turns out Gadfly made two color keys.

p = plot(df, x = :ex1, y = :ex2, color=:admit,
         Scale.color_discrete_manual(colorant"deep sky blue",
                                     colorant"light pink"),
         Guide.manual_color_key("Legend", ["Failure", "Success"],
                                ["deep sky blue", "light pink"]))

plot1

My question is how to change the color key legend when using Scale.color_discrete_manual?

One related question is Remove automatically generated color key in Gadfly plot, where the best answer suggests to use two layers plus Guide.manual_color_key. Is there a better solution for using DataFrame and Scale.color_discrete_manual?


Solution

  • Currently, it looks like users cannot customize the color legend generated by color or Scale.color_discrete_manual based on the discussion.

    From the same source, Mattriks suggested to use an extra column as "label". Although it is not "natural" for changing color key, it works pretty well.

    Therefore, for the same dataset in the problem. We add one more column:

    df[:admission] = map(df[:admit])do x
        if x == 1
            return "Success"
        else
            return "Failure"
        end
    end
    
    julia> df
    100×4 DataFrames.DataFrame
    │ Row │ exam1   │ exam2   │ admit │ admission │
    ├─────┼─────────┼─────────┼───────┼───────────┤
    │ 1   │ 34.6237 │ 78.0247 │ 0     │ "Failure" │
    │ 2   │ 30.2867 │ 43.895  │ 0     │ "Failure" │
    │ 3   │ 35.8474 │ 72.9022 │ 0     │ "Failure" │
    │ 4   │ 60.1826 │ 86.3086 │ 1     │ "Success" │
    │ 5   │ 79.0327 │ 75.3444 │ 1     │ "Success" │
    │ 6   │ 45.0833 │ 56.3164 │ 0     │ "Failure" │
    │ 7   │ 61.1067 │ 96.5114 │ 1     │ "Success" │
    │ 8   │ 75.0247 │ 46.554  │ 1     │ "Success" │
    ⋮
    │ 92  │ 90.4486 │ 87.5088 │ 1     │ "Success" │
    │ 93  │ 55.4822 │ 35.5707 │ 0     │ "Failure" │
    │ 94  │ 74.4927 │ 84.8451 │ 1     │ "Success" │
    │ 95  │ 89.8458 │ 45.3583 │ 1     │ "Success" │
    │ 96  │ 83.4892 │ 48.3803 │ 1     │ "Success" │
    │ 97  │ 42.2617 │ 87.1039 │ 1     │ "Success" │
    │ 98  │ 99.315  │ 68.7754 │ 1     │ "Success" │
    │ 99  │ 55.34   │ 64.9319 │ 1     │ "Success" │
    │ 100 │ 74.7759 │ 89.5298 │ 1     │ "Success" │
    

    Then color the data using this new column Scale.color_discrete_manual:

    plot(df, x = :exam1, y = :exam2, color = :admission,
         Scale.color_discrete_manual(colorant"deep sky blue",
                                     colorant"light pink"))
    

    plot