Search code examples
rconditional-statementshistogramlatticetrellis

Conditional Histograms Using Lattice Package, Output Plots Incorrect


I'm using histogram from the lattice package to plot two histograms conditioning on a variable with two options: Male or Female.

histogram(~ raw$Housework_Tot_Min [(raw$Housework_Tot_Min != 0) & 
(raw$Housework_Tot_Min < 1000)] | raw$Gender)

Output of code: two histograms, minutes doing housework by gender

But, when I actually look at the data, these histograms are not correct. By plotting:

histogram(~ raw$Housework_Tot_Min [(raw$Housework_Tot_Min != 0) & 
(raw$Housework_Tot_Min < 1000) & (raw$Gender == "Female")]

and:

histogram(~ raw$Housework_Tot_Min [(raw$Housework_Tot_Min != 0) & 
(raw$Housework_Tot_Min < 1000) & (raw$Gender == "Male")]

I get two histograms again, but they look very different

Does anyone have insight on why these outputs don't match? I have a bunch more binary-type panels to plot, and having to do them separately really defeats the purpose of working with the lattice package!

I apologize if this belies a fundamental misunderstanding of an easy concept, I'm still very much a beginner at R! Many thanks for the help.


Solution

  • Turns out that the issue was around a mismatch of data based on the exclusions applied using the brackets. Instead of:

    histogram(~ raw$Housework_Tot_Min [(raw$Housework_Tot_Min != 0) & 
    (raw$Housework_Tot_Min < 1000)] | raw$Gender)
    

    It should read:

    histogram(~ Housework_Tot_Min [(Housework_Tot_Min != 0) & (Housework_Tot_Min < 1000)] | 
            Gender [(Housework_Tot_Min != 0) & (Housework_Tot_Min < 1000)], data = raw,
          main = "Time Observed Housework by Gender",
          xlab = "Minutes spent",
          breaks = seq(from = 0, to = 400, by = 20))
    

    Note that the exclusions are now applied to both the housework time and gender variables, eliminating the mismatches in the data.

    The correct plot has been pasted below. Thanks again to all for the guidance.

    Updated Histogram