Search code examples
rplothistogram

histogram with multiple binary variables


I have a dataset where columns represents variables and rows interviewers. I have 5 binary variables indicating the region where the respondent comes from (Central, South, North, East, Ovest). Central is 1 if the individual lives there, 0 otherwise. The same for the other 4 dummies.

I want to visualize a bar plot or histogram, where on the y axis I have the absolute frequencies, and on the x axis I have two sub sections, 0 and 1. For each subsection I need a bar of the absolute frequencies for Central when is 0, a bar of the absolute frequencies for South when is 0, and so on. The same for the sub section 1.

Like in the figure


Solution

  • You will get a faster response if you provide a reproducible data set. From your description, you data seems to be something like this:

    set.seed(42)
    x <- sample.int(5, 15, replace=TRUE)
    dta <- matrix(0, 15, 5)
    dta[cbind(1:15, x)] <- 1
    colnames(dta) <- c("Central", "South", "North", "East", "Ovest")
    head(dta)
    #      Central South North East Ovest
    # [1,]       1     0     0    0     0
    # [2,]       0     0     0    0     1
    # [3,]       1     0     0    0     0
    # [4,]       1     0     0    0     0
    # [5,]       0     1     0    0     0
    # [6,]       0     0     0    1     0
    

    You want to produce bar plots for each region:

    freq <- colSums(dta)
    bars <- rbind(Present=freq, Absent=nrow(dta) - freq)
    #         Central South North East Ovest
    # Present       5     5     0    3     2
    # Absent       10    10    15   12    13
    barplot(bars, beside=TRUE, legend=TRUE)
    

    Barplot