Search code examples
rplotcategorical-datalme4mixed-models

A plot for cross tabulations in R


I'm trying to see if each unique childid only occur in one unique schoolid or not. I have plotted the cross tabulations, but the visual is very busy and unclear.

Is there a better way (by plotting or otherwise) to achieve my goal in R?

(ps. As an alternative, I was also told to fit a mixed-model and plot the random-effects but as shown below the image is super small and unclear.)

dd <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/3.csv')
cross_tab <- xtabs(~ schoolid + childid, dd)

plot(cross_tab)

library(lme4)

m31 <- lmer(math~year+(1|schoolid/childid), data = dd)

image(getME(m31,"Zt"))

enter image description here

enter image description here


Solution

  • You can do it in this way (no plot).

    With Base R. You can calculate a contingency table and then count how many times for each childid you have a positive values for a schoolid+chilid match. If it's more than 1 than you have the insight you were looking for.

    x <- colSums(table(dd$schoolid, dd$childid) > 0) 
    x[x>1]
    #> named numeric(0)
    

    With dplyr. You distinct each schoolid+childid match and then you count if childid appears more than once.

    library(dplyr)
    
    dd %>% distinct(schoolid, childid) %>% count(childid) %>% filter(n>1)
    #> [1] childid n      
    #> <0 rows> (or 0-length row.names)