Search code examples
rjoingnuplyr

plyr R empty inner join


I got a big data table X (504x9) and a smaller data frame bzShare (323X4) both with the columns top.sector and sizeClass. Now i want to join a value from bzShare into X so that the dimensions of X are (504x10). An inner join selects only rows with matching keys in both x and y if I am right but I always get zero rows :-(

> dim(X)
[1] 504   9
> names(X)
[1] "sizeClass" "top.sector"    "year" "period" "somevar"
[6] "sumTest"   "sumTestTotal"  "AN"   "share"                    
> names(bzShare)
[1] "top.sector" "sizeClass"  "bzShare"   
> join(X,bzShare,type="inner",by=c("top.sector","sizeClass"))
NULL data table

Why didn't get this a (504x10) data frame?


Solution

  • Just because two data.frames or matrices share same column names it doesn't mean that they will join/merge nicely among other things because they may not have any common keys between them two, which would be a typical results in a inner-JOIN case like the one you are describing.

    Check also that your bzShare object is not empty, though it has valid columns names (i.e. I miss the result of dim(bzShare))

    Start with:

    count(X$top.sector %in% bzShare$top.sector)
    count(X$sizeClass %in% bzShare$sizeClass)
    

    and see if you get something in each set intersection.