Search code examples
rmatrixcross-product

Add labels to calculated cross product matrix in R


I have a table that I have created as follows

A_ID<-c(111,116,111,112,112,114,116,113,114,111,114,116,115,116,116)
U_ID<-c(221,221,222,222,223,223,223,224,224,225,225,225,226,226,226)

df_u_a<-data.frame(U_ID,A_ID)

myTab <- table(df_u_a) # count
myTab[] <- as.integer(as.logical(myTab)) # binary map

And a subsequent cross product matrix I've created as follows

CProd.Matrix <- crossprod(myTab[] %*% diag(1 / sqrt(colSums(myTab[]^2))))

This produced the following outputs

> myTab[]
     A_ID
U_ID  111 112 113 114 115 116
  221   1   0   0   0   0   1
  222   1   1   0   0   0   0
  223   0   1   0   1   0   1
  224   0   0   1   1   0   0
  225   1   0   0   1   0   1
  226   0   0   0   0   1   1
> CProd.Matrix
          [,1]      [,2]      [,3]      [,4] [,5]      [,6]
[1,] 1.0000000 0.4082483 0.0000000 0.3333333  0.0 0.5773503
[2,] 0.4082483 1.0000000 0.0000000 0.4082483  0.0 0.3535534
[3,] 0.0000000 0.0000000 1.0000000 0.5773503  0.0 0.0000000
[4,] 0.3333333 0.4082483 0.5773503 1.0000000  0.0 0.5773503
[5,] 0.0000000 0.0000000 0.0000000 0.0000000  1.0 0.5000000
[6,] 0.5773503 0.3535534 0.0000000 0.5773503  0.5 1.0000000

I don't know how to link the headers of the myTab[] to CProd.Matrix. For example like:

          111       112       113       114  115       116
111 1.0000000 0.4082483 0.0000000 0.3333333  0.0 0.5773503
112 0.4082483 1.0000000 0.0000000 0.4082483  0.0 0.3535534
113 0.0000000 0.0000000 1.0000000 0.5773503  0.0 0.0000000
114 0.3333333 0.4082483 0.5773503 1.0000000  0.0 0.5773503
115 0.0000000 0.0000000 0.0000000 0.0000000  1.0 0.5000000
116 0.5773503 0.3535534 0.0000000 0.5773503  0.5 1.0000000

What I want to achieve is

1- to be able to query for a specific number like 111 and get the values. At the moment I can only manage to query for a row/column, like as below (but I can not manage to query for 111 for example)

> CProd.Matrix [1,]
[1] 1.0000000 0.4082483 0.0000000 0.3333333 0.0000000 0.5773503

2- see the corresponding headers for a each number like

          111       112       113       114  115       116
111 1.0000000 0.4082483 0.0000000 0.3333333  0.0 0.5773503

3- sort the values like

          111       116       112       114       113  115
111 1.0000000 0.5773503 0.4082483 0.3333333 0.0000000  0.0

Any ideas on how to achieve any of the above?


Solution

  • We can use dimnames to assign the rownames and columnames. In this case, there is only colnames of 'myTab' is needed as dimnames

    dimnames(CProd.Matrix) <-rep(list(colnames(myTab)), 2)
    CProd.Matrix
    #          111       112       113       114 115       116
    #111 1.0000000 0.4082483 0.0000000 0.3333333 0.0 0.5773503
    #112 0.4082483 1.0000000 0.0000000 0.4082483 0.0 0.3535534
    #113 0.0000000 0.0000000 1.0000000 0.5773503 0.0 0.0000000
    #114 0.3333333 0.4082483 0.5773503 1.0000000 0.0 0.5773503
    #115 0.0000000 0.0000000 0.0000000 0.0000000 1.0 0.5000000
    #116 0.5773503 0.3535534 0.0000000 0.5773503 0.5 1.0000000
    

    Now, based on the rownames, it can be subsetted

    CProd.Matrix ["111",, drop = FALSE]
    #    111       112 113       114 115       116
    #111   1 0.4082483   0 0.3333333   0 0.5773503
    

    sort the values after subsetting ?

    t(apply(CProd.Matrix ["111",, drop = FALSE], 1, sort, decreasing = TRUE))
    #    111       116       112       114 113 115
    #111   1 0.5773503 0.4082483 0.3333333   0   0