I have a matrix, which represents mobility between various jobs:
jobnames <- c("job 1","job 2","job 3","job 4","job 5","job 6","job 7")
jobdat <- matrix(c(
5, 5, 5, 0, 0, 5, 5,
5, 5, 2, 5, 5, 1, 5,
1, 5, 5, 5, 0, 0, 1,
1, 0, 5, 5, 8, 0, 1,
0, 5, 0, 0, 5, 5, 1,
0, 0, 5, 5, 0, 5, 5,
0, 1, 0, 0, 5, 1, 5
),
nrow = 7, ncol = 7, byrow = TRUE,
dimnames = list(jobnames,jobnames
))
This is treated as a directed, weighted adjacency matrix in a social network analysis. The direction of the network is from rows to columns: So mobility is defined as going from a job-row to a job-column. The diagonal is relevant, since it is possible to change to the same job in another firm.
I need to collapse this matrix according to a prefigured list containing the index of the jobs that should be combined:
group.list <- list(grp1=c(1,2) ,grp2 =c(3,4))
Now, since it is an adjacency matrix, it's a bit different than the other ' answers about how to collapse a matrix that I've ' found here and elsewhere. The collapse has to be simultanious on both the rows and the columns. And some jobs isn't grouped at all. So the result in this example should be like this:
group.jobnames <- c("job 1 and 2","job 3 and 4","job 5","job 6","job 7")
group.jobdat <- matrix(c(
20,12,5,6,10,
7,17,8,0,2,
5,0,5,5,1,
0,10,0,5,5,
1,0,5,1,5
),
nrow = 5, ncol = 5, byrow = TRUE,
dimnames = list(group.jobnames,group.jobnames
))
This example groups the two first jobs and then the next two, but in my actual data it could be any combination of (indexes of) jobs, and any number of jobs in each group. So job [1,7] could be one group, and job [2,3,6] could be another group, while job 4 or 5 wasn't grouped. Or any other combination.
Thank you for your time,
I believe there are some typos in the intended output, and the group.list definition. If I am correct in my interpretation, here is a solution.
Here is a new group.list to conform with the names of the desired output. In this version, group 2 is mapped to 1 and group 4 is mapped to 3, which conforms with the text in group.jobs.
group.list <- list(grp1=c(1, 3), grp2=c(2, 4))
Given this list, construct a grouping vector
# initial grouping
groups <- seq_len(ncol(jobdat))
# map elements of second list item to values of first list item
groups[match(group.list[["grp2"]], groups)] <- group.list[["grp1"]]
groups
[1] 1 1 3 3 5 6 7
So, now groups 1 and 2 are the same as well as 3 and 4. Now, we use rowsum
and a couple of transposes to calculate the output.
myMat <- t(rowsum(t(rowsum(jobdat, groups)), groups))
# add the group names
dimnames(myMat) <- list(group.jobnames,group.jobnames)
myMat
job 1 and 2 job 3 and 4 job 5 job 6 job 7
job 1 and 2 20 12 5 6 10
job 3 and 4 7 20 8 0 2
job 5 5 0 5 5 1
job 6 0 10 0 5 5
job 7 1 0 5 1 5
In response to the OP's comments below, the grouping was intended to be within list elements, rather than corresponding positions between list elements as I had originally interpreted. To accomplish this form a grouping, a repeated feeding of replace
to Reduce
will accomplish the task.
With group.list as in the question,
group.list <- list(grp1=c(1, 2), grp2=c(3, 4))
groups <- Reduce(function(x, y) replace(x, x[x %in% y], min(y)),
c(list(groups), unname(group.list)))
groups
[1] 1 1 3 3 5 6 7
Here, replace
takes the original grouping, finds the elements in the grouping that are in one of the vectors in group.list, and replaces these with the minimum value of that vector. The Reduce
function repeatedly applies this operation on the original group variable, except modifying it in each iteration.
With this result, we use the above transposes and rowsum
to get
myMat
job 1 and 2 job 3 and 4 job 5 job 6 job 7
job 1 and 2 20 12 5 6 10
job 3 and 4 7 20 8 0 2
job 5 5 0 5 5 1
job 6 0 10 0 5 5
job 7 1 0 5 1 5