I have a csv file with large data, e.g.
> data <- read.csv("data2006.csv", header = TRUE, sep = ";")
> data
cntry aa ab ac ad
1 AT 3 4 3 2
2 AT 1 2 3 2
3 AT 2 3 3 4
I want to demean this data, i.e. to subtract the mean of each row from all elements of subsequent raws. I need to do it for elements contained in the columns with numeric values, i.e. the columns 'aa', 'ab', 'ac' and 'ad', while preserving the elements in the column 'cntry'. So, the desired outcome looks like:
cntry aa ab ac ad
1 AT 0 1 0 -1
2 AT -1 0 1 0
3 AT -1 0 0 1
In the article on mean-centering (http://www.gastonsanchez.com/visually-enforced/how-to/2014/01/15/Center-data-in-R/) I've found that one can use rowMeans for that:
center_rowmeans <- function(x) {
xcenter = rowMeans(x)
x - rep(xcenter, rep.int(nrow(x), ncol(x)))
}
but I cannot adjust this code to using for processing my data. Could someone help?
All you really are missing is how to identify the class of a column and index with that identification:
anatasia <- read.table(text=" cntry aa ab ac ad
1 AT 3 4 3 2
2 AT 1 2 3 2
3 AT 2 3 3 4 ")
rmeans <- rowMeans(anatasia[,sapply(anatasia, class) %in% c("numeric", "integer")])
dat <- cbind(anatasia[,!sapply(anatasia, class) %in% c("numeric", "integer")],
anatasia[, sapply(anatasia, class) %in% c("numeric", "integer")]-rmeans)
colnames(dat) <- colnames(anatasia)
dat
cntry aa ab ac ad 1 AT 0 1 0 -1 2 AT -1 0 1 0 3 AT -1 0 0 1