Suppose I have a simple data frame
test_df <- data.frame(c(0,0,1,0,0,1,1,1,1,1),c(1,0,0,0,0,0,0,0,0,0))
I want to get which number (0 or 1) is the maximum for each row. In my example 1 for the first vector (6 occurrences), 0 for the second one (9 occurrences).
I started with:
> sapply(test_df,table)
c.0..0..1..0..0..1..1..1..1..1. c.1..0..0..0..0..0..0..0..0..0.
0 4 9
1 6 1
so far looks fine. Then
> sapply((sapply(test_df,table)),max)
[1] 4 6 9 1
I got lost, did I loose the associations? 1 -> 6 , 0 -> 9 What I want is to have returned a vector with the "winner": 1,0,...
1 for the first vector (6 occurrences)
0 for the second vector (9 occurrences)
...
This can be done in one apply
statement. Although, it's unclear whether you want the maximum occurrences for each row or column, so here's both (using @akrun 's cleaner data set), returning a vector showing the 'winner' (either 1 or 0) for each row/column.
## Data
test_df <- data.frame(v1= c(0,0,1,0,0,1,1,1,1,1),
v2= c(1,0,0,0,0,0,0,0,0,0),
v3= c(1,0,0,0,0,0,0,0,0,1))
# v1 v2 v3
# 1 0 1 1
# 2 0 0 0
# 3 1 0 0
# 4 0 0 0
# 5 0 0 0
# 6 1 0 0
# 7 1 0 0
# 8 1 0 0
# 9 1 0 0
# 10 1 0 1
## Solution - For each row
apply(test_df, 1, function(x) { sum(sum(x == 1) > sum(x == 0)) })
## Result
# [1] 1 0 0 0 0 0 0 0 0 1
## Solution - For each column
apply(test_df, 2, function(x) { sum(sum(x == 1) > sum(x == 0)) })
## Result
# v1 v2 v3
# 1 0 0