I am trying to use dplyr::mutate()
to return the integer value of the level of a factor column for each row. Here's what I have so far:
a <- tibble(group = factor(c(rep(c('group1', 'groupA', 'groupB'), 4), rep('groupC', 3))),
name1 = factor(c(rep(c('gene1', 'gene2', 'geneA', 'geneB'),3),
c('gene1', 'gene2', 'geneA'))),
name2 = factor(c(rep(c('geneB', 'geneA', 'gene2', 'gene1'), 3),
c('geneB', 'geneA', 'gene2')))) %>%
arrange(group)
a <- group_by(a, group) %>%
mutate(n = row_number(),
n_max = max(n),
lev1 = which(levels(a$name1) == name1))
which causes the error message:
Error in mutate_impl(.data, dots) :
Column `lev1` must be length 4 (the group size) or one, not 2
but if I just run which(levels(a$name1) == 'gene2')
I get the desired value 2
.
What is causing this error and how do I get around it?
Are you after this?
group_by(a, group) %>%
mutate(
n = row_number(),
n_max = max(n),
lev1 = as.numeric(name1))
## A tibble: 15 x 6
## Groups: group [4]
# group name1 name2 n n_max lev1
# <fct> <fct> <fct> <int> <dbl> <dbl>
# 1 group1 gene1 geneB 1 4. 1.
# 2 group1 geneB gene1 2 4. 4.
# 3 group1 geneA gene2 3 4. 3.
# 4 group1 gene2 geneA 4 4. 2.
# 5 groupA gene2 geneA 1 4. 2.
# 6 groupA gene1 geneB 2 4. 1.
# 7 groupA geneB gene1 3 4. 4.
# 8 groupA geneA gene2 4 4. 3.
# 9 groupB geneA gene2 1 4. 3.
#10 groupB gene2 geneA 2 4. 2.
#11 groupB gene1 geneB 3 4. 1.
#12 groupB geneB gene1 4 4. 4.
#13 groupC gene1 geneB 1 3. 1.
#14 groupC gene2 geneA 2 3. 2.
#15 groupC geneA gene2 3 3. 3.
name1
is already a factor
, so as.numeric
returns its factor level index.