I have a list of tibbles that looks like this:
$WT_top_markers
# A tibble: 128 × 2
# Groups: cluster [26]
cluster gene
<fct> <chr>
1 0 Abi3bp
2 0 Apoe
3 0 Apoc1
4 0 Tgm2
5 0 Bcam
6 1 Aqp3
7 1 Sult1d1
8 1 Dapl1
9 1 Fxyd3
10 1 Pir
# … with 118 more rows
$F7KO_top_markers
# A tibble: 125 × 2
# Groups: cluster [25]
cluster gene
<fct> <chr>
1 0 Abi3bp
2 0 Apoe
3 0 Apoc1
4 0 Dapl1
5 0 Tgm2
6 1 Scgb3a1
7 1 Sftpa1
8 1 Reg3g
9 1 Bpifb1
10 1 Itln1
# … with 115 more rows
$F8HET_top_markers
# A tibble: 147 × 2
# Groups: cluster [30]
cluster gene
<fct> <chr>
1 0 Abi3bp
2 0 Apoe
3 0 Apoc1
4 0 1600014C10Rik
5 0 Bcam
6 1 Krt14
7 1 Krt17
8 1 Krt5
9 1 Bcam
10 1 Cav1
# … with 137 more rows
I want to pull out the genes from the first tibble where cluster = 20. I have tried:
features_to_plot <- unlist(top_markers[[1]][[which(top_markers[[1]]$cluster == 20)]])
but am getting an error:
! Must extract column with a single valid subscript.
✖ Subscript which(top_markers[[1]]$cluster == 20)
has size 5 but must be size 1.
Can anyone tell me how to do this properly?
Thanks, Stacy
We can use lapply
to loop over the list
and subset
where the 'cluster' value is 20
lapply(top_markers, \(x) subset(x, cluster == 20))
The error in the OP's code is related to usage of [[
for subsetting more than one element. Use [
with ,
i.e. top_markers[[1]]
is the first list
element which is a tibble
, we get the row index with which(top_markers[[1]]$cluster == 20)
, if we want to subset the rows, the indexing will be rowindex, columnindex
, and here we need to use rowindex,
. By default, indexing in data.frame, tibble are taken as column index (eg. - tibble(col1 = 1:5)[1:2,]
and not tibble(col1 = 1:5)[1:2]
- returns error as there is only a single column and we request to select 2 columns)
top_markers[[1]][which(top_markers[[1]]$cluster == 20),]