I have a huge metadata file with 79 Columns and 78687 Rows. This metadata is from our cancer experiment results. I am using dplyr to query the cell counts for each sample in that metadata.
I have 16 samples:
I need to find the cell counts for each condition (Tumor or Normal or MSS_Status) in each sample. I am doing it individually so for as follows
dim(meta %>% filter(Condition == "Tumor" & MSI_Status=="MSS" & Location =="Left" & orig.ident == "B_cac10"));
# 689 24
I am sure there is an intelligent way to do it, how can I loop this to get an answer in one go?
P.S: I am a Biologist and my knowledge is very limited in Looping or coding
EDIT: 1
reproducible Example
df <- data.frame(Condition = c("Normal","Normal","Normal","Tumor","Tumor","Tumor"),
MSI_Status = c("High", "High", "High", "Low", "Low", "Low"),
Location = c("Lungs", "Lungs", "Lungs", "Kidney", "Kidney", "Liver"),
Clusters = c(1,2,4,2,2,6),
orig.ident = c("B-cac10","B-cac11","T-cac15","B-cac15","B-cac19","T-cac22"))
My Codes:
df %>% filter(Condition == "Tumor" & MSI_Status=="Low" & Location
=="Kidney" & orig.ident == "B-cac15")
Expected results:
Each orig.idents counts should be given under Condition "Tumor ", MSI_Status=="Low" & Location = "Kidney"
Thanks a lot for your Help, Stay Safe. Dave
You can use the dplyr
function filter
to subset the data based on your criteria. Then you can use the dplyr
count
function to count the unique values in orig.ident
. As alluded to in the comments, you can opt to set name = Freq
from within this function. I opted to use the rename
function instead to be as explicit as possible since you are new to R
.
Data
df <- data.frame(Condition =
c("Normal","Normal","Normal","Tumor","Tumor","Tumor"), MSI_Status =
c("High", "High", "High", "Low", "Low", "Low"), Location = c("Lungs",
"Lungs", "Lungs", "Kidney", "Kidney", "Liver"), Clusters =
c(1,2,4,2,2,6), orig.ident=c("B-cac10","B-cac11","T-cac15","B-
cac15","B-cac19","T-cac22"))
Code
library(dplyr)
df %>%
filter(Condition == "Tumor" &
MSI_Status == "Low" &
Location == "Kidney") %>%
count(orig.ident) %>%
rename(Freq = n)
#> orig.ident Freq
#> 1 B-cac15 1
#> 2 B-cac19 1
Created on 2020-09-05 by the reprex package (v0.3.0)