Search code examples
rgenetics

Created a new column using mutate, result is a column full of NAs (but this code worked for a different file I have)


I am using tidyr and creating a new column using mutate to sum how many 0's were returned in a different column I have. For some reason, although the new column forms, I am receiving NA's throughout the column even when I can see there should be an answer of at least one (e.g. I see a 0 in a column, but the "count" (total) column still reads N/A".

This code worked previously on a nearly identical dataset for the same type of question, can someone explain to me what is going on? A copy of my code is below.

Gathered <- ScottCrkMeta250918 %>% 
                gather(SNP, Genotype, 43:234)

Prefailed <- Gathered %>% 
                group_by(NMFS_DNA_ID, BOX_ID,BOX_POSITION) %>% 
                mutate(Count = sum(Genotype == 0)) 

I am trying to see how many SNPs failed, therefore I have 0s in columns where there was a failure. I am trying to tell R to tally up these zeroes (failures) and give them to me in a separate column.


Solution

  • Unfortunately you don't share data, so this is a bit of guess. So I'm guessing that Genotype contains NAs. In this case, try replacing your code with

    Prefailed <- Gathered %>% 
        group_by(NMFS_DNA_ID, BOX_ID, BOX_POSITION) %>% 
        mutate(Count = sum(Genotype == 0, na.rm = TRUE))
    

    Here is a minimal reproducible code example to demonstrate

    set.seed(2018)
    df <- data.frame(
        Genotype = sample(c(NA, 0, 1), 10, replace = T))
    
    df %>%
        mutate(
            Count_without_NA_removed = sum(Genotype == 0),
            Count_with_NA_removed = sum(Genotype == 0, na.rm = T))
    #   Genotype Count_without_NA_removed Count_with_NA_removed
    #1         0                       NA                     5
    #2         0                       NA                     5
    #3        NA                       NA                     5
    #4        NA                       NA                     5
    #5         0                       NA                     5
    #6        NA                       NA                     5
    #7         0                       NA                     5
    #8        NA                       NA                     5
    #9         1                       NA                     5
    #10        0                       NA                     5