Search code examples
rconditional-statementsinfinity

Remove rows from dataframe that have an infinite value in one column, but not others


I have a dataframe with multiple columns that contain both Inf and -Inf values. I want to remove all rows from the dataset that include Inf/-Inf values in one of the columns, but I want to keep the Inf/-Inf in the other columns.

So, if I start with the following dataframe:

 Group<-c("A","B","C","D","E","F","G")
 LRR <- c(Inf, 1,2,3,-Inf,4, 5)
 LRR.var <- c(Inf, Inf, 3, -Inf, -Inf, 6,7)
 data<-data.frame(cbind(Group, LRR, LRR.var))
 data

 Group  LRR  LRR.var
 A      Inf  Inf
 B      1    Inf
 C      2    3
 D      3   -Inf
 E     -Inf -Inf
 F      4    6
 G      5    7

I would like it to ultimately look like this:

Group<-c("B","C","D","F","G")
LRR <- c(1,2,3,4, 5)
LRR.var <- c( Inf, 3,-Inf, 6,7)
data1<-data.frame(cbind(Group, LRR, LRR.var))
data1

Group  LRR  LRR.var
 B      1    Inf
 C      2    3
 D      3   -Inf
 F      4    6
 G      5    7   

All the solutions I have found to remove infinite values from dataframes remove all infinite values, and not just those based on one column in the dataset. Thanks for your help!


Solution

  • Your variables are factors, so you should convert them into numeric format first. Then, you have several ways to remove Inf values. The simpliest method is to use is.finite to select rows.

    data <- data.frame(
        Group = c("A","B","C","D","E","F","G"),
        LRR = c(Inf, 1,2,3,-Inf,4, 5),
        LRR.var = c(Inf, Inf, 3, -Inf, -Inf, 6,7), 
        stringsAsFactors = FALSE
    )
    

    Base R

    data[is.finite(data$LRR),]
    
      Group LRR LRR.var
    2     B   1     Inf
    3     C   2       3
    4     D   3    -Inf
    6     F   4       6
    7     G   5       7
    

    You can also use variable position

    data[is.finite(data[,2]),]
    

    data.table:

    With data.table, you don't need the second dimension:

    library(data.table)
    as.data.table(data)[is.finite(LRR)]
    
    Group LRR LRR.var
    1:     B   1     Inf
    2:     C   2       3
    3:     D   3    -Inf
    4:     F   4       6
    5:     G   5       7
    

    dplyr:

    With dplyr, you can use filter

    library(dplyr)
    data %>% filter(is.finite(LRR))
    
      Group LRR LRR.var
    1     B   1     Inf
    2     C   2       3
    3     D   3    -Inf
    4     F   4       6
    5     G   5       7