Search code examples
rdataframereplaceinequalities

Replace inequalities with values in dataframe in R


I have a dataframe that involves several columns, in which there are many instances where inequalities are present. What I would like to have, is an R script that will identify these inequalities and replace them with actual values. More specific, let's assume that we have "<2" and we want to replace it with its half value ("<2" -> 1.0). Is there a generic way to do it so that I do not need to find manually all the inequalities within the dataframe and replace them?

A simple example might be the following:

Col1,Col2, Col3, Col4 2.2, <3, 4,<2
3.4, 4, <5,3 4.2, 2,2.1,5 1.3, 1,4,<8

And I want to get something like this:

Col1,Col2,Col3,Col4 2.2, 1.5, 4,1
3.4, 4, 2,5,3 4.2, 2,2.1,5 1.3, 1,4,4


Solution

  • We can replace the < with an expression and evaluate

    df1[] <-  lapply(df1, function(x) sapply(sub("<", "(1/2)*", x, fixed = TRUE), 
         function(y) eval(parse(text = y))))
    
    df1
    #  Col1 Col2 Col3 Col4
    #1  2.2  1.5  4.0    1
    #2  3.4  4.0  2.5    3
    #3  4.2  2.0  2.1    5
    #4  1.3  1.0  4.0    4
    

    data

    df1 <- structure(list(Col1 = c(2.2, 3.4, 4.2, 1.3), Col2 = c("<3", "4", 
    "2", "1"), Col3 = c("4", "<5", "2.1", "4"), Col4 = c("<2", "3", 
    "5", "<8")), row.names = c(NA, -4L), class = "data.frame")