Search code examples
rextractstringrstrsplit

spliting column into 2 and replace character by number - R


I have obtained results in the following format:

     Parameter  Wert
...
99      se.m  0.1000
100     se.m  0.1000
101    se.st  0.5000
102    se.st  0.500
...

I want to split the column Parameter into two columns: Parameter and Insentität. The split should be done where the dot . is. Then in the resulting column Intensität, I want to replace all m by the value of 2 and replace all st by the value of 3. The result should look like this:

Parameter Intensität Wert
...
99  se        2  0.4000
100 se        2  0.0396
101 se        3  0.2702
102 se        3  1.1566
...

I have managed to obtain this format, however i am sure there is a more elegant way to do.

The way i have obtained the results is clumsy. I originally had 2 columns in the output: se.m and se.st. I have manually changed the column names:

colnames(results) <- c("2", "3")

and then combined the two columns into 1 column. Then i added a column containing the characters se in every row.

results <- melt(results)
cbind( Parameter ="se", results)

I know there is another alretnatives. For example by using extract from tidyr, but i cannot get the expression syntax right. Also stringr package, I could use str_match function or maybe also strsplit. All these look great but i seem to be unable to apply them to my problem. I am stuck here. There are similiar question, but I couldnt find a solution that works for me.

PS: I appriciate any input - comments, critique, tips. I am a learner and any piece of advice is of a great value to me.


Solution

  • We can use separate

    library(tidyr)
    library(dplyr)
    separate(df1, Parameter, into = c("Parameter", "Intensitat")) %>%
                    mutate(Intensitat = recode(Intensitat, m = 2, st = 3))
    
    #  Parameter Intensitat Wert
    #1        se          2  0.1
    #2        se          2  0.1
    #3        se          3  0.5
    #4        se          3  0.5
    

    Or we can separate using read.table by specifying the sep, do a transformation and cbind with the "Wert" column

    cbind(transform(read.table(text= as.character(df1$Parameter), 
        col.names = c("Parameter", "Intensitat"), sep="."), 
          Intensitat = ifelse(Intensitat=="m", 2, 3)), df1["Wert"])
    #    Parameter Intensitat Wert
    #99         se          2  0.1
    #100        se          2  0.1
    #101        se          3  0.5
    #102        se          3  0.5