I have a dataframe like following:
df:
S S1 S2 S3 S4
100130426 0 0 0.9066 0
100133144 16.3644 9.2659 11.6228 12.0894
100134869 12.9316 17.379 9.2294 11.0799
3457 1910.3 2453.50 2695.37 1372.3624
9834 1660.13 857.30 1240.53 1434.6463
ATP5L2|267 0 0 0.9066 0
ATP5L|1063 1510.29 1270.79 2965.54 2397.1866
ATP5O|539 2176.17 1868.95 2004.53 2360.3641
I actually want to remove "|" and also numbers after "|" in the first column. For eg: ATP5L2|267 should be like ATP5L2.
So I tried in the following way:
SD <- sapply(strsplit(df$s, split='|', fixed=TRUE), function(x) (x[1]))
But this gave me an error:
Error in strsplit(s, split = "|", fixed = TRUE) : non-character argument.
Output should look like following:
df:
S S1 S2 S3 S4
100130426 0 0 0.9066 0
100133144 16.3644 9.2659 11.6228 12.0894
100134869 12.9316 17.379 9.2294 11.0799
3457 1910.3 2453.50 2695.37 1372.3624
9834 1660.13 857.30 1240.53 1434.6463
ATP5L2 0 0 0.9066 0
ATP5L 1510.29 1270.79 2965.54 2397.1866
ATP5O 2176.17 1868.95 2004.53 2360.3641
You can do this with sub
and a regular expression.
df$S = sub("\\|.*", "", as.character(df$S))
df
S S1 S2 S3 S4
1 100130426 0.0000 0.0000 0.9066 0.0000
2 100133144 16.3644 9.2659 11.6228 12.0894
3 100134869 12.9316 17.3790 9.2294 11.0799
4 3457 1910.3000 2453.5000 2695.3700 1372.3624
5 9834 1660.1300 857.3000 1240.5300 1434.6463
6 ATP5L2 0.0000 0.0000 0.9066 0.0000
7 ATP5L 1510.2900 1270.7900 2965.5400 2397.1866
8 ATP5O 2176.1700 1868.9500 2004.5300 2360.3641
Details:
sub
substitutes the second argument for whatever matches the first argument. In this case, we want | and everything after it. You can't just write | because that has a special meaning in regular expressions so you "escape" it with by writing \\|. It is followed by .*. The . means "any character" and * means any number of times, so together \\|.* means | followed by any number of characters. We replace that with the empty string "". We apply this operation to as.character(df$S)
because your error message makes it look like your variable df$S
may be a factor, rather than a string.