I have a gene expression file and its row names is like this: GTEX.1117F.3226.SM.5N9CT enter image description here I want to edit its rownames to be like this:
GTEX-1117F and so on.
I used these commands:
row.names(gene_exp_transpose) <- data
gsub(".","-",row.names(gene_exp_transpose)) #this just gives ----- to all the rownames data
row.names(gene_exp) substr(data, 0,5) ## but for the last rows, it has 4 character instead of 5.
A base R solution. Data borrowed from TarJae's answer.
In the first instruction, the regex is almost identical to TarJae's, with two differences:
Then the only period is replaced by a dash "_"
.
row.names(df) <- sub('^([^.]+\\.[^.]+).*$', '\\1', row.names(df))
row.names(df) <- sub('\\.', '-', row.names(df))
row.names(df)
#> [1] "GTEX-1117F" "GTEX-111FC" "GTEX-1128S" "GTEX-117XS" "GTEX-1192X"
Created on 2022-07-02 by the reprex package (v2.0.1)
onyambu's comment makes the above code a one-liner.
sub('^([^.]+)\\.([^.]+).*', '\\1-\\2', rownames(df))
#> [1] "GTEX-1117F" "GTEX-111FC" "GTEX-1128S" "GTEX-117XS" "GTEX-1192X"
Created on 2022-07-02 by the reprex package (v2.0.1)