Search code examples
rdataframeremoveallquotation-marks

remove all quotation marks from a data frame


I have a data frame rep that looks like this:

> head(rep)
     position chrom  value label  
[1,] "17408"  "chr1" "0"   "miRNA"
[2,] "17409"  "chr1" "0"   "miRNA"
[3,] "17410"  "chr1" "0"   "miRNA"
[4,] "17411"  "chr1" "0"   "miRNA"
[5,] "17412"  "chr1" "0"   "miRNA"
[6,] "17413"  "chr1" "0"   "miRNA"

How can I remove the quotation marks from all elements?

Note: rep$position and rep$value should be numeric type, rep$chrom and rep$label should be character type.


Solution

  • As indicated by @Roland, you have a matrix, not a data.frame, and these have different default print methods. Sticking with a matrix, you can set quote = FALSE explicitly in print or you can use noquote.

    Here is a basic example:

    ## Sample data
    x <- matrix(c(17, "chr1", 0, "miRNA", 18, "chr1", 0, "miRNA"), nrow = 2, 
                byrow = TRUE, dimnames = list(
                  NULL, c("position", "chrom", "value", "label")))
    
    ## Default printing
    x
    #      position chrom  value label  
    # [1,] "17"     "chr1" "0"   "miRNA"
    # [2,] "18"     "chr1" "0"   "miRNA"
    
    ## Two options to make the quotes disappear
    print(x, quote = FALSE)
    #      position chrom value label
    # [1,] 17       chr1  0     miRNA
    # [2,] 18       chr1  0     miRNA
    noquote(x)
    #      position chrom value label
    # [1,] 17       chr1  0     miRNA
    # [2,] 18       chr1  0     miRNA
    

    Also, as you figured out on your own, converting your matrix to a data.frame makes the quotes disappear. A data.frame is a more appropriate structure to hold your data if each column is a different type of data (numeric, character, factor, and so on). However, converting a matrix to a data.frame does not take care of the conversion of columns for you automatically. Instead, you can make use of type.convert (which is also used when creating a data.frame using read.table and family):

    y <- data.frame(x, stringsAsFactors = FALSE)
    str(y)
    # 'data.frame':  2 obs. of  4 variables:
    #  $ position: chr  "17" "18"
    #  $ chrom   : chr  "chr1" "chr1"
    #  $ value   : chr  "0" "0"
    #  $ label   : chr  "miRNA" "miRNA"
    y[] <- lapply(y, type.convert)
    str(y)
    # 'data.frame':  2 obs. of  4 variables:
    #  $ position: int  17 18
    #  $ chrom   : Factor w/ 1 level "chr1": 1 1
    #  $ value   : int  0 0
    #  $ label   : Factor w/ 1 level "miRNA": 1 1
    y
    #   position chrom value label
    # 1       17  chr1     0 miRNA
    # 2       18  chr1     0 miRNA