Search code examples
rtextdplyr

How to export list into self-formatted text file in R?


I have a list which looks like this:

name <- c("A","B","C","D","E")
n <- c("1, 2, 3, 4, 5, 6, 7, 8, 9, 10", "11, 12, 13, 14, 15", "16, 17, 18, 19, 20", "21, 22, 23, 24, 25", "26, 27, 28, 29, 30, 31, 32, 33, 34, 35")
type <- c("a", "b", "c", "d", "e")

list <- list(name=name,n=n,type=type)

$name
[1] "A" "B" "C" "D" "E"

$n
[1] "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"          "11, 12, 13, 14, 15"                     "16, 17, 18, 19, 20"                    
[4] "21, 22, 23, 24, 25"                     "26, 27, 28, 29, 30, 31, 32, 33, 34, 35"

$type
[1] "a" "b" "c" "d" "e"

I need to export the list in a text file looks like this:

name = "A",
type = "a",
n = 1, 2, 3, 4, 5,
    6, 7, 8, 9, 10
name = "B",
type = "b",
n = 11, 12, 13, 14, 15
name = "C",
type = "c",
n = 16, 17, 18, 19, 20
name = "D",
type = "d",
n = 21, 22, 23, 24, 25
name = "E",
type = "e",
n = 26, 27, 28, 29, 30,
    31, 32, 33, 34, 35

This text file follows the following format:

  • name, n and type are a group. It means that, i.e., name="A", n=1,2,...10, type="a" are one set of record.
  • Each element name (name, n and type) is placed at the beginning of each row.
  • Each element contents (i.e., 11, 12, 13, 14, 15 is a content of n) is placed after element name. Some element contents, in this case, data kept in name and type are surrounded by double quotation " but those in n are not surrounded by double quotation. This setting can be changed according to the needs.
  • Each element contents is separated by comma , at the end of each element contents, such as name = "A",.
  • It should be noted that element contents of n already contains comma as it is string type data. However, if n contains more than 5 numbers, column shall begin on the new line. (This data seems to be tricky. It may be better to convert string to integer first as the length of each number is not unified.)
  • On the other hand, the last element contents, which means type in this example, does not have a comma at its end as a new set of name, n and type starts from the next row

I used to use R for more simple data manipulation but have never used it for this kind of text manipulation. Your suggestions are highly appreciated.

I prefer to use pipe function in dplyr but the solution is not limited to this package. Thank you in advance for your kind suggestions.


Solution

  • This isn't a very straightforward string manipulation for R. The heart of string concatenation is paste() and paste0(). You can use mapply() to iterate over the elements in your list at the same time. Basically you just have to get everything to line up. For example

    rows <- paste0(mapply(function(name,n,type) {
      n <- strsplit(n, ", ")[[1]]
      parts <- unlist(tapply(n, (seq_along(n)-1)%/%5, paste, collapse=", "))
      val <- paste0(paste(
        paste0("name = ", dQuote(name, q=FALSE)),
        paste0("type = ", dQuote(type, q=FALSE)),
        paste0(rep(c("n = ", "    "), c(1, length(parts)-1)), parts, collapse=",\n"),
        sep = ",\n"), "\n")
    }, 
        list$name, list$n, list$type, SIMPLIFY=TRUE), collapse="")
    
    cat(rows)
    

    returns

    name = "A",
    type = "a",
    n = 1, 2, 3, 4, 5,
        6, 7, 8, 9, 10
    name = "B",
    type = "b",
    n = 11, 12, 13, 14, 15
    name = "C",
    type = "c",
    n = 16, 17, 18, 19, 20
    name = "D",
    type = "d",
    n = 21, 22, 23, 24, 25
    name = "E",
    type = "e",
    n = 26, 27, 28, 29, 30,
        31, 32, 33, 34, 35
    

    Be sure to use cat() to the newline characters are expanded in the output.