Search code examples
rr-xlsx

Can R read unregular xlsx?


enter image description here

I have so many(about 1,000) xlsx like the picture above. And I want to read every xlsx and get the data of every candatate's name, number and age. But I don't know how to read this unregular xlsx?


Solution

  • I don't know if any R Excel API is smart enough to handle your column formatting, but there is an easy workaround. You can just save the above worksheet in CSV format. Doing this for the data you showed above left me with the following three CSV lines:

    Title,,,,,
    name,mike,number,123214,age,28
    ,score,,ddd,aaa,bbb
    

    You can try the following code:

    df <- read.csv(file="path/to/your/file.csv", header=FALSE)
    df <- df[2:nrow(df), ]      # drop first row
    

    To get the name, number, and age for Mike:

    name   <- df[1, 2]
    number <- df[1, 4]
    age    <- df[1, 6]