Search code examples
rrstudiospecial-charactersstringr

Extract number between underscore in text


I have files with names like

  • Hughson.George_54_4
  • Ifran.Dean_51_3
  • Houston.Amanda_49_6

I'd like to create a data frame where each row is information extracted from a file name in the form of Author, Volume, Issue.

I'm able to extract the name and volume, but can't seem to get the issue number. Using "stringr" package, I've done the following, which gives me _4 instead of just 4.

[^a-z](?:[^_]+_){0}([^_ ]+$)  

How do I fix this?


Solution

  • you are looking for:

    read.table(text = string, sep ='_', col.names = c('Author', 'Volume', 'Issue'))
    
              Author Volume Issue
    1 Hughson.George     54     4
    2     Ifran.Dean     51     3
    3 Houston.Amanda     49     6
    

    where

    string <- c("Hughson.George_54_4", "Ifran.Dean_51_3", "Houston.Amanda_49_6")
    

    edit: You are looking for:

     read.table(text = string, sep ='_', fill=TRUE)