Search code examples
rrentrez

using pubmedR to search by doi


trying to use the methods here and here to extract author affiliation from a list of doi's but not finding any info out there on using pubmedR to search by doi. have looked into entrez syntax but no luck there also. please help! thank you


Solution

  • lol figured it out [this post was a great help] and wanted to post for anyone else who may have this same issue in the future. ended up using the easypubmed package instead tho

    library(easyPubMed)
    
    #loading list of DOIs
    dois <- read.csv("dois.csv")
    
    #convering dois to their associated PMID's 
    pmids <- lapply(dois, get_pubmed_ids)
    
    #using pmid's to extract abstract & article information as an xml thing
    abstracts <- c()
    for (i in c(1:length(pmids))){
      if (is.character(pmids[[i]][["IdList"]][["Id"]])==TRUE){
      abstracts[i] <- fetch_pubmed_data(pmids[[i]])
      }
    }
    abstracts <-as.list(abstracts) 
    
    #making the xml thing readable by R
    readAbstracts <- c()
    for (i in c(1:length(abstracts))){
      if (is.na(abstracts[[i]]) == FALSE){
        readAbstracts[[i]] <- read_xml(abstracts[[i]]) 
      }
    }
    
    #now extracting desired information from the abstracts object
     <- data.frame()
    for (i in c(1:length(pmids)){
      if (is.na(abstracts[[i]]) == FALSE && length(pmids[[i]]$IdList)<2){
        index <- i
        curRent <- readAbstracts[[i]] 
        pmid    <- xml2::xml_find_first( curRent, ".//PMID") %>% xml2::xml_text()
        title    <- xml2::xml_find_first( curRent, ".//ArticleTitle") %>% xml2::xml_text()
        authors <- paste( 
          xml2::xml_find_all( curRent, ".//AuthorList/Author/LastName") %>% xml2::xml_text(),
          xml2::xml_find_all( curRent, ".//AuthorList/Author/ForeName") %>% xml2::xml_text(),
          sep = ", " )
        affiliate <- xml2::xml_find_all( curRent, ".//AuthorList/Author/AffiliationInfo[1]/Affiliation") %>% xml2::xml_text()
        if(is.na(affiliate[1])==TRUE){
          affiliate <- NA
        }
        if(is.na(authors[1])==TRUE){
          authors <- NA
        }
        if(length(authors)>length(affiliate)){
          authors <- authors[c(1:length(affiliate))]
        }
        df <- data.frame( pmid = pmid, title=title, authors = authors, affiliate = affiliate, i = i)
        extractedInfo <-  rbind(extractedInfo, df)
        }
    }
    

    you may then use various methods to format 'extractedInfo' as you desire. cheers