Search code examples
sql-serverrxmladventureworks

Export XML data from SQL Server to R


I need to export the attribute Demographics from the table Person.Person -- in XML -- from SQL Server database AdventureWorks2014 into R to make some statistical analysis. I would like to use the package XML but it looks like R doesn't recognize my file as XML when I am exporting it.

Does someone know a way to go straight from SQL to R with some package? Without exporting the data into CSV before?

One of my tuple in SQL looks like this:

<IndividualSurvey xmlns="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/IndividualSurvey">
    <TotalPurchaseYTD>-31</TotalPurchaseYTD>
    <DateFirstPurchase>2003-11-01Z</DateFirstPurchase>
    <BirthDate>1962-07-26Z</BirthDate>
    <MaritalStatus>S</MaritalStatus>
    <YearlyIncome>0-25000</YearlyIncome>
    <Gender>M</Gender>
    <TotalChildren>1</TotalChildren>
    <NumberChildrenAtHome>0</NumberChildrenAtHome>
    <Education>Graduate Degree</Education>
    <Occupation>Manual</Occupation>
    <HomeOwnerFlag>0</HomeOwnerFlag>
    <NumberCarsOwned>0</NumberCarsOwned>
    <CommuteDistance>0-1 Miles</CommuteDistance>
</IndividualSurvey>

I would like to have different attributes in R like this:

TotalPurchaseYTD  DateFirstPurchase  BirthDate   MaritalStatus  YearlyIncome...
-31                  2003-11-01    1962-07-26      S            0-25000     ....

Solution

  • we can use:

    library(xml2)
    df <- read_xml(x) %>% as_list %>% sapply(rbind) %>% as.data.frame
    
    df
    # TotalPurchaseYTD DateFirstPurchase   BirthDate MaritalStatus YearlyIncome Gender TotalChildren NumberChildrenAtHome Education Occupation HomeOwnerFlag NumberCarsOwned CommuteDistance
    # 1              -31       2003-11-01Z 1962-07-26Z             S      0-25000      M             1                    0 1 Graduate Degree     Manual             0               0       0-1 Miles