I have election results data in xml files I am trying to import into R. This is my first time ever working with xml files but I haven't the foggiest idea what is up with the .xls version of the data I can download so I'm attempting to work with the xml.
There isn't a direct link to the xml file, but it can be accessed here https://results.enr.clarityelections.com/IL/Bloomington/109017/web.276013/#/summary on the right side by scrolling down a bit to "Reports" and downloading "Detail XML".
I've been trying to use xml2
to get it into a data frame. I can read_xml
then turn it into a list but after that my attempts have given me only a variety of errors or more lists with a lot of NULLs. It's possible the weirdness is being caused by the xml file itself, but I don't know enough about them to know if that is the case.
Here's the solution I ended up with: use XSLT to restructure the xml before trying to construct a data frame. Basics of the solution came from R: convert XML data to data frame (coincidently also about election data).
XSLT - Restructured it to just be one long list of every precinct node with the applicable info from their choice, contest, and votetype ancestors as attributes.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/ElectionResult">
<xsl:copy>
<xsl:apply-templates select="descendant::Precinct"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Precinct">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:attribute name="election">
<xsl:value-of select="ancestor::ElectionResult/ElectionName"/>
</xsl:attribute>
<xsl:attribute name="contest">
<xsl:value-of select="ancestor::Contest/@text"/>
</xsl:attribute>
<xsl:attribute name="choice">
<xsl:value-of select="ancestor::Choice/@text"/>
</xsl:attribute>
<xsl:attribute name="votetype">
<xsl:value-of select="ancestor::VoteType[1]/@name"/>
</xsl:attribute>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
R - The xslt
package works as an extension for xml2
to apply the .xsl file.
library(xml2)
library(xslt)
library(tidyverse)
# Parse XML and XSL
xml <- read_xml("electionresults.xml")
style <- read_xml("style.xsl", package = "xslt")
# Transform XML
new_xml <- xslt::xml_xslt(xml, style)
# Build data frame
elections <- new_xml %>%
xml_find_all("//Precinct") %>%
map_dfr(~list(election = xml_attr(., "election"),
contest = xml_attr(., "contest"),
choice = xml_attr(., "choice"),
votetype = xml_attr(., "votetype"),
precinct = xml_attr(., "name"),
votes = xml_attr(., "votes"))) %>%
type_convert()
Mapping process for building the data frame came from R XML - combining parent and child nodes into data frame