How to extract the attribute value from XML file using custom extractor using U-SQL job. I can able to extract the sub element values from XML file.
sample Xml File:
<?xml version="1.0" encoding="UTF-8"?>
<User ID="001">
<User ID="002">
I can able to extract Firstname and lastname using the below code.How can i get ID value as a part of csv file.
Sample U sql Job:
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
@input = EXTRACT
FirstName string,
LastName string
FROM @"/USERS.xml"
USING new Microsoft.Analytics.Samples.Formats.Xml.XmlExtractor("User",
new SQL.MAP<string, string> {
@output = SELECT * FROM @input;
OUTPUT @output
TO "/USERS.csv"
USING Outputters.Csv();
You can do this easily in Databricks, eg
USING com.databricks.spark.xml
OPTIONS (path "/FileStore/tables/input42.xml", rowTag "User")
Then read the table:
FROM User;
If you must do it with U-SQL then using the XmlDomExtractor
from the Formats assembly worked for me:
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
DECLARE @inputFile string = "/input/input40.xml";
@input =
id string,
firstName string,
lastName string
FROM @inputFile
USING new Microsoft.Analytics.Samples.Formats.Xml.XmlDomExtractor(rowPath : "/Users/User",
columnPaths : new SQL.MAP<string, string>{
{ "@ID", "id" },
{ "FirstName", "firstName" },
{ "LastName", "lastName" }
@output =
FROM @input;
OUTPUT @output
TO "/output/output.csv"
USING Outputters.Csv();
My results: