Search code examples
xmlcsvxmlstarlet

Recursive loop xml to csv with xmlstarlet


I would like to convert a complex xml to csv.

<tests>
    <test>
        <name>AAA</name> 
        <language>BBB</language> 
        <Project>
            <name>XXX</name>
            <id>123</id>
        </Project>
        <fac>
            <name>XXX</name>
            <idt>
                <number>99</number>
            <idt>
            <pers>YYY</pers>
        </fac>
        <fac>
            <name>BBB</name>
            <idt>
                <number>70</number>
            <idt>
            <pers>MMM</pers>
        </fac>
        <fac>
            <name>XXX</name>
            <idt>
                <number>40</number>
            <idt>
            <pers>XXX</pers>
        </fac>
        <date>2018</date>
    </test>
    <test>
    <name>BBB</name> 
    <language>CCC</language> 
    <Project>
        <name>AAA</name>
        <id>12</id>
    </Project>
    <fac>
        <name>YXX</name>
        <idt>
            <number>10</number>
        <idt>
        <pers>LLL</pers>
    </fac>
    <fac>
        <name>BB</name>
        <idt>
            <number>7</number>
        <idt>
        <pers>MM</pers>
    </fac>
    <fac>
        <name>XX</name>
        <idt>
            <number>40</number>
        <idt>
        <pers>XXX</pers>
    </fac>
    <date>2018</date>
</test>
 <tests> 

What I've done so far :

xmlstarlet \
    sel -T -t -m /tests/test \
-v "concat(name,';'
,language,';'
,Project/name,';'
,Project/id,';'
,fac/name,';'
,fac/idt/number,';'
,fac/pers,';'
,date)"
   -n test.xml > test.csv

Everything works has expected but I can only get the data contained in the first node. What I would like to have something like that :

For the first

    name;language;name;id;data contained in the first node"fac";date 
    name(same as first line);language(same as first line); etc..; data contained in the second "fac" node;date (same as first line)
etc... as much as there are face nodes

and then for the second node.

I don't know if this can be done using xmlstarlet ?

Thank you in advance for your help RFlow


Solution

  • If you want an entry for each fac, that's what you should match. Then you can go up to the ancestor test to get the other data you need.

    Example...

    xmlstarlet sel -T -t -m "//fac" -v "concat(ancestor::test/name,';',ancestor::test/language,';',ancestor::test/Project/name,';',ancestor::test/Project/id,';',name,';',idt/number,';',pers,';',ancestor::test/date)" -n test.xml
    

    Output...

    AAA;BBB;XXX;123;XXX;99;YYY;2018
    AAA;BBB;XXX;123;BBB;70;MMM;2018
    AAA;BBB;XXX;123;XXX;40;XXX;2018
    BBB;CCC;AAA;12;YXX;10;LLL;2018
    BBB;CCC;AAA;12;BB;7;MM;2018
    BBB;CCC;AAA;12;XX;40;XXX;2018
    

    Here's the concat() broken down for readabilty...

    concat(
        ancestor::test/name,';',
        ancestor::test/language,';',
        ancestor::test/Project/name,';',
        ancestor::test/Project/id,';',
        name,';',
        idt/number,';',
        pers,';',
        ancestor::test/date
    )