Search code examples
xmlbashparsingxsd

Script to parse XML and download a file from certain URL then transfer it to another computer


I need a master computer to be able to download a predefined XML file then parse it; with the available info, a script should identify the correct file to be downloaded from a web server periodically. Finally this file must be transferred to another computer without internet access in the same LAN and extract it to a given location.

However I have bounds: I can't use PHP to accomplish this because and I can't install anything there; I've proposed to do this via MD5sums but a given day and hour must be given to be able to download the file to the master computer and another hour and day to transfer it to the slim terminal.

I've designed this XML:

<?xml version="1.0"?>
<updateXML id="subdirectory1">
    <file>
        <tarname>compressedFile.tar.gz</tarname>
        <name>fileInsideName</name>
        <filExtension>.ext</filExtension>
        <url protocol="http://">someurl.com/mainDirectory</url>
    </file>
    <update>
        <download>2012-02-02T03:00:00.00000</download>
        <copyTo terminal="1">2012-02-02T09:00:00.00000</copyTo>
    </update>
</updateXML>

And this working XML Schema (XSD)

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <xsd:element name="updateXML">
        <xsd:complexType>
            <xsd:sequence>
            
                <xsd:element name="file" minOccurs="1" maxOccurs="unbounded">
                    <xsd:complexType>
                        <xsd:all>
                            <xsd:element name="tarname" type="xsd:string"/>
                            <xsd:element name="name" type="xsd:string"/>
                            <xsd:element name="filExtension" type="xsd:string"/>
                            <xsd:element name="url">
                                <xsd:complexType>
                                    <xsd:simpleContent>
                                        <xsd:extension base="xsd:string">
                                            <xsd:attribute name="protocol" type="xsd:string" />
                                        </xsd:extension>
                                    </xsd:simpleContent>
                                </xsd:complexType>
                            </xsd:element>
                        </xsd:all>
                    </xsd:complexType>
                </xsd:element>
                
                <xsd:element name="update" minOccurs="1" maxOccurs="1">
                    <xsd:complexType>
                        <xsd:sequence>
                            <xsd:element name="download" type="xsd:dateTime" />
                            <xsd:element name="copyTo">
                                <xsd:complexType>
                                    <xsd:simpleContent>
                                        <xsd:extension base="xsd:string">
                                            <xsd:attribute name="terminal" type="xsd:int" />
                                        </xsd:extension>
                                    </xsd:simpleContent>
                                </xsd:complexType>
                            </xsd:element>
                        </xsd:sequence>
                    </xsd:complexType>
                </xsd:element>

            </xsd:sequence>
            <xsd:attribute name="id" type="xsd:string"/>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>

I've researched and I've found XMLlint can parse XML files. No problems here now:

$ xmllint --noout --schema updatemenus.xsd updatemenus.xml 
updatemenus.xml validates

I've thought in a process that can help me accomplish the given task. I am keen to know if my proposed steps are correct:

  1. Generate the XML and validate it with it's XSD.
  2. Download it to the master computer and parse it there. Use the available info to construct a complete URL to download.
  3. Once downloaded secure copy it to the needed terminal in the LAN
  4. Check if a newer XML it's available and download it (a cron process?)

Is this correct? If so, now that my XML is valid, now what? How can I use its info? I'm new to XMLlint and to XPath. What can I do now?


Solution

  • I can't help with the whole thing, but you'll find dateTime in

    http://www.w3.org/2001/XMLSchema.xsd
    

    edit

    my point is that your document specifies

     <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    

    which is an existing document BUT not a true .xsd file and will not validate any of your XML.

    I hope this helps.