Search code examples
xmlxpathxquery

Reading file XML with XPath query


Good morning I need to extrapolate a series of values ​​through XPath queries from XML files that I report at end of the page.

I'm able to obtain the values ​​of my interest until the FileGroup level of the schema, for example with a query like //FileGroup/File/Path

However, I cannot read the data contained in the EmbeddedMetadata node

I have tried in various ways such as for example // FileGroup / File / MoreInfo / EmbeddedMetadata / SubjectDestinatario / Denominazione and others but I don't get the correct query In particular it would be useful for me to read the values contained for example in Tag such as:

<FileNameOriginale> IT09533610011_173.xml </FileNameOriginale>

<SubjectDestinatario role = "Addressee" type = "organization">
   <Denomination> REWIND Srl </Denomination>

<MetadataAggiuntivi name = "NumeroDocumento"
   value = "2020047"

Precisely in this case i would like read: IT09533610011_173.xml REWIND Srl 2020047

Thank you very much in advance who will be kind enough to help me

<?xml version="1.0" encoding="utf-8"?>
<SIP url="http://www.archismall.com" version="1.0" xmlns:s="http://www.uni.com/U3011/sincro/">
    <SelfDescription>
        <ID s:scheme="local">IDVc9df21aa-f2f9-4345-a580-c8f777367adc1</ID>
        <CreatingApplication>
            <Name>ArchiSMALL</Name>
            <Version>1.12.2</Version>
            <Producer>Archivist SRL</Producer>
        </CreatingApplication>
    </SelfDescription>
    <VdV>
        <ID s:scheme="local">SIPc9df21aa-f2f9-4345-a580-c8f777367adc</ID>
    </VdV>
    <FileGroup>
        <File encoding="binary" format="text/xml">
            <ID s:scheme="local">a6b72ff7-4287-4f56-96d8-1da9a89d2316</ID>
            <Path>document/1/IT09533610011_173.xml</Path>
            <Hash function="SHA-256">6fb942e36b879764cb5cf95a2bffa3585f7f04155447ccc38d38441ed7dd6852</Hash>
            <MoreInfo xmlns="http://archismall.com/IDV_EmbeddedMetadata_XSD.xsd">
                <EmbeddedMetadata>
                    <FileNameOriginale>IT09533610011_173.xml</FileNameOriginale>
                    <IdDocumento s:scheme="local">a6b72ff7-4287-4f56-96d8-1da9a89d2316</IdDocumento>
                    <ImprontaDocumento function="SHA-256">6fb942e36b879764cb5cf95a2bffa3585f7f04155447ccc38d38441ed7dd6852</ImprontaDocumento>
                    <OggettoDocumento>Fattura Elettronica Passiva</OggettoDocumento>
                    <DataChiusura normal="+01">2021-01-07 12:41:38</DataChiusura>
                    <SoggettoProduttore role="Producer" type="organization">
                        <Denominazione>GEFIR IMMOBILIARE S.R.L.</Denominazione>
                        <PartitaIva scheme="VATRegistrationNumber">09533610011</PartitaIva>
                        <CodiceFiscale scheme="TaxCode">09533610011</CodiceFiscale>
                    </SoggettoProduttore>
                    <SoggettoDestinatario role="Addressee" type="person">
                        <Nome>Mario</Nome>
                        <Cognome>Infanti</Cognome>
                        <CodiceFiscale scheme="TaxCode">NFNMRA46R09F463G</CodiceFiscale>
                    </SoggettoDestinatario>
                    <SoggettoDestinatario role="Addressee" type="organization">
                        <Denominazione>REWIND Srl</Denominazione>
                        <PartitaIva scheme="VATRegistrationNumber"/>
                        <CodiceFiscale scheme="TaxCode">02406910352</CodiceFiscale>
                    </SoggettoDestinatario>
                    <MetadataAggiuntivi name="PeriodoEsercizio"
                        value="2020" xmlns="http://archismall.com/Metadata.xsd"/>
                    <MetadataAggiuntivi name="NumeroDocumento"
                        value="2020047" xmlns="http://archismall.com/Metadata.xsd"/>
                    <MetadataAggiuntivi name="DataDocumento"
                        value="2020-12-31" xmlns="http://archismall.com/Metadata.xsd"/>
                    <MetadataAggiuntivi name="TipoDocumento"
                        value="Fattura" xmlns="http://archismall.com/Metadata.xsd"/>
                    <MetadataAggiuntivi name="ProgressivoInvio"
                        value="173" xmlns="http://archismall.com/Metadata.xsd"/>
                    <MetadataAggiuntivi name="ResponsabileConservazione"
                        value="Mario Infanti" xmlns="http://archismall.com/Metadata.xsd"/>
                </EmbeddedMetadata>
            </MoreInfo>
        </File>
    </FileGroup>
    <Process>
        <Agent role="Producer" type="person">
            <AgentName>
                <NameAndSurname>
                    <FirstName>Mario</FirstName>
                    <LastName>Infanti</LastName>
                </NameAndSurname>
            </AgentName>
            <Agent_ID scheme="TaxCode">NFNMRA46R09F463G</Agent_ID>
        </Agent>
        <Agent role="Producer" type="organization">
            <AgentName>
                <FormalName>REWIND Srl</FormalName>
            </AgentName>
            <Agent_ID scheme="TaxCode">02406910352</Agent_ID>
        </Agent>
        <TimeReference>
            <TimeInfo normal="+01">2021-01-07 12:41:38</TimeInfo>
        </TimeReference>
    </Process>
</SIP>

Solution

  • If it is XPath 1 then either learn how bind a prefix (e.g. idv) to the namespace http://archismall.com/IDV_EmbeddedMetadata_XSD.xsd in your environment and then use it as in e.g. //idv:MoreInfo/idv:EmbeddedMetadata/idv:FileNameOriginale to select elements in that namespace or go for the local name e.g. //*[local-name() = 'MoreInfo']/*[local-name() = 'EmbeddedMetadata']/*[local-name() = 'FileNameOriginale'].

    With XPath 2 or 3 can always use namespace wildcards e.g. //*:MoreInfo/*:EmbeddedMetadata/*:FileNameOriginale for those elements that are in a namespace. In XPath 3 you can even include the namespace URI in each step of a path e.g. Q{http://archismall.com/IDV_EmbeddedMetadata_XSD.xsd}EmbeddedMetadata.