I have a string that needs to be parsed using XSLT 2.0
Input string
Hoffmann, Rüdiger (Universtiy-A, SomeCity, (SomeCountry); University-B, SomeCity, (SomeCountry)); Author, X; Author, B. (University-C, SomeCity (SomeCountry))
Expected output
Hoffmann, Rüdiger (Universtiy-A, SomeCity, (SomeCountry); University-B, SomeCity, (SomeCountry))
Author, X
Author, B. (University-C, SomeCity (SomeCountry))
The structure is - author name, followed by his university. But, one author could have two universities. And the delimiter between universities and between two sets of author is the same one. (semi-colon in this case).
I need to split it based on the delimiter for author-affiliation group, ignoring the semicolon between affiliations.
I believe it can be done with the help of regex, but I have not much experience building regex myself.
As long as the parentheses around the list of universities and around the country are always present you could match on them:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="xs mf">
<xsl:output method="text"/>
<xsl:param name="authors">Author, A. (Universtiy-A, SomeCity, (SomeCountry); University-B, SomeCity, (SomeCountry));Author, B. (University-C, SomeCity (SomeCountry))</xsl:param>
<xsl:template match="/">
<xsl:value-of select="mf:split($authors)" separator=" "/>
</xsl:template>
<xsl:function name="mf:split" as="xs:string*">
<xsl:param name="input" as="xs:string"/>
<xsl:analyze-string select="$input" regex="[^;)]*?\([^(]*?\([^(]*?\)\)">
<xsl:matching-substring>
<xsl:sequence select="."/>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:function>
</xsl:transform>