I am trying to add dynamically generated XML content to a eXist-db collection (see the code below addFile.pl
) using Perl, the issue is that whenever the content contains UTF-8 characters I receive the error Failed to parse XML-RPC request: Byte "195" is not a member of the (7-bit) ASCII character set.
.
#!/usr/bin/perl
use RPC::XML;
use RPC::XML::Client;
my ($sec, $min, $hour, $mday, $mon, $year) = localtime();
my $timestamp = sprintf("%04d%02d%02d%02d%02d%02d",$year+1900,$mon+1,$mday,$hour,$min,$sec);
print("Timestamp: $timestamp\n");
my $FILENAME = "$timestamp.xml";
my $COLLECTION = 'output';
my $record = <<END;
<document id="doc_20150419014112">
<text>ñáéíóú</text>
</document>
END
$query = <<END;
xquery version "3.0";
import module namespace xmldb="http://exist-db.org/xquery/xmldb";
declare variable \$filename := '$FILENAME';
declare variable \$record := '';
let \$log-in := xmldb:login("/db", "admin", "admin")
(: let \$create-collection := xmldb:create-collection("/db", "$COLLECTION") :)
let \$record :=
$record
for \$target in ('/db/$COLLECTION')
return xmldb:store(\$target, \$filename, \$record)
END
print $query;
$URL = "http://admin:admin\@localhost:8080/exist/xmlrpc";
# connecting to $URL...
$client = new RPC::XML::Client $URL;
# Output options
$options = RPC::XML::struct->new(
'indent' => 'yes',
'encoding' => 'UTF-8',
'highlight-matches' => 'none');
$req = RPC::XML::request->new("query", $query, 20, 1, $options);
$response = $client->send_request($req);
if($response->is_fault) {
die "An error occurred: " . $response->string . "\n";
}
my $result = $response->value;
print $result;
When I run the xquery script (see below) directly with eXide it runs normally but when I run it through the perl script I receive the following:
$ perl addFile.pl
Timestamp: 20150428162016
xquery version "3.0";
import module namespace xmldb="http://exist-db.org/xquery/xmldb";
declare variable $filename := '20150428162016.xml';
declare variable $record := '';
let $log-in := xmldb:login("/db", "admin", "admin")
(: let $create-collection := xmldb:create-collection("/db", "output") :)
let $record :=
<document id="doc_20150419014112">
<text>ñáéíóú</text>
</document>
for $target in ('/db/output')
return xmldb:store($target, $filename, $record)
An error occurred: Failed to parse XML-RPC request: Byte "195" is not a member of the (7-bit) ASCII character set.
I found the solution here, I will quote the answer just in case:
The RPC::XML Perl module uses us-ascii as XML encoding by default. If you delivering UTF-8 content from a database or other sources, RPC::XML produces invalid XML with the default setting.
The XML encoding used by RPC::XML can only be changed globally:
#!/usr/bin/perl
use RPC::XML;
use RPC::XML::Client;
$RPC::XML::ENCODING = 'utf-8';