I expect the ignore case "i" flag to only increase the amount of matches, not to decrease them, but the following SPARQL query (endpoint http://www.snik.eu/sparql) does result in one match without the flag but no matches with it:
select * { ?s rdfs:label ?l. filter(regex(str(?l),"قانون بیمارستان")) }
-> 1 match
select * { ?s rdfs:label ?l. filter(regex(str(?l),"قانون بیمارستان","i")) }
-> no match
With non-Persian letters it works as expected:
select count(*) { ?s rdfs:label ?l.filter(regex(str(?l),"Information"))
}
-> 319 matches
select count(*) { ?s rdfs:label ?l.filter(regex(str(?l),"Information","i"))
}
-> 363 matches
What is the reason for this behaviour and how can I change it to behave as expected?
Virtuoso version 07.20.3217 on Linux (x86_64-unknown-linux-gnu), Single Server Edition
P.S.: The problem still persists after an upgrade to 07.20.3229.
The problem also occurs on DBpedia, which has the same version right now:
select *
{
<http://dbpedia.org/resource/Persian_language> dbo:abstract ?l.
filter(regex(str(?l),"فارسی","i")).
}
I found an open issue on the Virtuoso GitHub repository regarding this problem at https://github.com/openlink/virtuoso-opensource/issues/705, it seems to be under investigation.
Thanks to all the commenters for helping with the investigation and for giving great workarounds and alternatives.