Search code examples

SPARQL regex doesn't match Persian characters with the "i" flag

I expect the ignore case "i" flag to only increase the amount of matches, not to decrease them, but the following SPARQL query (endpoint does result in one match without the flag but no matches with it:

select * { ?s rdfs:label ?l. filter(regex(str(?l),"قانون بیمارستان")) }

-> 1 match

select * { ?s rdfs:label ?l. filter(regex(str(?l),"قانون بیمارستان","i")) }

-> no match

With non-Persian letters it works as expected:

select count(*) { ?s rdfs:label ?l.filter(regex(str(?l),"Information"))}

-> 319 matches

select count(*) { ?s rdfs:label ?l.filter(regex(str(?l),"Information","i"))}

-> 363 matches

What is the reason for this behaviour and how can I change it to behave as expected?

Virtuoso version 07.20.3217 on Linux (x86_64-unknown-linux-gnu), Single Server Edition

P.S.: The problem still persists after an upgrade to 07.20.3229.

The problem also occurs on DBpedia, which has the same version right now:

select *
  <> dbo:abstract ?l.    


  • I found an open issue on the Virtuoso GitHub repository regarding this problem at, it seems to be under investigation.

    Thanks to all the commenters for helping with the investigation and for giving great workarounds and alternatives.