Search code examples
sparqlrdfvirtuosoopenlink-virtuosodcat

Sparql UNION returns Virtuoso 37000 Error SP031


I have the query shown below:

SELECT DISTINCT ?dataset ?title WHERE { 

      ?dataset a dcat:Dataset ; 
      dcterms:title ?title ; 
      dcterms:description ?description .

      { ?dataset dcterms:title ?title . 
        ?title bif:contains "'keyword_1'" }        
      UNION
      { ?dataset dcterms:description ?description . 
        ?description bif:contains "'keyword_1'" }

      { ?dataset dcterms:title ?title . 
        ?title bif:contains "'keyword_2'" }
      UNION
      { ?dataset dcterms:description ?description . 
        ?description bif:contains "'keyword_2'" }
    }

Semantically, this query is supposed to return all datasets which have "keyword_1" in either their "title" or "description" (this is the first UNION clause) and "keyword_2" in either their "title" or "description" (second UNION clause). The intent is to intersect these two UNION clauses together, that is, getting only only those datasets which fulfill both clauses.

This validator tells me that the query is syntactically correct. However, when sending the query to Virtuoso, the following error is returned:

Virtuoso 37000 Error SP031: SPARQL compiler: Internal error: sparp_find_triple_with_var_obj_of_freetext(): lost connection between triple pattern and an ft predicate


SPARQL query:
define sql:big-data-const 0 

output-format:text/html<br>
define sql:signal-void-variables 1 

Do you have an idea whats going on? I don't get what Virtuoso is trying to tell me when stating "lost connection between triple pattern and an ft predicate"...

Thanks in advance!


Solution

  • Maybe a bug in the query executor or optimizer. The Virtuoso experts like TallTed know better and will give you support. I can at least reproduce this on e.g. https://www.europeandataportal.eu/sparql which runs on Virtuoso version 07.20.3230 on Linux (x86_64-unknown-linux-gnu), Single Server Edition.

    But, more important: your query looks way too complex as you could use a FILTER with logical || in combination with && - at least that's what I thought.

    Unfortunately, it fails with an error

    Virtuoso 37000 Error SP031: SPARQL compiler: No suitable triple pattern is found for a variable $description in special predicate bif:contains() at line 7 of query
    

    and neither

    SELECT DISTINCT ?dataset ?title WHERE { 
      ?dataset a dcat:Dataset ; 
      dcterms:title ?title ; 
      dcterms:description ?description .
      filter( (bif:contains(?title, "'keyword_1'") || bif:contains(?description,"'keyword_1'")) 
                && 
              (bif:contains(?title, "'keyword_2'") || bif:contains(?description,"'keyword_2'"))
      )   
    }
    

    nor

    SELECT DISTINCT ?dataset ?title WHERE { 
      ?dataset a dcat:Dataset ; 
      dcterms:title ?title ; 
      dcterms:description ?description .
      filter(bif:contains(?title, "'keyword_1'") || bif:contains(?description,"'keyword_1'"))
      filter(bif:contains(?title, "'keyword_2'") || bif:contains(?description,"'keyword_2'"))         
    }
    

    do work as I'd expect.

    (Verbose) workaround using subqueries:

    SELECT DISTINCT ?dataset ?title WHERE { 
     {
      select ?dataset ?title { 
      ?dataset a dcat:Dataset ; 
               dcterms:title ?title ; 
               dcterms:description ?description .
      filter( bif:contains(?title, "'keyword_1'") || bif:contains(?description,"'keyword_1'")) 
      }
     }
     {
      select ?dataset ?title { 
      ?dataset a dcat:Dataset ; 
               dcterms:title ?title ; 
               dcterms:description ?description .
      filter( bif:contains(?title, "'keyword_2'") || bif:contains(?description,"'keyword_2'"))
      } 
     }     
    }