Search code examples
regexfiltersparqldublin-coreeuropeana-api

dc:Creator string literal vs. regex FILTER in SPARQL


I am using Europeana's Virtuoso SPARQL Endpoint.

I have been trying to search in SPARQL for content about a specific contributor. To my understanding, this could be carried out this way:

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?title 
WHERE {
     ?objectInfo dc:title ?title .
     ?objectInfo dc:creator 'Picasso' .

}

Nevertheless, I get nothing in return.

Alternatively, I used FILTER regex to search for the literal.

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?title ?creator
WHERE {
     ?objectInfo dc:title ?title .
     ?objectInfo dc:creator ?creator .
     FILTER regex(?creator, 'Picasso')
}

This actually worked very well and returned correctly the results.

My question is: Is it possible to produce the SPARQL query without using FILTER to search the work of a particular artist?

Many thanks.


Solution

  • I don't think there are any objects with 'Picasso' literally as the creator. So a regex filter is a good choice, but slow.

    Here's a way to find the strings your regex is matching:

    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    SELECT ?creator, (count(?creator) as ?ccount)
    WHERE {
         ?objectInfo dc:title ?title .
         ?objectInfo dc:creator ?creator .
         FILTER regex(?creator, 'Picasso')
    }
    group by ?creator
    order by ?ccount
    

    It might have been easier for you to see that if your had displayed all variables in the select statement:

    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    SELECT *
    WHERE {
         ?objectInfo dc:title ?title .
         ?objectInfo dc:creator ?creator .
         FILTER regex(?creator, 'Picasso')
    }
    

    If you don't want to use a regex filter, you could enumerate all of the Picasso variants you are looking for:

    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    SELECT *
    WHERE {
             values ?creator { "Picasso, Pablo" "Pablo Picasso" } .
             ?objectInfo dc:title ?title .
             ?objectInfo dc:creator ?creator
        }
    

    bif:contains works on this endpoint and is pretty fast:

    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    SELECT *
    WHERE {
         ?objectInfo dc:title ?title .
         ?objectInfo dc:creator ?creator .
         ?creator bif:contains 'Picasso'
         #FILTER regex(?creator, 'Picasso')
    }