Search code examples
sparqldbpedia

Filter language only if the type is literal


This is probably almost the same quetion as: Filter by language only if the object is a literal

Problem is that the answer there doesn't work in my case.

I have this query:

SELECT ?property ?value
WHERE { <http://dbpedia.org/resource/Facebook> ?property ?value
FILTER(STRSTARTS(STR(?property), "http://dbpedia.org/property") || STRSTARTS(STR(?property), "http://dbpedia.org/ontology"))}

Result in Virtuoso

There you would see a list of properties including "alexa rating 2" and "abstract" in many languages.

If I then try the suggested solution in the mentioned question above:

SELECT ?property ?value
WHERE { <http://dbpedia.org/resource/Facebook> ?property ?value
FILTER(STRSTARTS(STR(?property), "http://dbpedia.org/property") || STRSTARTS(STR(?property), "http://dbpedia.org/ontology"))
FILTER(!isLiteral(?value) || langMatches(lang(?value), "EN"))}

Result in Virtuoso

Now you would see that only english version of "abstract" is there but "alexa rating 2" and many other non-string values are gone.

Anyone that knows how to get all properties as in the first query and then for literals only filter out the english language?


Solution

  • Your second query does filter out literals that have a language tag other than English. In RDF 1.0, there are three types of literals:

    • plain literals (no datatype, and no language tag)
    • language tagged literals (a string and a language tag)
    • datatype literals (a lexical form and a datatype)

    So the Alexa rating, which has a value of 2, is a literal, and it doesn't have a language tag, so the language tag certainly isn't "EN" (and more importantly, doesn't match "EN"; langMatches does some more complex checks). What you want is to filter out non-English language tagged literals. That's not hard; you just need to add lang(?value) = "" to the filter (since lang returns "" for literals with no language tag):

    SELECT ?property ?value
    WHERE { <http://dbpedia.org/resource/Facebook> ?property ?value
    FILTER(STRSTARTS(STR(?property), "http://dbpedia.org/property") || STRSTARTS(STR(?property), "http://dbpedia.org/ontology"))
    FILTER(!isLiteral(?value) || lang(?value) = "" || langMatches(lang(?value), "EN"))}
    

    SPARQL results

    The way to read that filter is:

    Keep values that

    1. are not literals; or
    2. are literals, but don't have a language tag; or
    3. are literals with a language tag that matches "en".