Search code examples
filtersparql

SPARQL - Filter comment. label and abstract to english and keep integer as well as uri values


I'm trying to get available information about certain resources from DBPedia. However, I want to filter out the label, comment and abstract to be only in english.

My original query is:

SELECT ?property ?value (lang(?value ) as ?lang) { <http://dbpedia.org/resource/England> ?property ?value . }

To filter the english results I have modified it to:

SELECT ?property ?value (lang(?value ) as ?lang) { <http://dbpedia.org/resource/England> ?property ?value .  FILTER(LANG(?value) = "en") }

However, doing this I lose a lot of entries and information such as population, density, position, latitude, longitude and many many more.

I was wondering, if there is a way to get all the available entries but filter the abstract, label and comment to be only in English.

I have tried to modify my Query to the following:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?property ?value (lang(?value ) as ?lang) { <http://dbpedia.org/resource/England> ?property ?value .   FILTER(LANG(?rdfs:label) = "en")}

Unfortunately, I did not get any results since the query is false. Any help will be greatly appreciated.


Solution

  • I have answered my own question after a thorough research. You can use the FILTER function to get all the values with or without any @en or language part available. For this you can use the following part:

    FILTER(LANG(?value) = "" || LANGMATCHES(LANG(?value), "en")
    

    However, this leads to losing all the available uri and other links. If one also wants to get all the links besides the English filtered results, one will also need a similar function to the startsWith function. Its equivalent in the SPARQL language is called strstarts. I have added the following part to the FILTER in my query and I have gotten all originally needed results.

    strstarts(str(?value), 'http'))
    

    So this is the resulting query. Hint I have removed the PREFIXES since I do not need them in my query

     SELECT ?property ?value { <http://dbpedia.org/resource/England> ?property ?value .
          FILTER(LANG(?value) = "" || LANGMATCHES(LANG(?value), "en") || strstarts(str(?value), 'http'))}