Search code examples

SPAQRL: select item and count occurences of its label

I have this SPARQL query directed to the Open Research Knowledge Graph (ORKG):

PREFIX orkgr: <>
PREFIX orkgc: <>
PREFIX orkgp: <>
PREFIX rdfs: <>
PREFIX xsd: <>
PREFIX rdf: <>

SELECT ?o1Label (COUNT(?o1Label) AS ?o1LabelCount)
  ?o1 a orkgc:Paper.
  ?o1 rdfs:label ?o1Label.
  FILTER (strlen(?o1Label) > 1).

GROUP BY ?o1Label
ORDER BY DESC(?o1LabelCount)

Which results in labels (?o1Label) and the number of occurrences of this label (?o1LabelCount).

How can I extend this query to also include a column for the actual item (?o1)?

Because there might be multiple candidates (when o1LabelCount is > 1), there should be one row for each of these items (with the same label and the same label count).


  • I see two options:

    First (and probably better) is to use GROUP_CONCAT and collect the entities into one field to be parsed again on application side. this could look like this (link):

    PREFIX orkgr: <>
    PREFIX orkgc: <>
    PREFIX orkgp: <>
    PREFIX rdfs: <>
    PREFIX xsd: <>
    PREFIX rdf: <>
    SELECT ?o1Label (GROUP_CONCAT(?o1, "\t") AS ?o1s) (COUNT(?o1Label) AS ?o1LabelCount)
    WHERE {
      ?o1 a orkgc:Paper.
      ?o1 rdfs:label ?o1Label.
      FILTER (strlen(?o1Label) > 1).
    GROUP BY ?o1Label
    ORDER BY DESC(?o1LabelCount)

    An alternative would be using nested queries and receive a result as you described (link):

    PREFIX orkgr: <>
    PREFIX orkgc: <>
    PREFIX orkgp: <>
    PREFIX rdfs: <>
    PREFIX xsd: <>
    PREFIX rdf: <>
    SELECT ?o1Label ?o1 ?o1LabelCount
    WHERE {
      ?o1 rdfs:label ?o1Label .
        SELECT ?o1Label (COUNT(?o1Label) AS ?o1LabelCount)
        WHERE {
            a orkgc:Paper;
            rdfs:label ?o1Label
          FILTER (strlen(?o1Label) > 1).
    GROUP BY ?o1Label
    ORDER BY DESC(?o1LabelCount)