Search code examples
sparqlopendata

SPARQL group by a substring and average


I am querying a large data set (temperatures recorded hourly for nearly 20 years) and I'd rather get a summary, e.g. daily temperatures.

An example query is here: http://www.boisvert.me.uk/opendata/sparql_aq+.html?pasteid=hu5rbc7W

PREFIX opensheff: <uri://opensheffield.org/properties#>

select ?time ?temp where {
    ?m opensheff:sensor <uri://opensheffield.org/datagrid/sensors/Weather_Mast/Weather_Mast.ic> ;
       opensheff:rawValue ?temp ;
       <http://purl.oclc.org/NET/ssnx/ssn#endTime> ?time .
  FILTER (str(?time) > "2011-09-24")
}
ORDER BY ASC(?time)

And the results look like this:

time                temp
"2011-09-24T00:00Z" 12.31
"2011-09-24T01:00Z" 11.68
"2011-09-24T02:00Z" 11.92
"2011-09-24T03:00Z" 11.59

Now I would like to group by a part of the date string, so as to get a daily average temperature:

time            temp
"2011-09-24"    12.3  # or whatever
"2011-09-23"    11.7
"2011-09-22"    11.9
"2011-09-21"    11.6

So, how do I group by a substring of ?time ?


Solution

  • Eventually solved it. Running here:

    http://www.boisvert.me.uk/opendata/sparql_aq+.html?pasteid=j8m0Qk6s

    Code: PREFIX opensheff:

    select ?d AVG(?temp) as ?day_temp
    where {
        ?m opensheff:sensor <uri://opensheffield.org/datagrid/sensors/Weather_Mast/Weather_Mast.ic> ;
           opensheff:rawValue ?temp ;
           <http://purl.oclc.org/NET/ssnx/ssn#endTime> ?time .
        BIND( SUBSTR(?time, 1, 10) AS ?d ) .
    }
    GROUP BY ?d
    ORDER BY ASC(?d)
    

    We use BIND to set a new variable to the substring required, and then grouping and averaging by that variable is simple enough.