I wanted to calculate variance (or standard deviation) of wikipedia page length history, for light novel books (Light_novels), with dbpedia (http://live.dbpedia.org/sparql).
The SPARQL is here:
PREFIX dbpedia0: <http://dbpedia.org/ontology/>
PREFIX term: <http://purl.org/dc/terms/>
SELECT DISTINCT ?item ((SUM((?n - ?mean)*(?n - ?mean)))/(COUNT(?n) - 1) AS ?variance)
WHERE
{
?item term:subject dbc:Light_novels .
?item dbpedia0:wikiPageLength ?n
{
SELECT ?item (AVG(?n) AS ?mean) WHERE {
?item dbpedia0:wikiPageLength ?n .
}
}
}
This gives an error Virtuoso 37000 Error SP031: SPARQL compiler: The name ?n is used in multiple clauses, without any logical connection
and I tried modifying the code blindlessly countless times, but ended up no idea how can I achieve that.
I really appreciate your helpful comments. Thanks.
Normally this should work
PREFIX dbpedia0: <http://dbpedia.org/ontology/>
PREFIX term: <http://purl.org/dc/terms/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?item IF(?count=1,0,(SUM(?x)/(?count-1))) as ?var
{
SELECT ?item ?n xsd:integer(?n) as ?ni ?mean xsd:integer(?ni-?mean) as ?nmean ((?nmean)*(?nmean)) as ?x ?count
{
?item dbpedia0:wikiPageLength ?n
{
SELECT ?item AVG(?n2) as ?mean (COUNT(?item)) as $count
{
?item term:subject dbc:Light_novels .
?item dbpedia0:wikiPageLength ?n2
}
GROUP BY ?item
}
}
} GROUP BY ?item ?count
but it throws this error
Internal Optimized compiler error : Bad dfe in sqlo_place_exp in sqldf.c:2491
So I have modified it to this one:
PREFIX dbpedia0: <http://dbpedia.org/ontology/>
PREFIX term: <http://purl.org/dc/terms/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?item IF(?count=1,-1,(?sx/(?count-1))) as ?var
{
SELECT ?item SUM(?x) as ?sx ?count
{
SELECT ?item ?n xsd:integer(?n) as ?ni ?mean xsd:integer(?ni-?mean) as ?nmean ((?nmean)*(?nmean)) as ?x ?count
{
?item dbpedia0:wikiPageLength ?n
{
SELECT ?item AVG(?n2) as ?mean (COUNT(?item)) as $count
{
?item term:subject dbc:Light_novels .
?item dbpedia0:wikiPageLength ?n2
}
GROUP BY ?item
}
}
} GROUP BY ?item ?count
}
we need to aggregations a. for calculating the mean and b. to calculate the sum of differences of each n
from mean.
I needed to convert type to integer the variable xsd:integer(?ni-?mean)
, otherwise it does not show the result of multiply
With running the second inner subquery you can see how the variables are modified