I'm using DSE solr to index a cassandra table that contains a collection of UDTs. I want to be able to sort search results based on a value inside those UDTs.
Given a simplistic example table...
create type test_score (
test_name text,
percentile double,
score int,
description text
);
create table students (
id int,
name text,
test_scores set<frozen<test_score>>,
...
);
... and assuming I'm auto-generating the solr schema via dsetool
, I want to be able to write a solr query that finds students who have taken a test (by a specific test_name), and sort them by that test's score (or percentile, or whatever).
Unfortunately you can't sort by UDT fields.
However, I'm not sure what the value of a UDT is here. Perhaps I don't know enough about your use case. Another issue I see is that each partition key is a student id, so you can only store one test result per student. A better approach might be to use test id as a clustering column so you can store all the test results for a student in a single partition. Something like this:
CREATE TABLE students (
id int,
student_name text,
test_name text,
score int,
percentile double,
description text,
PRIMARY KEY (id, student_name, test_name)
);
Student name is kind of redundant (it should be the same for every row in each partition), but it doesn't have to be a clustering column.
Then you can sort on any field like so:
SELECT * FROM students WHERE solr_query='{"q":"test_name:Biology", "sort":"percentile desc"}' LIMIT 10;
I've used the JSON syntax described here: https://docs.datastax.com/en/datastax_enterprise/4.8/datastax_enterprise/srch/srchJSON.html