Why is the DSE Search Unique Key the Partition key in Cassandra?

I have a column family that I expose to some application via DataStax Enterprise Search's SolR HTTP API. In some use cases, I thought it might be preferable directly accessing the cql layer.

When taking a closer look at the underlying data model though, I see that the unique in SolR is mapped to the partition key in Cassandra, not making use of compound keys with clustering columns.

Won't this produce a single wide row per partition? And isn't that a "poor" data model for large data sets?

Solution

The unique key in your Solr schema should be a comma-separated list of all of the partition and clustering columns, enclosed within parentheses. Composite partition keys are supported as well as compound primary keys.

See the doc: http://www.datastax.com/documentation/datastax_enterprise/4.5/datastax_enterprise/srch/srchConfSkema.html

Yes, you do get a single wide storage row for each partition key, but it's your choice whether a column in your Cassandra primary key should be used as a clustering column or in the partition key. If you feel that your storage rows in Cassandra are two wide, move one of the clustering columns into a composite partition key, or add another column for that purpose.

Balancing the number of partitions and partition width is of course critical, but DSE/Solr is not restricting your choice.