Cassandra Error - Clustering column cannot be restricted (preceding column is restricted by a non-EQ relation)

select * from journeydetails where bucketid in('2015-12') and vehicleid in('1234567')
  and starttime > '2015-12-1 00:00:00' and starttime < '2015-12-3 23:59:59' 
  and travelduration > 1800000;

That's not going to work. The reason goes back to how Cassandra stores data on-disk. The idea with Cassandra is that it is very efficient at returning a single row with a precise key, or at returning a continuous range of rows from the disk.

Your rows are partitioned by bucketid, and then sorted on disk by vehicleid, starttime, and travelduration. Because you are already executing a range query (non-EQ relation) on starttime, you cannot restrict the key that follows. This is because the travelduration restriction may disqualify some of the rows in your range condition. This would result in an inefficient, non-continuous read. Cassandra is designed to protect you from writing queries (such as this), which may have unpredictable performance.

Here are two alternatives:

1- If you could restrict all of your key columns prior to travelduration (with an equals relation), then you could apply a your greater-than condition:

select * from journeydetails where bucketid='2015-12' and vehicleid='1234567'
  and starttime='2015-12-1 00:00:00' and travelduration > 1800000;

Of course, restricting on an exact starttime may not be terribly useful.

2- Another approach would be to omit travelduration altogether, and then your original query would work.

select * from journeydetails where bucketid='2015-12' and vehicleid='1234567'
  and starttime > '2015-12-1 00:00:00' and starttime < '2015-12-3 23:59:59';

Unfortunately, Cassandra does not offer a large degree of query flexibility. Many people have found success using a solution like Spark (alongside Cassandra) to achieve this level of reporting.

And just a side note, but don't use IN unless you have to. Querying with IN is similar to using a secondary index, in that Cassandra has to talk to several nodes to satisfy your query. Calling it with a single item probably isn't too big of a deal. But IN is one of those old RDBMS habits that you should really break before getting too deep into Cassandra.

Cassandra Error - Clustering column cannot be restricted (preceding column is restricted by a non-EQ relation)

Tags:

Cassandra

Cql

Cql3

Datastax Enterprise

Related

Recent Posts