cassandra clustering key 的查询原理

来源:互联网 发布:四川广电网络套餐介绍 编辑:程序博客网 时间:2024/04/27 21:22

Suppose your clustering keys are

k1 t1, k2 t2, ..., kn tn

where ki is the ith key name and ti is the ith key type. Then the order data is stored in is lexicographic ordering where each dimension is compared using the comparator for that type.

So (a1, a2, ..., an) < (b1, b2, ..., bn) if a1 < b1 using t1 comparator, or a1=b1 and a2 < b2 using t2 comparator, or (a1=b1 and a2=b2) and a3 < b3 using t3 comparator, etc..

This means that it is efficient to find all rows with a certain k1=a, since the data is stored together. But it is inefficient to find all rows with ki=x for i > 1. In fact, such a query isn't allowed - the only clustering key constraints that are allowed specify zero or more clustering keys, starting from the first with none missing.

For example, consider the schema

create table clustering (    x text,    k1 text,    k2 int,    k3 timestamp,    y text,    primary key (x, k1, k2, k3));

If you did the following inserts:

insert into clustering (x, k1, k2, k3, y) values ('x', 'a', 1, '2013-09-10 14:00+0000', '1');insert into clustering (x, k1, k2, k3, y) values ('x', 'b', 1, '2013-09-10 13:00+0000', '1');insert into clustering (x, k1, k2, k3, y) values ('x', 'a', 2, '2013-09-10 13:00+0000', '1');insert into clustering (x, k1, k2, k3, y) values ('x', 'b', 1, '2013-09-10 14:00+0000', '1');

then they are stored in this order on disk (the order select * from clustering where x = 'x'returns):

 x | k1 | k2 | k3                       | y---+----+----+--------------------------+--- x |  a |  1 | 2013-09-10 14:00:00+0000 | 1 x |  a |  2 | 2013-09-10 13:00:00+0000 | 1 x |  b |  1 | 2013-09-10 13:00:00+0000 | 1 x |  b |  1 | 2013-09-10 14:00:00+0000 | 1

k1 ordering dominates, then k2, then k3.


primary key决定了在哪个node上,cluster key 决定的是存储的顺序,而且是按照cluster key1, cluster key2, cluster key3 的顺序来存储的,所以上例子中,:

select * from clustering where x='x' and k1='a', 很容易查,但是select * from clustering where x='x' and k2='b',这个时候得先把k1=*查出来,然后再找k2='b'的,所以没有意义了。

0 0
原创粉丝点击