Apache Cassandra

Apache Cassandra是一套開源分散式NoSQL資料庫系統。它最初由Facebook開發,用於儲存收件箱等簡單格式數據,集Google BigTable的數據模型與Amazon Dynamo的完全分散式的架構於一身。Facebook於2008將 Cassandra 開源,此後,由於Cassandra良好的可擴放性,被DiggTwitter等知名Web_2.0網站所採納,成為了一種流行的分散式結構化數據存儲方案。

Apache Cassandra 是一套開源分佈式資料儲存管理系統,最初由 Facebook 開發,用於儲存海量級的資料。Cassandra 是一個混合型的非關聯型的資料儲存庫,主要特點是它不是一個資料庫,而是由一堆資料儲存節點共同構成的一個分散式網絡服務,對 Cassandra 的一個寫操作,會被覆制到其它節點上,對 Cassandra 的讀操作,也會被路由到某個節點上面去讀取




Cassandra vs MySQL with 50GB of data

MySQL

Cassandra

~300ms write

~0.12ms write

~350ms read

~15ms read


http://www.youtube.com/watch?v=fWkEeyT3e2Y


YouTube Video



我的Blog相關文集




Cassandra Summit 2010





特點


  1. 靈活的schema:不需要像資料庫一樣預先設計schema,增加或者刪除字段非常方便(on the fly)。
  2. 支持range查詢:可以對Key進行範圍查詢。
  3. 高可用,可擴展:單點故障不影響叢集服 務,可線性擴展。
  4. 無傳統叢集的中心節點:各個節點地位都是平等的,通過 Gossip 協議維持叢集中的節點資訊。


http://wiki.apache.org/cassandra/GettingStarted


快速安裝


安装JDK 6

   tar -zxvf cassandra-$VERSION.tgz

   cd cassandra-$VERSION

   sudo mkdir -p /var/log/cassandra

   sudo chown -R `whoami` /var/log/cassandra

   sudo mkdir -p /var/lib/cassandra

   sudo chown -R `whoami` /var/lib/cassandra

修改/bin/cassandra.in.sh里面 的启动端口(-Dcom.sun.management.jmxremote.port=8080




Commands




cassandra-cli



cassandra@wisdomfish:/opt/apache-cassandra-0.6.1/bin# ./cassandra-cli
Welcome to cassandra CLI.

Type 'help' or '?' for help. Type 'quit' or 'exit' to quit.
cassandra> help
List of all CLI commands:
?                                                                  Same as help.
help                                                          Display this help.
connect <hostname>/<port>                             Connect to thrift service.
describe keyspace <keyspacename>                              Describe keyspace.
exit                                                                   Exit CLI.
quit                                                                   Exit CLI.
show config file                                Display contents of config file.
show cluster name                                          Display cluster name.
show keyspaces                                           Show list of keyspaces.
show api version                                        Show server API version.
get <ksp>.<cf>['<key>']                                  Get a slice of columns.
get <ksp>.<cf>['<key>']['<super>']                   Get a slice of sub columns.
get <ksp>.<cf>['<key>']['<col>']                             Get a column value.
get <ksp>.<cf>['<key>']['<super>']['<col>']              Get a sub column value.
set <ksp>.<cf>['<key>']['<col>'] = '<value>'                       Set a column.
set <ksp>.<cf>['<key>']['<super>']['<col>'] = '<value>'        Set a sub column.
del <ksp>.<cf>['<key>']                                           Delete record.
del <ksp>.<cf>['<key>']['<col>']                                  Delete column.
del <ksp>.<cf>['<key>']['<super>']['<col>']                   Delete sub column.
count <ksp>.<cf>['<key>']                               Count columns in record.
count <ksp>.<cf>['<key>']['<super>']            Count columns in a super column.
cassandra>




cassandra> set Keyspace1.Standard1['wisdomfish']['first'] = 'Kuo'
Value inserted.
cassandra> set Keyspace1.Standard1['wisdomfish']['last'] = 'Chaoyi'
Value inserted.
cassandra> set Keyspace1.Standard1['wisdomfish']['age'] = '88'
Value inserted.
cassandra> get Keyspace1.Standard1['wisdomfish']
=> (column=6c617374, value=Kuo, timestamp=1271804967320000)
=> (column=6669727374, value=Chaoyi, timestamp=1271804952384000)
=> (column=616765, value=88, timestamp=1271804984260000)
Returned 3 results.

Timestamp

cassandra> get Keyspace1.Standard1['1']
=> (column=66756c6c4e616d65, value=大智若魚, timestamp=1271944955968)
=> (column=616765, value=49, timestamp=1271944955968)
Returned 2 results.

cassandra> get Keyspace1.Standard1['1']
=> (column=66756c6c4e616d65, value=大智若魚, timestamp=1271968924124)
=> (column=616765, value=49, timestamp=1271968924124)
Returned 2 results.





nodetool







DataModel


看似如同四到五維的雜溱(散列/Hash)
  1. http://wiki.apache.org/cassandra/DataModel
  2. Cassandra的資料模型, http://thebestsolutions.cn/bbs/read.php?tid=18410
  3. 大 話Cassandra資料模型
  4. WTF is a SuperColumn? An Intro to the Cassandra Data Model





Thrift-Java


http://wiki.apache.org/cassandra/ThriftExamples


High Level Clients
http://wiki.apache.org/cassandra/ClientOptions






Cluster


ring
來源於 consistent hash,在consistent hash中,各個節點組成一個環,通常稱為 ring。

Seeds
Cassandra 沒有傳統叢集的中心節點,各個節點地位都是平等的,通過 Gossip 協議維持叢集中的節點資訊。為了使叢集中的各節點在啟動時能發現其他節點,需要指定種子節點(seeds),各節點都先和種子節點通信,通過種子節點獲取其他節點列表,然後和其他節點通信。種子節點可以指定多個,通過在 conf/storage-conf.xml 中的 seeds 屬性配置。

如何安裝和配置Cassandra


http://hudson.zones.apache.org/hudson/job/Cassandra/changes




Cassandra 0.7 蓄勢待發



Cassandra

Nosql Cassandra study [1]

http://genius-bai.javaeye.com/blog/639820
http://kauu.net/2010/02/27/cassandra%E5%88%9D%E4%BD%93%E9%AA%8C/


Comments