Tuesday, March 11, 2014

More Fun With Cassandra

Last time I just went over the very basics of setting up Cassandra.   But to review:

  1. Spend an hour or two playing with Cassandra following the getting started in the Cassandra wiki. 
  2. Go get CCM so the process of making clusters and administering them on local host is easy.
  3. Maybe follow these simple steps to get CCM working. 
Now that Cassandra is all set up, what to do with it? 

The sample code for this post and the companion posts is on github at https://github.com/fwelland/CassandraStatementTools.

Surprise, Cassandra is NoSQL

So all the ER stuff I am familiar in relational systems doesn't' really apply to Cassandra, well mostly doesn't apply.  Cassandra modeling could be some sort of art form and I will leave it to the reader to go learn, perhaps try here.  Recall my problem, I just needed some thing simple that approximates this:
  • some sort of unique key per statement. 
  • customer id 
  • statement type
  • statement file name
  • year
  • month
  • day
  • statement
A couple of notes on  these attributes:    I made attributes for the date parts because I know that some common ways customers ask for statements are like:  "can I have all statements for January 2013" or "I'd like all statement for the 15th day of every month in 2012".

From Attributes To A Cassandra Table

Cassandra's native 'tonque', is "CQL".  You can read lots about it here and here.   In brief it is a query language for ddl and dml that smells a bit like SQL.   I will not spend a bunch to time/space blabbing about this; it is pretty well documented in many other places.

So I took what I learned earlier and came up with this schema:

create KEYSPACE  statementarchive 
WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
CREATE TABLE statementarchive.statements (         
    archived_statement_id uuid,         
    customer_id int,         
    statement_type text,         
    statement_filename text,         
    year int,         
    month int,         
    day int,         
    statement blob,         
    primary key (archived_statement_id));

How To Get This Schema 'Into' Cassandra

Since I'd gone through "getting started" naturally, I turned to 'csqlsh'.   Recall that in this proto-type, I had been using CCM to easily administer a 3 node cluster on local host.  Here is just a few facts that learned:
  • Maybe goes without saying, but you only need to apply the schema to a node in the cluster.  The cluster takes care of applying the schema everywhere. 
  • Maybe don't think about connecting to a node, just think about connecting to a cluster.
  • I had installed the binary download of Cassandra during my earliest experiments.    Just running the csqlsh tool from that distribution was able to connect to MyTestCluster created and managed by CCM. 
  • CCM has a node level command 'cqlsh' for connecting to a node.   For example:  /opt/ccm-master/ccm node1 cqlsh  .  (Curious, why can it be just /opt/ccm-master/ccm cqlsh ? )

Now just apply the script.   Copy-n-paste into csqlsh works more or less like it should.

But I want a GUI for Doing iCQL Stuff

IRL, I spent a lot of time on this and documented a bunch of stuff about clients that did or didn't work.   I found that there are several clients out there but many of them only support Cassandra 1x.   So I will skip over most of that stuff and just point out:   DataStax DevCenter.   It is free and it works OK with Cassandra 2x.     So here is a quick now-to on how I installed and connected to MyTestCluster with DevCenter:

  • Download and Install.  That is up to you to figure this out. 
  • Run it (again up to you)
  • Run this CCM command:  /opt/ccm-master/ccm liveset   Note the IP list is provides. 
  • Back in DevCenter create a Connection and add each IP, and provide a name for the connection 
  • Open The Connection
  • Explore your cluster with DevCenter.
I left out lots of details, but honestly, most of it is obvious.   But here is a quick screen shot of my connection to MyTestCluster



There is lots to play with in DevCenter; it best to go explore on your own.      In a future post, I will start presenting some java code, I've built to insert some records into this table. 

No comments:

Post a Comment