Cassandra Developer JumpStart

Okay, so you decided to start playing with Cassandra to get an idea of one of the most widely used big data databases!

In this post I will give a (hopwfully useful) walkthrough of how to get started with Cassandra and perform a few simple but important actions for any database environment:

  • create a new schema
  • create a new table
  • insert, update, delete and select

I assume that Cassandra is installed in your environment. If you have not yet installed Cassandra, then you can find installers and documentation on either the PlanetCassandra or the DataStax websites. DataStax Cassandra, which I also use, offers a customised version of Cassandra including a few enhancements.

Assuming that you are using Windows, get sure that Cassandra services are running (for other OS you can get advice from DataStax documentation on how to start/stop Cassandra)

DataStax Cassandra Services

and let’s get started!

 

Go to [InstallationFolder]/apache-cassandra/bin and run the CQL shell cqlsh.bat

CQL shell

Similar to relational database engines that use schemas/users as containers of data, Cassandra uses Keyspaces.  Let’s list all the keyspaces that currently exist in the database:

Type: DESC KEYSPACES;

Desc KeySpaces

It is not a recommended practice to edit the system keyspaces. For this reason, let’s create a new namespace called demo_db.

Type: CREATE KEYSPACE demo_db WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1};

Create keyspace

The provided options provide additional logic to the keyspace:

  • class = ‘SimpleStrategy’ means that only one data center will be used.
  • replication_factor = 1 means that there will be only one copy of each row on one node in the cluster.

Now that we ‘ve got our custom keyspace we can create a new table. We need to execute the CREATE command but before that we need to execute a USE  command to specify our target keyspace:

Type: USE demo_db; CREATE TABLE movies (movieId int, name text, year int, PRIMARY KEY (movieId));

table_movies

Now that our table is create we can execute INSERT commands to fill our table with data.

Type:

INSERT INTO movies (movieId, name, year) VALUES (1, 'The Illusionist', 2006);
INSERT INTO movies (movieId, name, year) VALUES (2, 'The Prestige', 2006);

Then execute a SELECT command to return all the records of the movies table:

Type: SELECT * from movies;

insert_select_movies

SELECT can also use filters (where clauses) to filter data on primary key or indexes.

Type: SELECT * from movies WHERE movieId = 1;

select filter

However, trying to filter on a field that is neither primary key nor index gives a 2200 error:

select filter invalid

Cassandra allows also updating records, just like any other database engine. This is achieved by executing UPDATE statements. Let’s update our second row to change the year of the movie.

Type: UPDATE movies SET year = 2015 WHERE movieId = 2;

Finally, Cassandra enables deleting rows by executing DELETE statements. Let’s delete our first row:

Type: DELETE from movies where movieId = 1;

update_delete

 

This concludes the most important operation of the Cassandra database from the developer’s perspective. Hopefully you found this post helpful and you can now start doing some real work with Cassandra.

Leave a comment