Meeting Cassandra: Adding a database to your service

Implementing a high-performance database such as Cassandra in our Spring boot microservice.

Marco Capo
10 min readAug 1, 2019
photo by: Iñaki del Olmo

Hello,

Today we will implement a REST service to store user data using Cassandra, this database has good scalability and high availability without compromising performance.
This article will focus on the development process of a microservice and code implementation so you can skip through the operational steps, and start building brilliant data models, and quick prototyping.

First, we will run a Docker container with Cassandra in it, create the keyspace needed and the table that we will be using so you can focus on code examples.

We are creating a lot of services these days, next post we will be creating another service to retrieve current weather information from a specific location. Why another service and not everything in the logic layer of our previous service? Because each service should have one responsibility. One service will store user information, like address, name, tags, among other properties and will expose some HTTP methods to operate through user resources. The other service will be used to obtain current weather information from a city by learning how to connect to a third party weather API and make rest calls.

Finally, in later articles, learn how to put all those services together and orchestrate it through a microservices architecture, by getting user data, gather weather information from user location and then see if any rule applies to return any message to the user by using our service created in the first article of the series.

You would be thinking about why we need so many services. Well, first because we need to get used to creating new services fully functional, found the advantages of microservices and learn how to orchestrate it with docker.

Let’s review some key factors about Cassandra and why is a good database for a high-performance application.

Cassandra

I will give you a bit of background on how Cassandra works so you can picture each step in this article. Apache Cassandra is a NoSQL distributed database, this means that Cassandra will replicate your data across multiple nodes to manage a large amount of data transfer while providing high availability at the cost of consistency. So, if any of the nodes are down for any reason, Cassandra will still be operative reducing data loss if at least one node is running. Of course, if all nodes stop working you’re screwed and Cassandra won’t rescue you.

So there you go, now you are a Cassandra expert, congratulate yourself and celebrate with a beer. But if you fill it was a very short and loose explanation of Cassandra, which it is, you can read more about in Datastax website, they also have great courses that I highly recommend.

What you will learn in this article

  • Build a REST service with CRUD functions.
  • Create a Cassandra database container.
  • Store data in Cassandra with spring-boot data module.
  • Gain a better understanding of spring-boot and its components.
  • How to make your unit tests and integration tests in an embedded Cassandra.
  • Use of JUnit 5 with parameterized tests.
  • Package your service for delivery with Docker.
  • How to handle properties production ready.

Pre-requirements

Project

KISS Cassandra

Let’s keep it simple by running a container named “my-cassandra” ( — name) in detached mode (-d), expose port 9042 which is the default port for Cassandra so we can connect to it, cap the memory space to 3 gigabytes, -g flag starts a Node with Graph Model enabled, -s flag starts a Node with Search Engine enabled and -k flag starts a Node with Spark Analytics enabled just in case you want to connect to Cassandra through DataStax Studio container to inspect the database.

The command line in the terminal would look something like:

Well, that was easy, now we need to create our KEYSPACE and our data struct to be stored so we can start building beautiful services.

Here we are executing cqlsh inside our my-cassandra container, so we can execute CQL scripts (similar to SQL scripts) to create keyspace, types, and tables. If you know SQL already you will find a lot of similarities to CQL, it’s been done on purpose so you can get familiar with syntax easily. One of the differences apart of using a database that is NoSQL, so it doesn’t have tables, it uses documents, but you will find this “TABLE” keyword to define structs inside a document. I don’t want you to confuse Cassandra for a SQL database, but the use of a similar syntax makes everything confusing.

Let’s create our KEYSPACE and our table.

The use of KEYSPACE is to define a unique space for your data to be stored, generally, I use a KEYSPACE for each application in that database. There is 2 strategy class for Cassandra, we won’t get into details, but it’s related to how each row is replicated. In this example we are replicating 1 time each row in the same cluster, the network strategy is used for production and enforce replication in multiple data centers.

I want you to know how to create a custom type in Cassandra, nothing complicated just use the keyword TYPE. We will be using the user location to request weather information in another service (keep reading you will see).

Lastly, create a table “user” to store all user properties we want, like creation time of the user, name, some tags to do stuff (?) and a location using the custom type we just created.

Spring boot pimp my app!

First, get those dependencies in our gradle.build file (check the link if you want the full code)

Gradle

Quick note, the “starter” part of the dependency is a module created by spring to autoconfigure the module that you are using. Since we are exposing a REST service, we are implementing the web-services dependency and to user Cassandra in our data layer, we are implementing data-cassandra. There are other useful dependencies to improve our system like data-cassandra-reactive, but we will get into that once we work to improve the performance in our app. Big note here, no need to get ahead and start optimizing things just because, this app will do the work just fine, then we can give them superpowers by making it reactive or adding a cache.

Config

We are getting familiar with spring boot, this is our 3er application together, let’s recap a couple of things. Configuration files are used by spring to create beans, beans are objects added to the spring context. How we set a class as configuration? By marking the class with the annotation @Configuration and with @EnableCassandraRepositories (love self-explanatory names), also we specify our base package where spring will look for your models to map Cassandra data to a POJO.

If you look at the Gradle build file you will notice we are using the latest spring release version (2.1.4.RELEASE) and to configure Cassandra you need to disable metrics if you are not using it or add Cassandra metrics module when you need it, otherwise the configuration will end up throw an error saying, ‘Hey! I can’t configure metrics’. These metrics can be queried via JMX or pushed to external monitoring systems using a number of built-in and third-party reporter plugins, it’s a good idea to have a monitoring system for you DB.

Ok, We are extending an abstract spring configuration for Cassandra that has all we need to configure, so we override the key methods we need to make this work. We need to specify the keyspace we are using in Cassandra, ours is “tutorial”, if you use another keyspace and I hope you will, you will have to specify the keyspace you use on creation. A contact point is the direction or directions where Cassandra is located, ours is localhost since we are running Cassandra locally in a docker container, the port we expose, by default is 9042 and we disable metrics since we won’t collect any metrics for now.

If you have concerns about the @Value annotation is used to map spring application properties to a variable and been able to read those values, it’s a clean way to have all the configurable properties. Oh! And by the way the double dot (:) after the property is used to specify a default value, if the property is not present it will set the default value instead.

Model

We need to configure our models so spring can map Cassandra objects to POJO (Plain old java object), there are a couple of useful annotations that come handy since I want you to start creating clean code from the start, let’s review the key annotations that I use here.

Look at the User model class. First, we have @Table(“users”) used to specify in spring, hey this object should be created when a file from table users is returned. @PrimaryKey is used to say this is the unique identifier, of course, you can use composite keys, but we won’t cover that in this article. Finally, how do we say to spring, hey we are storing a nested custom object in Cassandra, well by using @CassandraType(type = Name.UDT, userTypeName = “location”), where we say; Hey! This is a Cassandra custom type object, you can find it by the type name “location” and the type is UDT that stands for User Defined Type.

Repository

Spring Data JPA focuses on using JPA (Java Persistence API) to store data. Its most compelling feature is the ability to create repository implementations automatically, at runtime, from a repository interface.

The goal of the repository interface is to significantly reduce the amount of boilerplate code required to implement data access layers for various persistence stores, in our case Cassandra.

The repository interface takes the domain class to manage as well as the id type of the domain class as type arguments. This interface acts primarily as a marker interface to capture the types to work with and to help you to discover interfaces that extend this one. The CassandraRepository provides sophisticated CRUD (Create, read, update and delete) functionality for the entity class that is being managed.

What this means is that you can create an interface extending from a Spring Repository interface (in our case we need CassandraRepository) and add the @Repository annotation exposing methods to create a new element in our database, update or delete it, all done by Spring boot without the need to know any SQL. Isn’t that amazing?

Tests

It took me some time to configure the test needed for this article since spring boot had changed with the latest release (I always try to use the latest version of any dependency to write an article) and I wasn’t able to configure an embedded Cassandra for test cases with custom types, but I finally manage to do it, let’s find out how it’s done.

Before all need to configure an embedded database to operate inside our test and avoid getting one running locally and avoid interfering with existing data making the test autonomous and encapsulated. We are using a JUnit 5 test annotation @BeforeAll that say, hey run this static method before start bootstrapping the app and do it only one time.

We start our embedded Cassandra and configure which port will expose, the contact point and disable metrics. After that, we need to create the keyspace and custom types we will be using, remember that this is a new database and is different from the local database we created before, but use the same scripts.

Then after each test (@AfterEach methods will be executed after each test), we want to make sure the database is empty so we remove any data we store in the previous test.

Each test will test every CRUD method we expose in our controller to make sure from the communication layer up to the store layer is working perfectly and by this I mean it can receive REST messages (GET, POST, PUT and DELETE), validate them and store this information in a database using spring boot framework. I encourage you to take a read on each of these methods and figure it out how is executing this REST messages and how is validate it.

Lastly after all (@AfterAll) we will clean our embedded Cassandra, and that’s it, we have our REST User store service done! Congratulations!

Conclusion

Service diagram

We learn how to create a rest service that can scale very easily using a high scalability database as Cassandra to handle tons of IO per second, everything using Spring boot a java framework that revives my love for java again.

We won’t be able to cover Cassandra fully, there is a lot to talk about it, but I want you to speed Cassandra configuration so you can focus on developing a robust service and then you can optimize it. Also, I give you the tools so you can start your base with integration test so you can ensure that your code works.

Do you know what else we need to talk a lot about? Microservice. We are barely understanding how it works and their capabilities (we aren’t there yet), but if you are following my series of posts you are learning a lot of tools on how and when to use them.

My goal is that you are already working on services with spring boot and you found these posts to unblock you on whatever you are doing and gain some understanding.

--

--

Marco Capo

I’m a senior software engineer working in technology development for 12+ years from marketing, videogames and microservices for big companies