5 basic steps for Key-value store database design

by László Wágner, Nexsis Mobile
Database design

Nowadays NoSql databases are very popular. There are lots of vendors with different technical approach. CAP theorem applies for most of them:

  • Consistency – we can see same data at the same time on all nodes
  • Availability - every request is guaranteed to be processed and responded regardless of success factor
  • Partition tolerance - system remains accessible if some messages are lost or failure of part of the system (e.g. we lost some nodes)
Sometimes we can tune two features of the three. However tons of documents can be found about NoSQL technology but data modeling part is not the best studied part. I also found different implementations supporting different techniques, having some sort of advantages or shortcomings. I picked one of my favorite key value store implementation, Redis, to show some real working modeling techniques. Our demo database must serve a Web application, store users and their click actions.

1. Define objects

In the first step of data modeling we need to identify objects. In the relational database design they will be tables most likely. In Redis they can be strings, lists, hashes, etc. Regardless the physical implementation we need to separate objects by good names / keys. We can identify USER and CLICK as objects in our example. We will use USER: and CLICK: as a prefix. This will be part of our naming convention.

2. Define identifiers

Since Redis is a key value store we need to define keys to store data. We defined two objects above we always have to think how they are connected. It seems obvious if we store users we want to know where and when they clicked. This is the purpose of our demo database. We know from the application designer the LOGIN_ID is a unique identifier for the user. So we will store user data in this format: USER: e.g. USER:100AB. That was easy. For CLICK object we have multiple options to define key as identifier.

3. Analyze requirement for compound keys

For the key definition of the CLICK object we need to know what the main purpose is of the objects and what the requirements are. We got the information from the application designer we need to answer to this question: How many times a certain user clicked to an element of the application. To fulfill this requirement we create key like this CLICK:: e.g. CLICK:100AB:Button_Submit

4. Define the physical implementation

To have real data in the database we have to define and choose physical implementation of our objects. For this we have to know the database implementation what kind of options we have. Redis is not a plain key-value store, but a data structures server, supporting different kind of values. Most obvious choices are String – store one value, Hash – store multiple fields. In our example we will use hash for USER and string for CLICK.

5. Create mockup to test your database design

So let’s just create a working version and test it with the application. We will use HMSET for USER and SET for CLICK to create them.
HMSET USER:100AB Name “John Doe” Email “john@doe.com” LastLogin “2015-05-05 05:05:05”
SET CLICK: 100AB:Button_Submit 1

We would like to support counting of click actions by database we just need to increment the number of clicks.
INCR CLICK: 100AB:Button_Submit

NoSQL databases are not the best in aggregation however there are some very good features in Redis to overcome this shortcoming. We can support the application with top 10 lists by different dimension. For examples: Top 10 clicked elements by user:
Store data: ZINCRBY CLICK: 100AB 1 Button_Submit
Query data: ZREVRANGE CLICK: 100AB 0 9 WITHSCORES

Top 10 active users by number of click:
Store data: ZINCRBY CLICK:ActiveUsers 1 USER:100AB
Query data: ZREVRANGE CLICK:ActiveUsers 0 9 ->
HGET Name -> “John Doe”

Here you can see the beauty of our design. We can easily “join” click information to users in order to get the name of the active person.

Surely there are some other conceptual techniques and principles of NoSQL data modeling like denormalization, index tables, composite keys, etc. However the common things, I found most important, we need to analyze the requirements and transform them into a physical model corresponding to the database implementation.

We'd like to help and talk with you

Contact Us