Cassandra is an alternative to MySQL, Oracle or other database manager for very large amounts of queries.
It is suitable for fully distributed and highly scalable databases, so for the wide traffic of sites such as Facebook with millions of queries per hour.
The distributed model is used to store information on many different servers managed by a central system.
Cassandra is based on the non-relational data model BigTable created by Google and used by the index of its search engine, running on Dynamo, the storage system model from Amazon.
It has been open sourced by Facebook in 2008 and then supported by the Apache Foundation, it is the Cassandra Project.
The word Cassandra comes from the Greek mythology, it is the name of a princess who has the power to predict the future but whose fate is never to be believed. The logo of a female gaze refers to the idea of vision. It is assumed that the developer expected not to be believed about the future of this system.
SQL or not SQL?
Cassandra is part of NoSQL the movement that wants to simplify the databases by removing the relational aspect.
Tables are no longer a predetermined fixed schema (that we can actually change later), and can change horizontally (for the columns) as well as vertically (for the lines, so the records).
NoSQL actually means Not Only SQL, so it is not about the query language, which is always SQL.
Cassandra vs. MySQL
Here is a benchmark comparison provided by Apache:
- Writing: MySQL: 300 ms. Cassandra: 0.12 ms.
- Reading: MySQL: 350ms. Cassandra: 15 ms.
Differences in features:
- Number of columns: 4096. Cassandra: 2 billion.
Cassandra is shemaless and has no table. The number of columns can vary from one row to another.
Facebook is the origin of Cassandra, even though the project was then integrated to Apache.
See Software powering Facebook.
Twitter does not so far use Cassandra to manage tweets, because it would have to rewrite the system but it is used for statistical data and geolocation.
Complaining of slowness in MySQL, Digg has decided to completely reimplement the management of data to Cassandra.
See Why Digg replaced MySQL.
Netflix, The streaming TV company prefers to forego the benefits of relational database for the scalability of Cassandra.
Cassandra was originally designed for Facebook on the basis of a model made for Google.
It tends to be used by more and more different players, but this is a new product which has not been confronted with all the variety of uses that can be excerced. We can expect setbacks in adapting it to a new application.