FAQ

How should I use DynomiteDB?

The currently recommended use case is to use DynomiteDB with a Redis backend as a globally distributed, highly available cache.

Using Redis as a backend will provide you with a high performance cache that can scale to support your entire dataset in memory.

What is different about DynomiteDB?

DynomiteDB is a modern NoSQL distributed database that achieves high performance and high availability (HA) via a modular architecture.

DynomiteDB supports pluggable backends which means that you can use the Dynomite clustering layer with different backends to support a variety of use cases. For example, you can combine Dynomite with Redis to use DynomiteDB as a big data cache. Alternatively, you can setup another cluster that combines Dynomite with RocksDB as a big data database. These two solutions reuse the same Dynomite layer to support vastly different use cases. With Redis, you gain exceptionally fast in-memory performance. With RocksDB, you gain strong persistent performance combined with minimal write amplification (which is ideal when you need to store massive amounts of data).

The fact that you can use DynomiteDB for both your cache and persistent database means that your operations staff can reuse tooling and knowledge for both solutions.

Last, and certainly not least, DynomiteDB helps improve developer velocity by abstracting away sharding, replication and other distributed systems concepts. Developers can focus on application requirements by using a simple, Redis-compatible API to access DynomiteDB.

What is the difference between a rack and a data center?

Let’s start at the top of the DynomiteDB hierarchy. The top level contain is the cluster. A cluster contains one or more data centers (DC) and each DC contains multiple racks. Racks contain multiple servers.

Additional information is provided in the Terminology document.

What is a backend?

A backend is a database storage engine and can be either volatile (in-memory) or persistent (on disk).

Which backends are supported by DynomiteDB?

Redis
Memcached

Why not Cassandra?

While DynomiteDB and Cassandra are both Dynamo-inspired systems, each database has very different performance characteristics.

Cassandra is written in Java and uses compaction in its storage engine. As a result, Cassandra experiences garbage collection (GC) pauses at runtime which can result in unexpected latency.

While Cassandra has fast write performance under normal conditions, the use of compaction hurts Cassandra’s write performance under very high frequency writes which causes node failure when compaction cannot catch up with the frequency of writes. Compaction also increases read latency due to the use of extra CPU and disk.

A benefit if DynomiteDB is that it achieves the same throughput and latency as Cassandra at a 15x to 20x cost reduction. In other words, DynomiteDB fundamentally improves the economics of a distributed database.

Why not Redis cluster?

DynomiteDB provides substantially better availability and scalability than Redis cluster.

Redis cluster is a multi-master/slave system and therefore suffers from the problems of master/slave systems including coordination overhead.

DynomiteDB is a masterless system that scales linearly. It supports multiple data centers (DC) and is highly available as each node is independent of every other node. In other words, DynomiteDB will continue to function normally even when a server, rack or DC is offline.

Why not use Twemproxy?

Twemproxy and DynomiteDB solve different problems. Twemproxy provides consistent hashing, proxying and connection pooling. Dynomite, which was originally a fork of Twemproxy, add a complete Dynamo-inspired layer on top of the base Twemproxy functionality.

Is Dynomite a fork from Twemproxy?

Yes, Dynomite is a fork of Twemproxy. Twemproxy was used to jump start development. However, Dynomite’s design is substantially different from Twemproxy, so eventually Dynomite went off in its own direction. Importantly, Dynomite has retained all of Twemproxy’s commit histories, its copyright notices and copyright notices for other open source software that Twemproxy uses (such as the FreeBSD code from University of Berkeley).

What do I need to install before building Dynomite?

The following prerequisites are required to build Dynomite:

libevent
autoconf
libtool
Redis or Memcached

How can I monitor a Dynomite process?

Dynomite provides statistics on the running process on port 22222 by default. The statistics are accessible via the HTTP protocol.

curl 'http://localhost:22222/info'

Why is Dynomite written in C?

Dynomite was written in C for the following reasons:

C/C++ is fast and provides excellent performance for Dynomite
C/C++ provides total control
C was chosen over C++ as the additional features provided by C++ are not used by Dynomite

JVM languages (Java/Scala) were evaluated before Dynomite development began. Unlike C, a JVM language suffers from GC pauses and would provide with no deterministic control over GC.

Go was also evaluated before development began. While Go was a new and interesting language, at the time it was evaluated it still suffered from long GC pauses.

Is there something like Priam for Cassandra?

Yes, there is. We call this ‘Florida’ and it will play the similar role as Priam for helping to manage a multi-region Dynomite cluster in AWS cloud. We will open source this soon.

What client can we use to connect to DynomiteDB?

You can use any Redis client to connect to DynomiteDB.

However, if you are using a JVM language then we strongly recommend that you use the Dyno client. Dyno provides additional functional beyond a default Redis client, including:

Connection pooling
Network topology aware routing (intelligent request routing)
Connection pool metrics
Ability to route traffic away from nodes (helpful for maintenance)
And many other advanced features

Can I access an individual backend?

Yes, DynomiteDB allows you to directly access an individual backend on a single host. This capability is useful when you went to perform maintainance on a single node, or want to inspect the data on a single node without interacting with the cluster.