NoSQL Reviews

NoSQL Review: Aerospike 3.14

Updated. Review of the latest Aerospike NoSQL database: version 3.14

Aerospike is a key-value store NoSQL database with a unique native flash storage layer, making it blindingly fast. It is one of my favourite NoSQL databases thanks to its architecture, and focus on performance.

Updated 22/Jun/2017: Added information about new Incremental Backup and Filtering. Also added info from just released 3.14 blog post. Sods law that the 3.13 and 3.14 blog posts are released a couple of days after my review! Ah well… Updates highlighted in underlined text.

Vital Statistics

Latest release: Version 3.14 (Jun 6th, 2017) (Press release latest is 3.11 only, Jan 31st)
Commercial backer: Aerospike Inc
Twitter: @aerospikedb
Licensing: Community core (Affero GPL V3.0) with Enterprise supported version
Sales model: Innovative payment by data volume managed model
Release Press Release: http://www.aerospike.com/press-releases/aerospike-3-11/
Release Full Details: 3.11, 3.12, 3.13 & 3.14

What’s new

Aerospike are positioning themselves as having a ‘hybrid memory database’ rather than a high performance key-value store, which is what it is functionally. With improvements in RAM usage and a more fine grained locking capability in this release, you can see why they position themselves this way.

Transport Layer Security (TLS 1.2) has been added to v 3.11 on the server and client libraries. I must admit, I was a bit surprised that Aerospike didn’t have this before – It’s very common across the industry and a requirement in very secure customers. If you need this, upgrade to Version 3.11 or above.

To be fair to Aerospike, the database is normally deployed deep in networks next to the application servers that will communicate with it, so communications weren’t unencrypted over the Internet or anything – just within the datacentre.

IPv6 support is now available in all client drivers, whereas this feature was introduced in the server for version 3.10. Although I don’t know many places using just IPv6 in anger, it’s the kinda feature that if you need it, then you REALLY need it!

Improved support for the Kafka connector architecture is included in this release. This should help those that have adopted Kafka to more easily load data in to Aerospike.

What I like

Official client API that support all the new functionality are available for Java, C#, C and Go – a nice range of languages that should suffice for most development houses.

Rolling updates are not new in Aerospike, but I love seeing companies just drop in the fact they can do this across the cluster. For administrators of large, live, mission-critical clusters this is a god send!

New clustering approaches that have culminated in the V 3.14 release are a great boon. By Aerospike’s own admission they previously aimed at clusters of 10-80 servers. Some clients obviously need more than this, and now they claim the ability to scale to 128 nodes (Aerospike’s configuration limit).

It’s worth noting that that is hard to do. Cross talk between sharded servers is a known Engineering problem for NoSQL databases. Aerospike claim this release’s cluster comms are sped up ‘100x’ – truly impressive if true. You should definitely check this out on your existing cluster if you’re nearing 80 servers. Measure network use before and after for a rough indication.

Incremental backup and restore is new in 3.12. This relies on the filtering functionality I mention below. One downside is that incremental backups are limited to timestamps – requiring you to have very time sync’ed servers. I would prefer to see a cluster timestamp rather than server time, as the current approach may be prone to ‘missing’ key data. The current feature feels more like a timed export than a true backup, but I expect this to improve in future versions. Incremental Backup too is like cat nip for administrators – especially those with, for example, large data churn during the working week. Full details on their website.

In general, I just love the approach Aerospike take to high performance. They wrote their own Flash SSD native handling layer rather than rely on the underlying operating system in order to squeeze the best performance out of the drives.

The focus on high performance across its product’s features and client libraries is a winning combination that I’m sure will keep customers with acute performance problems knocking on Aerospike’s door for years to come.

What’s not so good

The improvements to the SortedMap type will be welcome, allowing higher speed interaction to a more complex data type than just keys and values. I would like to see more complex data type and operations support than currently provided though. I do love the fact that Aerospike are very open about the computational complexity of their operations though.

The Redis database is still the king in this area, supporting a wide range of data types and operations in it’s key-value store. Aerospike’s support of User Defined Functions (UDFs) means that customers can work around this if they need to.

Geospatial support appears limited to Spheres (which the Earth isn’t), and the library used appears to have not been updated in 6 years, leading me to wonder about how much investment is going on in Geo beyond simple storage of geospatial data.

There’s a hint in the documentation page that Geospatial precision may not be accurate to very short distances, and may be indexed as floating point not double precision. This means centimetre resolution may not be possible – please get in contact if you’d tested this, I’d love to hear about your experience.

Filtering of read results and scans is a new feature, with low-level API support added to just the Java and C APIs for now. The filters create C code that is compiled for speed of execution though. It will be interesting to see the higher level API that are created using this capability. This is an oft needed feature as Aerospike’s blog post itself mentions. Be sure also to read up about Secondary Index Support.

The biggest technical issue I’ve found (Thanks to the person who got in to contact after the original version of this post was published) is that secondary indexes are not persisted – they are recreated from logs on restart. This means you are not up and running with reliable query until all indexes are rebuilt. This could take hours. This is simply unacceptable and should be fixed ASAP. I hope Aerospike will look in to this as soon as they can. See discussion of this problem from this user and this ranting user.

Although this is the ‘not so good’ section, I have to say that I do love how open Aerospike are by listing their known limitations! Good job! I’d advise people to check this list for any deal breakers… Then let Aerospike know so they can improve what is a great product.

Where is it used

Financial Services is a key area that Aerospike focus on. It would be interesting to see how many people use Aerospike as an alternative architecture to Software AG’s Terracotta or Oracle’s Coherence shared Java memory approach, which too is prevalent in Financial Services.

Advertising is also a key space, as is common for all key value stores. Telecom and E-commerce are other customer areas too.

There is a complete list of customers on their website.

Advertisements

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s