Has anyone worked with Aerospike? How does it compare to MongoDB?

Can anyone say if Aerospike is as good as they claim it to be? I'm a bit skeptical since it's a commercial enterprise. As far as I understand they just released a open source version, but the claims on their website could still be exaggerated.

I'm especially interested on how Aerospike compares to MongoDB.

I have used Aerospike, MongoDB, Redis, and tested many other NoSQL databases, and I would say Aerospike is very good at what it does, but it is different than MongoDB. Everything depends on what you are planning on using a database for. I can give you an example of what I am using my different databases for, and go over the differences between them an Aerospike, and the benefits of Aerospike.

MongoDB

I am using MongoDB as a SQL alternative. In my MongoDB database I have many different fields, often times, the fields are changing, and I will randomly need to query on various fields, it is a very unstructured database, and MongoDB is amazing at that. I have also used MongoDB as a standard key-value store as well, it preforms well, but I have had MongoDB preform sub optimally at scale(both transaction scale and database size scale). Admittedly, the database may have been able to be optimized a little better, however I find it very hard to find documentation on configuring MongoDB correctly in different situations.

Redis

Redis is a pure Key-Value store. Redis' biggest problem is that it is purely in memory(on disk backups but you cannot store more information the you have memory available). It is extremely fast for what it is used for. I personally use it for a small transactional database: I do very simple functions on keys, like counting how many times an event happened for a certain user. I also do quick in-memory look ups that I need mapped to different values. Redis is a great tool for a small dataset, and it is extremely fast. Configuration is very easy as well.

Aerospike

I personally use Aerospike as a solution to scaling a Redis database, although from my understanding, it can be used for more. Like Redis, Aerospike is a Key-Value store. I believe the open source edition also supports secondary indexes, which Redis does not (I have not used secondary indexes in production, but I have done a little testing on them).

Aerospike's best feature is its ability to scale. The biggest problem I needed to solve when looking into Aerospike was how can I scale my system for for large data sets, and still be extremely fast. The project I use Aerospike for has very stringent requirements on speed. I usually make 3-4 database lookups, plus other processing and need to have sub-50ms transaction times. A few of those look-ups are on data sets which are 300GB+. I could not find a solution to hold this data and make it accessible in a reasonable amount of time. Redis obviously won't work unless I had a machine which had 300GB+ of RAM. MongoDB started to perform extremely poorly at a size much lower than 300GB. So I gave Aerospike a shot, and it was able to handle everything very well. The best thing about Aerospike has been as my data set has grown, I have not had to done much more than standing up a new box when needed, the speed has stayed consistent.

I also find Aerospikes documentation very good, it isn't too hard to configure, and it's pretty easy to find answers for any issue that have come up.

Conclusion

So back to you question of if Aerospike is as good as they claim. Personally, I have not seen anything less than what they have claimed in my uses. I haven't had to scale to 1 million TPS. But I do believe with enough hardware that would be possible. I also believe the numbers of the speed difference between MongoDB. Aerospike is a much more "configured" and "planed out" database than MongoDB. Because of this, at scale Aerospike will be much faster than MongoDB. It only has to worry about a single(or in case of secondary indices, a few hundred) indexes, rather than MongoDB, which can change dynamically. The question you really need to be asking is what goal am I trying to accomplish with my database, and then look into what database will fit your needs best. If you need a scalable, fast key-value store database, I would say Aerospike is probably the best out there.

Let me know if you have any specific questions or need anything clarified, I would probably be able to help you out.

I've built several ad networks with data up to 80TB using MongoDB and Aerospike.

Speed

Aerospike is absolutely unbeatable in speed. Almost any system is lightning fast with low load or simple data access but Aerospike has stayed consistently fast with < 5ms for > 99% of lookups at around 200k TPS. Mongo is fast when used standalone or in a small cluster but starts to have severe problems when doing lots of writes (anywhere close to 25% writes = slow).

Reliability

Aerospike has given us 0 downtime. MongoDB has caused 2 major outages, 1 was a bug in their slave oplog code and another was a config mistake we made which was hard to recover from. The clustering with Aerospike is far easier to setup and just works, we can restart/add/delete a server anytime without worry, the official drivers handle everything about cluster connectivity and we've never had to manage anything manually. There's a lot to be said about that piece of mind.

Setup/Configuration

Aerospike wins. There is no contest here, and after managing large datasets growing rapidly with strict performance requirements, its so much nicer to just use something that works without worry. MongoDB can be easier if you're just setting up a single server as it runs on more platforms natively and you can start it without any config, but there's no production use for a single node database.

MongoDB has two major ways of clustering, replica sets (for data availability) and sharding (for data scalability). We had 5 shards and each shard had a replica-set of 3 servers. That's 15 "servers" and in our case we used virtual machines for each. Then we had (3) config servers that maintained the cluster config and had to add (2) arbiter processes after our 1st major outage to deal with properly escalating a slave to master.

That's a lot of moving pieces and also makes it incredibly hard to change your layout in the future. We originally started with just a replica-set and when performance wasn't enough we started sharding. That was complicated and I don't want to even think about what would happen if we picked the wrong shard key.

In contrast, Aerospike took me about 20 mins to set up a 2 node cluster on EC2. And that same cluster is now in production with just 1 more server added. All the nodes are exactly the same, no special master or config nodes anywhere. You just have to configure your namespaces and the network settings. We use mesh networking so just point 1 node at the IP of another node in the cluster and that's it, your cluster is up and running. Adding a new server is the same thing, you don't have to touch the old machines. The only issue is if you use mesh network (instead of multicast) and you remove a machine that's being pointed at by another node, make sure to update that existing node so if it ever loses sync it can seed itself again.

Aerospike also has the Aerospike Management Console which gives you a nice GUI into status of your cluster. No unwanted sign up for MMS necessary.

Data Access

This is a tie. MongoDB has database > collection > record with each record just being a json document with a key. Aerospike has pre-configured namespace > set > record where each record is then a collection of key-value "bins". This really depends on what you're doing with your application. I like Aerospike more generally here because although key-value bins are a little more to deal with in your app instead of just a bit of JSON, Aerospike comes with built-in server-side support for more complex data types like large sets, stacks and lists where the server just handles everything for you (as long as you use their official drivers, but why wouldn't you?)

Both have secondary indexes although MongoDB lets you query immediately by anything while Aerospike requires some setup first (unless you're getting a specific key or scanning all records). Both have built-in aggregation frameworks. Aerospike clients have more first-class support for LUA scripting but MongoDB supports mapreduce and custom javascript functions as well. I'd say they're both equal here and it really depends on what you're trying to do.

Cost

Both are now open-source and free. Both have enterprise versions with a few extra features and if you need them it's worth it, but licensing is expensive if you have lots of data.

Aerospike ended up being a lot cheaper to actually run since it requires far less machines (3 nodes vs 17 with our mongo setup) and their SSD based performance is incredible which means less RAM required on the nodes.

Overall

I'd almost always choose Aerospike at this point. There's just no contest. Outside of relational stuff or needing the ability to run any random query, you can store pretty much anything you want in this database and they have a powerful aggregation framework and scripting support for anything fancy.

The best part though is that you can get massive reliable performance with just a few nodes and can get it all setup in minutes, without having to constantly worry about scaling or maintenance or what happens if a node goes down. Isn't that the whole point of having high availability, so you don't have to worry? Aerospike more than meets its claims and I'm glad we're using it.

Update October 2014: Aerospike now offers a special program for qualified startups to get free access to the enterprise version, contact them or feel free to message me directly and I'll put you in touch with someone on their team.

I have used MongoDB(2.4) and Aerospike 3 in our production systems. These are the few observation found by our team :-

1)Read/Write throughput by Aerospike is unbeatable. Usually Mongo db works up to certain scale if read requests are at higher side. If you need concurrent read/write as 95/5 percent ratio, Mongo degrades like anything. With Aerospike we have seen very little impact even if this ratio is 90/10. On AWS we have achieved 200k TPS using Aerospike.

2)In Aerospike latency is very low. Read latency was sub-millisecond for 99 percentile at server side. Write latency was sub-millisecond for 80 percentile and within 8ms for 100 percentile. Best thing was that we got almost similar number in different POC, so consistent performance.

3)Very few nodes are sufficient in Aerospike cluster compare to other solutions. Also SSD based data store gives quite impressive numbers, so very cost effective and little maintenance overhead.

4)Now Aerospike is open source, so hope for wider community support :-)

So we are using Aerospike for all the new systems and trying to migrate from MongoDB.

MongoDB and Aerospike are not done for the same data management, as SQL is not dead too.

We have done cache systems with sharded clusters on Mongodb with TokuMX version (2.0.0 based on Mongodb 2.4.10), system is still running well with only 0,1% of queries taking more than 100ms on 65 millions queries per day and about 10 millions updates per day. We're now trying Aerospike wich seem's to be great and now open-source. There is only one problem with this open-source version,

DON'T USE IT IN PRODUCTION SERVERS !

The security management is only available in Enterprise distribution. It means that

YOU CAN NOT SECURE ANYTHING WITH PASSWORD AND USER !

Now, if you don't mind, you can use it on production, but don't remember any of your client can ask for a security audit and then you'll have to pay a lot.

Aerospike is an in-memory db and Mongo is a document db. i doubt how can someone compare both? I would recommend Redis as of today because Aerospike has stood up with its opensource very recently. Mongo has got a huge backing and wonderful contributors too. It surely is one of the best documentdb's that are in production. many of the mobile apps have mongo as their back end. even i have it for my apps.