Been crazily busy, in a good way, the last six months. Finally got a little time to catch you all up on all things me and NoSQL!
The retail version of NoSQL for Dummies [Amazon UK] [Amazon USA], not to be confused with MarkLogic’s free Enterprise NoSQL for Dummies download, has been on sale for 6 months now. I’ve sold well over a thousand copies. It was even No1 in databases for over a month on Amazon.com!
I spoke at the Big Data Meetup in Munich earlier this year. Talked about Hadoop storage and batch processing verses an OLTP NoSQL database being used on top. Great feedback and involvement from the audience… and great beer and local food too! Managed to get a Weiner Schnitzel whilst I was over there – not had one in years!
I spoke at MarkLogic World San Francisco, helping out with the Semantics Hands-On session. Apologies to all who went to Amsterdam and London to see a repeat of the session – I couldn’t be there having injured my foot with the cadet forces on a mountain a couple of weeks before. I know my colleagues ably delivered the session anyway.
I also spoke with my colleague Barry Lloyd at the Public Sector Show in London’s Excel centre last month. This was a great event, with lots of people seeing data problems that NoSQL databases can solve. We had standing room only in our break out session, and over 20 people listening and watching a screen outside of the session. I think it’s down to the Department of Communities and Local Government (DCLG) speaker who was delivering some of the session, showing how large data submission, business-critical applications can be built using NoSQL databases like MarkLogic.
I’ll also be at a BBC Lunch and Learn event in London in August for their technology staff talking about all things NoSQL, with possibly a repeat session in Salford, Manchester (practically local for me!) at some later point if there’s enough interest.
Biggest story first: FoundationDB were bought by Apple of all people!!! They wanted FoundationDB for their own nefarious needs in their cloud. Suspicion is a new online service they want to offer needs an ACID compliant NoSQL database, so Apple stumped up the cash and bought FoundationDB. FoundationDB’s current customers will need to move off and migrate, so plenty of opportunity for MarkLogic there as one of the few other ACID compliant NoSQL databases.
A little older news from December 2014 (I’ve not blogged since then!), but MongoDB acquired WiredTiger. This was a smart move as MongoDB’s default storage tech (which weirdly is STILL their default) sucked both performance wise, and could never achieve ACID compliance which is something on their RoadMap. WiredTiger with MongoDB 3.0 still has teething problems, but in a couple of years time they may have a very good storage and data consistency story… not yet though unfortunately.
Most Enterprises tell me they will only adopt something like MongoDB when it can provide true ACID compliance and data integrity. I think MongoDB bought WiredTiger to compete with the likes of Cassandra on performance. I would personally prefer to see a bit more thought leadership on actually doing something with the data they store – at the moment it feels like a ‘dumb’ store with just very basic element=value style querying. Recent advances in multi-term queries are good, but there’s still a way to go.
It also turns out some DB admins have been, stupidly, not adding security to their MongoDB instances by default. Over 40 000 MongoDB instances were exposed earlier this year. Oops. Not really MongoDB’s fault, but personally I’d try and fire as many warning messages in to the logs and API responses as possible about unsecured use – so there is something they could have done.
A bit of a mixed 6 months for MongoDB all in all.
IBM has also today announced it is buying Compose. Doesn’t sound huge… until you realise than in Jan 2014 they bought Cloudant. They’ve now bought Compose who is a major operator of NoSQL databases in the cloud. This puts IBM in a strong position to learn how to put NoSQL database in large cloud infrastructure. I wouldn’t be surprised if in another 6 months time they announce a major overhaul of Cloudant and a massive NoSQL cloud hosting capability.
Funny Defence story
A defence software firm, who shall remain nameless, also went to the Defence Geospatial Intelligence conference in January 2015 in London. They claimed a major ISR (Imagery) database could be built with their Couchbase Server based product. Naturally I couldn’t resist putting my hand up and asking the question – how can you rely on a non-ACID compliant database for an ISR database where any data loss could result in loss of life?
Turns out, worryingly, they didn’t know it wasn’t ACID. I was even pointed to their technical architect for the product afterwards and he said it doesn’t lose data. I pointed out it can as Couchbase can only provide “Strong” consistency and its replicas are asynchronous, so if the primary node dies not only is the data unavailable on a replica, but it also may have been lost from the primary due to it not being journaled to disc.
This seemed to make him go pale… This is why people need to read my book NoSQL for Dummies! NoSQL is a great technology – but you need to know what you’re buying, and why you’re using it.
Semantics for Dummies
MarkLogic have commissioned Semantics for Dummies as a free download. This adds all the semantics goodness that was missing in the original NoSQL for Dummies 18 months ago. Definitely worth a read for those who are RDF challenged. I helped with some of the thoughts going in to the book. I’m glad to say it’s very well written as it turns out, and a great intro read.
Cool stuff I’ve been working on
I’ve been working on an idea for an SQL over HTTP layer for MarkLogic. A bit like Microsoft’s SQL over HTTP in DocumentDB, which was a great idea. A colleague, Gabo Manuel, is working on a first proper SQL compliant prototype for that right now. (Mine had very very basic “select * from colleactionA where element=value” style queries originally!). This is similar to MarkLogic’s ODBC connector, but doesn’t need ODBC (obviously), and uses the universal index for exact matches rather than requiring range indexes for every field in a view.
I’ve also launched MarkLogic Workflow. This uses MarkLogic’s existing Content Processing Framework Finite State Automata like document lifecycle processing to create a high level BPM layer for human and content centric processing. This may sound familiar to my old colleagues from IBM FileNet!
We needed this for a customer, so I used the Eclipse BPMN 2.0 modeler to create a plugin for a MarkLogic BPMN runtime. You upload these standard BPMN 2.0 compliant process models to MarkLogic Workflow’s REST API and it makes them executable. It works great, and is being added to by consultants in the field right now.
This is targetted at document approval workflows, production and publishing workflows, and those sorts of things. It’s not meant for system-to-system flows.
A really cool thing is that a colleague, Balvinder Dang, has integration Orbeon Forms in to MarkLogic! This enables you to store form definitions and form output in MarkLogic Server! He’s also integrating this in to MarkLogic Workflow too, to provide a human UI over workflow steps. Very, very cool! He’s been working closely with a customer and the guys from Orbeon on this.
Not sure! I have plenty of blog ideas though, so watch this space over summer!!!