Of late I have been pondering what I have to say about :
- Distributed MVCC and write-scaling
- Different approaches to eventual consistency with replicated RDBMS
- Various MySQL Cluster related topics
- Various general rambling and unstructured topics
In the meantime here are some things I have found interesting recently :
- Learn You Some Erlang for Great Good
I actually rediscovered this online book after watching some Joe Armstrong + Erlang videos, after watching some spoof video about bringing Erlang up to date. All recommended Erlang and Ndb Cluster share some Plex heritage, which can still be seen in their architectures today. Since Plex, Erlang has mated with Prolog, and Ndb Cluster was involved in a car crash with C++.
- HyperDex + Hyperdex Warp
Something I discovered last year from Emin Gün Sirer's blog and have returned to since. There are a number of nice ideas combined here (chain replication, value dependent chaining, hyperspace hashing, subspaces). My favourites are the concept of 'spurious coordination' and their solution w.r.t. transaction consistency : ordering the route of the optimistic 'distributed commit' based on the affected keys. I guess we need more independent analysis and evaluation to understand the strengths and weaknesses of these techniques.
This is a distributed HA 'event ordering system' from the same Cornell HyperDex team. Thinking about distributed MVCC led me to thinking about efficiently maintaining a distributed partial ordering of events while avoiding 'spurious coordination'. Kronos is an attempt to solve part of that problem in a kind of abstracted SOA way. There is some nice detail in the paper about their dependency graph traversal optimisations, and how dependencies are immutable once discovered, so can be cached, replicated for read scale-out etc. This could be a great systems building block.
- Systems Performance book
I am slowly reading this doorstop book from Brendan Gregg, an ex Solaris kernel engineer at Sun, now at Joyent. It contains a great amount of recent practical information about Linux + Solaris performance analysis and optimisation. Unix performance tools have always been a little opaque to me, with very little of how-to-approach a performance problem ever being documented. This book covers many old and new tools, but also includes rare information on how to analyse problems with these tools, rather than just syntax and units of values returned. Perhaps even better is his supreme confidence about tackling and solving any performance problem that foolishly catches his eye. I guess that comes from experience, but maybe a little can be conveyed to his readers by this book.
- Google Spanner, Galera Cluster, MoSQL, RAMCloud, NuoDB, OpenReplica
Different approaches, ideas, hidden tradeoffs, strengths and weaknesses!
"On two occasions I have been asked 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?'Strange how often this response has been on my lips since !
I am not able rightly to comprehend the kind of confusion of ideas that could provoke such a question."