The Data Dichotomy

Data Systems are about exposing data, Services are about hiding it.

Dec 14th, 2016

Elements of Scale: Composing and Scaling Data Platforms

This transcribed talk explores a range of data platforms through a lens of basic hardware and software tradeoffs.

Apr 28th, 2015

Log Structured Merge Trees

A detailed look at the interesting LSM file organisation seen in BigTable, Cassandra and most recently MongoDB

Feb 14th, 2015

A Story about George

A lighthearted look at Oracle & Google using a metaphorical format. The style won’t suit everyone, but it’s a bit of fun!

Jun 3rd, 2012


Devoxx London 2017 – Rethinking Services With Stateful Streams
May 12th, 2017

Posted at May 12th |Filed Under: Blog - read on

Slides from Strata Software Architecture: The Data Dichotomy – Rethinking data and services with streams
Apr 5th, 2017

Posted at Apr 5th |Filed Under: Blog, Uncategorized - read on

QCon 2017: The Power of the Log
Mar 8th, 2017

VIDEO (currently attendees only) HERE

This talk is about the beauty of sequential access and append only data structures. We’ll do this in the context of a little known paper entitled “Log Structured Merge Trees”. LSM describes a surprisingly counterintuitive approach to storing and accessing data in a sequential fashion. It came to prominence in Google’s Big Table paper and today, the use of Logs, LSM and append only data structures drive many of the world’s most influential storage systems: Cassandra, HBase, RocksDB, Kafka and more. Finally we’ll look at how the beauty of sequential access goes beyond database internals, right through to how applications communicate, share data and scale.



Posted at Mar 8th |Filed Under: Blog, Uncategorized - read on

The Data Dichotomy
Dec 14th, 2016

A post about services and data, published on the Confluent site.

Posted at Dec 14th |Filed Under: Analysis and Opinion, Blog, Top4 - read on

Streaming, Databases & Distributed Systems – Bridging the Divide
Nov 23rd, 2016

This talk introduces Stateful Stream Processing and makes a case for SSP as a general approach to data computation in distributed environments.

Slides, alone, can be found here:


Posted at Nov 23rd |Filed Under: Blog - read on

Slides from Codemesh & BigDataLdn
Nov 4th, 2016

Posted at Nov 4th |Filed Under: Blog - read on

View full blogroll