Book: Designing Event Driven Systems

I wrote a book: Designing Event Driven Systems PDF EPUB

Apr 27th, 2018

The Data Dichotomy

Data Systems are about exposing data, Services are about hiding it.

Dec 14th, 2016

Elements of Scale: Composing and Scaling Data Platforms

This transcribed talk explores a range of data platforms through a lens of basic hardware and software tradeoffs.

Apr 28th, 2015

Log Structured Merge Trees

A detailed look at the interesting LSM file organisation seen in BigTable, Cassandra and most recently MongoDB

Feb 14th, 2015

Blog/News



Slides from Craft Meetup
May 9th, 2018

The slides for the Craft Meetup can be found here.

Posted at May 9th |Filed Under: Blog, Uncategorized - read on


Book: Designing Event Driven Systems
Apr 27th, 2018

I wrote a book: Designing Event Driven Systems

PDF

EPUB

Posted at Apr 27th |Filed Under: Blog, Top4 - read on


Building Event Driven Services with Kafka Streams (Kafka Summit Edition)
Apr 23rd, 2018

The Kafka Summit version of this talk is more practical and includes code examples which walk though how to build a streaming application with Kafka Streams.

Posted at Apr 23rd |Filed Under: Blog, Uncategorized - read on


Slides fo NDC – The Data Dichotomy
Jan 19th, 2018

When building service-based systems, we don’t generally think too much about data. If we need data from another service, we ask for it. This pattern works well for whole swathes of use cases, particularly ones where datasets are small and requirements are simple. But real business services have to join and operate on datasets from many different sources. This can be slow and cumbersome in practice.

These problems stem from an underlying dichotomy. Data systems are built to make data as accessible as possible—a mindset that focuses on getting the job done. Services, instead, focus on encapsulation—a mindset that allows independence and autonomy as we evolve and grow. But these two forces inevitably compete in most serious service-based architectures.

Ben Stopford explains why understanding and accepting this dichotomy is an important part of designing service-based systems at any significant scale. Ben looks at how companies make use of a shared, immutable sequence of records to balance data that sits inside their services with data that is shared, an approach that allows the likes of Uber, Netflix, and LinkedIn to scale to millions of events per second.

Ben concludes by examining the potential of stream processors as a mechanism for joining significant, event-driven datasets across a whole host of services and explains why stream processing provides much of the benefits of data warehousing but without the same degree of centralization.

Posted at Jan 19th |Filed Under: Blog - read on


Handling GDPR: How to make Kafka Forget
Dec 4th, 2017

If you follow the press around Kafka you’ll probably know it’s pretty good at tracking and retaining messages, but sometimes removing messages is important too. GDPR is a good example of this as, amongst other things, it includes the right to be forgotten. This begs a very obvious question: how do you delete arbitrary data from Kafka? It’s an immutable log after all. (more…)

Posted at Dec 4th |Filed Under: Blog - read on


View full blogroll