January, 2011
Beyond the Data Grid: Coherence, Normalisation, Joins and Linear Scalability (QCon)
Thursday, January 27th, 2011
Normalisation is, in many ways, the antithesis of typical cache design. We tend to denormalise for speed. Building a data store (rather than a cache) is a little different: Manageability, versioning, bi-temporal reconstitution become more important factors. Normalisation helps solve these problems but normalisation in distributed architectures suffers from problems of distributed joins, requiring iterative network calls.
We’ve developed a mechanism for managing normalisation based on a variant of the Star Schema model used in data warehousing. In our implementation Facts are held distributed (partitioned) in the data nodes and Dimensions are replicated throughout the query-processing nodes. To save space we track ‘used’, or as we term them ‘connected’ data, to ensure only useful objects are replicated.
This model was presented at the QCon 2011 and at the Coherence SIG.
You can find the slides here (Powerpoint – 7MB).
See Also:
QCon 2011
Tuesday, January 11th, 2011
Just a little plug for the 5th annual QCon London on March 7-11, 2011. There is a bunch of cool speakers inlcuding Craig Larman and Juergen Hoeller as well as the obligitory set of Ex-TW types. I’ll be doing a session on Going beyond the Data Grid.
You can save £100 and give £100 to charity is you book with this code: STOP100
Interesting Links Dec 2010
Monday, January 3rd, 2011
More discussions on the move to in memory storage:
- RAM is my friend
- LMAX – How to Do 100K TPS at Less than 1ms Latency
- The problems with ACID, and how to fix them without going NoSQL
- Basho Riak: An Open Source Scalable Data Store
- Facebook’s belief in HBase
- Numbers Everyone Should Know
- Google Dremel Paper
- Facebook’s New Year Performance Stats