A look at where change in the database market has come from and where it might be going
Whilst Big Data contains the promise of fame and fortune, the signal to noise ratio is high. How do you make sense of the marketing blurb?
Once upon a time, in a land of metaphor, there lived a database called George…
Have you ever wished that code was easier to test? The talk, and associated paper, explores how things might be different if we started from scratch.
My experiences of late have largely focused around distributed data storage, in particular Oracle Coherence, and this has inevitably shaped my understanding of systems, so keep that in mind as you read on.
The articles above are those that are recent or popular. There are a variety of other articles, categorised to the right and more traditional blog entries below.
- JAX-2013: The Return of Big Iron?
- QCon-2012: Where does Big Data meet Big Database?
- QCon-2012: Progressive Architectures at RBS
- JavaOne-2011: Balancing Replication and Partitioning in a Distributed Java Database
- QCon-2011: Beyond the Data Grid: Coherence, Normalisation, Joins and Linear Scalability
- OpenWorld-2011: Adopting Oracle Coherence as an Enterprise Standard
- UCL-2011: A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for High Performance Data Access
- CoSIG-2011: Oracle Coherence Implementation Patterns (Special Interest Group)
- ICST-2011: Test-Oriented Languages: a new era?
- ICST-2011: Enabling Testing, Design and Refactoring Practices in Remote Locations
- Birkbeck-2011: Data Storage for Extreme Use Cases
- RefTest-2010: Has Mocking Gone Wrong?
- RBS-2009: Data Grids with Oracle Coherence
- Brunel-2008: The Architect's Two Hats
- Brunel-2007: Architecture and Design in Industry
- Database Y
- The Big Data Conundrum
- The Best of VLDB 2012 (Very Large Database Conference)
- Thinking in Graphs: Neo4J
- A Brief Summary of the NoSQL World
- ODC – RBS’s Distributed Datastore
- Big Data, an Epic Essay
- A Story about George
- Data Storage for Extreme Use Cases
- The Rebirth of the In-Memory Database
- Is the Traditional Database a Thing of the Past?
- Shared Nothing v.s. Shared Disk Architectures: An Independent View
- Component Software. Where is it going?
- Do Metrics Have a Place in Software Engineering Today?
- Distributing Skills Across a Continental Divide
- Four HPC Architecture Questions – With Answers
- Interviewing: The Importance of Examining Applied Knowledge
- Learning Practices for Distributed Teams (ICST)
- Mapping Personal Practices
- The Business Analyst Test
- Beyond the Data Grid: Coherence, Normalisation, Joins and Linear Scalability (QCon)
- Coherence Part I: An Introduction
- Coherence Part II: Delving a Little Deeper
- Coherence Part III: The Coherence Toolbox
- Coherence Part IV: Merging Data And Processing
- Coherence: The Fallacy of Linear Scalability
- How Fault Tolerant Is Coherence Really?
- Merging Data And Processing: Why it doesn’t “just work”
- An Overview of the Best Coherence Patterns
- Cluster Time and Consistent Snapshotting
- GUI Sorting and Pagination with Chained CQCs
- Joins: Advanced Patterns for Data Stores
- Joins: Simple joins using CQC or Key-Association
- Latest-Versioned/Marker Patterns and MVCC
- Reliable version of putAll()
- Singleton Service
- The Collections Cache
Something close to my own heart – interesting paper on lightweight milti-key transactions for KV stores.
Slides for today’s talk at RBS Techstock:
Similar name to the Big Data 2013 but a very different deck:
The slides from yesterday’s guest lecture on NoSQL, NewSQL and Big Data can be found here.
Slides from today’s European Trading Architecture Summit 2012 are here.
Over the last few years we’ve had a fair few discussions around the various different ways to branch and how they fit into a world of Continuous Integration (and more recently Continuous Delivery). It’s so fundamental that it’s worth a post of its own!
Dave Farley (the man that literally wrote the book on it) penned a the best advice I’ve seen on the topic a while back. Worth a read, or even a reread (and gets better towards the end).
InfoQ published the video for my Where does Big Data meet Big Database talk at QCon this year.
- Intel’s new MIC ‘Knights Corner’ coprocessor (in the Intel Xeon Phi line) is targeted at the high concurrency market, previously dominated by GPGPUs, but without the need for code to be rewritten into Cuda etc (note Knights Ferry is the older prototype version).
- The chip has 64 cores and 8GBs of RAM with a 512b vector engine. Clock speed is ~ 1.1Ghz and have a 512k L1 cache. The linux kernel runs on two 2.2GHZ processors.
- It comes on a card that drops into a PCI slot so machines can install multiple units.
- It uses a MESI protocol for cache coherence.
- There is a slimmed down linux OS that can run on the processor.
- Code must be compiled to two binaries, one for the main processor and one for Knights Corner.
- Compilers are currently available only for C+ and Fortran. Only Intel compilers at present.
- It’s on the cusp of being released (Q4 this year) for NDA partners (though we – GBM – have access to one off-site at Maidenhead). Due to be announced at the Supercomputing conference in November(?).
- KC is 4-6 GFLOPS/W – which works out at 0.85-1.8 TFLOPS for double precision.
- It is expected to be GA Q1 ‘13.
- It’s a large ‘device’ the wafer is a 70mm square form-factor!
- Access to a separate board over PCI is a temporary step. Expected that future versions will be a tightly-coupled co-processor. This will also be on the back of the move to the 14nm process.
- A single host can (depending on OEM design) support several PCI cards.
- Similarly power-draw and heat-dispersal an OEM decision.
- Reduced instruction set e.g. no VM support instructions or context-switch optimisations.
- Performance now being expressed as GFlops per Watt. This is a result of US Government (efficiency) requirements.
- A single machine is can go faster than a room-filling supercomputer of ‘97 – ASIC_Red!
- The main constraint to doing even more has been the limited volume production pipeline.
- Pricing not announced, but expected to be ‘consistent with’ GPGPUs.
- Key goal is to make programming it ‘easy’ or rather: a lot easier than the platform dedicated approaches or abstraction mechanisms such as OpenCL.
- Once booted (probably by a push of an OS image from the main host’s store to the device) it can appear as a distinct host over the network.
- The key point is that Knights Corner provides most of the advantages of a GPGPU but without the painful and costly exercise of migrating software from one language to another (that is to say it is based on the familiar x86 programming model).
- Offloading work to the card is instructed through the offload pragma or offloading keywords via shared virtual memory.
- Computation occurs in a heterogeneous environment that spans both the main CPU and the MIC card which is how execution can be performed with minimal code changes.
- There is a reduced instruction set for Knights Corner but the majority of the x86 instructions are there.
- There is support for OpenCl although Intel are not recommending that route to customers due to performance constraints.
- Real world testing has shown a provisional 4x improviement in throughput using an early version of the card running some real programs. However results from a sample test shows perfect scaling. Some restructuring of the code was necessary. Not huge but not insignificant.
- There is currently only C++ and Fortran interfaces (so not much use if you’re running Java or C#)
- You need to remember that you are on PCI Express so you don’t have the memory bandwidth you might want.
Other things worth thinking about:
Thanks to Mark Atwell for his help with this post.
Read the article here.
Watch the video here.
A big thanks to Fuzz, Mark and Ciaran for making this happen.
I really enjoyed Harvey’s ‘POF Art’ talk at the Coherence SIG. Slides are here if you’re into that kind of thing POF-Art.
What if, more than anything else, we valued helping each other out? What if this was the ultimate praise, not the best technologists, not an ability to hit deadlines, not production stability. What if the ultimate accolade was to consistently help others get things done? Is that crazy? It’s certainly not always natural; we innately divide into groups, building psychological boundaries. Politics erupts from trivial things. And what about the business? How would we ever deliver anything if we spent all our time helping each other out? But maybe we’d deliver quite a lot.
If helping each other out were our default position wouldn’t we be more efficient? We’d have less politics, less conflict, fewer empires and we’d spend less money managing them.
We probabably can’t change who we are. We’ll always behave a bit like we do now. Conflict will always arise and it will always result in problems, we all have tempers, we play games, we frustrate others and retort to the slights and injustices.
But what if it was simply our default position. Our core value. The thing we fall back on. It wouldn’t change the world, but it might make us a little bit more efficient.
… right back to the real world
Valve handbook. Very cool:
Jon ‘The Gridman’ Knight has finally dusted off his keyboard and entered the blogsphere with fantastic post on how we implement a reliable version of Coherence’s putAll() over here on ODC. One to add to your feed if you are interested in all things Coherence.
- Intel managing to squeeze 50 cores on a single chip, breaking through the teraflop boundary as they do so: Brier Dudley’s Blog | Wow: Intel unveils 1 teraflop chip with 50-plus cores | Seattle Times Newspaper
- RISC architectures have had a renaissance thanks largely to the needs of the mobile sector, could their low power consumption make them a serious contender for enterprise space? x86 Faces Unexpected RISC Competition
- AMD announce 4 memory channels allowing massive addressable spaces up to 364GB per CMP : AMD’s Interlagos and Valencia finally emerge
- Anyone who follows my blog will know of my belief in large address spaces reshaping the landscape, certainly for enterprise applications. This articles echoes these views: Megatrend: Cheap RAM Reshaping All of Computing | Dr Dobb’s
- IBM’s Lime is an interesting approach to simplifying the programming of secondary devices. See Lime paper and the related Liquid Metal project.
- JVM on FPGA: JOP: A Tiny Java Processor Core for FPGA
- An interesting paper on using FPGA for Monte Carlo Simulation: FPGA for monte carlos
High Performance Java
- An excellent talk about using memory efficiently in Java applications, that the costs are often higher than we think. It includes clear descriptions of the footprint of all Java objects and utilities : Building Memory Efficient Java Applications
- There has been a flurry of activity coming from Azul Systems recently. Most notably the release of Zing, their pauseless garbage collector. Gene Til’s talk about the State of the Art in GC from QCon SF 2011 is one of the best I’ve seen (QConSF 2011: State of the Art in Garbage Collection).
- Azul have also recently released JHiccup. An interesting utility that measures operating system stalls. Java Developer Tools: jHiccup Java Performance Analysis
- Charles Nutter’s comments on his favourite JVM flags including my favourite (-XX:+PrintOptoAssembly): Headius: My Favorite Hotspot JVM Flags
Distributed Data Storage
- A great paper from VLDB describing an approach for balancing replication and partitioning, something close to my own heart: Schism: a Workload-Driven Approach to Database Replication and Partitioning
- Hasso Plattner (the P is SAP) wrote this paper which provides an insigntful view of where he believes the field should be going (and of course SAP’s solution Hana): Hasso Plattner on In-Memory OLAP & OLTP
- I enjoyed watching this talk about Mongo: InfoQ: Scaling with MongoDB
- An entertaining article from the Economist about David Gelernter’s predictions of the future of computing: Brain scan: Seer of the mirror world | The Economist
- Could Prezi really dislodge PowerPoint? Prezi
- Double Loop Learning – a different view on organizational learning. Chris Argyris.
- Worth reading if you are not familiar with the idea already: CQRS
- An interesting twist on the traditional storyboard approach Our Story Board is Better Than Yours… I’m a big fan of replacing estimation with uniformly sized stories.
- Booked your next holiday? What about a Code Retreat with Corey Haines
High Performance Java
- Not exactly lightweight reading but one of the most detailed and influential papers on tuning your software for processing efficiency: What Developers Should Understand About Memory
- If you read the above and want to put some of it into action then VTune should be your next port of call. Diagnostic software for CPU cache hits etc: VTune™ Amplifier XE 2011 from Intel – Intel® Software Network
- When it really won’t go any faster, look at the Assembler: Deep dive into assembly code from Java | Java.net
- In anticipation of G1 (in case they ever get it finished) here’s the original paper with anticipated performance figures: G1 paper with figures
- A different approach to GC using processor specific minor collections (in Haskell): Multicore Garbage Collection with Local Heaps
Distributed Data Storage:
- The new Oracle NoSQL database – this is the best article I’ve read summarising it’s position in the market: DBMS Musings: Overview of the Oracle NoSQL Database
- The official Oracle NoSQL Whitepaper: Oracle NoSQL Database White Paper
- An interesting approach to data storage: an FPGA based data warehouse: FPGA Data Warehouse
- Google’s interesting SQL wrapped MapReduce framework: Tenzing A SQL Implementation On The MapReduce Framework
- The Actors Model – just in case you’re not familiar with it: Actors model for distribution
- Gluster – an open source distributed file system: Gluster
- Running Cuda natively on x86 processors: Running CUDA Code Natively on x86 Processors | Dr Dobb’s Journal
- Thinking about using 64bit JVMs with compressed pointers : 32-bit or 64-bit JVM? How about a Hybrid?
- Using different caches for read and write. A sensible pattern for Cohernece implementation: Alexey Ragozin’s Blog
- OCZ Z-Drive – an interesting and competitively priced alternative to FusionIO:
- The architecture of the transputer. An interesting reflection on a couple of Bristol’s finest exports (other than Portishead): the Transputer and the Occum programming language. David May, parallel processing pioneer • reghardware
- Is your brain like an Iphone? Is Your Brain Like an iPhone? Which App is Running Now? – Novato, CA Patch
- Just be still for once: No Shame in Stillness « Under the Apricot Tree
- Of the huge amount of writing about Steve Jobs I thought the Economist’s coverage was the best: Steve Jobs: The magician | The Economist
- Scott Marcar’s thought prevoking dialog on technology through a financial crisis: The Long Haul: Scott Marcar Leads RBS’ Tech Team Through the Financial Crisis- WatersTechnology.com
- Short but thought provoking article on company culture: Why You Should Question Your Culture – Ron Ashkenas – Harvard Business Review
Here are a the slides for the talk I gave at JavaOne:
Balancing Replication and Partitioning in a Distributed Java Database
This session describes the ODC, a distributed, in-memory database built in Java that holds objects in a normalized form in a way that alleviates the traditional degradation in performance associated with joins in shared-nothing architectures. The presentation describes the two patterns that lie at the core of this model. The first is an adaptation of the Star Schema model used to hold data either replicated or partitioned data, depending on whether the data is a fact or a dimension. In the second pattern, the data store tracks arcs on the object graph to ensure that only the minimum amount of data is replicated. Through these mechanisms, almost any join can be performed across the various entities stored in the grid, without the need for key shipping or iterative wire calls.
- Original talk, given at QCon London, which is more Coherence specific
- [pptx-9MB] [pdf-48MB]
- A related post documenting the main points covered in the talk
I’m heading to JavaOne in October to talk about some of the stuff we’ve been doing at RBS. The talk is entitled “Balancing Replication and Partitioning in a Distributed Java Database”.
Is anyone else going?
Because the future will inevitably be in-memory databases:
- SAP (slightly weirdly) is leading the way with Hana
- SSD makes a new kind of database possible
- The move away from clusters is not restricted to the enterprise
- More drinking of the Hana Kool-Aid
- Fusion IO
- Phase Change Memory breakthrough at IBM
Other interesting stuff:
- Interesting retrospective on computing giants of the past and future (in typical Economist style)
- A mathematician’s lament
- The next generation of Map Reduce
- Where google may be going wrong
The LMAX guys have open-sourced their Disruptor queue implementation. Their stats show some significant improvements (over an order of magnitude) over standard ArrayBlockingQueues in a range of concurrent tests. Both interesting and useful.
The slides/video from the my talk at QCon London have been put up on InfoQ.
An effort well worthy of it’s own post: http://www.christof-strauch.de/nosqldbs.pdf
- Nice talk covering optimising code in a single JVM: LMAX
- Biased locking in Hotspot: biased_locking_in_hotspot
- Good overview of caching: intel-cpu-caches
- Good overview of lock free algorithms: lock-free-algorithms
- Nice overview of the key NoSQL players: cassandra-vs-mongodb-vs-couchdb-vs-redis
- Google’s layering of ACID over BigTable (at least ACID inside a partition):
- Typically Economist: economist.com
Just a little plug for the 5th annual QCon London on March 7-11, 2011. There is a bunch of cool speakers inlcuding Craig Larman and Juergen Hoeller as well as the obligitory set of Ex-TW types. I’ll be doing a session on Going beyond the Data Grid.
You can save £100 and give £100 to charity is you book with this code: STOP100
More discussions on the move to in memory storage:
- RAM is my friend
- LMAX – How to Do 100K TPS at Less than 1ms Latency
- The problems with ACID, and how to fix them without going NoSQL
- Basho Riak: An Open Source Scalable Data Store
- Facebook’s belief in HBase
- Numbers Everyone Should Know
- Google Dremel Paper
- Facebook’s New Year Performance Stats
I’ve been working on a medium sized data store (around half a TB) that provides high bandwidth and low latency access to data.
Caching and Warehousing techniques push you towards denormalisation but this becomes increasingly problematic when you move to a highly distributed environment (certainly if the data is long lived). We’ve worked on a model that is semi normalised whilst retaining the performance benefits associated with denormalisation.
The other somewhat novel attribute of the system is its use of Messaging as a system of record.
I’ll also be adding some more posts in the near future to flesh out how this all works.
Submissions are being accepted for RefTest at IEEE International Conference on Testing, Verification and Validation.
Submissions can be short (2 page) or full length conference papers. The deadline in Jan 4th 2011.
Full details are here.