The Best of VLDB 2012 (Very Large Database Conference)
Here are some of the highlights of the 210 papers presented at VLDB earlier this year. You can find the full list here.
From Cooperative Scans to Predictive Buffer Management (here)
Intriguing paper from the Vectorwise guys for improving IO efficiency under load. LRU/MRU caching policies are known to break down under large, concurrent workloads. SQL Server and DB2 both have mechanisms for sharing IO between queries (by attaching to an existing scan or throttling faster queries so that IO can be shared). The Cooperative Scans discussed here takes this a step further by incorporating an active buffer manager which scans use to register their interest in data. The manager then adaptively chooses which pages to load and pass to the various concurrent requests.
There is another related paper at this conference SharedDB: Killing One Thousand Queries With One Stone (here)
Processing a Trillion Cells per Mouse Click (Google) (here)
Interesting paper from Google suggesting an alternative to the approach to column orientation taken in Dremel. PowerDrill uses a double-dictionary encoded column store where the encodings live largely in memory. Further optimisations are made at load time to ensure minimal access to persistent storage. This makes it more akin to column stores like ParAccel or Vectorwise, applied to analytical workloads (aggregates, group bys etc).
Can the elephants handle the NoSQL onslaught (here)
Another paper comparing the performance of Hadoop with a relational database (in a similar vein to the Sigmod 09 paper DeWitt published previously here). I sympathise with the message – databases outperform hadoop on small to medium workloads – but I hope that most people know that already. This time the comparison is with Microsoft’s Sql Server PDW (Parallel Data Warehouse). The choice of data sizes between 250Gb and 16TB means that the study has the same failing as the previous Sigmod one; it’s not looking at large dataset performance.
Interactive Query Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads (here)
Useful, empirically driven paper with detailed data sets from a number of NoSQL implementations including Facebook. Chen et al. performed an empirical study on the implementation of Hadoop at a number of companies including Facebook. It hints at the current ‘elephant in the room’ that is Hadoop’s focus on batch-time over real-time performance (roll on Impala!) . Having data of this level of granularity over a range of real time systems in itself is quite valuable. They note that 90% of jobs are small (resulting in MBs of data returned).
High-Performance Concurrency Control Mechanisms for Main-Memory Databases (here)
Proposes an optimistic MVCC method for in memory concurrency control. The conclusion: single-version locking performs well only when transactions are short and contention is low; higher contention or workloads including some long transactions favor the multiversion methods, and the optimistic method performs better than the pessimistic one.
Blink and It’s Done: Interactive Queries on Very Large Data (here)
Blink is different to the mainstream database as it’s not designed to give you an exact answer. Instead you specify either error (confidence) or maximum time constraints on your query. The approach uses a number of sampling based strategies to achieve the required confidence level. There is a related paper: Model-based Integration of Past & Future in TimeTravel (here)
Developing and Analyzing XSDs through BonXai (here)
B+-tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives (here)
SSDs performance increases (initially) with the number of concurrent executions (in stark contrast with magnetic drives). This paper looks into maximising this with the use of concurrent B-trees that utalise parallel IO. Useful research as flash is only going to get cheaper.
SCOUT: Prefetching for Latent Structure Following Queries (here)
I quite like the ideas in this paper around prefetching data based on a known structure (probably because it’s similar to some of the stuff we do).
Fast Updates on Read-Optimized Databases Using Multi-Core CPUs (here)
Addresses the problem some columnar architectures suffer where they accumulate writes in a separate partition, which must be periodically merged with the read-optimised main one.
FDB: A Query Engine for Factorised Relational Databases (here)
I hadn’t come across the idea of Factorised Databsaes before. An interesting concept. The paper demonstrates performance improvements over traditional methods for many-to-many join criteria.
Only Agressive Elephants are Fast Elephants (here)
Interesting approach to indexing Hadoop that claims to improve both read and write performance. I couldn’t find the code though so couldn’t try it.
The Vertica Analytic Database: C-Store 7 Years Later (here)
A good summary of this mature shared-everything, columnar database. They discuss their use of super projections over join indexes, due to the overheads associated with tuple construction and the verbosity of storing the associated rowids. There is a summary of the encoding types used as well as partitioning and locking strategies.
Muppet: MapReduce-Style Processing of Fast Data (here)
Whilst the majority of MapReduce commentary focuses on improving MR query performance this paper looks at the problem of injesting data quickly for high throughput, streaming workloads. The interesting approach focuses on data as streams (in and out) in association with a moving historical window (they denote a slate). To me there seems to be a lot of similarity between this approach the one taken by products like StreamBase and Cloudscale but the authors differentiate themselves my being less schema oriented, more akin to the traditional MR style.
Serializable Snapshot Isolation in PostgreSQL (here)
Interesting paper on the implementation of serializable isolation using the snapshot model.
Other papers of note:
- Minuet: A Scalable Distributed Multiversion B-Tree (here)
- A Statistical Approach Towards Robust ProgressEstimation (here)
- Efﬁcient Multi-way Theta-Join Processing UsingMapReduce (here)
- Avatara: OLAP for Web-scale Analytics Products (OLAP cubes over a NoSQL @LinkedIn) (here)
- 10 Year Best Paper Award: Approximate Frequency Counts over Data Streams (here)