Coherence Part I: An Introduction

You can think of Coherence as simply being a distributed cache. It is after all what it was designed to do. But doing so would be something of an injustice. If a caching layer is all you need there are probably cheaper options. What you get with Coherence is a well thought out, simple framework for dealing with distributed data.

In one dimension it has moved towards the traditional database space, offering query functionality, indexing etc. In another it has encroached on the world of the application container by providing a framework for low latency, highly available, distributed systems in Java. It is its evolution into both of these, traditionally disparate, technology spaces that make it such a unique and useful product to use.

Coherence is still a traditional distributed cache under the covers, and is a pretty good one at that. So if you simply require fast access to prefabricated data (that is to say data that has been pre-processed into the required form), and you work in one of the 3 main languages (particularly Java), Coherence is still likely to be a decent choice, but there are quite a few cheaper alternatives these days, so bear that in mind.

It’s also important to understand the limits of the technology and Coherence certainly has its limits (for example). A large proportion of Coherence’s performance and scalability gains come from it’s adoption of a shared nothing architecture (I’ve written more on shared nothing architectures here). This means it excels in certain situations and quite the opposite in others. Learning to use the technology is about learning its limits. It should be one of the many tools in your architectural toolbox, but a fantastic tool to have.

Coherence is laid out over three distinct layers; client, cluster, persistence (see opening figure). The Coherence cluster itself is sandwiched between the client on the left and the persistent data source on the right. The client has it’s own, in process, 2nd level cache. The persistent data source is usually only used for data writes, it does not contribute to data retrieval (as the cluster, in the centre of the diagram, will typically be pre-populated with data, but more on that later).

Coherence has three major things going for it; it is fast, fault tolerant and scalable. Lets look at each of these in turn…

Coherence is Fast

Coherence’s speed can be attributed to five major attributes of it’s design:

It stores all data solely in memory. There is no need to go to disk.
Objects are always held in their serialised form (using an efficient binary encoding named POF – find out more about this here). Holding data in a serialised form allows Coherence to skip the serialisation step on the server meaning that data requests only have one serialisation hit, occurring when they are deserialised on the client after a response. Note that both keys and values are held in their serialised form (and in fact the hash code has to be cached as a result of this).
Writes to the database are usually performed asynchronously (this is configurable). Asynchronous persistence of data is desirable as it means Coherence does not have to wait for disk access on a potentially bottlenecked resource. As we’ll see later it also does some clever stuff to batch writes to persistent stores to make them more efficient. The result of asynchronous database access is that writes to the Coherence cluster are fast and will stay fast as the cluster scales. The downside being that data could be lost should a critical failure occur. As a result you should only use this asynchronous behaviour for data you don’t mind loosing.
Queries use indexes which are sharded across the data grid. Thus queries follow a divide and conquer approach.
Coherence includes a second level cache that sits in process on the client. This is a analogous to a typical caching layer, holding an in-process copy. This copy can be kept coherent either via setting a near-cache to be ‘present’ or via using a ‘continuous query’

Coherence is Fault Tolerant

Coherence is both fault tolerant and highly available. That is to say that the loss of a single machine will not significantly impact the operation of the cluster. The reason for this resilience is that loss of a single node will result in a seamless failover to a backup copy held elsewhere in the cluster. All operations that were running on the node when it went down will also be re-executed elsewhere.

It is worth emphasizing that this is one of the most powerful features of the product. Coherence will efficiently detect node loss and deal with it. It also deals with the addition of new nodes in the same seamless manor.

Coherence is Scalable

Coherence holds data on only one machine (two if you include the backup). Thus adding new machines to the cluster increases the storage capacity by a factor of 1/n, where n is the number of nodes. CPU and bandwidth capacity will obviously be increased too as machines are added. This allows the cluster to scale linearly through the simple addition of commodity hardware. There is no need to buy bigger an bigger boxes. It should be noted that scalability only comes with key-based access. As noted previously (here) queries will not scale linearly as you increase the number of nodes.

So we can summarise why Coherence is faster than traditional data repositories.

Coherence works to a simpler contract. It is efficient only for simple data access. As such it can do this one job quickly and scalably.
Databases are constrained by the wealth of features they must implement. Most notably (from a latency perspective) ACID.
High performance users are often happy to sacrifice ACID transactions for speed and scalability.

So What Is Coherence Really?

Most importantly, Coherence is just a map. All data is stored as key value pairs. It offers ‘some’ functionality that goes beyond this but it is still the fundamental structure of the product and hash based access to the key/value pairs it contains is fundamental to the way it works at the lowest level.

In a typical installation Coherence will be prepopulated with data so that the cluster become the primary data source rather than just a caching layer sitting above it (Coherence offers both modes of operation, it just so happens that almost everyone I know does it this way). The main reason that ‘read through’ is not often used is that (i) it adds latency to early client transactions and (ii) the map contains in indeterminate quantity of data meaning that searches (queries) against the cache will return indeterminate results.

Coherence is not a database. It is a much lighter-weight product designed for fast data retrieval operations. Databases provide a variety of additional functionality which Coherence does not support including ACID (Atomic, Consistent, Isolated and Durable), the joining of data in different caches (or tables) and all the features of the SQL language.

Coherence is not a Database

Coherence does however support an object based query language which is not dissimilar to SQL. There is now even an SQL-like declarative language you can use too. However Coherence is not suited to complex data operations or long transactions. It is designed for fast data access via lookups based on simple attributes e.g. retrieving a trade by its trade ID, writing a new trade, retrieving trades in a date range etc as well as executing data-centric custom functions (more to come on this later)

Coherence does not support:

Transactions (ACID)*
Joins
SQL**

* There is now (as of 3.6 I think) support for transactional caches. I’ve not used them to be honest and they have a number of restrictions. If you need transactions though you should probably look at alternative technologies.

** Coherence does support a simpler, object based query language but it is important to note that coherence does not lend itself to certain types of query, in particular large joins across multiple fact tables. There is now a newer declarative language option too.

Comparing Coherence with Other High Performance Data Repositories

Now lets compare Coherence with some other prominent products in the Oracle suite. Firstly lets look at the relationship with Oracle RAC (Real Application Cluster).

RAC is a clustered database technology. Being clustering it, like Coherence, is fault tolerant and highly available – that is to say that loss of a single machine will not significantly effect the running of the application. However, unlike Coherence, RAC is durable to almost any failure as data is persisted to (potentially several different) disks. However Coherence’s lack of disk access makes it significantly faster and thus the choice for many highly performant applications. Finally RAC supports SQL and thus can handle complex data processing. RAC however is limited by the fact that it is a Shared Disk Architecture, whereas Coherence is Shared Nothing (This difference is beyond the scope of this article but is discussed in full here).

TimesTen is a totally different Oracle technology. It is a completely in-memory implementation of an Oracle database supporting most standard database functionality, but at much lower latency.

The support for in memory storage is clearly a feature of both TimesTen and Coherence thus making them both suitable for low latency applications.

However the big advantage of using Coherence is that it is distributed i.e. the data is spread across multiple machines. TimesTen is restricted to a single process and thus is neither highly available nor scalable beyond the confines of a single machine (although it can be configured for fault tolerance).

However TimesTen offers most of the support that a database offers including:

Transactions
Complex query language (SQL) joins etc
Heavily optimised query execution.

This makes it the obvious choice if complex data processing is required or there is an existing dependence on SQL.

The other comparable technological space is the Shared Nothing database. These are databases that share the same architectural style where each node has sole ownership of the data it holds. Such systems are currently used for a rather different use case; data warehousing as apposed to OLTP applications. However this is likely to change in the near future. You can find more discussion of Shared Nothing databases here. My SNDB of choice is ParAccel.

Finally Coherence there are a number of other competitors out there which are pretty good. If you’re reading this today (I’m updating this in 2013) you should be checking out some of the open source alternatives. Hazlecast is the most obvious which now has a mature and well funded project that plays in the same product space. Gemfire, Terracotta and Gigaspaces are the direct competitors. If you are just looking for scalable caching layers with query semantics you might be better looking at a NoSQL disk based solution. These are much cheaper to run in the long term and keeping all your data in memory is often overkill if you are not operating on it directly. Check out MongoDB and Couchbase which are the two NoSQLs most closely related and both open source.

See also:

Posted on March 4th, 2009 in Coherence

Test Driven Development (all)

Manjunath Reddy

September 25th, 2011
18:59 GMT

Excellent article. Though I’m very new to coherence I could catch up easily after going through your blog. Appreciate your time and efforts in sharing your knowledge.

ben

September 26th, 2011
8:44 GMT

Thanks Manjunath – I’m very glad you are finding it useful 🙂

enyi

May 22nd, 2012
7:53 GMT

Good stuff. Ben S not only knows Coherence but makes it so easy to understand. Wish I knew about this blog a lot earlier.

Ravi Mishra

January 6th, 2014
7:23 GMT

Hi Ben,

I have a 5+ years experience in Java and recently landed to Coherence role. Really the article injects the jump start for any new guy. Would love to read more from you on this.

If possible send me directly few links of the latest description of coherence.

Arjun

January 9th, 2014
0:32 GMT

Is there a way, I can query to a particular node of coherence ?
Say suppose there are 5 nodes and I want to see number of keys in each nodes rather then cumulative number.

January 22nd, 2014
22:49 GMT

There are a number of ways to do this but the simplest is to use an Invocable. They are just an invocation that executes either no a specific node or on all of them. From there you can do whatever you want. See the api doc here:

http://download.oracle.com/otn_hosted_doc/coherence/340/com/tangosol/util/InvocableMap.html

Vlad

March 5th, 2014
14:38 GMT

Hi Ben, I just downloaded the zip file and build failed with thiw errors:
[junit] at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:136)
[junit] at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onNotify(DistributedCache.
CDB:3)
[junit] at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
[junit] at java.lang.Thread.run(Thread.java:662)
[junit] Caused by: (Wrapped: error configuring class “com.tangosol.io.pof.ConfigurablePofContext”) (Wrapped: Unable to load class f
or user type (Config=config/my-pof-config.xml, Type-Id=1000, Class-Name=com.benstopford.coherence.bootstrap.structures.MyPofObject)) (W
rapped) java.lang.ClassNotFoundException: com.benstopford.coherence.bootstrap.structures.MyPofObject
[junit] at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.ensureSerializer(Service.CDB:49)
[junit] at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.instantiateFromBinaryConve
rter(DistributedCache.CDB:3)
[junit] at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$BackingMapContext.setClass
Loader(DistributedCache.CDB:6)
[junit] at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$BackingMapContext.onInit(D

March 20th, 2014
7:49 GMT

Sorry about that. I must have committed a dodgy version. It should be ok now.

Arif Iftekhar

March 21st, 2018
15:21 GMT

Is there a database query to find what job is running on what node when coherence is enabled ?

law firm

April 11th, 2025
8:44 GMT

I think this is onee of the most significant info for me.
And i’m glaqd reading your article. But want to remark on some
general things, The website style is ideal, the articles is really excellent :
D. Goood job, cheers

lawyers in my area

April 17th, 2025
8:07 GMT

This is the right blog for anybody who wishes to find out about this topic.

You understand so much its almost hard to argue wth you (not that I personally will need to…HaHa).
You certainly put a fresh spin on a subject whicxh has been discussed for ages.
Great stuff, just wonderful!

lawyer articles

May 2nd, 2025
5:41 GMT

Good day! This is my first comment herre so I just wanted
too give a quick shout out and tell you I really enjoy reading through your posts.
Can you suggest any other blogs/websites/forums thzt go over the same subjects?
Thanks!

kqxs dong nai

May 2nd, 2025
8:10 GMT

Thanks for the good writeup. It actually was once a amusement account it.
Look advanced to more added agreeable from you!
By the way, how can we be in contact?

kqxshomnay.it.com

May 3rd, 2025
23:29 GMT

I have read so many articles about the blogger lovers but this post is really a good article,
keep it up.

kqxsmb.it.com

May 7th, 2025
0:14 GMT

You have made some decent points there. I checked on the net to find out more about the issue and found most individuals will go
along with your views on this web site.

Lawyer article

May 17th, 2025
7:56 GMT

Hello there, just became aware of your blog through Google, andd found tthat it’s truly informative.
I’m gonna wztch out for brussels. I will appreciate if you continue this in future.
Lots of people will be benefited from your writing. Cheers!

g2g59.club

May 20th, 2025
12:40 GMT

Hi there! Would you mind if I share your blog with my myspace group?
There’s a lot of folks that I think would really appreciate your content.
Please let me know. Cheers

sagame

May 20th, 2025
16:34 GMT

I every time emailed this webpage post page to all my associates, since if like to
read it next my links will too.

Doctor slimming

May 24th, 2025
7:44 GMT

Thank you for addresssing this topic! It’s not easy to navigate the
world of fitness and nutrition, but your focus
on [Health and Fitness, Weight Loss, Nutrition and Diet] really hits the mark.

I’ve been trying to implement more whole foods into my diet, and it’s made a difference.

What’s your take on balancing healthy eating with occasional indulgences?

Professional weight management from The Weight Loss Medics

May 26th, 2025
12:44 GMT

This post offers such a refreshing perspective! It’s amazing how much mindset matters in achieving health goals.
I’ve been reading a lott avout [Health and Fitness, Weight Loss, Nutrition and Diet] lately, and
it’s inspiring to see similar ideas shared here. What’s been your
biggest takeaway from your own journey toward better health?

Medical guidance by The Weight Loss Medics

June 4th, 2025
8:53 GMT

Thank you for addressing this topic! It’s not easy to navigate the world of
fitneszs and nutrition, but your focus on [Health and Fitness, Weight Loss, Nutrition and Diet] really hits
the mark. I’ve been trying to implement more whole foods into my diet, and it’s made a difference.
What’s your take on balancing healthy eating with occasional indulgences?

pgslot99

June 12th, 2025
5:52 GMT

I was more than happy to find this page. I want to to thank you for
your time for this fantastic read!! I definitely really
liked every part of it and I have you saved to fav to look at new information in your web site.

banca30.skin

June 16th, 2025
20:31 GMT

Have you ever thought about writing an ebook or guest authoring on other sites?
I have a blog centered on the same information you discuss
and would love to have you share some stories/information. I know my audience
would appreciate your work. If you are even remotely interested, feel free to
send me an e mail.

https://cwin.luxe/

June 17th, 2025
2:23 GMT

Do you have a spam issue on this site; I also am a blogger,
and I was wondering your situation; many of us have created some nice methods and we are looking to
swap methods with other folks, be sure to shoot me an email if interested.

https://topnhacai.it.com/

June 17th, 2025
4:02 GMT

When I initially commented I clicked the “Notify me when new comments are added”
checkbox and now each time a comment is added I get three emails with the same comment.
Is there any way you can remove me from that service?

Cheers!

https://bongdaluvip.it.com/

June 18th, 2025
19:07 GMT

Hi there very cool blog!! Man .. Beautiful .. Amazing ..
I will bookmark your site and take the feeds additionally?
I’m glad to find so many useful info right
here in the submit, we want develop more strategies on this regard,
thank you for sharing. . . . . .

sex trẻ em

June 19th, 2025
0:28 GMT

Hello, I enjoy reading through your article. I wanted to write a little
comment to support you.

Mauricejargo

June 19th, 2025
23:52 GMT

Love how they explain even complex things simply.

tk88

June 21st, 2025
12:38 GMT

I’m really loving the theme/design of your blog.
Do you ever run into any browser compatibility problems?
A few of my blog visitors have complained about my website not working correctly in Explorer but looks great in Opera.
Do you have any solutions to help fix this problem?

555win.bar

June 26th, 2025
13:05 GMT

Hi there! I could have sworn I’ve been to this blog before but
after checking through some of the post I realized it’s
new to me. Nonetheless, I’m definitely happy I found it and I’ll be book-marking and checking back frequently!

สล็อต pg

June 27th, 2025
12:47 GMT

We are a group of volunteers and starting a new scheme in our community.

Your website offered us with valuable info to work on.
You’ve done a formidable job and our entire community will be
grateful to you.

biggest private jets charter

July 10th, 2025
16:37 GMT

You strike a very good stability between detail and simplicity.

Caleb

July 10th, 2025
19:44 GMT

I’ve read similar posts, but yours added a few contemporary perspectives I hadn’t thought-about before.

trusted gold investment companies for iras

July 11th, 2025
16:10 GMT