Part of building stateless systems that scale horizontally is using a distributed cache (where state is actually stored). This guide outlines the different types of items one will probably need to cache in a system like this, where to cache it (local or distributed), how to use the cache, what timeouts to use, etc.
Horizontal Scale
First lets review what a horizontally scaled system looks like.
Machine 1 and 2 accept requests from the load balancer in an un-deterministic way, meaning there is no affinity or sticky sessions. So the machines need to be stateless, meaning they don’t manage state themselves. The state is stored in a central place, the distributed cache. The machines can be taken offline without killing a bunch of user session, and more machines can be added and load can be distributed as needed.
Types of Cache
Notice there are two types of caches here:
1) Local caches - these are in-memory caches on each machine. This is where we want to store stuff that has long running timeouts and is not session specific.
2) Distributed cache - this is a hi performance cluster of machines with a lot of memory, built specifically for providing a out of process memory store for other machines/services to use. This is where we want to sure stuff that is session specific.
Using Cache
When using information that is cached, one should always try to get the information from the cache first, then get from source if it’s not cached, store in cache, and return to caller. This pattern is called the read through cache pattern. This ensures you are always getting data in the most efficient means possible, before going back to the source if needed.
Cached Item Lifespan
There are basically two things to think about when thinking about cached items lifespan.
1) How long should something remain in the cache before it has to be updated? This will vary depending on the type of data cached. Some things like the logo of a tenant on a multi-tenant system should have a long timeout (like hours or days). While other stuff like an array of balances in a banking system should have a short timeout (like seconds or minutes) so it is almost always up-to-date.
2) When should stuff be removed from cache? You should always remove stuff from cache if you know you are about to do something that will invalidate the information previously cached. This means if you are about to execute a transfer, you should invalidate the balances because you’ll want to get the latest balances from the source after the transfer has happened (since an update has occurred). Basically any time you can identify an action that will invalidate (make inconsistent) something in the cache, you should remove that item, so it can be refreshed.
Designing Cache Keys
You should take the time to design a good cache key strategy. The strategy should make it clear for your development team how keys are constructed. I’ll present one way to do this (but not the only way). First think about the types of data you’ll be caching. Lets say a typical multi-tenant system will consist of the following categories of cached items:
1) Application- this is stuff that applies to the whole system/application.
2) Tenant - this is stuff that is specific to a tenant. A tenant is a specific organization/company that is running software in your system.
3) Session - this is stuff that is specific to a session. A session is what a specific user of an organization creates and uses as they interact with your software.
The whole point of key design is to figure out how to develop unique keys. So lets start with the categories. We can do something simple like Application = “A”, Tenant = “T”, Session = “S”. The category becomes the fist part of the cache key.
We can use nested static classes to define parts of the key, starting with the categories. In the code sample above we start with a Application class that uses “A” as the KeyPattern. The next we build a nested class
"Currencies" which extends the KeyPattern with it’s own unique signature. Notice that the signature in this case takes in parameters as to create the unique key. In this case we are using page and page size to build the key. This way we can cache a specific set of results to a query that uses paging. There is also a property to get the TimeToLive and another to construct the key, based off the pattern.
The above example is caching stuff in a “local cache”, not a distributed cache. This is because the information in this example is not specific to a user or session. So it can be loaded on each machine which can keep a copy of it there. Generally you want to do this for anything that doesn’t need to be distributed, because it performs much better (think local memory vs. serialization/deserialization/network, etc).
When thinking about unique keys for things like session, you should consider putting the session identifier as an input to the key, since that should guarantee uniqueness (per session). Remember you basically just have a really big name/value dictionary to fill up. But you have to manage the uniqueness of the keys.
Takeaways
1) Use both a local and distributed cache. Only put session or short lived stuff in the distributed cache, other stuff cache locally.
2) Set appropriate timeouts for items. This will vary depending on the type of information and how close to the source it needs to be.
3) Remove stuff from cache when you know it will be inconsistent (like updates, deletes, etc).
4) Take care to design cache keys that are unique. Build a model of the type of information you plan to cache and use that as a template for building keys.