The problem of cache dependencies

When some data has been updated, all dependent caches should be reset. Suppose, the cache of main page of a company site contains an instance of Product model. When the instance of Product model has been updated, the cache should be updated too. Another example, when attributes of User model (for example, last_name) has been updated, all caches of posts of the user, contained the last_name, should be reset too.

Usually, the pattern Observer (or its variety pattern Multicast) is responsible for the cache invalidation. But even in this case the invalidation logic becomes too complicated, the achieved accuracy is too low, the Coupling is growing fast, and encapsulation is disclosed.

The solution of the issue can be cache tagging (i.e. marking cache by tags). For example, the cache of main page can be marked by tag product.id:635. The all user’s posts can be marked by tag user.id:10. Post lists can be marked by composite tag, composed of selection criteria, for example type.id:1;category.id:15;region.id:239.

Now it’s enough to invalidate a tag to invalidate all dependent caches. This approach is not new, and widely used in other programming languages. At one time it was even implemented in memcache, see memcached-tag.

See also:

Overhead of cache reading vs overhead of cache creation

How to implement invalidation of caches dependent on the tag? There are two options:

1. Destroy physically all dependent caches on the tag invalidation. Implementation of this approach requires some overhead on the cache creation to add key of the cache into the cache list (or set) of the tag (for example, using SADD). The disadvantage is that the invalidation of too many dependent caches takes some time.

2. Just change the version of tag on the tag invalidation. Implementation of this approach requires some overhead on the cache reading to verify the version of each tag of the cache with the actual tag version. So, the cache should contain all tag versions on the cache creation. If any tag version is expired on the cache reading, the cache is invalid. The advantage of this approach is immediate invalidation of the tag and all dependent caches. Another advantage is that premature discarding of a tag info is not possible (using LRU), because the tag info is read mush often than dependent caches.

I’ve chosen the second option.

Tagging of nested caches

Because tags are verified at the moment of cache reading, let’s imagine, what happens, when one cache will be nested to other cache. Multi-level cache is not so rarely. In this case, the tags of inner cache will never be verified, and outer cache will alive with outdated data. At the moment of creation the outer cache, it must accept the all tags from the inner cache into own tag list. If we pass the tags from the inner cache to the outer cache in an explicit way, it violates encapsulation! So, cache system must keep track the relations between all nested caches, and pass automatically all tags from an inner cache to its outer cache.

Replication problem

When tag has been invalidated, a concurrent thread/process can recreate a dependent cache with outdated data from a slave, before the slave will be updated.

The best solution of this problem is a locking the tag for cache creation until slave will be updated. But, first, this implies a certain overhead, and secondly, all threads (including current one) continue to read outdated data from the slave (unless reading from the master is specified explicitly).

A compromise solution can be simple re-invalidation of the tag after period of time when the slave is guaranteed to be updated.

I saw also an approach of cache regeneration instead of removing/invalidation. This approach entail ineffective memory usage (in case LRU principle). Also, this approach does not resolve the problem of complexity of invalidation logic. Usually, this approach is the cause of a lot of bugs. For example, it requires to use a high quality ORM. Some ORMs does not perform type conversion of instance attributes on save, therefore, the cache can be wrong (for example, there can be a string instead of a datetime instance). I saw such case in my practice, the cache saved the string from the HTTP-client instead of the datatime instance. Although the data had been saved correctly, the model logic didn’t performed type conversion until some another method had been called (semantic coupling).

Updated on Nov 10, 2016

Added description of implementation of tag locking.

Implementation of tag locking

The main purpose of tag locking is a preventing of substitution of actual data by outdated data by concurent threads/processes, if it’s needed by transaction isolation level or a delay of replication.

The tag locking is implemented by library as preventing the dependent cache creation by concurent threads/processes while the tag is locked.

Why was not implemented a Pessimistic Offline Lock or Mutual Exclusion? This is a resonable question, because the cached logic can be too resource intensive. This implementation requires concurent threads/processes are waiting untile the locked tag will be released.

Constructive obstacle to implementing pessimistic locking

The main purpose of the library is cache invalidation.

Suppose, the process P1 has begun transaction with isolation level of “Repeatable read”.

Then the process P2 has begun the transaction, updated data in the DB, invalidated tag T1, and ascuired the lock for the tag T1 until the transaction will be committed.

Process P1 are trying to read the cache with key C1, which is tagged by the tag T1, and is not valid anymore. Not being able to read the invalid cache C1, the process P1 receives the outdated data from the DB (remember, the transaction isolation level is “Repeatable read”). Then the process P1 are trying to create the cache C1, and waiting while the tag T1 will be released.

When the transaction of process P2 is committed, the process P2 releases the tag T1. Then the process P1 writes the outdated data into the cache C1. This locking does not make sense.

But what will be happened, if we check the status of tag T1 on the cache reading (not writing)? Can this approach to change something?

Yes, it can. First, it adds an overhead to reading logic. The second, it can has an effect if transaction isolation level is not higher than “Read committed”. For the transaction isolation level “Repeatable read” (which is default for some DB, and at least required for the correct work of pattern Identity Map) and higher, it does not has any effect. In this case, the process P2 would be locked before the transaction beginning.

Thus, this solution would be partial, not universal, and would contain an uncontrolled dependence. For 2 from 4 of transaction isolation level it would not work.

Accompanying obstacle to implementing pessimistic locking

Except the constructive obstacle to implementing pessimistic locking, there is also some other obstacles.

The library is focused mainly on web applications. Waiting for parallel process until the end of the transaction, or until the slave is updated, which in some cases can take 8 seconds or more, is practically not feasible in web applications.

There is the 3 main reasons:

  • The quickness of response is important for web-application, otherwise a client simply can not wait for the response.
  • There is no any reason to wait for lock release longer than it takes time to create the cache itself.
  • An increase in the number of pending processes can lead to a memory overflow, or reaching of available workers of the server, or reaching of the maximum allowed number of connections to the database or other resources.

Also, there would be a problem with the implementation, since it is impossible to correctly block all tags by single query.

  • First, we have to use method cache.add() instead of cache.set_many() for locking, to ensure the atomicity of the existence check and cache creation.
  • Second, each tag should be locked by separate query, that increases the overhead.
  • Third, the locking by single query per tag can lead to Deadlock, the probability of which can be significantly reduced by topological sorting.

We should also mention the possibility of row-level locking by DB using SELECT FOR UPDATE. But it works only when both transactions use SELECT FOR UPDATE, otherwise it does not work:

When a transaction uses this isolation level, a SELECT query (without a FOR UPDATE/SHARE clause) sees only data committed before the query began; it never sees either uncommitted data or changes committed during query execution by concurrent transactions. In effect, a SELECT query sees a snapshot of the database as of the instant the query begins to run.

But no one uses cache of select for update (it doesn’t make sense to do it, and usually select for update is not used by web-applications because business transaction is used instead). Also, this approach is not able to solve the problem of replication.

Thundering herd

But what we can to do if cached logic is really resource intensive?

Dogpile is also known as Thundering Herd effect or cache stampede.

The answer is simple - Pessimistic Lock. But we have to lock not tags, but the key of the cache (or group of related keys, see Coarse-Grained Lock, especially when using aggregate queries). It’s because of when the cache key is released, the cache must be guaranteed to be created (but tags has many-to-many relation to caches).

The lock must cover the entire code fragment from reading the cache to creating it. And this responsibility is not related to invalidation.

There is a lot of libraries which solve the issue, for example:

Transaction problem

When web-application has good traffic, it’s possible the concurrent process recreates the cache with the outdated data since the tag has been invalidated but before the transaction is committed. In contrast to replication problem, here is the manifestation of the problem strongly depends on the quality of the ORM, and the probability of problems is reduced when you use a pattern Unit of Work.

Let to consider the problem for each transaction isolation level separately.

Read uncommitted

This is a simple case without any problems. If replication is used, it’s enough to repeat invalidation when the slave is guaranteed to be updated.

Read committed

There is a problem, especially when you are using the pattern ActiveRecord. The probability of the problem can be reduced by using the pattern DataMapper together with Unit of Work, this reduces the interval of time between data saving and transaction commit. But the problem is still possible.

In contrast to the replication problem, it would be preferable to use tag locking here until the transaction will be committed, because the current process reads different data than concurrent processes. It’s impossible to say which process (the current process or concurrent one) will have created the cache, thus it would be desirable to avoid cache creation until transaction is committed.

But this transaction isolation level is not so serious, and most often used to increase the degree of parallelism, i.e. has the same purpose as replication. In this case, the problem of the transaction isolation level “Read committed” is usually absorbed by the replication problem, because process usually reads data from a slave.

Therefore, the expensive lock can be replaced by a re-invalidation when transaction is committed, as tradeoff.

Repeatable read

This case is more interesting. We can’t avoid tag locking here because we have to know not only the list of cache’s tags, but also the time of each transaction commit which has invalidated the tag.

Thus, we have to lock the tag from the moment of the invalidation, but, moreover, we are not able to create cache in transactions which has been begun earlier than the current transaction is committed.

The good news is that we can lock the tag until the slave will be updated, if we have to use tag locking in any case.

Serializable

Because non-existent objects usually are not cached, we are able to limit the problem of this transaction isolation level by the level of Repeatable read.

Multiple connections to Database

When you use multiple databases, and its transactions are synchronous, or you use simple replication, then you can use by one instance of outer cache (wrapper) per one instance of inner cache (backend). The transaction of the cache does not have to strictly follow to system transactions of the DB. It is enough to fulfill its purpose - to prevent the substitution of the cached actual data by concurrent process until the actual data will be visible for the concurrent process. Therefore, a single transaction of the cache can cover several system database transactions.

When you use multiple connections to the same database (it sounds a little strange, but theoretically it’s possible, for example, when you don’t have ability to share connection between several ORMs in the single application), or the system database transactions are not synchronous, then you can configure the outer cache (wrapper) in the way to have by one instance of outer cache (wrapper) per one connection to DB for each instance of inner cache (backend).

Non-cached fragment. Dynamic window in cache.

There are two mutually complementary patterns based on diametrically opposite principles - Decorator и Strategy. The first one places variable logic around a code, In the first case, the variable logic is placed around the declared code, in the second case it is passed into it. Usual cache is similar to the pattern Decorator, when the dynamic logic is located around the cached logic. But sometimes a little fragment of the logic should not to be cached inside the cache. For example, it can be some data of user, permission checking etc.

This problem can be solved by using Server Side Includes.

Another approach is using two-phase template rendering, for example django-phased. To be honest, this approach has a considerable resource consumption, and in some cases the achieved effect can be gone. Probably, due to this reason the approach is not widely used.

The popular template engine Smarty written by PHP has the function {nocache}.

But the more interesting approach would be to use python code inside the dynamic window to abstract from third-party technologies.

Updated on Nov 06, 2016

Added abstract dependency manager.

Abstract dependency manager

For a long time I did not like the fact that several classes with different responsibilities were aware about the logic of tags handling.

It would be good to encapsulate this logic into separate class strategy, for example, similar to TagDependency of YII framework, but this approach creates overhead as extra query per each cache key to verify its tags, that means depriving the method cache.get_many() of the sense - aggregation queries. I think, the overhead should not be more than one extra query per action, even for case this action is aggregated like cache.get_many().

Also I had another method with tangled responsibilities to provide aggregation queries, that does not cause delight.

But I like the idea to extract an abstract dependency manager, and obtain ability to use not only tags for invalidation, but any another principle, even an composite principle.

The problem was solved by class Deferred. It’s not pure Deferred as we know it from asynchronous programming, otherwise I would like to use this elegant and lightweight library, kindly provided by the guys from Canonical.

My case requires not only delay the query execution, but also aggregation queries when it possible, for example, by using of cache.get_many().

Probably, the name Queue or Aggregator would be better, but since from the interface point of view we just postpone the task execution without going into details of its implementation, I preferred to leave the name Deferred.

This approach allows me to extract the abstract dependency manager, and now the logic of invalidation by cache tagging is simple an implementation of the intarface as class strategy TagsDependency.

This opens prospects for the creation of other implementations of dependency management, for example, by observing a file changing, or SQL query, or some system events.

Gratitude

Thanks a lot to @akorn for the meaningful discussion of the problem of caching.

Comments

comments powered by Disqus