Consistent fault-tolerant mechanism for distributed cache

It is known that computer engineering has two open problems: variables naming and cache invalidation. Although, there are approaches for keeping cache consistent and obtain performance gain in reads-optimized distributed systems.

This article based on another paper “Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency”.

Model

Consider the following scheme: there is a single server, where all data is stored. Clients can read or write data.

According to the title, the mechanism should have two properties:

  • Consistency - for a given moment the request made at any part of the system must be processed with the same result
  • Fault-tolerant - the system must keep working after some of the nodes will go offline. In this particular case, it means surviving client’s fault but not server’s one

Algorithm

When a client attempts to read the data from the server, the server responses with the result and the lease. Lease - is a guarantee of data will not be changed during the specified timeframe. The client may hold the data in local cache while the lease is active.

When the server is requested to write data, it makes an attempt to invalidate all active leases. If some of the clients experiencing network fault, the server is obligated to delay the write operation until all active leases either invalided or expired.

Lease terms

Choosing the lease timeframe is a trade-off. If it too high, writing in case of failing clients is too expensive. If it is too low, clients make more requests to the server. Furthermore, zero lease timeframe might be better than very small lease timeframe because it would expire before a client will be able to access cached value.

Let’s consider a simple analytic model. For simplicity assume, that all clients are equal.

Parameters of the system are:

$N$ - is a total number of clients.

$R$ - rate of reads for each client (means in average there are $R$ reads per time unit)

$W$ - rate of writes for each client

$S$ - a maximum number of active leases in any specific moment. A client can have at most one lease.

$m_{prop}$ - time for a message to be transferred between the client and the server

$m_{proc}$ - time for client or server to process the message

$\varepsilon$ - clock uncertainty between the client and the server

$t_S$ - lease term

When the message is sent, it requires

$$\underbrace{m_{proc}}{client} \underbrace{m_{prop}}{client -> server} + \underbrace{m_{proc}}{server} = m{prop} + 2m_{proc}$$

An effective lease term (lease term from the client’s point of view):

$$ t_C = \max ( 0, t_{S} - (m_{prop} + 2m_{proc}) - (m_{prop} + 2m_{proc}) - \varepsilon) $$

Assume that $$t_S » m_{prop} + 2m_{proc}$$

Let’s investigate how lease timeframe influences a total number of network interactions made by the server.

If no lease is present, there are expected $NR$ reads, and $NW$ writes total. Read requires two interactions, therefore total load is $2NR + NW$.

When leases are active, $Rt_C$ requests per time unit would go to local cache, plus 1 request to obtain the lease. Therefore, the probability of client accessing the server is $\frac{1}{1+Rt_C}$. Expected read server load is $\frac{2NR}{1+Rt_C}$

On the other hand, now it is required to make attempts to withdraw the lease on write. During each writing request, it is necessary to withdraw $S$ leases. Therefore, new write cost is $(S+1)NW$

The goal is to select lease terms to make a profit.

$$2NR + NW > \frac{2NR}{1 + Rt_C} + (S+1)NW$$

If $2R > SW$ the following inequality can be derived.

$$ t_C > \frac{1}{R(\frac{2R}{SW} - 1)} $$

If this condition holds, there is a performance benefit for a server.

Analysis

The algorithm allows implementing caching scheme in an arbitrary distributed system without loss of any guarantees. Furthermore, the model has a free variable - lease term - which allows adapting the system for any specific usage pattern. Experiments show that server load can be only 10% for 10s lease term, comparing with lease-less architecture.

There are several possibilities for algorithm improvements:

  1. Set different lease terms for different data, depending on $\alpha$
  2. Renew leases before it has expired

The remaining drawbacks are:

  1. The model assumes equal delays during message processing and propagation in different requests. That is not true in real systems.
  2. Clocks in clients and the server must be in sync. It might be hard to guarantee a low $\varepsilon$ due to unpredictable networking, and high $\varepsilon$ would require higher uneffective lease term.

Overall, the algorithm, despite its simplicity, shows good empiric results. Since the article was first published in 1989, there must have been made several improvements for the idea.

Share