Ensuring performance in a multi tenant Kafka Cluster


I went down the rabbit hole of learning about performance in a multi tenant Kafka cluster

The Apache Kafka documentation covers the basics but, I found that the Confluent Blog post on Multi-Tenancy in the Cloud and YouTube video provides some good insights on operating a performant Kafka As A Service in the cloud. 

My notes are a summary of the excellent resources listed below. 

https://kafka.apache.org/documentation/#multitenancy
https://www.youtube.com/watch?v=8hcUBhLE6_U

Multi Tenant Kafka with Confluent Cloud -
https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/
https://www.youtube.com/watch?v=8ti63z3idbs&t=2s

Optimizing Performance

An application's performance is usually bounded by the following resources -

  • CPU
  • Memory
  • Network
  • Disk

In most systems, the CPU is faster than the memory. Memory in turn is faster than the network and disk.

An application's read performance is bounded by memory - when data is cached. The write performance depends on how fast data can be written to disk.

Quotas

A quota is not a reservation, but a means to ensure that tenants don't consume all the resources of a system.

Kafka provides 3 quotas that correlate to CPU, Memory and Disk. Each of these quotas can be configured per broker per tenant to match the finite resources available to the underlying operating system. 

  • Disk - Produce Bandwidth Quotas
  • Memory - Consume Bandwidth Quotas
  • CPU - Request Quotas

Configuring Broker Performance

What is a Tenant ?

A tenant in Kafka can be any entity or application that expects a certain level of performance from the cluster. It is represented by any of the following -
  • The User Principal
  • The Client Id
  • Both the User Principal And the Client ID

Bandwidth Quotas

There are two types of Bandwidth Quotas -
  • Produce Bandwidth Quota - This throttles the number of writes from clients into Kafka
  • Consumer Bandwidth Quota - This throttles the numbers of reads from clients from Kafka
Bandwidth quotas are measured in bytes per second.

Request Quotas

Request quotas throttle the number of requests received per second. 

For newer clients, the broker communicates that the client needs to throttle its requests for a certain amount of time. For older clients, the broker automatically mutes the channel for the duration of the throttle.

Request quotas help protect CPU performance. This can be difficult to understand and even harder to determine. 

Each request and response from and to the client is associated with network threads and request handler threads. Network threads handle communication with the client, while Request threads handle processing the requests. These threads are bound to the CPU. Hence the request quota is a function of both the number of network and request handler threads.

The Request Quota can be calculated using the following formula -

( network thread + request handler threads) * 100

Configuring a Kafka broker

A Kafka broker can be configured via the kafka-config command, with the following parameters

  • producer_byte_rate
  • consumer_byte_rate
  • request_percentage

Other Quotas

Besides Bandwidth and Request Quotas it is possible to limit the creation of topics, and limit the number of connections per broker.

Effective Capacity

When configuring quotas it is important to recognize that brokers are also responsible for replicating their data and this should be taken into account during capacity planning. The effective capacity (diagram below) is what is available to be divided among the various tenants.


Sources

https://kafka.apache.org/documentation/#multitenancy

https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/

https://www.youtube.com/watch?v=8ti63z3idbs&t=2s

https://www.youtube.com/watch?v=8hcUBhLE6_U