Configuring Kafka on Kubernetes makes available from an external client with helm

Tsuyoshi Ushio
7 min readApr 9, 2019

--

A lot of people struggled to configure Kafka on kubernetes, especially if you want to use it from outside of the cluster. I’ll explain how can we make it. I also wrote a PR to the incubator/Kafka helm chart for enabling it. If it is merged, it might be helpful. I test it on Azure, However, the principal is the same for other cloud platform or on-premise.

What is the issue?

Deploy Kafka cluster on kubrernetes from an external client, you need to access through the Service. It can be NodePorts, LoadBalancer, or ingress controller. Some cloud platform can’t use Node Ports, in this case, we need to use LoadBalancer or ingress. However, this scenario could cause a problem. Let’s see this diagram.

Kafka on Kubernetes with LoadBalancer

When you start talking with Kafka from the external client, You send a request to the Service that has IP Address (12.345.67:31090). Then the Kafka return the endpoint where to access from the client. It should be `12.345.67:31090`. However, you might see something like `kafka-0.kafka-headless.default:9092` that is an internal access point of Kafka from the resources of kubernetes. Then the Kafka client tries to access the endpoint `kafka-0.kafka-headless.default:9092`. It fails because it is not accessible from the outside of the kubernetes.

We can use kafkacat for testing it. However, it returns wrong address.

$ kafkacat -b 12.345.67:31090 -L
Metadata for all topics (from broker -1: 12.345.67:31090/bootstrap):
1 brokers:
broker 0 at kafka-0.kafka-headless.default:9092
2 topics:
topic “test2” with 1 partitions:
partition 0, leader 0, replicas: 0, isrs: 0

Root cause

Kafka advises the endpoint based on the incoming request of the port number. We also configure advertised.listener and listener.security.protocol.map and listeners and inter.broker.listener.name for enabling it.

 “advertised.listeners”: PLAINTEXT://kafka-0.kafka-headless.default:9092
,EXTERNAL://12.345.67:31090
“listener.security.protocol.map”: PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT
“listeners”: PLAINTEXT://:9092,EXTERNAL://:31090
“inter.broker.listener.name”: “PLAINTEXT”

If you receive request to 9092, it will advice “PLAINTEXT://kafka-0.kafka-headless.default:9092” as a listener endpoint. In case 31090, it will be “EXTERNAL://12.345.67:31090”.

Looks simple, however, it doesn’t work for a lot of helm chart. It is not implemented or has a bug for the LoadBalancer scenario. Let’s solve this.

For more detail of the broker configuration,

Solution

Let me expand. Why does it happen even if I configure the advertised.listeners and some others? I investigate incubator/helm chart. I realized the root cause. I’ll explain what happens based on the configuration of the incubator/helm chart.

Ports for Kafka on kubernetes with LoadBalancer

TargetPort of Service

Look at the picture above. If the Service(LoadBalancer) receive the request at the Port(31090), it transfers the request to the Pod. According to the default configuration of the TargetPort, it dispatches the request to the 9092 port of the container. container dispatch the same port 9092. From the point of view of the Kafka, they think “hey, the request to the 9092 is coming. Let’s advise the endpoint for them. It should be kafka-0.kafka-headless.default:9092”. The TargetPort should be the same as the Service(LoadBalancer). If you configure the advertised.listener, Kafka creates these listeners which have a port 9092 and 31090. We need to fix the original helm chart to enable this. You can find my PR later.

Assign IP Address in advance

You need to assign pre-reserved IP Address for the Service(LoadBalancer). We need to tell the IPAddress or DNS name to the broker as the setting of the advertised.listeners when you deploy Kafka with the helm chart. In the case of the Type=LoadBalancer, we can configure loadBalancerIP. The IP address is fake.

values.yaml


## External access.
##
external:
type: LoadBalancer
# annotations:
# service.beta.kubernetes.io/openstack-internal-load-balancer: “true”
dns:
useInternal: false
useExternal: false
# create an A record for each statefulset pod
distinct: false
enabled: true
servicePort: 19092
firstListenerPort: 31090
domain: cluster.local
loadBalancerIP:
— 13.77.176.999
— 52.247.212.999
— 13.66.160.999

If you are using Azure with AKS, you need to create the IPAddress within the resource group of the AKS. If you create an AKS cluster on “KafkaResource”, the target resource group should be something like “MC_KafkaResource_clustername_westus2”. It is automatically generated from the AKS. For more details,

Pass the IP Address to the Broker side with the same Port number.

Configuration with Pre-defined IP Addresses

The last fix is to pass the IP Address to the Broker side. At the first time, I thought I might configure something like this.


“advertised.listeners”: EXTERNAL0://12.345.67:31090,EXTERNAL1://12.345.68:31091,EXTERNAL2://12.345.69:31092
“listener.security.protocol.map”: PLAINTEXT:PLAINTEXT,EXTERNAL0:PLAINTEXT,EXTERNAL1:PLAINTEXT, EXTERNAL2:PLAINTEXT
“listeners”: PLAINTEXT://:9092,EXTERNAL0://:31090,EXTERNAL1://:31091,EXTERNAL2://:31092
“inter.broker.listener.name”: “PLAINTEXT”

However, this setting doesn’t work. The first broker pod works fine. However, the second pod complains that “This is already configured” or something. Also, this is redundant. This helm chart automatically increments the port number from the external.firstListenerPort. This feature is good for NodePort. However, not make sense for LoadBalancer. Let’s fix to change the helm chart a little bit. I add two feature. 1. Give the IP Address to the Pod. 2. In the case of LoadBalancer, we don’t need to increase the port number.

The same port number for LoadBalancers

The configuration will be like this.

"advertised.listeners": EXTERNAL://${LOAD_BALANCER_IP}:31090
"listener.security.protocol.map": PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT
"listeners": PLAINTEXT://:9092,EXTERNAL://:31090
"inter.broker.listener.name": "PLAINTEXT"

You can see the configuration for the external access sample.

Deployment

You can deploy helm like this. Copy the values.yaml from here and modify it.

$ helm install -f values.yaml -n mh-kafkas ./incubator/kafka

Make sure the adviced listener is correct

Make sure if the advice of the listener is correct.

$ kafkacat -b 13.77.176.999:31090 -L
Metadata for all topics (from broker 0: 13.77.176.999:31090/0):
3 brokers:
broker 2 at 13.66.160.999:31090
broker 1 at 52.247.212.999:31090
broker 0 at 13.77.176.999:31090

Create a topic

$ kubectl -n default exec testclient -- /usr/bin/kafka-topics --zookeeper mh-kafkas-zookeeper:2181 --topic test4 --create --partitions 3 --replication-factor 1

Talk with Consumer/Producer

Console 1

$ kafkacat -b 13.77.176.999:31090 -C -t test4

Console 2

$ kafkacat -b 13.77.176.999:31090 -P -t test4

If you type something on the Console 2, you will see the output on the Console 1 side. :)

Pull Request

This is my PR for the incubator/kafka repo. I hope this PR got merged. :) If not, you can do the same fix as my repository. The changes are just a couple of lines.

Update 22th April 2019. My PR merged. :)

Conclusion

Enjoy Kafka with Helm on Kubernetes! (and hopefully on Azure. lol)

Resources

There are a lot of great resources there. You can find a very good example with YAML file. You can see the example of how to configure Kafka on Azure with external access.

The best blog for understanding the behavior of the Kafka Listeners.

Another solution with domain name

Reference of Kafkacat

Spec request

Helm

Kafka documentation

Metadata logic on Kafka

Value.yaml whole sample

--

--