Configuring Kafka on Kubernetes makes available from an external client with helm
A lot of people struggled to configure Kafka on kubernetes, especially if you want to use it from outside of the cluster. I’ll explain how can we make it. I also wrote a PR to the incubator/Kafka helm chart for enabling it. If it is merged, it might be helpful. I test it on Azure, However, the principal is the same for other cloud platform or on-premise.
What is the issue?
Deploy Kafka cluster on kubrernetes from an external client, you need to access through the Service. It can be NodePorts, LoadBalancer, or ingress controller. Some cloud platform can’t use Node Ports, in this case, we need to use LoadBalancer or ingress. However, this scenario could cause a problem. Let’s see this diagram.
When you start talking with Kafka from the external client, You send a request to the Service that has IP Address (12.345.67:31090). Then the Kafka return the endpoint where to access from the client. It should be `12.345.67:31090`. However, you might see something like `kafka-0.kafka-headless.default:9092` that is an internal access point of Kafka from the resources of kubernetes. Then the Kafka client tries to access the endpoint `kafka-0.kafka-headless.default:9092`. It fails because it is not accessible from the outside of the kubernetes.
We can use kafkacat for testing it. However, it returns wrong address.
$ kafkacat -b 12.345.67:31090 -L
Metadata for all topics (from broker -1: 12.345.67:31090/bootstrap):
1 brokers:
broker 0 at kafka-0.kafka-headless.default:9092
2 topics:
topic “test2” with 1 partitions:
partition 0, leader 0, replicas: 0, isrs: 0
Root cause
Kafka advises the endpoint based on the incoming request of the port number. We also configure advertised.listener and listener.security.protocol.map and listeners and inter.broker.listener.name for enabling it.
“advertised.listeners”: PLAINTEXT://kafka-0.kafka-headless.default:9092
,EXTERNAL://12.345.67:31090
“listener.security.protocol.map”: PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT
“listeners”: PLAINTEXT://:9092,EXTERNAL://:31090
“inter.broker.listener.name”: “PLAINTEXT”
If you receive request to 9092, it will advice “PLAINTEXT://kafka-0.kafka-headless.default:9092” as a listener endpoint. In case 31090, it will be “EXTERNAL://12.345.67:31090”.
Looks simple, however, it doesn’t work for a lot of helm chart. It is not implemented or has a bug for the LoadBalancer scenario. Let’s solve this.
For more detail of the broker configuration,
Solution
Let me expand. Why does it happen even if I configure the advertised.listeners and some others? I investigate incubator/helm chart. I realized the root cause. I’ll explain what happens based on the configuration of the incubator/helm chart.
TargetPort of Service
Look at the picture above. If the Service(LoadBalancer) receive the request at the Port(31090), it transfers the request to the Pod. According to the default configuration of the TargetPort, it dispatches the request to the 9092 port of the container. container dispatch the same port 9092. From the point of view of the Kafka, they think “hey, the request to the 9092 is coming. Let’s advise the endpoint for them. It should be kafka-0.kafka-headless.default:9092”. The TargetPort should be the same as the Service(LoadBalancer). If you configure the advertised.listener, Kafka creates these listeners which have a port 9092 and 31090. We need to fix the original helm chart to enable this. You can find my PR later.
Assign IP Address in advance
You need to assign pre-reserved IP Address for the Service(LoadBalancer). We need to tell the IPAddress or DNS name to the broker as the setting of the advertised.listeners when you deploy Kafka with the helm chart. In the case of the Type=LoadBalancer, we can configure loadBalancerIP. The IP address is fake.
values.yaml
## External access.
##
external:
type: LoadBalancer
# annotations:
# service.beta.kubernetes.io/openstack-internal-load-balancer: “true”
dns:
useInternal: false
useExternal: false
# create an A record for each statefulset pod
distinct: false
enabled: true
servicePort: 19092
firstListenerPort: 31090
domain: cluster.local
loadBalancerIP:
— 13.77.176.999
— 52.247.212.999
— 13.66.160.999
If you are using Azure with AKS, you need to create the IPAddress within the resource group of the AKS. If you create an AKS cluster on “KafkaResource”, the target resource group should be something like “MC_KafkaResource_clustername_westus2”. It is automatically generated from the AKS. For more details,
Pass the IP Address to the Broker side with the same Port number.
The last fix is to pass the IP Address to the Broker side. At the first time, I thought I might configure something like this.
“advertised.listeners”: EXTERNAL0://12.345.67:31090,EXTERNAL1://12.345.68:31091,EXTERNAL2://12.345.69:31092
“listener.security.protocol.map”: PLAINTEXT:PLAINTEXT,EXTERNAL0:PLAINTEXT,EXTERNAL1:PLAINTEXT, EXTERNAL2:PLAINTEXT
“listeners”: PLAINTEXT://:9092,EXTERNAL0://:31090,EXTERNAL1://:31091,EXTERNAL2://:31092
“inter.broker.listener.name”: “PLAINTEXT”
However, this setting doesn’t work. The first broker pod works fine. However, the second pod complains that “This is already configured” or something. Also, this is redundant. This helm chart automatically increments the port number from the external.firstListenerPort. This feature is good for NodePort. However, not make sense for LoadBalancer. Let’s fix to change the helm chart a little bit. I add two feature. 1. Give the IP Address to the Pod. 2. In the case of LoadBalancer, we don’t need to increase the port number.
The configuration will be like this.
"advertised.listeners": EXTERNAL://${LOAD_BALANCER_IP}:31090
"listener.security.protocol.map": PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT
"listeners": PLAINTEXT://:9092,EXTERNAL://:31090
"inter.broker.listener.name": "PLAINTEXT"
You can see the configuration for the external access sample.
Deployment
You can deploy helm like this. Copy the values.yaml from here and modify it.
$ helm install -f values.yaml -n mh-kafkas ./incubator/kafka
Make sure the adviced listener is correct
Make sure if the advice of the listener is correct.
$ kafkacat -b 13.77.176.999:31090 -L
Metadata for all topics (from broker 0: 13.77.176.999:31090/0):
3 brokers:
broker 2 at 13.66.160.999:31090
broker 1 at 52.247.212.999:31090
broker 0 at 13.77.176.999:31090
Create a topic
$ kubectl -n default exec testclient -- /usr/bin/kafka-topics --zookeeper mh-kafkas-zookeeper:2181 --topic test4 --create --partitions 3 --replication-factor 1
Talk with Consumer/Producer
Console 1
$ kafkacat -b 13.77.176.999:31090 -C -t test4
Console 2
$ kafkacat -b 13.77.176.999:31090 -P -t test4
If you type something on the Console 2, you will see the output on the Console 1 side. :)
Pull Request
This is my PR for the incubator/kafka repo. I hope this PR got merged. :) If not, you can do the same fix as my repository. The changes are just a couple of lines.
Update 22th April 2019. My PR merged. :)
Conclusion
Enjoy Kafka with Helm on Kubernetes! (and hopefully on Azure. lol)
Resources
There are a lot of great resources there. You can find a very good example with YAML file. You can see the example of how to configure Kafka on Azure with external access.
The best blog for understanding the behavior of the Kafka Listeners.
Another solution with domain name
Reference of Kafkacat
Spec request
Helm
Kafka documentation
Metadata logic on Kafka
Value.yaml whole sample