Encryption, authentication and external access for Confluent Kafka on Kubernetes

Ryan Morris
Cognizant Servian
Published in
5 min readMay 14, 2019

--

Ariel Besagar (Unsplash)

Confluent provide a helm chart that makes the installation of their Kafka platform on a Kubernetes cluster super easy. This guide is aimed at those who have used this Helm chart to create a Kafka installation, or have otherwise rolled their own Kubernetes installation using the Kafka docker images and wish to expose it outside the cluster with SSL encryption and authentication.

Out of the box, the helm chart doesn’t support SSL configurations for encryption and authentication, or exposing the platform for access from outside the Kubernetes cluster. To implement these requirements, there are a few modifications to the installation needed. In summary, they are:

  • Generate some private keys/certificates for brokers and clients
  • Create Kubernetes Secrets to provide them within your cluster
  • Update the broker StatefulSet with your Secrets and SSL configuration
  • Expose each broker pod via an external service

This article will run you through a solution for creating a secured, externally accessible Kafka cluster on Kubernetes.

Create some keys and certs

Kafka uses two-way SSL for its authentication which means you’ll need to generate keys and certificates for each broker and client of the Kafka cluster. Kafka, which is written mostly in Java uses the Java KeyStore(JKS) for it’s key/certificate management. You’ll need to create both a keystore (holds it’s own private keys/certificates) and a truststore (holds other client/brokers certificates) for each client and broker.

Generating these for Kafka is fairly well documented already and covered in Confluent’s docs here.

Access them using Kubernetes Secrets

Once the broker’s keystore and truststore JKS stores have been generated, they can be provided to the brokers via Kubernetes Secrets.

We’re going to use Kubernetes Secrets to make these files accessible to the broker pods. Basically, the secrets configuration in Kubernetes will hold some base64 encoded strings representing your keystore, truststore and passwords, which you’ll mount as files to a volume in your StatefulSet definition ie. broker pods.

So first, get the base64 encoded value for your broker’s keystore and truststore:

cat kafka.server.keystore.jks | base64

Create a secrets.yaml file as below with the encoded files:

apiVersion: v1
kind: Secret
metadata:
name: ssl-config
type: Opaque
data:
kafka.broker.truststore.jks: MIIDxgIBAzCCA38GCSqGSIb3DQEHAaCCA3AEggNsMIIDaDCCA2QGCSqGSIb3DQEHBqCCA1UwggNRAgEAMIIDSgYJKoZIhvcNAQcBMCkGCiqGSIb3DQEMAQYwGwQUy0PLhIuIqPBeAr8IbK5A5DJCfgECAwDDUICCAxCyWyVw+cA1hGo9CdWHJJAHVjudmN3GmtXOv7IqhxM2AOV5jmbf+g6+Y/S66PpmG68q5Jn+bhzeIqgig4He+uPou2TJ
kafka.broker.keystore.jks: MIIPUQIBAzCCDwoGCSqGSIb3DQEHAaCCDvsEgg73MIIO8zCCAw8GCSqGSIb3DQEHAaCCAwAEggL8MIIC+DCCAvQGCyqGSIb3DQEMCgECoIICmzCCApcwKQYKKoZIhvcNAQwBAzAbBBSYLzcNVIWY5U
truststore-creds: <<<BASE64_TRUSTSTORE_PW>>>
keystore-creds: <<<BASE64_KEYSTORE_PW>>>
key-creds: <<<BASE64_KEY_PW>>>

Apply the secrets.yaml to your cluster. Given that you’ll want to keep your key passwords even more secret, a good idea is to inject these at deploy time with your CI/CD process.

Update the broker StatefulSet with the secrets and SSL configuration

Mount a secrets volume to make them available to the brokers in the broker StatefulSet:

volumeMounts:
- mountPath: /etc/kafka/secrets
name: secrets-vol
....volumes:
name: secrets-vol
secret:
secretName: ssl-config

Now that the keystore and truststore are available to your broker pods, we can configure them to enable SSL encryption and authentication. Add/modify the following environment variables to the existing Kafka broker StatefulSet created by the Helm template to enable SSL:

- name: KAFKA_SSL_KEYSTORE_FILENAME
value: kafka.broker.keystore.jks
- name: KAFKA_SSL_KEYSTORE_CREDENTIALS
value: keystore-creds
- name: KAFKA_SSL_TRUSTSTORE_FILENAME
value: kafka.broker.truststore.jks
- name: KAFKA_SSL_TRUSTSTORE_CREDENTIALS
value: truststore-creds
- name: KAFKA_SSL_KEY_CREDENTIALS
value: key-creds
- name: KAFKA_SECURITY_INTER_BROKER_LISTENER_NAME
value: SSL
- name: KAFKA_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM
value: ' '
- name: KAFKA_SSL_CLIENT_AUTH
value: required
- name: KAFKA_AUTHORIZER_CLASS_NAME
value: kafka.security.auth.SimpleAclAuthorizer
- name: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP
value: SSL:SSL,PLAINTEXT:PLAINTEXT,EXTERNAL:SSL
- name: KAFKA_ALLOW_EVERYONE_IF_NO_ACL_FOUND
value: "true"

The Kafka docker image seems to be hardcoded to look for keystore files under /etc/kafka/secrets, so no need to specify the mount path.

Modify the ADVERTISED_LISTENERS environment variable to specify SSL as the protocol for the listeners:

export KAFKA_ADVERTISED_LISTENERS=SSL://${POD_IP}:9092,EXTERNAL://${HOST_IP}:$((31090 + ${KAFKA_BROKER_ID})) && \

This creates two listener ports; one for inter-broker communication and the other for external. The difference between the two being the advertised listener address, which is the address that clients are directed to look for brokers. Now, kubectl apply the statefulset.yaml updates and check the container logs. At this point inter-broker communication should be happening successfully over SSL.

Generate secure topics using Kubernetes Jobs

apiVersion: batch/v1
kind: Job
metadata:
name: kafka-topic-jobs
spec:
template:
metadata:
name: topic-jobs
spec:
containers:
- name: create-topic
image: confluentinc/cp-kafka
command: ["kafka-topics", "--zookeeper", "funny-porcupine- cp-zookeeper:2181", "--create", "--topic" ,"secured-topic", "--partitions", "1", "--replication-factor", "1", "--if-not-exists"]
restartPolicy: Never---apiVersion: batch/v1
kind: Job
metadata:
name: kafka-acl-jobs
spec:
template:
metadata:
name: acl-jobs
spec:
containers:
- name: create-acl
image: confluentinc/cp-kafka
command: ["kafka-acls","--topic","secured-topic","--producer","--authorizer-properties","zookeeper.connect=funny-porcupine-cp-zookeeper:2181", "--add","--allow-principal","User:CN=Dataflow,OU=1,O=1,L=1,ST=1,C=1"]
restartPolicy: Never

This creates a “secured-topic” topic that is only be writable by a client with the private key of the certificateCN=Dataflow,OU=1,O=1,L=1,ST=1,C=1 , signed by the root CA in the broker’s truststore. So, the user there should be changed to match the client certificate produced earlier.

Access from outside the Kubernetes cluster

The default Helm chart installation gives you three Kafka brokers, each only accessible from clients within the Kubernetes cluster. Allowing access from outside the cluster is tricky, particular with a multi-broker setup, but possible. The first step is to create an external service for every broker pod in your cluster.

Why can’t I just have one external load balancer, balancing the load to all the broker pods, doing its load balancing thing?

Each client requires the ability to connect to a specific broker, depending which is the leader of the partition being written to. This means that each broker, which is a pod in this setup, needs to be externally addressable by the client. One way of achieving this is to create a LoadBalancer service for each broker, and have that service’s selector route to it’s broker pod.

Repeat the below for each broker, to create a LoadBalancer with a public IP (you can do it like this on GKE):

kubectl expose service funny-porcupine-cp-kafka --type=LoadBalancer --name=kafla-lb --port 31090 --target-port 31090

Then modify the LoadBalancer services to point to a particular broker pod using the pod name label:

selector:
release: funny-porcupine
statefulset.kubernetes.io/pod-name: funny-porcupine-cp-kafka-0

Now, we need a way to set each broker’s ADVERTISED_LISTENERS variable in the broker StatefulSet to be configured with it’s unique external LoadBalancer address, which will be different for each broker pod. A relatively simple way to do this is to have a DNS sub-domain for each broker that directs to its corresponding LoadBalancer IP address, so then you can just use the KAFKA_BROKER_ID to form each address, like below:

export KAFKA_ADVERTISED_LISTENERS=SSL://${POD_IP}:9092,EXTERNAL://broker${KAFKA_BROKER_ID}.kafka.ryancm.net:31090 && \

So with the default three broker install, your subdomain mapping would be something like:

broker0.kafka.ryancm.net:31090 ---> EXT_ADDRESS_1_IP
broker1.kafka.ryancm.net:31090 ---> EXT_ADDRESS_2_IP
broker2.kafka.ryancm.net:31090 ---> EXT_ADDRESS_3_IP

With those changes applied, external access to each broker is now possible.

Testing it out with Google Cloud Dataflow

In case you’re using Google Cloud Dataflow, here’s some example configuration to get it working with Kafka, authenticated via SSL to the external address. Basically, configuring the keystore and truststore similarly to the broker configuration and making them accessible to your Dataflow job.

In summary, getting the Helm chart install of Confluent Kafka platform to support SSL, while not configured out of the box, is simple enough to set up. And if you want it to be accessible from outside your cluster, you’ll probably want to do it.

--

--