Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
January 11, 2022 08:02 pm GMT

Running Kafka on kubernetes for local development

In this post I will cover the steps to run Kafka locally on your development machine using Kubernetes. Usually I go with docker-compose for that as it is simpler to get it going but due to the omnipresence of Kubernetes these days in companies I decided to also port my local Confluent Kafka setup to run on Kubernetes for practicing and getting closer to production environments where my application will most likely also run on Kubernetes.

This is a minimal local Kafka setup meant for development so it's a single instance of Kafka, Schema-registry and Zookeeper. You may want to run a local kafka for quick and easy test and prototyping and also to save money from running it in the cloud just for functional development, if you combine that with also packaging and running your application on Kubernetes you have a very efficient setup that closely mirrors cloud environments where your app and the Kubernetes cluster you will use for the environments are probably running.

Strimzi is an awesome simpler alternative to achieve the same, check it out. My goal with this setup here is for learning and to have a more "realistic" Kubernetes setup on local development machine so I opted to not use Strimzi or Helm charts.

In the past couple of days I came up with two local setups running Kafka, Schema-registry and Zookeeper on local development machine with Kubernetes using Kind, in this first post I will cover a setup using Persistent Volumes and Persistent Volume Claims and in the next one I will cover using Storage classes.

I have created and tested these approaches on a Linux Development machine. It should work for Mac and Windows also but I have never tried it.

You can get the full source from Github repo where you will find the files and Quick Start for both aforementioned approaches. To clone the repo git clone [email protected]:mmaia/kafka-local-kubernetes.git.

Well I guess this is more than enough introduction let's have some fun.

The setup using Persistent Volumes and Persistent Volume Claims

If you checked out the repo described above the setup presented here is under pv-pvc-setup folder. You will find multiple Kubernetes declarative files in this folder, please notice that you could also combine all files in a single one separating them with a line containing triple dashes ---, if combining them is your preference you can open a terminal and from the pv-pvc-setup folder run for each in ./kafka-k8s/*; do cat $each; echo "---"; done > local-kafka-combined.yaml this will concatenate all files in a single one called local-kafka-combined.yaml.

I keep them separate to explicitly separate each type in this case and because it's convenient as you can just run kubectl pointing to the directory as described below in the "Running it" section.

kind-config.yaml - This file configures Kind to expose the kafka and schema-registry ports to the local machine host so you can connect your application while developing from your IDE or command line and connect with Kafka running on Kubernetes.

apiVersion: kind.x-k8s.io/v1alpha4kind: Clusternodes:  - role: control-plane  - role: worker    extraPortMappings:      - containerPort: 30092 # internal kafka nodeport        hostPort: 9092 # port exposed on "host" machine for kafka      - containerPort: 30081 # internal schema-registry nodeport        hostPort: 8081 # port exposed on "host" machine for schema-registry    extraMounts:      - hostPath: ./tmp/kafka-data        containerPath: /var/lib/kafka/data        readOnly: false        selinuxRelabel: false        propagation: Bidirectional      - hostPath: ./tmp/zookeeper-data/data        containerPath: /var/lib/zookeeper/data        readOnly: false        selinuxRelabel: false        propagation: Bidirectional      - hostPath: ./tmp/zookeeper-data/log        containerPath: /var/lib/zookeeper/log        readOnly: false        selinuxRelabel: false        propagation: Bidirectional

Notice the mapping from the internal container paths to the external hostPath on the local machine. The local paths will need to be manually created before running the setup as per instructions on the section "Running it" below.

This is it for the Kind configuration now let's check the Kubernetes files(under kafka-kl8s if you checked out the project):

kafka-deployment.yaml - Configures the kafka broker and exposes an internal(for kubernetes network) and external port(for kafka clients) for kafka also maps an internal volume to expose kafka data files.

apiVersion: apps/v1kind: Deploymentmetadata:  labels:    service: kafka  name: kafkaspec:  replicas: 1  selector:    matchLabels:      service: kafka  strategy:    type: Recreate  template:    metadata:      labels:        network/kafka-network: "true"        service: kafka    spec:      enableServiceLinks: false      containers:      - name: kafka        imagePullPolicy: IfNotPresent        image: confluentinc/cp-kafka:7.0.1        ports:          - containerPort: 29092          - containerPort: 9092        env:          - name: CONFLUENT_SUPPORT_CUSTOMER_ID            value: "anonymous"          - name: KAFKA_ADVERTISED_LISTENERS            value: "INTERNAL://kafka:29092,LISTENER_EXTERNAL://kafka:9092"          - name: KAFKA_AUTO_CREATE_TOPICS_ENABLE            value: "true"          - name: KAFKA_BROKER_ID            value: "1"          - name: KAFKA_DEFAULT_REPLICATION_FACTOR            value: "1"          - name: KAFKA_INTER_BROKER_LISTENER_NAME            value: "INTERNAL"          - name: KAFKA_LISTENERS            value: "INTERNAL://:29092,LISTENER_EXTERNAL://:9092"          - name: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP            value: "INTERNAL:PLAINTEXT,LISTENER_EXTERNAL:PLAINTEXT"          - name: KAFKA_NUM_PARTITIONS            value: "1"          - name: KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR            value: "1"          - name: KAFKA_LOG_CLEANUP_POLICY            value: "compact"          - name: KAFKA_ZOOKEEPER_CONNECT            value: "zookeeper:2181"        resources: {}        volumeMounts:          - mountPath: /var/lib/kafka/data            name: kafka-data      hostname: kafka      restartPolicy: Always      volumes:        - name: kafka-data          persistentVolumeClaim:            claimName: kafka-pvc

kafka-network-np.yaml - Sets up the internal Kubernetes network used by the

apiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata:  name: kafka-networkspec:  ingress:    - from:        - podSelector:            matchLabels:              network/kafka-network: "true"  podSelector:    matchLabels:      network/kafka-network: "true"

kafka-pv.yaml - this file describes the persistent volume used.

apiVersion: v1kind: PersistentVolumemetadata:  name: kafka-pvspec:  accessModes:    - ReadWriteOnce  storageClassName: kafka-local-storage  capacity:    storage: 5Gi  persistentVolumeReclaimPolicy: Retain  hostPath:    path: /var/lib/kafka/data

kafka-pvc.yaml - this file is the claim used by the 'pod' as described in the deployment file for kafka above and associated to the persistent volume.

apiVersion: v1kind: PersistentVolumeClaimmetadata:  name: kafka-pvcspec:  accessModes:    - ReadWriteOnce  storageClassName: kafka-local-storage  resources:    requests:      storage: 5Gi

kafka-service.yaml - This file defines the mappings between the internal containers and ports that are exposed, called NodePorts in Kubernetes.

apiVersion: v1kind: Servicemetadata:  labels:    service: kafka  name: kafkaspec:  selector:    service: kafka  ports:    - name: internal      port: 29092      targetPort: 29092    - name: external      port: 30092      targetPort: 9092      nodePort: 30092  type: NodePort

The remaining files are declarative Kubernetes configurations files to schema-registry and zookeeper.

schema-registry-deployment.yaml

apiVersion: apps/v1kind: Deploymentmetadata:  labels:    service: schema-registry  name: schema-registryspec:  replicas: 1  selector:    matchLabels:      service: schema-registry  strategy: {}  template:    metadata:      labels:        network/kafka-network: "true"        service: schema-registry    spec:      enableServiceLinks: false      containers:        - env:            - name: SCHEMA_REGISTRY_HOST_NAME              value: "schema-registry"            - name: SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS              value: "kafka:29092"            - name: SCHEMA_REGISTRY_LISTENERS              value: "http://0.0.0.0:30081"          image: confluentinc/cp-schema-registry:7.0.1          name: schema-registry          ports:            - containerPort: 30081          resources: {}      hostname: schema-registry      restartPolicy: Always

schema-registry-service.yaml

apiVersion: v1kind: Servicemetadata:  labels:    service: schema-registry  name: schema-registryspec:  ports:    - port: 30081      name: outport      targetPort: 30081      nodePort: 30081  type: NodePort  selector:    service: schema-registry

zookeeper-data-pv.yaml

apiVersion: v1kind: PersistentVolumemetadata:  name: zookeeper-data-pvspec:  accessModes:    - ReadWriteOnce  storageClassName: zookeeper-data-local-storage  capacity:    storage: 5Gi  persistentVolumeReclaimPolicy: Retain  hostPath:    path: /var/lib/zookeeper/data

zookeeper-data-pvc.yaml

apiVersion: v1kind: PersistentVolumeClaimmetadata:  name: zookeeper-data-pvcspec:  accessModes:    - ReadWriteOnce  storageClassName: zookeeper-data-local-storage  resources:    requests:      storage: 5Gi

zookeeper-deployment.yaml

apiVersion: apps/v1kind: Deploymentmetadata:  labels:    service: zookeeper  name: zookeeperspec:  replicas: 1  selector:    matchLabels:      service: zookeeper  strategy: {}  template:    metadata:      labels:        network/kafka-network: "true"        service: zookeeper    spec:      containers:        - env:            - name: TZ            - name: ZOOKEEPER_CLIENT_PORT              value: "2181"            - name: ZOOKEEPER_DATA_DIR              value: "/var/lib/zookeeper/data"            - name: ZOOKEEPER_LOG_DIR              value: "/var/lib/zookeeper/log"            - name: ZOOKEEPER_SERVER_ID              value: "1"          image: confluentinc/cp-zookeeper:7.0.1          name: zookeeper          ports:            - containerPort: 2181          resources: {}          volumeMounts:            - mountPath: /var/lib/zookeeper/data              name: zookeeper-data            - mountPath: /var/lib/zookeeper/log              name: zookeeper-log      hostname: zookeeper      restartPolicy: Always      volumes:        - name: zookeeper-data          persistentVolumeClaim:            claimName: zookeeper-data-pvc        - name: zookeeper-log          persistentVolumeClaim:            claimName: zookeeper-log-pvc

zookeeper-log-pv.yaml

apiVersion: v1kind: PersistentVolumemetadata:  name: zookeeper-log-pvspec:  accessModes:    - ReadWriteOnce  storageClassName: zookeeper-log-local-storage  capacity:    storage: 5Gi  persistentVolumeReclaimPolicy: Retain  hostPath:    path: /var/lib/zookeeper/log

zoopeeker-log-pvc.yaml

apiVersion: v1kind: PersistentVolumeClaimmetadata:  name: zookeeper-log-pvcspec:  accessModes:    - ReadWriteOnce  storageClassName: zookeeper-log-local-storage  resources:    requests:      storage: 5Gi

zookeeper-service.yaml

apiVersion: v1kind: Servicemetadata:  labels:    service: zookeeper  name: zookeeperspec:  ports:    - name: "2181"      port: 2181      targetPort: 2181  selector:    service: zookeeper

Running it

  1. After cloning the project open a terminal and cd to the pv-pvc-setup folder. Or if you're creating the files navigate to folder where the kind-config.yaml is located.

  2. Create the folders on your local host machine so the persistent volumes can be persisted to the file system and you
    will be able to restart the kafka and zookeeper cluster without loosing data from topic. Restarting the kind cluster
    will delete the contents of persistent volumes, this is by design how Kind works when having
    a propagation: Bidirectional configuration. To create the folders you'll need to have uid and gid from same
    user running kind cluster otherwise the persistent folders will not be properly persisted. i.e - make sure to
    create tmp/kafka-data, tmp/zookeeper-data/data and tmp/zookeeper-data/log
    from same level where the kind-config.yaml file you're running kind with or it won't work as expected.

  3. Run kind specifying configuration: kind create cluster --config=kind-config.yml. This will start a kubernetes
    control plane + worker

  4. Run kubernetes configuration for kafka kubectl apply -f kafka-k8s

  5. When done stop kubernetes objects: kubectl delete -f kafka-k8s and then if you want also stop the kind cluster
    which:
    will also delete the storage on the host machine: kind delete cluster

Check Kind docker containers with Kubernetes control-plane and worker running with docker ps

That's all for now folks.

Stay tunned as I will post about a simpler approach using the default Storage Class automatically provisioned by Kind (Rancher/local-path-provisioner) which simplifies the setup considerably with the trade off of not having so much control over the host local storage where kafka and zookpeeper files are stored on the host machine.

Photo by Fotis Fotopoulos on Unsplash


Original Link: https://dev.to/thegroo/running-kafka-on-kubernetes-for-local-development-2a54

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To