Building a Highly Available Kubernetes Event Collection Pipeline with Fluent Bit


Kubernetes events are one of the most valuable yet frequently overlooked data sources in a cluster. They provide insights into scheduling failures, pod restarts, node issues, and configuration problems. However, Kubernetes events are ephemeral and can disappear quickly if not centrally collected.

This blog explains a production-ready and highly available approach to collect Kubernetes cluster events and forward them to Amazon CloudWatch Logs using Fluent Bit and Kubernetes leader election.


Why Centralize Kubernetes Events?

By default, Kubernetes events are stored in etcd with limited retention. Once they expire, critical troubleshooting information is permanently lost.

Centralizing events in Amazon CloudWatch Logs provides:

  • Long-term retention

  • Search and filtering capabilities

  • Audit and compliance visibility

  • Correlation with metrics and application logs

Solution Overview

This solution uses a leader-elected event collector that streams Kubernetes events using kubectl. Only one pod acts as the leader at any given time, ensuring that duplicate events are not generated.

Fluent Bit tails the event log file and forwards events to CloudWatch Logs. Multiple replicas ensure high availability and automatic failover.


Architecture Overview

  1. Multiple replicas of the event collector are deployed

  2. Kubernetes Lease API is used for leader election

  3. Only the leader pod streams events

  4. Events are written to a log file

  5. Fluent Bit ships events to Amazon CloudWatch Logs


Scope and Environment

  • Cluster Name: test

  • Region: ap-south-1

  • Namespace: amazon-cloudwatch

  • Log Group: /aws/eks/test/cluster-events

Required IAM Permissions

For Fluent Bit to successfully push Kubernetes events to Amazon CloudWatch Logs, the EKS node group IAM role must have the required CloudWatch Logs permissions.

  • Create log groups

  • Create log streams

  • Put log events

  • Describe log groups and streams

Core Components

  • Service Account with least privilege access

  • ClusterRole and ClusterRoleBinding

  • ConfigMaps for Fluent Bit and event collector

  • Deployment with multiple replicas

  • Pod Disruption Budget for availability


Deployment Procedure

  1. Create the YAML file using the naming convention:

    fluent-bit-events-ha-<cluster-name>.yaml
                                    
  2. Validate the YAML:

    kubectl apply --dry-run=client -f fluent-bit-events.yaml
                                    
  3. Deploy to the cluster:

    kubectl apply -f fluent-bit-events-ha-test.yaml
                                    
  4. Validate Leader

    kubectl get lease fluent-bit-events-leader -n amazon-cloudwatch
                                    


Complete Deployment YAML


apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluent-bit-events-sa
  namespace: amazon-cloudwatch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluent-bit-events-cluster-role
rules:
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["namespaces", "pods"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["get", "create", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fluent-bit-events-cluster-role-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: fluent-bit-events-cluster-role
subjects:
  - kind: ServiceAccount
    name: fluent-bit-events-sa
    namespace: amazon-cloudwatch
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-events-config
  namespace: amazon-cloudwatch
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush        5
        Log_Level    info
        HTTP_Server  On
        HTTP_Listen  0.0.0.0
        HTTP_Port    2020
        Health_Check On

    [INPUT]
        Name              tail
        Tag               kube.events
        Path              /var/log/events.log
        Parser            json
        DB                /var/log/flb_events.db
        Refresh_Interval  2
        Read_from_Head    false

    [FILTER]
        Name   modify
        Match  kube.events
        Add    cluster test
        Add    region ap-south-1

    [OUTPUT]
        Name                cloudwatch_logs
        Match               kube.events
        region              ap-south-1
        log_group_name      /aws/eks/test/cluster-events
        log_stream_prefix   events-
        auto_create_group   true
        log_retention_days  30
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: event-collector-script
  namespace: amazon-cloudwatch
data:
  collect-events.sh: |
    #!/bin/sh
    kubectl get events --all-namespaces --watch-only -o json >> /var/log/events.log
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: fluent-bit-events
  namespace: amazon-cloudwatch
spec:
  replicas: 3
  selector:
    matchLabels:
      app: fluent-bit-events
  template:
    metadata:
      labels:
        app: fluent-bit-events
    spec:
      serviceAccountName: fluent-bit-events-sa
      containers:
        - name: event-collector
          image: bitnami/kubectl:latest
        - name: fluent-bit
          image: amazon/aws-for-fluent-bit:2.31.12
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: fluent-bit-events-pdb
  namespace: amazon-cloudwatch
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: fluent-bit-events


Operational Verification

  • Verify pods are running

  • Verify leader lease

  • Verify Fluent Bit health endpoint

  • Verify CloudWatch Logs are receiving events

Failover and Recovery

Leader failover is automatic. If the leader pod fails, another replica acquires leadership using the Lease API and resumes event collection without manual intervention.


Security and Compliance

  • Least privilege RBAC

  • Namespace isolation

  • No hostPath usage

  • Cloud-native authentication



Top Blog Posts

×

Talk to our experts to discuss your requirements

Real boy icon sized sample pic Real girl icon sized sample pic Real boy icon sized sample pic
India Directory