Building a Highly Available Kubernetes Event Collection Pipeline with Fluent Bit

Muzammil Lohare 20th Jan 2026 - 7 mins read

Kubernetes events are one of the most valuable yet frequently overlooked data sources in a cluster. They provide insights into scheduling failures, pod restarts, node issues, and configuration problems. However, Kubernetes events are ephemeral and can disappear quickly if not centrally collected.

This blog explains a production-ready and highly available approach to collect Kubernetes cluster events and forward them to Amazon CloudWatch Logs using Fluent Bit and Kubernetes leader election.

Why Centralize Kubernetes Events?

By default, Kubernetes events are stored in etcd with limited retention. Once they expire, critical troubleshooting information is permanently lost.

Centralizing events in Amazon CloudWatch Logs provides:

Long-term retention
Search and filtering capabilities
Audit and compliance visibility
Correlation with metrics and application logs

Solution Overview

This solution uses a leader-elected event collector that streams Kubernetes events using kubectl. Only one pod acts as the leader at any given time, ensuring that duplicate events are not generated.

Fluent Bit tails the event log file and forwards events to CloudWatch Logs. Multiple replicas ensure high availability and automatic failover.

Architecture Overview

Multiple replicas of the event collector are deployed
Kubernetes Lease API is used for leader election
Only the leader pod streams events
Events are written to a log file
Fluent Bit ships events to Amazon CloudWatch Logs

Scope and Environment

Cluster Name: test
Region: ap-south-1
Namespace: amazon-cloudwatch
Log Group: /aws/eks/test/cluster-events

Required IAM Permissions

For Fluent Bit to successfully push Kubernetes events to Amazon CloudWatch Logs, the EKS node group IAM role must have the required CloudWatch Logs permissions.

Create log groups
Create log streams
Put log events
Describe log groups and streams

Core Components

Service Account with least privilege access
ClusterRole and ClusterRoleBinding
ConfigMaps for Fluent Bit and event collector
Deployment with multiple replicas
Pod Disruption Budget for availability

Deployment Procedure

Create the YAML file using the naming convention:

fluent-bit-events-ha-<cluster-name>.yaml

Validate the YAML:

kubectl apply --dry-run=client -f fluent-bit-events.yaml

Deploy to the cluster:

kubectl apply -f fluent-bit-events-ha-test.yaml

Validate Leader

kubectl get lease fluent-bit-events-leader -n amazon-cloudwatch

Complete Deployment YAML


apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluent-bit-events-sa
  namespace: amazon-cloudwatch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluent-bit-events-cluster-role
rules:
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["namespaces", "pods"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["get", "create", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fluent-bit-events-cluster-role-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: fluent-bit-events-cluster-role
subjects:
  - kind: ServiceAccount
    name: fluent-bit-events-sa
    namespace: amazon-cloudwatch
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-events-config
  namespace: amazon-cloudwatch
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush        5
        Log_Level    info
        HTTP_Server  On
        HTTP_Listen  0.0.0.0
        HTTP_Port    2020
        Health_Check On

    [INPUT]
        Name              tail
        Tag               kube.events
        Path              /var/log/events.log
        Parser            json
        DB                /var/log/flb_events.db
        Refresh_Interval  2
        Read_from_Head    false

    [FILTER]
        Name   modify
        Match  kube.events
        Add    cluster test
        Add    region ap-south-1

    [OUTPUT]
        Name                cloudwatch_logs
        Match               kube.events
        region              ap-south-1
        log_group_name      /aws/eks/test/cluster-events
        log_stream_prefix   events-
        auto_create_group   true
        log_retention_days  30
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: event-collector-script
  namespace: amazon-cloudwatch
data:
  collect-events.sh: |
    #!/bin/sh
    kubectl get events --all-namespaces --watch-only -o json >> /var/log/events.log
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: fluent-bit-events
  namespace: amazon-cloudwatch
spec:
  replicas: 3
  selector:
    matchLabels:
      app: fluent-bit-events
  template:
    metadata:
      labels:
        app: fluent-bit-events
    spec:
      serviceAccountName: fluent-bit-events-sa
      containers:
        - name: event-collector
          image: bitnami/kubectl:latest
        - name: fluent-bit
          image: amazon/aws-for-fluent-bit:2.31.12
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: fluent-bit-events-pdb
  namespace: amazon-cloudwatch
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: fluent-bit-events

Operational Verification

Verify pods are running
Verify leader lease
Verify Fluent Bit health endpoint
Verify CloudWatch Logs are receiving events

Failover and Recovery

Leader failover is automatic. If the leader pod fails, another replica acquires leadership using the Lease API and resumes event collection without manual intervention.

Security and Compliance

Least privilege RBAC
Namespace isolation
No hostPath usage
Cloud-native authentication

Building a Highly Available Kubernetes Event Collection Pipeline with Fluent Bit

Top Blog Posts

Implementing-Zero
Trust-Security

Zero-Trust-Architecture
on-AWS

Serverless-Architecture
on-AWS

Dynamic-vs-Static
Malware-Analysis

Talk to our experts to discuss your requirements

Build

Operate

Innovate

Contact Us

Building a Highly Available Kubernetes Event Collection Pipeline with Fluent Bit

Top Blog Posts

Implementing-Zero Trust-Security

Zero-Trust-Architecture on-AWS

Serverless-Architecture on-AWS

Dynamic-vs-Static Malware-Analysis

Talk to our experts to discuss your requirements

Implementing-Zero
Trust-Security

Zero-Trust-Architecture
on-AWS

Serverless-Architecture
on-AWS

Dynamic-vs-Static
Malware-Analysis