Skip to content

Event-Driven Processing with Kafka

**Referenced Files in This Document** - [[consumer.go]](file/bi-common/mq/kafkax/consumer.go) - [[producer.go]](file/bi-common/mq/kafkax/producer.go) - [[multi_consumer.go]](file/bi-common/mq/kafkax/multi-consumer.go) - [[worker_pool.go]](file/bi-common/mq/kafkax/worker-pool.go) - [[config.go]](file/bi-common/mq/kafkax/config.go) - [[consume.go]](file/bi-common/mq/kafkax/consume.go) - [[retry.go]](file/bi-common/mq/kafkax/retry.go) - [[consumer_manager.go]](file/bi-common/mq/kafkax/consumer-manager.go) - [[schema_registry.go]](file/bi-common/mq/kafkax/schema-registry.go) - [[kafkax-default.yaml]](file/bi-common/mq/kafkax/kafkax-default.yaml) - [[option.go]](file/bi-common/mq/kafkax/option.go) - [[compression.go]](file/bi-common/mq/kafkax/compression.go) - [[balancer.go]](file/bi-common/mq/kafkax/balancer.go) - [[kafka_grpc.pb.go]](file/bi-server/api/bi-base/v1/kafka/kafka-grpc.pb.go) - [[kafka.go]](file/bi-basic/internal/service/kafka.go)

Table of Contents

  1. Introduction
  2. Project Structure
  3. Core Components
  4. Architecture Overview
  5. Detailed Component Analysis
  6. Dependency Analysis
  7. Performance Considerations
  8. Troubleshooting Guide
  9. Conclusion
  10. Appendices

Introduction

This document describes the event-driven processing architecture built on Apache Kafka within the project. It covers the event sourcing pattern implementation, producers and consumers, event stores, multi-consumer architecture with worker pools, serialization formats, message routing, topic organization, replay mechanisms, dead letter queues, ordering guarantees, consumer group management, offset handling, fault tolerance, microservice communication, inter-service event coordination, distributed transactions, monitoring, lag detection, and performance optimization.

Project Structure

The Kafka integration resides under the shared library bi-common/mq/kafkax, providing reusable primitives for producers, consumers, batching, retries, dead-letter queues, worker pools, and schema registry support. Example microservice usage appears in bi-server and bi-basic.

Diagram sources

Section sources

Core Components

  • Producer: Asynchronous or synchronous message publishing with batching, compression, and optional JSON helpers.
  • Consumer: Single-topic consumption with automatic/manual commit modes, retries, DLQ, and lag reporting.
  • MultiTopicConsumer: Parallel consumers across multiple topics with shared configuration.
  • WorkerPool: In-process worker pool with channel buffering, configurable concurrency, and retry/backoff.
  • ConsumerManager: Runtime orchestration of topic-to-handler mappings and worker pool startup.
  • Config & Defaults: YAML-based defaults for producer/consumer with environment overrides.
  • Serialization: JSON for simple payloads; optional Confluent Schema Registry integration for Avro/Protobuf/JSON schemas.
  • Retry & DLQ: Exponential backoff with jitter and dead-letter topic routing.

Section sources

Architecture Overview

The system supports two primary patterns:

  • Microservice gRPC-to-Kafka bridge: Services expose gRPC APIs to accept requests and publish Kafka events via the shared producer.
  • Event-driven consumers: Services subscribe to topics, optionally using MultiTopicConsumer or ConsumerManager to run worker pools per topic.

Diagram sources

Detailed Component Analysis

Producer Implementation

  • Supports batching, compression, and partition balancing.
  • Provides JSON helpers and raw message sending with optional headers.
  • Health checks and statistics exposed.

Diagram sources

Section sources

Consumer Implementation

  • Single-topic consumption with automatic/manual commit modes.
  • Built-in retry with exponential backoff and DLQ routing.
  • Graceful shutdown, lag reporting, and stats.

Diagram sources

Section sources

Multi-Topic Consumer

  • Spawns independent consumers per topic with shared configuration.
  • Supports manual commit and graceful shutdown across topics.

Diagram sources

Section sources

Worker Pool Management

  • Channel-based dispatcher with configurable buffer and concurrency.
  • Optional retry/backoff and DLQ integration.
  • Non-blocking start and graceful stop.

Diagram sources

Section sources

Consumer Manager Orchestration

  • Registers handlers globally or per-manager instance.
  • Starts worker pools per topic with runtime configuration.

Diagram sources

Section sources

Configuration and Defaults

  • YAML defaults for producer and consumer.
  • Environment-based overrides and option builders.
  • TLS/SASL and partition change monitoring.

Diagram sources

Section sources

Serialization Formats and Schema Registry

  • JSON payloads supported directly by producer helpers.
  • Confluent Schema Registry integration for Avro/Protobuf/JSON with caching and wire format.

Diagram sources

Section sources

Event Replay, Ordering, and Dead Letter Queues

  • Replay: StartOffset controls earliest/latest; SetOffset/SetOffsetAt enable precise replay.
  • Ordering: Per-partition ordering preserved; consumer groups balance partitions.
  • DLQ: On max retries, messages are sent to configured DLQ topic with original metadata.

Diagram sources

Section sources

Consumer Group Management and Offset Handling

  • Consumer groups automatically rebalance partitions; heartbeat/session timeouts govern liveness.
  • Manual commit mode allows idempotent processing with explicit Ack/Nack semantics.
  • Graceful shutdown waits for in-flight work completion.

Section sources

Fault Tolerance Strategies

  • Partition change monitoring prevents initial-start consumption issues.
  • Retries with backoff and jitter reduce transient failure impact.
  • DLQ decouples poison-pill messages from live processing.
  • TLS/SASL protect transport and authentication.

Section sources

Event-Driven Microservice Communication

  • Example gRPC-to-Kafka bridge exposes RPC methods that serialize request data and publish to Kafka topics.
  • Example service method converts protobuf data to Kafka payload and delegates to producer.

Diagram sources

Section sources

Distributed Transactions and Coordination

  • The codebase does not implement end-to-end transactional producers/consumers; it focuses on at-least-once delivery with retries and DLQ.
  • For cross-service saga/eventual consistency, coordinate via idempotent event handlers and external state stores.

[No sources needed since this section provides general guidance]

Dependency Analysis

  • Producer depends on kafka-go Writer and optional Schema Registry.
  • Consumer depends on kafka-go Reader and provides retry/DLQ logic.
  • MultiTopicConsumer composes multiple Consumers.
  • WorkerPool wraps Consumer with channel dispatch and worker goroutines.
  • ConsumerManager orchestrates handlers and worker pools.

Diagram sources

Section sources

Performance Considerations

  • Tune batching (size/bytes/timeout) and compression for throughput/latency trade-offs.
  • Choose partition balancers appropriate to key distribution.
  • Use worker pools sized to CPU and IO characteristics.
  • Monitor lag via ReaderStats and adjust consumer concurrency/partitions accordingly.
  • Enable partition change monitoring to avoid cold-start consumption delays.

[No sources needed since this section provides general guidance]

Troubleshooting Guide

  • Health checks: Use Ping on Producer/Consumer to validate broker connectivity.
  • Logs: Set Logger/ErrorLogger on Producer/Consumer/ProducerConfig/ConsumerConfig.
  • DLQ: Verify DLQ topic exists and permissions; inspect headers for original topic and error.
  • Retries: Adjust MaxRetries and backoff parameters; consider jitter for thundering herd mitigation.
  • Offsets: Use SetOffset/SetOffsetAt for targeted replay; confirm commit mode (auto/manual).
  • TLS/SASL: Validate credentials and certificate paths; disable verification only in controlled environments.

Section sources

Conclusion

The Kafka integration provides robust primitives for event-driven architectures: producers with batching and serialization, consumers with retries and DLQ, multi-topic orchestration, and worker pools for parallel processing. Combined with configuration defaults, TLS/SASL, and schema registry support, it enables scalable, fault-tolerant systems suitable for microservice communication and eventual consistency patterns.

Appendices

Topic Organization and Routing

  • Use hierarchical naming conventions for discoverability and team ownership.
  • Route by domain entity and event type to minimize cross-team coupling.
  • Consider idempotency keys and partitioning strategies aligned to event keys.

[No sources needed since this section provides general guidance]

Monitoring and Lag Detection

  • Use ReaderStats and WriterStats to track lag and throughput.
  • Alert on rising lag, increased DLQ rates, and retry saturation.
  • Correlate consumer lag with partition counts and worker pool sizes.

[No sources needed since this section provides general guidance]