Event-Driven Processing with Kafka
Table of Contents
- Introduction
- Project Structure
- Core Components
- Architecture Overview
- Detailed Component Analysis
- Dependency Analysis
- Performance Considerations
- Troubleshooting Guide
- Conclusion
- Appendices
Introduction
This document describes the event-driven processing architecture built on Apache Kafka within the project. It covers the event sourcing pattern implementation, producers and consumers, event stores, multi-consumer architecture with worker pools, serialization formats, message routing, topic organization, replay mechanisms, dead letter queues, ordering guarantees, consumer group management, offset handling, fault tolerance, microservice communication, inter-service event coordination, distributed transactions, monitoring, lag detection, and performance optimization.
Project Structure
The Kafka integration resides under the shared library bi-common/mq/kafkax, providing reusable primitives for producers, consumers, batching, retries, dead-letter queues, worker pools, and schema registry support. Example microservice usage appears in bi-server and bi-basic.
Diagram sources
- [producer.go]
- [consumer.go]
- [multi_consumer.go]
- [worker_pool.go]
- [config.go]
- [kafkax-default.yaml]
- [option.go]
- [compression.go]
- [balancer.go]
- [schema_registry.go]
- [consume.go]
- [retry.go]
- [consumer_manager.go]
- [kafka_grpc.pb.go]
- [kafka.go]
Section sources
- [producer.go]
- [consumer.go]
- [multi_consumer.go]
- [worker_pool.go]
- [config.go]
- [kafkax-default.yaml]
- [option.go]
- [compression.go]
- [balancer.go]
- [schema_registry.go]
- [consume.go]
- [retry.go]
- [consumer_manager.go]
- [kafka_grpc.pb.go]
- [kafka.go]
Core Components
- Producer: Asynchronous or synchronous message publishing with batching, compression, and optional JSON helpers.
- Consumer: Single-topic consumption with automatic/manual commit modes, retries, DLQ, and lag reporting.
- MultiTopicConsumer: Parallel consumers across multiple topics with shared configuration.
- WorkerPool: In-process worker pool with channel buffering, configurable concurrency, and retry/backoff.
- ConsumerManager: Runtime orchestration of topic-to-handler mappings and worker pool startup.
- Config & Defaults: YAML-based defaults for producer/consumer with environment overrides.
- Serialization: JSON for simple payloads; optional Confluent Schema Registry integration for Avro/Protobuf/JSON schemas.
- Retry & DLQ: Exponential backoff with jitter and dead-letter topic routing.
Section sources
- [producer.go]
- [consumer.go]
- [multi_consumer.go]
- [worker_pool.go]
- [consumer_manager.go]
- [config.go]
- [kafkax-default.yaml]
- [schema_registry.go]
- [consume.go]
- [retry.go]
Architecture Overview
The system supports two primary patterns:
- Microservice gRPC-to-Kafka bridge: Services expose gRPC APIs to accept requests and publish Kafka events via the shared producer.
- Event-driven consumers: Services subscribe to topics, optionally using MultiTopicConsumer or ConsumerManager to run worker pools per topic.
Diagram sources
Detailed Component Analysis
Producer Implementation
- Supports batching, compression, and partition balancing.
- Provides JSON helpers and raw message sending with optional headers.
- Health checks and statistics exposed.
Diagram sources
Section sources
Consumer Implementation
- Single-topic consumption with automatic/manual commit modes.
- Built-in retry with exponential backoff and DLQ routing.
- Graceful shutdown, lag reporting, and stats.
Diagram sources
Section sources
Multi-Topic Consumer
- Spawns independent consumers per topic with shared configuration.
- Supports manual commit and graceful shutdown across topics.
Diagram sources
Section sources
Worker Pool Management
- Channel-based dispatcher with configurable buffer and concurrency.
- Optional retry/backoff and DLQ integration.
- Non-blocking start and graceful stop.
Diagram sources
Section sources
Consumer Manager Orchestration
- Registers handlers globally or per-manager instance.
- Starts worker pools per topic with runtime configuration.
Diagram sources
Section sources
Configuration and Defaults
- YAML defaults for producer and consumer.
- Environment-based overrides and option builders.
- TLS/SASL and partition change monitoring.
Diagram sources
Section sources
Serialization Formats and Schema Registry
- JSON payloads supported directly by producer helpers.
- Confluent Schema Registry integration for Avro/Protobuf/JSON with caching and wire format.
Diagram sources
Section sources
Event Replay, Ordering, and Dead Letter Queues
- Replay: StartOffset controls earliest/latest; SetOffset/SetOffsetAt enable precise replay.
- Ordering: Per-partition ordering preserved; consumer groups balance partitions.
- DLQ: On max retries, messages are sent to configured DLQ topic with original metadata.
Diagram sources
Section sources
Consumer Group Management and Offset Handling
- Consumer groups automatically rebalance partitions; heartbeat/session timeouts govern liveness.
- Manual commit mode allows idempotent processing with explicit Ack/Nack semantics.
- Graceful shutdown waits for in-flight work completion.
Section sources
Fault Tolerance Strategies
- Partition change monitoring prevents initial-start consumption issues.
- Retries with backoff and jitter reduce transient failure impact.
- DLQ decouples poison-pill messages from live processing.
- TLS/SASL protect transport and authentication.
Section sources
Event-Driven Microservice Communication
- Example gRPC-to-Kafka bridge exposes RPC methods that serialize request data and publish to Kafka topics.
- Example service method converts protobuf data to Kafka payload and delegates to producer.
Diagram sources
Section sources
Distributed Transactions and Coordination
- The codebase does not implement end-to-end transactional producers/consumers; it focuses on at-least-once delivery with retries and DLQ.
- For cross-service saga/eventual consistency, coordinate via idempotent event handlers and external state stores.
[No sources needed since this section provides general guidance]
Dependency Analysis
- Producer depends on kafka-go Writer and optional Schema Registry.
- Consumer depends on kafka-go Reader and provides retry/DLQ logic.
- MultiTopicConsumer composes multiple Consumers.
- WorkerPool wraps Consumer with channel dispatch and worker goroutines.
- ConsumerManager orchestrates handlers and worker pools.
Diagram sources
Section sources
Performance Considerations
- Tune batching (size/bytes/timeout) and compression for throughput/latency trade-offs.
- Choose partition balancers appropriate to key distribution.
- Use worker pools sized to CPU and IO characteristics.
- Monitor lag via ReaderStats and adjust consumer concurrency/partitions accordingly.
- Enable partition change monitoring to avoid cold-start consumption delays.
[No sources needed since this section provides general guidance]
Troubleshooting Guide
- Health checks: Use Ping on Producer/Consumer to validate broker connectivity.
- Logs: Set Logger/ErrorLogger on Producer/Consumer/ProducerConfig/ConsumerConfig.
- DLQ: Verify DLQ topic exists and permissions; inspect headers for original topic and error.
- Retries: Adjust MaxRetries and backoff parameters; consider jitter for thundering herd mitigation.
- Offsets: Use SetOffset/SetOffsetAt for targeted replay; confirm commit mode (auto/manual).
- TLS/SASL: Validate credentials and certificate paths; disable verification only in controlled environments.
Section sources
Conclusion
The Kafka integration provides robust primitives for event-driven architectures: producers with batching and serialization, consumers with retries and DLQ, multi-topic orchestration, and worker pools for parallel processing. Combined with configuration defaults, TLS/SASL, and schema registry support, it enables scalable, fault-tolerant systems suitable for microservice communication and eventual consistency patterns.
Appendices
Topic Organization and Routing
- Use hierarchical naming conventions for discoverability and team ownership.
- Route by domain entity and event type to minimize cross-team coupling.
- Consider idempotency keys and partitioning strategies aligned to event keys.
[No sources needed since this section provides general guidance]
Monitoring and Lag Detection
- Use ReaderStats and WriterStats to track lag and throughput.
- Alert on rising lag, increased DLQ rates, and retry saturation.
- Correlate consumer lag with partition counts and worker pool sizes.
[No sources needed since this section provides general guidance]