Conversation Management System

**Referenced Files in This Document** - [[conversation_manager.py]](file/bi-chat/bi-chat/src/agents/conversation-manager.py) - [[models.py]](file/bi-chat/bi-chat/src/db/models.py) - [[sessions.py]](file/bi-chat/bi-chat/src/routes/sessions.py) - [[session_service_async.py]](file/bi-chat/bi-chat/src/services/session-service-async.py) - [[bi_agent.py]](file/bi-chat/bi-chat/src/agents/bi-agent.py) - [[config.py]](file/bi-chat/bi-chat/src/core/config.py) - [[long_term_memory.py]](file/bi-chat/bi-chat/src/services/long-term-memory.py) - [[redis_cache.py]](file/bi-chat/bi-chat/src/cache/redis-cache.py) - [[chat.py]](file/bi-chat/bi-chat/src/routes/chat.py) - [[ChatContext.tsx]](file/bi-chat/bi-chat-web/src/context/chatcontext.tsx) - [[use-chat.ts]](file/bi-chat/bi-chat-web/src/hooks/use-chat.ts) - [[chat_v1_design.md]](file/bi-chat/docs/chat-v1-design.md)

Introduction
Project Structure
Core Components
Architecture Overview
Detailed Component Analysis
Dependency Analysis
Performance Considerations
Troubleshooting Guide
Conclusion

Introduction

This document describes the conversation management system responsible for maintaining context and state across multi-agent interactions. It covers session lifecycle management (creation, persistence, cleanup), memory store implementation (short-term and long-term), context preservation mechanisms, integration with external storage systems, caching strategies, and error recovery. It also provides examples of conversation state serialization, context extraction algorithms, and conversation flow control.

Project Structure

The conversation management system spans backend Python services and frontend React context/state management:

Backend: FastAPI routes, asynchronous session service, conversation manager, agent orchestration, memory stores, and caching.
Frontend: React context and hooks for managing session state and UI interactions.

Diagram sources

Section sources

Core Components

ConversationManager: Central orchestrator for session lifecycle, concurrency control, memory loading/persistence, and long-term memory integration.
AsyncSessionService: Asynchronous database operations for session creation, updates, retrieval, and message persistence.
BIReActAgent: Agent wrapper that auto-persists incremental conversation messages and supports long-term memory.
LongTermMemoryService: User-scoped long-term memory backed by Mem0 and Milvus.
RedisCacheManager: Cost-saving cache for LLM responses, vector search, and statistical computations.
FastAPI Routes: Expose session and chat endpoints for frontend integration.

Section sources

Architecture Overview

The system integrates frontend session state with backend session management and memory persistence. Sessions are created via API, cached in-memory with LRU eviction, and persisted to PostgreSQL. Long-term memory is enabled per user and stored in Milvus. Redis caches frequently accessed results to reduce latency and cost.

Diagram sources

Detailed Component Analysis

Session Lifecycle Management

Creation: If no conversation_id is provided, a UUID is generated. Sessions are created/updated asynchronously in PostgreSQL via AsyncSessionService.
Persistence: Messages are saved incrementally after each agent reply. The agent tracks saved message count to avoid duplication.
Cleanup: Idle sessions are evicted based on max_idle_seconds; removed sessions are marked archived asynchronously.
Concurrency: Per-session asyncio locks prevent race conditions during creation and access updates.

Diagram sources

Section sources

Memory Store Implementation

Short-term memory: InMemoryMemory per session, loaded from PostgreSQL session_messages with intelligent character threshold to avoid triggering compression. History sanitization ensures valid assistant-tool sequences.
Long-term memory: User-scoped instances via LongTermMemoryService, backed by Mem0 and Milvus. Retrieval includes deduplication to avoid repeated memories.
Serialization: Messages are stored as JSON with a "messages" array containing role/type/timestamped entries.

Diagram sources

Section sources

Context Preservation Mechanisms

Intelligent memory loading: Iterates through recent messages in reverse chronological order, estimates cumulative character count using a token counter, and stops before exceeding a configured threshold to avoid compression triggers.
History sanitization: Ensures assistant tool-call sequences are complete; strips incomplete tool-calls and discards orphan tool messages to prevent downstream errors.
Context extraction: Provides a formatted history endpoint that flattens and cleans messages for agent consumption, preserving assistant-tool continuity.

Diagram sources

Section sources

Integration with External Storage Systems

PostgreSQL: Stores sessions and session_messages with foreign key relationships and timestamps.
Milvus: Vector store for long-term memory embeddings; includes a patched vector store to handle metadata-only updates safely.
Redis: Caches LLM responses, vector search results, web search results, and computed statistics to reduce latency and cost.

Diagram sources

Section sources

Caching Strategies for Performance Optimization

LLM cache: Prompts hashed to keys with TTL; reduces repeated LLM calls.
Vector search cache: Stores retrieval results for queries with shorter TTL.
Web search cache: Stores web search results with moderate TTL.
Statistics cache: Stores computed analytics with metadata and recommended refresh time.

Section sources

[redis_cache.py]

Conversation State Serialization and Context Extraction

Serialization: Messages stored as JSON with a "messages" array; includes role, type, content, and optional tool_call_id/timestamps.
Context extraction: Endpoint returns flattened, cleaned messages suitable for agent prompts, skipping intermediate tool_use/tool_result blocks.

Section sources

Conversation Flow Control

Streamed responses: SSE generator emits conversation_id once, followed by text deltas, tool_use, and tool_result events, ending with [DONE].
Non-streamed responses: Collects full response, tool calls, and tool results in a single JSON object.
Frontend integration: React context manages session ID, messages, and API version; hooks create/delete sessions and load histories.

Diagram sources

Section sources

Error Recovery and Session Timeout Handling

Idle cleanup: Periodic cleanup removes sessions older than max_idle_seconds; marks status as archived asynchronously.
Graceful cancellation: Pipeline handles asyncio.CancelledError and interrupts agent execution.
Frontend fallback: On API errors, frontend displays user-friendly messages and resets streaming state.

Section sources

Dependency Analysis

ConversationManager depends on AsyncSessionService for DB operations and optionally LongTermMemoryService for persistent user memory.
BIReActAgent persists messages via AsyncSessionService after each reply.
Routes depend on ConversationManager and expose session and chat endpoints.
Frontend depends on API endpoints for session CRUD and chat streaming.

Diagram sources

Section sources

Performance Considerations

Memory compression: Enabled by default with configurable thresholds and recent rounds retention to balance context length and cost.
Caching: Redis caches frequently accessed results to reduce latency and cost; cache statistics provide visibility into hit rates.
Asynchronous operations: DB writes and long-term memory operations are offloaded to background tasks to minimize request latency.
Token counting: Uses a character/token counter aligned with compression logic to avoid unnecessary recompression.

[No sources needed since this section provides general guidance]

Troubleshooting Guide

Empty or malformed histories: The system sanitizes histories to ensure assistant-tool continuity; incomplete tool-call sequences are stripped and orphan tool messages are dropped.
Session not found or deleted: Deleting a session soft-deletes it; frontend should create a new session or refresh the list.
Streaming issues: The SSE generator guarantees a final [DONE] marker; errors are sent with finish_reason set appropriately.
Idle sessions disappearing: Configure max_idle_seconds and ensure periodic activity to keep sessions alive.

Section sources

Conclusion

The conversation management system provides robust session lifecycle control, efficient memory persistence, and scalable context preservation across multi-agent workflows. It integrates PostgreSQL for short-term history, Milvus for long-term memory, and Redis for performance optimization. The design balances concurrency safety, error resilience, and extensibility for future enhancements.

brownfield

Multi Agent System Architecture

Data Flow And Processing Architecture

Microservices Architecture And Design Patterns

Core Services

Bi Analysis Analytics Engine

Bi Basic Foundation Data Services

Bi Server Business Orchestration Center

Management Services

Bi Sys System Management Service

Bi Tenant Multi Tenant Management Service

Shared Infrastructure

External Data Integration

Jushuitan Erp Integration

Leke Erp Integration

Kafka Data Synchronization Pipeline

Starrocks Olap Database

Admin Panel Ui Web Admin

Tenant Console Ui Web

Conversation Management System

Table of Contents

Introduction

Project Structure

Core Components

Architecture Overview

Detailed Component Analysis

Session Lifecycle Management

Memory Store Implementation

Context Preservation Mechanisms

Integration with External Storage Systems

Caching Strategies for Performance Optimization

Conversation State Serialization and Context Extraction

Conversation Flow Control

Error Recovery and Session Timeout Handling

Dependency Analysis

Performance Considerations

Troubleshooting Guide

Conclusion

Bi Analysis Analytics Engine

Bi Basic Foundation Data Services

Bi Server Business Orchestration Center

Bi Sys System Management Service

Bi Tenant Multi Tenant Management Service

Jushuitan Erp Integration

Leke Erp Integration

Conversation Management System ​

Table of Contents ​

Introduction ​

Project Structure ​

Core Components ​

Architecture Overview ​

Detailed Component Analysis ​

Session Lifecycle Management ​

Memory Store Implementation ​

Context Preservation Mechanisms ​

Integration with External Storage Systems ​

Caching Strategies for Performance Optimization ​

Conversation State Serialization and Context Extraction ​

Conversation Flow Control ​

Error Recovery and Session Timeout Handling ​

Dependency Analysis ​

Performance Considerations ​

Troubleshooting Guide ​

Conclusion ​

Conversation Management System

Table of Contents

Introduction

Project Structure

Core Components

Architecture Overview

Detailed Component Analysis

Session Lifecycle Management

Memory Store Implementation

Context Preservation Mechanisms

Integration with External Storage Systems

Caching Strategies for Performance Optimization

Conversation State Serialization and Context Extraction

Conversation Flow Control

Error Recovery and Session Timeout Handling

Dependency Analysis

Performance Considerations

Troubleshooting Guide

Conclusion