Natural Language to SQL Integration
Table of Contents
- Introduction
- Project Structure
- Core Components
- Architecture Overview
- Detailed Component Analysis
- Dependency Analysis
- Performance Considerations
- Troubleshooting Guide
- Conclusion
- Appendices
Introduction
This document explains the NL2SQL integration with AskTable in the bi-chat-asktable service. It covers how natural language queries are transformed into safe, executable SQL against a StarRocks data source, how row-level access control is enforced, and how AskTable is integrated for chat-based query experiences. It also documents the AskTable service integration (authentication, API communication, and response handling), the template-like access policy system that adapts generic SQL patterns to specific business contexts, and practical troubleshooting and performance guidance.
Project Structure
The bi-chat-asktable module is a FastAPI service that:
- Exposes REST endpoints for token creation, chat session management, embed URL generation, access policy creation, and datasource setup.
- Manages configuration via environment variables and optional Nacos overrides.
- Integrates with AskTable SDK and HTTP APIs to provision tokens, chats, policies, and datasources.
- Provides a convenience endpoint that composes token creation, chat creation, and embed URL building.
Diagram sources
Section sources
Core Components
- FastAPI application factory and lifecycle events
- Configuration loader with Nacos support and environment fallback
- AskTableService for SDK and HTTP integrations
- Route handlers for token, chat, embed URL, policy, and datasource operations
- Pydantic models for request/response schemas and standardized API responses
Key responsibilities:
- Token creation with row-level access control variables
- Chat session provisioning per user and role
- Access policy creation for table-level and row-level filtering
- Datasource setup for StarRocks with metadata building
- Embed URL composition for client-side widgets
Section sources
- [app/main.py]
- [app/core/config.py]
- [app/services/asktable_service.py]
- [app/api/routes.py]
- [app/models/schemas.py]
Architecture Overview
The NL2SQL sandbox architecture focuses on secure, context-aware query execution:
- Schema retrieval: Obtain metadata from StarRocks via AskTable datasource.
- SQL generation: Use LLM prompts optimized for StarRocks to produce SQL.
- Execution sandbox: Enforce read-only access, inject tenant filters, and cap scan volume/time.
Diagram sources
Section sources
Detailed Component Analysis
AskTableService: AskTable Integration and Access Control
AskTableService encapsulates all AskTable operations:
- Token creation with role variables for row-level filtering
- Chat session creation and retrieval
- Access policy creation with regex-based table patterns and row filters
- Datasource setup for StarRocks with metadata building
- Embed URL derivation based on API URL
Diagram sources
Key behaviors:
- Row-level filtering: Access policies inject conditions like shop_id filters for basic_ and diy_ tables.
- Datasource scoping: Regex patterns limit policies to specific table families.
- Embed URL construction: Derives embed base URL from configured API URL.
Section sources
Route Handlers: API Workflows
The route handlers orchestrate AskTable operations and return standardized responses.
Diagram sources
Additional workflows:
- Embed URL composition: Creates token, chat, and returns embed URL with chatId and apiKey.
- Access policy creation: Builds policies scoped to StarRocks tables and applies row filters.
- Datasource setup: Provisions StarRocks datasource and metadata.
Section sources
Configuration Management: Environment and Nacos
Configuration supports environment variables and optional Nacos overrides:
- AskTable API URL, API key, bot ID, datasource ID
- StarRocks connection details
- Redis settings
- Token TTL
Diagram sources
Section sources
Template-Friendly Access Policies: Adapting Generic Patterns to Business Contexts
Access policies define generic SQL patterns that adapt to specific schemas:
- Regex-based table pattern matching for basic_ and diy_ tables
- Row-level filters injected via role variables (e.g., shop_id IN (...))
- Datasource-scoped permissions ensuring policies apply only to configured StarRocks database
Diagram sources
Section sources
NL2SQL Workflow: From Natural Language to Executable SQL
The NL2SQL workflow documented here focuses on the AskTable integration layer:
- Schema retrieval: AskTable datasource metadata provides table/column context for LLM prompting.
- SQL generation: LLM generates StarRocks-compatible SQL using curated system prompts.
- Execution sandbox: Enforce read-only access, inject tenant filters (e.g., i_tenant_id), and constrain scan volume/time.
Diagram sources
Section sources
Dependency Analysis
External dependencies and integrations:
- AskTable SDK and HTTP APIs for tokens, chats, policies, and datasources
- StarRocks datasource connection for metadata and query execution
- Nacos for centralized configuration management
- FastAPI for routing and HTTP transport
Diagram sources
Section sources
Performance Considerations
- Limit query execution time and scanned rows in the sandbox to prevent resource exhaustion.
- Prefer read-only connections and scoped access policies to minimize overhead.
- Cache frequently accessed metadata and tokens where appropriate.
- Use regex-based policies to avoid expensive joins and broad scans.
- Tune token TTL to balance security and UX latency.
[No sources needed since this section provides general guidance]
Troubleshooting Guide
Common issues and resolutions:
- Invalid or missing AskTable credentials
- Verify ASKTABLE_API_KEY and ASKTABLE_API_URL in environment or Nacos.
- Confirm bot ID and datasource ID are set and valid.
- Placeholder keys in configuration
- Warning logs indicate placeholder values; replace with real keys.
- Token creation failures
- Ensure shop_ids are provided and formatted correctly.
- Check token TTL and role variables.
- Chat creation errors
- Validate token and bot_id; confirm chat permissions.
- Access policy creation failures
- Confirm datasource_id matches the configured StarRocks datasource.
- Ensure regex patterns match target table families.
- Datasource setup failures
- Verify StarRocks connectivity (host, port, user, password, database).
- Re-run setup script to rebuild metadata.
Operational checks:
- Health endpoint: GET /api/v1/asktable/health
- Embed URL endpoint: POST /api/v1/asktable/embed-url for quick validation
Section sources
Conclusion
The bi-chat-asktable service provides a secure, configurable bridge between natural language queries and StarRocks-backed SQL execution via AskTable. By combining row-level access control, regex-scoped policies, and a streamlined token/chat lifecycle, it enables context-aware NL2SQL experiences while enforcing safety and performance constraints.
[No sources needed since this section summarizes without analyzing specific files]
Appendices
API Reference Summary
- POST /api/v1/asktable/token: Create a temporary token with row-level access variables.
- POST /api/v1/asktable/chat: Create a chat session bound to a token.
- GET /api/v1/asktable/chat/{chat_id}: Retrieve chat details.
- POST /api/v1/asktable/embed-url: Compose embed URL with token and chat.
- POST /api/v1/asktable/policy: Create access policy with row filters.
- POST /api/v1/asktable/setup-datasource: Provision StarRocks datasource and metadata.
- GET /api/v1/asktable/health: Health check.
Section sources
Setup Script Usage
Run the setup script once to configure:
- StarRocks datasource
- Metadata
- AI assistant bot
- Role with shop-based row filters
The script prints generated IDs for configuration updates.
Section sources