Database Schema and Data Model

**Referenced Files in This Document** - [[schema.sql]](file/bi-analysis/docs/database/schema.sql) - [[init.sql]](file/bi-analysis/docs/database/init.sql) - [[design.md]](file/bi-analysis/docs/database/design.md) - [[table.go]](file/bi-analysis/internal/data/model/table.go) - [[dimension.go]](file/bi-analysis/internal/data/model/dimension.go) - [[field.go]](file/bi-analysis/internal/data/model/field.go) - [[field_meta.go]](file/bi-analysis/internal/data/model/field-meta.go) - [[field_group.go]](file/bi-analysis/internal/data/model/field-group.go)

Introduction
Project Structure
Core Components
Architecture Overview
Detailed Component Analysis
Dependency Analysis
Performance Considerations
Troubleshooting Guide
Conclusion

Introduction

This document provides comprehensive database schema and data model documentation for the bi-analysis StarRocks OLAP system. It covers table structures, field definitions, data types, primary keys, indexes, constraints, and the metric data model. It also explains dimension grouping, hierarchical relationships, design principles, partitioning strategies, and data lifecycle management. The document connects configuration tables with the underlying analytical data model and clarifies the relationship between configuration metadata and physical data sources.

Project Structure

The database design for bi-analysis consists of four core configuration tables that define data sources, dimensions, metrics, and field metadata. These tables are designed for StarRocks with Primary Key tables, BITMAP indexes, and soft-delete semantics.

Diagram sources

[schema.sql]

Section sources

Core Components

This section documents the core configuration tables and their roles in the bi-analysis data model.

anls_table: Defines data source tables used for analysis, including associations with dimensions and date types.
anls_dimension: Defines analytical dimensions and supports mutual exclusion rules via c_mutex_keys.
anls_field: Defines metric fields, formulas, functions, and their relationships to data sources and dimensions.
anls_field_meta: Provides metadata for available fields in source tables, enabling dynamic metric creation and validation.
anls_date: Provides a complete date dimension table for time-series analysis.
anls_template: Stores reusable analysis templates per tenant.
anls_user_home_layout: Stores user-specific dashboard layouts.
anls_field_visible_template: Stores personalized field visibility and ordering configurations.

Section sources

Architecture Overview

The bi-analysis database design centers on configuration-driven metric definition and flexible dimension modeling. The configuration tables guide how metrics are constructed, validated, and presented, while the date dimension table supports time-based analytics.

Diagram sources

[schema.sql]

Detailed Component Analysis

Data Source Configuration Table (anls_table)

Purpose:

Define data source tables used for analysis.
Associate dimensions and date types to tables.

Key fields and constraints:

Primary key: id
Tenant scoping via tenant_id
Unique constraint on c_table_name enforced by index
Soft delete via deleted_at
Status flag and sort order

Indexes:

Unique index on c_table_name
BITMAP index on status
BITMAP index on deleted_at

Partitioning and distribution:

Distributed by HASH(id) with 8 buckets
Replication set to 1

Operational notes:

Used to link metrics to source tables and dimensions.

Section sources

Dimension Configuration Table (anls_dimension)

Purpose:

Define analytical dimensions and enforce mutual exclusivity.

Key fields and constraints:

Primary key: id
Unique key: c_key
Mutual exclusion: c_mutex_keys stores comma-separated keys to exclude when selecting dimensions
Soft delete via deleted_at
Status flag and sort order

Indexes:

Unique index on c_key
BITMAP index on status
BITMAP index on deleted_at

Partitioning and distribution:

Distributed by HASH(id) with 8 buckets
Replication set to 1

Front-end logic:

The design document outlines a filtering algorithm to compute mutually exclusive dimensions based on selected keys.

Section sources

Metric Field Configuration Table (anls_field)

Purpose:

Define metric fields, formulas, functions, and their relationships to data sources and dimensions.

Key fields and constraints:

Primary key: id
Tenant scoping via tenant_id (0 indicates system-wide)
Unique key: c_field
Supports three types: field, formula, function
Units: number or percent
Expression and filters stored as JSON
Date range controls via offsets and fixed dates
Soft delete via deleted_at
Status flag and sort order

Indexes:

Unique index on c_field
BITMAP index on tenant_id
BITMAP index on c_group
BITMAP index on c_type
BITMAP index on status
BITMAP index on deleted_at

Partitioning and distribution:

Distributed by HASH(id) with 8 buckets
Replication set to 1

Relationships:

References data source tables via c_table_name and c_table_field
References dimensions via c_dimension
Supports derived metrics via c_field_p and c_expression

Initialization:

The init.sql script seeds system-level metrics across multiple groups (e.g., flow, finance, ad, sku, consult, region, wujie, amortiz, other).

Section sources

Field Metadata Table (anls_field_meta)

Purpose:

Provide metadata for available fields in source tables to enable dynamic metric creation and validation.

Key fields and constraints:

Primary key: id
Composite unique key: (c_group, c_table_field)
Grouping: c_group identifies data source type (order, refund, sd, apportion, bill_period)
Data typing: c_data_type (amount, count, rate, flag, enum)
Date support: c_date_type as JSON array
Metric capability: is_metric and default aggregation via c_func
Filter capability: is_filter and supported operators via c_operators
Enumerations: c_enum_values as JSON
Related filters: c_related_filters for metric-level filter sets
Soft delete via deleted_at
Status flag and sort order

Indexes:

Unique composite index on (c_group, c_table_field)
BITMAP index on c_group
BITMAP index on is_metric
BITMAP index on is_filter
BITMAP index on status
BITMAP index on deleted_at

Partitioning and distribution:

Distributed by HASH(id) with 4 buckets
Replication set to 1

Relationships:

Validates fields referenced by anls_field
Supports dynamic UI generation for metric creation

Section sources

Date Dimension Table (anls_date)

Purpose:

Provide a complete date dimension for time-series analysis.

Key fields and constraints:

Primary key: t_date (DATE)
Computed attributes: ISO week, year-month, quarter, day of week/month, week start/end, formatted week display
Pre-populated for years 2020–2050

Partitioning and distribution:

Distributed by HASH(t_date) with 4 buckets
Replication set to 1

Initialization:

The schema initializes the date dimension using a sequence generator and date arithmetic.

Section sources

[schema.sql]

Additional Configuration Tables

These tables support template-based analysis, user dashboards, and field visibility personalization.

anls_template: Tenant-scoped templates for analysis layouts and metric selections.
anls_user_home_layout: User-specific dashboard layouts.
anls_field_visible_template: Personalized field visibility and ordering.

Indexes and distribution:

All use distributed HASH(id) with 8 buckets and replication set to 1.
BITMAP indexes on tenant_id, status, and deleted_at where applicable.

Section sources

Dependency Analysis

The configuration tables form a layered dependency model that governs metric creation and validation.

Diagram sources

[schema.sql]

Section sources

[design.md]

Performance Considerations

Distribution and bucket sizing:
- Primary key tables use HASH(id) distribution with 4–8 buckets depending on cardinality needs.
- Date dimension uses HASH(t_date) with fewer buckets due to low-cardinality date keys.
Indexing strategy:
- BITMAP indexes on status and deleted_at enable fast filtering for soft-deleted records and active/inactive rows.
- Unique indexes on identifiers (c_table_name, c_key, c_field) prevent duplicates and accelerate joins.
Replication:
- All tables set replication_num to 1, suitable for read-heavy analytical workloads.
Data lifecycle:
- Soft delete pattern via deleted_at allows safe archival and recovery without physical deletion.
Partitioning:
- No explicit partitioning directives are present; time-series queries leverage the date dimension and BITMAP filters.

[No sources needed since this section provides general guidance]

Troubleshooting Guide

Common issues and resolutions:

Duplicate metric field names:
- Symptom: Insert fails on unique index for c_field.
- Resolution: Ensure c_field uniqueness per tenant; system-level metrics use tenant_id=0.
Invalid dimension selection:
- Symptom: Front-end shows no available dimensions after selection.
- Resolution: Verify c_mutex_keys on selected dimensions; mutual exclusion prevents overlapping selections.
Missing source field metadata:
- Symptom: Metric creation fails validation.
- Resolution: Confirm c_group and c_table_field exist in anls_field_meta; ensure is_metric/is_filter flags match intended usage.
Soft-deleted records interfering with queries:
- Symptom: Queries return unexpected inactive rows.
- Resolution: Apply deleted_at=0 filter in queries; BITMAP indexes on deleted_at support efficient filtering.

Section sources

Conclusion

The bi-analysis StarRocks database design leverages configuration-driven metadata to power flexible, tenant-aware metric creation and time-series analysis. The schema emphasizes soft deletes, BITMAP indexing, and primary key tables aligned with StarRocks best practices. The date dimension table and configuration tables collectively support robust analytical workflows, from dimension grouping and mutual exclusion to dynamic metric composition and personalization.

brownfield

Multi Agent System Architecture

Data Flow And Processing Architecture

Microservices Architecture And Design Patterns

Core Services

Bi Analysis Analytics Engine

Bi Basic Foundation Data Services

Bi Server Business Orchestration Center

Management Services

Bi Sys System Management Service

Bi Tenant Multi Tenant Management Service

Shared Infrastructure

External Data Integration

Jushuitan Erp Integration

Leke Erp Integration

Kafka Data Synchronization Pipeline

Starrocks Olap Database

Admin Panel Ui Web Admin

Tenant Console Ui Web

Database Schema and Data Model

Table of Contents

Introduction

Project Structure

Core Components

Architecture Overview

Detailed Component Analysis

Data Source Configuration Table (anls_table)

Dimension Configuration Table (anls_dimension)

Metric Field Configuration Table (anls_field)

Field Metadata Table (anls_field_meta)

Date Dimension Table (anls_date)

Additional Configuration Tables

Dependency Analysis

Performance Considerations

Troubleshooting Guide

Conclusion

Bi Analysis Analytics Engine

Bi Basic Foundation Data Services

Bi Server Business Orchestration Center

Bi Sys System Management Service

Bi Tenant Multi Tenant Management Service

Jushuitan Erp Integration

Leke Erp Integration

Database Schema and Data Model ​

Table of Contents ​

Introduction ​

Project Structure ​

Core Components ​

Architecture Overview ​

Detailed Component Analysis ​

Data Source Configuration Table (anls_table) ​

Dimension Configuration Table (anls_dimension) ​

Metric Field Configuration Table (anls_field) ​

Field Metadata Table (anls_field_meta) ​

Date Dimension Table (anls_date) ​

Additional Configuration Tables ​

Dependency Analysis ​

Performance Considerations ​

Troubleshooting Guide ​

Conclusion ​

Database Schema and Data Model

Table of Contents

Introduction

Project Structure

Core Components

Architecture Overview

Detailed Component Analysis

Data Source Configuration Table (anls_table)

Dimension Configuration Table (anls_dimension)

Metric Field Configuration Table (anls_field)

Field Metadata Table (anls_field_meta)

Date Dimension Table (anls_date)

Additional Configuration Tables

Dependency Analysis

Performance Considerations

Troubleshooting Guide

Conclusion