Skip to content

Database Schema Overview

**Referenced Files in This Document** - [[schema.sql]](file/bi-analysis/docs/database/schema.sql) - [[init.sql]](file/bi-analysis/docs/database/init.sql) - [[design.md]](file/bi-analysis/docs/database/design.md) - [[table.go]](file/bi-analysis/internal/data/model/table.go) - [[dimension.go]](file/bi-analysis/internal/data/model/dimension.go) - [[field.go]](file/bi-analysis/internal/data/model/field.go) - [[field_meta.go]](file/bi-analysis/internal/data/model/field-meta.go)

Table of Contents

  1. Introduction
  2. Project Structure
  3. Core Components
  4. Architecture Overview
  5. Detailed Component Analysis
  6. Dependency Analysis
  7. Performance Considerations
  8. Troubleshooting Guide
  9. Conclusion

Introduction

This document provides a comprehensive database schema overview for the StarRocks OLAP database supporting the bi-analysis BI system. It documents all nine tables, their purposes, relationships, multi-tenant design, partitioning strategies, and operational patterns optimized for analytical workloads and BI queries.

Project Structure

The database schema is defined under the bi-analysis module with two primary artifacts:

  • Schema definition script containing all nine tables and indexes
  • Initialization script with seed data for field groups and system-level metrics
  • Design documentation detailing the conceptual model and tenant-aware patterns

Diagram sources

Section sources

Core Components

This section introduces the nine tables that form the analytical metadata and configuration layer for the BI system. Each table serves a distinct role in defining data sources, dimensions, metrics, templates, and user personalization.

  • anls_table: Defines BI data source tables and their relationships to dimensions and date types.
  • anls_dimension: Configures analytical dimensions and supports mutual exclusivity among dimensions.
  • anls_field: Stores tenant-aware metric definitions, formulas, and filters.
  • anls_field_group: Categorizes metrics into logical groups and controls template creation permissions.
  • anls_date: Provides a complete date dimension table for time-based analysis.
  • anls_template: Saves user-defined analysis templates with row/column fields and configurations.
  • anls_field_meta: Captures metadata for data source fields enabling dynamic metric creation and validation.
  • anls_user_home_layout: Stores user-specific home dashboard layouts per device type.
  • anls_field_visible_template: Maintains per-user visibility and ordering preferences for templates.

Section sources

Architecture Overview

The schema follows a StarRocks OLAP design optimized for analytical queries with:

  • Primary Key tables with BITMAP indexes on status and soft-delete fields
  • Hash distribution on surrogate keys for balanced data locality
  • JSON fields for flexible expressions and filter configurations
  • Tenant-aware fields to isolate multi-tenant data and configurations

Diagram sources

Detailed Component Analysis

anls_table: Data Source Configuration

Purpose:

  • Defines BI data source tables and associates them with dimensions and date types.
  • Supports tenant scoping for multi-tenancy.

Key attributes:

  • Primary key: id
  • Tenant scoping: tenant_id
  • Unique constraint: c_table_name
  • Status and soft delete: status, deleted_at
  • Audit fields: created_by, updated_by, timestamps

Indexing and distribution:

  • Hash distribution on id
  • BITMAP index on status
  • BITMAP index on deleted_at
  • Unique index on c_table_name

Relationships:

  • Associates with anls_field via c_table_name for metric-to-source linkage.

Section sources

anls_dimension: Dimension Configuration and Mutual Exclusivity

Purpose:

  • Defines analytical dimensions (e.g., time, product, shop, geography).
  • Implements mutual exclusivity among dimensions using c_mutex_keys.

Key attributes:

  • Primary key: id
  • Unique constraint: c_key
  • Mutual exclusion: c_mutex_keys (comma-separated dimension keys)
  • Sorting and status: i_sort, status
  • Soft delete: deleted_at
  • Audit fields: created_by, updated_by, timestamps

Indexing and distribution:

  • Hash distribution on id
  • BITMAP index on status
  • BITMAP index on deleted_at
  • Unique index on c_key

Frontend usage:

  • Supports dynamic dimension selection with mutual exclusivity enforcement.

Section sources

anls_field: Metric Definitions and Tenant Isolation

Purpose:

  • Stores tenant-aware metric definitions, formulas, and filters.
  • Supports system-level metrics (tenant_id = 0) shared across tenants.

Key attributes:

  • Primary key: id
  • Tenant scoping: tenant_id (0 = system-wide)
  • Unique constraint: c_field
  • Metric metadata: c_display, c_group, c_type, c_unit
  • Expression and filtering: c_expression (JSON), c_where (JSON)
  • Date range controls: i_start_diff_day, i_end_diff_day, t_start_date, t_end_date
  • Template flag: is_template
  • Sorting and status: i_sort, status
  • Soft delete: deleted_at
  • Audit fields: created_by, updated_by, timestamps

Indexing and distribution:

  • Hash distribution on id
  • BITMAP indexes on tenant_id, c_group, c_type, status, deleted_at
  • Unique index on c_field

Relationships:

  • References anls_field_meta for field validation and metadata.
  • References anls_table for data source linkage.

Section sources

anls_field_group: Metric Grouping and Template Controls

Purpose:

  • Groups metrics into logical categories (e.g., traffic, finance, SKU).
  • Controls whether tenants can create template-based metrics.

Key attributes:

  • Primary key: id
  • Unique constraint: c_key
  • Name and sort: c_name, i_sort
  • Template permission: allow_template
  • Status and soft delete: status, deleted_at
  • Audit fields: created_by, updated_by, timestamps

Indexing and distribution:

  • Hash distribution on id
  • BITMAP indexes on status, deleted_at

Section sources

anls_date: Complete Date Dimension

Purpose:

  • Provides a comprehensive date dimension table for time-based analysis.
  • Range: 2020-01-01 to 2050-12-31.

Key attributes:

  • Primary key: t_date
  • ISO week, month, quarter, year, day-of-week, day-of-month, week-of-year
  • Week boundaries and display formatting

Distribution and initialization:

  • Hash distribution on t_date
  • Pre-populated with generated sequences for full coverage.

Section sources

anls_template: User-Defined Analysis Templates

Purpose:

  • Saves user-defined analysis templates with row/column fields and configurations.
  • Supports tenant scoping.

Key attributes:

  • Primary key: id
  • Tenant scoping: tenant_id
  • Owner identification: oper_id
  • Template metadata: c_name, c_icon, c_color, c_row, c_col, c_field, c_field_config, c_remark, i_sort
  • Status and soft delete: status, deleted_at
  • Audit fields: created_by, updated_by, timestamps

Indexing and distribution:

  • Hash distribution on id
  • BITMAP indexes on tenant_id, status, deleted_at

Section sources

anls_field_meta: Dynamic Field Metadata and Validation

Purpose:

  • Captures metadata for data source fields enabling dynamic metric creation and validation.
  • Uses c_group to map to dynamic base tables (e.g., order/refund/sd/apportion/bill_period).

Key attributes:

  • Primary key: id
  • Field identity: c_table_field, c_display
  • Source grouping: c_group
  • Data typing: c_data_type, c_date_type (JSON), c_func
  • Metric/filter capability: is_metric, is_filter
  • Related filters and operators: c_related_filters (JSON), c_operators (JSON)
  • Enumerations: c_enum_values (JSON)
  • Sorting and status: i_sort, status
  • Soft delete: deleted_at
  • Audit fields: timestamps

Indexing and distribution:

  • Hash distribution on id
  • BITMAP indexes on is_metric, is_filter, status, deleted_at
  • Unique composite index on (c_group, c_table_field)

Relationships:

  • Validates fields referenced by anls_field.
  • Supports dynamic metric creation and filter rendering.

Section sources

anls_user_home_layout: Personalized Dashboard Layouts

Purpose:

  • Stores user-specific home dashboard layouts per device type.
  • Supports tenant scoping.

Key attributes:

  • Primary key: id
  • Tenant and user identifiers: tenant_id, user_id
  • Device type: c_type (default: pc)
  • Layout configuration: c_layout_data (JSON)
  • Audit fields: created_by, updated_by, timestamps
  • Soft delete: deleted_at

Indexing and distribution:

  • Hash distribution on id
  • BITMAP indexes on tenant_id, user_id, c_type, deleted_at

Section sources

anls_field_visible_template: Per-User Template Visibility

Purpose:

  • Maintains per-user visibility, column ordering, and summary preferences for templates.
  • Mirrors external system compatibility (FieldVisibleTemplate).

Key attributes:

  • Primary key: id
  • Tenant scoping: tenant_id
  • User/operator identifiers: i_user_id, i_oper_id
  • Template key: c_table_key (e.g., templateTable/618)
  • Configuration name/value: c_name, c_value (JSON)
  • Audit fields: created_by, updated_by, timestamps
  • Soft delete: deleted_at

Indexing and distribution:

  • Hash distribution on id
  • BITMAP indexes on tenant_id, i_user_id, c_table_key, deleted_at

Section sources

Dependency Analysis

The following diagram illustrates the relationships among the nine tables and how they collaborate to support the BI configuration and analysis workflow.

Diagram sources

Section sources

Performance Considerations

  • Distribution strategy:
    • Hash distribution on primary keys ensures balanced data locality and efficient joins.
    • Bucket counts are tuned per table to balance cardinality and concurrency.
  • Indexing strategy:
    • BITMAP indexes on status and deleted_at enable fast filtering for active records and soft-deleted isolation.
    • Unique indexes prevent duplicates and accelerate lookups for identifiers.
  • Storage properties:
    • Replication set to 1 for development/stage; production may require higher replication for durability.
  • Analytical workload optimization:
    • JSON fields (c_expression, c_where, c_date_type, c_operators, c_enum_values) support flexible configurations while maintaining query performance through targeted filtering and indexing.
    • Date dimension table (anls_date) precomputes derived attributes to reduce runtime computation in analytical queries.

[No sources needed since this section provides general guidance]

Troubleshooting Guide

Common operational issues and resolutions:

  • Duplicate metric creation:

    • Symptom: Insert fails due to unique constraint on c_field.
    • Resolution: Ensure c_field uniqueness within tenant scope; system-level metrics use tenant_id = 0.
  • Soft deletion and visibility:

    • Symptom: Records appear missing in queries.
    • Resolution: Filter by deleted_at = 0 or use views that exclude deleted records.
  • Tenant isolation failures:

    • Symptom: Metrics/templates visible across tenants.
    • Resolution: Verify tenant_id filtering in queries and application logic; ensure BITMAP index on tenant_id is leveraged.
  • Dimension mutual exclusivity:

    • Symptom: Conflicting selections persist.
    • Resolution: Validate c_mutex_keys configuration and enforce at the application level during selection.
  • Date dimension gaps:

    • Symptom: Missing dates in analysis.
    • Resolution: Confirm anls_date initialization covers the required date range.

Section sources

Conclusion

The bi-analysis StarRocks schema provides a robust foundation for multi-tenant BI analytics. It balances flexibility with performance through tenant-aware fields, BITMAP indexing, hash distribution, and JSON-based extensibility. The nine-table design cleanly separates concerns across data sources, dimensions, metrics, templates, metadata, and user personalization, enabling scalable and maintainable analytical workflows.