Analytical Processing and Data Transformation

**Referenced Files in This Document** - [[README.md]](file/bi-analysis/readme.md) - [[go.mod]](file/bi-analysis/go.mod) - [[sql_builder.go]](file/bi-analysis/internal/biz/sql-builder.go) - [[sql_formula.go]](file/bi-analysis/internal/biz/sql-formula.go) - [[sql_validator.go]](file/bi-analysis/internal/biz/sql-validator.go) - [[sql_order_view.go]](file/bi-analysis/internal/biz/sql-order-view.go) - [[sql_product_view.go]](file/bi-analysis/internal/biz/sql-product-view.go) - [[query.go]](file/bi-analysis/internal/biz/query.go) - [[field.go]](file/bi-analysis/internal/biz/field.go) - [[dimension.go]](file/bi-analysis/internal/biz/dimension.go) - [[table.go]](file/bi-analysis/internal/biz/table.go) - [[data.go]](file/bi-analysis/internal/data/data.go)

Introduction
Project Structure
Core Components
Architecture Overview
Detailed Component Analysis
Dependency Analysis
Performance Considerations
Troubleshooting Guide
Conclusion

Introduction

This document explains the analytical processing capabilities of bi-analysis, focusing on the dynamic field processing system, SQL query building mechanisms, and data transformation pipelines. It covers metric calculation algorithms, aggregation functions, formula processing, query optimization strategies, caching mechanisms, performance tuning, and integration with external data sources. Real-time data processing workflows are addressed through the SQL builder’s support for order-related views and computed fields.

Project Structure

The bi-analysis module is a Kratos-based microservice that integrates with bi-common and exposes analytical APIs. The internal/biz package implements the business logic for query orchestration, SQL generation, formula parsing, and validation. The internal/data package provides repositories and persistence wiring via GORM and Redis.

Diagram sources

Section sources

Core Components

QueryUsecase orchestrates analytical queries: parses requests, resolves field dependencies, builds SQL, executes, and formats results.
SQLBuilder constructs complex CTE-based SQL tailored to field categories (base, formula, order/finance, DIY).
FormulaParser converts JSON-formatted formula expressions into safe SQL with division-by-zero protection and IFNULL wrapping.
Validators enforce safe WHERE conditions and date formats.
OrderViewBuilder and ProductViewBuilder generate canonical CTEs for order/finance and product analytics aligned with “Yingfeng” semantics.
Repositories and Transaction abstraction provide persistence and optional Redis caching.

Section sources

Architecture Overview

The analytical pipeline follows a layered architecture:

API layer (Protobuf) receives structured requests.
Business layer (usecases) validates inputs, resolves dependencies, and delegates SQL construction.
SQL builder composes CTEs for product info, sales/refunds, and derived views.
Formula parser transforms expressions into SQL with safety checks.
Data layer executes SQL and returns typed results, formatted for clients.

Diagram sources

Detailed Component Analysis

Dynamic Field Processing System

Field categorization: Fields are classified into base, formula, period (derived), DIY, and order/finance groups. Order/finance fields trigger the use of a unified sale day view.
Runtime configuration: Per-field aggregation modes (SUM, AVG, MAX, MIN, AVG_CLEAN) are resolved from runtime config or defaults.
Formula dependencies: Formula fields are parsed and dependencies are recursively loaded to ensure correct evaluation order.

Diagram sources

Section sources

SQL Query Building Mechanisms

CTE composition: The builder generates multiple CTEs (product_info, data sources, product_date, merge_base, product_diy, dimension_group, merge_formula, result, result_limit, result_count).
Simplified paths: Dedicated SQL builders for pure DIY metrics and pure order/finance metrics enable efficient execution plans.
Ranking and pagination: Uses dense ranking and an i_order_by column derived from the first dimension to ensure stable ordering and pagination.
Performance hint: Emits a SQL hint to guide query planner nodes.

Diagram sources

Section sources

Data Transformation Pipelines

Aggregation functions: getAggFunc/getAggExpr/getDimensionAggExpr select appropriate aggregate functions and wrap expressions for null handling and zero-div prevention.
Formula evaluation: FormulaParser wraps numeric operands and division operators with IFNULL/NULLIF to avoid nulls and division-by-zero.
Result formatting: Values are converted to strings with date/time formatting and nil normalization.

Diagram sources

Section sources

Metric Calculation Algorithms and Aggregation Functions

Finance metrics mapping: A strict mapping ensures fields like payment amount, refunds, and derived purity metrics resolve to canonical columns in the unified sale-day view.
Product metrics mapping: Product indicators (sales, refunds,试用, purity) map to fields in the product merge view.
Group detection: isOrderRelatedTable, IsFinanceGroup, IsProductGroup route fields to appropriate CTEs.

Diagram sources

Section sources

Formula Processing Capabilities

Token-based parsing: JSON arrays of tokens (field, symbol, parentheses, number) are validated and transformed into SQL.
Safety checks: Division operators receive NULLIF wrapping; unknown operators and invalid numbers are rejected.
Dependency sorting: Topological sort ensures composite formulas evaluate inner dependencies first.

Diagram sources

Section sources

Query Optimization Strategies

Early routing: Pure-DIY and pure-order/finance paths bypass unnecessary joins and reduce plan complexity.
Date filtering: getDateRange enforces bounds and supports flexible date inputs.
Planner hint: A SQL hint is embedded to influence node placement.
Ranking window functions: Dense rank and partitioned ordering ensure stable pagination without OFFSET scans.

Section sources

Caching Mechanisms and Performance Tuning

Optional Redis: The data layer initializes Redis when configured; while not explicitly used in analytical queries here, it is available for caching metadata or intermediate results.
Transaction isolation: InTx ensures consistent reads/writes across operations.
Pagination: Uses window functions and dense ranks to minimize overhead compared to OFFSET-based pagination.

Section sources

Examples of Complex Analytical Operations

Multi-dimensional aggregations: Rows and columns are split and trimmed, then used to construct GROUP BY clauses and ORDER BY expressions.
Time-series analysis: When time dimensions are present, time series CTEs are generated and LEFT JOINed to ensure zero-fill behavior for missing dates.
Derived metrics: Purity metrics (net items/orders/buyers, net amount) and cost/fee adjustments are computed in dedicated CTEs and reused across queries.

Section sources

Data Validation, Error Handling, and Logging Patterns

WHERE condition parser: Validates operators, escapes values, and rejects invalid field names or empty IN lists.
Date validator: Accepts standard dates or allowed SQL date functions; falls back to current date when invalid.
Logging: Extensive debug logs during SQL generation and shop resolution; errors are wrapped with context for traceability.

Section sources

Integration with External Data Sources and Real-Time Workflows

Tenant-scoped tables: Order/refund/sd tables are dynamically constructed per tenant and platform/shop, enabling multi-tenant isolation.
Unified sale-day view: Orders, refunds, and试用 are merged into a single view to compute derived metrics consistently.
Shop-aware queries: When order/finance fields are present, shop lists are resolved to build per-shop CTEs and union them.

Section sources

Dependency Analysis

The module depends on bi-common for configuration, database, cache, and API response types. Wire is used for dependency injection.

Diagram sources

[go.mod]

Section sources

[go.mod]

Performance Considerations

Prefer pure-DIY or pure-order/finance routes when feasible to minimize joins.
Use dense rank and partitioned ordering for stable pagination.
Wrap division operands with NULLIF to prevent expensive divide-by-zero exceptions.
Limit result sets and leverage time-series zero-fill only when needed.

Troubleshooting Guide

Unsupported base metrics: Queries requiring base tenant metrics return an error; switch to DIY or order/finance fields.
Empty shop list: If no shops are found for a tenant, the builder returns an empty CTE; verify tenant/shop configuration.
Invalid formula or WHERE: Parsing errors are returned with context; validate JSON tokens and operators.
Date range issues: Invalid dates fall back to sensible defaults; ensure correct date formats or supported SQL functions.

Section sources

Conclusion

Bi-analysis provides a robust, extensible framework for analytical queries. Its SQL builder composes canonical CTEs for order/finance and product domains, safely evaluates formulas, and enforces validations. With optional Redis caching and transactional guarantees, it supports scalable, accurate analytics across tenants and time-series workloads.

brownfield

Multi Agent System Architecture

Data Flow And Processing Architecture

Microservices Architecture And Design Patterns

Core Services

Bi Analysis Analytics Engine

Bi Basic Foundation Data Services

Bi Server Business Orchestration Center

Management Services

Bi Sys System Management Service

Bi Tenant Multi Tenant Management Service

Shared Infrastructure

External Data Integration

Jushuitan Erp Integration

Leke Erp Integration

Kafka Data Synchronization Pipeline

Starrocks Olap Database

Admin Panel Ui Web Admin

Tenant Console Ui Web

Analytical Processing and Data Transformation

Table of Contents

Introduction

Project Structure

Core Components

Architecture Overview

Detailed Component Analysis

Dynamic Field Processing System

SQL Query Building Mechanisms

Data Transformation Pipelines

Metric Calculation Algorithms and Aggregation Functions

Formula Processing Capabilities

Query Optimization Strategies

Caching Mechanisms and Performance Tuning

Examples of Complex Analytical Operations

Data Validation, Error Handling, and Logging Patterns

Integration with External Data Sources and Real-Time Workflows

Dependency Analysis

Performance Considerations

Troubleshooting Guide

Conclusion

Bi Analysis Analytics Engine

Bi Basic Foundation Data Services

Bi Server Business Orchestration Center

Bi Sys System Management Service

Bi Tenant Multi Tenant Management Service

Jushuitan Erp Integration

Leke Erp Integration

Analytical Processing and Data Transformation ​

Table of Contents ​

Introduction ​

Project Structure ​

Core Components ​

Architecture Overview ​

Detailed Component Analysis ​

Dynamic Field Processing System ​

SQL Query Building Mechanisms ​

Data Transformation Pipelines ​

Metric Calculation Algorithms and Aggregation Functions ​

Formula Processing Capabilities ​

Query Optimization Strategies ​

Caching Mechanisms and Performance Tuning ​

Examples of Complex Analytical Operations ​

Data Validation, Error Handling, and Logging Patterns ​

Integration with External Data Sources and Real-Time Workflows ​

Dependency Analysis ​

Performance Considerations ​

Troubleshooting Guide ​

Conclusion ​

Analytical Processing and Data Transformation

Table of Contents

Introduction

Project Structure

Core Components

Architecture Overview

Detailed Component Analysis

Dynamic Field Processing System

SQL Query Building Mechanisms

Data Transformation Pipelines

Metric Calculation Algorithms and Aggregation Functions

Formula Processing Capabilities

Query Optimization Strategies

Caching Mechanisms and Performance Tuning

Examples of Complex Analytical Operations

Data Validation, Error Handling, and Logging Patterns

Integration with External Data Sources and Real-Time Workflows

Dependency Analysis

Performance Considerations

Troubleshooting Guide

Conclusion