Skip to content

Product Catalog Synchronization

**Referenced Files in This Document** - [[goods.go]](file/bi-api-leke/internal/service/consumer-handlers/goods.go) - [[goods.go]](file/bi-api-leke/internal/biz/usecase/goods.go) - [[data.go]](file/bi-api-leke/internal/biz/usecase/data.go) - [[goods.go]](file/bi-api-leke/internal/biz/bo/data.go) - [[goods.proto]](file/bi-server/api/bi-leke/v1/goods/goods.proto) - [[goods.pb.go]](file/bi-basic/api/goods/v1/goods.pb.go) - [[goods_category.pb.go]](file/bi-basic/api/good-category/v1/goods-category.pb.go) - [[goods_category.pb.go]](file/bi-proto/basic/v1/category/goods-category.pb.go) - [[kafka.pb.validate.go]](file/bi-proto/kafka/v1/kafka.pb.validate.go) - [[kafka.pb.validate.go]](file/bi-server/api/bi-base/v1/kafka/kafka.pb.validate.go) - [[goods.pb.validate.go]](file/bi-basic/api/goods/v1/goods.pb.validate.go) - [[product.ts]](file/ui-web/src/api/product.ts)

Table of Contents

  1. Introduction
  2. Project Structure
  3. Core Components
  4. Architecture Overview
  5. Detailed Component Analysis
  6. Dependency Analysis
  7. Performance Considerations
  8. Troubleshooting Guide
  9. Conclusion

Introduction

This document describes the product catalog synchronization system that extracts product data from external platforms via APIs, manages SKU variants, maps product attributes, synchronizes categories and brands, tracks product lifecycle and inventory, validates data, detects duplicates, and maintains price consistency. It focuses on the Taobao/Leke integration implemented in the bi-api-leke module and the downstream consumers that process product and SKU events.

Project Structure

The product catalog synchronization spans several modules:

  • Consumer handlers orchestrate API calls, pagination, and event publishing to Kafka.
  • Business use cases coordinate data fetching, parsing, and message generation.
  • Data parsing utilities handle flexible JSON structures (single vs array) for items and SKUs.
  • Protobuf models define product and SKU data contracts and validation rules.
  • Category and brand metadata are managed via category APIs and standardized category displays.
  • Frontend components consume product and SKU details for display and selection.

Diagram sources

Section sources

Core Components

  • GoodsUsecase: Builds Kafka messages for paginated ItemsOnsale requests, validates DataKey, and coordinates asynchronous dispatch.
  • ConsumerHandlers: Processes ItemsOnsale messages, fetches ItemSellerGet for details, parses SKUs, converts to Kafka messages, and publishes batches.
  • DataUsecase: Parses ItemSkusGet and ItemSellerGet responses, normalizes SKU lists, handles property images, and cleans property names.
  • Protobuf models: Define Product and SkuWithInventory structures, including validation rules for price, quantities, and status.
  • Category and Brand: Standard category display and category tree APIs support category mapping and hierarchy.
  • Frontend: Provides product and SKU detail interfaces for UI consumption.

Section sources

Architecture Overview

The synchronization pipeline:

  1. GoodsUsecase queries shop authorization details and builds ItemsOnsale fetch messages per shop/DataKey.
  2. ConsumerHandlers receives ItemsOnsale messages, validates DataKey, and paginates through ItemsOnsale.
  3. For each item, ConsumerHandlers calls ItemSellerGet to retrieve detailed product and SKU information.
  4. SKUs are parsed, property images mapped, and product/SKU data normalized.
  5. Converted messages are batched and sent to Kafka for downstream processing.

Diagram sources

Detailed Component Analysis

Product Data Extraction with Pagination

  • ItemsOnsale pagination: The system determines total results, computes total pages using a fixed page size, and emits one Kafka message per page.
  • DataKey validation: Each ItemsOnsale request is validated against the seller nickname and DataKey before proceeding.
  • Parallel detail fetching: For each page, multiple ItemSellerGet calls are executed concurrently up to a configurable limit.

Diagram sources

Section sources

SKU Synchronization and Variant Management

  • ItemSellerGet provides product details and SKUs. The handler parses SKUs whether single or array, maps property images to SKU URLs, cleans property names, and extracts outer IDs.
  • ItemSkusGet is used by DataUsecase to retrieve SKU-level details when product-level SKUs are insufficient (e.g., missing images).

Diagram sources

Section sources

Product Attribute Mapping (ItemProps and ItemPropValues)

  • Standard properties and values are represented via ItemPropValues API structures. Attributes include property keys, values, aliases, and status.
  • These mappings enable consistent attribute representation across products and SKUs.

Section sources

Product Hierarchy, Category Mapping, and Brand Information

  • Category display and category tree APIs provide hierarchical category information and standardized category metadata.
  • Product messages include category IDs and names to align with platform category structures.

Section sources

Product Lifecycle Management and Inventory Integration

  • Lifecycle tracking: Product status (e.g., approve status) and timestamps (first starts time, modified) are captured during synchronization.
  • Inventory integration: SKU inventory quantities are parsed from ItemSellerGet responses and included in Kafka messages for downstream systems.

Section sources

Data Validation Rules

  • Product and SKU validation rules are defined in protobuf validation files. Examples include minimum length constraints for price strings and explicit inventory validations.
  • Validation ensures downstream consumers receive consistent and constrained data.

Section sources

Duplicate Detection and Price Synchronization Strategies

  • Duplicate detection: The system relies on platform identifiers (product ID, SKU ID) and normalized property names to avoid duplication. Property name cleaning ensures consistent attribute-value pairs.
  • Price synchronization: Prices are extracted from ItemSellerGet responses and included in Kafka messages. Validation enforces string-based price constraints.

Section sources

Product Synchronization Workflows

  • New product addition: ItemsOnsale pagination identifies new items; ItemSellerGet retrieves details and SKUs; normalized messages are published.
  • Product updates: Modified timestamps and approve status reflect lifecycle changes; downstream systems can reconcile updates based on these signals.
  • Deletions: Approve status and lifecycle fields help identify removed or delisted items; consumers can mark SKUs as inactive accordingly.

Section sources

Dependency Analysis

The synchronization system exhibits clear separation of concerns:

  • GoodsUsecase depends on shop authorization services and Kafka producer factory to emit ItemsOnsale fetch messages.
  • ConsumerHandlers depends on Taobao/Leke clients, request logging, and KafkaServiceClient to publish product/SKU events.
  • DataUsecase depends on Taobao SDK helpers to parse ItemSkusGet and ItemSellerGet responses.

Diagram sources

Section sources

Performance Considerations

  • Concurrency control: Parallel detail fetching limits are enforced to prevent API throttling and resource exhaustion.
  • Efficient parsing: Flexible JSON parsing accommodates single-object or array responses for items and SKUs, reducing branching overhead.
  • Batch publishing: Kafka batching consolidates messages to minimize network overhead.

[No sources needed since this section provides general guidance]

Troubleshooting Guide

Common issues and resolutions:

  • DataKey validation failures: Verify seller nicknames and DataKey correctness before invoking ItemsOnsale.
  • Empty or zero total results: Confirm shop authorization and DataKey validity; ensure ItemsOnsale returns non-empty counts.
  • Parsing errors for items/SKUs: Validate JSON structures and handle both single-object and array cases.
  • Request logging: Use saved request logs to diagnose API failures and measure processing times.

Section sources

Conclusion

The product catalog synchronization system integrates Taobao/Leke APIs with robust pagination, SKU variant handling, attribute mapping, category alignment, lifecycle tracking, and inventory ingestion. Protobuf validation and structured workflows ensure reliable data delivery to downstream consumers, while concurrency controls and batch publishing optimize performance.