System Architecture

DataMate system architecture design documentation

This document details DataMate’s system architecture, tech stack, and design philosophy.

Overall Architecture

DataMate adopts a microservices architecture, splitting the system into multiple independent services, each responsible for specific business functions. This architecture provides good scalability, maintainability, and fault tolerance.

┌─────────────────────────────────────────────────────────────────┐
│                           Frontend Layer                        │
│                    (React + TypeScript)                         │
│                      Ant Design + Tailwind                      │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│                        API Gateway Layer                        │
│                    (Spring Cloud Gateway)                       │
│                      Port: 8080                                 │
└────────────────────────┬────────────────────────────────────────┘
                         │
         ┌───────────────┼───────────────┐
         ▼               ▼               ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│  Java Backend│ │ Python Backend│ │  Runtime     │
│   Services   │ │    Service    │ │   Service    │
├──────────────┤ ├──────────────┤ ├──────────────┤
│· Main App    │ │· RAG Service  │ │· Operator    │
│· Data Mgmt   │ │· LangChain    │ │  Execution   │
│· Collection  │ │· FastAPI      │ │              │
│· Cleaning    │ │              │ │              │
│· Annotation  │ │              │ │              │
│· Synthesis   │ │              │ │              │
│· Evaluation  │ │              │ │              │
│· Operator    │ │              │ │              │
│· Pipeline    │ │              │ │              │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
       │                │                │
       └────────────────┼────────────────┘
                        ▼
         ┌──────────────┴──────────────┐
         │                              │
    ┌────▼────┐    ┌─────────┐   ┌─────▼────┐
    │PostgreSQL│    │  Redis  │   │  Milvus  │
    │  (5432)  │    │ (6379)  │   │ (19530)  │
    └──────────┘    └─────────┘   └──────────┘
                                              │
                                        ┌─────▼─────┐
                                        │   MinIO   │
                                        │  (9000)   │
                                        └───────────┘

Tech Stack

Frontend Tech Stack

TechnologyVersionPurpose
React18.xUI framework
TypeScript5.xType safety
Ant Design5.xUI component library
Tailwind CSS3.xStyling framework
Redux Toolkit2.xState management
React Router6.xRouting management
Vite5.xBuild tool

Backend Tech Stack (Java)

TechnologyVersionPurpose
Java21Runtime environment
Spring Boot3.5.6Application framework
Spring Cloud2023.xMicroservices framework
MyBatis Plus3.xORM framework
PostgreSQL Driver42.xDatabase driver
Redis5.xCache client
MinIO8.xObject storage client

Backend Tech Stack (Python)

TechnologyVersionPurpose
Python3.11+Runtime environment
FastAPI0.100+Web framework
LangChain0.1+LLM application framework
Ray2.xDistributed computing
Pydantic2.xData validation

Data Storage

TechnologyVersionPurpose
PostgreSQL15+Main database
Redis8.xCache and message queue
Milvus2.6.5Vector database
MinIORELEASE.2024+Object storage

Microservices Architecture

Service List

Service NamePortTech StackDescription
API Gateway8080Spring Cloud GatewayUnified entry, routing, auth
Frontend30000ReactFrontend UI
Main Application-Spring BootCore business logic
Data Management Service8092Spring BootDataset management
Data Collection Service-Spring BootData collection tasks
Data Cleaning Service-Spring BootData cleaning tasks
Data Annotation Service-Spring BootData annotation tasks
Data Synthesis Service-Spring BootData synthesis tasks
Data Evaluation Service-Spring BootData evaluation tasks
Operator Market Service-Spring BootOperator marketplace
RAG Indexer Service-Spring BootKnowledge base indexing
Runtime Service8081Python + RayOperator execution engine
Backend Python Service18000FastAPIPython backend service
Database5432PostgreSQLDatabase

Service Communication

Synchronous Communication

  • API Gateway → Backend Services: HTTP/REST
  • Frontend → API Gateway: HTTP/REST
  • Backend Services ↔: HTTP/REST (Feign Client)

Asynchronous Communication

  • Task Execution: Database task queue
  • Event Notification: Redis Pub/Sub

Data Architecture

Data Flow

┌─────────────┐
│  Data       │ Collection task config
│  Collection │ → DataX → Raw data
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Data       │ Dataset management, file upload
│  Management │ → Structured storage
└──────┬──────┘
       │
       ├──────────────┐
       ▼              ▼
┌─────────────┐  ┌─────────────┐
│  Data       │  │ Knowledge   │
│  Cleaning   │  │ Base        │
│             │  │             │
└──────┬──────┘  └──────┬──────┘
       │                │
       ▼                ▼
┌─────────────┐  ┌─────────────┐
│  Data       │  │ Vector      │
│  Annotation │  │ Index       │
└──────┬──────┘  └──────┬──────┘
       │                │
       ▼                │
┌─────────────┐          │
│  Data       │          │
│  Synthesis  │          │
└──────┬──────┘          │
       │                │
       ▼                ▼
┌─────────────┐  ┌─────────────┐
│  Data       │  │  RAG        │
│  Evaluation │  │ Retrieval   │
└─────────────┘  └─────────────┘

Deployment Architecture

Docker Compose Deployment

┌────────────────────────────────────────────────┐
│              Docker Network                    │
│            datamate-network                    │
│                                                │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
│  │Frontend  │  │ Gateway  │  │ Backend  │   │
│  │ :30000   │  │  :8080   │  │          │   │
│  └──────────┘  └──────────┘  └──────────┘   │
│                                                │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
│  │Backend   │  │ Runtime  │  │Database  │   │
│  │  Python  │  │  :8081   │  │  :5432   │   │
│  └──────────┘  └──────────┘  └──────────┘   │
│                                                │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
│  │  Milvus  │  │  MinIO   │  │  etcd    │   │
│  │  :19530  │  │  :9000   │  │          │   │
│  └──────────┘  └──────────┘  └──────────┘   │
└────────────────────────────────────────────────┘

Kubernetes Deployment

┌────────────────────────────────────────────────┐
│           Kubernetes Cluster                   │
│                                                │
│  Namespace: datamate                           │
│                                                │
│  ┌────────────┐  ┌────────────┐              │
│  │ Deployment │  │ Deployment │              │
│  │  Frontend  │  │  Gateway   │              │
│  │   (3 Pods) │  │  (2 Pods)  │              │
│  └─────┬──────┘  └─────┬──────┘              │
│        │                │                     │
│  ┌─────▼────────────────▼──────┐              │
│  │       Service (LoadBalancer) │              │
│  └──────────────────────────────┘              │
│                                                │
│  ┌────────────┐  ┌────────────┐              │
│  │ StatefulSet│  │ Deployment │              │
│  │  Database  │  │  Backend   │              │
│  └────────────┘  └────────────┘              │
└────────────────────────────────────────────────┘

Security Architecture

Authentication & Authorization

JWT Authentication (Optional)

datamate:
  jwt:
    enable: true  # Enable JWT authentication
    secret: your-secret-key
    expiration: 86400  # 24 hours

API Key Authentication

datamate:
  api-key:
    enable: false

Data Security

Transport Encryption

  • API Gateway supports HTTPS/TLS
  • Internal service communication can be encrypted

Storage Encryption

  • Database: Transparent data encryption (TDE)
  • MinIO: Server-side encryption
  • Milvus: Encryption at rest

Next Steps


Last modified February 9, 2026: :memo: add english docs (3868c82)