Backend Architecture

DataMate Java backend architecture design

DataMate backend adopts microservices architecture built on Spring Boot 3.x and Spring Cloud.

Architecture Overview

DataMate backend uses microservices architecture, splitting into multiple independent services:

┌─────────────────────────────────────────────┐
│              API Gateway                    │
│         (Spring Cloud Gateway)              │
│              Port: 8080                     │
└──────────────┬──────────────────────────────┘
               │
       ┌───────┴───────┬───────────────┐
       ▼               ▼               ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│   Main       │ │  Data        │ │  Data        │
│ Application  │ │  Management  │ │  Collection  │
└──────────────┘ └──────────────┘ └──────────────┘
       │               │               │
       └───────────────┴───────────────┘
                       │
                       ▼
              ┌────────────────┐
              │   PostgreSQL   │
              │   Port: 5432   │
              └────────────────┘

Tech Stack

Core Frameworks

TechnologyVersionPurpose
Java21Programming language
Spring Boot3.5.6Application framework
Spring Cloud2023.xMicroservices framework
MyBatis Plus3.5.xORM framework

Support Components

TechnologyVersionPurpose
Redis5.xCache and message queue
MinIO8.xObject storage
Milvus SDK2.3.xVector database

Microservices List

API Gateway

Port: 8080

Functions:

  • Unified entry point
  • Route forwarding
  • Authentication and authorization
  • Rate limiting and circuit breaking

Tech: Spring Cloud Gateway, JWT authentication

Main Application

Functions:

  • User management
  • Permission management
  • System configuration
  • Task scheduling

Data Management Service

Port: 8092

Functions:

  • Dataset management
  • File management
  • Tag management
  • Statistics

API Endpoints:

  • /data-management/datasets - Dataset management
  • /data-management/datasets/{id}/files - File management

Runtime Service

Port: 8081

Functions:

  • Operator execution
  • Ray integration
  • Task scheduling

Tech: Python + Ray, FastAPI

Database Design

Main Tables

users (User Table)

FieldTypeDescription
idBIGINTPrimary key
usernameVARCHAR(50)Username
passwordVARCHAR(255)Password (encrypted)
emailVARCHAR(100)Email
roleVARCHAR(20)Role
created_atTIMESTAMPCreation time

datasets (Dataset Table)

FieldTypeDescription
idVARCHAR(50)Primary key
nameVARCHAR(100)Name
descriptionTEXTDescription
typeVARCHAR(20)Type
statusVARCHAR(20)Status
created_byVARCHAR(50)Creator

Service Communication

Synchronous Communication

Services communicate via HTTP/REST:

// Using Feign Client
@FeignClient(name = "data-management-service")
public interface DataManagementClient {
    @GetMapping("/data-management/datasets/{id}")
    DatasetResponse getDataset(@PathVariable String id);
}

Asynchronous Communication

Using Redis for async messaging:

// Send message
redisTemplate.convertAndSend("task.created", taskMessage);

// Receive message
@RedisListener(topic = "task.created")
public void handleTaskCreated(TaskMessage message) {
    // Handle task creation event
}

Authentication & Authorization

JWT Authentication

@Configuration
public class JwtConfig {
    @Value("${datamate.jwt.secret}")
    private String secret;

    @Value("${datamate.jwt.expiration}")
    private Long expiration;
}

RBAC

@PreAuthorize("hasRole('ADMIN')")
public void adminOperation() {
    // Admin operations
}

Performance Optimization

Database Connection Pool

spring:
  datasource:
    hikari:
      maximum-pool-size: 20
      minimum-idle: 5
      connection-timeout: 30000

Caching Strategy

@Cacheable(value = "datasets", key = "#id")
public Dataset getDataset(String id) {
    return datasetRepository.findById(id);
}

Last modified February 9, 2026: :memo: add english docs (3868c82)