API Reference

DataMate API documentation

DataMate provides complete REST APIs supporting programmatic access to all core features.

API Overview

DataMate API is based on REST architecture design, providing the following services:

  • Data Management API: Dataset and file management
  • Data Cleaning API: Data cleaning task management
  • Data Collection API: Data collection task management
  • Data Annotation API: Data annotation task management
  • Data Synthesis API: Data synthesis task management
  • Data Evaluation API: Data evaluation task management
  • Operator Market API: Operator management
  • RAG Indexer API: Knowledge base and vector retrieval
  • Pipeline Orchestration API: Pipeline orchestration management

Authentication

DataMate supports two authentication methods:

GET /api/v1/data-management/datasets
Authorization: Bearer <your-jwt-token>

Get JWT Token:

POST /api/v1/auth/login
Content-Type: application/json

{
  "username": "admin",
  "password": "password"
}

Response:

{
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "expiresIn": 86400
}

API Key Authentication

GET /api/v1/data-management/datasets
X-API-Key: <your-api-key>

Common Response Format

Success Response

{
  "code": 200,
  "message": "success",
  "data": {
    // Response data
  }
}

Error Response

{
  "code": 400,
  "message": "Bad Request",
  "error": "Invalid parameter: datasetId",
  "timestamp": "2024-01-15T10:30:00Z",
  "path": "/api/v1/data-management/datasets"
}

Paged Response

{
  "content": [],
  "page": 0,
  "size": 20,
  "totalElements": 100,
  "totalPages": 5,
  "first": true,
  "last": false
}

API Endpoints

Data Management

EndpointMethodDescription
/data-management/datasetsGETGet dataset list
/data-management/datasetsPOSTCreate dataset
/data-management/datasets/{id}GETGet dataset details
/data-management/datasets/{id}PUTUpdate dataset
/data-management/datasets/{id}DELETEDelete dataset
/data-management/datasets/{id}/filesGETGet file list
/data-management/datasets/{id}/files/uploadPOSTUpload files

Data Cleaning

EndpointMethodDescription
/data-cleaning/tasksGETGet cleaning task list
/data-cleaning/tasksPOSTCreate cleaning task
/data-cleaning/tasks/{id}GETGet task details
/data-cleaning/tasks/{id}PUTUpdate task
/data-cleaning/tasks/{id}DELETEDelete task
/data-cleaning/tasks/{id}/executePOSTExecute task

Data Collection

EndpointMethodDescription
/data-collection/tasksGETGet collection task list
/data-collection/tasksPOSTCreate collection task
/data-collection/tasks/{id}GETGet task details
/data-collection/tasks/{id}/executePOSTExecute collection task

Data Synthesis

EndpointMethodDescription
/data-synthesis/tasksGETGet synthesis task list
/data-synthesis/tasksPOSTCreate synthesis task
/data-synthesis/templatesGETGet instruction template list
/data-synthesis/templatesPOSTCreate instruction template

Operator Market

EndpointMethodDescription
/operator-market/operatorsGETGet operator list
/operator-market/operatorsPOSTPublish operator
/operator-market/operators/{id}GETGet operator details
/operator-market/operators/{id}/installPOSTInstall operator

RAG Indexer

EndpointMethodDescription
/rag/knowledge-basesGETGet knowledge base list
/rag/knowledge-basesPOSTCreate knowledge base
/rag/knowledge-bases/{id}/documentsPOSTUpload documents
/rag/knowledge-bases/{id}/searchPOSTVector search

Error Codes

CodeDescription
200Success
201Created
400Bad Request
401Unauthorized
403Forbidden
404Not Found
409Conflict
500Internal Server Error

Rate Limiting

API call rate limits:

  • Default limit: 1000 requests/hour
  • Burst limit: 100 requests/minute

Exceeding the limit returns 429 Too Many Requests.

Response headers contain rate limiting information:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1642252800

Version Management

API versions are specified through URL paths:

  • Current version: /api/v1/
  • Future versions: /api/v2/

Data Management API

Dataset and file management API


Last modified February 9, 2026: :memo: add english docs (3868c82)