Data Management
Data management module provides unified dataset management capabilities, supporting multiple data types for storage, query, and operations.
Features Overview
Data management module provides:
- Multiple data types: Image, text, audio, video, and multimodal support
- File management: Upload, download, preview, delete operations
- Directory structure: Support for hierarchical directory organization
- Tag management: Use tags to categorize and retrieve data
- Statistics: Dataset size, file count, and other statistics
Dataset Types
| Type | Description | Supported Formats |
|---|---|---|
| Image | Image data | JPG, PNG, GIF, BMP, WebP |
| Text | Text data | TXT, MD, JSON, CSV |
| Audio | Audio data | MP3, WAV, FLAC, AAC |
| Video | Video data | MP4, AVI, MOV, MKV |
| Multimodal | Multimodal data | Mixed formats |
Quick Start
1. Create Dataset
Step 1: Enter Data Management Page
In the left navigation, select Data Management.
Step 2: Create Dataset
Click the Create Dataset button in the upper right corner.
Step 3: Fill Basic Information
- Dataset name: e.g.,
user_images_dataset - Dataset type: Select data type (e.g., Image)
- Description: Dataset purpose description (optional)
- Tags: Add tags for categorization (optional)
Step 4: Create Dataset
Click the Create button to complete.
2. Upload Files
Method 1: Drag & Drop
- Enter dataset details page
- Drag files directly to the upload area
- Wait for upload completion
Method 2: Click Upload
- Click Upload File button
- Select local files
- Wait for upload completion
Method 3: Chunked Upload (Large Files)
For large files (>100MB), the system automatically uses chunked upload:
- Select large file to upload
- System automatically splits the file
- Upload chunks one by one
- Automatically merge
3. Create Directory
Step 1: Enter Dataset
Click dataset name to enter details.
Step 2: Create Directory
- Click Create Directory button
- Enter directory name
- Select parent directory (optional)
- Click confirm
Directory structure example:
user_images_dataset/
├── train/
│ ├── cat/
│ └── dog/
├── test/
│ ├── cat/
│ └── dog/
└── validation/
├── cat/
└── dog/
4. Manage Files
View Files
In dataset details page, you can see all files:
| Filename | Size | File Count | Upload Time | Tags | Tag Update Time | Actions |
|---|---|---|---|---|---|---|
| image1.jpg | 2.3 MB | 1 | 2024-01-15 | Training Set | 2024-01-16 | Download Rename Delete |
| image2.png | 1.8 MB | 1 | 2024-01-15 | Validation Set | 2024-01-16 | Download Rename Delete |
Preview File
Click Preview button to preview in browser:
- Image: Display thumbnail and details
- Text: Display text content
- Audio: Online playback
- Video: Online playback
Download File
- Single file download: Click Download button
Currently, batch download and package download are not supported.
5. Dataset Operations
View Statistics
In dataset details page, you can see:
- Total files: Total number of files in dataset
- Total size: Total size of all files
Edit Dataset
Click Edit button to modify:
- Dataset name
- Description
- Tags
- Associated collection task
Delete Dataset
Click Delete button to delete entire dataset.
Note: Deleting a dataset will also delete all files within it. This action cannot be undone.
Advanced Features
Tag Management
Create Tag
- In dataset list page, click Tag Management
- Click Create Tag
- Enter tag name
Use Tags
- Edit dataset
- Select existing tags in tag bar
- Save dataset
Filter by Tags
In dataset list page, click tags to filter datasets with that tag.
Best Practices
1. Dataset Organization
Recommended directory organization:
project_dataset/
├── raw/ # Raw data
├── processed/ # Processed data
├── train/ # Training data
├── validation/ # Validation data
└── test/ # Test data
2. Naming Conventions
- Dataset name: Use lowercase letters and underscores, e.g.,
user_images_2024 - Directory name: Use meaningful English names, e.g.,
train,test,processed - File name: Keep original filename or use standardized naming
3. Tag Usage
Recommended tag categories:
- Project tags:
project-a,project-b - Status tags:
raw,processed,validated - Type tags:
image,text,audio - Purpose tags:
training,testing,evaluation
4. Data Backup
The system currently does not support automatic backup. To backup data, you can manually download individual files:
- Enter dataset details page
- Find the file you need to backup
- Click the Download button of the file
Common Questions
Q: Large file upload fails?
A: Suggestions for large file uploads:
- Use chunked upload: System automatically enables chunked upload
- Check network: Ensure stable network connection
- Adjust upload parameters: Increase timeout
- Use FTP/SFTP: For very large files, use FTP upload
Q: How to import existing data?
A: Three methods to import existing data:
- Upload files: Upload via interface
- Add files: If files already on server, use add file feature
- Data collection: Use data collection module to collect from external sources
Q: Dataset size limit?
A: Dataset size limits:
- Single file: Maximum 5GB (chunked upload)
- Total dataset: Limited by storage space
- File count: No explicit limit
Regularly clean unnecessary files to free up space.
API Reference
For detailed API documentation, see:
Related Documentation
- Data Collection - Collect data to datasets
- Data Cleaning - Clean dataset data
- Data Annotation - Annotate dataset data
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.