Operator Market

Manage and use DataMate operators

Operator marketplace provides rich data processing operators and supports custom operator development.

Features Overview

Operator marketplace provides:

  • Built-in Operators: Rich built-in data processing operators
  • Operator Publishing: Publish and share custom operators
  • Operator Installation: Install third-party operators
  • Custom Development: Develop custom operators

Built-in Operators

Data Cleaning Operators

OperatorFunctionInputOutput
DeduplicationRemove duplicatesDatasetDeduplicated data
Null HandlerHandle nullsDatasetFilled data
Format ConverterConvert formatOriginal formatNew format

Text Processing Operators

OperatorFunction
Text SegmentationChinese word segmentation
Remove StopwordsRemove common stopwords
Text CleaningClean special characters

Quick Start

1. Browse Operators

Step 1: Enter Operator Market

Select Operator Market in the left navigation.

Step 2: Browse Operators

View all available operators with ratings and installation counts.

2. Install Operator

Install Built-in Operator

Built-in operators are installed by default.

Install Third-party Operator

  1. In operator details page, click Install
  2. Wait for installation completion

3. Use Operator

After installation, use in:

  • Data Cleaning: Add operator node to cleaning pipeline
  • Pipeline Orchestration: Add operator node to workflow

Advanced Features

Develop Custom Operator

Create Operator

  1. In operator market page, click Create Operator
  2. Fill operator information
  3. Write operator code (Python)
  4. Package and publish

Python Operator Example:

class MyTextCleaner:
    def __init__(self, config):
        self.remove_special_chars = config.get('remove_special_chars', True)

    def process(self, data):
        if isinstance(data, str):
            result = data
            if self.remove_special_chars:
                import re
                result = re.sub(r'[^\w\s]', '', result)
            return result
        return data

Best Practices

1. Operator Design

Good operator design:

  • Single responsibility: One operator does one thing
  • Configurable: Rich configuration options
  • Error handling: Comprehensive error handling
  • Performance: Consider large-scale data

Common Questions

Q: Operator execution failed?

A: Troubleshoot:

  1. View logs
  2. Check configuration
  3. Check data format
  4. Test locally