Lesson
3.2 Transformers
Transformers are the workhorses of your data pipelines—they clean, reshape, and enhance the data flowing through your system. Use these blocks to clean, transform, and enhance data from other blocks. Every data engineering project requires transformation logic, and Mage makes this process intuitive and modular.
Understanding data transformation
All blocks (except Scratchpads and Sensors) pass their data from the return statement in their decorated function to all their downstream blocks. This creates a clear data flow where each transformer receives data from its upstream blocks, applies transformations, and passes the results forward.
The power of transformers lies in their modularity. Instead of writing one large transformation script, you can break complex logic into smaller, testable pieces that can be reused across different pipelines.
Transformer block structure
Every transformer follows this basic pattern:
Data transformation types in Mage
When creating transformer blocks in Mage, you can choose from several built-in templates that handle common data transformation scenarios:
Python transformers:
Generic (no template): Start with a blank Python transformer for custom logic
Clean column names: Standardize column naming conventions and remove special characters
Remove duplicate rows: Eliminate duplicate records based on specified criteria
Select columns: Choose specific columns to keep while dropping others
Filter rows: Apply conditional filtering logic to subset your data
SQL transformers:
Generic SQL: Write custom SQL transformation logic with full database capabilities
Automated SQL: Use Mage's visual interface for simple transformations without writing code
Raw SQL: Handle complex SQL operations with multiple statements and advanced database features
Language-specific options:
Python transformer: For pandas-based data manipulation and complex business logic
SQL transformer: For database-native operations and optimized query performance
R transformer: For statistical analysis and specialized R packages
PySpark transformer: For big data processing and distributed computing
Python data transformation block template:

Conclusion
Transformers represent the analytical heart of your data engineering workflows in Mage. By breaking complex data processing into modular, reusable blocks, you create pipelines that are not only more maintainable but also more reliable and easier to debug. Whether you're cleaning messy data with Python, performing complex aggregations with SQL, or building sophisticated features for machine learning, Mage's transformer ecosystem provides the tools and flexibility you need.