1.3 Introduction to blocks - Magical workflows for data engineering

Open in Editor

The Basics

Introduction

1.1 What is data engineering?

1.2 How does Mage fit into data engineering

1.3 Introduction to blocks

Configuration

4.1 Secrets

CI / CD

5.1 Introduction to CICD

1.3 Introduction to blocks

Central to Mage Pro's functionality are its modular components, known as blocks, which provide the essential building blocks for constructing scalable and efficient data workflows. These blocks encapsulate specific functionalities, allowing data engineers to design pipelines that are both flexible and maintainable. By leveraging these modular components, users can streamline data ingestion, transformation, and export processes, ensuring seamless data flow and integration across various systems and platforms.

Mage offers 8 basic block types:

Data Loader blocks: Facilitate the extraction of data from diverse source systems such as APIs, data warehouses, data lakes, and relational databases, supporting various protocols and authentication methods for secure and efficient data retrieval.
Transformer blocks: Execute data cleansing, transformation, and enrichment operations using advanced processing engines, enabling tasks like normalization, aggregation, and custom data manipulation to prepare data for analysis.
Data Exporter blocks: Enable the transfer of processed data to external destinations, including data synchronization services and cloud storage solutions, with support for multiple data formats and robust error handling mechanisms.
Scratchpad blocks: Provide a sandboxed environment for testing and prototyping code snippets outside the main pipeline execution, allowing developers to validate logic and debug scripts without affecting production workflows.
Sensor blocks: Continuously monitor specific conditions or metrics within the pipeline, triggering actions or alerts based on real-time data states or predefined thresholds to implement conditional logic.
dbt blocks: Integrate with dbt Core to execute dbt commands within Mage Pro, facilitating the orchestration of ELT workflows by managing dbt models, tests, and documentation.
Extension blocks: Enhance pipeline functionality by integrating external libraries and tools, such as the Great Expectations data quality framework, to enforce data validation rules and governance policies without acting as standalone processing units.
Callback blocks: Execute predefined functions in response to the success or failure of parent blocks, enabling automated post-processing actions like logging, notifications, and error handling to maintain pipeline resilience.

Figure 1: Data loader block example

In this section, you have explored Mage Pro's modular architecture and gained an understanding of the eight fundamental block types that constitute its data pipelines. By further familiarizing yourself with Data Loader, Transformer, Data Exporter, Scratchpad, Sensor, dbt, Extension, and Callback blocks, in Module 5 you will be equipped to design and implement scalable and efficient data workflows tailored to various data engineering needs. Next, we will introduce the Chicago Crime Analytics project in Lesson 1.4.

Proof of work

Earn 10 runs

Paste the link to your pipeline for this lesson. Our AI mentor will step inside, check your work, and reward you with free compute credits if you’ve nailed it.