8.2 Part 1: Build the Dimension Tables

Final Project

8.2 Part 1: Build the Dimension Tables

Scenario

Your analytics team needs to analyze complaints by agency, type, and location. You'll create three dimension tables that provide context and enable efficient filtering and grouping.


Exercise 1A: Agency Dimension

Requirements

Create a dimension table that stores unique agencies responding to 311 complaints.

Step 1: Create SQL transformer block

  1. Click "Blocks""Transformer" → "SQL: Base template (generic)”

  2. Name the block: build_dim_agency

  3. Connect it to the transform_to_silver_layer block

Step 2: Implement the Dimension SQL

-- Agency Dimension Table

CREATE TABLE IF NOT EXISTS `gcp_project.schema_name.table_name` (
    agency_key STRING,
    agency_code STRING,
    agency_name STRING
);

WITH new_agencies AS (
    SELECT DISTINCT
        agency_key,
        agency as agency_code,
        agency as agency_name
    FROM `gcp_project.schema_name.table_name`
    WHERE agency IS NOT NULL
)
MERGE `gcp_project.schema_name.table_name` AS target
USING new_agencies AS source
ON target.agency_key = source.agency_key

WHEN NOT MATCHED THEN INSERT (
    agency_key,
    agency_code,
    agency_name
) VALUES (
    source.agency_key,
    source.agency_code,
    source.agency_name
)

Step 3: Validate

  1. Run the block

  2. Query the table to verify unique agencies were loaded

  3. Confirm hash keys match those in the silver layer


Exercise 1B: Complaint Type Dimension

Requirements

Create a dimension table for complaint types and their descriptors.

Step 1: Create SQL Data Loader Block

  1. Click "Blocks""Transformer" SQL: Base template (generic)”

  2. Name the block: build_dim_complaint_type

  3. Connect it to the transform_to_silver_layer block

Step 2: Implement the Dimension SQL

-- Complaint Type Dimension Table

CREATE TABLE IF NOT EXISTS `gcp_project.schema_name.table_name` (
    complaint_type_key STRING,
    complaint_type STRING,
    descriptor STRING
);

WITH new_complaint_types AS (
    SELECT DISTINCT
        complaint_type_key,
        complaint_type,
        descriptor
    FROM `gcp_project.schema_name.table_name`
    WHERE complaint_type IS NOT NULL
)
MERGE `gcp_project.schema_name.table_name` AS target
USING new_complaint_types AS source
ON target.complaint_type_key = source.complaint_type_key

WHEN NOT MATCHED THEN INSERT (
    complaint_type_key,
    complaint_type,
    descriptor
) VALUES (
    source.complaint_type_key,
    source.complaint_type,
    source.descriptor
)

Step 3: Validate

  1. Run the block

  2. Query the table to verify unique agencies were loaded

  3. Confirm hash keys match those in the silver layer


Exercise 1C: Location Dimension

Requirements

Create a dimension table for geographic analysis by borough and zip code.

Step 1: Create SQL Data Loader Block

  1. Click "Blocks""Transformer" SQL: Base template (generic)”

  2. Name the block: build_dim_location

  3. Connect it to the transform_to_silver_layer block

Step 2: Implement the Dimension SQL

-- Location Dimension Table 

CREATE TABLE IF NOT EXISTS `gcp_project.schema_name.table_name` (
    location_key STRING,
    borough STRING,
    incident_zip STRING
);

WITH new_locations AS (
    SELECT DISTINCT
        location_key,
        borough,
        incident_zip
    FROM `gcp_project.schema_name.table_name`
    WHERE borough IS NOT NULL
        AND incident_zip IS NOT NULL
)
MERGE `gcp_project.schema_name.table_name` AS target
USING new_locations AS source
ON target.location_key = source.location_key

WHEN NOT MATCHED THEN INSERT (
    location_key,
    borough,
    incident_zip
) VALUES (
    source.location_key,
    source.borough,
    source.incident_zip
)

Step 3: Validate

  1. Run the block

  2. Query the table to verify unique agencies were loaded

  3. Confirm hash keys match those in the silver layer

Learning Objectives

After completing these dimension exercises, you should understand:

  • ✅ Star schema dimensional modeling concepts

  • ✅ Creating dimension tables with hash-based surrogate keys

  • ✅ Handling slowly changing dimensions (Type 1)

  • ✅ Using MERGE for incremental dimension loads

  • ✅ Ensuring referential integrity through consistent key generation