This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Ingest and transform data (30–35%)
   --> Design and implement loading patterns
      --> Prepare data for loading into a dimensional model

Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

One of the primary goals of data engineering is to transform raw operational data into a structure that supports efficient reporting, analytics, and business intelligence. In Microsoft Fabric, this often involves preparing data for loading into a dimensional model.

Dimensional modeling is a foundational concept in data warehousing and analytics. It organizes data into fact tables and dimension tables, enabling fast query performance, simplified reporting, and intuitive business analysis.

For the DP-700 exam, you should understand:

Dimensional modeling concepts
Fact and dimension tables
Star and snowflake schemas
Data preparation requirements
Surrogate keys
Slowly Changing Dimensions (SCDs)
Data cleansing and conformance
Loading sequence considerations
Fabric implementation patterns using Lakehouses, Warehouses, Notebooks, Dataflows Gen2, and Pipelines

Many DP-700 scenario questions focus on selecting the correct transformations and loading patterns to support dimensional models.

What Is a Dimensional Model?

A dimensional model organizes data into structures optimized for analytics rather than transaction processing.

The model consists primarily of:

Fact tables
Dimension tables

Example:

                Product Dimension
                        |
Customer Dimension --- Fact Sales --- Date Dimension
                        |
                 Store Dimension

This structure allows users to analyze business measures from multiple perspectives.

Why Use a Dimensional Model?

Dimensional models provide:

Improved Query Performance

Analytics queries often execute faster than on normalized transactional systems.

Easier Reporting

Business users can understand facts and dimensions more easily than complex normalized schemas.

Better Scalability

Supports large-scale reporting and analytical workloads.

Simplified BI Development

Tools such as Power BI work exceptionally well with dimensional models.

Fact Tables

Fact tables contain measurable business events.

Examples:

Fact Table	Business Event
FactSales	Sales transactions
FactOrders	Customer orders
FactInventory	Inventory balances
FactClaims	Insurance claims

Fact tables typically contain:

Numeric measures
Foreign keys to dimensions

Example:

SalesKey	DateKey	ProductKey	CustomerKey	SalesAmount
1001	20260101	201	501	250.00

Dimension Tables

Dimension tables provide descriptive attributes used for filtering and grouping.

Examples:

Dimension	Example Attributes
Customer	Name, Region, Age
Product	Category, Brand
Date	Year, Month, Quarter
Store	Location, Territory

Example:

CustomerKey	CustomerName	Region
501	Smith Corp	East

Understanding the Star Schema

The most common dimensional design is the star schema.

         Product
            |
Customer -- Fact Sales -- Date
            |
          Store

Characteristics:

Central fact table
Multiple dimensions
Simple joins
Excellent reporting performance

For DP-700, the star schema is typically the preferred analytical design.

Understanding the Snowflake Schema

A snowflake schema normalizes dimension tables.

Example:

			
Product
   |
Category
   |
Department

		

Advantages:

Reduced redundancy

Disadvantages:

More joins
Increased complexity

Most Fabric analytics workloads favor star schemas over snowflake schemas.

Data Preparation Before Loading

Raw source data rarely fits directly into a dimensional model.

Preparation typically includes:

Data cleansing
Standardization
Deduplication
Business rule application
Surrogate key generation
Data quality validation

Data Cleansing

Before loading dimensions and facts, incorrect data must be corrected.

Examples:

Inconsistent Values

			
FL
Florida
Fla

Standardized to:

Florida

Invalid Dates

01/45/2026

Must be corrected or rejected.

Handling Missing Values

Example:

CustomerID	Email
101	NULL

Possible approaches:

Default values
Unknown members
Data quality workflows

A common dimensional modeling practice is using “Unknown” dimension records.

Deduplication

Source systems often contain duplicate records.

Example:

CustomerID	Name
100	Smith
100	Smith

Duplicates should be removed before loading.

Conformed Dimensions

A conformed dimension is shared across multiple fact tables.

Example:

			
Fact Sales
     |
Customer Dimension
     |
Fact Orders

		

Benefits:

Consistent reporting
Unified business definitions
Simplified analytics

DP-700 questions often reference conformed dimensions.

Surrogate Keys

Dimension tables typically use surrogate keys instead of business keys.

Example:

Source System:

CustomerID
CUST100

Dimension:

CustomerKey	CustomerID
501	CUST100

Why Use Surrogate Keys?

Advantages include:

Independence from Source Systems

Source keys can change.

Improved Performance

Integer keys are more efficient than text values.

Support for Slowly Changing Dimensions

Surrogate keys help track historical changes.

Dimension Loading Sequence

Dimension tables are usually loaded before fact tables.

Why?

Fact tables require dimension keys.

Typical workflow:

			
Load Dimensions
       ↓
Generate Surrogate Keys
       ↓
Load Fact Tables

		

Key Lookup Process

During fact loading:

Source business key identified
Matching dimension record located
Surrogate key retrieved
Fact record loaded

Example:

			
CustomerID = CUST100
       ↓
CustomerKey = 501
       ↓
FactSales loaded

		

Slowly Changing Dimensions (SCD)

Dimensions often change over time.

Example:

Customer moves from:

Florida

Texas

The organization must decide how historical records should be handled.

SCD Type 1

Type 1 overwrites existing values.

Example:

Before:

Customer	State
Smith	Florida

After:

Customer	State
Smith	Texas

History is lost.

SCD Type 2

Type 2 preserves history.

Example:

Customer	State	Current
Smith	Florida	No
Smith	Texas	Yes

Benefits:

Historical reporting
Auditability
Trend analysis

Type 2 SCD is heavily tested in data engineering certifications.

Date Dimensions

Date dimensions are one of the most important dimensions.

Typical attributes:

DateKey	Year	Quarter	Month
20260101	2026	Q1	January

Benefits:

Consistent date calculations
Faster reporting
Simplified filtering

Fact Table Preparation

Before loading facts:

Validate Measures

Example:

SalesAmount >= 0

Verify Foreign Keys

Ensure referenced dimensions exist.

Remove Invalid Records

Reject records with missing required fields.

Apply Business Rules

Example:

			
NetSales =
SalesAmount - DiscountAmount

Fact Table Granularity

Granularity defines the level of detail stored.

Examples:

Transaction-Level

One row per sale.

Daily Summary

One row per day.

Monthly Summary

One row per month.

The chosen grain should be clearly defined before loading.

Preparing Data in Microsoft Fabric

Several Fabric tools support dimensional modeling.

Dataflows Gen2

Useful for:

Data cleansing
Standardization
Deduplication
Business rule transformations

Best for low-code scenarios.

Notebooks

Useful for:

Complex transformations
Spark processing
SCD implementation
Large-scale dimension preparation

Common languages:

PySpark
Spark SQL

Data Pipelines

Useful for:

Orchestration
Scheduling
Parameterized execution
End-to-end ETL workflows

Fabric Warehouses

Useful for:

Dimensional storage
SQL-based transformations
Star schema implementation

Typical Fabric Dimensional Loading Pattern

			
Source Systems
       ↓
Bronze Layer
(Raw Data)
       ↓
Silver Layer
(Cleansed Data)
       ↓
Dimension Processing
       ↓
Fact Processing
       ↓
Gold Layer
(Analytics Model)

		

This Medallion Architecture pattern is commonly used in Fabric implementations.

Common DP-700 Exam Scenarios

Scenario 1

A reporting system requires historical customer address tracking.

Best solution:

SCD Type 2

Scenario 2

A sales fact table references customer and product dimensions.

Best practice:

Load dimensions before facts.

Scenario 3

A product code changes in the source system.

Best solution:

Use surrogate keys.

Scenario 4

Multiple fact tables require consistent customer reporting.

Best solution:

Conformed dimensions.

Best Practices

Define Granularity Early

Determine the fact table grain before development.

Use Surrogate Keys

Avoid using business keys directly in fact tables.

Load Dimensions First

Fact loads depend on dimension keys.

Implement Data Quality Checks

Prevent invalid data from entering the warehouse.

Use Conformed Dimensions

Promote consistency across analytical models.

Preserve History When Needed

Use SCD Type 2 for historical reporting requirements.

DP-700 Exam Focus Areas

You should understand:

✓ Fact tables

✓ Dimension tables

✓ Star schemas

✓ Snowflake schemas

✓ Conformed dimensions

✓ Surrogate keys

✓ Business keys

✓ Fact table granularity

✓ Dimension loading strategies

✓ Fact loading strategies

✓ SCD Type 1

✓ SCD Type 2

✓ Data cleansing

✓ Data standardization

✓ Medallion Architecture support for dimensional models

Practice Exam Questions

Question 1

Which type of table stores measurable business events in a dimensional model?

A. Staging table

B. Dimension table

C. Lookup table

D. Fact table

Answer: D

Explanation

Fact tables store measurable events such as sales, orders, inventory quantities, and revenue.

Question 2

A data engineer needs to store customer attributes such as customer name, city, and region.

Which table type should be used?

A. Fact table

B. Bridge table

C. Aggregate table

D. Dimension table

Answer: D

Explanation

Dimension tables contain descriptive attributes used for filtering, grouping, and reporting.

Question 3

What is the primary advantage of a star schema?

A. More normalization

B. Simplified queries and better reporting performance

C. Reduced storage requirements

D. Elimination of dimensions

Answer: B

Explanation

Star schemas reduce join complexity and are optimized for analytical workloads.

Question 4

Which key type is typically used as the primary key in a dimension table?

A. Natural key

B. Foreign key

C. Composite key

D. Surrogate key

Answer: D

Explanation

Surrogate keys are system-generated identifiers that improve performance and support Slowly Changing Dimensions.

Question 5

A customer changes states from Florida to Texas, and historical reporting must be preserved.

Which Slowly Changing Dimension type should be used?

A. Type 0

B. Type 1

C. Type 2

D. Type 3

Answer: C

Explanation

Type 2 creates a new dimension record and preserves historical values.

Question 6

What should generally be loaded first during dimensional processing?

A. Dimension tables

B. Aggregate tables

C. Materialized views

D. Fact tables

Answer: A

Explanation

Fact tables require dimension keys, so dimensions are loaded first.

Question 7

Which activity is most commonly performed during data preparation for dimensional modeling?

A. Encrypting storage accounts

B. Data cleansing and standardization

C. Creating dashboards

D. Configuring network firewalls

Answer: B

Explanation

Data cleansing and standardization improve data quality before loading into dimensions and facts.

Question 8

A company wants multiple fact tables to use the same customer dimension.

What type of dimension should be implemented?

A. Slowly Changing Dimension

B. Role-playing Dimension

C. Junk Dimension

D. Conformed Dimension

Answer: D

Explanation

Conformed dimensions provide consistent business definitions across multiple fact tables.

Question 9

What is the primary purpose of a Date dimension?

A. Store transaction details

B. Manage security permissions

C. Provide standardized calendar attributes for reporting

D. Store surrogate key mappings

Answer: C

Explanation

Date dimensions simplify filtering, aggregation, and time-based reporting.

Question 10

A data engineer must ensure that every sales transaction is stored individually.

What fact table grain should be selected?

A. Monthly summary

B. Quarterly summary

C. Daily summary

D. Transaction-level detail

Answer: D

Explanation

Transaction-level grain stores one row per business event, providing the highest level of detail and analytical flexibility.

Exam Tip

For DP-700, remember this fundamental sequence:

			
Cleanse Data
      ↓
Build Dimensions
      ↓
Generate Surrogate Keys
      ↓
Load Fact Tables
      ↓
Publish Analytics Model

		

When an exam question discusses historical tracking, think SCD Type 2. When it discusses reporting performance and simplicity, think Star Schema. When it discusses multiple fact tables sharing the same business entity, think Conformed Dimension. These concepts appear frequently in real-world Fabric data warehouse implementations and certification exam scenarios.

Go to the DP-700 Exam Prep Hub main page.