This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Ingest and transform data (30–35%)
--> Design and implement loading patterns
--> Prepare data for loading into a dimensional model
Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.
Introduction
One of the primary goals of data engineering is to transform raw operational data into a structure that supports efficient reporting, analytics, and business intelligence. In Microsoft Fabric, this often involves preparing data for loading into a dimensional model.
Dimensional modeling is a foundational concept in data warehousing and analytics. It organizes data into fact tables and dimension tables, enabling fast query performance, simplified reporting, and intuitive business analysis.
For the DP-700 exam, you should understand:
- Dimensional modeling concepts
- Fact and dimension tables
- Star and snowflake schemas
- Data preparation requirements
- Surrogate keys
- Slowly Changing Dimensions (SCDs)
- Data cleansing and conformance
- Loading sequence considerations
- Fabric implementation patterns using Lakehouses, Warehouses, Notebooks, Dataflows Gen2, and Pipelines
Many DP-700 scenario questions focus on selecting the correct transformations and loading patterns to support dimensional models.
What Is a Dimensional Model?
A dimensional model organizes data into structures optimized for analytics rather than transaction processing.
The model consists primarily of:
- Fact tables
- Dimension tables
Example:
Product Dimension
|
Customer Dimension --- Fact Sales --- Date Dimension
|
Store Dimension
This structure allows users to analyze business measures from multiple perspectives.
Why Use a Dimensional Model?
Dimensional models provide:
Improved Query Performance
Analytics queries often execute faster than on normalized transactional systems.
Easier Reporting
Business users can understand facts and dimensions more easily than complex normalized schemas.
Better Scalability
Supports large-scale reporting and analytical workloads.
Simplified BI Development
Tools such as Power BI work exceptionally well with dimensional models.
Fact Tables
Fact tables contain measurable business events.
Examples:
| Fact Table | Business Event |
|---|---|
| FactSales | Sales transactions |
| FactOrders | Customer orders |
| FactInventory | Inventory balances |
| FactClaims | Insurance claims |
Fact tables typically contain:
- Numeric measures
- Foreign keys to dimensions
Example:
| SalesKey | DateKey | ProductKey | CustomerKey | SalesAmount |
|---|---|---|---|---|
| 1001 | 20260101 | 201 | 501 | 250.00 |
Dimension Tables
Dimension tables provide descriptive attributes used for filtering and grouping.
Examples:
| Dimension | Example Attributes |
|---|---|
| Customer | Name, Region, Age |
| Product | Category, Brand |
| Date | Year, Month, Quarter |
| Store | Location, Territory |
Example:
| CustomerKey | CustomerName | Region |
|---|---|---|
| 501 | Smith Corp | East |
Understanding the Star Schema
The most common dimensional design is the star schema.
Product
|
Customer -- Fact Sales -- Date
|
Store
Characteristics:
- Central fact table
- Multiple dimensions
- Simple joins
- Excellent reporting performance
For DP-700, the star schema is typically the preferred analytical design.
Understanding the Snowflake Schema
A snowflake schema normalizes dimension tables.
Example:
Product |Category |Department
Advantages:
- Reduced redundancy
Disadvantages:
- More joins
- Increased complexity
Most Fabric analytics workloads favor star schemas over snowflake schemas.
Data Preparation Before Loading
Raw source data rarely fits directly into a dimensional model.
Preparation typically includes:
- Data cleansing
- Standardization
- Deduplication
- Business rule application
- Surrogate key generation
- Data quality validation
Data Cleansing
Before loading dimensions and facts, incorrect data must be corrected.
Examples:
Inconsistent Values
FLFloridaFla
Standardized to:
Florida
Invalid Dates
01/45/2026
Must be corrected or rejected.
Handling Missing Values
Example:
| CustomerID | |
|---|---|
| 101 | NULL |
Possible approaches:
- Default values
- Unknown members
- Data quality workflows
A common dimensional modeling practice is using “Unknown” dimension records.
Deduplication
Source systems often contain duplicate records.
Example:
| CustomerID | Name |
|---|---|
| 100 | Smith |
| 100 | Smith |
Duplicates should be removed before loading.
Conformed Dimensions
A conformed dimension is shared across multiple fact tables.
Example:
Fact Sales |Customer Dimension |Fact Orders
Benefits:
- Consistent reporting
- Unified business definitions
- Simplified analytics
DP-700 questions often reference conformed dimensions.
Surrogate Keys
Dimension tables typically use surrogate keys instead of business keys.
Example:
Source System:
| CustomerID |
|---|
| CUST100 |
Dimension:
| CustomerKey | CustomerID |
|---|---|
| 501 | CUST100 |
Why Use Surrogate Keys?
Advantages include:
Independence from Source Systems
Source keys can change.
Improved Performance
Integer keys are more efficient than text values.
Support for Slowly Changing Dimensions
Surrogate keys help track historical changes.
Dimension Loading Sequence
Dimension tables are usually loaded before fact tables.
Why?
Fact tables require dimension keys.
Typical workflow:
Load Dimensions ↓Generate Surrogate Keys ↓Load Fact Tables
Key Lookup Process
During fact loading:
- Source business key identified
- Matching dimension record located
- Surrogate key retrieved
- Fact record loaded
Example:
CustomerID = CUST100 ↓CustomerKey = 501 ↓FactSales loaded
Slowly Changing Dimensions (SCD)
Dimensions often change over time.
Example:
Customer moves from:
Florida
to
Texas
The organization must decide how historical records should be handled.
SCD Type 1
Type 1 overwrites existing values.
Example:
Before:
| Customer | State |
|---|---|
| Smith | Florida |
After:
| Customer | State |
|---|---|
| Smith | Texas |
History is lost.
SCD Type 2
Type 2 preserves history.
Example:
| Customer | State | Current |
|---|---|---|
| Smith | Florida | No |
| Smith | Texas | Yes |
Benefits:
- Historical reporting
- Auditability
- Trend analysis
Type 2 SCD is heavily tested in data engineering certifications.
Date Dimensions
Date dimensions are one of the most important dimensions.
Typical attributes:
| DateKey | Year | Quarter | Month |
|---|---|---|---|
| 20260101 | 2026 | Q1 | January |
Benefits:
- Consistent date calculations
- Faster reporting
- Simplified filtering
Fact Table Preparation
Before loading facts:
Validate Measures
Example:
SalesAmount >= 0
Verify Foreign Keys
Ensure referenced dimensions exist.
Remove Invalid Records
Reject records with missing required fields.
Apply Business Rules
Example:
NetSales =SalesAmount - DiscountAmount
Fact Table Granularity
Granularity defines the level of detail stored.
Examples:
Transaction-Level
One row per sale.
Daily Summary
One row per day.
Monthly Summary
One row per month.
The chosen grain should be clearly defined before loading.
Preparing Data in Microsoft Fabric
Several Fabric tools support dimensional modeling.
Dataflows Gen2
Useful for:
- Data cleansing
- Standardization
- Deduplication
- Business rule transformations
Best for low-code scenarios.
Notebooks
Useful for:
- Complex transformations
- Spark processing
- SCD implementation
- Large-scale dimension preparation
Common languages:
- PySpark
- Spark SQL
Data Pipelines
Useful for:
- Orchestration
- Scheduling
- Parameterized execution
- End-to-end ETL workflows
Fabric Warehouses
Useful for:
- Dimensional storage
- SQL-based transformations
- Star schema implementation
Typical Fabric Dimensional Loading Pattern
Source Systems ↓Bronze Layer(Raw Data) ↓Silver Layer(Cleansed Data) ↓Dimension Processing ↓Fact Processing ↓Gold Layer(Analytics Model)
This Medallion Architecture pattern is commonly used in Fabric implementations.
Common DP-700 Exam Scenarios
Scenario 1
A reporting system requires historical customer address tracking.
Best solution:
SCD Type 2
Scenario 2
A sales fact table references customer and product dimensions.
Best practice:
Load dimensions before facts.
Scenario 3
A product code changes in the source system.
Best solution:
Use surrogate keys.
Scenario 4
Multiple fact tables require consistent customer reporting.
Best solution:
Conformed dimensions.
Best Practices
Define Granularity Early
Determine the fact table grain before development.
Use Surrogate Keys
Avoid using business keys directly in fact tables.
Load Dimensions First
Fact loads depend on dimension keys.
Implement Data Quality Checks
Prevent invalid data from entering the warehouse.
Use Conformed Dimensions
Promote consistency across analytical models.
Preserve History When Needed
Use SCD Type 2 for historical reporting requirements.
DP-700 Exam Focus Areas
You should understand:
✓ Fact tables
✓ Dimension tables
✓ Star schemas
✓ Snowflake schemas
✓ Conformed dimensions
✓ Surrogate keys
✓ Business keys
✓ Fact table granularity
✓ Dimension loading strategies
✓ Fact loading strategies
✓ SCD Type 1
✓ SCD Type 2
✓ Data cleansing
✓ Data standardization
✓ Medallion Architecture support for dimensional models
Practice Exam Questions
Question 1
Which type of table stores measurable business events in a dimensional model?
A. Staging table
B. Dimension table
C. Lookup table
D. Fact table
Answer: D
Explanation
Fact tables store measurable events such as sales, orders, inventory quantities, and revenue.
Question 2
A data engineer needs to store customer attributes such as customer name, city, and region.
Which table type should be used?
A. Fact table
B. Bridge table
C. Aggregate table
D. Dimension table
Answer: D
Explanation
Dimension tables contain descriptive attributes used for filtering, grouping, and reporting.
Question 3
What is the primary advantage of a star schema?
A. More normalization
B. Simplified queries and better reporting performance
C. Reduced storage requirements
D. Elimination of dimensions
Answer: B
Explanation
Star schemas reduce join complexity and are optimized for analytical workloads.
Question 4
Which key type is typically used as the primary key in a dimension table?
A. Natural key
B. Foreign key
C. Composite key
D. Surrogate key
Answer: D
Explanation
Surrogate keys are system-generated identifiers that improve performance and support Slowly Changing Dimensions.
Question 5
A customer changes states from Florida to Texas, and historical reporting must be preserved.
Which Slowly Changing Dimension type should be used?
A. Type 0
B. Type 1
C. Type 2
D. Type 3
Answer: C
Explanation
Type 2 creates a new dimension record and preserves historical values.
Question 6
What should generally be loaded first during dimensional processing?
A. Dimension tables
B. Aggregate tables
C. Materialized views
D. Fact tables
Answer: A
Explanation
Fact tables require dimension keys, so dimensions are loaded first.
Question 7
Which activity is most commonly performed during data preparation for dimensional modeling?
A. Encrypting storage accounts
B. Data cleansing and standardization
C. Creating dashboards
D. Configuring network firewalls
Answer: B
Explanation
Data cleansing and standardization improve data quality before loading into dimensions and facts.
Question 8
A company wants multiple fact tables to use the same customer dimension.
What type of dimension should be implemented?
A. Slowly Changing Dimension
B. Role-playing Dimension
C. Junk Dimension
D. Conformed Dimension
Answer: D
Explanation
Conformed dimensions provide consistent business definitions across multiple fact tables.
Question 9
What is the primary purpose of a Date dimension?
A. Store transaction details
B. Manage security permissions
C. Provide standardized calendar attributes for reporting
D. Store surrogate key mappings
Answer: C
Explanation
Date dimensions simplify filtering, aggregation, and time-based reporting.
Question 10
A data engineer must ensure that every sales transaction is stored individually.
What fact table grain should be selected?
A. Monthly summary
B. Quarterly summary
C. Daily summary
D. Transaction-level detail
Answer: D
Explanation
Transaction-level grain stores one row per business event, providing the highest level of detail and analytical flexibility.
Exam Tip
For DP-700, remember this fundamental sequence:
Cleanse Data ↓Build Dimensions ↓Generate Surrogate Keys ↓Load Fact Tables ↓Publish Analytics Model
When an exam question discusses historical tracking, think SCD Type 2. When it discusses reporting performance and simplicity, think Star Schema. When it discusses multiple fact tables sharing the same business entity, think Conformed Dimension. These concepts appear frequently in real-world Fabric data warehouse implementations and certification exam scenarios.
Go to the DP-700 Exam Prep Hub main page.
