This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Ingest and transform data (30–35%)
--> Design and implement loading patterns
--> Design and implement full and incremental data loads
Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.
Introduction
One of the most important responsibilities of a data engineer is moving data efficiently from source systems into analytical platforms. As data volumes grow, loading strategies become critical for performance, scalability, cost management, and data freshness.
In Microsoft Fabric, data engineers frequently design and implement two primary loading patterns:
- Full Loads
- Incremental Loads
Understanding when and how to use each approach is a core skill measured in the DP-700 exam.
For the exam, you should understand:
- The differences between full and incremental loading
- Advantages and disadvantages of each approach
- Change detection techniques
- Watermarking strategies
- Merge and upsert operations
- Loading patterns in Lakehouses and Warehouses
- Performance considerations
- Common implementation methods using Dataflows Gen2, Pipelines, Notebooks, and SQL
Many DP-700 scenario questions focus on selecting the most appropriate loading strategy for a given business requirement.
Understanding Data Loading Patterns
A data load is the process of moving data from a source system into a destination such as:
- Lakehouse
- Data Warehouse
- Delta Table
- Analytical Model
The loading strategy determines how data is transferred.
The two most common strategies are:
Full LoadIncremental Load
What Is a Full Load?
A full load reloads all records from the source into the destination every time the process runs.
Example:
Source Table:
| CustomerID | Name |
|---|---|
| 1 | Smith |
| 2 | Jones |
| 3 | Brown |
Every execution loads all three records regardless of whether changes occurred.
Full Load Process
Source Data ↓Delete Existing Data ↓Reload Entire Dataset ↓Destination Table
Some implementations truncate the destination before reloading.
Full Load Characteristics
Loads All Records
Every execution processes the entire dataset.
Simple Logic
No need to identify changed records.
Easy Troubleshooting
The destination always reflects a complete source snapshot.
Higher Resource Consumption
Processing time grows as data volume increases.
Advantages of Full Loads
Simplicity
The implementation is straightforward.
Reduced Change Tracking Requirements
No need for timestamps or change detection.
Easier Validation
Source and destination can be directly compared.
Useful for Small Datasets
When data volume is low, the overhead is minimal.
Disadvantages of Full Loads
Longer Processing Times
Large datasets take longer to reload.
Increased Compute Costs
More resources are consumed.
Greater Network Utilization
All records must be transferred.
Limited Scalability
May become impractical as data volumes grow.
Common Full Load Scenarios
Small Reference Tables
Examples:
- Countries
- Product Categories
- Currency Codes
Initial Data Loads
The first load into a new Lakehouse or Warehouse is typically a full load.
Periodic Snapshot Tables
Some reporting systems intentionally capture full snapshots.
What Is an Incremental Load?
An incremental load transfers only data that has changed since the previous execution.
Instead of processing millions of records, only new or modified records are loaded.
Incremental Load Process
Source Data ↓Identify Changes ↓Load New/Changed Records ↓Update Destination
Incremental Load Characteristics
Processes Only Changes
Only inserts, updates, and sometimes deletes are handled.
Faster Execution
Smaller data volumes are processed.
Lower Costs
Reduced compute and storage consumption.
Greater Complexity
Requires change detection mechanisms.
Advantages of Incremental Loads
Better Performance
Only changed records are processed.
Lower Resource Usage
Reduces compute requirements.
Improved Scalability
Handles large datasets efficiently.
Faster Refresh Times
Supports more frequent data updates.
Disadvantages of Incremental Loads
More Complex Design
Must accurately identify changes.
Requires Metadata
Change tracking information is often needed.
Potential Data Quality Risks
Missed changes can lead to inconsistent results.
Common Incremental Load Scenarios
Large Transaction Tables
Examples:
- Sales transactions
- Orders
- IoT telemetry
- Customer interactions
Near Real-Time Analytics
Frequent updates benefit from incremental processing.
Enterprise Data Warehouses
Most modern warehouse architectures rely heavily on incremental loads.
Comparing Full and Incremental Loads
| Feature | Full Load | Incremental Load |
|---|---|---|
| Data Processed | All Records | Changed Records Only |
| Complexity | Low | Higher |
| Performance | Lower | Higher |
| Scalability | Limited | Excellent |
| Compute Cost | Higher | Lower |
| Initial Load | Common | Rare |
| Large Datasets | Poor Fit | Ideal |
Change Detection Techniques
The success of incremental loading depends on identifying changed records.
Common methods include:
- Timestamps
- Watermarks
- Change Data Capture (CDC)
- Delta tables
- Version columns
Timestamp-Based Loading
Many source systems contain columns such as:
| Column |
|---|
| CreatedDate |
| ModifiedDate |
| LastUpdated |
Example:
SELECT *FROM SalesWHERE ModifiedDate > '2026-01-01'
Only records changed after the previous load are returned.
Watermarking
A watermark stores the most recent processed value.
Example:
Last Processed Date:2026-01-01 12:00 PM
Next execution:
SELECT *FROM SalesWHERE ModifiedDate > Watermark
After successful processing:
Watermark Updated
High-Watermark Pattern
One of the most common DP-700 exam topics.
Workflow:
Read Watermark ↓Extract New Records ↓Load Destination ↓Update Watermark
This pattern minimizes duplicate processing.
Change Data Capture (CDC)
CDC tracks:
- Inserts
- Updates
- Deletes
Instead of comparing entire tables, the source system records changes.
Benefits:
- Highly efficient
- Accurate change detection
- Reduced processing overhead
CDC is commonly used in enterprise environments.
Delta Tables and Incremental Loads
Microsoft Fabric Lakehouses use Delta Lake technology.
Delta tables support:
- ACID transactions
- Versioning
- Efficient updates
- Merge operations
These features make incremental processing much easier.
Upserts
An upsert combines:
UPDATE+INSERT
Logic:
- Update existing records
- Insert new records
Example:
Customer 100 exists:
UPDATE
Customer 101 does not exist:
INSERT
Merge Operations
MERGE is commonly used for incremental loads.
Example logic:
MERGE TargetTableUSING SourceTableON TargetTable.CustomerID = SourceTable.CustomerIDWHEN MATCHED THEN UPDATEWHEN NOT MATCHED THEN INSERT
This is one of the most important SQL patterns for DP-700.
Handling Deletes
Deletes are often overlooked.
Possible approaches:
Soft Deletes
Add:
IsDeleted = True
CDC Delete Tracking
Capture delete operations directly.
Full Reconciliation
Periodically compare source and target.
Loading Patterns in Microsoft Fabric
Several Fabric components support loading operations.
Dataflow Gen2
Best for:
- Low-code ingestion
- Power Query transformations
- Small-to-medium workloads
Data Pipelines
Best for:
- Orchestration
- Scheduling
- Parameterized loads
- End-to-end automation
Notebooks
Best for:
- Spark-based transformations
- Delta MERGE operations
- Advanced incremental logic
SQL Warehouses
Best for:
- T-SQL-based loading
- MERGE statements
- Data warehouse processing
Common Enterprise Pattern
A common Fabric implementation:
Pipeline ↓Read Watermark ↓Notebook ↓MERGE Into Delta Table ↓Update Watermark
This pattern is frequently used in production environments.
Choosing Between Full and Incremental Loads
Choose Full Loads When
- Data volume is small
- Simplicity is important
- Initial loading is required
- Change tracking is unavailable
Choose Incremental Loads When
- Data volume is large
- Frequent refreshes are required
- Performance matters
- Compute costs must be minimized
Common DP-700 Exam Scenarios
Scenario 1
A table contains 50 million records and receives 10,000 new records daily.
Best approach:
Incremental Load
Scenario 2
A lookup table contains 50 records and changes monthly.
Best approach:
Full Load
Scenario 3
A source system includes a LastModifiedDate column.
Best approach:
Timestamp-Based Incremental Load
Scenario 4
A Lakehouse Delta table must update existing rows and insert new rows.
Best approach:
MERGE Operation
Best Practices
Use Incremental Loads for Large Tables
Avoid unnecessary processing.
Maintain Watermarks Carefully
Improper watermark handling can cause missed records.
Use MERGE for Upserts
This simplifies insert/update processing.
Validate Change Detection Logic
Ensure no records are missed.
Periodically Reconcile Data
Occasional full validation improves data quality.
DP-700 Exam Focus Areas
You should understand:
✓ Full load design
✓ Incremental load design
✓ Watermark patterns
✓ High-watermark processing
✓ Timestamp-based loading
✓ Change Data Capture (CDC)
✓ Delta Lake incremental processing
✓ MERGE operations
✓ Upserts
✓ Delete handling
✓ Lakehouse loading strategies
✓ Pipeline orchestration for data loads
Practice Exam Questions
Question 1
A source table contains 100 million records, but only 5,000 records change daily. Which loading strategy is most appropriate?
A. Incremental load
B. Full load
C. Snapshot load
D. Manual reload
Answer: A
Explanation
Incremental loading minimizes processing by loading only changed records, making it ideal for large datasets with relatively small daily changes.
Question 2
Which loading strategy reloads all source records during each execution?
A. Full load
B. Watermark load
C. Incremental load
D. CDC load
Answer: A
Explanation
A full load processes every record regardless of whether changes have occurred.
Question 3
What is the primary purpose of a watermark?
A. Store the last successfully processed change value
B. Encrypt source data
C. Compress Delta tables
D. Improve security permissions
Answer: A
Explanation
Watermarks track the latest processed timestamp or key value, enabling future incremental loads.
Question 4
A data engineer wants to update existing rows and insert new rows in a Delta table during an incremental load.
Which operation should be used?
A. DELETE
B. TRUNCATE
C. APPEND
D. MERGE
Answer: D
Explanation
MERGE supports upsert functionality by updating matching rows and inserting non-matching rows.
Question 5
Which source column is most commonly used for timestamp-based incremental loading?
A. LastModifiedDate
B. CustomerName
C. ProductCategory
D. PostalCode
Answer: A
Explanation
Timestamp columns such as LastModifiedDate are commonly used to identify changed records.
Question 6
What is a major disadvantage of full loads?
A. Increased complexity
B. Difficulty validating results
C. Higher resource consumption
D. Inability to load new records
Answer: C
Explanation
Full loads require processing all records, increasing compute, storage, and network usage.
Question 7
Which technology captures inserts, updates, and deletes directly from a source system?
A. Data masking
B. Change Data Capture (CDC)
C. Endorsement
D. Dynamic expressions
Answer: B
Explanation
CDC records source-system changes and allows efficient incremental processing.
Question 8
A company needs the simplest possible solution for a reference table containing only 100 rows.
Which approach is most appropriate?
A. CDC
B. Watermark processing
C. Delta MERGE
D. Full load
Answer: D
Explanation
For very small datasets, the simplicity of a full load often outweighs the benefits of incremental processing.
Question 9
Which Microsoft Fabric storage technology provides efficient support for incremental processing through MERGE operations and transaction history?
A. CSV files
B. JSON files
C. Delta tables
D. Excel workbooks
Answer: C
Explanation
Delta tables support ACID transactions, versioning, and MERGE operations, making them ideal for incremental loads.
Question 10
What is the primary benefit of incremental loading compared to full loading?
A. More complex implementation
B. Faster processing and lower resource usage
C. Requires no change tracking
D. Eliminates the need for validation
Answer: B
Explanation
Incremental loading processes only changed records, improving performance and reducing resource consumption.
Exam Tip
For DP-700 scenario questions, focus on the size of the dataset and the frequency of change:
| Scenario | Best Choice |
|---|---|
| Small lookup table | Full Load |
| Initial migration | Full Load |
| Large transaction table | Incremental Load |
| Daily updates to large datasets | Incremental Load |
| Delta Lake upserts | MERGE |
| Timestamp tracking available | Watermark-Based Incremental Load |
A common exam clue is a statement such as:
“The table contains millions of rows, but only a small percentage changes each day.”
When you see this, it may indicate an Incremental Loading scenario.
Go to the DP-700 Exam Prep Hub main page.
