This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Ingest and transform data (30–35%)
--> Ingest and transform batch data
--> Implement mirroring
Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.
Introduction
One of the most important capabilities in Microsoft Fabric for modern data engineering is Mirroring. Mirroring enables organizations to continuously replicate data from operational databases and external data platforms into Microsoft Fabric with minimal configuration and without requiring complex ETL pipelines.
For the DP-700 exam, you should understand:
- What Mirroring is
- When to use Mirroring
- Supported source systems
- How Mirroring works
- Mirroring architecture and components
- Benefits and limitations
- Security considerations
- Differences between Mirroring and other ingestion methods
- Monitoring and managing mirrored databases
What Is Mirroring?
Mirroring is a Microsoft Fabric capability that continuously replicates data from supported source systems into OneLake.
Unlike traditional batch ingestion approaches, Mirroring provides near real-time synchronization of source data changes into Fabric.
The primary goal is to simplify operational analytics by allowing organizations to:
- Keep transactional systems as the system of record
- Replicate data into Fabric automatically
- Analyze data using Fabric workloads without building custom ingestion pipelines
Think of Mirroring as:
“Continuously copying operational database changes into Fabric while keeping the source database independent.”
Why Use Mirroring?
Traditionally, moving data into an analytics platform requires:
- ETL pipelines
- Dataflows
- Custom code
- Scheduling
- Change Data Capture (CDC) implementation
- Ongoing maintenance
Mirroring removes much of this complexity.
Benefits include:
Reduced Data Movement Complexity
No need to create:
- Copy activities
- Incremental load logic
- Watermark tracking
- Custom CDC solutions
Near Real-Time Analytics
Changes made in source databases are replicated continuously.
Faster Time to Value
Data engineers can begin analyzing data almost immediately.
Centralized Data Access
Mirrored data becomes available within:
- OneLake
- Lakehouses
- Warehouses
- Notebooks
- Power BI
- SQL Analytics Endpoints
Mirroring Architecture
A typical architecture consists of:
Source System
Examples:
- Azure SQL Database
- Azure SQL Managed Instance
- SQL Server
- Azure Cosmos DB
- Snowflake
- Other supported sources
↓
Change Tracking / CDC
Fabric captures changes from the source.
↓
Mirroring Service
Fabric continuously reads changes.
↓
OneLake
Data is stored in Delta Parquet format.
↓
Analytics Workloads
Data can be consumed by:
- Lakehouses
- Data Warehouses
- Notebooks
- Spark
- Power BI
- Real-Time Analytics
How Mirroring Works
The process typically follows these stages:
Step 1: Initial Snapshot
Fabric performs an initial load of source tables.
This creates a baseline copy in OneLake.
Step 2: Continuous Change Capture
Fabric captures:
- Inserts
- Updates
- Deletes
from the source system.
Step 3: Synchronization
Changes are continuously applied to the mirrored data.
Step 4: Analytics
Users query the replicated data without impacting operational systems.
Mirrored Databases
When mirroring is configured, Fabric creates a:
Mirrored Database
This is a Fabric item that represents the source system.
The mirrored database:
- Stores replicated tables
- Maintains synchronization metadata
- Tracks replication status
- Exposes data to Fabric workloads
A mirrored database is not simply a copy of files.
It is a managed Fabric object that continuously synchronizes with the source.
Supported Mirroring Sources
Microsoft continues expanding supported sources.
Examples include:
Azure SQL Database
One of the most common mirroring sources.
Azure SQL Managed Instance
Supports enterprise operational workloads.
SQL Server
Supported in many hybrid scenarios.
Azure Cosmos DB
Supports analytical access to operational NoSQL data.
Snowflake
Allows integration of external cloud data platforms.
Exam Tip: Always verify supported sources based on the latest Microsoft documentation because supported systems continue to expand.
Mirroring vs Dataflows Gen2
A common DP-700 exam objective is choosing the appropriate ingestion method.
| Feature | Mirroring | Dataflow Gen2 |
|---|---|---|
| Continuous synchronization | Yes | No |
| Data transformation | Limited | Extensive |
| Low-code experience | Yes | Yes |
| Incremental changes handled automatically | Yes | Requires configuration |
| Near real-time updates | Yes | No |
| ETL processing | Not primary purpose | Primary purpose |
Use Mirroring when:
- You need operational analytics.
- Data should remain synchronized automatically.
- Minimal transformation is required.
Use Dataflows Gen2 when:
- Complex transformations are required.
- Data cleansing is needed.
- Business logic must be applied during ingestion.
Mirroring vs Pipelines
| Feature | Mirroring | Pipeline |
|---|---|---|
| Continuous replication | Yes | No |
| Orchestration | Limited | Extensive |
| Scheduling | Automatic | Configurable |
| Multiple system workflows | No | Yes |
| Transformation support | Limited | Extensive |
Use Mirroring for continuous replication.
Use Pipelines for orchestration and workflow automation.
Mirroring vs Shortcuts
Many exam questions compare Mirroring and OneLake Shortcuts.
OneLake Shortcut
- References data in another location
- Does not copy data
- Virtual access layer
Mirroring
- Creates replicated copies
- Synchronizes changes
- Stores data in OneLake
| Capability | Shortcut | Mirroring |
|---|---|---|
| Copies data | No | Yes |
| Continuous synchronization | No | Yes |
| Storage in OneLake | Referenced | Replicated |
| Data movement | None | Yes |
Security Considerations
Mirroring respects Fabric security controls.
Security areas include:
Source Authentication
Secure connections are required to source systems.
Workspace Permissions
Users need appropriate access to mirrored database items.
OneLake Security
Access controls apply to replicated data.
Sensitivity Labels
Labels can be applied to mirrored data assets.
Auditing
Mirroring activities can be monitored through Fabric auditing and monitoring tools.
Monitoring Mirroring
Data engineers should monitor:
Replication Health
Shows whether synchronization is functioning correctly.
Replication Status
Examples:
- Running
- Initializing
- Warning
- Failed
Synchronization Latency
Measures how current the replicated data is compared to the source.
Error Logs
Useful for troubleshooting:
- Authentication failures
- Network issues
- Schema changes
- Permission problems
Schema Changes and Mirroring
Source systems often evolve over time.
Examples:
- New columns added
- Columns removed
- Data type modifications
- New tables created
Data engineers should understand how schema evolution affects mirrored databases.
Potential actions include:
- Refreshing metadata
- Revalidating mappings
- Reviewing replication health
Exam questions may present scenarios involving schema modifications and synchronization behavior.
Common Mirroring Use Cases
Operational Analytics
Analyze transactional data without impacting production systems.
Example:
- Sales application database
- Replicated to Fabric
- Power BI dashboards updated continuously
Hybrid Analytics
Combine:
- SQL Server
- Azure SQL
- Cosmos DB
into a unified Fabric environment.
Data Modernization
Organizations migrating toward Fabric can begin replicating source systems immediately without redesigning all ETL processes.
Self-Service Analytics
Business users gain access to current data through Fabric and Power BI.
DP-700 Exam Tips
Remember the following:
✓ Mirroring continuously replicates source data into Fabric.
✓ Mirroring reduces the need for custom ETL and CDC implementations.
✓ Mirrored data is stored in OneLake.
✓ Mirrored databases are managed Fabric items.
✓ Mirroring is best for operational analytics and near real-time reporting.
✓ Shortcuts reference data without copying it; Mirroring copies and synchronizes data.
✓ Pipelines orchestrate workflows; Mirroring synchronizes data.
✓ Dataflows Gen2 are designed for transformation and ETL workloads.
✓ Monitor replication health, synchronization status, and latency.
✓ Understand the differences between Mirroring, Pipelines, Dataflows Gen2, and Shortcuts.
Practice Exam Questions
Question 1
A company wants to continuously replicate data from Azure SQL Database into Fabric with minimal engineering effort. Which feature should be used?
A. Dataflow Gen2
B. Mirroring
C. Notebook
D. Warehouse
Correct Answer: B
Explanation:
Mirroring continuously synchronizes data from supported operational systems into Fabric with minimal configuration.
Question 2
Which statement best describes a OneLake shortcut?
A. It creates a replicated copy of data in OneLake.
B. It continuously synchronizes source changes.
C. It provides virtual access to data without copying it.
D. It performs CDC automatically.
Correct Answer: C
Explanation:
Shortcuts provide access to external data without physically copying it into OneLake.
Question 3
A data engineer needs extensive data cleansing and transformation during ingestion. Which option is most appropriate?
A. Dataflow Gen2
B. Mirroring
C. Shortcut
D. Workspace role assignment
Correct Answer: A
Explanation:
Dataflows Gen2 are designed for ETL and transformation scenarios.
Question 4
What is typically performed first when configuring Mirroring?
A. Initial snapshot of source data
B. Continuous CDC synchronization
C. Power BI semantic modeling
D. Delta optimization
Correct Answer: A
Explanation:
Mirroring generally begins with an initial snapshot before applying incremental changes.
Question 5
Which benefit is most directly associated with Mirroring?
A. Eliminates workspace permissions
B. Replaces Power BI semantic models
C. Automatically synchronizes source changes into Fabric
D. Converts all data into KQL format
Correct Answer: C
Explanation:
The primary purpose of Mirroring is continuous synchronization of source data.
Question 6
A Fabric administrator wants to determine whether a mirrored database is successfully synchronizing. Which metric should be reviewed?
A. Semantic model refresh duration
B. Replication health and status
C. Capacity SKU name
D. Workspace description
Correct Answer: B
Explanation:
Replication health and synchronization status indicate whether mirroring is functioning properly.
Question 7
Which Fabric item represents a continuously synchronized copy of a source system?
A. Lakehouse shortcut
B. Notebook
C. Pipeline
D. Mirrored Database
Correct Answer: D
Explanation:
A Mirrored Database is the Fabric item created and maintained by the Mirroring feature.
Question 8
Which scenario is the best fit for Mirroring?
A. Complex multi-step ETL workflow across ten systems
B. Monthly batch processing only
C. Near real-time operational reporting from a transactional database
D. Interactive notebook development
Correct Answer: C
Explanation:
Mirroring excels at near real-time analytics on operational data sources.
Question 9
Which activity is most commonly used to orchestrate multiple workflows and dependencies?
A. Mirroring
B. Sensitivity labels
C. Pipelines
D. OneLake shortcuts
Correct Answer: C
Explanation:
Pipelines are designed for orchestration, dependency management, and workflow automation.
Question 10
A company wants analytics users to query current operational data without directly querying production databases. What is the primary advantage of Mirroring?
A. It replicates data into Fabric for analytical workloads.
B. It encrypts all source databases automatically.
C. It removes the need for OneLake.
D. It replaces Delta Lake storage.
Correct Answer: A
Explanation:
Mirroring creates synchronized copies of operational data inside Fabric, allowing analytical workloads to run without impacting production systems.
Go to the DP-700 Exam Prep Hub main page.
