Tag: Mirroring

Implement mirroring (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Ingest and transform data (30–35%)
   --> Ingest and transform batch data
      --> Implement mirroring


Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

One of the most important capabilities in Microsoft Fabric for modern data engineering is Mirroring. Mirroring enables organizations to continuously replicate data from operational databases and external data platforms into Microsoft Fabric with minimal configuration and without requiring complex ETL pipelines.

For the DP-700 exam, you should understand:

  • What Mirroring is
  • When to use Mirroring
  • Supported source systems
  • How Mirroring works
  • Mirroring architecture and components
  • Benefits and limitations
  • Security considerations
  • Differences between Mirroring and other ingestion methods
  • Monitoring and managing mirrored databases

What Is Mirroring?

Mirroring is a Microsoft Fabric capability that continuously replicates data from supported source systems into OneLake.

Unlike traditional batch ingestion approaches, Mirroring provides near real-time synchronization of source data changes into Fabric.

The primary goal is to simplify operational analytics by allowing organizations to:

  • Keep transactional systems as the system of record
  • Replicate data into Fabric automatically
  • Analyze data using Fabric workloads without building custom ingestion pipelines

Think of Mirroring as:

“Continuously copying operational database changes into Fabric while keeping the source database independent.”


Why Use Mirroring?

Traditionally, moving data into an analytics platform requires:

  • ETL pipelines
  • Dataflows
  • Custom code
  • Scheduling
  • Change Data Capture (CDC) implementation
  • Ongoing maintenance

Mirroring removes much of this complexity.

Benefits include:

Reduced Data Movement Complexity

No need to create:

  • Copy activities
  • Incremental load logic
  • Watermark tracking
  • Custom CDC solutions

Near Real-Time Analytics

Changes made in source databases are replicated continuously.

Faster Time to Value

Data engineers can begin analyzing data almost immediately.

Centralized Data Access

Mirrored data becomes available within:

  • OneLake
  • Lakehouses
  • Warehouses
  • Notebooks
  • Power BI
  • SQL Analytics Endpoints

Mirroring Architecture

A typical architecture consists of:

Source System

Examples:

  • Azure SQL Database
  • Azure SQL Managed Instance
  • SQL Server
  • Azure Cosmos DB
  • Snowflake
  • Other supported sources

Change Tracking / CDC

Fabric captures changes from the source.

Mirroring Service

Fabric continuously reads changes.

OneLake

Data is stored in Delta Parquet format.

Analytics Workloads

Data can be consumed by:

  • Lakehouses
  • Data Warehouses
  • Notebooks
  • Spark
  • Power BI
  • Real-Time Analytics

How Mirroring Works

The process typically follows these stages:

Step 1: Initial Snapshot

Fabric performs an initial load of source tables.

This creates a baseline copy in OneLake.

Step 2: Continuous Change Capture

Fabric captures:

  • Inserts
  • Updates
  • Deletes

from the source system.

Step 3: Synchronization

Changes are continuously applied to the mirrored data.

Step 4: Analytics

Users query the replicated data without impacting operational systems.


Mirrored Databases

When mirroring is configured, Fabric creates a:

Mirrored Database

This is a Fabric item that represents the source system.

The mirrored database:

  • Stores replicated tables
  • Maintains synchronization metadata
  • Tracks replication status
  • Exposes data to Fabric workloads

A mirrored database is not simply a copy of files.

It is a managed Fabric object that continuously synchronizes with the source.


Supported Mirroring Sources

Microsoft continues expanding supported sources.

Examples include:

Azure SQL Database

One of the most common mirroring sources.

Azure SQL Managed Instance

Supports enterprise operational workloads.

SQL Server

Supported in many hybrid scenarios.

Azure Cosmos DB

Supports analytical access to operational NoSQL data.

Snowflake

Allows integration of external cloud data platforms.

Exam Tip: Always verify supported sources based on the latest Microsoft documentation because supported systems continue to expand.


Mirroring vs Dataflows Gen2

A common DP-700 exam objective is choosing the appropriate ingestion method.

FeatureMirroringDataflow Gen2
Continuous synchronizationYesNo
Data transformationLimitedExtensive
Low-code experienceYesYes
Incremental changes handled automaticallyYesRequires configuration
Near real-time updatesYesNo
ETL processingNot primary purposePrimary purpose

Use Mirroring when:

  • You need operational analytics.
  • Data should remain synchronized automatically.
  • Minimal transformation is required.

Use Dataflows Gen2 when:

  • Complex transformations are required.
  • Data cleansing is needed.
  • Business logic must be applied during ingestion.

Mirroring vs Pipelines

FeatureMirroringPipeline
Continuous replicationYesNo
OrchestrationLimitedExtensive
SchedulingAutomaticConfigurable
Multiple system workflowsNoYes
Transformation supportLimitedExtensive

Use Mirroring for continuous replication.

Use Pipelines for orchestration and workflow automation.


Mirroring vs Shortcuts

Many exam questions compare Mirroring and OneLake Shortcuts.

OneLake Shortcut

  • References data in another location
  • Does not copy data
  • Virtual access layer

Mirroring

  • Creates replicated copies
  • Synchronizes changes
  • Stores data in OneLake
CapabilityShortcutMirroring
Copies dataNoYes
Continuous synchronizationNoYes
Storage in OneLakeReferencedReplicated
Data movementNoneYes

Security Considerations

Mirroring respects Fabric security controls.

Security areas include:

Source Authentication

Secure connections are required to source systems.

Workspace Permissions

Users need appropriate access to mirrored database items.

OneLake Security

Access controls apply to replicated data.

Sensitivity Labels

Labels can be applied to mirrored data assets.

Auditing

Mirroring activities can be monitored through Fabric auditing and monitoring tools.


Monitoring Mirroring

Data engineers should monitor:

Replication Health

Shows whether synchronization is functioning correctly.

Replication Status

Examples:

  • Running
  • Initializing
  • Warning
  • Failed

Synchronization Latency

Measures how current the replicated data is compared to the source.

Error Logs

Useful for troubleshooting:

  • Authentication failures
  • Network issues
  • Schema changes
  • Permission problems

Schema Changes and Mirroring

Source systems often evolve over time.

Examples:

  • New columns added
  • Columns removed
  • Data type modifications
  • New tables created

Data engineers should understand how schema evolution affects mirrored databases.

Potential actions include:

  • Refreshing metadata
  • Revalidating mappings
  • Reviewing replication health

Exam questions may present scenarios involving schema modifications and synchronization behavior.


Common Mirroring Use Cases

Operational Analytics

Analyze transactional data without impacting production systems.

Example:

  • Sales application database
  • Replicated to Fabric
  • Power BI dashboards updated continuously

Hybrid Analytics

Combine:

  • SQL Server
  • Azure SQL
  • Cosmos DB

into a unified Fabric environment.


Data Modernization

Organizations migrating toward Fabric can begin replicating source systems immediately without redesigning all ETL processes.


Self-Service Analytics

Business users gain access to current data through Fabric and Power BI.


DP-700 Exam Tips

Remember the following:

✓ Mirroring continuously replicates source data into Fabric.

✓ Mirroring reduces the need for custom ETL and CDC implementations.

✓ Mirrored data is stored in OneLake.

✓ Mirrored databases are managed Fabric items.

✓ Mirroring is best for operational analytics and near real-time reporting.

✓ Shortcuts reference data without copying it; Mirroring copies and synchronizes data.

✓ Pipelines orchestrate workflows; Mirroring synchronizes data.

✓ Dataflows Gen2 are designed for transformation and ETL workloads.

✓ Monitor replication health, synchronization status, and latency.

✓ Understand the differences between Mirroring, Pipelines, Dataflows Gen2, and Shortcuts.


Practice Exam Questions

Question 1

A company wants to continuously replicate data from Azure SQL Database into Fabric with minimal engineering effort. Which feature should be used?

A. Dataflow Gen2
B. Mirroring
C. Notebook
D. Warehouse

Correct Answer: B

Explanation:
Mirroring continuously synchronizes data from supported operational systems into Fabric with minimal configuration.


Question 2

Which statement best describes a OneLake shortcut?

A. It creates a replicated copy of data in OneLake.
B. It continuously synchronizes source changes.
C. It provides virtual access to data without copying it.
D. It performs CDC automatically.

Correct Answer: C

Explanation:
Shortcuts provide access to external data without physically copying it into OneLake.


Question 3

A data engineer needs extensive data cleansing and transformation during ingestion. Which option is most appropriate?

A. Dataflow Gen2
B. Mirroring
C. Shortcut
D. Workspace role assignment

Correct Answer: A

Explanation:
Dataflows Gen2 are designed for ETL and transformation scenarios.


Question 4

What is typically performed first when configuring Mirroring?

A. Initial snapshot of source data
B. Continuous CDC synchronization
C. Power BI semantic modeling
D. Delta optimization

Correct Answer: A

Explanation:
Mirroring generally begins with an initial snapshot before applying incremental changes.


Question 5

Which benefit is most directly associated with Mirroring?

A. Eliminates workspace permissions
B. Replaces Power BI semantic models
C. Automatically synchronizes source changes into Fabric
D. Converts all data into KQL format

Correct Answer: C

Explanation:
The primary purpose of Mirroring is continuous synchronization of source data.


Question 6

A Fabric administrator wants to determine whether a mirrored database is successfully synchronizing. Which metric should be reviewed?

A. Semantic model refresh duration
B. Replication health and status
C. Capacity SKU name
D. Workspace description

Correct Answer: B

Explanation:
Replication health and synchronization status indicate whether mirroring is functioning properly.


Question 7

Which Fabric item represents a continuously synchronized copy of a source system?

A. Lakehouse shortcut
B. Notebook
C. Pipeline
D. Mirrored Database

Correct Answer: D

Explanation:
A Mirrored Database is the Fabric item created and maintained by the Mirroring feature.


Question 8

Which scenario is the best fit for Mirroring?

A. Complex multi-step ETL workflow across ten systems
B. Monthly batch processing only
C. Near real-time operational reporting from a transactional database
D. Interactive notebook development

Correct Answer: C

Explanation:
Mirroring excels at near real-time analytics on operational data sources.


Question 9

Which activity is most commonly used to orchestrate multiple workflows and dependencies?

A. Mirroring
B. Sensitivity labels
C. Pipelines
D. OneLake shortcuts

Correct Answer: C

Explanation:
Pipelines are designed for orchestration, dependency management, and workflow automation.


Question 10

A company wants analytics users to query current operational data without directly querying production databases. What is the primary advantage of Mirroring?

A. It replicates data into Fabric for analytical workloads.
B. It encrypts all source databases automatically.
C. It removes the need for OneLake.
D. It replaces Delta Lake storage.

Correct Answer: A

Explanation:
Mirroring creates synchronized copies of operational data inside Fabric, allowing analytical workloads to run without impacting production systems.


Go to the DP-700 Exam Prep Hub main page.