Category: DP-700

DP-700, Microsoft Certification, Microsoft Fabric June 3, 2026

Handle duplicate, missing, and late-arriving data (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Ingest and transform data (30–35%)
   --> Ingest and transform batch data
      --> Handle duplicate, missing, and late-arriving data

Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

One of the most important responsibilities of a data engineer is ensuring data quality. In real-world data environments, source systems frequently produce data issues that can negatively impact reporting, analytics, machine learning, and business decision-making.

Three of the most common data quality challenges are:

Duplicate data
Missing data
Late-arriving data

Microsoft Fabric provides multiple tools for addressing these challenges, including:

Dataflows Gen2
Data Pipelines
Lakehouses
Data Warehouses
PySpark Notebooks
SQL
Delta Lake tables

For the DP-700 exam, you should understand:

How duplicate data occurs
How to identify and remove duplicates
How to handle missing values
Strategies for managing late-arriving facts and dimensions
Data quality best practices in Fabric
Tools and techniques available across Fabric workloads

Understanding Duplicate Data

Duplicate data occurs when the same logical record appears more than once.

Example:

CustomerID	Name	Email
1001	John Smith	john@email.com
1001	John Smith	john@email.com

Duplicates can occur due to:

Multiple source system loads
Pipeline reruns
Source application errors
Data integration mistakes
Incomplete change tracking

Duplicate records can lead to:

Inflated sales totals
Incorrect customer counts
Poor analytics results
Data integrity issues

Types of Duplicates

Exact Duplicates

Every field is identical.

Example:

OrderID	Amount
5001	250
5001	250

These are usually the easiest to detect.

Partial Duplicates

Records represent the same entity but differ slightly.

Example:

CustomerID	Name
1001	John Smith
1001	Jon Smith

These may require business rules to resolve.

Detecting Duplicate Records Using SQL

A common approach is grouping records and counting occurrences.

Example:

			
SELECT
    CustomerID,
    COUNT(*) AS RecordCount
FROM Customers
GROUP BY CustomerID
HAVING COUNT(*) > 1;

		

This identifies customers appearing multiple times.

Removing Duplicates Using SQL

A common technique uses ROW_NUMBER().

Example:

			
WITH RankedRows AS
(
    SELECT *,
           ROW_NUMBER() OVER
           (
             PARTITION BY CustomerID
             ORDER BY LastModifiedDate DESC
           ) AS RowNum
    FROM Customers
)
SELECT *
FROM RankedRows
WHERE RowNum = 1;

		

This keeps only the most recent record for each customer.

Handling Duplicates Using PySpark

PySpark provides the dropDuplicates() method.

Example:

customers_df = customers_df.dropDuplicates(["CustomerID"])

This removes duplicate records based on CustomerID.

For multiple columns:

			
customers_df = customers_df.dropDuplicates(
    ["CustomerID","Email"]
)

This is frequently used during Lakehouse ingestion processes.

Delta Lake and Duplicate Prevention

Delta Lake tables support MERGE operations that help prevent duplicates.

Example:

			
MERGE INTO Customers AS Target
USING StagingCustomers AS Source
ON Target.CustomerID = Source.CustomerID
WHEN MATCHED THEN
    UPDATE SET *
WHEN NOT MATCHED THEN
    INSERT *

		

Benefits include:

Upserts
Incremental loads
Duplicate prevention
Simplified ETL logic

This is a common Fabric pattern.

Understanding Missing Data

Missing data occurs when expected values are absent.

Example:

CustomerID	Name	Email
1001	John Smith	NULL

Missing values may result from:

Source system limitations
User entry errors
Integration failures
Incomplete records

Why Missing Data Matters

Missing values can cause:

Reporting inaccuracies
Failed transformations
Machine learning issues
Data quality concerns

Example:

Calculating average salary becomes unreliable if many salary values are missing.

Identifying Missing Values

SQL example:

			
SELECT *
FROM Customers
WHERE Email IS NULL;

PySpark example:

			
customers_df.filter(
    customers_df.Email.isNull()
)

Handling Missing Data

Several approaches exist.

Option 1: Replace Missing Values

Example:

			
SELECT
    CustomerID,
    COALESCE(Email,'Unknown')
FROM Customers;

PySpark:

df.fillna("Unknown")

Useful when business rules permit default values.

Option 2: Exclude Records

Example:

			
SELECT *
FROM Customers
WHERE Email IS NOT NULL;

Useful when incomplete records cannot be trusted.

Option 3: Use Business Defaults

Example:

Missing Value	Replacement
Country	Unknown
Discount	0
Status	Pending

Common in dimensional modeling.

Dataflows Gen2 and Missing Data

Dataflows Gen2 provides low-code capabilities for:

Replacing null values
Filtering null rows
Conditional column creation
Data validation

These transformations are frequently used during ingestion and cleansing processes.

Understanding Late-Arriving Data

Late-arriving data refers to records that arrive after the expected processing window.

Example:

An order occurs on January 1st but is not received until January 5th.

Causes of Late-Arriving Data

Common causes include:

Network interruptions
Application outages
Batch processing delays
Replication latency
Manual data entry

Late-arriving data is extremely common in enterprise systems.

Late-Arriving Facts

A late-arriving fact occurs when a transaction arrives after related reporting periods have already been processed.

Example:

OrderDate	Arrival Date
Jan 1	Jan 5

The order belongs to January 1 but arrives later.

Challenges include:

Incorrect historical totals
Incomplete reporting
Fact table inconsistencies

Handling Late-Arriving Facts

Common approaches include:

Reprocessing Historical Periods

Reload affected partitions.

Example:

Reload the previous 7 days during each load.

Benefits:

Simple implementation
Common Fabric approach

Watermark-Based Processing

Track the latest processed timestamp.

Example:

			
WHERE LastModifiedDate >
      @LastProcessedTimestamp

This supports efficient incremental loads.

Sliding Window Loads

Instead of loading only today’s data:

Load the last:

3 days
7 days
30 days

This helps capture delayed records.

Late-Arriving Dimensions

A late-arriving dimension occurs when a fact record arrives before its related dimension record.

Example:

Fact arrives:

CustomerID	Sales
1001	500

Dimension record for CustomerID 1001 has not yet arrived.

Handling Late-Arriving Dimensions

Create an Unknown Member

Dimension table:

CustomerKey	CustomerName
-1	Unknown

Fact records temporarily reference the unknown member.

When the actual dimension arrives:

Update dimension
Correct foreign key relationships

This is a common data warehouse design pattern.

Retry Processing

Pause loading until dimensions arrive.

Useful when:

Data quality requirements are strict
Fact loads depend on dimension availability

Data Quality Monitoring

Fabric solutions should include monitoring for:

Duplicate counts
Null counts
Data freshness
Late-arriving records
Invalid keys

Typical monitoring techniques include:

SQL validation queries
Notebook checks
Pipeline activities
Custom logging tables

Common Fabric Tools for Data Quality

Tool	Use Case
Dataflows Gen2	Data cleansing
Pipelines	Orchestration and validation
PySpark	Large-scale deduplication
SQL	Validation and quality checks
Delta Tables	Upserts and merge operations
Warehouse	Data quality reporting

Best Practices

Prevent Duplicates Early

Validate source data before loading.

Use Incremental Loading

Avoid repeatedly processing entire datasets.

Implement MERGE Operations

Upserts help prevent duplicate records.

Monitor Null Values

Track trends in missing data over time.

Create Unknown Dimension Members

Support late-arriving dimensions.

Use Sliding Window Loads

Capture delayed records efficiently.

Maintain Audit Columns

Examples:

			
CreatedDate
ModifiedDate
LoadDate
SourceSystem
BatchID

		

These assist with troubleshooting and recovery.

DP-700 Exam Tips

Remember these key concepts:

Duplicates can be identified using GROUP BY and COUNT().
ROW_NUMBER() is commonly used for deduplication.
PySpark supports deduplication through dropDuplicates().
Delta Lake MERGE operations support upserts and duplicate prevention.
Missing values can be replaced, filtered, or assigned defaults.
COALESCE() is commonly used in SQL to handle NULL values.
Late-arriving facts often require partition reprocessing or sliding windows.
Late-arriving dimensions are commonly handled using Unknown members.
Incremental loading and watermarking help manage delayed data efficiently.
Data quality monitoring is an essential responsibility of data engineers.

Practice Exam Questions

Question 1

A data engineer wants to identify duplicate customer records in a Fabric Warehouse.

Which SQL technique should be used?

A. GROUP BY with COUNT()

B. ORDER BY

C. DISTINCT

D. UNION

Correct Answer: A

Explanation: GROUP BY combined with COUNT() can identify keys that appear multiple times, making it a common duplicate detection technique.

Question 2

Which PySpark method is specifically designed to remove duplicate records?

A. filter()

B. collect()

C. sort()

D. dropDuplicates()

Correct Answer: D

Explanation: dropDuplicates() removes duplicate records based on one or more specified columns.

Question 3

A data engineer wants to keep only the most recent customer record when duplicates exist.

Which SQL function is most appropriate?

A. SUM()

B. MAX()

C. ROW_NUMBER()

D. COUNT()

Correct Answer: C

Explanation: ROW_NUMBER() allows ranking records within groups so the preferred record can be retained.

Question 4

Which SQL function is commonly used to replace NULL values with a default value?

A. CAST()

B. CONCAT()

C. GROUP BY

D. COALESCE()

Correct Answer: D

Explanation: COALESCE() returns the first non-NULL value and is frequently used to replace missing data.

Question 5

A sales transaction occurred on January 1 but was not received until January 5.

What type of issue is this?

A. Duplicate fact

B. Missing dimension

C. Late-arriving fact

D. Slowly changing dimension

Correct Answer: C

Explanation: The transaction belongs to an earlier business date but arrives after the expected processing window.

Question 6

Which loading strategy helps capture records that arrive several days late?

A. Full reload every hour

B. Sliding window load

C. Static partitioning

D. Manual refresh

Correct Answer: B

Explanation: Sliding window loads reprocess recent periods, allowing delayed records to be captured.

Question 7

A fact record arrives before its related dimension record.

What is this situation called?

A. Incremental loading

B. Duplicate key processing

C. Data drift

D. Late-arriving dimension

Correct Answer: D

Explanation: A late-arriving dimension scenario occurs when fact data is received before the associated dimension data.

Question 8

What is the most common solution for handling late-arriving dimensions in a dimensional model?

A. Delete the fact record

B. Skip the load permanently

C. Create an Unknown member in the dimension table

D. Disable incremental processing

Correct Answer: C

Explanation: An Unknown member allows fact records to load while preserving referential integrity until the dimension arrives.

Question 9

Which Delta Lake operation is commonly used to perform inserts and updates while minimizing duplicate records?

A. DELETE

B. MERGE

C. TRUNCATE

D. CREATE VIEW

Correct Answer: B

Explanation: MERGE supports upsert operations and is widely used for incremental loading and deduplication.

Question 10

Which column is most useful when implementing watermark-based incremental loads?

A. LastModifiedDate

B. Region

C. Category

D. ProductName

Correct Answer: A

Explanation: Watermarking relies on timestamps such as LastModifiedDate to identify new or changed records since the previous load.

Go to the DP-700 Exam Prep Hub main page.

Data Development, DP-700, Microsoft Certification, Microsoft Fabric June 3, 2026

Group and aggregate data (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Ingest and transform data (30–35%)
   --> Ingest and transform batch data
      --> Group and aggregate data

Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Grouping and aggregating data are among the most common and important data transformation operations performed by data engineers. Organizations rarely analyze raw transactional data directly. Instead, they summarize, categorize, and aggregate data to create meaningful business metrics such as total sales, average order value, monthly revenue, customer counts, inventory levels, and operational KPIs.

In Microsoft Fabric, grouping and aggregation can be performed using several technologies, including:

SQL in Fabric Data Warehouses and Lakehouses
PySpark notebooks
Dataflows Gen2
KQL (Kusto Query Language)
Data pipelines as part of larger ETL/ELT processes

For the DP-700 exam, you should understand:

Why grouping and aggregation are important
When to aggregate data
How to implement aggregations using SQL, PySpark, KQL, and Dataflows Gen2
Common aggregation functions
Performance considerations
Aggregations in dimensional modeling and analytics solutions

Why Group and Aggregate Data?

Raw data often contains millions or billions of records.

For example:

OrderID	CustomerID	OrderDate	Amount
1001	C001	2026-01-01	250
1002	C001	2026-01-02	150
1003	C002	2026-01-02	300

Business users usually want summarized information such as:

CustomerID	TotalSales
C001	400
C002	300

This transformation is accomplished through grouping and aggregation.

Benefits include:

Faster analytics
Reduced storage requirements
Easier reporting
Improved dashboard performance
Simplified business intelligence models

Understanding Grouping

Grouping combines records that share common values.

Examples:

Group by:

Customer
Product
Region
Department
Date
Month
Year

Example:

Region	Sales
East	100
East	200
West	300

Grouped by Region:

Region	TotalSales
East	300
West	300

The GROUP BY operation creates logical categories before aggregation calculations occur.

Common Aggregation Functions

Data engineers should be familiar with the most common aggregation functions.

SUM()

Calculates totals.

Example:

			
SELECT Region,
       SUM(SalesAmount) AS TotalSales
FROM Sales
GROUP BY Region;

Result:

Region	TotalSales
East	500000
West	750000

COUNT()

Counts rows.

			
SELECT Region,
       COUNT(*) AS OrderCount
FROM Sales
GROUP BY Region;

Used for:

Number of customers
Number of transactions
Number of products

AVG()

Calculates averages.

			
SELECT ProductCategory,
       AVG(SalesAmount) AS AverageSale
FROM Sales
GROUP BY ProductCategory;

Used for:

Average order value
Average response time
Average inventory level

MIN()

Returns the smallest value.

			
SELECT MIN(OrderDate)
FROM Orders;

Used for:

Earliest order
Lowest temperature
Minimum cost

MAX()

Returns the largest value.

			
SELECT MAX(OrderDate)
FROM Orders;

Used for:

Latest transaction
Highest sales amount
Maximum inventory quantity

Grouping and Aggregation Using SQL

SQL is the most common approach for aggregation in Fabric.

Example:

			
SELECT
    ProductCategory,
    SUM(SalesAmount) AS TotalSales,
    COUNT(*) AS OrderCount,
    AVG(SalesAmount) AS AverageSale
FROM Sales
GROUP BY ProductCategory;

		

This query:

Groups records by category
Calculates total sales
Counts orders
Calculates average sales

Multi-Column Grouping

You can group by multiple columns.

Example:

			
SELECT
    Year,
    Region,
    SUM(SalesAmount) AS TotalSales
FROM Sales
GROUP BY Year, Region;

		

Result:

Year	Region	TotalSales
2025	East	500000
2025	West	700000
2026	East	550000

This provides more granular analysis.

Filtering Aggregated Results with HAVING

WHERE filters rows before aggregation.

HAVING filters after aggregation.

Example:

			
SELECT Region,
       SUM(SalesAmount) AS TotalSales
FROM Sales
GROUP BY Region
HAVING SUM(SalesAmount) > 1000000;

		

Only regions exceeding $1 million in sales are returned.

Aggregation Using PySpark

PySpark is commonly used for large-scale aggregation operations in Lakehouses.

Example:

			
from pyspark.sql.functions import sum
sales_df.groupBy("Region") \
        .agg(sum("SalesAmount").alias("TotalSales")) \
        .show()

Result:

Region	TotalSales
East	500000
West	750000

Multiple Aggregations in PySpark

			
from pyspark.sql.functions import sum, avg, count
sales_df.groupBy("Region").agg(
    sum("SalesAmount").alias("TotalSales"),
    avg("SalesAmount").alias("AvgSales"),
    count("*").alias("OrderCount")
)

		

This performs several calculations in a single operation.

Aggregation Using KQL

KQL is commonly used in Real-Time Intelligence workloads.

Example:

			
Sales
| summarize TotalSales=sum(SalesAmount)
    by Region

Result:

Region	TotalSales
East	500000
West	750000

Multiple Aggregations in KQL

			
Sales
| summarize
    TotalSales=sum(SalesAmount),
    AvgSales=avg(SalesAmount),
    OrderCount=count()
    by Region

		

This pattern is common in real-time analytics.

Aggregation in Dataflows Gen2

Dataflows Gen2 provides a low-code interface.

Using the Group By transformation, users can:

Sum values
Count rows
Calculate averages
Find minimum values
Find maximum values

Typical steps:

Connect to source
Select Group By
Choose grouping columns
Define aggregation functions
Load results

This approach is useful for citizen developers and low-code ETL scenarios.

Aggregation in Dimensional Models

Aggregations are commonly used before loading data into fact and dimension tables.

Example:

Raw transactions:

OrderID	Customer	Amount
1	A	100
2	A	200
3	B	300

Aggregated customer sales:

Customer	TotalSales
A	300
B	300

This summary table can support reporting and dashboards.

Fact Table Aggregations

Fact tables often store:

Transaction-level facts
Daily summaries
Monthly summaries

Examples:

Transaction Fact

OrderID	Amount
1001	50

Daily Aggregate Fact

Date	TotalSales
2026-01-01	50000

Aggregated fact tables improve query performance.

Window Aggregations vs Group Aggregations

Data engineers should understand the difference.

Group Aggregation

Returns one row per group.

			
SELECT Region,
       SUM(SalesAmount)
FROM Sales
GROUP BY Region;

Window Aggregation

Preserves detail rows.

			
SELECT
    OrderID,
    Region,
    SalesAmount,
    SUM(SalesAmount)
        OVER(PARTITION BY Region)
        AS RegionTotal
FROM Sales;

		

Useful for:

Running totals
Rankings
Percentages
Advanced analytics

Performance Considerations

Grouping and aggregation can be expensive.

Best practices include:

Filter Early

Reduce data before aggregation.

WHERE OrderDate >= '2026-01-01'

Aggregate Close to the Source

Avoid moving unnecessary detailed records.

Use Partitioning

Partitioning helps Spark process data efficiently.

Examples:

Date
Region
Customer segment

Use Delta Tables

Delta tables improve performance through:

Data skipping
File optimization
Efficient query execution

Avoid Excessive Cardinality

Grouping on highly unique columns can reduce efficiency.

Bad example:

GROUP BY TransactionID

Good example:

GROUP BY Region

DP-700 Exam Tips

Remember the following:

GROUP BY creates logical groups before aggregation.
HAVING filters aggregated results.
SQL uses GROUP BY.
PySpark uses groupBy() and agg().
KQL uses summarize.
Dataflows Gen2 provides Group By transformations.
Aggregated fact tables improve reporting performance.
Window functions preserve detailed rows while performing calculations.
Aggregations are frequently used when preparing dimensional models.
Filtering before aggregation improves performance.

Practice Exam Questions

Question 1

A data engineer needs to calculate total sales by region in a Fabric Warehouse.

Which SQL function should be used?

A. AVG()

B. COUNT()

C. SUM()

D. MAX()

Correct Answer: C

Explanation: SUM() calculates the total of numeric values. AVG() calculates averages, COUNT() counts rows, and MAX() returns the largest value.

Question 2

A Fabric notebook must calculate the number of orders per customer.

Which aggregation function should be used?

A. COUNT()

B. AVG()

C. MIN()

D. MAX()

Correct Answer: A

Explanation: COUNT() returns the number of rows in each group, making it ideal for counting orders.

Question 3

You need to remove regions with total sales less than $500,000 after aggregation.

Which SQL clause should you use?

A. ORDER BY

B. WHERE

C. DISTINCT

D. HAVING

Correct Answer: D

Explanation: HAVING filters aggregated results after the GROUP BY operation is completed.

Question 4

Which KQL operator is primarily used for aggregation?

A. project

B. summarize

C. extend

D. join

Correct Answer: B

Explanation: The summarize operator performs grouping and aggregation in KQL.

Question 5

A Fabric Dataflow Gen2 developer wants to calculate average sales by product category.

Which transformation should be used?

A. Merge

B. Append

C. Group By

D. Split Column

Correct Answer: C

Explanation: The Group By transformation supports aggregation operations such as averages, sums, counts, minimums, and maximums.

Question 6

What is the primary purpose of a GROUP BY clause?

A. Sort rows

B. Remove duplicates

C. Filter rows

D. Create logical groups for aggregation

Correct Answer: D

Explanation: GROUP BY organizes rows into groups before aggregate calculations are performed.

Question 7

Which PySpark operation performs grouping before aggregation?

A. select()

B. filter()

C. groupBy()

D. orderBy()

Correct Answer: C

Explanation: groupBy() defines the grouping columns that will be used by aggregation functions.

Question 8

Which scenario is most appropriate for a window aggregation?

A. Total sales by region only

B. Average salary by department only

C. Customer counts by state only

D. Display each transaction along with the total sales for its region

Correct Answer: D

Explanation: Window functions preserve detail rows while calculating aggregates across a defined partition.

Question 9

A data engineer groups a dataset by TransactionID, where every TransactionID is unique.

What is the likely result?

A. Improved aggregation performance

B. Reduced cardinality

C. Limited performance benefits because each group contains one row

D. Automatic partition optimization

Correct Answer: C

Explanation: Grouping by a highly unique column creates many groups and often provides little analytical value.

Question 10

When preparing data for a dimensional model, why are aggregated tables often created?

A. To increase data duplication

B. To improve reporting and query performance

C. To eliminate dimension tables

D. To replace fact tables entirely

Correct Answer: B

Explanation: Pre-aggregated tables reduce the amount of data that must be processed during reporting, improving performance and user experience.

Go to the DP-700 Exam Prep Hub main page.

DP-700, Microsoft Certification, Microsoft Fabric June 3, 2026

Endorse items (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
   --> Configure security and governance
      --> Endorse items

Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

As organizations adopt Microsoft Fabric, the number of available data assets can grow rapidly. Data engineers, analysts, business users, and executives may encounter hundreds or even thousands of reports, semantic models, dashboards, warehouses, lakehouses, notebooks, and other data assets.

A common challenge is determining:

Which data assets are trustworthy?
Which reports should be used for executive reporting?
Which semantic models represent official business definitions?
Which datasets have been reviewed and approved?

To address these governance challenges, Microsoft Fabric supports endorsements.

Endorsements help organizations identify trusted and authoritative data assets, making it easier for users to discover and use approved content.

For the DP-700 exam, it is important to understand endorsement types, governance benefits, use cases, and how endorsements differ from security and sensitivity labels.

What Are Endorsements?

An endorsement is a governance feature that allows organizations to identify and promote trusted data assets.

Endorsements help users answer the question:

“Can I trust this data asset?”

Instead of searching through numerous reports and datasets, users can quickly identify endorsed items that have been reviewed and approved.

Purpose of Endorsements

Organizations use endorsements to:

Improve data discoverability
Promote trusted assets
Reduce duplicate reports
Encourage consistent reporting
Improve governance
Increase user confidence
Establish authoritative data sources

Endorsement Types

Microsoft Fabric supports two primary endorsement levels:

Promoted

Certified

These endorsement levels indicate different degrees of trust and governance.

Promoted Items

A Promoted item indicates:

The content creator believes the item is valuable.
The item is recommended for broader use.
The item may not have gone through formal governance review.

Think of Promoted as:

Recommended Content

Examples:

Frequently used reports
Department dashboards
Common semantic models
Team-approved datasets

Characteristics of Promoted Items

Promoted items:

Are easier to discover
Indicate useful content
Can be designated by authorized users
Do not necessarily represent official organizational standards

Example

A Sales team creates a dashboard used by dozens of users.

The dashboard is reliable and widely used.

The owner marks it as:

Promoted

This helps users identify it as recommended content.

Certified Items

Certified is a higher endorsement level.

Certified items have typically undergone formal review and approval processes.

Think of Certified as:

Official Trusted Content

Examples:

Executive reporting datasets
Enterprise semantic models
Corporate KPI reports
Official financial dashboards

Characteristics of Certified Items

Certified items:

Represent authoritative data
Follow governance standards
Have undergone validation
Are approved by designated governance teams
Should be used whenever possible

Example

A Finance semantic model contains:

Revenue
Expenses
Profit
Corporate KPIs

The governance team validates the model and certifies it.

The model becomes:

Certified

Users now know it represents official business definitions.

Comparing Promoted and Certified

Feature	Promoted	Certified
Recommended by creator	Yes	Yes
Formal review required	No	Yes
Governance approval	Optional	Required
Official organizational source	Not necessarily	Yes
Highest trust level	No	Yes

Why Endorsements Matter

Without endorsements:

			
Sales Report V1
Sales Report V2
Sales Report Final
Sales Report Final2
Sales Dashboard New

		

Users may not know which asset to trust.

With endorsements:

			
Sales Dashboard
(Certified)

The preferred asset becomes obvious.

Supported Fabric Items

Endorsements can be applied to many Fabric assets, including:

Semantic Models
Reports
Dashboards
Data Warehouses
Lakehouses
Dataflows
Other supported Fabric artifacts

Supported item types may evolve as Microsoft Fabric continues to expand.

Endorsements and Data Discovery

One major benefit of endorsements is improved discoverability.

Users searching for assets can identify:

Promoted content
Certified content

This reduces confusion and encourages reuse of trusted assets.

Governance Benefits

Endorsements support governance initiatives by helping organizations:

Establish trusted data sources
Reduce shadow analytics
Minimize duplicate content
Improve reporting consistency
Promote enterprise standards

Endorsements vs Security Permissions

A common DP-700 exam topic is distinguishing endorsements from security.

Endorsements	Permissions
Identify trusted content	Control access
Governance feature	Security feature
Improve discoverability	Restrict usage
Indicate quality	Grant authorization

Example:

A report may be:

Certified

But users still require permissions to access it.

Certification does not grant access.

Endorsements vs Sensitivity Labels

Another frequently tested distinction.

Endorsements	Sensitivity Labels
Indicate trustworthiness	Indicate sensitivity
Governance and quality	Classification and protection
Help users find trusted content	Help users identify sensitive content

Example:

			
Certified Report
Highly Confidential

Both labels may exist simultaneously.

The report is:

Trusted (Certified)
Sensitive (Highly Confidential)

Endorsements vs Data Lineage

Endorsements	Data Lineage
Indicates trust	Shows data flow
Governance tool	Dependency tracking tool

Data lineage answers:

Where did this data come from?

Endorsements answer:

Can I trust this asset?

Common DP-700 Exam Scenarios

Scenario 1

Requirement:

Users need to identify official KPI definitions.

Solution:

Use Certified semantic models.

Scenario 2

Requirement:

A department wants to recommend a dashboard without formal review.

Solution:

Use Promoted endorsement.

Scenario 3

Requirement:

An executive dashboard has been validated by the governance team.

Solution:

Apply Certified endorsement.

Scenario 4

Requirement:

A report contains highly sensitive financial information.

Solution:

Apply a sensitivity label.

Not an endorsement.

Endorsement Workflow

A common governance workflow:

			
Create Asset
      ↓
Validate Asset
      ↓
Promote Asset
      ↓
Governance Review
      ↓
Certify Asset

		

This process improves trust and consistency.

Best Practices

Certify Enterprise Assets

Certify:

Corporate KPI datasets
Financial reports
Enterprise semantic models

Promote Useful Content

Promote:

Department dashboards
Frequently used reports
Shared analytics assets

Establish Governance Processes

Define:

Who can certify content
Review procedures
Approval standards

Avoid Certifying Everything

Certification should remain meaningful and reserved for truly authoritative assets.

Combine Governance Features

Use endorsements alongside:

Sensitivity labels
Lineage tracking
Security permissions
Data cataloging

DP-700 Exam Focus Areas

You should understand:

✓ Purpose of endorsements

✓ Promoted endorsements

✓ Certified endorsements

✓ Governance benefits

✓ Data discovery improvements

✓ Trusted data sources

✓ Promoted versus Certified

✓ Endorsements versus permissions

✓ Endorsements versus sensitivity labels

✓ Endorsements versus lineage

✓ Common governance scenarios

Practice Exam Questions

Question 1

What is the primary purpose of endorsements in Microsoft Fabric?

A. Encrypt sensitive data

B. Identify trusted and recommended data assets

C. Filter rows of data

D. Control workspace permissions

Answer: B

Explanation

Endorsements help users identify trusted, recommended, and authoritative data assets within Fabric.

Question 2

Which endorsement level represents the highest level of organizational trust?

A. Endorsed

B. Promoted

C. Confidential

D. Certified

Answer: D

Explanation

Certified is the highest endorsement level and indicates formal governance review and approval.

Question 3

A department wants to highlight a useful dashboard without requiring formal governance approval.

Which endorsement should be used?

A. Certified

B. Promoted

C. Confidential

D. Restricted

Answer: B

Explanation

Promoted endorsements indicate recommended content without requiring formal certification processes.

Question 4

What is a key characteristic of a Certified item?

A. It automatically grants workspace access.

B. It is encrypted.

C. It automatically receives a sensitivity label.

D. It has undergone formal validation and approval.

Answer: D

Explanation

Certified items have been reviewed and approved according to organizational governance standards.

Question 5

How do endorsements differ from security permissions?

A. Endorsements classify sensitivity levels.

B. Endorsements indicate trustworthiness, while permissions control access.

C. Endorsements encrypt content.

D. Endorsements implement Row-Level Security.

Answer: B

Explanation

Permissions determine who can access an asset, while endorsements indicate whether the asset is trusted.

Question 6

Which statement about Promoted items is correct?

A. They require formal governance certification.

B. They cannot be used by business users.

C. They indicate content that is recommended for broader use.

D. They automatically become Certified after publication.

Answer: C

Explanation

Promoted items highlight useful and recommended content without formal certification requirements.

Question 7

A governance team reviews and approves an enterprise semantic model that contains official KPI definitions.

Which endorsement should be applied?

A. Public

B. Promoted

C. Internal

D. Certified

Answer: D

Explanation

Certified endorsement is appropriate for formally reviewed and approved enterprise assets.

Question 8

What problem do endorsements primarily help solve?

A. Unauthorized access

B. Data encryption

C. User identification

D. Difficulty identifying trusted content

Answer: D

Explanation

Endorsements help users distinguish trusted assets from numerous available reports and datasets.

Question 9

A report is marked as Certified.

What does this indicate?

A. It is an authoritative and approved data asset.

B. It is automatically encrypted.

C. It is accessible to all users.

D. It contains confidential information.

Answer: A

Explanation

Certification indicates that the asset has been validated and approved as a trusted source.

Question 10

Which statement best describes the relationship between endorsements and sensitivity labels?

A. They are identical governance features.

B. Sensitivity labels replace endorsements.

C. Endorsements indicate trustworthiness, while sensitivity labels indicate data sensitivity.

D. Certified items cannot have sensitivity labels.

Answer: C

Explanation

Endorsements focus on trust and quality, while sensitivity labels focus on classification and protection requirements.

Exam Tip

One of the most common DP-700 exam traps is confusing endorsements, sensitivity labels, and security permissions.

Remember:

Requirement	Solution
Identify trusted content	Endorsements
Classify sensitive data	Sensitivity Labels
Control who can access data	Permissions
Track data origins	Lineage

A useful memory aid is:

Promoted = Recommended
Certified = Official
Sensitivity Label = Sensitive
Permission = Access

If the exam question focuses on helping users identify the most trustworthy or authoritative asset, the correct answer is often Promoted or Certified endorsement, not a security control.

Go to the DP-700 Exam Prep Hub main page.

Data Security, DP-700, Microsoft Certification, Microsoft Fabric June 3, 2026

Apply sensitivity labels to items (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
   --> Configure security and governance
      --> Apply sensitivity labels to items

Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Data is one of an organization’s most valuable assets, and not all data carries the same level of sensitivity. Some information can be shared broadly across the organization, while other information must be protected due to regulatory, legal, contractual, or business requirements.

Examples include:

Financial reports
Employee records
Customer information
Healthcare data
Intellectual property
Confidential business plans

Microsoft Fabric integrates with Microsoft Information Protection (MIP) and Microsoft Purview sensitivity labels, enabling organizations to classify and protect data assets throughout their lifecycle.

Sensitivity labels help users understand the importance of data, enforce governance policies, and support compliance initiatives. They can be applied to many Fabric items, including reports, semantic models, dashboards, and other assets.

For the DP-700 exam, it is important to understand what sensitivity labels are, how they work, how they are applied, and how they differ from other security mechanisms such as Row-Level Security (RLS), Object-Level Security (OLS), and workspace permissions.

What Are Sensitivity Labels?

A sensitivity label is a classification tag that identifies the sensitivity level of data or content.

Examples include:

Label	Typical Meaning
Public	Safe for general sharing
General	Internal business use
Confidential	Restricted business information
Highly Confidential	Highly sensitive information
Restricted	Maximum protection required

Organizations can create custom sensitivity labels to align with their governance policies.

Purpose of Sensitivity Labels

Sensitivity labels help organizations:

Classify data consistently
Protect sensitive information
Improve data governance
Support regulatory compliance
Enable data discovery
Increase user awareness
Reduce accidental data exposure

Sensitivity labels serve as both a classification mechanism and, in some scenarios, a protection mechanism.

Sensitivity Labels in Microsoft Fabric

Microsoft Fabric integrates with Microsoft Purview Information Protection.

This integration allows labels to be applied to Fabric assets and propagated through data workflows.

Examples of Fabric items that can support sensitivity labels include:

Reports
Semantic Models
Dashboards
Data Warehouses
Lakehouses
Notebooks
Dataflows
Other supported Fabric artifacts

How Sensitivity Labels Work

A sensitivity label is associated with an item.

Example:

			
Executive Financial Report
Label: Highly Confidential

Users accessing the item can immediately identify its classification level.

The label travels with the item and may also propagate to downstream artifacts depending on organizational policies and supported scenarios.

Common Sensitivity Label Classifications

Public

Data intended for unrestricted access.

Examples:

Public website content
Marketing brochures
Published documentation

General

Data intended for internal use.

Examples:

Departmental reports
Internal project tracking

Confidential

Data requiring controlled access.

Examples:

Financial reports
Customer information
Internal analytics

Highly Confidential

Data requiring strict protection.

Examples:

Payroll information
Acquisition plans
Executive strategy documents

Restricted

Data requiring maximum protection.

Examples:

Legal investigations
Security credentials
Highly regulated information

Applying Sensitivity Labels

Sensitivity labels can be applied manually or automatically depending on organizational configurations.

Manual Labeling

Users select the appropriate label.

Example:

			
Report
 ↓
Apply Label
 ↓
Confidential

		

Manual labeling is commonly used when users understand the business context of the data.

Automatic Labeling

Organizations may configure policies that automatically apply labels based on:

Sensitive information types
Data patterns
Business rules
Compliance requirements

Example:

			
Contains Credit Card Data
           ↓
Apply Highly Confidential

Label Inheritance and Propagation

One of the most important DP-700 exam topics related to sensitivity labels is inheritance.

Labels may propagate from source items to downstream artifacts.

Example:

			
Lakehouse
(Confidential)
      ↓
Semantic Model
      ↓
Report

		

The downstream item may inherit the sensitivity label from its source.

This helps maintain governance consistency throughout the analytics lifecycle.

Benefits of Label Propagation

Without propagation:

			
Sensitive Data
       ↓
Unlabeled Report

Risk:

Users may unknowingly share sensitive information.

With propagation:

			
Sensitive Data
       ↓
Confidential Report

Users are immediately aware of the sensitivity level.

Sensitivity Labels vs Security Permissions

This distinction is frequently tested on certification exams.

Sensitivity Labels	Security Permissions
Classify data	Control access
Provide governance context	Enforce authorization
Improve awareness	Restrict usage
Support compliance	Protect resources

Example:

A report may be labeled:

Highly Confidential

But unless appropriate permissions exist, the label alone does not automatically prevent access.

Sensitivity Labels vs Row-Level Security

Sensitivity Labels	Row-Level Security
Classify content	Filter data rows
Governance feature	Access control feature
Applies to items	Applies to data records

Example:

RLS:

			
East Manager
↓
East Region Rows Only

Sensitivity Label:

Confidential Report

Both can be used together.

Sensitivity Labels vs Dynamic Data Masking

Sensitivity Labels	Dynamic Data Masking
Classifies data	Obscures sensitive values
Governance-focused	Security-focused
Does not change data display	Changes displayed values

Example:

Label:

Highly Confidential

Masking:

XXXX-XXXX-1234

Sensitivity Labels and Compliance

Sensitivity labels play an important role in compliance initiatives such as:

GDPR
HIPAA
PCI DSS
SOX
Internal governance programs

They help organizations:

Identify sensitive assets
Demonstrate governance controls
Improve audit readiness

Sensitivity Labels and Microsoft Purview

Microsoft Purview provides centralized governance capabilities.

Organizations can use Purview to:

Define sensitivity labels
Publish labels
Manage classification policies
Track protected content
Support compliance reporting

Fabric integrates with these governance capabilities.

Real-World Scenarios

Scenario 1

A finance report contains quarterly earnings information.

Solution:

Apply a Confidential sensitivity label.

Scenario 2

A payroll dataset contains salary and compensation data.

Solution:

Apply a Highly Confidential sensitivity label.

Scenario 3

A public product catalog is intended for external customers.

Solution:

Apply a Public label.

Scenario 4

A report inherits data from a Confidential semantic model.

Result:

The report may inherit the Confidential label through label propagation.

Best Practices

Establish a Clear Classification Framework

Create standardized labels such as:

Public
General
Confidential
Highly Confidential

Use Consistent Labeling

Apply labels consistently across Fabric assets.

Leverage Label Propagation

Allow downstream artifacts to inherit labels when appropriate.

Train Users

Ensure users understand:

Label meanings
Sharing responsibilities
Governance requirements

Combine Labels with Security Controls

Use labels alongside:

Workspace permissions
Item permissions
Row-Level Security
Object-Level Security
Dynamic Data Masking

Review Labels Regularly

Data classifications may change over time.

DP-700 Exam Focus Areas

You should understand:

✓ Sensitivity label concepts

✓ Microsoft Purview integration

✓ Classification levels

✓ Manual labeling

✓ Automatic labeling

✓ Label inheritance

✓ Label propagation

✓ Governance benefits

✓ Compliance scenarios

✓ Sensitivity labels versus security permissions

✓ Sensitivity labels versus RLS

✓ Sensitivity labels versus Dynamic Data Masking

Practice Exam Questions

Question 1

What is the primary purpose of a sensitivity label in Microsoft Fabric?

A. To classify and identify the sensitivity level of data

B. To encrypt data at rest

C. To filter rows of data

D. To assign workspace roles

Answer: A

Explanation

Sensitivity labels classify data according to its sensitivity and governance requirements. They are primarily used for data classification and protection awareness.

Question 2

Which Microsoft service provides the sensitivity labeling framework used by Microsoft Fabric?

A. Microsoft Purview

B. Microsoft Defender

C. Microsoft Sentinel

D. Azure Key Vault

Answer: A

Explanation

Microsoft Fabric integrates with Microsoft Purview Information Protection to provide sensitivity labeling capabilities.

Question 3

A company wants reports containing payroll information to be clearly identified as highly sensitive.

Which feature should be used?

A. Dynamic Data Masking

B. Row-Level Security

C. Sensitivity Labels

D. Deployment Pipelines

Answer: C

Explanation

Sensitivity labels classify and identify the sensitivity level of data assets such as payroll reports.

Question 4

What is label propagation?

A. Automatic workspace creation

B. Automatic dataset refresh

C. Automatic role assignment

D. Automatic inheritance of sensitivity labels to downstream items

Answer: D

Explanation

Label propagation helps maintain consistent governance by carrying sensitivity classifications to derived artifacts.

Question 5

Which statement best describes the relationship between sensitivity labels and security permissions?

A. Sensitivity labels replace security permissions.

B. Security permissions automatically create labels.

C. Sensitivity labels classify data, while permissions control access.

D. Sensitivity labels filter rows of data.

Answer: C

Explanation

Labels provide classification and governance context, while permissions determine who can access resources.

Question 6

A report inherits data from a semantic model labeled Confidential.

What may happen if label propagation is enabled?

A. The report may inherit the Confidential label.

B. The report becomes encrypted automatically.

C. The report is deleted after publication.

D. Workspace permissions are removed.

Answer: A

Explanation

Label propagation can automatically apply inherited classifications to downstream assets.

Question 7

Which classification would generally represent the highest level of protection?

A. Public

B. General

C. Confidential

D. Highly Confidential

Answer: D

Explanation

Highly Confidential labels are typically used for the most sensitive business information.

Question 8

Which statement about sensitivity labels is correct?

A. They filter records based on user identity.

B. They hide columns from users.

C. They help classify and govern data assets.

D. They assign workspace roles.

Answer: C

Explanation

Sensitivity labels primarily support classification, governance, and compliance initiatives.

Question 9

A company wants to automatically classify files containing credit card information.

Which capability supports this requirement?

A. Automatic sensitivity labeling

B. Dynamic Data Masking

C. Workspace Viewer permissions

D. Object-Level Security

Answer: A

Explanation

Automatic labeling policies can identify sensitive information patterns and apply appropriate labels.

Question 10

Why should sensitivity labels be combined with security controls?

A. Labels automatically replace encryption.

B. Labels alone do not control access to data.

C. Labels remove the need for governance policies.

D. Labels prevent all data leakage scenarios.

Answer: B

Explanation

Sensitivity labels provide classification and governance information, but access controls such as permissions, RLS, and masking are still required to secure data.

Exam Tip

A common DP-700 exam challenge is distinguishing classification technologies from access control technologies.

Remember:

Requirement	Solution
Classify data sensitivity	Sensitivity Labels
Restrict who can access data	Permissions
Restrict which rows users see	Row-Level Security
Hide sensitive values	Dynamic Data Masking
Hide tables or objects	Object-Level Security

If the question focuses on identifying, classifying, labeling, governing, or tracking sensitive data, the correct answer is often Sensitivity Labels rather than a traditional security control.

Go to the DP-700 Exam Prep Hub main page.

DP-700, Microsoft Certification, Microsoft Fabric June 3, 2026June 3, 2026

Implement dynamic data masking (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
   --> Configure security and governance
      --> Implement dynamic data masking

Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Protecting sensitive data is a critical responsibility for data engineers. Organizations routinely store confidential information such as:

Personally Identifiable Information (PII)
Social Security numbers
Credit card information
Email addresses
Phone numbers
Employee salaries
Customer account details

While some users require access to this information, many others only need access to the surrounding business data. Granting unrestricted visibility to sensitive values can increase security risks and create compliance concerns.

Dynamic Data Masking (DDM) is a security feature that limits the exposure of sensitive data by masking values for non-privileged users while allowing authorized users to see the original values.

For the DP-700 exam, it is important to understand how Dynamic Data Masking works, its use cases, limitations, and how it differs from other security mechanisms such as Row-Level Security (RLS), Column-Level Security (CLS), and encryption.

What Is Dynamic Data Masking?

Dynamic Data Masking is a security feature that obscures sensitive data at query time.

The actual data remains unchanged in storage.

Instead, unauthorized users see a masked version of the data.

Example:

Actual data:

CustomerName	Email
John Smith	john.smith@contoso.com

Masked view:

CustomerName	Email
John Smith	jXXXXXXX@XXXX.com

The original data still exists in the database.

Only the displayed results are modified.

Why Use Dynamic Data Masking?

Organizations often need to:

Protect confidential information
Limit exposure of sensitive fields
Support regulatory compliance
Reduce accidental data disclosure
Allow broader access to datasets without exposing confidential values

Dynamic Data Masking provides a simple way to accomplish these goals.

How Dynamic Data Masking Works

The masking process occurs during query execution.

			
User Query
     ↓
Security Evaluation
     ↓
Masking Applied
     ↓
Results Returned

		

Authorized users:

john.smith@contoso.com

Unauthorized users:

jXXXXXXX@XXXX.com

The underlying stored value never changes.

Common Dynamic Data Masking Use Cases

Customer Contact Information

Sensitive fields:

Email addresses
Phone numbers
Mailing addresses

Example:

			
Actual:
john.smith@contoso.com
Masked:
jXXXXXXX@XXXX.com

Employee Information

Sensitive fields:

Salary
Bonus information
Tax identifiers

Example:

			
Actual:
$120,000
Masked:
$XXXXXX

Financial Information

Sensitive fields:

Credit card numbers
Bank account numbers
Account balances

Example:

			
Actual:
4321-5678-9876-1234
Masked:
XXXX-XXXX-XXXX-1234

Types of Data That Can Be Masked

Common candidates include:

Email addresses
Phone numbers
National identification numbers
Credit card numbers
Salary data
Medical information
Customer account information

Generally, highly sensitive columns are good candidates for masking.

Dynamic Data Masking vs Encryption

This distinction is frequently tested on certification exams.

Dynamic Data Masking	Encryption
Protects displayed results	Protects stored data
Data remains visible to privileged users	Data is encrypted at rest or in transit
User-facing security feature	Storage and transport security feature
Does not alter stored values	Changes stored representation

Example

Dynamic Data Masking:

			
Stored:
123-45-6789
Displayed:
XXX-XX-6789

Encryption:

			
Stored:
A7F4D93C12...

Dynamic Data Masking vs Row-Level Security

These concepts are often confused.

Dynamic Data Masking	Row-Level Security
Hides data values	Filters rows
Same rows visible	Different rows visible
Column-focused	Row-focused
Data remains visible in masked form	Rows may be completely hidden

Example:

RLS:

			
East Region Manager
→ East Region Rows Only

DDM:

			
All Rows Visible
→ Sensitive Values Masked

Dynamic Data Masking vs Column-Level Security

Another important distinction.

Dynamic Data Masking	Column-Level Security
Shows masked values	Hides column entirely
User sees partial data	User cannot access column
More flexible visibility	More restrictive security

Example:

DDM:

			
Salary
XXXXXX

CLS:

			
Salary Column
Not Visible

Dynamic Data Masking vs Object-Level Security

Dynamic Data Masking	Object-Level Security
Masks data values	Hides objects
User accesses table	Table may be hidden
Granular data visibility	Object visibility control

Example:

DDM:

Salary = XXXXX

OLS:

Payroll Table Hidden

Benefits of Dynamic Data Masking

Simplified Security

Protects sensitive values without redesigning datasets.

Reduced Data Exposure

Users only see the information necessary for their role.

Regulatory Support

Can help support compliance initiatives involving:

GDPR
HIPAA
PCI DSS
Internal governance policies

Easier Data Sharing

Organizations can provide broader dataset access while reducing risk.

Limitations of Dynamic Data Masking

For the DP-700 exam, understanding limitations is important.

Dynamic Data Masking:

Does NOT Encrypt Data

Data remains stored in its original form.

Does NOT Replace Access Controls

Users still require appropriate permissions.

Does NOT Replace RLS

Rows remain visible.

Does NOT Replace CLS

Columns remain accessible.

Is Not a Complete Security Solution

DDM should be combined with other security mechanisms.

Layered Security Approach

Organizations commonly combine:

			
Workspace Security
        ↓
Item Security
        ↓
Row-Level Security
        ↓
Column-Level Security
        ↓
Dynamic Data Masking
        ↓
Encryption

		

Each layer provides additional protection.

Common DP-700 Exam Scenarios

Scenario 1

Requirement:

Customer service representatives should view customer records but not full credit card numbers.

Solution:

Implement Dynamic Data Masking.

Scenario 2

Requirement:

Managers should only see employees within their region.

Solution:

Implement Row-Level Security.

Scenario 3

Requirement:

Payroll data should be completely hidden from analysts.

Solution:

Implement Object-Level Security or Column-Level Security.

Scenario 4

Requirement:

Protect sensitive data stored on disk.

Solution:

Use encryption rather than Dynamic Data Masking.

Best Practices

Mask Sensitive Columns

Focus on:

PII
Financial data
Healthcare information
Confidential business information

Combine DDM with Other Controls

Use:

Workspace permissions
Item permissions
RLS
CLS
OLS

for comprehensive protection.

Follow Least Privilege

Limit access to unmasked data.

Regularly Review Security Policies

Verify masking requirements align with governance policies.

Protect Production Data

Apply masking wherever sensitive data exposure could occur.

DP-700 Exam Focus Areas

You should understand:

✓ Dynamic Data Masking concepts

✓ How masking works

✓ Common masking scenarios

✓ Sensitive data protection

✓ Dynamic Data Masking vs Encryption

✓ Dynamic Data Masking vs RLS

✓ Dynamic Data Masking vs CLS

✓ Dynamic Data Masking vs OLS

✓ Security best practices

✓ Layered security approaches

Practice Exam Questions

Question 1

What is the primary purpose of Dynamic Data Masking?

A. Encrypt stored data

B. Restrict workspace access

C. Filter rows returned by a query

D. Hide sensitive data values from unauthorized users

Answer: D

Explanation

Dynamic Data Masking obscures sensitive data values in query results while leaving the underlying stored data unchanged.

Question 2

Which statement about Dynamic Data Masking is true?

A. It permanently modifies stored data.

B. It encrypts data at rest.

C. It masks data at query time for unauthorized users.

D. It removes sensitive columns.

Answer: C

Explanation

DDM operates at query time and displays masked values to users who do not have permission to view the actual data.

Question 3

A company wants customer service agents to view customer records while masking credit card numbers.

Which feature should be implemented?

A. Dynamic Data Masking

B. Row-Level Security

C. Deployment Rules

D. Workspace Viewer Role

Answer: A

Explanation

DDM allows users to view records while hiding sensitive portions of specific data fields.

Question 4

What is the primary difference between Dynamic Data Masking and Row-Level Security?

A. DDM encrypts data while RLS does not.

B. DDM controls workspace permissions while RLS controls item permissions.

C. DDM hides columns while RLS hides tables.

D. DDM masks values while RLS filters rows.

Answer: D

Explanation

RLS determines which rows are visible, while DDM determines how sensitive values are displayed.

Question 5

Which security feature completely hides a column from users?

A. Dynamic Data Masking

B. Column-Level Security

C. Row-Level Security

D. Encryption

Answer: B

Explanation

Column-Level Security removes access to the column entirely, whereas DDM displays masked values.

Question 6

A company needs to protect sensitive data stored on disk.

Which technology should be used?

A. Dynamic Data Masking

B. Build Permission

C. Encryption

D. Row-Level Security

Answer: C

Explanation

Encryption protects stored data, while DDM only affects how data is displayed.

Question 7

Which type of data is commonly protected using Dynamic Data Masking?

A. Email addresses

B. Credit card numbers

C. Social Security numbers

D. All of the above

Answer: D

Explanation

DDM is commonly used to protect various forms of sensitive personal and financial information.

Question 8

A user can access a salary column but sees masked values instead of actual salaries.

Which security feature is being used?

A. Row-Level Security

B. Dynamic Data Masking

C. Object-Level Security

D. Folder-Level Security

Answer: B

Explanation

DDM allows access to the column while masking sensitive values.

Question 9

Which statement accurately describes Dynamic Data Masking?

A. It replaces all other security controls.

B. It prevents users from accessing tables.

C. It should be combined with other security mechanisms.

D. It filters data based on user region.

Answer: C

Explanation

DDM is one layer of security and should be used alongside permissions, RLS, CLS, and encryption.

Question 10

A company wants users to see the last four digits of credit card numbers while masking the rest.

Which solution is most appropriate?

A. Object-Level Security

B. Workspace-Level Security

C. Encryption

D. Dynamic Data Masking

Answer: D

Explanation

Dynamic Data Masking can reveal portions of sensitive values while masking the remaining characters.

Exam Tip

One of the most common DP-700 exam traps is confusing Dynamic Data Masking with other security technologies.

Remember:

Requirement	Solution
Hide sensitive values	Dynamic Data Masking
Filter rows	Row-Level Security
Hide columns	Column-Level Security
Hide tables or measures	Object-Level Security
Protect stored data	Encryption

If users should still be able to access a column but only see a masked version of its contents, Dynamic Data Masking is usually the correct answer.

Go to the DP-700 Exam Prep Hub main page.

DP-700, Microsoft Certification, Microsoft Fabric June 3, 2026

Create and configure deployment pipelines (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
   --> Implement lifecycle management in Fabric
      --> Create and configure deployment pipelines

Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

One of the most important aspects of enterprise analytics development is ensuring that changes move safely and consistently from development environments into production. Without a structured deployment process, organizations face risks such as inconsistent configurations, accidental overwrites, insufficient testing, and production outages.

Microsoft Fabric Deployment Pipelines provide a controlled mechanism for promoting content through different stages of the software development lifecycle. Deployment Pipelines help organizations implement DataOps and DevOps practices by enabling repeatable, auditable, and governed deployments.

For the DP-700 exam, you should understand how Deployment Pipelines work, how to create and configure them, how content is promoted between stages, how deployment rules work, and how Deployment Pipelines integrate with source control and lifecycle management strategies.

What Are Deployment Pipelines?

A Deployment Pipeline is a Fabric feature that enables content to move through multiple environments in a controlled manner.

A typical deployment pipeline consists of three stages:

			
Development
      ↓
Test
      ↓
Production

		

Each stage is associated with a separate Fabric workspace.

Deployment Pipelines help ensure that content is properly validated before reaching production.

Why Deployment Pipelines Are Important

Without Deployment Pipelines:

Developers may deploy directly to production.
Environment configurations may become inconsistent.
Testing may be skipped.
Rollbacks become more difficult.

Deployment Pipelines provide:

Controlled releases
Environment separation
Repeatable deployments
Reduced deployment risk
Improved governance
Better collaboration

Deployment Pipeline Architecture

A common architecture consists of:

			
Development Workspace
          ↓
     Deploy
          ↓
Test Workspace
          ↓
     Deploy
          ↓
Production Workspace

		

Each workspace contains a version of the solution appropriate for that stage.

Deployment Pipeline Stages

Development Stage

The Development stage is where new work is performed.

Activities include:

Creating notebooks
Building pipelines
Modifying warehouses
Developing Dataflows Gen2
Testing new features

Characteristics:

Frequent changes
Active development
Potential instability

Test Stage

The Test stage is used for validation.

Activities include:

Functional testing
Integration testing
User acceptance testing
Performance testing

Characteristics:

Controlled environment
Representative data
Validation before production

Production Stage

The Production stage contains approved content used by business users.

Characteristics:

Stable environment
Business-critical workloads
Strict change control

Creating a Deployment Pipeline

Creating a Deployment Pipeline generally involves the following steps.

Step 1: Create the Pipeline

From the Fabric workspace experience:

Create a new Deployment Pipeline.
Assign a pipeline name.
Configure the stages.

Example:

Sales Analytics Pipeline

Step 2: Assign Workspaces

Associate each stage with a Fabric workspace.

Example:

Stage	Workspace
Development	Sales-Dev
Test	Sales-Test
Production	Sales-Prod

Each workspace should be dedicated to its specific purpose.

Step 3: Validate Content

Ensure workspace content is configured properly before deployment.

Examples:

Notebooks
Data Pipelines
Dataflows Gen2
Warehouses
Lakehouses
Reports
Semantic Models

Step 4: Deploy Content

Deploy content from one stage to the next.

Example:

			
Development
      ↓
Deploy
      ↓
Test

		

After validation:

			
Test
      ↓
Deploy
      ↓
Production

		

Supported Fabric Items

Deployment Pipelines support many Fabric artifacts.

Common examples include:

Data Pipelines
Notebooks
Lakehouses
Warehouses
Semantic Models
Reports
Dataflows Gen2
Environments

The exact list of supported items may evolve as Fabric continues to expand.

Understanding Deployment

A deployment copies supported metadata and configurations from one stage to another.

Deployment typically includes:

Definitions
Metadata
Configurations

Deployment does not generally mean copying large volumes of production data between workspaces.

This distinction is important for exam questions.

Comparing Stages

Deployment Pipelines provide stage comparison capabilities.

Administrators can identify:

New items
Modified items
Missing items

Example:

Item	Development	Test
Customer Notebook	Updated	Old Version
Sales Pipeline	New	Missing

Comparison helps determine what should be deployed.

Deployment Rules

One of the most important DP-700 topics is Deployment Rules.

Deployment Rules allow environment-specific settings to be maintained during deployment.

Without Deployment Rules:

			
Development Connection String
          ↓
Deploy
          ↓
Production Connection String Overwritten

		

With Deployment Rules:

			
Development Connection String
          ↓
Deploy
          ↓
Production Connection String Preserved

		

Why Deployment Rules Matter

Organizations often have different settings per environment.

Examples include:

Environment	Data Source
Development	Dev Database
Test	Test Database
Production	Production Database

Deployment Rules prevent deployments from overwriting these environment-specific settings.

Common Deployment Rule Scenarios

Data Source Connections

Development:

SalesDB-Dev

Production:

SalesDB-Prod

Deployment Rules preserve the correct environment-specific connection.

Parameter Values

Different environments may use:

Different storage accounts
Different schemas
Different APIs

Deployment Rules ensure correct values remain in place.

Integration with Git

Deployment Pipelines and Git serve different purposes.

Feature	Purpose
Git	Source control
Deployment Pipeline	Environment promotion

Git manages:

Version history
Branching
Collaboration

Deployment Pipelines manage:

Test promotion
Production releases
Environment consistency

Exam Tip

A common DP-700 question asks which technology should be used.

If the requirement is:

Track changes → Git
Promote between environments → Deployment Pipeline

Deployment Pipeline Workflow

A common enterprise workflow:

			
Developer
      ↓
Git Repository
      ↓
Development Workspace
      ↓
Deployment Pipeline
      ↓
Test Workspace
      ↓
Deployment Pipeline
      ↓
Production Workspace

		

This architecture aligns with modern DevOps practices.

Security and Permissions

To manage Deployment Pipelines, users typically require:

Appropriate Fabric permissions
Access to associated workspaces

Common workspace roles include:

Role	Capability
Admin	Full pipeline management
Member	Manage workspace content
Contributor	Create and edit content
Viewer	Read-only access

Administrators generally oversee deployments.

Monitoring Deployments

Deployment Pipelines provide visibility into:

Deployment history
Successful deployments
Failed deployments
Content differences

Benefits include:

Troubleshooting support
Change auditing
Governance tracking

Common DP-700 Exam Scenarios

Scenario 1

A company wants separate development, test, and production environments.

Solution:

Create a Deployment Pipeline with three stages.

Scenario 2

A deployment should not overwrite production database connections.

Solution:

Configure Deployment Rules.

Scenario 3

Developers need to promote tested notebooks into production.

Solution:

Deploy through a Deployment Pipeline.

Scenario 4

A company needs to track source code history.

Solution:

Use Git, not Deployment Pipelines.

Best Practices

Use Separate Workspaces

Maintain dedicated workspaces for:

Development
Test
Production

Never Develop in Production

All changes should originate in Development.

Use Deployment Rules

Prevent environment-specific settings from being overwritten.

Validate Before Promotion

Test thoroughly before promoting content.

Combine Git and Deployment Pipelines

Use:

Git for version control
Deployment Pipelines for deployment

Document Deployment Processes

Establish consistent promotion procedures.

DP-700 Exam Focus Areas

You should understand:

✓ Deployment Pipeline architecture

✓ Development, Test, and Production stages

✓ Workspace assignment

✓ Deployment operations

✓ Supported Fabric items

✓ Stage comparisons

✓ Deployment Rules

✓ Environment-specific configurations

✓ Integration with Git

✓ Security considerations

✓ Monitoring and auditing

Practice Exam Questions

Question 1

What is the primary purpose of a Deployment Pipeline in Microsoft Fabric?

A. Track source code changes

B. Manage Spark clusters

C. Promote content between environments

D. Store warehouse data

Answer: C

Explanation

Deployment Pipelines are designed to move content through development, test, and production environments in a controlled manner.

Question 2

Which stage typically contains content actively being developed?

A. Development

B. Test

C. Production

D. Deployment

Answer: A

Explanation

The Development stage is where developers create and modify Fabric artifacts before testing and deployment.

Question 3

A company wants to validate notebooks before releasing them to business users.

Which deployment stage should be used before Production?

A. Archive

B. Sandbox

C. Workspace

D. Test

Answer: D

Explanation

The Test stage allows organizations to validate functionality before promoting content to Production.

Question 4

What is the purpose of Deployment Rules?

A. Improve Spark performance

B. Control notebook versioning

C. Create Git branches

D. Preserve environment-specific settings

Answer: D

Explanation

Deployment Rules allow different environments to maintain their own configuration values during deployments.

Question 5

Which technology should be used to maintain source code history?

A. Deployment Pipeline

B. Monitoring Hub

C. Git Integration

D. Capacity Metrics

Answer: C

Explanation

Git provides version control, branching, collaboration, and historical tracking capabilities.

Question 6

Which Fabric item can be promoted through a Deployment Pipeline?

A. Data Pipeline

B. Notebook

C. Lakehouse

D. All of the above

Answer: D

Explanation

Deployment Pipelines support many Fabric artifacts, including pipelines, notebooks, and Lakehouses.

Question 7

A deployment from Development to Production overwrites a production connection string.

Which feature could have prevented this?

A. Lineage View

B. Deployment Rule

C. Gateway Cluster

D. Dataflow Refresh

Answer: B

Explanation

Deployment Rules preserve environment-specific values such as connection strings and parameter settings.

Question 8

What is the recommended sequence for a deployment pipeline?

A. Production → Test → Development

B. Test → Development → Production

C. Development → Test → Production

D. Workspace → Git → Production

Answer: C

Explanation

Changes should move from Development to Test and finally to Production after validation.

Question 9

Which statement best describes the relationship between Git and Deployment Pipelines?

A. They serve the same purpose.

B. Deployment Pipelines replace Git.

C. Git replaces Deployment Pipelines.

D. Git manages source control while Deployment Pipelines manage environment promotion.

Answer: D

Explanation

Git handles version control and collaboration, while Deployment Pipelines handle content promotion between environments.

Question 10

A company wants to compare the contents of Development and Test before deployment.

Which Deployment Pipeline capability should they use?

A. Stage comparison

B. Dataflow lineage

C. Capacity monitoring

D. Spark history

Answer: A

Explanation

Stage comparison identifies differences between environments, helping administrators determine what should be deployed.

Exam Tip

For DP-700, one of the most common lifecycle management distinctions is:

Requirement	Solution
Track changes and maintain history	Git Integration
Promote content across environments	Deployment Pipeline
Preserve environment-specific values	Deployment Rules
Validate content before production	Test Stage

When you see questions involving Development, Test, and Production workspaces, the correct answer is often related to Deployment Pipelines. When you see questions involving version history, branching, merging, or rollback, the correct answer is typically Git integration.

Go to the DP-700 Exam Prep Hub main page.

DP-700, Microsoft Certification, Microsoft Fabric June 3, 2026

Implement row-level, column-level, object-level, and folder/file-level access controls (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
   --> Configure security and governance
      --> Implement row-level, column-level, object-level, and folder/file-level access controls

Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Modern data platforms must secure information at multiple layers. While workspace-level and item-level permissions determine who can access Fabric assets, organizations often require much more granular control over the data itself.

For example:

A regional sales manager should only see sales data for their region.
Human Resources staff should see salary information, while other users should not.
Analysts should access specific tables but not highly sensitive tables.
Data engineers should have access to specific folders within OneLake while being restricted from others.

Microsoft Fabric supports multiple layers of security to address these requirements:

Row-Level Security (RLS)
Column-Level Security (CLS)
Object-Level Security (OLS)
Folder/File-Level Security

These controls help organizations implement the Principle of Least Privilege while supporting regulatory compliance, governance requirements, and data protection initiatives.

For the DP-700 exam, understanding the differences between these security mechanisms and when to use each one is extremely important.

Understanding Security Layers

Security in Fabric is often implemented as multiple layers.

			
Workspace Security
        ↓
Item Security
        ↓
Row-Level Security
        ↓
Column-Level Security
        ↓
Object-Level Security
        ↓
Folder/File Security

		

Each layer controls access to increasingly granular portions of data.

Row-Level Security (RLS)

What Is Row-Level Security?

Row-Level Security restricts which rows of data a user can view.

Users access the same table or report but see different subsets of data.

Example table:

Employee	Region	Sales
John	East	100,000
Sarah	West	150,000
Mike	East	120,000

With RLS:

User	Visible Data
East Manager	East rows only
West Manager	West rows only

Why Use RLS?

Organizations commonly use RLS to:

Restrict regional data
Secure department-specific data
Support multi-tenant solutions
Enforce business ownership boundaries

Static RLS

Static RLS uses predefined security roles.

Example:

Region = "East"

Users assigned to the role see only East region data.

Dynamic RLS

Dynamic RLS evaluates the currently logged-in user.

Example:

USERPRINCIPALNAME()

The system automatically determines which rows the user should access.

Dynamic RLS is commonly used in enterprise implementations because it scales better than manually assigning users to roles.

Common DP-700 RLS Scenario

A company has regional managers.

Requirement:

Each manager should only see data for their assigned region.

Solution:

Implement Row-Level Security.

Column-Level Security (CLS)

What Is Column-Level Security?

Column-Level Security restricts access to specific columns while allowing access to the rest of the table.

Example table:

EmployeeID	Name	Salary
1001	John	90,000
1002	Sarah	110,000

With CLS:

HR Users:

EmployeeID	Name	Salary
1001	John	90,000

Non-HR Users:

EmployeeID	Name
1001	John

The Salary column is hidden.

Why Use CLS?

CLS is commonly used to protect:

Salary information
Personally identifiable information (PII)
Social Security numbers
Healthcare data
Financial account information

Benefits of CLS

Protects sensitive fields
Simplifies compliance efforts
Reduces data exposure
Supports privacy regulations

Common DP-700 CLS Scenario

Requirement:

Managers need employee information but must not see salary data.

Solution:

Implement Column-Level Security.

Object-Level Security (OLS)

What Is Object-Level Security?

Object-Level Security controls access to entire database objects.

Examples include:

Tables
Views
Columns
Measures

Rather than filtering data, OLS completely hides objects from users.

Example

Database Objects:

			
Sales Table
Customer Table
Payroll Table

Finance Users:

			
Sales Table
Customer Table
Payroll Table

Sales Users:

			
Sales Table
Customer Table

The Payroll table is completely hidden.

OLS vs CLS

This distinction is frequently tested on the DP-700 exam.

Feature	CLS	OLS
Hides columns	Yes	Yes
Hides tables	No	Yes
Hides measures	No	Yes
Hides entire objects	No	Yes

Why Use OLS?

Organizations use OLS when users should not even know an object exists.

Examples:

Payroll tables
Executive compensation data
Audit tables
Compliance datasets

Common DP-700 OLS Scenario

Requirement:

Analysts should not see the Payroll table at all.

Solution:

Implement Object-Level Security.

Folder-Level Security

What Is Folder-Level Security?

Folder-level security controls access to folders within storage structures such as OneLake.

Example:

			
Finance
│
├── Payroll
├── Budgets
└── Forecasts
Sales
│
├── Regional
└── Territory

		

Finance users may access:

Finance/*

while Sales users may access:

Sales/*

Why Folder-Level Security Matters

Benefits include:

Departmental separation
Data governance
Simplified access management
Better organization

OneLake Considerations

Microsoft Fabric’s OneLake serves as the unified storage layer.

Organizations often structure OneLake data using:

			
Department
    ↓
Project
    ↓
Files

		

Folder-level controls help restrict access appropriately.

File-Level Security

What Is File-Level Security?

File-level security controls access to individual files.

Example:

			
EmployeeData.parquet
Payroll.parquet
Benefits.parquet

A user may be granted access to:

EmployeeData.parquet

while being denied access to:

Payroll.parquet

Use Cases

File-level security is useful when:

Sensitive files exist within shared folders
Regulatory restrictions apply
Individual datasets require additional protection

Folder-Level vs File-Level Security

Folder Security	File Security
Controls entire folder	Controls individual files
Easier to manage	More granular
Less administrative effort	Greater precision

Comparing All Four Security Models

Security Type	Controls
Row-Level Security	Which rows users can see
Column-Level Security	Which columns users can see
Object-Level Security	Which tables, views, measures, or columns exist for users
Folder/File Security	Which storage locations users can access

Security Layer Examples

Example 1: Regional Sales

Requirement:

Users should only see sales for their region.

Solution:

Row-Level Security

Example 2: Salary Protection

Requirement:

Users can see employee records but not salary information.

Solution:

Column-Level Security

Example 3: Payroll Table Protection

Requirement:

Payroll tables should be invisible to most users.

Solution:

Object-Level Security

Example 4: Departmental Data Separation

Requirement:

Finance files should not be accessible by Sales users.

Solution:

Folder-Level Security

Combining Security Controls

Enterprise environments often combine multiple controls.

Example:

			
Workspace Permission
        ↓
Item Permission
        ↓
RLS
        ↓
CLS
        ↓
OLS
        ↓
Folder Security

		

This layered approach provides stronger protection.

Best Practices

Follow Least Privilege

Grant only required access.

Prefer Dynamic RLS

Dynamic RLS scales better than manually maintained security roles.

Use OLS for Highly Sensitive Objects

Hide entire tables when appropriate.

Protect Sensitive Columns

Use CLS for PII and confidential information.

Organize OneLake Carefully

Use logical folder structures to simplify governance.

Audit Security Regularly

Review permissions and security configurations periodically.

DP-700 Exam Focus Areas

You should understand:

✓ Row-Level Security (RLS)

✓ Static versus Dynamic RLS

✓ Column-Level Security (CLS)

✓ Object-Level Security (OLS)

✓ Folder-level access controls

✓ File-level access controls

✓ OneLake security concepts

✓ Security layering

✓ Least-privilege principles

✓ Common security implementation scenarios

Practice Exam Questions

Question 1

A company wants regional managers to see only sales records from their assigned region.

Which security feature should be implemented?

A. File-Level Security

B. Object-Level Security

C. Workspace Security

D. Row-Level Security

Answer: D

Explanation

Row-Level Security filters rows based on user identity or role, allowing managers to view only relevant regional data.

Question 2

A user should be able to view employee records but not salary information.

Which security feature should be used?

A. Row-Level Security

B. Workspace Viewer Role

C. Column-Level Security

D. Deployment Rules

Answer: C

Explanation

Column-Level Security restricts access to specific columns while allowing access to the rest of the table.

Question 3

Which security feature can completely hide a table from users?

A. Row-Level Security

B. Object-Level Security

C. File-Level Security

D. Workspace Security

Answer: B

Explanation

Object-Level Security can hide entire tables, views, columns, and measures.

Question 4

A company wants to prevent users from accessing a specific file within a shared OneLake folder.

Which security mechanism is most appropriate?

A. Row-Level Security

B. Column-Level Security

C. Object-Level Security

D. File-Level Security

Answer: D

Explanation

File-Level Security controls access to individual files.

Question 5

What is a key advantage of Dynamic RLS compared to Static RLS?

A. Faster query execution

B. Automatically adjusts security based on the logged-in user

C. Eliminates the need for semantic models

D. Removes workspace permissions

Answer: B

Explanation

Dynamic RLS uses user identity functions to determine appropriate data access automatically.

Question 6

Which security control is best suited for hiding a Payroll table from analysts?

A. Column-Level Security

B. Folder-Level Security

C. Object-Level Security

D. Row-Level Security

Answer: C

Explanation

OLS completely hides the table from unauthorized users.

Question 7

A company wants Finance users to access the Finance folder but not the Sales folder.

Which security mechanism should be implemented?

A. Folder-Level Security

B. Row-Level Security

C. Column-Level Security

D. Build Permission

Answer: A

Explanation

Folder-Level Security restricts access to specific storage locations.

Question 8

Which statement correctly describes Column-Level Security?

A. It filters rows of data.

B. It hides specific columns while keeping the table accessible.

C. It hides entire tables.

D. It controls workspace membership.

Answer: B

Explanation

CLS allows access to the table while restricting access to designated columns.

Question 9

What is the primary purpose of Object-Level Security?

A. Filter data rows

B. Control notebook execution

C. Hide entire data objects from users

D. Manage workspace permissions

Answer: C

Explanation

OLS controls visibility of tables, views, columns, and measures.

Question 10

Which combination provides the most granular data protection?

A. Workspace Security only

B. Workspace Security and Item Security only

C. Row-Level Security only

D. Layered security using RLS, CLS, OLS, and folder/file controls

Answer: D

Explanation

Layering multiple security mechanisms provides comprehensive protection across multiple levels of the data platform.

Exam Tip

For the DP-700 exam, remember these distinctions:

Requirement	Solution
Restrict specific records	Row-Level Security (RLS)
Hide specific columns	Column-Level Security (CLS)
Hide entire tables, views, or measures	Object-Level Security (OLS)
Restrict storage locations	Folder/File-Level Security

A common exam strategy is to identify what is being protected:

Rows → RLS
Columns → CLS
Objects (tables/views/measures) → OLS
Files/Folders → Folder/File-Level Security

Many DP-700 questions present multiple security options that seem plausible. Focusing on the exact scope of the restriction usually leads directly to the correct answer.

Go to the DP-700 Exam Prep Hub main page.

DP-700, Microsoft Certification, Microsoft Fabric June 3, 2026

Implement item-level access controls (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
   --> Configure security and governance
      --> Implement item-level access controls

Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

While workspace-level permissions provide the first layer of security in Microsoft Fabric, they are often not sufficient for enterprise environments. Organizations frequently need to grant users access to specific assets without providing access to everything within a workspace.

This is where item-level access controls become important.

Item-level access controls allow administrators and data engineers to secure individual Fabric items such as Lakehouses, Warehouses, Semantic Models, Reports, Dashboards, and other assets. These controls provide more granular security than workspace permissions and help organizations implement governance, compliance, and least-privilege access models.

For the DP-700 exam, it is important to understand how item-level permissions work, how they interact with workspace permissions, and when to use item-level controls instead of workspace-level access.

What Are Item-Level Access Controls?

Item-level access controls are permissions assigned directly to specific Fabric assets rather than to an entire workspace.

For example:

			
Workspace
│
├── Lakehouse A
├── Lakehouse B
├── Warehouse A
├── Warehouse B
└── Notebook A

		

A user might need access only to:

Lakehouse A
Warehouse A

without gaining access to the entire workspace.

Item-level permissions make this possible.

Why Item-Level Security Is Important

Many organizations have:

Sensitive financial data
Human resources data
Customer information
Regulatory data
Executive reporting data

Providing broad workspace access could expose data unnecessarily.

Item-level controls allow organizations to:

Restrict sensitive assets
Share specific content
Improve governance
Support compliance requirements
Implement least-privilege security

Workspace Permissions vs Item-Level Permissions

One of the most frequently tested DP-700 topics is understanding the difference between workspace-level and item-level permissions.

Workspace-Level Access	Item-Level Access
Applies to entire workspace	Applies to specific items
Broader permissions	Granular permissions
Easier administration	More precise security
Used for collaboration	Used for controlled sharing

Example

A data engineer may need:

Contributor access to a workspace

while an executive may only need:

Access to a single report

In this scenario, item-level permissions are the preferred solution.

Permission Inheritance

Understanding inheritance is critical for the DP-700 exam.

Workspace permissions often provide access to contained items.

However, item-level permissions can be used to:

Grant access beyond workspace membership
Share specific assets
Restrict access to sensitive content

Exam Tip

Many questions focus on determining whether workspace-level permissions or item-level permissions should be used.

A good rule is:

Access to many assets → Workspace permissions
Access to one or a few assets → Item-level permissions

Item-Level Security for Lakehouses

Lakehouses contain:

Delta tables
Files
Structured data
Unstructured data

Organizations often need to control who can access these assets.

Examples include:

Finance Lakehouse
HR Lakehouse
Customer Analytics Lakehouse

Item-level permissions can limit access to specific Lakehouses without exposing all workspace assets.

Item-Level Security for Warehouses

Fabric Data Warehouses frequently contain business-critical data.

Examples include:

Sales metrics
Financial transactions
Customer information

Warehouse permissions can be assigned independently of workspace permissions.

This helps organizations:

Restrict data access
Support regulatory requirements
Enforce data ownership

Item-Level Security for Reports

Reports are one of the most commonly shared Fabric assets.

A user may need:

Access to a report
No access to development artifacts

Examples:

Executive dashboards
Department reports
Operational reporting

Item-level sharing enables this scenario.

Semantic Model Permissions

Semantic Models serve as the foundation for many reporting solutions.

Permissions can control who may:

View the model
Build reports from the model
Query the model

Common permissions include:

Read access
Build permission

Understanding Build Permission

Build permission is a commonly tested Fabric security topic.

Users with Build permission can:

Create reports from a semantic model
Analyze data using approved tools
Reuse trusted datasets

Without Build permission, users may be able to view reports but not create new reports from the underlying semantic model.

Example

			
Semantic Model
       ↓
Build Permission
       ↓
Create New Reports

		

This distinction frequently appears in exam questions.

Sharing Items

Fabric allows users to share individual assets.

Examples include:

Reports
Semantic Models
Dashboards

Sharing provides targeted access without adding users to the workspace.

Benefits include:

Reduced administrative overhead
Improved security
Easier collaboration

Row-Level Security (RLS)

Although RLS is often discussed separately from item permissions, it is closely related to access control.

Row-Level Security restricts which rows a user can view.

Example:

User	Visible Region
Alice	East
Bob	West

Both users access the same report but see different data.

Important Distinction

Item-level permissions control:

Who can access the item

RLS controls:

What data they can see within the item

This distinction is commonly tested.

Object-Level Security (OLS)

Object-Level Security provides even more granular control.

OLS can restrict access to:

Tables
Columns
Measures

Example:

Finance users:

Salary Column = Visible

General users:

Salary Column = Hidden

Combining Security Layers

Enterprise security often combines:

			
Workspace Security
        ↓
Item Security
        ↓
Row-Level Security
        ↓
Object-Level Security

		

Each layer provides additional protection.

Security Groups and Item Permissions

Best practice is to assign item permissions through Microsoft Entra ID groups rather than individual users.

Example:

			
Finance Executives Group
       ↓
Executive Dashboard Access

Benefits include:

Easier administration
Consistent security
Simplified auditing

Common DP-700 Exam Scenarios

Scenario 1

A CEO needs access to a financial dashboard but should not access the entire workspace.

Solution:

Use item-level sharing on the report.

Scenario 2

Analysts need to create reports from a semantic model.

Solution:

Grant Build permission.

Scenario 3

Regional managers should only see sales data for their region.

Solution:

Implement Row-Level Security.

Scenario 4

Users should not see salary-related columns.

Solution:

Implement Object-Level Security.

Best Practices

Follow Least Privilege

Grant only necessary permissions.

Use Security Groups

Avoid assigning permissions to individual users whenever possible.

Use Build Permission Carefully

Build permission enables report creation and data exploration.

Combine Security Layers

Use:

Workspace permissions
Item permissions
RLS
OLS

where appropriate.

Audit Permissions Regularly

Review access assignments periodically.

Secure Sensitive Assets Separately

Finance, HR, and compliance data should receive additional scrutiny.

DP-700 Exam Focus Areas

You should understand:

✓ Item-level permissions

✓ Workspace vs item-level security

✓ Semantic model permissions

✓ Build permission

✓ Report sharing

✓ Lakehouse permissions

✓ Warehouse permissions

✓ Permission inheritance

✓ Row-Level Security (RLS)

✓ Object-Level Security (OLS)

✓ Security group assignments

✓ Least-privilege principles

Practice Exam Questions

Question 1

A user needs access to a single report but should not have access to the workspace.

What should be used?

A. Workspace Admin role

B. Workspace Member role

C. Item-level sharing

D. Capacity assignment

Answer: C

Explanation

Item-level sharing allows access to a specific report without granting workspace access.

Question 2

What is the primary purpose of item-level access controls?

A. Manage Fabric capacities

B. Secure specific Fabric assets

C. Configure deployment pipelines

D. Manage Spark pools

Answer: B

Explanation

Item-level permissions provide granular security for individual Fabric items.

Question 3

Which permission allows users to create reports from an existing semantic model?

A. Viewer

B. Contributor

C. Read

D. Build

Answer: D

Explanation

Build permission allows users to create new reports and analyses from a semantic model.

Question 4

A user can access a report but should only see rows for their assigned sales region.

Which security feature should be implemented?

A. Item sharing

B. Workspace Viewer role

C. Object-Level Security

D. Row-Level Security

Answer: D

Explanation

Row-Level Security filters data based on user identity or role.

Question 5

What is the primary difference between workspace-level and item-level permissions?

A. Item-level permissions apply to specific assets

B. Workspace permissions only apply to reports

C. Item permissions control Spark resources

D. Workspace permissions cannot be assigned to groups

Answer: A

Explanation

Workspace permissions affect the entire workspace, while item-level permissions affect specific assets.

Question 6

A company wants users to access an executive dashboard without viewing development notebooks.

What should be implemented?

A. Workspace Admin access

B. Workspace Contributor access

C. Item-level access to the dashboard

D. Capacity permissions

Answer: C

Explanation

Item-level sharing allows access to specific assets without exposing the broader workspace.

Question 7

What does Object-Level Security (OLS) control?

A. Workspace membership

B. Data refresh schedules

C. Deployment permissions

D. Access to tables, columns, and measures

Answer: D

Explanation

OLS provides granular security at the database object level.

Question 8

A user belongs to a workspace but should not access a highly sensitive financial warehouse.

What security approach is most appropriate?

A. Use item-level controls on the warehouse

B. Increase capacity

C. Configure deployment rules

D. Create a notebook

Answer: A

Explanation

Item-level permissions provide additional control over access to sensitive assets.

Question 9

Which statement about Build permission is correct?

A. It grants workspace administration rights.

B. It allows users to create reports from a semantic model.

C. It controls deployment pipelines.

D. It replaces Row-Level Security.

Answer: B

Explanation

Build permission enables report creation and data exploration based on a semantic model.

Question 10

What is considered a best practice when assigning item-level permissions?

A. Assign permissions directly to every user

B. Use only Admin roles

C. Use Microsoft Entra ID groups

D. Disable item sharing

Answer: C

Explanation

Group-based permission management improves consistency, scalability, and governance.

Exam Tip

For the DP-700 exam, remember the following hierarchy:

Security Layer	Purpose
Workspace Security	Controls access to the workspace
Item-Level Security	Controls access to specific assets
Row-Level Security (RLS)	Controls which rows users can see
Object-Level Security (OLS)	Controls which tables, columns, or measures users can access

A common exam strategy is to identify whether the question is asking who can access an asset (item permissions) or what data they can see after gaining access (RLS/OLS). This distinction often leads directly to the correct answer.

Go to the DP-700 Exam Prep Hub main page.

Analytics, DP-700, Microsoft Certification, Microsoft Fabric June 3, 2026

Implement workspace-level access controls (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
   --> Configure security and governance
      --> Implement workspace-level access controls

Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Security and governance are foundational components of any enterprise analytics platform. In Microsoft Fabric, workspaces serve as the primary organizational boundary for managing content, collaboration, and permissions. Because workspaces often contain sensitive data assets such as Lakehouses, Warehouses, Data Pipelines, Notebooks, and Reports, controlling who can access and modify these assets is critical.

Workspace-level access controls provide the first layer of security within Fabric. They determine who can view, create, modify, share, and administer workspace content. Properly configured access controls help organizations implement the principle of least privilege, improve governance, reduce security risks, and ensure compliance with organizational policies.

For the DP-700 exam, you should understand workspace roles, permission inheritance, Microsoft Entra ID integration, security best practices, and common access-control scenarios.

Understanding Fabric Workspaces

A workspace is a collaborative environment used to organize and manage Fabric assets.

Examples of assets stored within a workspace include:

Lakehouses
Data Warehouses
Data Pipelines
Dataflows Gen2
Notebooks
Semantic Models
Reports
Eventstreams
Environments

Workspaces serve as the primary security boundary for these resources.

Why Workspace-Level Access Controls Matter

Without proper access controls:

Unauthorized users may access sensitive data.
Critical assets may be modified accidentally.
Governance requirements may not be met.
Production environments may be compromised.

Workspace-level security helps organizations:

Restrict access
Protect sensitive data
Separate responsibilities
Support auditing and compliance
Implement least-privilege security

Microsoft Entra ID Integration

Microsoft Fabric uses Microsoft Entra ID for authentication and identity management.

Users access Fabric using their organizational accounts.

Benefits include:

Centralized identity management
Single sign-on (SSO)
Multi-factor authentication support
Group-based security management
Conditional Access integration

Fabric does not maintain a separate user authentication system.

Workspace Roles

Workspace access is controlled through predefined roles.

The four primary workspace roles are:

Role	Purpose
Admin	Full workspace control
Member	Create, edit, and publish content
Contributor	Create and modify content
Viewer	Read-only access

Understanding these roles is extremely important for the DP-700 exam.

Admin Role

Admins have complete control over the workspace.

Capabilities include:

Manage workspace settings
Add or remove users
Assign roles
Delete workspace content
Configure Git integration
Configure deployment pipelines
Manage permissions

Admins effectively own the workspace.

Use Cases

Platform administrators
Workspace owners
Data engineering leads

Member Role

Members can actively participate in workspace development.

Capabilities include:

Create content
Modify content
Publish content
Collaborate with team members

However, Members do not have all administrative capabilities.

Use Cases

Senior developers
Data engineers
Analytics developers

Contributor Role

Contributors can create and modify content but have fewer management capabilities than Members.

Capabilities include:

Create notebooks
Create pipelines
Modify assets
Build solutions

Contributors generally focus on development activities rather than workspace administration.

Use Cases

Developers
Data engineers
ETL specialists

Viewer Role

Viewers have read-only access.

Capabilities include:

View reports
View data assets
Review content

Restrictions include:

Cannot modify content
Cannot create content
Cannot administer the workspace

Use Cases

Business users
Auditors
Stakeholders

Workspace Permission Assignment

Permissions can be assigned to:

Individual users
Security groups
Microsoft Entra ID groups

Best practice is to assign permissions through groups whenever possible.

Example:

			
Finance-DataEngineers → Contributor
Finance-Developers → Member
Finance-Managers → Viewer

Benefits include:

Easier administration
Reduced maintenance
Improved consistency

Principle of Least Privilege

One of the most important security concepts for DP-700 is the Principle of Least Privilege.

This principle states:

Users should receive only the permissions necessary to perform their job functions.

Example:

User Type	Recommended Role
Report Consumer	Viewer
Data Engineer	Contributor
Team Lead	Member
Workspace Owner	Admin

Over-permissioning increases security risks.

Permission Inheritance

Workspace-level permissions often provide access to items contained within the workspace.

Examples include:

Lakehouses
Warehouses
Notebooks
Dataflows

A user with workspace access generally gains access to supported content based on their assigned role.

However, some Fabric items support additional item-level permissions that can supplement workspace-level controls.

Exam Tip

Workspace permissions and item-level permissions are related but not identical.

Many exam questions test your understanding of this distinction.

Workspace Access and OneLake

OneLake security is closely tied to Fabric permissions.

When users access:

Lakehouses
Warehouse data
OneLake files

their permissions are generally governed through Fabric security controls.

This means workspace permissions play a significant role in determining data accessibility.

Separating Development, Test, and Production Access

Organizations commonly implement separate workspaces for:

			
Development
      ↓
Test
      ↓
Production

		

Different access controls are applied to each environment.

Example:

Environment	Typical Permissions
Development	Contributors and Members
Test	Limited Contributors
Production	Mostly Viewers

This reduces the risk of unauthorized production changes.

Workspace Security Best Practices

Use Security Groups

Prefer:

Sales-DataEngineers

instead of assigning permissions to individual users.

Minimize Admins

Only a small number of users should have Admin privileges.

Separate Production Access

Production workspaces should have stricter permissions.

Review Permissions Regularly

Conduct periodic audits of workspace access.

Follow Least Privilege

Assign the lowest role necessary.

Use Dedicated Service Principals

Automated processes should use service principals rather than personal accounts.

Common Security Scenarios

Scenario 1

A business analyst needs to view reports but should not modify content.

Solution:

Assign the Viewer role.

Scenario 2

A data engineer needs to build pipelines and notebooks but should not manage workspace permissions.

Solution:

Assign the Contributor role.

Scenario 3

A workspace owner needs to manage users and configure workspace settings.

Solution:

Assign the Admin role.

Scenario 4

A team lead needs to create and manage content while collaborating with developers.

Solution:

Assign the Member role.

Auditing and Governance

Workspace access controls support governance by enabling:

Access reviews
Compliance reporting
Security audits
Change tracking

Administrators should periodically verify:

User memberships
Group assignments
Admin privileges
Production access

These activities help maintain a secure Fabric environment.

DP-700 Exam Focus Areas

You should understand:

✓ Workspace roles

✓ Admin, Member, Contributor, and Viewer permissions

✓ Microsoft Entra ID integration

✓ Security group assignments

✓ Least-privilege principles

✓ Workspace permission inheritance

✓ Item-level versus workspace-level security

✓ Production environment security

✓ Service principal usage

✓ Governance and auditing practices

Practice Exam Questions

Question 1

Which workspace role provides full control over workspace settings and permissions?

A. Admin

B. Member

C. Contributor

D. Viewer

Answer: A

Explanation

Admins have complete control over workspace management, including permissions, settings, and content administration.

Question 2

A user needs read-only access to reports and data assets in a workspace.

Which role should be assigned?

A. Admin

B. Member

C. Contributor

D. Viewer

Answer: D

Explanation

The Viewer role allows users to access and view content without modifying it.

Question 3

Which Microsoft service provides identity and authentication for Fabric users?

A. Azure Data Lake Storage

B. Microsoft Entra ID

C. OneLake

D. Fabric Capacity

Answer: B

Explanation

Microsoft Entra ID provides authentication, identity management, and access control for Fabric users.

Question 4

A data engineer needs to create notebooks and pipelines but should not manage workspace permissions.

Which role is most appropriate?

A. Viewer

B. Admin

C. Contributor

D. Workspace Owner

Answer: C

Explanation

Contributors can create and modify content without having full administrative privileges.

Question 5

What is the primary goal of the Principle of Least Privilege?

A. Maximize workspace access

B. Reduce storage costs

C. Improve Spark performance

D. Grant only the permissions required to perform a job

Answer: D

Explanation

Least privilege reduces security risks by ensuring users receive only the permissions necessary for their responsibilities.

Question 6

Which approach is generally recommended for assigning workspace permissions?

A. Assign permissions directly to every user

B. Use Microsoft Entra ID security groups

C. Give all users Member access

D. Assign Admin access broadly

Answer: B

Explanation

Group-based permission management simplifies administration and improves consistency.

Question 7

A team lead needs to create content, collaborate with developers, and participate in solution management but does not require full administrative control.

Which role is most appropriate?

A. Viewer

B. Contributor

C. Member

D. Admin

Answer: C

Explanation

Members can actively manage and collaborate on workspace content without having full administrative authority.

Question 8

Why should organizations limit the number of workspace Admins?

A. To reduce Spark resource consumption

B. To simplify notebook development

C. To improve deployment speed

D. To reduce security risk and administrative exposure

Answer: D

Explanation

Admin roles have extensive privileges and should be assigned only when necessary.

Question 9

A company wants automated deployment processes that are not dependent on employee accounts.

What should be used?

A. Viewer accounts

B. Personal accounts

C. Service principals

D. Shared passwords

Answer: C

Explanation

Service principals provide stable, secure identities for automation and deployment activities.

Question 10

What is the primary benefit of separating Development, Test, and Production workspaces?

A. Increased storage capacity

B. Improved security and change control

C. Reduced OneLake storage usage

D. Faster notebook execution

Answer: B

Explanation

Environment separation helps prevent accidental production changes and supports proper testing and governance.

Exam Tip

For the DP-700 exam, many security questions can be solved by understanding the differences between the four workspace roles:

Role	Key Capability
Admin	Full control and permissions management
Member	Create, manage, and collaborate on content
Contributor	Create and modify content
Viewer	Read-only access

When evaluating scenarios, choose the lowest role that satisfies the requirement. Microsoft frequently tests the Principle of Least Privilege, making it one of the most important security concepts to master for the exam.

Go to the DP-700 Exam Prep Hub main page.

DP-700, Microsoft Certification, Microsoft Fabric June 3, 2026

Implement database projects (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
   --> Implement lifecycle management in Fabric
      --> Implement database projects

Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

As organizations adopt DevOps and DataOps practices, managing database changes through source control and automated deployments has become a critical requirement. Traditionally, database development was performed directly against production systems, often leading to inconsistent environments, deployment risks, and limited change tracking.

Database projects address these challenges by allowing database objects and schema definitions to be treated as code. This approach enables version control, collaboration, automated testing, continuous integration (CI), and continuous deployment (CD).

In Microsoft Fabric, database projects are particularly important when working with Fabric Data Warehouses. Database projects allow teams to manage warehouse schemas using modern software development practices and integrate them into broader lifecycle management processes.

For the DP-700 exam, you should understand the purpose of database projects, how they support DevOps workflows, their relationship to source control and deployment pipelines, and how they are used to manage Fabric Warehouse schemas.

What Is a Database Project?

A database project is a collection of files that define database objects and schema structures as source code.

Instead of creating objects directly within a database, developers define them in project files.

Examples include:

Tables
Views
Stored procedures
Functions
Security objects
Schemas
Constraints

The database project becomes the authoritative source for database definitions.

Why Database Projects Matter

Without database projects:

Schema changes may be undocumented.
Developers may overwrite each other’s work.
Production environments can drift from development environments.
Rollbacks become difficult.

Database projects provide:

Version control
Repeatable deployments
Change tracking
Environment consistency
Team collaboration

These capabilities align with modern DataOps and DevOps practices.

Database-as-Code

Database projects support the concept of Database-as-Code.

Database-as-Code means:

Database objects are stored as code files.
Changes are tracked through source control.
Deployments are automated.
Changes can be reviewed before implementation.

Instead of manually executing SQL scripts against production systems, organizations deploy controlled changes from source-controlled projects.

Database Projects and Fabric Warehouses

In Microsoft Fabric, database projects are primarily associated with Data Warehouses.

Fabric Warehouse objects such as:

Tables
Views
Stored procedures
Security definitions

can be managed through database projects.

This allows warehouse development to follow the same lifecycle management practices used for application development.

Components of a Database Project

A database project typically contains:

Schema Definitions

Definitions of database structures.

Examples:

			
CREATE TABLE Sales
(
    SalesID INT,
    Amount DECIMAL(18,2)
);

		

Views

Reusable query definitions.

Example:

			
CREATE VIEW vwSalesSummary
AS
SELECT SalesID, Amount
FROM Sales;

Stored Procedures

Reusable business logic.

Example:

			
CREATE PROCEDURE uspLoadSales
AS
BEGIN
    -- ETL logic
END;

		

Security Objects

Objects such as:

Users
Roles
Permissions

can also be defined and managed.

Database Projects and Source Control

One of the primary benefits of database projects is Git integration.

Database project files can be stored in repositories such as:

Azure DevOps
GitHub

Benefits include:

Change tracking
Auditability
Collaboration
Rollback capability

For DP-700, understand that database projects are often managed using the same source control processes as notebooks, pipelines, and other Fabric assets.

Version Control Workflow

A typical workflow looks like:

			
Developer
    ↓
Database Project
    ↓
Git Repository
    ↓
Validation
    ↓
Deployment

		

Benefits include:

Controlled releases
Code review processes
Consistent environments

Branching Strategies

Database projects commonly use standard Git branching practices.

Main Branch

Contains production-ready code.

main

Development Branch

Contains active development work.

			
main
 │
 └── develop

Feature Branches

Used for individual enhancements.

			
main
 │
 ├── feature/new-customer-table
 ├── feature/security-update
 └── feature-reporting-view

		

Feature branches reduce conflicts and improve collaboration.

Database Project Deployment

After changes are approved, they must be deployed.

The deployment process typically:

Compares source and target schemas.
Identifies differences.
Generates deployment actions.
Applies approved changes.

This process reduces manual effort and deployment errors.

Schema Comparison

Schema comparison is a key database project capability.

It identifies differences between:

Development and test environments
Test and production environments
Project definitions and deployed databases

Examples:

Object	Development	Production
Sales Table	Exists	Missing
Customer View	Updated	Old Version

Schema comparison helps maintain consistency across environments.

Database Projects and CI/CD

Database projects are frequently integrated into CI/CD pipelines.

CI/CD stands for:

Continuous Integration
Continuous Deployment

Typical process:

			
Developer Changes
        ↓
Git Commit
        ↓
Build Validation
        ↓
Testing
        ↓
Deployment Pipeline
        ↓
Production

		

Benefits:

Faster releases
Reduced risk
Increased automation
Improved reliability

Database Projects and Fabric Deployment Pipelines

Deployment pipelines complement database projects.

Database projects manage:

Database definitions
Source code
Schema changes

Deployment pipelines manage:

Promotion between environments
Release processes

Typical environment flow:

			
Development
      ↓
Test
      ↓
Production

		

This separation of responsibilities is important for the exam.

Managing Warehouse Objects Through Projects

Database projects help manage common warehouse objects.

Tables

Examples:

Fact tables
Dimension tables

Views

Used for:

Business reporting
Data abstraction
Security

Stored Procedures

Used for:

ETL operations
Data loading
Data quality checks

Security Definitions

Used for:

Role management
Permission assignments

Benefits of Database Projects

Improved Collaboration

Multiple developers can work simultaneously.

Repeatable Deployments

Deployments become consistent across environments.

Auditability

All changes are tracked through source control.

Rollback Capability

Previous versions can be restored.

Reduced Human Error

Automation reduces deployment mistakes.

Common DP-700 Exam Scenarios

Scenario 1

Multiple developers are modifying warehouse schemas.

Requirement:

Track changes and prevent overwriting.

Solution:

Implement a database project with Git integration.

Scenario 2

A company needs consistent schemas across development, test, and production environments.

Solution:

Use database projects and deployment pipelines.

Scenario 3

An accidental schema change is deployed.

Requirement:

Restore a previous version.

Solution:

Rollback using source control history.

Best Practices

Store All Database Objects in Source Control

Treat schemas as code.

Use Feature Branches

Avoid direct modifications to production branches.

Perform Code Reviews

Review schema changes before deployment.

Automate Deployments

Use deployment pipelines whenever possible.

Maintain Environment Consistency

Use schema comparison tools to identify drift.

Document Changes

Use meaningful commit messages.

Example:

Added customer dimension surrogate key support

DP-700 Exam Focus Areas

You should understand:

✓ Purpose of database projects

✓ Database-as-Code concepts

✓ Source control integration

✓ Git repositories

✓ Schema comparison

✓ Deployment processes

✓ CI/CD integration

✓ Branching strategies

✓ Warehouse schema management

✓ Rollback and versioning

✓ Relationship to deployment pipelines

Practice Exam Questions

Question 1

What is the primary purpose of a database project?

A. Execute Spark workloads

B. Store warehouse data files

C. Manage database schemas as source-controlled code

D. Monitor Fabric capacities

Answer: C

Explanation

Database projects allow database objects and schemas to be defined, tracked, and managed as code. This supports version control, collaboration, and automated deployments.

Question 2

Which Fabric item is most commonly managed through a database project?

A. Data Warehouse

B. Eventstream

C. Notebook

D. Lakehouse Shortcut

Answer: A

Explanation

Database projects are primarily associated with Fabric Data Warehouses and their schema objects.

Question 3

What is a key benefit of storing database projects in source control?

A. Increased storage capacity

B. Change tracking and version history

C. Faster SQL execution

D. Reduced OneLake usage

Answer: B

Explanation

Source control provides auditability, rollback capabilities, collaboration support, and historical tracking of schema changes.

Question 4

Which Git branching strategy allows developers to work on isolated enhancements?

A. Feature branch

B. Production branch

C. Release branch

D. Main-only development

Answer: A

Explanation

Feature branches enable developers to work independently and merge approved changes later.

Question 5

What is the purpose of schema comparison in a database project?

A. Increase warehouse performance

B. Compare Fabric capacities

C. Identify differences between database environments

D. Create deployment pipelines

Answer: C

Explanation

Schema comparison identifies discrepancies between source and target databases, helping maintain consistency.

Question 6

Which process typically occurs after a developer commits database project changes?

A. Data replication

B. Build validation and testing

C. Spark optimization

D. Capacity scaling

Answer: B

Explanation

CI/CD workflows generally include validation and testing before deployment occurs.

Question 7

Which database object can be managed within a database project?

A. Table

B. View

C. Stored Procedure

D. All of the above

Answer: D

Explanation

Database projects can manage tables, views, stored procedures, functions, schemas, and security objects.

Question 8

What is the primary role of deployment pipelines when used with database projects?

A. Store source code

B. Manage Git repositories

C. Promote changes between environments

D. Execute SQL queries

Answer: C

Explanation

Deployment pipelines move validated changes through development, test, and production environments.

Question 9

A team needs the ability to restore a previous schema version after a failed deployment.

Which capability should they use?

A. Version control rollback

B. Capacity monitoring

C. Spark session recovery

D. Dataflow refresh

Answer: A

Explanation

Source control systems maintain historical versions, enabling rollback to previous states when necessary.

Question 10

What does the term “Database-as-Code” refer to?

A. Storing data in code files

B. Executing SQL through notebooks

C. Converting SQL to Spark code

D. Managing database objects through source-controlled definitions

Answer: D

Explanation

Database-as-Code treats database definitions as managed source code artifacts that can be versioned, reviewed, tested, and deployed through automated processes.

Exam Tip

For the DP-700 exam, remember that database projects are fundamentally about applying software engineering practices to database development. Questions often focus on source control, schema management, CI/CD, deployment consistency, and collaboration. If a scenario involves managing warehouse schemas across multiple environments, tracking changes, enabling rollbacks, or automating deployments, a database project is often a key part of the solution.

Go to the DP-700 Exam Prep Hub main page.