Category: Data Security

Choose Between a Lakehouse, Warehouse, or Eventhouse

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Prepare data
--> Get data
--> Choose Between a Lakehouse, Warehouse, or Eventhouse

One of the most important architectural decisions a Microsoft Fabric Analytics Engineer must make is selecting the right analytical store for a given workload. For the DP-600 exam, this topic tests your ability to choose between a Lakehouse, Warehouse, or Eventhouse based on data type, query patterns, latency requirements, and user personas.

Overview of the Three Options

Microsoft Fabric provides three primary analytics storage and query experiences:

OptionPrimary Purpose
LakehouseFlexible analytics on files and tables using Spark and SQL
WarehouseEnterprise-grade SQL analytics and BI reporting
EventhouseReal-time and near-real-time analytics on streaming data

Understanding why and when to use each is critical for DP-600 success.

Lakehouse

What Is a Lakehouse?

A Lakehouse combines the flexibility of a data lake with the structure of a data warehouse. Data is stored in Delta Lake format in OneLake and can be accessed using both Spark and SQL.

When to Choose a Lakehouse

Choose a Lakehouse when you need:

  • Flexible schema (schema-on-read or schema-on-write)
  • Support for data engineering and data science
  • Access to raw, curated, and enriched data
  • Spark-based transformations and notebooks
  • Mixed workloads (batch analytics, exploration, ML)

Key Characteristics

  • Supports files and tables
  • Uses Spark SQL and T-SQL endpoints
  • Ideal for ELT and advanced transformations
  • Easy integration with notebooks and pipelines

Exam signal words: flexible, raw data, Spark, data science, experimentation

Warehouse

What Is a Warehouse?

A Warehouse is a fully managed, SQL-first analytical store optimized for business intelligence and reporting. It enforces schema-on-write and provides a traditional relational experience.

When to Choose a Warehouse

Choose a Warehouse when you need:

  • Strong SQL-based analytics
  • High-performance reporting
  • Well-defined schemas and governance
  • Centralized enterprise BI
  • Compatibility with Power BI Import or DirectQuery

Key Characteristics

  • T-SQL only (no Spark)
  • Optimized for structured data
  • Best for star/snowflake schemas
  • Familiar experience for SQL developers

Exam signal words: enterprise BI, reporting, structured, governed, SQL-first

Eventhouse

What Is an Eventhouse?

An Eventhouse is optimized for real-time and streaming analytics, built on KQL (Kusto Query Language). It is designed to handle high-velocity event data.

When to Choose an Eventhouse

Choose an Eventhouse when you need:

  • Near-real-time or real-time analytics
  • Streaming data ingestion
  • Operational or telemetry analytics
  • Event-based dashboards and alerts

Key Characteristics

  • Uses KQL for querying
  • Integrates with Eventstreams
  • Handles massive ingestion rates
  • Optimized for time-series data

Exam signal words: streaming, telemetry, IoT, real-time, events

Choosing the Right Option (Exam-Critical)

The DP-600 exam often presents scenarios where multiple options could work, but only one best fits the requirements.

Decision Matrix

RequirementBest Choice
Raw + curated dataLakehouse
Complex Spark transformationsLakehouse
Enterprise BI reportingWarehouse
Strong governance and schemasWarehouse
Streaming or telemetry dataEventhouse
Near-real-time dashboardsEventhouse
SQL-only usersWarehouse
Data science workloadsLakehouse

Common Exam Scenarios

You may be asked to:

  • Choose a storage type for a new analytics solution
  • Migrate from traditional systems to Fabric
  • Support both engineers and analysts
  • Enable real-time monitoring
  • Balance governance with flexibility

Always identify:

  1. Data type (batch vs streaming)
  2. Latency requirements
  3. User personas
  4. Query language
  5. Governance needs

Best Practices to Remember

  • Use Lakehouse as a flexible foundation for analytics
  • Use Warehouse for polished, governed BI solutions
  • Use Eventhouse for real-time operational insights
  • Avoid forcing one option to handle all workloads
  • Let business requirements—not familiarity—drive the choice

Key Takeaway
For the DP-600 exam, choosing between a Lakehouse, Warehouse, or Eventhouse is about aligning data characteristics and access patterns with the right Fabric experience. Lakehouses provide flexibility, Warehouses deliver enterprise BI performance, and Eventhouses enable real-time analytics. The correct answer is almost always the one that best fits the scenario constraints.

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

  • Identifying and understand why an option is correct (or incorrect) — not just which one
  • Look for and understand the usage scenario of keywords in exam questions, with the below possible association:
    • Spark, raw, experimentationLakehouse
    • Enterprise BI, governed, SQL reportingWarehouse
    • Streaming, telemetry, real-timeEventhouse
  • Expect scenario-based questions rather than direct definitions

1. Which Microsoft Fabric component is BEST suited for flexible analytics on both files and tables using Spark and SQL?

A. Warehouse
B. Eventhouse
C. Lakehouse
D. Semantic model

Correct Answer: C

Explanation:
A Lakehouse stores data in Delta format in OneLake and supports both Spark and SQL, making it ideal for flexible analytics across files and tables.

2. A team of data scientists needs to experiment with raw and curated data using notebooks. Which option should they choose?

A. Warehouse
B. Eventhouse
C. Semantic model
D. Lakehouse

Correct Answer: D

Explanation:
Lakehouses are designed for data engineering and data science workloads, offering Spark-based notebooks and flexible schema handling.

3. Which option is MOST appropriate for enterprise BI reporting with well-defined schemas and strong governance?

A. Lakehouse
B. Warehouse
C. Eventhouse
D. OneLake

Correct Answer: B

Explanation:
Warehouses are SQL-first, schema-on-write systems optimized for structured data, governance, and high-performance BI reporting.

4. A solution must support near-real-time analytics on streaming IoT telemetry data. Which Fabric component should be used?

A. Lakehouse
B. Warehouse
C. Eventhouse
D. Dataflow Gen2

Correct Answer: C

Explanation:
Eventhouses are optimized for high-velocity streaming data and real-time analytics using KQL.

5. Which query language is primarily used to analyze data in an Eventhouse?

A. T-SQL
B. Spark SQL
C. DAX
D. KQL

Correct Answer: D

Explanation:
Eventhouses are built on KQL (Kusto Query Language), which is optimized for querying event and time-series data.

6. A business analytics team requires fast dashboard performance and is familiar only with SQL. Which option best meets this requirement?

A. Lakehouse
B. Warehouse
C. Eventhouse
D. Spark notebook

Correct Answer: B

Explanation:
Warehouses provide a traditional SQL experience optimized for BI dashboards and reporting performance.

7. Which characteristic BEST distinguishes a Lakehouse from a Warehouse?

A. Lakehouses support Power BI
B. Warehouses store data in OneLake
C. Lakehouses support Spark-based processing
D. Warehouses cannot be governed

Correct Answer: C

Explanation:
Lakehouses uniquely support Spark-based processing, enabling advanced transformations and data science workloads.

8. A solution must store structured batch data and unstructured files in the same analytical store. Which option should be selected?

A. Warehouse
B. Eventhouse
C. Semantic model
D. Lakehouse

Correct Answer: D

Explanation:
Lakehouses support both structured tables and unstructured or semi-structured files within the same environment.

9. Which scenario MOST strongly indicates the need for an Eventhouse?

A. Monthly financial reporting
B. Slowly changing dimension modeling
C. Real-time operational monitoring
D. Ad hoc SQL analysis

Correct Answer: C

Explanation:
Eventhouses are designed for real-time analytics on streaming data, making them ideal for operational monitoring scenarios.

10. When choosing between a Lakehouse, Warehouse, or Eventhouse on the DP-600 exam, which factor is MOST important?

A. Personal familiarity with the tool
B. The default Fabric option
C. Data characteristics and latency requirements
D. Workspace size

Correct Answer: C

Explanation:
DP-600 emphasizes selecting the correct component based on data type (batch vs streaming), latency needs, user personas, and governance—not personal preference.

Ingest or Access Data as Needed

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Prepare data
--> Get data
--> Ingest or access data as needed

A core responsibility of a Microsoft Fabric Analytics Engineer is deciding how data should be brought into Fabric—or whether it should be brought in at all. For the DP-600 exam, this topic focuses on selecting the right ingestion or access pattern based on performance, freshness, cost, and governance requirements.

Ingest vs. Access: Key Concept

Before choosing a tool or method, understand the distinction:

  • Ingest data: Physically copy data into Fabric-managed storage (OneLake)
  • Access data: Query or reference data where it already lives, without copying

The exam frequently tests your ability to choose the most appropriate option—not just a working one.

Common Data Ingestion Methods in Microsoft Fabric

1. Dataflows Gen2

Best for:

  • Low-code ingestion and transformation
  • Reusable ingestion logic
  • Business-friendly data preparation

Key characteristics:

  • Uses Power Query Online
  • Supports scheduled refresh
  • Stores results in OneLake (Lakehouse or Warehouse)
  • Ideal for centralized, governed ingestion

Exam tip:
Use Dataflows Gen2 when reuse, transformation, and governance are priorities.

2. Data Pipelines (Copy Activity)

Best for:

  • High-volume or frequent ingestion
  • Orchestration across multiple sources
  • ELT-style workflows

Key characteristics:

  • Supports many source and sink types
  • Enables scheduling, dependencies, and retries
  • Minimal transformation (primarily copy)

Exam tip:
Choose pipelines when performance and orchestration matter more than transformation.

3. Notebooks (Spark)

Best for:

  • Complex transformations
  • Data science or advanced engineering
  • Custom ingestion logic

Key characteristics:

  • Full control using Spark (PySpark, Scala, SQL)
  • Suitable for large-scale processing
  • Writes directly to OneLake

Exam tip:
Notebooks are powerful but require engineering skills—don’t choose them for simple ingestion scenarios.

Accessing Data Without Ingesting

1. OneLake Shortcuts

Best for:

  • Avoiding data duplication
  • Reusing data across workspaces
  • Accessing external storage

Key characteristics:

  • Logical reference only (no copy)
  • Supports ADLS Gen2 and Amazon S3
  • Appears native in Lakehouse tables or files

Exam tip:
Shortcuts are often the best answer when the question mentions avoiding duplication or reducing storage cost.

2. DirectQuery

Best for:

  • Near-real-time data access
  • Large datasets that cannot be imported
  • Centralized source-of-truth systems

Key characteristics:

  • Queries run against the source system
  • Performance depends on source
  • Limited modeling flexibility compared to Import

Exam tip:
Expect trade-off questions involving DirectQuery vs. Import.

3. Real-Time Access (Eventstreams / KQL)

Best for:

  • Streaming and telemetry data
  • Operational and real-time analytics

Key characteristics:

  • Event-driven ingestion
  • Supports near-real-time dashboards
  • Often discovered via Real-Time hub

Exam tip:
Use real-time ingestion when freshness is measured in seconds, not hours.

Choosing the Right Approach (Exam-Critical)

You should be able to decide based on these factors:

RequirementBest Option
Reusable ingestion logicDataflows Gen2
High-volume copyData pipelines
Complex transformationsNotebooks
Avoid duplicationOneLake shortcuts
Near real-time reportingDirectQuery / Eventstreams
Governance and trustIngestion + endorsement

Governance and Security Considerations

  • Ingested data can inherit sensitivity labels
  • Access-based methods rely on source permissions
  • Workspace roles determine who can ingest or access data
  • Endorsed datasets should be preferred for reuse

DP-600 often frames ingestion questions within a governance context.

Common Exam Scenarios

You may be asked to:

  • Choose between ingesting data or accessing it directly
  • Identify when shortcuts are preferable to ingestion
  • Select the right tool for a specific ingestion pattern
  • Balance data freshness vs. performance
  • Reduce duplication across workspaces

Best Practices to Remember

  • Ingest when performance and modeling flexibility are required
  • Access when freshness, cost, or duplication is a concern
  • Centralize ingestion logic for reuse
  • Prefer Fabric-native patterns over external tools
  • Let business requirements drive architectural decisions

Key Takeaway
For the DP-600 exam, “Ingest or access data as needed” is about making intentional, informed choices. Microsoft Fabric provides multiple ways to bring data into analytics solutions, and the correct approach depends on scale, freshness, reuse, governance, and cost. Understanding why one method is better than another is far more important than memorizing features.

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

  • Identifying and understand why an option is correct (or incorrect) — not just which one
  • Look for and understand the usage scenario of keywords in exam questions (for example, low code/no code, large dataset, high-volume data, reuse, complex transformations)
  • Expect scenario-based questions rather than direct definitions

Also, keep in mind that …

  • DP-600 questions often include multiple valid options, but only one that best aligns with the scenario’s constraints. Always identify and consider factors such as:
    • Data volume
    • Freshness requirements
    • Reuse and duplication concerns
    • Transformation complexity

1. What is the primary difference between ingesting data and accessing data in Microsoft Fabric?

A. Ingested data cannot be secured
B. Accessed data is always slower
C. Ingesting copies data into OneLake, while accessing queries data in place
D. Accessed data requires a gateway

Correct Answer: C

Explanation:
Ingestion physically copies data into Fabric-managed storage (OneLake), while access-based approaches query or reference data where it already exists.

2. Which option is BEST when the goal is to avoid duplicating large datasets across multiple workspaces?

A. Import mode
B. Dataflows Gen2
C. OneLake shortcuts
D. Notebooks

Correct Answer: C

Explanation:
OneLake shortcuts allow data to be referenced without copying it, making them ideal for reuse and cost control.

3. A team needs reusable, low-code ingestion logic with scheduled refresh. Which Fabric feature should they use?

A. Spark notebooks
B. Data pipelines
C. Dataflows Gen2
D. DirectQuery

Correct Answer: C

Explanation:
Dataflows Gen2 provide Power Query–based ingestion with refresh scheduling and reuse across Fabric items.

4. Which ingestion method is MOST appropriate for complex transformations requiring custom logic?

A. Dataflows Gen2
B. Copy activity in pipelines
C. OneLake shortcuts
D. Spark notebooks

Correct Answer: D

Explanation:
Spark notebooks offer full control over transformation logic and are suited for complex, large-scale processing.

5. When should DirectQuery be preferred over Import mode?

A. When the dataset is small
B. When data freshness is critical
C. When transformations are complex
D. When performance must be maximized

Correct Answer: B

Explanation:
DirectQuery is preferred when near-real-time access to data is required, even though performance depends on the source system.

6. Which Fabric component is BEST suited for orchestrating high-volume data ingestion with dependencies and retries?

A. Dataflows Gen2
B. Data pipelines
C. Semantic models
D. Power BI Desktop

Correct Answer: B

Explanation:
Data pipelines are designed for orchestration, handling large volumes of data, scheduling, and dependency management.

7. A dataset is queried infrequently but must support advanced modeling features. Which approach is most appropriate?

A. DirectQuery
B. Access via shortcut
C. Import into OneLake
D. Eventstream ingestion

Correct Answer: C

Explanation:
Import mode supports full modeling capabilities and high query performance, making it suitable even for infrequently accessed data.

8. Which scenario best fits the use of real-time ingestion methods such as Eventstreams or KQL databases?

A. Monthly financial reporting
B. Static reference data
C. IoT telemetry and operational monitoring
D. Slowly changing dimensions

Correct Answer: C

Explanation:
Real-time ingestion is designed for continuous, event-driven data such as IoT telemetry and operational metrics.

9. Why might ingesting data be preferred over accessing it directly?

A. It always reduces storage costs
B. It eliminates the need for security
C. It improves performance and modeling flexibility
D. It avoids data refresh

Correct Answer: C

Explanation:
Ingesting data into OneLake enables faster query performance and full support for modeling features.

10. Which factor is MOST important when deciding between ingesting data and accessing it?

A. The color of the dashboard
B. The number of reports
C. Business requirements such as freshness, scale, and governance
D. The Fabric region

Correct Answer: C

Explanation:
The decision to ingest or access data should be driven by business needs, including performance, freshness, cost, and governance—not technical convenience alone.

Discover Data by Using OneLake Catalog and Real-Time Hub

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Prepare data
--> Get data
--> Discover data by using OneLake catalog and Real-Time hub

Discovering existing data assets efficiently is a critical capability for a Microsoft Fabric Analytics Engineer. For the DP-600 exam, this topic emphasizes how to find, understand, and evaluate data sources using Fabric’s built-in discovery experiences: OneLake catalog and Real-Time hub.

Purpose of Data Discovery in Microsoft Fabric

In large Fabric environments, data already exists across:

  • Lakehouses
  • Warehouses
  • Semantic models
  • Streaming and event-based sources

The goal of data discovery is to:

  • Avoid duplicate ingestion
  • Promote reuse of trusted data
  • Understand data ownership, sensitivity, and freshness
  • Accelerate analytics development

OneLake Catalog

What Is the OneLake Catalog?

The OneLake catalog is a centralized metadata and discovery experience that allows users to browse and search data assets stored in OneLake, Fabric’s unified data lake.

It provides visibility into:

  • Lakehouses and Warehouses
  • Tables, views, and files
  • Shortcuts to external data
  • Endorsement and sensitivity metadata

Key Capabilities of the OneLake Catalog

For the exam, you should understand that the OneLake catalog enables users to:

  • Search and filter data assets across workspaces
  • View schema details (columns, data types)
  • Identify endorsed (Certified or Promoted) assets
  • See sensitivity labels applied to data
  • Discover data ownership and location
  • Reuse existing data rather than re-ingesting it

This supports both governance and efficiency.

Endorsement and Trust Signals

Within the OneLake catalog, users can quickly identify:

  • Certified items (approved and governed)
  • Promoted items (recommended but not formally certified)

These trust signals are important in exam scenarios that ask how to guide users toward reliable data sources.

Shortcuts and External Data

The catalog also exposes OneLake shortcuts, which allow data from:

  • Azure Data Lake Storage Gen2
  • Amazon S3
  • Other Fabric workspaces

to appear as native OneLake data without duplication. This is a key discovery mechanism tested in DP-600.

Real-Time Hub

What Is the Real-Time Hub?

The Real-Time hub is a discovery experience focused on streaming and event-driven data sources in Microsoft Fabric.

It centralizes access to:

  • Eventstreams
  • Azure Event Hubs
  • Azure IoT Hub
  • Azure Data Explorer (KQL databases)
  • Other real-time data producers

Key Capabilities of the Real-Time Hub

For exam purposes, understand that the Real-Time hub allows users to:

  • Discover available streaming data sources
  • Preview live event data
  • Subscribe to or reuse existing event streams
  • Understand data velocity and schema
  • Reduce duplication of real-time ingestion pipelines

This is especially important in architectures involving operational analytics or near real-time reporting.

OneLake Catalog vs. Real-Time Hub

FeatureOneLake CatalogReal-Time Hub
Primary focusStored dataStreaming / event data
Data typesTables, files, shortcutsEvents, streams, telemetry
Use caseAnalytical and historical dataReal-time and operational analytics
Governance signalsEndorsement, sensitivityOwnership, stream metadata

Understanding when to use each is a common exam theme.

Security and Governance Considerations

Data discovery respects Fabric security:

  • Users only see items they have permission to access
  • Sensitivity labels are visible in discovery views
  • Workspace roles control discovery depth

This ensures compliance while still promoting self-service analytics.

Exam-Relevant Scenarios

On the DP-600 exam, you may be asked to:

  • Identify how users can discover existing datasets before ingesting new data
  • Choose between OneLake catalog and Real-Time hub based on data type
  • Locate endorsed or certified data assets
  • Reduce duplication by reusing existing tables or streams
  • Enable self-service discovery while maintaining governance

Best Practices (Aligned to DP-600)

  • Use OneLake catalog first before creating new data connections
  • Encourage use of endorsed and certified assets
  • Use Real-Time hub to discover existing event streams
  • Leverage shortcuts to reuse data without copying
  • Combine discovery with proper labeling and endorsement

Key Takeaway
For the DP-600 exam, discovering data in Microsoft Fabric is about visibility, trust, and reuse. The OneLake catalog helps users find and understand stored analytical data, while the Real-Time hub enables discovery of live streaming sources. Together, they reduce redundancy, improve governance, and accelerate analytics development.

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

  • Identifying and understand why an option is correct (or incorrect) — not just which one
  • Pay close attention to when to use OneLake catalog vs. Real-Time hub
  • Look for and understand the usage scenario of keywords in exam questions (for example, discover, reuse, streaming, endorsed, shortcut)
  • Expect scenario-based questions that test architecture choices, rather than direct definitions

1. What is the primary purpose of the OneLake catalog in Microsoft Fabric?

A. To ingest streaming data
B. To schedule data refreshes
C. To discover and explore data stored in OneLake
D. To manage workspace permissions

Correct Answer: C

Explanation:
The OneLake catalog is a centralized discovery and metadata experience that helps users find, understand, and reuse data stored in OneLake across Fabric workspaces.

2. Which type of data is the Real-Time hub primarily designed to help users discover?

A. Historical data in Lakehouses
B. Structured warehouse tables
C. Streaming and event-driven data sources
D. Power BI semantic models

Correct Answer: C

Explanation:
The Real-Time hub focuses on streaming and event-based data such as Eventstreams, Azure Event Hubs, IoT Hub, and KQL databases.

3. A user wants to avoid re-ingesting data that already exists in another workspace. Which Fabric feature best supports this goal?

A. Data pipelines
B. OneLake shortcuts
C. Import mode
D. DirectQuery

Correct Answer: B

Explanation:
OneLake shortcuts allow data stored externally or in another workspace to appear as native OneLake data without physically copying it.

4. Which metadata element in the OneLake catalog helps users identify trusted and approved data assets?

A. Workspace name
B. File size
C. Endorsement status
D. Refresh schedule

Correct Answer: C

Explanation:
Endorsements (Promoted and Certified) act as trust signals, helping users quickly identify reliable and governed data assets.

5. Which statement about data visibility in the OneLake catalog is true?

A. All users can see all data across the tenant
B. Only workspace admins can see catalog entries
C. Users can only see items they have permission to access
D. Sensitivity labels hide data from discovery

Correct Answer: C

Explanation:
The OneLake catalog respects Fabric security boundaries—users only see data assets they are authorized to access.

6. A team is building a real-time dashboard and wants to see what streaming data already exists. Where should they look first?

A. OneLake catalog
B. Power BI Service
C. Dataflows Gen2
D. Real-Time hub

Correct Answer: D

Explanation:
The Real-Time hub centralizes discovery of streaming and event-based data sources, making it the best starting point for real-time analytics scenarios.

7. Which of the following items is most likely discovered through the Real-Time hub?

A. Parquet files in OneLake
B. Lakehouse Delta tables
C. Azure Event Hub streams
D. Warehouse SQL views

Correct Answer: C

Explanation:
Azure Event Hubs and other event-driven sources are exposed through the Real-Time hub, not the OneLake catalog.

8. What advantage does data discovery provide in large Fabric environments?

A. Faster Power BI rendering
B. Reduced licensing costs
C. Reduced data duplication and improved reuse
D. Automatic data modeling

Correct Answer: C

Explanation:
Discovering existing data assets helps teams reuse trusted data, reducing redundant ingestion and improving governance.

9. Which information is commonly visible when browsing an asset in the OneLake catalog?

A. User passwords
B. Column-level schema details
C. Tenant-wide permissions
D. Gateway configuration

Correct Answer: B

Explanation:
The OneLake catalog exposes metadata such as table schemas, column names, and data types to help users evaluate suitability before use.

10. Which scenario best demonstrates correct use of OneLake catalog and Real-Time hub together?

A. Using DirectQuery for all reports
B. Creating a new pipeline for every dataset
C. Discovering historical data in OneLake and live events in Real-Time hub
D. Applying sensitivity labels to dashboards

Correct Answer: C

Explanation:
OneLake catalog is optimized for discovering stored analytical data, while Real-Time hub is designed for discovering live streaming sources. Using both ensures comprehensive data discovery.

Create a Data Connection in Microsoft Fabric

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Prepare data
--> Get data
--> Create a data connection

Creating data connections is a foundational skill for a Microsoft Fabric Analytics Engineer. In the DP-600 exam, this topic focuses on how to securely and efficiently connect Fabric workloads—such as Lakehouses, Warehouses, Dataflows Gen2, and semantic models—to a wide variety of data sources.

What a Data Connection Means in Microsoft Fabric

A data connection defines how Fabric authenticates to, accesses, and retrieves data from a source system. It includes:

  • The data source type
  • Connection details (server, database, endpoint, file path, etc.)
  • Authentication method
  • Optional privacy and credential reuse settings

Once created, a data connection can often be reused across multiple items within a workspace.

Common Data Sources in Fabric

For the exam, you should be familiar with connecting to the following categories of data sources:

1. Azure and Microsoft Data Sources

  • Azure SQL Database
  • Azure Synapse (dedicated and serverless pools)
  • Azure Data Lake Storage Gen2
  • Azure Blob Storage
  • OneLake (Fabric-native storage)
  • Power BI semantic models (DirectQuery)

2. On-Premises Data Sources

  • SQL Server
  • Oracle
  • Other relational databases

These typically require an On-premises Data Gateway.

3. Files and Semi-Structured Data

  • CSV, JSON, Parquet, Excel
  • Files stored in OneLake, ADLS Gen2, SharePoint, or local file systems

Where Data Connections Are Created

In Microsoft Fabric, data connections can be created from several entry points:

  • Lakehouse: Add data via shortcuts or ingestion
  • Warehouse: Connect external data or ingest via pipelines
  • Dataflows Gen2: Define connections as part of Power Query Online
  • Pipelines: Configure source connections in copy activities
  • Semantic models: Connect via Import or DirectQuery

Understanding where the connection is configured is important for exam scenarios.

Authentication Methods

The DP-600 exam commonly tests authentication concepts. Be familiar with:

  • Microsoft Entra ID (OAuth) – Recommended and most secure
  • Service principal – Common for automation and CI/CD
  • Account key / Shared Access Signature (SAS) – Often used for storage
  • Username and password – Less secure, sometimes legacy

You should also understand when credentials are:

  • Stored at the connection level
  • Managed per workspace
  • Reused across multiple items

Gateways and Connectivity Modes

On-Premises Data Gateway

Required when connecting Fabric to on-premises sources. Key points:

  • Can be standard or personal (standard is preferred)
  • Must be online for refresh and query operations
  • Uses outbound connections only

Connectivity Modes

  • Import: Data is loaded into Fabric storage
  • DirectQuery: Queries run against the source system
  • Shortcut-based access: Data remains external but appears native in OneLake

Security and Governance Considerations

When creating data connections, Fabric enforces governance through:

  • Workspace roles (Viewer, Contributor, Member, Admin)
  • Credential isolation per workspace
  • Sensitivity labels inherited from data sources (when applicable)

Exam questions may test your ability to choose the most secure and scalable connection method.

Best Practices (Exam-Relevant)

  • Prefer Entra ID authentication over credentials or keys
  • Use OneLake shortcuts to avoid unnecessary data duplication
  • Centralize connections in Dataflows Gen2 for reuse
  • Validate gateway availability for on-premises sources
  • Align connection methods with performance needs (Import vs DirectQuery)

How This Appears on the DP-600 Exam

You may be asked to:

  • Identify the correct data connection method for a scenario
  • Choose the appropriate authentication type
  • Determine when a gateway is required
  • Decide where to create a connection for reuse and governance
  • Troubleshoot refresh or connectivity issues

Key Takeaway
Creating data connections in Microsoft Fabric is about more than just accessing data—it’s about security, performance, reusability, and governance. For the DP-600 exam, focus on understanding source types, authentication options, gateways, and where connections are defined within the Fabric ecosystem.

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

  • Identifying and understand why an option is correct (or incorrect) — not just which one
  • Look for and understand the usage scenario of keywords in exam questions (for example, gateway, authentication, reuse, DirectQuery vs Import)
  • Expect scenario-based questions rather than direct definitions

1. Which authentication method is generally recommended when creating data connections in Microsoft Fabric?

A. Username and password
B. Shared Access Signature (SAS)
C. Microsoft Entra ID (OAuth)
D. Account key

Correct Answer: C

Explanation:
Microsoft Entra ID (OAuth) is the recommended authentication method because it provides centralized identity management, better security, support for conditional access, and easier credential rotation compared to passwords or keys.

2. When is an On-premises Data Gateway required in Microsoft Fabric?

A. When connecting to Azure SQL Database
B. When connecting to OneLake
C. When connecting to an on-premises SQL Server
D. When connecting to Azure Data Lake Storage Gen2

Correct Answer: C

Explanation:
An On-premises Data Gateway is required when Fabric needs to access data sources that are hosted on-premises. Cloud-based sources such as Azure SQL Database or ADLS Gen2 do not require a gateway.

3. Which Fabric feature allows external data to appear as if it is stored in OneLake without copying the data?

A. Import mode
B. DirectQuery mode
C. OneLake shortcuts
D. Data pipelines

Correct Answer: C

Explanation:
OneLake shortcuts provide a logical reference to external storage locations (such as ADLS Gen2 or S3) without physically moving or duplicating the data.

4. You want multiple Fabric items in the same workspace to reuse a single data connection. Where should you create the connection?

A. In each semantic model
B. In Dataflows Gen2
C. In Power BI Desktop only
D. In Excel

Correct Answer: B

Explanation:
Dataflows Gen2 are designed for centralized data ingestion and transformation, making them ideal for creating reusable data connections across multiple Fabric items.

5. Which connectivity mode loads data into Fabric storage and provides the best query performance?

A. DirectQuery
B. Live connection
C. Shortcut-based access
D. Import

Correct Answer: D

Explanation:
Import mode copies data into Fabric-managed storage, enabling high-performance queries and full modeling capabilities at the cost of data freshness.

6. Which statement about DirectQuery connections in Fabric is true?

A. Data is stored in OneLake
B. Queries are always faster than Import mode
C. Queries are executed against the source system
D. A gateway is never required

Correct Answer: C

Explanation:
With DirectQuery, queries are sent directly to the source system at runtime. Performance depends on the source, and a gateway may be required for on-premises sources.

7. Which role is required to create or edit data connections within a Fabric workspace?

A. Viewer
B. Contributor
C. Member
D. Admin

Correct Answer: B

Explanation:
Users must have at least Contributor permissions to create or modify data connections. Viewers have read-only access and cannot manage connections.

8. Which file formats are commonly supported when creating file-based data connections in Fabric?

A. CSV only
B. CSV, JSON, Parquet, Excel
C. TXT only
D. XML only

Correct Answer: B

Explanation:
Microsoft Fabric supports a wide range of structured and semi-structured file formats, including CSV, JSON, Parquet, and Excel, especially when stored in OneLake or ADLS Gen2.

9. What is the primary security benefit of using a service principal for data connections?

A. Faster query performance
B. No need for a gateway
C. Automated, non-interactive authentication
D. Unlimited access to all workspaces

Correct Answer: C

Explanation:
Service principals enable secure, automated authentication scenarios (such as CI/CD pipelines) without relying on individual user credentials.

10. A data refresh in Fabric fails because credentials are missing. What is the most likely cause?

A. The dataset is in Import mode
B. The gateway is offline or misconfigured
C. The semantic model contains calculated columns
D. The file format is unsupported

Correct Answer: B

Explanation:
If a data source requires an On-premises Data Gateway and the gateway is offline or incorrectly configured, Fabric cannot access the credentials, causing refresh failures.

Improve DAX performance

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Implement and manage semantic models (25-30%)
--> Optimize enterprise-scale semantic models
--> Improve DAX performance

Effective DAX (Data Analysis Expressions) is essential for high-performance semantic models in Microsoft Fabric. As datasets and business logic become more complex, inefficient DAX can slow down query execution and degrade report responsiveness. This article explains why DAX performance matters, common performance pitfalls, and best practices to optimize DAX in enterprise-scale semantic models.


Why DAX Performance Matters

In Fabric semantic models (Power BI datasets + Direct Lake / Import / composite models), DAX is used to define:

  • Measures (dynamic calculations)
  • Calculated columns (row-level expressions)
  • Calculated tables (derived data structures)

When improperly written, DAX can become a bottleneck — especially on large models or highly interactive reports (many slicers, visuals, etc.). Optimizing DAX ensures:

  • Faster query execution
  • Better user experience
  • Lower compute consumption
  • More efficient use of memory

The DP-600 exam tests your ability to identify and apply performance-aware DAX patterns.


Understand DAX Execution Engines

DAX queries are executed by two engines:

  • Formula Engine (FE) — processes logic that can’t be delegated
  • Storage Engine (SE) — processes optimized aggregations and scans

Performance improves when more computation can be done in the Storage Engine (columnar operations) rather than the Formula Engine (row-by-row logic).

Rule of thumb: Favor patterns that minimize work done in the Formula Engine.


Common DAX Performance Anti-Patterns

1. Repeated Calculations Without Variables

Example:

Total Sales + Total Cost - Total Discount

If Total Sales, Total Cost, and Total Discount all compute the same sub-expressions repeatedly, the engine may evaluate redundant logic multiple times.

Anti-Pattern:

Repeated expressions without variables.


2. Nested Iterator Functions

Using iterators like SUMX or FILTER on large tables many times in a measure increases compute overhead.

Example:

SUMX(
    FILTER(FactSales, FactSales[SalesAmount] > 0),
    FactSales[Quantity] * FactSales[UnitPrice]
)

Filtering inside iterators and then iterating again adds overhead.


3. Large Row Context with Filters

Complex FILTER expressions that operate on large intermediate tables will push computation into the Formula Engine, which is slower.


4. Frequent Use of EARLIER

While useful, EARLIER is often replaced with clearer, faster patterns using variables or iterator functions.


Best Practices for Optimizing DAX


1. Use Variables (VAR)

Variables reduce redundant computations, enhance readability, and often improve performance:

Measure Optimized =
VAR BaseTotal = SUM(FactSales[SalesAmount])
RETURN
IF(BaseTotal > 0, BaseTotal, BLANK())

Benefits:

  • Computed once per filter context
  • Reduces repeated expression evaluation

2. Favor Storage Engine Over Formula Engine

Use functions that can be processed by the Storage Engine:

  • SUM, COUNT, AVERAGE, MIN, MAX run faster
  • Avoid SUMX when a plain SUM suffices

Example:

Total Sales = SUM(FactSales[SalesAmount])

Over:

Total Sales =
SUMX(FactSales, FactSales[SalesAmount])


3. Simplify Filter Expressions

When possible, use simpler filter arguments:

Better:

CALCULATE([Total Sales], DimDate[Year] = 2025)

Instead of:

CALCULATE([Total Sales], FILTER(DimDate, DimDate[Year] = 2025))

Why?
The simpler condition is more likely to push to the Storage Engine without extra row processing.


4. Use TRUE/FALSE Filters

When filtering on a Boolean or condition:

Better:

CALCULATE([Total Sales], FactSales[IsActive] = TRUE)

Instead of:

CALCULATE([Total Sales], FILTER(FactSales, FactSales[IsActive] = TRUE))


5. Limit Column and Table Scans

  • Remove unused columns from the model
  • Avoid high-cardinality columns in calculations where unnecessary
  • Use star schema design to improve filter propagation

6. Reuse Measures

Instead of duplicating logic:

Total Profit =
[Total Sales] - [Total Cost]

Reuse basic measures within more complex logic.


7. Prefer Measures Over Calculated Columns

Measures calculate at query time and respect filter context; calculated columns are evaluated during refresh. Use calculated columns only when necessary.


8. Reduce Iterators on Large Tables

If SUMX is needed for row-level expressions, consider summarizing first or using aggregation tables.


9. Understand Evaluation Context

Complex measures often inadvertently alter filter context. Use functions like:

  • ALL
  • REMOVEFILTERS
  • KEEPFILTERS

…carefully, as they affect performance and results.


10. Leverage DAX Studio or Performance Analyzer

While not directly tested with UI steps, knowing when to use tools to diagnose DAX is helpful:

  • Performance Analyzer identifies slow visuals
  • DAX Studio exposes query plans and engine timings

Performance Patterns and Anti-Patterns

PatternGood / BadNotes
VAR usageGoodMakes measures efficient and readable
SUM over SUMXGood if applicableLeverages Storage Engine
FILTER inside SUMXBadForces row context early
EARLIER / nested row contextBadHard to optimize, slows performance
Simple CALCULATE filtersGoodMore likely to fold

Example Before / After

Before (inefficient):

Measure = 
SUMX(
    FILTER(FactSales, FactSales[SalesAmount] > 1000),
    FactSales[Quantity] * FactSales[UnitPrice]
)

After (optimized):

VAR FilteredSales =
    CALCULATETABLE(
        FactSales,
        FactSales[SalesAmount] > 1000
    )
RETURN
SUMX(
    FilteredSales,
    FilteredSales[Quantity] * FilteredSales[UnitPrice]
)

Why better?
Explicit filtering via CALCULATETABLE often pushes more work to the Storage Engine than iterating within FILTER.


Exam-Focused Takeaways

For DP-600 questions related to DAX performance:

  • Identify inefficient row context patterns
  • Prefer variables and simple aggregations
  • Favor Storage Engine–friendly functions
  • Avoid unnecessary nested iterators
  • Recognize when a measure should be rewritten for performance

Summary

Improving DAX performance is about writing efficient calculations and avoiding patterns that force extra processing in the Formula Engine. By using variables, minimizing iterator overhead, simplifying filter expressions, and leveraging star schema design, you can significantly improve query responsiveness — a key capability for enterprise semantic models and the DP-600 exam.

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

  • Identifying and understand why an option is correct (or incorrect) — not just which one
  • Look for and understand the usage scenario of keywords in exam questions to guide you
  • Expect scenario-based questions rather than direct definitions

Question 1

You have a DAX measure that repeats the same complex calculation multiple times. Which change is most likely to improve performance?

A. Convert the calculation into a calculated column
B. Use a DAX variable (VAR) to store the calculation result
C. Replace CALCULATE with SUMX
D. Enable bidirectional relationships

Correct Answer: B

Explanation:
DAX variables evaluate their expression once per query context and reuse the result. This avoids repeated execution of the same logic and reduces Formula Engine overhead, making variables one of the most effective performance optimization techniques.


Question 2

Which aggregation function is generally the most performant when no row-by-row logic is required?

A. SUMX
B. AVERAGEX
C. SUM
D. FILTER

Correct Answer: C

Explanation:
Native aggregation functions like SUM, COUNT, and AVERAGE are optimized to run in the Storage Engine, which is much faster than iterator-based functions such as SUMX that require row-by-row evaluation in the Formula Engine.


Question 3

Why is this DAX pattern potentially slow on large tables?

CALCULATE([Total Sales], FILTER(FactSales, FactSales[SalesAmount] > 1000))

A. FILTER disables relationship filtering
B. FILTER forces evaluation in the Formula Engine
C. CALCULATE cannot push filters to the Storage Engine
D. The expression produces incorrect results

Correct Answer: B

Explanation:
The FILTER function iterates over rows, forcing Formula Engine execution. When possible, using simple Boolean expressions inside CALCULATE (e.g., FactSales[SalesAmount] > 1000) allows the Storage Engine to handle filtering more efficiently.


Question 4

Which CALCULATE filter expression is more performant?

A. FILTER(Sales, Sales[Year] = 2024)
B. Sales[Year] = 2024
C. ALL(Sales[Year])
D. VALUES(Sales[Year])

Correct Answer: B

Explanation:
Simple Boolean filters allow DAX to push work to the Storage Engine, while FILTER requires row-by-row evaluation. This distinction is frequently tested on the DP-600 exam.


Question 5

Which practice helps reduce the Formula Engine workload?

A. Using nested iterator functions
B. Replacing measures with calculated columns
C. Reusing base measures in more complex calculations
D. Increasing column cardinality

Correct Answer: C

Explanation:
Reusing base measures promotes efficient evaluation plans and avoids duplicated logic. Nested iterators and high cardinality columns increase computational complexity and slow down queries.


Question 6

Which modeling choice can indirectly improve DAX query performance?

A. Using snowflake schemas
B. Increasing the number of calculated columns
C. Removing unused columns and tables
D. Enabling bidirectional relationships by default

Correct Answer: C

Explanation:
Removing unused columns reduces memory usage, dictionary size, and scan costs. Smaller models lead to faster Storage Engine operations and improved overall query performance.


Question 7

Which DAX pattern is considered a performance anti-pattern?

A. Using measures instead of calculated columns
B. Using SUMX when SUM would suffice
C. Using star schema relationships
D. Using single-direction filters

Correct Answer: B

Explanation:
Iterator functions like SUMX should only be used when row-level logic is required. Replacing simple aggregations with iterators unnecessarily shifts work to the Formula Engine.


Question 8

Why can excessive use of EARLIER negatively impact performance?

A. It prevents relationship traversal
B. It creates complex nested row contexts
C. It only works in measures
D. It disables Storage Engine scans

Correct Answer: B

Explanation:
EARLIER introduces nested row contexts that are difficult for the DAX engine to optimize. Modern DAX best practices recommend using variables instead of EARLIER.


Question 9

Which relationship configuration can negatively affect DAX performance if overused?

A. Single-direction filtering
B. Many-to-one relationships
C. Bidirectional filtering
D. Active relationships

Correct Answer: C

Explanation:
Bidirectional relationships increase filter propagation paths and query complexity. While useful in some scenarios, overuse can significantly degrade performance in enterprise-scale models.


Question 10

Which tool should you use to identify slow visuals caused by inefficient DAX measures?

A. Power Query Editor
B. Model View
C. Performance Analyzer
D. Deployment Pipelines

Correct Answer: C

Explanation:
Performance Analyzer captures visual query durations, DAX query times, and rendering times, making it the primary tool for diagnosing DAX and visual performance issues in Power BI and Fabric semantic models.

Configure Direct Lake, including default fallback and refresh behavior

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Implement and manage semantic models (25-30%)
--> Optimize enterprise-scale semantic models
--> Configure Direct Lake, including default fallback and refresh behavior

Overview

Direct Lake is a storage and connectivity mode in Microsoft Fabric semantic models that enables Power BI to query data directly from OneLake without importing data into VertiPaq or sending queries back to the data source (as in DirectQuery). It is designed to deliver near–Import performance with DirectQuery-like freshness, making it a key feature for enterprise-scale analytics.

For the DP-600 exam, you are expected to understand:

  • How Direct Lake works
  • When and why fallback occurs
  • How default fallback behavior is configured
  • How refresh behaves in Direct Lake models
  • Common performance and design considerations

How Direct Lake Works

In Direct Lake mode:

  • Data resides in Delta tables stored in OneLake (typically from a Lakehouse or Warehouse).
  • The semantic model reads Parquet/Delta files directly, bypassing data import.
  • Metadata and file statistics are cached to optimize query performance.
  • Queries are executed without duplicating data into VertiPaq storage.

This architecture reduces data duplication while still enabling fast, interactive analytics.


Default Fallback Behavior

What Is Direct Lake Fallback?

Fallback occurs when a query or operation cannot be executed using Direct Lake. In these cases, the semantic model automatically falls back to another mode to ensure the query still returns results.

Depending on configuration, fallback may occur to:

  • DirectQuery, or
  • Import (VertiPaq), if data is available

Fallback is automatic and transparent to report users unless explicitly restricted.


Common Causes of Fallback

Direct Lake fallback can be triggered by:

  • Unsupported DAX functions or expressions
  • Unsupported data types in Delta tables
  • Complex model features (certain calculation patterns, security scenarios)
  • Queries that cannot be resolved efficiently using file-based access
  • Temporary unavailability of OneLake files

Understanding these triggers is important for diagnosing performance issues.


Configuring Default Fallback Behavior

In Fabric semantic model settings, you can configure:

  • Allow fallback (default) – Ensures queries continue to work even when Direct Lake is not supported.
  • Disable fallback – Queries fail instead of falling back, which is useful for enforcing performance expectations or testing Direct Lake compatibility.

From an exam perspective:

  • Allowing fallback prioritizes reliability
  • Disabling fallback prioritizes predictability and performance validation

Refresh Behavior in Direct Lake Models

Do Direct Lake Models Require Refresh?

Unlike Import mode:

  • Direct Lake does not require scheduled data refresh to reflect new data in OneLake.
  • New or updated Delta files are automatically visible to the semantic model.

However, metadata refreshes are still relevant.


Types of Refresh in Direct Lake

  1. Metadata Refresh
    • Updates table schemas, partitions, and statistics
    • Required when:
      • Columns are added or removed
      • Table structures change
    • Lightweight compared to Import refresh
  2. Hybrid Scenarios
    • If fallback to Import is enabled and used, those imported parts do require refresh
    • Mixed behavior may exist in composite or fallback-heavy models

Impact of Refresh on Performance

  • No large-scale data movement during refresh
  • Faster model readiness after schema changes
  • Reduced refresh windows compared to Import models
  • Lower memory pressure in capacity

This makes Direct Lake especially suitable for large, frequently updated datasets.


Performance and Design Considerations

To optimize Direct Lake usage:

  • Use supported Delta table features and data types
  • Keep models simple and star-schema based
  • Avoid unnecessary bidirectional relationships
  • Monitor fallback behavior using performance tools
  • Test critical DAX measures for Direct Lake compatibility

From an exam standpoint, expect scenario-based questions asking you to choose Direct Lake and configure fallback appropriately for scale, freshness, and reliability.


When to Use Direct Lake

Direct Lake is best suited for:

  • Large datasets stored in OneLake
  • Near-real-time analytics
  • Enterprise models that need both performance and freshness
  • Organizations standardizing on Fabric Lakehouse or Warehouse architectures

Key DP-600 Takeaways

  • Direct Lake queries Delta tables directly in OneLake
  • Default fallback ensures query continuity when Direct Lake isn’t supported
  • Fallback behavior can be enabled or disabled
  • Data refresh is not required, but metadata refresh still matters
  • Understanding fallback and refresh behavior is critical for enterprise-scale optimization

DP-600 Exam Tip 💡

Expect scenario-based questions where you must decide:

  • Whether to enable or disable fallback
  • How refresh behaves after schema changes
  • Why a query is falling back unexpectedly

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

  • Identifying and understand why an option is correct (or incorrect) — not just which one
  • Look for and understand the usage scenario of keywords in exam questions to guide you
  • Expect scenario-based questions rather than direct definitions

1. What is the primary benefit of using Direct Lake mode in a Fabric semantic model?

A. It fully imports data into VertiPaq for maximum compression
B. It queries Delta tables in OneLake directly without data import
C. It sends all queries back to the source system
D. It eliminates the need for semantic models

Correct Answer: B

Explanation:
Direct Lake reads Delta/Parquet files directly from OneLake, avoiding both data import (Import mode) and source query execution (DirectQuery), enabling near-Import performance with fresher data.


2. When does a Direct Lake semantic model fall back to another query mode?

A. When scheduled refresh fails
B. When unsupported features or queries are encountered
C. When the dataset exceeds 1 GB
D. When row-level security is enabled

Correct Answer: B

Explanation:
Fallback occurs when a query or model feature is not supported by Direct Lake, such as certain DAX expressions or unsupported data types.


3. What is the default behavior of Direct Lake when a query cannot be executed in Direct Lake mode?

A. The query fails immediately
B. The query retries using Import mode only
C. The query automatically falls back to another supported mode
D. The semantic model is disabled

Correct Answer: C

Explanation:
By default, Direct Lake allows fallback to ensure query reliability. This allows reports to continue functioning even if Direct Lake cannot handle a specific request.


4. Why might an organization choose to disable fallback in a Direct Lake semantic model?

A. To reduce OneLake storage costs
B. To enforce consistent Direct Lake performance and detect incompatibilities
C. To allow automatic data imports
D. To improve data refresh frequency

Correct Answer: B

Explanation:
Disabling fallback ensures queries only run in Direct Lake mode. This is useful for performance validation and preventing unexpected query behavior.


5. Which action typically requires a metadata refresh in a Direct Lake semantic model?

A. Adding new rows to a Delta table
B. Updating existing fact table values
C. Adding a new column to a Delta table
D. Running a Power BI report

Correct Answer: C

Explanation:
Schema changes such as adding or removing columns require a metadata refresh so the semantic model can recognize structural changes.


6. How does Direct Lake handle new data written to Delta tables in OneLake?

A. Data is visible only after a scheduled refresh
B. Data is visible automatically without data refresh
C. Data is visible only after manual import
D. Data is cached permanently

Correct Answer: B

Explanation:
Direct Lake reads data directly from OneLake, so new or updated data becomes available without needing a traditional Import refresh.


7. Which scenario is MOST likely to cause Direct Lake fallback?

A. Simple SUM aggregation on a fact table
B. Querying a supported Delta table
C. Using unsupported DAX functions in a measure
D. Filtering data using slicers

Correct Answer: C

Explanation:
Certain complex or unsupported DAX functions can force fallback because Direct Lake cannot execute them efficiently using file-based access.


8. What happens if fallback is disabled and a query cannot be executed in Direct Lake mode?

A. The query automatically switches to DirectQuery
B. The query fails and returns an error
C. The semantic model imports the data
D. The model switches to Import mode permanently

Correct Answer: B

Explanation:
When fallback is disabled, unsupported queries fail instead of switching modes, making incompatibilities more visible during testing.


9. Which statement about refresh behavior in Direct Lake models is TRUE?

A. Full data refresh is always required
B. Direct Lake models do not support refresh
C. Only metadata refresh may be required
D. Refresh behaves the same as Import mode

Correct Answer: C

Explanation:
Direct Lake does not require full data refreshes because it reads data directly from OneLake. Metadata refresh is needed only for structural changes.


10. Why is Direct Lake well suited for enterprise-scale semantic models?

A. It eliminates the need for Delta tables
B. It supports unlimited bidirectional relationships
C. It combines near-Import performance with fresh data access
D. It forces all data into memory

Correct Answer: C

Explanation:
Direct Lake offers high performance without importing data, making it ideal for large datasets that require frequent updates and scalable analytics.

Choose Between Direct Lake on OneLake and Direct Lake on SQL Endpoints

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Implement and manage semantic models (25-30%)
--> Optimize enterprise-scale semantic models
--> Choose between Direct Lake on OneLake and Direct Lake on SQL endpoints

In Microsoft Fabric, Direct Lake is a high-performance semantic model storage mode that allows Power BI and Fabric semantic models to query data directly from OneLake without importing it into VertiPaq. When implementing Direct Lake, you must choose where the semantic model reads from, either:

  • Direct Lake on OneLake
  • Direct Lake on SQL endpoints

Understanding the differences, trade-offs, and use cases for each option is critical for optimizing enterprise-scale semantic models, and this topic appears explicitly in the DP-600 exam blueprint.


Direct Lake on OneLake

What It Is

Direct Lake on OneLake connects the semantic model directly to Delta tables stored in OneLake, bypassing SQL engines entirely. Queries operate directly on Parquet/Delta files using the Fabric Direct Lake engine.

Key Characteristics

  • Reads Delta tables directly from OneLake
  • No dependency on a SQL query engine
  • Near-Import performance with zero data duplication
  • Minimal latency between data ingestion and reporting
  • Requires supported Delta table structures and data types

Advantages

  • Best performance for large-scale analytics
  • Always reflects the latest data written to OneLake
  • Eliminates Import refresh overhead
  • Ideal for lakehouse-centric architectures

Limitations

  • Some complex DAX patterns may cause fallback
  • Requires schema compatibility with Direct Lake
  • Less flexibility for SQL-based transformations

Typical Use Cases

  • Enterprise lakehouse analytics
  • High-volume fact tables
  • Near-real-time reporting
  • Fabric-native data pipelines

Direct Lake on SQL Endpoints

What It Is

Direct Lake on SQL endpoints connects the semantic model to the SQL analytics endpoint of a Lakehouse or Warehouse, while still using Direct Lake storage mode behind the scenes.

Instead of reading files directly, the semantic model relies on the SQL endpoint to expose the data.

Key Characteristics

  • Queries go through the SQL endpoint
  • Still benefits from Direct Lake storage
  • Enables SQL views and transformations
  • Slightly higher latency than pure OneLake access

Advantages

  • Supports SQL-based modeling (views, joins, calculated columns)
  • Easier integration with existing SQL logic
  • Familiar experience for SQL-first teams
  • Useful when business logic is already defined in SQL

Limitations

  • Additional query layer may impact performance
  • Less efficient than direct file access
  • SQL endpoint availability becomes a dependency

Typical Use Cases

  • Organizations with strong SQL development practices
  • Reuse of existing SQL views and transformations
  • Gradual migration from Warehouse or SQL models
  • Mixed BI and ad-hoc SQL workloads

Key Comparison Summary

AspectDirect Lake on OneLakeDirect Lake on SQL Endpoint
Data accessDirect file accessVia SQL analytics endpoint
PerformanceHighestSlightly lower
SQL dependencyNoneRequired
Schema flexibilityLowerHigher
Transformation styleLakehouse / SparkSQL-based
Ideal forScale & performanceSQL reuse & flexibility

Choosing Between the Two (Exam-Focused Guidance)

On the DP-600 exam, questions typically focus on architectural intent and performance optimization:

Choose Direct Lake on OneLake when:

  • Performance is the top priority
  • Data is already modeled in Delta tables
  • You want the simplest, most scalable architecture
  • Near-real-time analytics are required

Choose Direct Lake on SQL endpoints when:

  • You need SQL views or transformations
  • Existing logic already exists in SQL
  • Teams are more comfortable with SQL than Spark
  • Some flexibility is preferred over maximum performance

Exam Tip 💡

If a question emphasizes:

  • Maximum performance, minimal latency, or scalability/large-scale analyticsDirect Lake on OneLake
  • SQL views, SQL transformations, or SQL reuseDirect Lake on SQL endpoints

Expect scenario-based questions where both options are technically valid, but only one best aligns with the business and performance requirements.


Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

  • Identifying and understand why an option is correct (or incorrect) — not just which one
  • Look for and understand the usage scenario of keywords in exam questions to guide you
  • Expect scenario-based questions rather than direct definitions

Question 1

A company has Delta tables stored in OneLake and wants the lowest possible query latency for Power BI reports without using SQL views. Which option should they choose?

A. Import mode
B. DirectQuery on SQL endpoint
C. Direct Lake on SQL endpoint
D. Direct Lake on OneLake

Correct Answer: D

Explanation:
Direct Lake on OneLake reads Delta tables directly from OneLake without a SQL layer, delivering the best performance and lowest latency.


Question 2

Which requirement would most strongly favor Direct Lake on SQL endpoints over Direct Lake on OneLake?

A. Maximum performance
B. Real-time data visibility
C. Use of SQL views for business logic
D. Minimal infrastructure dependencies

Correct Answer: C

Explanation:
Direct Lake on SQL endpoints allows semantic models to consume SQL views and transformations, making it ideal when business logic is defined in SQL.


Question 3

What is a key architectural difference between Direct Lake on OneLake and Direct Lake on SQL endpoints?

A. Only OneLake supports Delta tables
B. SQL endpoints require data import
C. OneLake access bypasses the SQL engine
D. SQL endpoints cannot be used with semantic models

Correct Answer: C

Explanation:
Direct Lake on OneLake reads Delta files directly from storage, while SQL endpoints introduce an additional SQL query layer.


Question 4

A Fabric semantic model uses Direct Lake on OneLake. Under which condition might it fallback to DirectQuery?

A. The model contains calculated columns
B. The dataset exceeds 1 TB
C. The Delta table schema is unsupported
D. The SQL endpoint is unavailable

Correct Answer: C

Explanation:
If the Delta table schema or data types are not supported by Direct Lake, Fabric automatically falls back to DirectQuery.


Question 5

Which scenario is best suited for Direct Lake on SQL endpoints?

A. High-volume streaming telemetry
B. SQL-first team reusing existing warehouse views
C. Near-real-time dashboards on raw lake data
D. Large fact tables optimized for scan performance

Correct Answer: B

Explanation:
Direct Lake on SQL endpoints is ideal when teams rely on SQL views and want to reuse existing SQL logic.


Question 6

Which statement about performance is most accurate?

A. SQL endpoints always outperform OneLake
B. OneLake always requires Import mode
C. Direct Lake on OneLake typically offers better performance
D. Direct Lake on SQL endpoints does not use Direct Lake

Correct Answer: C

Explanation:
Direct Lake on OneLake avoids the SQL layer, resulting in faster query execution in most scenarios.


Question 7

A Power BI model must reflect new data immediately after ingestion into OneLake. Which option best supports this requirement?

A. Import mode
B. DirectQuery
C. Direct Lake on SQL endpoint
D. Direct Lake on OneLake

Correct Answer: D

Explanation:
Direct Lake on OneLake reads data directly from Delta tables and reflects changes immediately without refresh.


Question 8

Which dependency exists when using Direct Lake on SQL endpoints that does not exist with Direct Lake on OneLake?

A. Delta Lake support
B. VertiPaq compression
C. SQL analytics endpoint availability
D. Semantic model compatibility

Correct Answer: C

Explanation:
Direct Lake on SQL endpoints depends on the SQL analytics endpoint being available, while OneLake access does not.


Question 9

From a DP-600 exam perspective, which factor most often determines the correct choice between these two options?

A. Dataset size alone
B. Whether SQL transformations are required
C. Number of report users
D. Power BI license type

Correct Answer: B

Explanation:
Exam questions typically focus on whether SQL logic (views, joins, transformations) is needed, which drives the choice.


Question 10

You are designing an enterprise semantic model focused on scalability and minimal complexity. The data is already curated as Delta tables. What is the best choice?

A. Import mode
B. DirectQuery on SQL endpoint
C. Direct Lake on SQL endpoint
D. Direct Lake on OneLake

Correct Answer: D

Explanation:
Direct Lake on OneLake offers the simplest architecture with the highest scalability and performance when Delta tables are already prepared.


Implement workspace-level access controls in Microsoft Fabric

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Maintain a data analytics solution
--> Implement security and governance
--> Implement workspace-level access controls

To Do:
Complete the related module for this topic in the Microsoft Learn course: Secure data access in Microsoft Fabric

Workspace-level access control is the first and most fundamental security boundary in Microsoft Fabric. It determines who can access a workspace, what actions they can perform, and how they can interact with Fabric items such as Lakehouses, Warehouses, semantic models, reports, notebooks, and pipelines.

For the DP-600 exam, you should clearly understand workspace roles, their permissions, and how workspace security integrates with broader governance practices.

What Are Workspace-Level Access Controls?

Workspace-level access controls define permissions at the workspace scope, applying to all items within that workspace unless further restricted by item-level or data-level security.

These controls are managed through workspace roles, which are assigned to:

  • Individual users
  • Microsoft Entra ID (Azure AD) security groups
  • Distribution lists (limited scenarios)

Workspace Roles in Microsoft Fabric

Microsoft Fabric workspaces use role-based access control (RBAC). There are 4 roles that users can be assigned to for workspace access and each role grants a predefined set of permissions.

1. Admin

Highest level of access

Admins can:

  • Manage workspace settings
  • Add or remove users and assign roles
  • Delete the workspace
  • Control capacity assignment
  • Access and manage all items

Typical use cases

  • Platform administrators
  • Lead analytics engineers

Exam note
Admins automatically have all permissions of lower roles.

2. Member

Full content creation and collaboration role

Members can:

  • Create, edit, and delete Fabric items
  • Publish and update semantic models and reports
  • Share content
  • Run pipelines and notebooks

Members cannot:

  • Delete the workspace
  • Manage capacity settings

Typical use cases

  • Analytics engineers
  • Senior analysts

3. Contributor

Content creation with limited governance control

Contributors can:

  • Create and modify items they have access to
  • Run notebooks, pipelines, and queries
  • Publish reports and datasets

Contributors cannot:

  • Manage workspace users
  • Modify workspace settings

Typical use cases

  • Data analysts
  • Developers contributing content

4. Viewer

Read-only access

Viewers can:

  • View reports and dashboards
  • Read data from semantic models
  • Execute queries if explicitly allowed

Viewers cannot:

  • Create or edit items
  • Publish or share content

Typical use cases

  • Business users
  • Report consumers

Summary table:

RoleDescriptionCan / CannotTypical use cases
Admin– Highest level of access.
– Full workspace administration access including ability to delete.
Admins Can:
– Manage workspace settings
– Add or remove users and assign roles
– Delete the workspace
– Control capacity assignment
– Access and manage all items
– Platform administrators
– Lead analytics engineers
MemberFull content creation and collaboration role.
– Can manage members with same or lower permissions.
Members can:
– Create, edit, and delete Fabric items
– Publish and update semantic models and reports
– Share content
– Run pipelines and notebooks

Members cannot:
– Delete the workspace
– Manage capacity settings
– Analytics engineers
– Senior analysts
Contributor– Content creation with limited governance control
– Can create and manage workspace content
Contributors can:
– Create and modify items they have access to
– Run notebooks, pipelines, and queries
– Publish reports and datasets

Contributors cannot:
– Manage workspace users
– Modify workspace settings
– Data analysts
– Developers contributing content
Viewer– Read-only access to the workspaceViewers can:
– View reports and dashboards
– Read data from semantic models
– Execute queries if explicitly allowed

Viewers cannot:
– Create or edit items
– Publish or share content
– Business users
– Report consumers

How Workspace-Level Security Is Enforced

Workspace-level access controls:

  • Are evaluated before item-level or data-level security
  • Determine whether a user can even see workspace content
  • Apply consistently across all Fabric workloads (Power BI, Lakehouse, Warehouse, Data Factory, Real-Time Analytics)

This makes workspace roles the entry point for all other security mechanisms.

Best Practices for Workspace-Level Access Control

Use Security Groups Instead of Individuals

  • Assign Microsoft Entra ID security groups to workspace roles
  • Simplifies access management
  • Supports scalable governance

Separate Workspaces by Purpose

Common patterns include:

  • Development vs Test vs Production
  • Department-specific workspaces
  • Consumer-only (Viewer) workspaces

Apply Least Privilege

  • Grant users the lowest role necessary
  • Avoid overusing Admin and Member roles

Relationship to Other Security Layers

Workspace-level access controls work alongside:

  • Item-level permissions (e.g., sharing a report)
  • Row-level, column-level, and object-level security in semantic models
  • File-level security in OneLake
  • Capacity-level governance

For exam scenarios, always identify which security layer is being tested.

Common Exam Scenarios to Watch For

You may be asked to:

  • Choose the correct workspace role for a given user persona
  • Identify why a user cannot see or edit workspace content
  • Decide when to use Viewer vs Contributor
  • Understand how workspace roles interact with RLS or file access

Key Exam Takeaways

  • Workspace roles control who can access a workspace and what actions they can perform
  • Admin, Member, Contributor, and Viewer each have distinct permission boundaries
  • Workspace security is broader than item-level sharing
  • Always think workspace first, data second when designing security

Exam Tips

If the question is about who can create, edit, share, or manage content, the answer almost always involves workspace-level access controls.

Expect scenario-based questions that test:

  • Choosing the least-privileged role
  • Understanding the difference between Member vs Contributor
  • Knowing when workspace security is not enough and must be combined with RLS or item-level access

Practice Questions

Question 1 (Single choice)

Which workspace role in Microsoft Fabric allows a user to publish content, manage permissions, and delete the workspace?

A. Viewer
B. Contributor
C. Member
D. Admin

Correct Answer: D

Explanation:

  • Admin is the highest workspace role and includes full control, including managing access, deleting the workspace, and assigning roles.
  • Contributors and Members cannot manage workspace-level permissions.
  • Viewers have read-only access.

Question 2 (Scenario-based)

You want analysts to create and edit items (lakehouses, notebooks, reports) but prevent them from managing access or deleting the workspace. Which role should you assign?

A. Viewer
B. Contributor
C. Member
D. Admin

Correct Answer: C

Explanation:

  • Members can create, edit, and publish content but cannot manage workspace access or delete the workspace.
  • Contributors have more limited permissions.
  • Admins have excessive privileges for this scenario.

Question 3 (Multi-select)

Which actions are possible for a user assigned the Contributor role? (Select all that apply.)

A. Create new items
B. Edit existing items
C. Manage workspace permissions
D. Publish reports to the workspace

Correct Answers: A, B

Explanation:

  • Contributors can create and edit items.
  • They cannot manage permissions or perform full publishing/administrative actions.
  • Publishing to app audiences or managing access requires Member or Admin.

Question 4 (Scenario-based)

A workspace contains sensitive data. You want executives to view reports only, without seeing datasets, lakehouses, or notebooks. What is the BEST approach?

A. Assign Viewer role
B. Assign Contributor role
C. Assign Member role
D. Assign Admin role

Correct Answer: A

Explanation:

  • Viewer role provides read-only access and prevents exposure to underlying assets beyond consumption.
  • Other roles expose authoring and object-level visibility.

Question 5 (Single choice)

Workspace-level access controls in Fabric are applied to:

A. Individual tables only
B. Semantic models only
C. All items within the workspace
D. Reports published to apps only

Correct Answer: C

Explanation:

  • Workspace-level roles apply across all items in the workspace unless further restricted using item-level or semantic-model security.
  • Finer-grained security must be implemented separately.

Question 6 (Scenario-based)

You need to ensure that workspace access is centrally governed and users cannot self-assign roles. What is the BEST practice?

A. Allow Members to manage access
B. Restrict access management to Admins only
C. Use Viewer roles exclusively
D. Disable workspace sharing

Correct Answer: B

Explanation:

  • Only Admins should manage workspace access for governance and compliance.
  • Members should not be allowed to assign roles in controlled environments.

Question 7 (Multi-select)

Which of the following are valid workspace roles in Microsoft Fabric? (Select all that apply.)

A. Viewer
B. Contributor
C. Member
D. Owner

Correct Answers: A, B, C

Explanation:

  • Valid Fabric workspace roles are Viewer, Contributor, Member, and Admin.
  • “Owner” is not a Fabric workspace role.

Question 8 (Scenario-based)

A user can view reports but receives an error when attempting to open a semantic model directly. What is the MOST likely reason?

A. They are a Contributor
B. They are a Viewer
C. The dataset is in Import mode
D. XMLA endpoint is disabled

Correct Answer: B

Explanation:

  • Viewers can consume reports but may not have permissions to explore or access underlying semantic models directly.
  • This behavior aligns with workspace-level access restrictions.

Question 9 (Single choice)

Which statement about workspace-level access vs. item-level security is TRUE?

A. Workspace access overrides all other security
B. Workspace access is more granular than item-level security
C. Item-level security can further restrict access granted by workspace roles
D. Workspace access only applies to reports

Correct Answer: C

Explanation:

  • Workspace roles grant baseline access, which can then be restricted using item-level security, RLS, or object-level permissions.
  • Workspace access does not override more restrictive controls.

Question 10 (Scenario-based)

You want to minimize administrative overhead while allowing self-service analytics. Which workspace role strategy is MOST appropriate?

A. Assign Admin to all users
B. Assign Member to authors and Viewer to consumers
C. Assign Contributor to executives
D. Assign Viewer to data engineers

Correct Answer: B

Explanation:

  • This is a recommended best practice:
    • Members for authors/builders
    • Viewers for consumers
  • It balances governance and agility while minimizing risk.

Implement item-level access controls in Microsoft Fabric

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Maintain a data analytics solution
--> Implement security and governance
--> Implement item-level access controls

To Do:
Complete the related module for this topic in the Microsoft Learn course: Secure data access in Microsoft Fabric

Item-level access controls in Microsoft Fabric determine who can access or interact with specific items inside a workspace, rather than the entire workspace. Items include reports, semantic models, Lakehouses, Warehouses, notebooks, pipelines, dashboards, and other Fabric artifacts.

For the DP-600 exam, it’s important to understand how item-level permissions differ from workspace roles, when to use them, and how they interact with data-level security such as RLS.

What Are Item-Level Access Controls?

Item-level access controls:

  • Apply to individual Fabric items
  • Are more granular than workspace-level roles
  • Allow selective sharing without granting broad workspace access

They are commonly used when:

  • Users need access to one report or dataset, not the whole workspace
  • Consumers should view content without seeing development artifacts
  • External or business users need limited access

Common Items That Support Item-Level Permissions

In Microsoft Fabric, item-level permissions can be applied to:

  • Power BI reports
  • Semantic models (datasets)
  • Dashboards
  • Lakehouses and Warehouses
  • Notebooks and pipelines (via workspace + item context)

The most frequently tested scenarios in DP-600 involve reports and semantic models.

Sharing Reports and Dashboards

Report Sharing

Reports can be shared directly with users or groups.

When you share a report:

  • Users can be granted View or Reshare permissions
  • The report appears in the recipient’s “Shared with me” section
  • Access does not automatically grant workspace access

Exam considerations

  • Sharing a report does not grant edit permissions
  • Sharing does not bypass data-level security (RLS still applies)
  • Users must also have access to the underlying semantic model

Semantic Model (Dataset) Permissions

Semantic models support explicit permissions that control how users interact with data.

Common permissions include:

  • Read – View and query the model
  • Build – Create reports using the model
  • Write – Modify the model (typically for owners)
  • Reshare – Share the model with others

Typical use cases

  • Allow analysts to build their own reports (Build permission)
  • Allow consumers to view reports without building new ones
  • Restrict direct querying of datasets

Exam tips

  • Build permission is required for “Analyze in Excel” and report creation
  • RLS and OLS are enforced at the semantic model level
  • Dataset permissions can be granted independently of report sharing

Item-Level Access vs Workspace-Level Roles

Understanding this distinction is critical for the exam.

FeatureWorkspace-Level AccessItem-Level Access
ScopeEntire workspaceSingle item
Typical rolesAdmin, Member, Contributor, ViewerView, Build, Reshare
Best forTeam collaborationTargeted sharing
GranularityCoarseFine-grained

Key exam insight:
Item-level access does not override workspace permissions. A user cannot edit an item if their workspace role is Viewer, even if the item is shared.

Interaction with Data-Level Security

Item-level access works together with:

  • Row-Level Security (RLS)
  • Column-Level Security (CLS)
  • Object-Level Security (OLS)

Important behaviors:

  • Sharing a report does not expose restricted rows or columns
  • RLS is evaluated based on the user’s identity
  • Item access only determines whether a user can query the item, not what data they see

Common Exam Scenarios

You may encounter questions such as:

  • A user can see a report but cannot build a new one → missing Build permission
  • A user has report access but sees no data → likely RLS
  • A business user needs access to one report only → item-level sharing, not workspace access
  • An analyst can’t query a dataset in Excel → lacks Build permission

Best Practices to Remember

  • Use item-level access for consumers and ad-hoc sharing
  • Use workspace roles for development teams
  • Assign permissions to Entra ID security groups when possible
  • Always pair item access with appropriate semantic model permissions

Key Exam Takeaways

  • Item-level access controls provide fine-grained security
  • Reports and semantic models are the most tested items
  • Build permission is critical for self-service analytics
  • Item-level access complements, but does not replace, workspace roles

Exam Tips

  • Think “Can they see the object at all?”
  • Combine:
    • Workspace roles → broad access
    • Item-level access → fine-grained control
    • RLS/CLS → data-level restrictions
  • Expect scenarios involving:
    • Preventing access to lakehouses
    • Separating authors from consumers
    • Protecting production assets
  • If a question asks who can view or build from a specific report or dataset without granting workspace access, the correct answer almost always involves item-level access controls.

Practice Questions:

Question 1 (Single choice)

What is the PRIMARY purpose of item-level access controls in Microsoft Fabric?

A. Control which rows a user can see
B. Control which columns a user can see
C. Control access to specific workspace items
D. Control DAX query execution speed

Correct Answer: C

Explanation:

  • Item-level access controls determine who can access specific items (lakehouses, warehouses, semantic models, notebooks, reports).
  • Row-level and column-level security are semantic model features, not item-level controls.

Question 2 (Scenario-based)

A user should be able to view reports but must NOT access the underlying lakehouse or semantic model. Which control should you use?

A. Workspace Viewer role
B. Item-level permissions on the lakehouse and semantic model
C. Row-level security
D. Column-level security

Correct Answer: B

Explanation:

  • Item-level access allows you to block direct access to specific items even when the user has workspace access.
  • Viewer role alone may still expose certain metadata.

Question 3 (Multi-select)

Which Fabric items support item-level access control? (Select all that apply.)

A. Lakehouses
B. Warehouses
C. Semantic models
D. Power BI reports

Correct Answers: A, B, C, D

Explanation:

  • Item-level access can be applied to most Fabric artifacts, including data storage, models, and reports.
  • This allows fine-grained governance beyond workspace roles.

Question 4 (Scenario-based)

You want data engineers to manage a lakehouse, but analysts should only consume a semantic model built on top of it. What is the BEST approach?

A. Assign Analysts as Workspace Viewers
B. Deny item-level access to the lakehouse for Analysts
C. Use Row-Level Security only
D. Disable SQL endpoint access

Correct Answer: B

Explanation:

  • Analysts can access the semantic model while being explicitly denied access to the lakehouse via item-level permissions.
  • This is a common enterprise pattern in Fabric.

Question 5 (Single choice)

Which permission is required for a user to edit or manage an item at the item level?

A. Read
B. View
C. Write
D. Execute

Correct Answer: C

Explanation:

  • Write permissions allow editing, updating, or managing an item.
  • Read/View permissions are consumption-only.

Question 6 (Scenario-based)

A user can see a report but receives an error when trying to connect to its semantic model using Power BI Desktop. Why?

A. XMLA endpoint is disabled
B. They lack item-level permission on the semantic model
C. The dataset is in Direct Lake mode
D. The report uses DirectQuery

Correct Answer: B

Explanation:

  • Viewing a report does not automatically grant access to the underlying semantic model.
  • Item-level access must explicitly allow it.

Question 7 (Multi-select)

Which statements about workspace access vs item-level access are TRUE? (Select all that apply.)

A. Workspace access automatically grants access to all items
B. Item-level access can further restrict workspace permissions
C. Item-level access overrides Row-Level Security
D. Workspace roles are broader than item-level permissions

Correct Answers: B, D

Explanation:

  • Workspace roles define baseline access.
  • Item-level access can tighten restrictions on specific assets.
  • RLS still applies within semantic models.

Question 8 (Scenario-based)

You want to prevent accidental modification of a production semantic model while still allowing users to query it. What should you do?

A. Assign Viewer role at the workspace level
B. Grant Read permission at the item level
C. Disable the SQL endpoint
D. Remove the semantic model

Correct Answer: B

Explanation:

  • Read item-level permission allows querying and consumption without edit rights.
  • This is safer than relying on workspace roles alone.

Question 9 (Single choice)

Which security layer is MOST appropriate for restricting access to entire objects rather than data within them?

A. Row-level security
B. Column-level security
C. Object-level security
D. Item-level access control

Correct Answer: D

Explanation:

  • Item-level access controls whether a user can access an object at all.
  • Object-level security applies inside semantic models.

Question 10 (Scenario-based)

A compliance requirement states that only approved users can access notebooks in a workspace. What is the BEST solution?

A. Place notebooks in a separate workspace
B. Apply item-level access controls to notebooks
C. Use Row-Level Security
D. Restrict workspace Viewer access

Correct Answer: B

Explanation:

  • Item-level access allows targeted restriction without restructuring workspaces.
  • This is the preferred Fabric governance approach.

Implement Row-Level, Column-Level, Object-Level, and File-Level Access Controls in Microsoft Fabric

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Maintain a data analytics solution
--> Implement security and governance
--> Implement row-level, column-level, object-level, and file-level access control

To Do:
Complete the related module for this topic in the Microsoft Learn course: Secure data access in Microsoft Fabric

Security and governance are foundational responsibilities of a Fabric Analytics Engineer. Microsoft Fabric provides multiple layers of access control to ensure users can only see and interact with the data they are authorized to access. For the DP-600 exam, it is important to understand what each access control type does, where it is applied, and when to use it.

1. Row-Level Security (RLS)

What it is

Row-Level Security (RLS) restricts access to specific rows in a table based on the identity or role of the user querying the data.

Where it is implemented

  • Power BI semantic models (datasets)
  • Direct Lake or Import models in Fabric
  • Applies at query time

How it works

  • You define DAX filter expressions on tables.
  • Users are assigned to roles, and those roles determine which rows are visible.
  • The filtering is enforced automatically whenever the model is queried.

Common use cases

  • Sales users see only their assigned regions
  • Managers see only their department’s data
  • Multi-tenant reporting scenarios

Exam tips

  • RLS filters rows, not columns
  • RLS is evaluated dynamically based on user context
  • Know the difference between static RLS (hard-coded filters) and dynamic RLS (based on USERPRINCIPALNAME or lookup tables)

2. Column-Level Security (CLS)

What it is

Column-Level Security (CLS) restricts access to specific columns within a table, preventing sensitive fields from being exposed.

Where it is implemented

  • Power BI semantic models
  • Defined within the model, not in reports

How it works

  • Columns are marked as hidden for certain roles
  • Users in those roles cannot query or visualize the restricted columns

Common use cases

  • Hiding personally identifiable information (PII)
  • Restricting access to salary, cost, or confidential metrics

Exam tips

  • CLS does not hide entire rows
  • Users without access cannot bypass CLS using visuals or queries
  • CLS is evaluated before data reaches the report layer

3. Object-Level Security (OLS)

What it is

Object-Level Security (OLS) controls access to entire objects within a semantic model, such as:

  • Tables
  • Columns
  • Measures

Where it is implemented

  • Power BI semantic models in Fabric
  • Typically managed using external tools or advanced model editing

How it works

  • Objects are explicitly denied to specific roles
  • Denied objects are completely invisible to the user

Common use cases

  • Hiding technical or staging tables
  • Preventing access to internal calculation measures
  • Supporting multiple audiences from the same model

Exam tips

  • OLS is stronger than CLS (objects are invisible, not just hidden)
  • OLS affects metadata discovery
  • Users cannot query objects they do not have access to

4. File-Level Access Controls

What it is

File-level access control governs who can access files stored in OneLake, including:

  • Lakehouse files
  • Warehouse data
  • Files accessed via notebooks or Spark jobs

Where it is implemented

  • OneLake
  • Workspace permissions
  • Underlying Azure Data Lake Gen2 permission model

How it works

  • Permissions are assigned at:
    • Workspace level
    • Item level (Lakehouse, Warehouse)
    • Folder or file level (where applicable)
  • Uses role-based access control (RBAC)

Common use cases

  • Restricting raw data access to engineers only
  • Allowing analysts read-only access to curated zones
  • Enforcing separation between development and production data

Exam tips

  • File-level security applies before data reaches semantic models
  • Workspace roles (Admin, Member, Contributor, Viewer) matter
  • OneLake follows a centralized storage model across Fabric workloads

Key Comparisons to Remember for the Exam

Security TypeScopeEnforced AtTypical Use
Row-Level (RLS)RowsQuery timeUser-specific data filtering
Column-Level (CLS)ColumnsModel levelProtect sensitive fields
Object-Level (OLS)Tables, columns, measuresModel metadataHide entire objects
File-LevelFiles and foldersStorage/workspaceControl raw and curated data access

How This Fits into Fabric Governance

In Microsoft Fabric, these access controls work together:

  • File-level security protects data at rest
  • Object-, column-, and row-level security protect data at the semantic model layer
  • Workspace roles govern who can create, modify, or consume items

For the DP-600 exam, expect scenario-based questions that test:

  • Choosing the right level of security
  • Understanding where security is enforced
  • Knowing limitations and interactions between security types

Final Exam Tips

If the question mentions who can see which data values, think RLS or CLS.
If it mentions who can see which objects, think OLS.
If it mentions access to files or raw data, think file-level and workspace permissions.

DP-600 Exam Strategy Notes

  • Security evaluation order (exam favorite):
    1. Workspace access
    2. Item-level access
    3. Object-level security
    4. Column-level security
    5. Row-level security
  • Use:
    • RLSWho sees which rows?
    • CLSWho sees which columns?
    • OLSWho sees which tables/measures?
    • File-levelWho sees which files?


Practice Questions

Question 1 (Single choice)

Which access control mechanism restricts which rows of data a user can see in a semantic model?

A. Column-level security
B. Object-level security
C. Row-level security
D. Item-level access

Correct Answer: C

Explanation:

  • Row-level security (RLS) filters rows dynamically based on user identity.
  • CLS restricts columns, OLS restricts objects, and item-level controls access to the artifact itself.

Question 2 (Scenario-based)

A sales manager should only see sales data for their assigned region across all reports. Which solution should you implement?

A. Column-level security
B. Row-level security with dynamic DAX
C. Object-level security
D. Workspace Viewer role

Correct Answer: B

Explanation:

  • Dynamic RLS uses functions like USERPRINCIPALNAME() to filter rows per user.
  • Workspace roles do not filter data.

Question 3 (Multi-select)

Which security types are configured within a Power BI semantic model? (Select all that apply.)

A. Row-level security
B. Column-level security
C. Object-level security
D. File-level security

Correct Answers: A, B, C

Explanation:

  • RLS, CLS, and OLS are semantic model features.
  • File-level security applies to OneLake files, not semantic models.

Question 4 (Scenario-based)

You want to prevent users from seeing a Salary column but still allow access to other columns in the table. What should you use?

A. Row-level security
B. Object-level security
C. Column-level security
D. Item-level access

Correct Answer: C

Explanation:

  • Column-level security hides specific columns from unauthorized users.
  • RLS filters rows, not columns.

Question 5 (Single choice)

Which access control hides entire tables or measures from users?

A. Row-level security
B. Column-level security
C. Object-level security
D. File-level security

Correct Answer: C

Explanation:

  • Object-level security (OLS) hides tables, columns, or measures completely.
  • Users won’t even see them in the field list.

Question 6 (Scenario-based)

A user should be able to query a semantic model but must not see a calculated measure used only internally. Which control is BEST?

A. Column-level security
B. Object-level security
C. Row-level security
D. Workspace permission

Correct Answer: B

Explanation:

  • OLS can hide measures entirely.
  • CLS only applies to columns, not measures.

Question 7 (Multi-select)

Which scenarios require file-level access controls in Microsoft Fabric? (Select all that apply.)

A. Restricting access to specific Parquet files in OneLake
B. Limiting access to a lakehouse table
C. Controlling access to raw ingestion files
D. Filtering rows in a semantic model

Correct Answers: A, C

Explanation:

  • File-level access applies to files and folders in OneLake.
  • Table and row access are handled elsewhere.

Question 8 (Scenario-based)

A data engineer needs access to raw files in OneLake, but analysts should only see curated tables. What should you implement?

A. Row-level security
B. Column-level security
C. File-level access controls
D. Object-level security

Correct Answer: C

Explanation:

  • File-level access ensures analysts cannot browse or access raw files.
  • RLS and CLS don’t apply at the file system level.

Question 9 (Single choice)

Which security type is evaluated first when a user attempts to access data?

A. Row-level security
B. Column-level security
C. Item-level access
D. Object-level security

Correct Answer: C

Explanation:

  • Item-level access determines whether the user can access the artifact at all.
  • If denied, other security layers are never evaluated.

Question 10 (Scenario-based)

A user can access a report but receives an error when querying a table directly from the semantic model. What is the MOST likely cause?

A. Missing Row-Level Security role
B. Column-level security blocking access
C. Object-level security hiding the table
D. File-level security restriction

Correct Answer: C

Explanation:

  • If OLS hides a table, it cannot be queried—even if reports still function.
  • Reports may rely on cached or abstracted queries.