Category: Business Intelligence Platform

Practice Questions: Create and configure a workspace (PL-300 Exam Prep)

This post is a part of the PL-300: Microsoft Power BI Data Analyst Exam Prep Hub; and this topic falls under these sections:
Manage and secure Power BI (15–20%)
--> Create and manage workspaces and assets
--> Create and configure a workspace


Below are 10 practice questions (with answers and explanations) for this topic of the exam.
There are also 2 practice tests for the PL-300 exam with 60 questions each (with answers) available on the hub.

Practice Questions


Question 1

You want to allow a user to publish reports and update semantic models, but not manage users or workspace settings. Which workspace role should you assign?

A. Viewer
B. Contributor
C. Member
D. Admin

Correct Answer: C. Member

Explanation:
Members can create, edit, publish, and update content but cannot manage workspace settings or users. Contributors can create content but have more limitations than Members. Admins have full control.


Question 2

Which workspace role provides read-only access to reports and dashboards?

A. Viewer
B. Contributor
C. Member
D. Admin

Correct Answer: A. Viewer

Explanation:
Viewers can only consume content. They cannot edit, publish, or share assets. This role is commonly used for business users.


Question 3

You need to use Copilot, deployment pipelines, and large semantic models in a workspace. What must you configure?

A. Assign Admin role to all users
B. Enable dataset endorsements
C. Move the workspace to Premium capacity
D. Publish a Power BI App

Correct Answer: C. Move the workspace to Premium capacity

Explanation:
Advanced features such as Copilot, deployment pipelines, and large semantic models require Premium (or Fabric) capacity.


Question 4

Which object is NOT stored inside a Power BI workspace?

A. Report
B. Dashboard
C. Semantic model
D. Power BI App

Correct Answer: D. Power BI App

Explanation:
Apps are published from workspaces but are not stored as workspace assets themselves.


Question 5

Who can add or remove users from a workspace?

A. Viewer
B. Contributor
C. Member
D. Admin

Correct Answer: D. Admin

Explanation:
Only workspace Admins can manage access and change workspace-level settings.


Question 6

A team wants to collaborate on report development but ensure that end users cannot modify reports. What is the BEST approach?

A. Give all users Member access
B. Share individual reports
C. Publish an app from the workspace
D. Use row-level security

Correct Answer: C. Publish an app from the workspace

Explanation:
Publishing an app provides a read-only, curated experience for consumers while allowing developers to work in the workspace.


Question 7

Which workspace role can create and modify content but cannot publish apps or manage access?

A. Viewer
B. Contributor
C. Member
D. Admin

Correct Answer: B. Contributor

Explanation:
Contributors can build and modify content but cannot manage users or publish apps.


Question 8

You want to ensure support requests and ownership questions are directed correctly. Which workspace setting helps with this?

A. Assigning Admin role
B. Workspace description
C. Contact list
D. App permissions

Correct Answer: C. Contact list

Explanation:
The contact list identifies workspace owners or support contacts and is a recommended best practice.


Question 9

Which scenario BEST justifies creating multiple workspaces instead of a single one?

A. Different teams own different datasets
B. Users want different slicer behavior
C. Reports use different visuals
D. Dashboards need bookmarks

Correct Answer: A. Different teams own different datasets

Explanation:
Workspaces should align with ownership, security boundaries, and business domains, not visual design preferences.


Question 10

Which statement about Power BI workspaces is TRUE?

A. Workspaces can only contain reports
B. Workspace roles apply only to dashboards
C. Apps must be published from a workspace
D. Viewers can publish new content

Correct Answer: C. Apps must be published from a workspace

Explanation:
All Power BI Apps are published from workspaces. Viewers cannot publish content, and workspaces contain multiple asset types.


Quick Exam Tips

  • Admins manage access and settings
  • Members publish and collaborate
  • Contributors build but don’t manage
  • Viewers consume only
  • Premium capacity unlocks advanced features
  • Apps = curated, read-only distribution

Go back to the PL-300 Exam Prep Hub main page

Practice Questions: Identify when a gateway is required (PL-300 Exam Prep)

This post is a part of the PL-300: Microsoft Power BI Data Analyst Exam Prep Hub; and this topic falls under these sections:
Manage and secure Power BI (15–20%)
--> Create and manage workspaces and assets
--> Identify when a gateway is required


Below are 10 practice questions (with answers and explanations) for this topic of the exam.
There are also 2 practice tests for the PL-300 exam with 60 questions each (with answers) available on the hub.

Practice Questions


Question 1

You publish a Power BI report that imports data from an on-premises SQL Server and want to schedule daily refreshes in the Power BI service. What is required?

A. No additional configuration
B. A Power BI app
C. An on-premises data gateway
D. A premium capacity workspace

Correct Answer: C

Explanation:
Scheduled refresh from an on-premises data source requires a gateway to securely connect Power BI service to the local SQL Server.


Question 2

A dataset uses Azure SQL Database in Import mode with scheduled refresh enabled. Is a gateway required?

A. Yes, because scheduled refresh is enabled
B. Yes, because Import mode is used
C. No, because the data source is cloud-based
D. No, because the dataset is small

Correct Answer: C

Explanation:
Azure SQL Database is a cloud data source that Power BI can access directly, so no gateway is needed.


Question 3

You create a Power BI report using DirectQuery to an on-premises SQL Server. When users view the report in the Power BI service, what is required?

A. A gateway
B. A scheduled refresh
C. Import mode
D. Power BI Premium

Correct Answer: A

Explanation:
DirectQuery sends queries at report view time. A gateway is required for on-premises sources.


Question 4

Which scenario does NOT require a Power BI gateway?

A. Importing data from SharePoint Online
B. DirectQuery to an on-premises database
C. Refreshing an on-premises dataflow
D. Live connection to on-premises SSAS

Correct Answer: A

Explanation:
SharePoint Online is a cloud-based service and does not require a gateway.


Question 5

A report combines data from Azure Data Lake Storage and an on-premises file share. What is true?

A. No gateway is required because one source is cloud-based
B. A gateway is required for the on-premises source
C. A gateway is required for both sources
D. Gateways are not supported for mixed data sources

Correct Answer: B

Explanation:
Any on-premises data source used in the Power BI service requires a gateway, even in hybrid datasets.


Question 6

While working in Power BI Desktop, you connect to an on-premises SQL Server and refresh data locally. Is a gateway required?

A. Yes, always
B. Yes, if Import mode is used
C. No, gateways are only needed in the Power BI service
D. No, if DirectQuery is used

Correct Answer: C

Explanation:
Power BI Desktop connects directly to local data sources. Gateways are only required after publishing to the Power BI service.


Question 7

You want to refresh a Power BI dataflow that connects to an on-premises Oracle database. What is required?

A. Power BI Premium
B. A gateway
C. A paginated report
D. An app workspace

Correct Answer: B

Explanation:
Dataflows that use on-premises data sources require a gateway to refresh in the Power BI service.


Question 8

Which connection type always requires a gateway when the data source is on-premises?

A. Import with manual refresh
B. Import with scheduled refresh
C. DirectQuery
D. Both B and C

Correct Answer: D

Explanation:
Scheduled refresh and DirectQuery both require a gateway for on-premises data sources.


Question 9

A report uses a Live connection to an on-premises Analysis Services model. What is required?

A. A dataset refresh schedule
B. A gateway
C. Import mode
D. A certified dataset

Correct Answer: B

Explanation:
Live connections to on-premises Analysis Services require a gateway for real-time queries.


Question 10

Which factor is the most important when deciding if a gateway is required?

A. Dataset size
B. Data refresh frequency
C. Location of the data source
D. Number of report users

Correct Answer: C

Explanation:
Gateway requirements are based on whether the data source is accessible from the cloud or located on-premises.


Exam Tips

  • On-premises + Power BI service = Gateway
  • Cloud sources do not require gateways
  • DirectQuery and Live connections still require gateways
  • Desktop-only work never requires a gateway

Go back to the PL-300 Exam Prep Hub main page

Practice Questions: Configure item-level access in Power BI (PL-300 Exam Prep)

This post is a part of the PL-300: Microsoft Power BI Data Analyst Exam Prep Hub; and this topic falls under these sections:
Manage and secure Power BI (15–20%)
--> Secure and govern Power BI items
--> Configure item-level access


Below are 10 practice questions (with answers and explanations) for this topic of the exam.
There are also 2 practice tests for the PL-300 exam with 60 questions each (with answers) available on the hub.

Practice Questions


Question 1

You want business users to create their own reports using an existing semantic model, but you do not want them to edit the model. What should you grant them?

A. Workspace Viewer role
B. Workspace Contributor role
C. Build permission on the semantic model
D. Read permission on the report

Correct Answer: C

Explanation:
The Build permission allows users to create new reports using a semantic model without modifying it. Viewer access alone does not allow report creation, and Contributor access is broader than required.


Question 2

A user can view a dashboard but sees broken tiles that fail to load data. What is the most likely cause?

A. The dataset refresh failed
B. The user lacks Build permission
C. The user does not have access to the underlying report
D. The dashboard was shared incorrectly

Correct Answer: C

Explanation:
Dashboard tiles link back to underlying reports. If the user does not have access to those reports, the tiles will not display correctly—even if the dashboard itself is shared.


Question 3

Which permission allows a user to create a new report in Power BI Desktop using a published semantic model?

A. Read
B. Viewer
C. Contributor
D. Build

Correct Answer: D

Explanation:
Only the Build permission enables users to create new reports from an existing semantic model, including using Power BI Desktop or Analyze in Excel.


Question 4

You need to limit who can see specific reports within a Power BI app without creating multiple apps. What should you use?

A. Row-level security (RLS)
B. Workspace roles
C. App audiences
D. Dataset permissions

Correct Answer: C

Explanation:
App audiences provide item-level visibility within an app, allowing different user groups to see different reports or dashboards.


Question 5

Which statement best describes item-level access?

A. It controls what data rows users can see
B. It controls access to entire workspaces
C. It controls access to individual Power BI items
D. It replaces workspace roles

Correct Answer: C

Explanation:
Item-level access applies to individual items such as reports, dashboards, and datasets. It does not control row-level data access and does not replace workspace roles.


Question 6

A user has access to a report but cannot export data from it. What is the most likely explanation?

A. The dataset is using DirectQuery
B. The report is in a Premium workspace
C. Export permissions are restricted at the report or tenant level
D. The user lacks RLS permissions

Correct Answer: C

Explanation:
Export behavior is governed by item-level settings and tenant-level policies, not RLS or workspace type alone.


Question 7

When sharing a report, which permission must be explicitly granted if the user needs to reshare it with others?

A. Build
B. Viewer
C. Contributor
D. Reshare

Correct Answer: D

Explanation:
The Reshare permission must be explicitly enabled when sharing an item. Without it, users can view the report but cannot share it further.


Question 8

Which scenario requires item-level access instead of workspace roles?

A. Granting full control of all assets
B. Managing dataset refresh schedules
C. Allowing users to view only specific reports in a workspace
D. Enabling paginated report creation

Correct Answer: C

Explanation:
Item-level access allows fine-grained control over individual assets, making it ideal when users should only see specific reports.


Question 9

How does item-level access differ from row-level security (RLS)?

A. Item-level access controls data rows
B. RLS controls report visibility
C. Item-level access controls content access; RLS controls data visibility
D. They serve the same purpose

Correct Answer: C

Explanation:
Item-level access determines whether a user can open or interact with content, while RLS limits the data shown within that content.


Question 10

What is the recommended best practice when assigning item-level access at scale?

A. Assign permissions to individual users
B. Use workspace roles only
C. Use Azure AD security groups
D. Share reports anonymously

Correct Answer: C

Explanation:
Using Azure AD security groups improves scalability, simplifies maintenance, and aligns with enterprise governance best practices.


Exam Readiness Tip

If you can confidently answer questions about:

  • Build vs Read vs Reshare
  • Dashboards vs reports vs datasets
  • Item-level access vs workspace roles vs RLS

…you are in excellent shape for PL-300 questions in this domain.


Go back to the PL-300 Exam Prep Hub main page

Practice Questions: Configure Access to Semantic Models (PL-300 Exam Prep)

This post is a part of the PL-300: Microsoft Power BI Data Analyst Exam Prep Hub; and this topic falls under these sections:
Manage and secure Power BI (15–20%)
--> Secure and govern Power BI items
--> Configure access to semantic models


Below are 10 practice questions (with answers and explanations) for this topic of the exam.
There are also 2 practice tests for the PL-300 exam with 60 questions each (with answers) available on the hub.

Practice Questions


Question 1

A user can view reports in a workspace but cannot create a new report using the existing semantic model. What is the most likely reason?

A. The user does not have Read permission on the semantic model
B. The user does not have Build permission on the semantic model
C. The user is not assigned a Row-Level Security role
D. The semantic model is not endorsed

Correct Answer: B

Explanation:
Creating new reports from a semantic model requires Build permission. A user can still view reports without Build permission, which makes this a common exam scenario.


Question 2

Which workspace role allows a user to edit semantic models and manage permissions?

A. Viewer
B. Contributor
C. Member
D. App user

Correct Answer: C

Explanation:
Members can publish, update, and manage semantic models, including assigning permissions. Contributors can edit content but cannot manage access.


Question 3

You want business users to create their own reports while preventing them from modifying the semantic model. What is the best approach?

A. Assign users the Viewer role and grant Build permission on the semantic model
B. Assign users the Contributor role
C. Assign users the Admin role
D. Publish the reports through a Power BI App only

Correct Answer: A

Explanation:
Granting Viewer role + Build permission enables self-service report creation without allowing model changes—this is a best practice and frequently tested.


Question 4

Where is Row-Level Security (RLS) enforced?

A. At the report level
B. At the dashboard level
C. At the semantic model level
D. At the workspace level

Correct Answer: C

Explanation:
RLS is defined in Power BI Desktop and enforced at the semantic model level, applying to all reports that use the model.


Question 5

Which DAX function is commonly used to implement dynamic Row-Level Security?

A. SELECTEDVALUE()
B. USERELATIONSHIP()
C. USERPRINCIPALNAME()
D. LOOKUPVALUE()

Correct Answer: C

Explanation:
USERPRINCIPALNAME() returns the logged-in user’s email or UPN and is commonly used in dynamic RLS filters.


Question 6

A user with Viewer access can see a report but receives an error when using Analyze in Excel. What is the most likely issue?

A. The user is not licensed for Power BI
B. The semantic model is not certified
C. The user does not have Build permission
D. RLS is incorrectly configured

Correct Answer: C

Explanation:
Analyze in Excel requires Build permission on the semantic model. Viewer role alone is insufficient.


Question 7

Which permission allows a user to share a semantic model with others?

A. Read
B. Build
C. Reshare
D. Admin

Correct Answer: C

Explanation:
The Reshare permission explicitly allows users to share the semantic model with other users or groups.


Question 8

What is the primary purpose of certifying a semantic model?

A. To apply Row-Level Security automatically
B. To improve query performance
C. To indicate the model is an approved and trusted data source
D. To allow external tool access

Correct Answer: C

Explanation:
Certification signals that a semantic model is officially approved and governed, helping users identify trusted data sources.


Question 9

Which approach is recommended for managing access to semantic models at scale?

A. Assign permissions to individual users
B. Use Microsoft Entra ID (Azure AD) security groups
C. Share semantic models directly from Power BI Desktop
D. Grant Admin role to all analysts

Correct Answer: B

Explanation:
Using security groups simplifies access management, supports scalability, and aligns with governance best practices.


Question 10

A report is published using a semantic model that has RLS enabled. A user accesses the report through a Power BI App. What happens?

A. RLS is ignored when using apps
B. RLS must be reconfigured for the app
C. RLS is enforced automatically
D. Only static RLS is applied

Correct Answer: C

Explanation:
Row-Level Security is always enforced at the semantic model level, regardless of whether content is accessed via a workspace, report, or app.


Final Exam Tips

  • Build permission is the most frequently tested concept
  • Viewer + Build is a common least-privilege design pattern
  • RLS always applies at the semantic model level
  • Certification is about trust and governance, not security
  • Apps do not bypass semantic model security

Go back to the PL-300 Exam Prep Hub main page

Practice Questions: Implement Row-Level Security Roles (PL-300 Exam Prep)

This post is a part of the PL-300: Microsoft Power BI Data Analyst Exam Prep Hub; and this topic falls under these sections:
Manage and secure Power BI (15–20%)
--> Secure and govern Power BI items
--> Implement row-level security roles


Below are 10 practice questions (with answers and explanations) for this topic of the exam.
There are also 2 practice tests for the PL-300 exam with 60 questions each (with answers) available on the hub.

Practice Questions


Question 1

Where are Row-Level Security roles and filters created?

A. In the Power BI Service
B. In Power BI Desktop
C. In Microsoft Entra ID
D. In Power BI Apps

Correct Answer: B

Explanation:
RLS roles and DAX filters are created in Power BI Desktop. Users and groups are assigned to those roles later in the Power BI Service.


Question 2

Which DAX function is most commonly used to implement dynamic RLS?

A. USERELATIONSHIP()
B. USERNAME()
C. USERPRINCIPALNAME()
D. SELECTEDVALUE()

Correct Answer: C

Explanation:
USERPRINCIPALNAME() returns the logged-in user’s email/UPN and is the most commonly used function for dynamic RLS scenarios.


Question 3

A single semantic model must filter sales data so that users only see rows matching their email address. What is the best approach?

A. Create one role per user
B. Create static RLS roles by region
C. Use dynamic RLS with a user-mapping table
D. Use Object-Level Security

Correct Answer: C

Explanation:
Dynamic RLS with a user-to-dimension mapping table scales efficiently and avoids creating many static roles.


Question 4

What happens if a user belongs to multiple RLS roles?

A. Access is denied
B. Only the most restrictive role is applied
C. The union of all role filters is applied
D. The first role alphabetically is applied

Correct Answer: C

Explanation:
Power BI applies the union of RLS role filters, meaning users see data allowed by any role they belong to.


Question 5

Which statement about Row-Level Security behavior is correct?

A. RLS is applied at the report level
B. RLS applies only to dashboards
C. RLS is enforced at the semantic model level
D. RLS must be reconfigured for each report

Correct Answer: C

Explanation:
RLS is enforced at the semantic model level and automatically applies to all reports and apps using that model.


Question 6

You test RLS using View as role in Power BI Desktop. What does this feature do?

A. Permanently applies RLS to the model
B. Bypasses RLS for the model author
C. Simulates how the report appears for a role
D. Assigns users to roles automatically

Correct Answer: C

Explanation:
View as allows you to simulate role behavior to validate RLS logic before publishing.


Question 7

Which type of RLS is least scalable in enterprise environments?

A. Dynamic RLS
B. RLS using USERPRINCIPALNAME()
C. Static RLS with hard-coded values
D. Group-based RLS

Correct Answer: C

Explanation:
Static RLS requires separate roles for each data segment, making it difficult to maintain at scale.


Question 8

A user accesses a report through a Power BI App. How does RLS behave?

A. RLS is ignored
B. RLS must be redefined in the app
C. RLS is enforced automatically
D. Only static RLS is enforced

Correct Answer: C

Explanation:
RLS is always enforced at the semantic model level, including when content is accessed through apps.


Question 9

Which security feature should be used if you need to hide entire columns or tables from certain users?

A. Row-Level Security
B. Workspace roles
C. Object-Level Security
D. Build permission

Correct Answer: C

Explanation:
RLS controls rows only. Object-Level Security (OLS) is used to hide tables or columns.


Question 10

Which best practice is recommended when assigning users to RLS roles?

A. Assign individual users directly
B. Assign workspace Admins only
C. Assign Microsoft Entra ID security groups
D. Assign report-level permissions

Correct Answer: C

Explanation:
Using security groups improves scalability, governance, and ease of maintenance.


Final PL-300 Exam Reminders

  • RLS controls data visibility, not report access
  • Dynamic RLS is heavily tested
  • RLS applies everywhere the semantic model is used
  • Users see the union of multiple roles
  • RLS is defined in Desktop, enforced in the Service

Go back to the PL-300 Exam Prep Hub main page

Create Views, Functions, and Stored Procedures

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Prepare data
--> Transform data
--> Create views, functions, and stored procedures

Creating views, functions, and stored procedures is a core data transformation and modeling skill for analytics engineers working in Microsoft Fabric. These objects help abstract complexity, improve reusability, enforce business logic, and optimize downstream analytics and reporting.

This section of the DP-600 exam focuses on when, where, and how to use these objects effectively across Fabric components such as Lakehouses, Warehouses, and SQL analytics endpoints.

Views

What are Views?

A view is a virtual table defined by a SQL query. It does not store data itself but presents data dynamically from underlying tables.

Where Views Are Used in Fabric

  • Fabric Data Warehouse
  • Lakehouse SQL analytics endpoint
  • Exposed to Power BI semantic models and other consumers

Common Use Cases

  • Simplify complex joins and transformations
  • Present curated, analytics-ready datasets
  • Enforce column-level or row-level filtering logic
  • Provide a stable schema over evolving raw data

Key Characteristics

  • Always reflect the latest data
  • Can be used like tables in SELECT statements
  • Improve maintainability and readability
  • Can support security patterns when combined with permissions

Exam Tip

Know that views are ideal for logical transformations, not heavy compute or data persistence.

Functions

What are Functions?

Functions encapsulate reusable logic and return a value or a table. They help standardize calculations and transformations across queries.

Types of Functions (SQL)

  • Scalar functions: Return a single value (e.g., formatted date, calculated metric)
  • Table-valued functions (TVFs): Return a result set that behaves like a table

Where Functions Are Used in Fabric

  • Fabric Warehouses
  • SQL analytics endpoints for Lakehouses

Common Use Cases

  • Standardized business calculations
  • Reusable transformation logic
  • Parameterized filtering or calculations
  • Cleaner and more modular SQL code

Key Characteristics

  • Improve consistency across queries
  • Can be referenced in views and stored procedures
  • May impact performance if overused in large queries

Exam Tip

Functions promote reuse and consistency, but should be used thoughtfully to avoid performance overhead.

Stored Procedures

What are Stored Procedures?

Stored procedures are precompiled SQL code blocks that can accept parameters and perform multiple operations.

Where Stored Procedures Are Used in Fabric

  • Fabric Data Warehouses
  • SQL endpoints that support procedural logic

Common Use Cases

  • Complex transformation workflows
  • Batch processing logic
  • Conditional logic and control-of-flow (IF/ELSE, loops)
  • Data loading, validation, and orchestration steps

Key Characteristics

  • Can perform multiple SQL statements
  • Can accept input and output parameters
  • Improve performance by reducing repeated compilation
  • Support automation and operational workflows

Exam Tip

Stored procedures are best for procedural logic and orchestration, not ad-hoc analytics queries.

Choosing Between Views, Functions, and Stored Procedures

ObjectBest Used For
ViewsSimplifying data access and shaping datasets
FunctionsReusable calculations and logic
Stored ProceduresComplex, parameter-driven workflows

Understanding why you would choose one over another is frequently tested on the DP-600 exam.

Integration with Power BI and Analytics

  • Views are commonly consumed by Power BI semantic models
  • Functions help ensure consistent calculations across reports
  • Stored procedures are typically part of data preparation or orchestration, not directly consumed by reports

Governance and Best Practices

  • Use clear naming conventions (e.g., vw_, fn_, sp_)
  • Document business logic embedded in SQL objects
  • Minimize logic duplication across objects
  • Apply permissions carefully to control access
  • Balance reusability with performance considerations

What to Know for the DP-600 Exam

You should be comfortable with:

  • When to use views vs. functions vs. stored procedures
  • How these objects support data transformation
  • Their role in analytics-ready data preparation
  • How they integrate with Lakehouses, Warehouses, and Power BI
  • Performance and governance implications

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

  • Identifying and understand why an option is correct (or incorrect) — not just which one
  • Look for and understand the usage scenario of keywords in exam questions to guide you
  • Expect scenario-based questions rather than direct definitions

1. What is the primary purpose of creating a view in a Fabric lakehouse or warehouse?

A. To permanently store transformed data
B. To execute procedural logic with parameters
C. To provide a virtual, query-based representation of data
D. To orchestrate batch data loads

Correct Answer: C

Explanation:
A view is a virtual table defined by a SQL query. It does not store data but dynamically presents data from underlying tables, making it ideal for simplifying access and shaping analytics-ready datasets.

2. Which Fabric component commonly exposes views directly to Power BI semantic models?

A. Eventhouse
B. SQL analytics endpoint
C. Dataflow Gen2
D. Real-Time hub

Correct Answer: B

Explanation:
The SQL analytics endpoint (for lakehouses and warehouses) exposes tables and views that Power BI semantic models can consume using SQL-based connectivity.

3. When should you use a scalar function instead of a view?

A. When you need to return a dataset with multiple rows
B. When you need to encapsulate reusable calculation logic
C. When you need to perform batch updates
D. When you want to persist transformed data

Correct Answer: B

Explanation:
Scalar functions are designed to return a single value and are ideal for reusable calculations such as formatting, conditional logic, or standardized metrics.

4. Which object type can return a result set that behaves like a table?

A. Scalar function
B. Stored procedure
C. Table-valued function
D. View index

Correct Answer: C

Explanation:
A table-valued function (TVF) returns a table and can be used in FROM clauses, similar to a view but with parameterization support.

5. Which scenario is the best use case for a stored procedure?

A. Creating a simplified reporting dataset
B. Applying row-level filters for security
C. Running conditional logic with multiple SQL steps
D. Exposing data to Power BI reports

Correct Answer: C

Explanation:
Stored procedures are best suited for procedural logic, including conditional branching, looping, and executing multiple SQL statements as part of a workflow.

6. Why are views commonly preferred over duplicating transformation logic in reports?

A. Views improve report rendering speed automatically
B. Views centralize and standardize transformation logic
C. Views permanently store transformed data
D. Views replace semantic models

Correct Answer: B

Explanation:
Views allow transformation logic to be defined once and reused consistently across multiple reports and consumers, improving maintainability and governance.

7. What is a potential downside of overusing functions in large SQL queries?

A. Increased storage costs
B. Reduced data freshness
C. Potential performance degradation
D. Loss of security enforcement

Correct Answer: C

Explanation:
Functions, especially scalar functions, can negatively impact query performance when used extensively on large datasets due to repeated execution per row.

8. Which object is most appropriate for parameter-driven data preparation steps in a warehouse?

A. View
B. Scalar function
C. Table
D. Stored procedure

Correct Answer: D

Explanation:
Stored procedures support parameters, control-of-flow logic, and multiple statements, making them ideal for complex, repeatable data preparation tasks.

9. How do views support governance and security in Microsoft Fabric?

A. By encrypting data at rest
B. By defining workspace-level permissions
C. By exposing only selected columns or filtered rows
D. By controlling OneLake storage access

Correct Answer: C

Explanation:
Views can limit the columns and rows exposed to users, helping implement logical data access patterns when combined with permissions and security models.

10. Which statement best describes how these objects fit into Fabric’s analytics lifecycle?

A. They replace Power BI semantic models
B. They are primarily used for real-time streaming
C. They prepare and standardize data for downstream analytics
D. They manage infrastructure-level security

Correct Answer: C

Explanation:
Views, functions, and stored procedures play a key role in transforming, standardizing, and preparing data for consumption by semantic models, reports, and analytics tools.

Implement OneLake Integration for Eventhouse and Semantic Models

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Prepare data
--> Get data
--> Implement OneLake Integration for Eventhouse and Semantic Models

Microsoft Fabric is designed around the principle of OneLake as a single, unified data foundation. For the DP-600 exam, the topic “Implement OneLake integration for Eventhouse and semantic models” focuses on how both streaming data and analytical models can integrate with OneLake to enable reuse, governance, and multi-workload analytics.

This topic frequently appears in architecture and scenario-based questions, not as a pure feature checklist.

Why OneLake Integration Is Important

OneLake integration enables:

  • A single copy of data to support multiple analytics workloads
  • Reduced data duplication and ingestion complexity
  • Consistent governance and security
  • Seamless movement between real-time, batch, and BI analytics

For the exam, this is about understanding how data flows across Fabric experiences, not just where it lives.

OneLake Integration for Eventhouse

Eventhouse Recap

An Eventhouse is optimized for:

  • Real-time and near-real-time analytics
  • Streaming and telemetry data
  • High-ingestion rates
  • Querying with KQL (Kusto Query Language)

By default, Eventhouse is focused on real-time querying—but many solutions require more.

How Eventhouse Integrates with OneLake

When OneLake integration is implemented for an Eventhouse:

  • Streaming data ingested into the Eventhouse is persisted in OneLake
  • The same data becomes available for:
    • Lakehouses (Spark / SQL)
    • Warehouses (T-SQL reporting)
    • Notebooks
    • Semantic models
  • Real-time and historical analytics can coexist

This allows streaming data to participate in downstream analytics without re-ingestion.

Exam Signals for Eventhouse + OneLake

Look for phrases like:

  • Persist streaming data
  • Reuse event data
  • Combine real-time and batch analytics
  • Avoid duplicate ingestion pipelines

These strongly indicate OneLake integration for Eventhouse.

OneLake Integration for Semantic Models

Semantic Models Recap

A semantic model (Power BI dataset) defines:

  • Business-friendly tables and relationships
  • Measures and calculations (DAX)
  • Security rules (RLS, OLS)
  • A curated layer for reporting and analysis

Semantic models do not store raw data themselves—they rely on underlying data sources.

How Semantic Models Integrate with OneLake

Semantic models integrate with OneLake when their data source is:

  • A Lakehouse
  • A Warehouse
  • Eventhouse data persisted to OneLake

In these cases:

  • Data physically resides in OneLake
  • The semantic model acts as a logical abstraction
  • Multiple reports can reuse the same curated model

This supports the Fabric design pattern of shared semantic models over shared data.

Import vs DirectQuery (Exam-Relevant)

Semantic models can connect to OneLake-backed data using:

  • Import mode – best performance, scheduled refresh
  • DirectQuery – near-real-time access, source-dependent performance

DP-600 often tests your ability to choose the appropriate mode based on:

  • Data freshness requirements
  • Dataset size
  • Performance expectations

Eventhouse + OneLake + Semantic Models (End-to-End View)

A common DP-600 architecture looks like this:

  1. Streaming data is ingested into an Eventhouse
  2. Event data is persisted to OneLake
  3. Data is accessed by:
    • Lakehouse (for transformations)
    • Warehouse (for BI-friendly schemas)
  4. A shared semantic model is built on top
  5. Multiple Power BI reports reuse the model

This architecture supports real-time insights and historical analysis from the same data.

Governance and Security Benefits

OneLake integration ensures:

  • Centralized security and permissions
  • Sensitivity labels applied consistently
  • Reduced risk of shadow datasets
  • Clear lineage across streaming, batch, and BI layers

Exam questions often frame this as a governance or compliance requirement.

Common Exam Scenarios

You may be asked to:

  • Enable downstream analytics from streaming data
  • Avoid duplicating event ingestion
  • Support real-time dashboards and historical reports
  • Reuse a semantic model across teams
  • Align streaming analytics with enterprise BI

Always identify:

  • Where the data is persisted
  • Who needs access
  • How fresh the data must be
  • Which query language is required

Best Practices (DP-600 Focus)

  • Use Eventhouse for real-time ingestion and KQL analytics
  • Enable OneLake integration for reuse and persistence
  • Build shared semantic models on OneLake-backed data
  • Avoid multiple ingestion paths for the same data
  • Let OneLake act as the single source of truth

Key Takeaway
For the DP-600 exam, implementing OneLake integration for Eventhouse and semantic models is about enabling streaming data to flow seamlessly into governed, reusable analytical solutions. Eventhouse delivers real-time insights, OneLake provides a unified storage layer, and semantic models expose trusted, business-ready analytics—all without unnecessary duplication.

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

  • Identifying and understand why an option is correct (or incorrect) — not just which one
  • Look for and understand the usage scenario of keywords in exam questions to guide you
  • Expect scenario-based questions rather than direct definitions

And also keep in mind …

  • When you see streaming data + reuse + BI or ML, think:
    Eventhouse → OneLake → Lakehouse/Warehouse → Semantic model

1. What is the primary benefit of integrating an Eventhouse with OneLake?

A. Faster Power BI rendering
B. Ability to query event data using DAX
C. Persistence and reuse of streaming data across Fabric workloads
D. Elimination of real-time ingestion

Correct Answer: C

Explanation:
OneLake integration allows streaming data ingested into an Eventhouse to be persisted and reused by Lakehouses, Warehouses, notebooks, and semantic models—without re-ingestion.

2. Which query language is used for real-time analytics directly in an Eventhouse?

A. T-SQL
B. Spark SQL
C. DAX
D. KQL

Correct Answer: D

Explanation:
Eventhouses are built on KQL (Kusto Query Language), which is optimized for querying streaming and time-series data.

3. A team wants to combine real-time event data with historical batch data in Power BI. What is the BEST approach?

A. Build separate semantic models for each data source
B. Persist event data to OneLake and build a semantic model on top
C. Use DirectQuery to the Eventhouse only
D. Export event data to Excel

Correct Answer: B

Explanation:
Persisting event data to OneLake allows it to be combined with historical data and exposed through a single semantic model.

4. How do semantic models integrate with OneLake in Microsoft Fabric?

A. Semantic models store data directly in OneLake
B. Semantic models replace OneLake storage
C. Semantic models reference OneLake-backed sources such as Lakehouses and Warehouses
D. Semantic models only support streaming data

Correct Answer: C

Explanation:
Semantic models do not store raw data; they reference OneLake-backed sources like Lakehouses, Warehouses, or persisted Eventhouse data.

5. Which scenario MOST strongly indicates the need for OneLake integration for Eventhouse?

A. Ad hoc SQL reporting on static data
B. Monthly batch ETL processing
C. Reusing streaming data for BI, ML, and historical analysis
D. Creating a single real-time dashboard

Correct Answer: C

Explanation:
OneLake integration is most valuable when streaming data must be reused across multiple analytics workloads beyond real-time querying.

6. Which storage principle best describes the benefit of OneLake integration?

A. Multiple copies for better performance
B. One copy of data, many analytics experiences
C. Schema-on-read only
D. Real-time only storage

Correct Answer: B

Explanation:
Microsoft Fabric promotes the principle of storing one copy of data in OneLake and enabling multiple analytics experiences on top of it.

7. Which connectivity mode should be chosen for a semantic model when near-real-time access to event data is required?

A. Import
B. Cached mode
C. DirectQuery
D. Snapshot mode

Correct Answer: C

Explanation:
DirectQuery enables near-real-time access to the underlying data, making it suitable when freshness is critical.

8. What governance advantage does OneLake integration provide?

A. Automatic deletion of sensitive data
B. Centralized security and sensitivity labeling
C. Removal of workspace permissions
D. Unlimited data access

Correct Answer: B

Explanation:
OneLake integration supports centralized governance, including consistent permissions and sensitivity labels across streaming and batch data.

9. Which end-to-end architecture BEST supports both real-time dashboards and historical reporting?

A. Eventhouse only
B. Lakehouse only
C. Eventhouse with OneLake integration and a shared semantic model
D. Warehouse without ingestion

Correct Answer: C

Explanation:
This architecture enables real-time ingestion via Eventhouse, persistence in OneLake, and curated reporting through a shared semantic model.

10. On the DP-600 exam, which phrase is MOST likely to indicate the need for OneLake integration for Eventhouse?

A. “SQL-only reporting solution”
B. “Single-user analysis”
C. “Avoid duplicating streaming ingestion pipelines”
D. “Static reference data”

Correct Answer: C

Explanation:
Avoiding duplication and enabling reuse of streaming data across analytics workloads is a key signal for OneLake integration.

Choose Between a Lakehouse, Warehouse, or Eventhouse

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Prepare data
--> Get data
--> Choose Between a Lakehouse, Warehouse, or Eventhouse

One of the most important architectural decisions a Microsoft Fabric Analytics Engineer must make is selecting the right analytical store for a given workload. For the DP-600 exam, this topic tests your ability to choose between a Lakehouse, Warehouse, or Eventhouse based on data type, query patterns, latency requirements, and user personas.

Overview of the Three Options

Microsoft Fabric provides three primary analytics storage and query experiences:

OptionPrimary Purpose
LakehouseFlexible analytics on files and tables using Spark and SQL
WarehouseEnterprise-grade SQL analytics and BI reporting
EventhouseReal-time and near-real-time analytics on streaming data

Understanding why and when to use each is critical for DP-600 success.

Lakehouse

What Is a Lakehouse?

A Lakehouse combines the flexibility of a data lake with the structure of a data warehouse. Data is stored in Delta Lake format in OneLake and can be accessed using both Spark and SQL.

When to Choose a Lakehouse

Choose a Lakehouse when you need:

  • Flexible schema (schema-on-read or schema-on-write)
  • Support for data engineering and data science
  • Access to raw, curated, and enriched data
  • Spark-based transformations and notebooks
  • Mixed workloads (batch analytics, exploration, ML)

Key Characteristics

  • Supports files and tables
  • Uses Spark SQL and T-SQL endpoints
  • Ideal for ELT and advanced transformations
  • Easy integration with notebooks and pipelines

Exam signal words: flexible, raw data, Spark, data science, experimentation

Warehouse

What Is a Warehouse?

A Warehouse is a fully managed, SQL-first analytical store optimized for business intelligence and reporting. It enforces schema-on-write and provides a traditional relational experience.

When to Choose a Warehouse

Choose a Warehouse when you need:

  • Strong SQL-based analytics
  • High-performance reporting
  • Well-defined schemas and governance
  • Centralized enterprise BI
  • Compatibility with Power BI Import or DirectQuery

Key Characteristics

  • T-SQL only (no Spark)
  • Optimized for structured data
  • Best for star/snowflake schemas
  • Familiar experience for SQL developers

Exam signal words: enterprise BI, reporting, structured, governed, SQL-first

Eventhouse

What Is an Eventhouse?

An Eventhouse is optimized for real-time and streaming analytics, built on KQL (Kusto Query Language). It is designed to handle high-velocity event data.

When to Choose an Eventhouse

Choose an Eventhouse when you need:

  • Near-real-time or real-time analytics
  • Streaming data ingestion
  • Operational or telemetry analytics
  • Event-based dashboards and alerts

Key Characteristics

  • Uses KQL for querying
  • Integrates with Eventstreams
  • Handles massive ingestion rates
  • Optimized for time-series data

Exam signal words: streaming, telemetry, IoT, real-time, events

Choosing the Right Option (Exam-Critical)

The DP-600 exam often presents scenarios where multiple options could work, but only one best fits the requirements.

Decision Matrix

RequirementBest Choice
Raw + curated dataLakehouse
Complex Spark transformationsLakehouse
Enterprise BI reportingWarehouse
Strong governance and schemasWarehouse
Streaming or telemetry dataEventhouse
Near-real-time dashboardsEventhouse
SQL-only usersWarehouse
Data science workloadsLakehouse

Common Exam Scenarios

You may be asked to:

  • Choose a storage type for a new analytics solution
  • Migrate from traditional systems to Fabric
  • Support both engineers and analysts
  • Enable real-time monitoring
  • Balance governance with flexibility

Always identify:

  1. Data type (batch vs streaming)
  2. Latency requirements
  3. User personas
  4. Query language
  5. Governance needs

Best Practices to Remember

  • Use Lakehouse as a flexible foundation for analytics
  • Use Warehouse for polished, governed BI solutions
  • Use Eventhouse for real-time operational insights
  • Avoid forcing one option to handle all workloads
  • Let business requirements—not familiarity—drive the choice

Key Takeaway
For the DP-600 exam, choosing between a Lakehouse, Warehouse, or Eventhouse is about aligning data characteristics and access patterns with the right Fabric experience. Lakehouses provide flexibility, Warehouses deliver enterprise BI performance, and Eventhouses enable real-time analytics. The correct answer is almost always the one that best fits the scenario constraints.

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

  • Identifying and understand why an option is correct (or incorrect) — not just which one
  • Look for and understand the usage scenario of keywords in exam questions, with the below possible association:
    • Spark, raw, experimentationLakehouse
    • Enterprise BI, governed, SQL reportingWarehouse
    • Streaming, telemetry, real-timeEventhouse
  • Expect scenario-based questions rather than direct definitions

1. Which Microsoft Fabric component is BEST suited for flexible analytics on both files and tables using Spark and SQL?

A. Warehouse
B. Eventhouse
C. Lakehouse
D. Semantic model

Correct Answer: C

Explanation:
A Lakehouse stores data in Delta format in OneLake and supports both Spark and SQL, making it ideal for flexible analytics across files and tables.

2. A team of data scientists needs to experiment with raw and curated data using notebooks. Which option should they choose?

A. Warehouse
B. Eventhouse
C. Semantic model
D. Lakehouse

Correct Answer: D

Explanation:
Lakehouses are designed for data engineering and data science workloads, offering Spark-based notebooks and flexible schema handling.

3. Which option is MOST appropriate for enterprise BI reporting with well-defined schemas and strong governance?

A. Lakehouse
B. Warehouse
C. Eventhouse
D. OneLake

Correct Answer: B

Explanation:
Warehouses are SQL-first, schema-on-write systems optimized for structured data, governance, and high-performance BI reporting.

4. A solution must support near-real-time analytics on streaming IoT telemetry data. Which Fabric component should be used?

A. Lakehouse
B. Warehouse
C. Eventhouse
D. Dataflow Gen2

Correct Answer: C

Explanation:
Eventhouses are optimized for high-velocity streaming data and real-time analytics using KQL.

5. Which query language is primarily used to analyze data in an Eventhouse?

A. T-SQL
B. Spark SQL
C. DAX
D. KQL

Correct Answer: D

Explanation:
Eventhouses are built on KQL (Kusto Query Language), which is optimized for querying event and time-series data.

6. A business analytics team requires fast dashboard performance and is familiar only with SQL. Which option best meets this requirement?

A. Lakehouse
B. Warehouse
C. Eventhouse
D. Spark notebook

Correct Answer: B

Explanation:
Warehouses provide a traditional SQL experience optimized for BI dashboards and reporting performance.

7. Which characteristic BEST distinguishes a Lakehouse from a Warehouse?

A. Lakehouses support Power BI
B. Warehouses store data in OneLake
C. Lakehouses support Spark-based processing
D. Warehouses cannot be governed

Correct Answer: C

Explanation:
Lakehouses uniquely support Spark-based processing, enabling advanced transformations and data science workloads.

8. A solution must store structured batch data and unstructured files in the same analytical store. Which option should be selected?

A. Warehouse
B. Eventhouse
C. Semantic model
D. Lakehouse

Correct Answer: D

Explanation:
Lakehouses support both structured tables and unstructured or semi-structured files within the same environment.

9. Which scenario MOST strongly indicates the need for an Eventhouse?

A. Monthly financial reporting
B. Slowly changing dimension modeling
C. Real-time operational monitoring
D. Ad hoc SQL analysis

Correct Answer: C

Explanation:
Eventhouses are designed for real-time analytics on streaming data, making them ideal for operational monitoring scenarios.

10. When choosing between a Lakehouse, Warehouse, or Eventhouse on the DP-600 exam, which factor is MOST important?

A. Personal familiarity with the tool
B. The default Fabric option
C. Data characteristics and latency requirements
D. Workspace size

Correct Answer: C

Explanation:
DP-600 emphasizes selecting the correct component based on data type (batch vs streaming), latency needs, user personas, and governance—not personal preference.

Ingest or Access Data as Needed

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Prepare data
--> Get data
--> Ingest or access data as needed

A core responsibility of a Microsoft Fabric Analytics Engineer is deciding how data should be brought into Fabric—or whether it should be brought in at all. For the DP-600 exam, this topic focuses on selecting the right ingestion or access pattern based on performance, freshness, cost, and governance requirements.

Ingest vs. Access: Key Concept

Before choosing a tool or method, understand the distinction:

  • Ingest data: Physically copy data into Fabric-managed storage (OneLake)
  • Access data: Query or reference data where it already lives, without copying

The exam frequently tests your ability to choose the most appropriate option—not just a working one.

Common Data Ingestion Methods in Microsoft Fabric

1. Dataflows Gen2

Best for:

  • Low-code ingestion and transformation
  • Reusable ingestion logic
  • Business-friendly data preparation

Key characteristics:

  • Uses Power Query Online
  • Supports scheduled refresh
  • Stores results in OneLake (Lakehouse or Warehouse)
  • Ideal for centralized, governed ingestion

Exam tip:
Use Dataflows Gen2 when reuse, transformation, and governance are priorities.

2. Data Pipelines (Copy Activity)

Best for:

  • High-volume or frequent ingestion
  • Orchestration across multiple sources
  • ELT-style workflows

Key characteristics:

  • Supports many source and sink types
  • Enables scheduling, dependencies, and retries
  • Minimal transformation (primarily copy)

Exam tip:
Choose pipelines when performance and orchestration matter more than transformation.

3. Notebooks (Spark)

Best for:

  • Complex transformations
  • Data science or advanced engineering
  • Custom ingestion logic

Key characteristics:

  • Full control using Spark (PySpark, Scala, SQL)
  • Suitable for large-scale processing
  • Writes directly to OneLake

Exam tip:
Notebooks are powerful but require engineering skills—don’t choose them for simple ingestion scenarios.

Accessing Data Without Ingesting

1. OneLake Shortcuts

Best for:

  • Avoiding data duplication
  • Reusing data across workspaces
  • Accessing external storage

Key characteristics:

  • Logical reference only (no copy)
  • Supports ADLS Gen2 and Amazon S3
  • Appears native in Lakehouse tables or files

Exam tip:
Shortcuts are often the best answer when the question mentions avoiding duplication or reducing storage cost.

2. DirectQuery

Best for:

  • Near-real-time data access
  • Large datasets that cannot be imported
  • Centralized source-of-truth systems

Key characteristics:

  • Queries run against the source system
  • Performance depends on source
  • Limited modeling flexibility compared to Import

Exam tip:
Expect trade-off questions involving DirectQuery vs. Import.

3. Real-Time Access (Eventstreams / KQL)

Best for:

  • Streaming and telemetry data
  • Operational and real-time analytics

Key characteristics:

  • Event-driven ingestion
  • Supports near-real-time dashboards
  • Often discovered via Real-Time hub

Exam tip:
Use real-time ingestion when freshness is measured in seconds, not hours.

Choosing the Right Approach (Exam-Critical)

You should be able to decide based on these factors:

RequirementBest Option
Reusable ingestion logicDataflows Gen2
High-volume copyData pipelines
Complex transformationsNotebooks
Avoid duplicationOneLake shortcuts
Near real-time reportingDirectQuery / Eventstreams
Governance and trustIngestion + endorsement

Governance and Security Considerations

  • Ingested data can inherit sensitivity labels
  • Access-based methods rely on source permissions
  • Workspace roles determine who can ingest or access data
  • Endorsed datasets should be preferred for reuse

DP-600 often frames ingestion questions within a governance context.

Common Exam Scenarios

You may be asked to:

  • Choose between ingesting data or accessing it directly
  • Identify when shortcuts are preferable to ingestion
  • Select the right tool for a specific ingestion pattern
  • Balance data freshness vs. performance
  • Reduce duplication across workspaces

Best Practices to Remember

  • Ingest when performance and modeling flexibility are required
  • Access when freshness, cost, or duplication is a concern
  • Centralize ingestion logic for reuse
  • Prefer Fabric-native patterns over external tools
  • Let business requirements drive architectural decisions

Key Takeaway
For the DP-600 exam, “Ingest or access data as needed” is about making intentional, informed choices. Microsoft Fabric provides multiple ways to bring data into analytics solutions, and the correct approach depends on scale, freshness, reuse, governance, and cost. Understanding why one method is better than another is far more important than memorizing features.

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

  • Identifying and understand why an option is correct (or incorrect) — not just which one
  • Look for and understand the usage scenario of keywords in exam questions (for example, low code/no code, large dataset, high-volume data, reuse, complex transformations)
  • Expect scenario-based questions rather than direct definitions

Also, keep in mind that …

  • DP-600 questions often include multiple valid options, but only one that best aligns with the scenario’s constraints. Always identify and consider factors such as:
    • Data volume
    • Freshness requirements
    • Reuse and duplication concerns
    • Transformation complexity

1. What is the primary difference between ingesting data and accessing data in Microsoft Fabric?

A. Ingested data cannot be secured
B. Accessed data is always slower
C. Ingesting copies data into OneLake, while accessing queries data in place
D. Accessed data requires a gateway

Correct Answer: C

Explanation:
Ingestion physically copies data into Fabric-managed storage (OneLake), while access-based approaches query or reference data where it already exists.

2. Which option is BEST when the goal is to avoid duplicating large datasets across multiple workspaces?

A. Import mode
B. Dataflows Gen2
C. OneLake shortcuts
D. Notebooks

Correct Answer: C

Explanation:
OneLake shortcuts allow data to be referenced without copying it, making them ideal for reuse and cost control.

3. A team needs reusable, low-code ingestion logic with scheduled refresh. Which Fabric feature should they use?

A. Spark notebooks
B. Data pipelines
C. Dataflows Gen2
D. DirectQuery

Correct Answer: C

Explanation:
Dataflows Gen2 provide Power Query–based ingestion with refresh scheduling and reuse across Fabric items.

4. Which ingestion method is MOST appropriate for complex transformations requiring custom logic?

A. Dataflows Gen2
B. Copy activity in pipelines
C. OneLake shortcuts
D. Spark notebooks

Correct Answer: D

Explanation:
Spark notebooks offer full control over transformation logic and are suited for complex, large-scale processing.

5. When should DirectQuery be preferred over Import mode?

A. When the dataset is small
B. When data freshness is critical
C. When transformations are complex
D. When performance must be maximized

Correct Answer: B

Explanation:
DirectQuery is preferred when near-real-time access to data is required, even though performance depends on the source system.

6. Which Fabric component is BEST suited for orchestrating high-volume data ingestion with dependencies and retries?

A. Dataflows Gen2
B. Data pipelines
C. Semantic models
D. Power BI Desktop

Correct Answer: B

Explanation:
Data pipelines are designed for orchestration, handling large volumes of data, scheduling, and dependency management.

7. A dataset is queried infrequently but must support advanced modeling features. Which approach is most appropriate?

A. DirectQuery
B. Access via shortcut
C. Import into OneLake
D. Eventstream ingestion

Correct Answer: C

Explanation:
Import mode supports full modeling capabilities and high query performance, making it suitable even for infrequently accessed data.

8. Which scenario best fits the use of real-time ingestion methods such as Eventstreams or KQL databases?

A. Monthly financial reporting
B. Static reference data
C. IoT telemetry and operational monitoring
D. Slowly changing dimensions

Correct Answer: C

Explanation:
Real-time ingestion is designed for continuous, event-driven data such as IoT telemetry and operational metrics.

9. Why might ingesting data be preferred over accessing it directly?

A. It always reduces storage costs
B. It eliminates the need for security
C. It improves performance and modeling flexibility
D. It avoids data refresh

Correct Answer: C

Explanation:
Ingesting data into OneLake enables faster query performance and full support for modeling features.

10. Which factor is MOST important when deciding between ingesting data and accessing it?

A. The color of the dashboard
B. The number of reports
C. Business requirements such as freshness, scale, and governance
D. The Fabric region

Correct Answer: C

Explanation:
The decision to ingest or access data should be driven by business needs, including performance, freshness, cost, and governance—not technical convenience alone.

Discover Data by Using OneLake Catalog and Real-Time Hub

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Prepare data
--> Get data
--> Discover data by using OneLake catalog and Real-Time hub

Discovering existing data assets efficiently is a critical capability for a Microsoft Fabric Analytics Engineer. For the DP-600 exam, this topic emphasizes how to find, understand, and evaluate data sources using Fabric’s built-in discovery experiences: OneLake catalog and Real-Time hub.

Purpose of Data Discovery in Microsoft Fabric

In large Fabric environments, data already exists across:

  • Lakehouses
  • Warehouses
  • Semantic models
  • Streaming and event-based sources

The goal of data discovery is to:

  • Avoid duplicate ingestion
  • Promote reuse of trusted data
  • Understand data ownership, sensitivity, and freshness
  • Accelerate analytics development

OneLake Catalog

What Is the OneLake Catalog?

The OneLake catalog is a centralized metadata and discovery experience that allows users to browse and search data assets stored in OneLake, Fabric’s unified data lake.

It provides visibility into:

  • Lakehouses and Warehouses
  • Tables, views, and files
  • Shortcuts to external data
  • Endorsement and sensitivity metadata

Key Capabilities of the OneLake Catalog

For the exam, you should understand that the OneLake catalog enables users to:

  • Search and filter data assets across workspaces
  • View schema details (columns, data types)
  • Identify endorsed (Certified or Promoted) assets
  • See sensitivity labels applied to data
  • Discover data ownership and location
  • Reuse existing data rather than re-ingesting it

This supports both governance and efficiency.

Endorsement and Trust Signals

Within the OneLake catalog, users can quickly identify:

  • Certified items (approved and governed)
  • Promoted items (recommended but not formally certified)

These trust signals are important in exam scenarios that ask how to guide users toward reliable data sources.

Shortcuts and External Data

The catalog also exposes OneLake shortcuts, which allow data from:

  • Azure Data Lake Storage Gen2
  • Amazon S3
  • Other Fabric workspaces

to appear as native OneLake data without duplication. This is a key discovery mechanism tested in DP-600.

Real-Time Hub

What Is the Real-Time Hub?

The Real-Time hub is a discovery experience focused on streaming and event-driven data sources in Microsoft Fabric.

It centralizes access to:

  • Eventstreams
  • Azure Event Hubs
  • Azure IoT Hub
  • Azure Data Explorer (KQL databases)
  • Other real-time data producers

Key Capabilities of the Real-Time Hub

For exam purposes, understand that the Real-Time hub allows users to:

  • Discover available streaming data sources
  • Preview live event data
  • Subscribe to or reuse existing event streams
  • Understand data velocity and schema
  • Reduce duplication of real-time ingestion pipelines

This is especially important in architectures involving operational analytics or near real-time reporting.

OneLake Catalog vs. Real-Time Hub

FeatureOneLake CatalogReal-Time Hub
Primary focusStored dataStreaming / event data
Data typesTables, files, shortcutsEvents, streams, telemetry
Use caseAnalytical and historical dataReal-time and operational analytics
Governance signalsEndorsement, sensitivityOwnership, stream metadata

Understanding when to use each is a common exam theme.

Security and Governance Considerations

Data discovery respects Fabric security:

  • Users only see items they have permission to access
  • Sensitivity labels are visible in discovery views
  • Workspace roles control discovery depth

This ensures compliance while still promoting self-service analytics.

Exam-Relevant Scenarios

On the DP-600 exam, you may be asked to:

  • Identify how users can discover existing datasets before ingesting new data
  • Choose between OneLake catalog and Real-Time hub based on data type
  • Locate endorsed or certified data assets
  • Reduce duplication by reusing existing tables or streams
  • Enable self-service discovery while maintaining governance

Best Practices (Aligned to DP-600)

  • Use OneLake catalog first before creating new data connections
  • Encourage use of endorsed and certified assets
  • Use Real-Time hub to discover existing event streams
  • Leverage shortcuts to reuse data without copying
  • Combine discovery with proper labeling and endorsement

Key Takeaway
For the DP-600 exam, discovering data in Microsoft Fabric is about visibility, trust, and reuse. The OneLake catalog helps users find and understand stored analytical data, while the Real-Time hub enables discovery of live streaming sources. Together, they reduce redundancy, improve governance, and accelerate analytics development.

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

  • Identifying and understand why an option is correct (or incorrect) — not just which one
  • Pay close attention to when to use OneLake catalog vs. Real-Time hub
  • Look for and understand the usage scenario of keywords in exam questions (for example, discover, reuse, streaming, endorsed, shortcut)
  • Expect scenario-based questions that test architecture choices, rather than direct definitions

1. What is the primary purpose of the OneLake catalog in Microsoft Fabric?

A. To ingest streaming data
B. To schedule data refreshes
C. To discover and explore data stored in OneLake
D. To manage workspace permissions

Correct Answer: C

Explanation:
The OneLake catalog is a centralized discovery and metadata experience that helps users find, understand, and reuse data stored in OneLake across Fabric workspaces.

2. Which type of data is the Real-Time hub primarily designed to help users discover?

A. Historical data in Lakehouses
B. Structured warehouse tables
C. Streaming and event-driven data sources
D. Power BI semantic models

Correct Answer: C

Explanation:
The Real-Time hub focuses on streaming and event-based data such as Eventstreams, Azure Event Hubs, IoT Hub, and KQL databases.

3. A user wants to avoid re-ingesting data that already exists in another workspace. Which Fabric feature best supports this goal?

A. Data pipelines
B. OneLake shortcuts
C. Import mode
D. DirectQuery

Correct Answer: B

Explanation:
OneLake shortcuts allow data stored externally or in another workspace to appear as native OneLake data without physically copying it.

4. Which metadata element in the OneLake catalog helps users identify trusted and approved data assets?

A. Workspace name
B. File size
C. Endorsement status
D. Refresh schedule

Correct Answer: C

Explanation:
Endorsements (Promoted and Certified) act as trust signals, helping users quickly identify reliable and governed data assets.

5. Which statement about data visibility in the OneLake catalog is true?

A. All users can see all data across the tenant
B. Only workspace admins can see catalog entries
C. Users can only see items they have permission to access
D. Sensitivity labels hide data from discovery

Correct Answer: C

Explanation:
The OneLake catalog respects Fabric security boundaries—users only see data assets they are authorized to access.

6. A team is building a real-time dashboard and wants to see what streaming data already exists. Where should they look first?

A. OneLake catalog
B. Power BI Service
C. Dataflows Gen2
D. Real-Time hub

Correct Answer: D

Explanation:
The Real-Time hub centralizes discovery of streaming and event-based data sources, making it the best starting point for real-time analytics scenarios.

7. Which of the following items is most likely discovered through the Real-Time hub?

A. Parquet files in OneLake
B. Lakehouse Delta tables
C. Azure Event Hub streams
D. Warehouse SQL views

Correct Answer: C

Explanation:
Azure Event Hubs and other event-driven sources are exposed through the Real-Time hub, not the OneLake catalog.

8. What advantage does data discovery provide in large Fabric environments?

A. Faster Power BI rendering
B. Reduced licensing costs
C. Reduced data duplication and improved reuse
D. Automatic data modeling

Correct Answer: C

Explanation:
Discovering existing data assets helps teams reuse trusted data, reducing redundant ingestion and improving governance.

9. Which information is commonly visible when browsing an asset in the OneLake catalog?

A. User passwords
B. Column-level schema details
C. Tenant-wide permissions
D. Gateway configuration

Correct Answer: B

Explanation:
The OneLake catalog exposes metadata such as table schemas, column names, and data types to help users evaluate suitability before use.

10. Which scenario best demonstrates correct use of OneLake catalog and Real-Time hub together?

A. Using DirectQuery for all reports
B. Creating a new pipeline for every dataset
C. Discovering historical data in OneLake and live events in Real-Time hub
D. Applying sensitivity labels to dashboards

Correct Answer: C

Explanation:
OneLake catalog is optimized for discovering stored analytical data, while Real-Time hub is designed for discovering live streaming sources. Using both ensures comprehensive data discovery.