Category: Analytics

Implement workspace-level access controls (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
   --> Configure security and governance
      --> Implement workspace-level access controls


Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Security and governance are foundational components of any enterprise analytics platform. In Microsoft Fabric, workspaces serve as the primary organizational boundary for managing content, collaboration, and permissions. Because workspaces often contain sensitive data assets such as Lakehouses, Warehouses, Data Pipelines, Notebooks, and Reports, controlling who can access and modify these assets is critical.

Workspace-level access controls provide the first layer of security within Fabric. They determine who can view, create, modify, share, and administer workspace content. Properly configured access controls help organizations implement the principle of least privilege, improve governance, reduce security risks, and ensure compliance with organizational policies.

For the DP-700 exam, you should understand workspace roles, permission inheritance, Microsoft Entra ID integration, security best practices, and common access-control scenarios.


Understanding Fabric Workspaces

A workspace is a collaborative environment used to organize and manage Fabric assets.

Examples of assets stored within a workspace include:

  • Lakehouses
  • Data Warehouses
  • Data Pipelines
  • Dataflows Gen2
  • Notebooks
  • Semantic Models
  • Reports
  • Eventstreams
  • Environments

Workspaces serve as the primary security boundary for these resources.


Why Workspace-Level Access Controls Matter

Without proper access controls:

  • Unauthorized users may access sensitive data.
  • Critical assets may be modified accidentally.
  • Governance requirements may not be met.
  • Production environments may be compromised.

Workspace-level security helps organizations:

  • Restrict access
  • Protect sensitive data
  • Separate responsibilities
  • Support auditing and compliance
  • Implement least-privilege security

Microsoft Entra ID Integration

Microsoft Fabric uses Microsoft Entra ID for authentication and identity management.

Users access Fabric using their organizational accounts.

Benefits include:

  • Centralized identity management
  • Single sign-on (SSO)
  • Multi-factor authentication support
  • Group-based security management
  • Conditional Access integration

Fabric does not maintain a separate user authentication system.


Workspace Roles

Workspace access is controlled through predefined roles.

The four primary workspace roles are:

RolePurpose
AdminFull workspace control
MemberCreate, edit, and publish content
ContributorCreate and modify content
ViewerRead-only access

Understanding these roles is extremely important for the DP-700 exam.


Admin Role

Admins have complete control over the workspace.

Capabilities include:

  • Manage workspace settings
  • Add or remove users
  • Assign roles
  • Delete workspace content
  • Configure Git integration
  • Configure deployment pipelines
  • Manage permissions

Admins effectively own the workspace.

Use Cases

  • Platform administrators
  • Workspace owners
  • Data engineering leads

Member Role

Members can actively participate in workspace development.

Capabilities include:

  • Create content
  • Modify content
  • Publish content
  • Collaborate with team members

However, Members do not have all administrative capabilities.

Use Cases

  • Senior developers
  • Data engineers
  • Analytics developers

Contributor Role

Contributors can create and modify content but have fewer management capabilities than Members.

Capabilities include:

  • Create notebooks
  • Create pipelines
  • Modify assets
  • Build solutions

Contributors generally focus on development activities rather than workspace administration.

Use Cases

  • Developers
  • Data engineers
  • ETL specialists

Viewer Role

Viewers have read-only access.

Capabilities include:

  • View reports
  • View data assets
  • Review content

Restrictions include:

  • Cannot modify content
  • Cannot create content
  • Cannot administer the workspace

Use Cases

  • Business users
  • Auditors
  • Stakeholders

Workspace Permission Assignment

Permissions can be assigned to:

  • Individual users
  • Security groups
  • Microsoft Entra ID groups

Best practice is to assign permissions through groups whenever possible.

Example:

Finance-DataEngineers → Contributor
Finance-Developers → Member
Finance-Managers → Viewer

Benefits include:

  • Easier administration
  • Reduced maintenance
  • Improved consistency

Principle of Least Privilege

One of the most important security concepts for DP-700 is the Principle of Least Privilege.

This principle states:

Users should receive only the permissions necessary to perform their job functions.

Example:

User TypeRecommended Role
Report ConsumerViewer
Data EngineerContributor
Team LeadMember
Workspace OwnerAdmin

Over-permissioning increases security risks.


Permission Inheritance

Workspace-level permissions often provide access to items contained within the workspace.

Examples include:

  • Lakehouses
  • Warehouses
  • Notebooks
  • Dataflows

A user with workspace access generally gains access to supported content based on their assigned role.

However, some Fabric items support additional item-level permissions that can supplement workspace-level controls.

Exam Tip

Workspace permissions and item-level permissions are related but not identical.

Many exam questions test your understanding of this distinction.


Workspace Access and OneLake

OneLake security is closely tied to Fabric permissions.

When users access:

  • Lakehouses
  • Warehouse data
  • OneLake files

their permissions are generally governed through Fabric security controls.

This means workspace permissions play a significant role in determining data accessibility.


Separating Development, Test, and Production Access

Organizations commonly implement separate workspaces for:

Development
Test
Production

Different access controls are applied to each environment.

Example:

EnvironmentTypical Permissions
DevelopmentContributors and Members
TestLimited Contributors
ProductionMostly Viewers

This reduces the risk of unauthorized production changes.


Workspace Security Best Practices

Use Security Groups

Prefer:

Sales-DataEngineers

instead of assigning permissions to individual users.


Minimize Admins

Only a small number of users should have Admin privileges.


Separate Production Access

Production workspaces should have stricter permissions.


Review Permissions Regularly

Conduct periodic audits of workspace access.


Follow Least Privilege

Assign the lowest role necessary.


Use Dedicated Service Principals

Automated processes should use service principals rather than personal accounts.


Common Security Scenarios

Scenario 1

A business analyst needs to view reports but should not modify content.

Solution:

Assign the Viewer role.


Scenario 2

A data engineer needs to build pipelines and notebooks but should not manage workspace permissions.

Solution:

Assign the Contributor role.


Scenario 3

A workspace owner needs to manage users and configure workspace settings.

Solution:

Assign the Admin role.


Scenario 4

A team lead needs to create and manage content while collaborating with developers.

Solution:

Assign the Member role.


Auditing and Governance

Workspace access controls support governance by enabling:

  • Access reviews
  • Compliance reporting
  • Security audits
  • Change tracking

Administrators should periodically verify:

  • User memberships
  • Group assignments
  • Admin privileges
  • Production access

These activities help maintain a secure Fabric environment.


DP-700 Exam Focus Areas

You should understand:

✓ Workspace roles

✓ Admin, Member, Contributor, and Viewer permissions

✓ Microsoft Entra ID integration

✓ Security group assignments

✓ Least-privilege principles

✓ Workspace permission inheritance

✓ Item-level versus workspace-level security

✓ Production environment security

✓ Service principal usage

✓ Governance and auditing practices


Practice Exam Questions

Question 1

Which workspace role provides full control over workspace settings and permissions?

A. Admin

B. Member

C. Contributor

D. Viewer

Answer: A

Explanation

Admins have complete control over workspace management, including permissions, settings, and content administration.


Question 2

A user needs read-only access to reports and data assets in a workspace.

Which role should be assigned?

A. Admin

B. Member

C. Contributor

D. Viewer

Answer: D

Explanation

The Viewer role allows users to access and view content without modifying it.


Question 3

Which Microsoft service provides identity and authentication for Fabric users?

A. Azure Data Lake Storage

B. Microsoft Entra ID

C. OneLake

D. Fabric Capacity

Answer: B

Explanation

Microsoft Entra ID provides authentication, identity management, and access control for Fabric users.


Question 4

A data engineer needs to create notebooks and pipelines but should not manage workspace permissions.

Which role is most appropriate?

A. Viewer

B. Admin

C. Contributor

D. Workspace Owner

Answer: C

Explanation

Contributors can create and modify content without having full administrative privileges.


Question 5

What is the primary goal of the Principle of Least Privilege?

A. Maximize workspace access

B. Reduce storage costs

C. Improve Spark performance

D. Grant only the permissions required to perform a job

Answer: D

Explanation

Least privilege reduces security risks by ensuring users receive only the permissions necessary for their responsibilities.


Question 6

Which approach is generally recommended for assigning workspace permissions?

A. Assign permissions directly to every user

B. Use Microsoft Entra ID security groups

C. Give all users Member access

D. Assign Admin access broadly

Answer: B

Explanation

Group-based permission management simplifies administration and improves consistency.


Question 7

A team lead needs to create content, collaborate with developers, and participate in solution management but does not require full administrative control.

Which role is most appropriate?

A. Viewer

B. Contributor

C. Member

D. Admin

Answer: C

Explanation

Members can actively manage and collaborate on workspace content without having full administrative authority.


Question 8

Why should organizations limit the number of workspace Admins?

A. To reduce Spark resource consumption

B. To simplify notebook development

C. To improve deployment speed

D. To reduce security risk and administrative exposure

Answer: D

Explanation

Admin roles have extensive privileges and should be assigned only when necessary.


Question 9

A company wants automated deployment processes that are not dependent on employee accounts.

What should be used?

A. Viewer accounts

B. Personal accounts

C. Service principals

D. Shared passwords

Answer: C

Explanation

Service principals provide stable, secure identities for automation and deployment activities.


Question 10

What is the primary benefit of separating Development, Test, and Production workspaces?

A. Increased storage capacity

B. Improved security and change control

C. Reduced OneLake storage usage

D. Faster notebook execution

Answer: B

Explanation

Environment separation helps prevent accidental production changes and supports proper testing and governance.


Exam Tip

For the DP-700 exam, many security questions can be solved by understanding the differences between the four workspace roles:

RoleKey Capability
AdminFull control and permissions management
MemberCreate, manage, and collaborate on content
ContributorCreate and modify content
ViewerRead-only access

When evaluating scenarios, choose the lowest role that satisfies the requirement. Microsoft frequently tests the Principle of Least Privilege, making it one of the most important security concepts to master for the exam.


Go to the DP-700 Exam Prep Hub main page.

Implement and use Microsoft Fabric audit logs (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
   --> Configure security and governance
      --> Implement and use Microsoft Fabric audit logs


Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

As organizations adopt Microsoft Fabric for enterprise analytics, data engineering, and business intelligence workloads, maintaining visibility into user activity becomes increasingly important. Administrators and governance teams need to answer questions such as:

  • Who accessed a specific report?
  • Who deleted a workspace item?
  • When was a dataset modified?
  • Which users shared sensitive information?
  • What actions were performed during a security incident investigation?

Microsoft Fabric Audit Logs provide a detailed record of user and administrative activities across the Fabric environment. These logs are essential for governance, security monitoring, compliance reporting, operational troubleshooting, and forensic investigations.

For the DP-700 exam, you should understand what audit logs are, how they work, what information they capture, where they can be accessed, and how they support security and governance requirements.


What Are Microsoft Fabric Audit Logs?

Audit logs are records of activities performed within Microsoft Fabric.

They capture information about:

  • User actions
  • Administrative actions
  • Security-related events
  • Content access
  • Item modifications
  • Sharing activities
  • Workspace operations

Audit logs provide a historical record that organizations can use for monitoring and investigation purposes.


Why Audit Logging Is Important

Audit logging helps organizations:

  • Monitor user activity
  • Detect suspicious behavior
  • Support compliance requirements
  • Investigate security incidents
  • Verify governance policies
  • Track administrative changes
  • Understand platform usage

Without audit logs, organizations have limited visibility into how Fabric resources are being used.


Types of Activities Captured

Microsoft Fabric audit logs can capture many types of events.

Examples include:

Workspace Activities

  • Workspace creation
  • Workspace deletion
  • Workspace updates
  • Membership changes

Item Activities

  • Report creation
  • Report deletion
  • Dataset creation
  • Semantic model updates
  • Lakehouse modifications
  • Warehouse modifications

Sharing Activities

  • Sharing reports
  • Sharing datasets
  • Permission changes
  • External sharing actions

Security Activities

  • Role assignments
  • Permission updates
  • Access changes
  • Governance actions

Administrative Activities

  • Tenant setting modifications
  • Capacity changes
  • Configuration updates

Audit Log Architecture

A simplified workflow looks like this:

User Action
Fabric Records Event
Audit Log Entry Created
Administrator Reviews Activity

Every significant operation can generate an audit event that becomes part of the organization’s audit trail.


Information Captured in Audit Logs

A typical audit log entry may contain:

FieldDescription
TimestampWhen the action occurred
UserWho performed the action
ActivityWhat action occurred
Item NameObject involved
WorkspaceLocation of activity
Operation StatusSuccess or failure
Additional DetailsContext information

Example:

Timestamp: 2026-01-15 10:42 AM
User: jsmith@contoso.com
Activity: Deleted Report
Report: Executive Dashboard
Workspace: Finance
Status: Success

Microsoft Fabric and Microsoft 365 Audit Logs

Fabric auditing is integrated into the broader Microsoft ecosystem.

Audit events are available through Microsoft 365 audit capabilities, allowing organizations to centralize monitoring and investigation activities.

This integration provides:

  • Unified auditing
  • Centralized investigation
  • Compliance support
  • Enterprise-wide visibility

Common Audit Log Use Cases

Security Investigations

A sensitive report is accidentally deleted.

Administrators can review audit logs to determine:

  • Who deleted the report
  • When the deletion occurred
  • Which workspace was affected

Compliance Audits

Regulators request evidence of access controls.

Audit logs provide historical records of:

  • User access
  • Permission changes
  • Administrative actions

Governance Reviews

An organization wants to understand how frequently critical assets are used.

Audit logs can reveal:

  • Access patterns
  • Sharing activities
  • Usage trends

Operational Troubleshooting

A workspace suddenly becomes unavailable.

Audit logs may identify:

  • Recent configuration changes
  • Role assignments
  • Administrative actions

Audit Logs vs Monitoring Metrics

This distinction is commonly tested.

Audit LogsMonitoring Metrics
Who performed an actionResource performance
Historical activity recordsCapacity utilization
Security and governance focusPerformance focus
User behavior trackingSystem behavior tracking

Example:

Audit Log:

User deleted dataset

Monitoring Metric:

CPU utilization reached 85%

Audit Logs vs Activity Monitoring

Although related, they serve different purposes.

Audit Logs

Focus on:

  • Security
  • Governance
  • Compliance
  • User activity

Monitoring Tools

Focus on:

  • Performance
  • Capacity utilization
  • Query execution
  • System health

Audit Logs and Compliance

Audit logging plays an important role in regulatory frameworks such as:

  • GDPR
  • HIPAA
  • SOX
  • PCI DSS
  • Internal governance standards

Organizations often require audit trails to demonstrate:

  • Accountability
  • Access monitoring
  • Change tracking
  • Security oversight

Key Security Benefits

Audit logs help organizations:

Detect Unauthorized Activity

Example:

Multiple unexpected permission changes

Investigate Security Incidents

Example:

Who accessed sensitive data?

Support Forensics

Example:

Timeline of events before a breach

Improve Accountability

Every action is associated with a user identity.


Common Audit Events for DP-700

Candidates should recognize events such as:

  • Create Workspace
  • Delete Workspace
  • Update Workspace
  • Create Report
  • Delete Report
  • Modify Dataset
  • Share Content
  • Change Permissions
  • Update Tenant Settings
  • Assign Roles

Audit Log Retention

Organizations should understand that audit logs are retained according to Microsoft and organizational retention policies.

Longer retention periods support:

  • Compliance investigations
  • Historical analysis
  • Security reviews

Retention capabilities may vary depending on licensing and organizational configuration.


Best Practices

Enable Auditing

Ensure audit logging is enabled and properly configured.


Review Logs Regularly

Perform periodic reviews for:

  • Security incidents
  • Governance violations
  • Unusual activity

Protect Audit Data

Audit logs themselves may contain sensitive information and should be protected appropriately.


Integrate with Security Processes

Use audit data alongside:

  • Security monitoring
  • Governance reviews
  • Compliance audits

Establish Alerting Procedures

Monitor for:

  • Unexpected permission changes
  • Mass deletions
  • Excessive sharing
  • Administrative changes

Retain Logs Appropriately

Align retention periods with:

  • Regulatory requirements
  • Organizational policies
  • Security needs

Common DP-700 Exam Scenarios

Scenario 1

A report is unexpectedly deleted.

Question:

How do you determine who deleted it?

Solution:

Review Microsoft Fabric audit logs.


Scenario 2

Management requests evidence showing who modified workspace permissions.

Solution:

Use audit logs to review permission-change events.


Scenario 3

A compliance auditor requests historical access records.

Solution:

Provide relevant audit log entries.


Scenario 4

An administrator wants to determine which users shared a sensitive dashboard.

Solution:

Review sharing-related audit events.


DP-700 Exam Focus Areas

You should understand:

✓ Purpose of audit logging

✓ Types of activities captured

✓ Security investigation scenarios

✓ Compliance use cases

✓ Governance monitoring

✓ Audit log contents

✓ Audit logs versus monitoring metrics

✓ Audit logs versus performance monitoring

✓ User activity tracking

✓ Administrative activity tracking

✓ Best practices for auditing


Practice Exam Questions

Question 1

What is the primary purpose of Microsoft Fabric audit logs?

A. To track user and administrative activities

B. To improve query performance

C. To optimize storage usage

D. To automate data ingestion

Answer: A

Explanation

Audit logs provide a historical record of user and administrative actions for governance, compliance, and security purposes.


Question 2

Which activity would most likely appear in a Fabric audit log?

A. CPU utilization reaching 90%

B. Network latency measurements

C. A user deleting a report

D. Spark memory allocation

Answer: C

Explanation

Audit logs capture user actions such as creating, modifying, sharing, and deleting Fabric items.


Question 3

A compliance auditor asks for evidence showing who changed workspace permissions last month.

Which feature should be used?

A. Audit logs

B. Capacity Metrics App

C. Query Insights

D. Spark Monitoring

Answer: A

Explanation

Audit logs record permission changes and can be used to identify who performed administrative actions.


Question 4

Which information is commonly included in an audit log entry?

A. CPU utilization percentage

B. Cluster memory consumption

C. Spark executor count

D. Timestamp, user, and activity performed

Answer: D

Explanation

Audit logs typically record who performed an action, when it occurred, and what operation was performed.


Question 5

A report was accidentally deleted. What is the best way to determine who deleted it?

A. Review workspace endorsements

B. Review sensitivity labels

C. Review audit logs

D. Review data lineage

Answer: C

Explanation

Audit logs provide detailed records of item deletion events and the users responsible for them.


Question 6

How do audit logs differ from monitoring metrics?

A. Audit logs track activities, while monitoring metrics track performance and resource usage.

B. Audit logs improve query performance.

C. Monitoring metrics identify user actions.

D. Monitoring metrics replace audit logs.

Answer: A

Explanation

Audit logs focus on user and administrative actions, whereas monitoring metrics focus on system and workload performance.


Question 7

Which scenario represents a common use of audit logs?

A. Scaling Spark clusters

B. Monitoring storage capacity growth

C. Determining who shared a sensitive report

D. Configuring deployment pipelines

Answer: C

Explanation

Audit logs capture sharing events and can be used to investigate who shared content.


Question 8

Which governance objective is best supported by audit logs?

A. Data compression

B. Accountability and traceability

C. Capacity scaling

D. Schema optimization

Answer: B

Explanation

Audit logs establish accountability by recording user actions and maintaining an activity history.


Question 9

Why are audit logs important during a security investigation?

A. They automatically restore deleted content.

B. They optimize warehouse performance.

C. They classify data sensitivity.

D. They provide a timeline of user and administrative activities.

Answer: D

Explanation

Audit logs help investigators reconstruct events and determine what actions occurred during a security incident.


Question 10

An organization wants to review all permission changes made during the last quarter.

Which Microsoft Fabric capability should be used?

A. Capacity Metrics

B. Query Monitoring

C. Audit Logs

D. Dataflows Gen2

Answer: C

Explanation

Audit logs record permission modifications and provide historical visibility into administrative actions.


Exam Tip

A frequent DP-700 exam challenge is distinguishing between audit logs, monitoring tools, and governance features.

Remember:

RequirementSolution
Determine who performed an actionAudit Logs
Monitor system performanceMonitoring Metrics
Track capacity utilizationCapacity Monitoring
Classify sensitive contentSensitivity Labels
Identify trusted contentEndorsements

If a question asks who did something, when it happened, or what changes were made, the correct answer is usually Audit Logs. If the question focuses on CPU, memory, performance, or utilization, the answer is likely a monitoring tool rather than auditing.


Go to the DP-700 Exam Prep Hub main page.

Choose between Dataflow Gen2, a pipeline and a notebook (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
   --> Orchestrate processes
      --> Choose between Dataflow Gen2, a pipeline and a notebook


Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

One of the most important skills for a Microsoft Fabric Data Engineer is selecting the appropriate tool for a particular task. Microsoft Fabric provides several powerful technologies for data ingestion, transformation, orchestration, and automation. Three of the most commonly used are:

  • Dataflow Gen2
  • Data Pipelines
  • Notebooks

Although these tools often work together, they serve different purposes. Choosing the wrong tool can lead to unnecessary complexity, reduced maintainability, and increased development effort.

For the DP-700 exam, you should understand:

  • The primary purpose of each tool
  • When to use each tool
  • Strengths and limitations
  • Common design patterns
  • How these tools interact with one another

A significant number of DP-700 scenario questions are likely to test your ability to determine which Fabric component best fits a given business requirement.


Understanding the Three Tools

Before comparing them, it is important to understand their primary functions.

ToolPrimary Purpose
Dataflow Gen2Low-code data ingestion and transformation
Data PipelineWorkflow orchestration and automation
NotebookCode-based data processing and advanced analytics

A useful way to remember this is:

Dataflow Gen2 = Transform Data
Pipeline = Orchestrate Processes
Notebook = Execute Code

What Is Dataflow Gen2?

Dataflow Gen2 is a low-code/no-code data integration and transformation tool built on Power Query technology.

It allows users to:

  • Connect to data sources
  • Clean data
  • Transform data
  • Merge datasets
  • Filter records
  • Load data into Fabric destinations

Dataflow Gen2 is designed for users who prefer visual development rather than coding.


Dataflow Gen2 Architecture

Data Source
Power Query Transformations
Dataflow Gen2
Lakehouse / Warehouse

The transformation logic is built using a graphical interface.


Common Dataflow Gen2 Tasks

Examples include:

  • Removing duplicates
  • Filtering rows
  • Renaming columns
  • Data cleansing
  • Combining files
  • Joining datasets
  • Type conversions
  • Data standardization

These activities require little or no programming.


Advantages of Dataflow Gen2

Low-Code Development

Business analysts and citizen developers can build transformations without extensive coding knowledge.

Reusable Transformations

Transformations can be reused across multiple projects.

Familiar Power Query Experience

Users familiar with Power BI often adapt quickly.

Large Connector Library

Supports many cloud and on-premises data sources.


Limitations of Dataflow Gen2

Dataflow Gen2 is not ideal for:

  • Complex machine learning workloads
  • Advanced Spark processing
  • Custom Python development
  • Large-scale distributed programming

For those scenarios, notebooks are often more appropriate.


What Is a Data Pipeline?

A Data Pipeline is an orchestration tool.

Its primary purpose is not data transformation but rather coordinating and automating activities.

Think of a pipeline as a workflow engine.


Pipeline Architecture

Activity 1
Activity 2
Activity 3
Activity 4

Pipelines determine:

  • What runs
  • When it runs
  • In what order it runs
  • Under what conditions it runs

Common Pipeline Activities

Examples include:

  • Copy Data
  • Execute Notebook
  • Execute Dataflow
  • Stored Procedures
  • Web Activities
  • Conditional Logic
  • Scheduling Jobs

Pipelines coordinate these activities into a complete workflow.


Advantages of Pipelines

Workflow Automation

Automates complex end-to-end processes.

Scheduling

Supports recurring execution schedules.

Dependency Management

Controls execution order.

Error Handling

Supports retries and failure paths.

Integration

Can orchestrate multiple Fabric components.


Limitations of Pipelines

Pipelines are not intended for:

  • Complex data transformations
  • Interactive analysis
  • Advanced programming

Pipelines orchestrate work; they do not replace transformation tools.


What Is a Notebook?

A notebook is a code-based environment that allows developers and data engineers to execute code directly against Fabric data.

Notebooks commonly use:

  • Python
  • PySpark
  • Spark SQL
  • Scala (where supported)

They run on Spark compute engines.


Notebook Architecture

Data Source
Spark Processing
Notebook
Lakehouse / Warehouse

Notebooks provide maximum flexibility and control.


Common Notebook Tasks

Examples include:

  • PySpark transformations
  • Data engineering workflows
  • Machine learning preparation
  • Advanced data cleansing
  • Streaming data processing
  • Delta table optimization
  • Custom business logic

Advantages of Notebooks

Full Programming Flexibility

Developers can implement virtually any logic.

Spark Integration

Supports distributed processing.

Advanced Transformations

Suitable for highly complex data engineering workloads.

Machine Learning Support

Works well with AI and ML frameworks.

Scalability

Can process very large datasets.


Limitations of Notebooks

Coding Required

Requires programming knowledge.

Higher Complexity

Can be more difficult to maintain.

Less Accessible

Business users typically prefer Dataflow Gen2.


Side-by-Side Comparison

FeatureDataflow Gen2PipelineNotebook
Primary PurposeData TransformationOrchestrationAdvanced Processing
Coding RequiredMinimalMinimalExtensive
SchedulingLimitedYesUsually via Pipeline
Spark SupportNo Direct CodingNoYes
Visual InterfaceYesYesNo
Advanced LogicLimitedLimitedExtensive
Best for ETLYesCoordinates ETLYes
Machine LearningNoNoYes

When to Choose Dataflow Gen2

Choose Dataflow Gen2 when:

  • Data cleansing is required
  • Users prefer visual tools
  • Power Query transformations are sufficient
  • Business analysts are building solutions
  • Coding should be minimized

Example

Requirement:

Import CSV files
Remove duplicates
Rename columns
Load into Lakehouse

Best Choice:

Dataflow Gen2


When to Choose a Pipeline

Choose a Pipeline when:

  • Multiple tasks must be coordinated
  • Processes require scheduling
  • Activities depend on one another
  • Workflows need monitoring
  • Automation is required

Example

Requirement:

Run Dataflow
Run Notebook
Load Warehouse
Send Notification

Best Choice:

Pipeline


When to Choose a Notebook

Choose a Notebook when:

  • Complex transformations are required
  • PySpark processing is needed
  • Machine learning is involved
  • Custom code is necessary
  • Large-scale distributed processing is required

Example

Requirement:

Apply custom PySpark transformation
Process 10 TB dataset
Optimize Delta tables

Best Choice:

Notebook


Common Real-World Pattern

In many Fabric environments, all three tools are used together.

Example:

Dataflow Gen2
Pipeline
Notebook
Warehouse

Workflow:

  1. Dataflow Gen2 cleans source files.
  2. Pipeline orchestrates execution.
  3. Notebook performs advanced transformations.
  4. Results load into a Warehouse.

This layered approach is common in enterprise solutions.


Decision Framework for DP-700

When reading exam questions, ask:

Is the requirement primarily data transformation?

Choose:

Dataflow Gen2


Is the requirement workflow orchestration?

Choose:

Pipeline


Is the requirement advanced coding or Spark processing?

Choose:

Notebook


Common Exam Traps

Trap #1

Question mentions:

  • Scheduling
  • Dependencies
  • Automation

Correct answer:

Pipeline

Even if transformations are involved.


Trap #2

Question mentions:

  • PySpark
  • Python
  • Machine Learning
  • Spark

Correct answer:

Notebook


Trap #3

Question mentions:

  • Power Query
  • Visual transformation
  • No-code development

Correct answer:

Dataflow Gen2


DP-700 Exam Focus Areas

You should understand:

✓ Purpose of Dataflow Gen2

✓ Purpose of Data Pipelines

✓ Purpose of Notebooks

✓ Visual versus code-based development

✓ Workflow orchestration

✓ Spark processing

✓ Power Query transformations

✓ Scheduling and automation

✓ Common integration patterns

✓ Appropriate tool selection for business scenarios


Practice Exam Questions

Question 1

A business analyst needs to import CSV files, remove duplicate rows, and standardize column names using a visual interface with minimal coding.

Which Fabric component should be used?

A. Notebook

B. Data Pipeline

C. Dataflow Gen2

D. Deployment Pipeline

Answer: C

Explanation

Dataflow Gen2 is designed for low-code data ingestion and transformation using Power Query.


Question 2

A data engineering solution must execute the following sequence:

  1. Run a Dataflow Gen2 process
  2. Execute a Notebook
  3. Load a Warehouse
  4. Send a notification email

Which Fabric component should coordinate this workflow?

A. Lakehouse

B. Data Pipeline

C. Notebook

D. Semantic Model

Answer: B

Explanation

Pipelines are designed to orchestrate and automate multiple activities and dependencies.


Question 3

A team needs to perform complex PySpark transformations against several terabytes of data.

Which Fabric component is most appropriate?

A. Dataflow Gen2

B. Pipeline

C. Dashboard

D. Notebook

Answer: D

Explanation

Notebooks provide Spark-based programming environments suitable for large-scale transformations.


Question 4

Which Fabric component is primarily responsible for workflow orchestration?

A. Dataflow Gen2

B. Lakehouse

C. Warehouse

D. Data Pipeline

Answer: D

Explanation

Data Pipelines coordinate and automate execution of multiple activities.


Question 5

A solution requires users with no programming experience to create reusable data cleansing transformations.

Which component should be selected?

A. Notebook

B. Dataflow Gen2

C. Pipeline

D. Spark Job Definition

Answer: B

Explanation

Dataflow Gen2 provides a low-code visual environment for data preparation.


Question 6

Which Fabric component offers the greatest flexibility for implementing custom business logic?

A. Dataflow Gen2

B. Warehouse

C. Notebook

D. Data Pipeline

Answer: C

Explanation

Notebooks support Python, PySpark, and Spark SQL, allowing virtually unlimited customization.


Question 7

A company wants to schedule nightly execution of several notebooks and monitor failures.

Which Fabric component should be used?

A. Dataflow Gen2

B. Notebook

C. Lakehouse

D. Data Pipeline

Answer: D

Explanation

Pipelines provide scheduling, monitoring, dependencies, and failure handling.


Question 8

Which statement best describes Dataflow Gen2?

A. It is primarily a workflow orchestration tool.

B. It is a low-code data transformation solution based on Power Query.

C. It is designed for machine learning development.

D. It replaces Spark notebooks.

Answer: B

Explanation

Dataflow Gen2 is optimized for low-code ETL and data transformation workloads.


Question 9

A data engineer must optimize Delta tables using Spark commands and Python code.

Which Fabric component should be used?

A. Notebook

B. Data Pipeline

C. Dataflow Gen2

D. Warehouse

Answer: A

Explanation

Notebook environments provide direct access to Spark capabilities and custom code execution.


Question 10

Which scenario is the best fit for a Data Pipeline?

A. Creating Power Query transformations

B. Applying machine learning algorithms

C. Coordinating multiple Fabric activities into an automated workflow

D. Writing custom PySpark code

Answer: C

Explanation

Pipelines are specifically designed for orchestration, automation, scheduling, dependency management, and monitoring.


Exam Tip

A useful DP-700 memory aid is:

RequirementBest Tool
Visual ETL and data preparationDataflow Gen2
Scheduling and orchestrationData Pipeline
Spark, Python, and advanced processingNotebook

When a scenario focuses on automation and coordinating activities, think Pipeline.

When it focuses on Power Query transformations, think Dataflow Gen2.

When it focuses on PySpark, Spark SQL, machine learning, or custom code, think Notebook.


Go to the DP-700 Exam Prep Hub main page.

Practice Questions: Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric) (DP-900 Exam Prep)

Practice Questions


Question 1

What is the primary purpose of Azure Databricks?

A. Hosting relational databases
B. Managing file shares
C. Processing large-scale data using Apache Spark
D. Running virtual machines

Answer: C

Explanation:
Azure Databricks is built on Apache Spark for large-scale data processing.


Question 2

Which feature is a key characteristic of Azure Databricks?

A. Fixed schema relational tables
B. Distributed data processing
C. File-based storage only
D. Limited scalability

Answer: B

Explanation:
Databricks uses distributed computing to process large datasets efficiently.


Question 3

Which scenario is BEST suited for Azure Databricks?

A. Hosting a transactional database
B. Running large-scale ETL pipelines and machine learning models
C. Managing shared file storage
D. Serving static web pages

Answer: B

Explanation:
Databricks is ideal for data engineering and machine learning at scale.


Question 4

What is Microsoft Fabric primarily designed for?

A. Running operating systems
B. Providing a unified, end-to-end analytics platform
C. Managing virtual networks
D. Hosting relational databases only

Answer: B

Explanation:
Microsoft Fabric integrates multiple analytics capabilities into one unified platform.


Question 5

Which component of Microsoft Fabric serves as a unified data storage layer?

A. Azure Blob Storage
B. SQL Database
C. OneLake
D. Azure Files

Answer: C

Explanation:
OneLake is the centralized storage layer within Microsoft Fabric.


Question 6

Which service is BEST suited for organizations that want a single platform for data engineering, data warehousing, and BI?

A. Azure Virtual Machines
B. Azure Databricks
C. Microsoft Fabric
D. Azure Table Storage

Answer: C

Explanation:
Fabric provides an end-to-end unified analytics experience.


Question 7

Which of the following best describes the difference between Azure Databricks and Microsoft Fabric?

A. Databricks is for storage, Fabric is for compute
B. Databricks focuses on big data processing, Fabric provides a unified analytics platform
C. Fabric only supports relational data, Databricks does not
D. Databricks cannot scale, Fabric can

Answer: B

Explanation:
Databricks focuses on processing and ML, while Fabric provides end-to-end analytics.


Question 8

Which programming environments are commonly supported in Azure Databricks notebooks?

A. HTML and CSS only
B. Python, SQL, Scala, and R
C. JavaScript only
D. PowerShell only

Answer: B

Explanation:
Databricks notebooks support multiple languages including Python, SQL, Scala, and R.


Question 9

Which scenario is NOT ideal for Azure Databricks?

A. Large-scale data transformation
B. Machine learning model training
C. Managing simple file shares
D. Processing streaming data

Answer: C

Explanation:
Databricks is not designed for file-sharing scenarios.


Question 10

Which statement about Microsoft Fabric is TRUE?

A. It requires manual infrastructure management
B. It is a SaaS-based unified analytics platform
C. It only supports batch processing
D. It replaces all Azure services

Answer: B

Explanation:
Microsoft Fabric is a fully managed SaaS platform that integrates analytics services.


✅ Quick Exam Takeaways

Azure Databricks

  • Apache Spark-based
  • Distributed processing
  • Data engineering & machine learning

Microsoft Fabric

  • Unified analytics platform
  • End-to-end solution (data + analytics + BI)
  • Includes OneLake storage

✔ Key differences:

  • Databricks → processing & ML
  • Fabric → all-in-one analytics platform

✔ Exam tip:
👉 Big data processing → Azure Databricks
👉 Unified analytics platform → Microsoft Fabric


Go to the DP-900 Exam Prep Hub main page.

Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric) (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
--> Describe common elements of large-scale analytics
--> Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric)


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Modern analytics workloads often require processing massive volumes of data quickly and efficiently. Microsoft provides powerful cloud services to meet these needs, including Azure Databricks and Microsoft Fabric.

For the DP-900 exam, you should understand what these services are, their key features, and when to use each.


Why Large-Scale Analytics Services Matter

Large-scale analytics involves:

  • Processing big data (TBs to PBs)
  • Supporting batch and real-time workloads
  • Enabling advanced analytics and machine learning

✔ Traditional tools often cannot scale to meet these demands.


Azure Databricks


What Is Azure Databricks?

Azure Databricks is a cloud-based analytics platform built on Apache Spark.

It is designed for:

  • Big data processing
  • Data engineering
  • Machine learning
  • Collaborative analytics

Key Features


1. Apache Spark-Based Processing

  • Distributed computing engine
  • Processes large datasets in parallel

✔ Ideal for big data workloads


2. Collaborative Workspace

  • Notebooks (Python, SQL, Scala, R)
  • Multiple users can collaborate

3. Integration with Azure

  • Works with Azure Data Lake Storage
  • Integrates with Azure Synapse Analytics

4. Machine Learning Support

  • Built-in ML capabilities
  • Supports advanced analytics workflows

Common Use Cases

  • Big data processing (ETL/ELT pipelines)
  • Data science and machine learning
  • Real-time analytics
  • Data transformation at scale

Best for: Data engineers and data scientists working with large datasets


Microsoft Fabric


What Is Microsoft Fabric?

Microsoft Fabric is an end-to-end, unified analytics platform that brings together multiple data services into a single environment.

It integrates:

  • Data engineering
  • Data warehousing
  • Data science
  • Real-time analytics
  • Business intelligence

Key Features


1. Unified Platform

  • Combines multiple services into one
  • Reduces complexity of managing separate tools

2. OneLake (Unified Storage Layer)

  • Centralized data lake for all workloads
  • Eliminates data silos

3. Integrated Analytics Experiences

  • Data Factory (ingestion)
  • Data Warehouse
  • Real-Time Analytics
  • Power BI integration

4. SaaS-Based Model

  • Fully managed platform
  • Minimal infrastructure management

Common Use Cases

  • End-to-end analytics solutions
  • Unified data platform for organizations
  • Business intelligence and reporting
  • Data integration and transformation

Best for: Organizations wanting a single, unified analytics solution


Azure Databricks vs Microsoft Fabric

FeatureAzure DatabricksMicrosoft Fabric
FocusBig data processing & MLEnd-to-end analytics platform
EngineApache SparkMultiple integrated engines
UsersData engineers, data scientistsBroad (engineers, analysts, business users)
ComplexityMore flexible, more technicalSimpler, unified experience
Use CaseAdvanced analytics & MLUnified analytics and BI

How They Fit in an Analytics Architecture

Typical roles:

  • Azure Databricks
    • Data processing
    • Advanced transformations
    • Machine learning
  • Microsoft Fabric
    • End-to-end pipeline
    • Storage (OneLake)
    • Reporting (Power BI integration)

✔ They can complement each other in modern architectures.


Key Considerations When Choosing


Choose Azure Databricks when:

  • You need advanced data engineering or machine learning
  • You require Spark-based processing
  • You want full control and flexibility

Choose Microsoft Fabric when:

  • You want a unified analytics platform
  • You prefer simplified, integrated workflows
  • You need end-to-end analytics in one place

Why This Matters for DP-900

On the exam, you may be asked to:

  • Identify the purpose of Azure Databricks
  • Recognize Microsoft Fabric as a unified analytics platform
  • Choose the right service for a scenario
  • Understand how these services support large-scale analytics

Summary — Exam-Relevant Takeaways

Azure Databricks

  • Apache Spark-based
  • Big data processing
  • Machine learning
  • Flexible and powerful

Microsoft Fabric

  • Unified analytics platform
  • End-to-end solution
  • Includes data engineering, warehousing, and BI

✔ Key difference:

  • Databricks → advanced processing & ML
  • Fabric → all-in-one analytics platform

✔ Exam tip:
👉 Spark + big data processing → Azure Databricks
👉 Unified analytics platform → Microsoft Fabric


Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

Describe the difference between Batch and Streaming data (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
--> Describe considerations for real-time data analytics
--> Describe the difference between Batch and Streaming data


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Understanding the difference between batch data and streaming data is fundamental for designing modern analytics solutions. These two approaches define how data is ingested, processed, and analyzed.


What Is Batch Data?

Batch data refers to data that is:

  • Collected over a period of time
  • Processed in large chunks (batches)
  • Handled at scheduled intervals

Key Characteristics of Batch Data

  • High latency (minutes, hours, or days)
  • Processes large volumes at once
  • Typically scheduled (e.g., nightly jobs)
  • Efficient and cost-effective

Common Use Cases

  • Daily sales reports
  • Monthly financial summaries
  • Historical data analysis
  • Data warehousing workloads

Azure Services for Batch Processing

  • Azure Data Factory → batch ingestion and orchestration
  • Azure Synapse Analytics → batch processing and analytics

What Is Streaming Data?

Streaming data refers to data that is:

  • Generated continuously
  • Processed in real time (or near real time)
  • Handled as individual events or small micro-batches

Key Characteristics of Streaming Data

  • Low latency (seconds or milliseconds)
  • Continuous data flow
  • Enables real-time insights
  • Often requires more complex processing

Common Use Cases

  • IoT sensor monitoring
  • Fraud detection
  • Live dashboards
  • Website activity tracking

Azure Services for Streaming

  • Azure Event Hubs → event ingestion
  • Azure Stream Analytics → real-time processing

Batch vs Streaming — Key Differences

FeatureBatch ProcessingStreaming Processing
Data FlowPeriodicContinuous
LatencyHighLow
Data SizeLarge chunksSmall events
ComplexitySimplerMore complex
CostLowerHigher
Use CaseHistorical analysisReal-time insights

When to Use Batch Processing

Choose batch when:

  • Real-time data is not required
  • You are working with large historical datasets
  • Cost efficiency is important
  • Processing can occur on a schedule

When to Use Streaming Processing

Choose streaming when:

  • You need real-time or near real-time insights
  • Data is generated continuously
  • Immediate action is required

Hybrid Approaches (Lambda / Modern Architectures)

Many modern systems use both:

  • Batch layer → historical analysis
  • Streaming layer → real-time insights

✔ Example:

  • Real-time dashboard + nightly aggregated reports

Why This Matters for DP-900

On the exam, you may be asked to:

  • Distinguish between batch and streaming scenarios
  • Choose the appropriate processing method
  • Identify Azure services for each approach
  • Understand trade-offs (latency, cost, complexity)

Summary — Exam-Relevant Takeaways

Batch processing

  • Processes data in chunks
  • Higher latency
  • Lower cost
  • Best for historical analysis

Streaming processing

  • Processes data continuously
  • Low latency
  • Enables real-time insights
  • More complex

✔ Azure services:

  • Batch → Azure Data Factory, Azure Synapse Analytics
  • Streaming → Azure Event Hubs, Azure Stream Analytics

✔ Exam tip:
👉 Real-time requirement → Streaming
👉 Scheduled / historical → Batch


Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

Describe features of data models in Power BI (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
--> Describe data visualization in Microsoft Power BI
--> Describe features of data models in Power BI


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

A data model is the foundation of any effective report in Microsoft Power BI. It defines how data is structured, related, and calculated, enabling efficient analysis and meaningful visualizations.

For the DP-900 exam, you should understand how data models work, their key components, and best practices.


What Is a Data Model in Power BI?

A data model is a logical representation of data that includes:

  • Tables
  • Relationships
  • Calculations

It allows Power BI to:

  • Combine data from multiple sources
  • Enable filtering and aggregation
  • Support interactive reporting

Key Features of Power BI Data Models


1. Tables

Data models consist of one or more tables, which can come from:

  • Databases
  • Files (Excel, CSV)
  • Cloud sources

✔ Tables contain rows (records) and columns (fields)


2. Relationships

Relationships define how tables are connected.

Types of Relationships

  • One-to-many (1:*) → Most common
  • Many-to-one (*:1)
  • Many-to-many (:)

Key Concepts

  • Primary key → Unique identifier in one table
  • Foreign key → Reference in another table

✔ Relationships enable filtering across tables


3. Schema Design (Star Schema)

Power BI models commonly follow a star schema:

  • Fact tables → Contain measurable data (e.g., sales)
  • Dimension tables → Contain descriptive data (e.g., customer, product)

✔ This structure improves performance and usability


4. Measures and Calculated Columns

Power BI uses DAX (Data Analysis Expressions) for calculations.

Measures

  • Calculated at query time
  • Used in aggregations (e.g., SUM, AVERAGE)

Calculated Columns

  • Computed during data load
  • Stored in the model

✔ Measures are preferred for performance


5. Data Types

Each column has a defined data type:

  • Text
  • Number
  • Date/Time
  • Boolean

✔ Correct data types ensure accurate calculations and visuals


6. Hierarchies

Hierarchies allow users to drill down into data.

Example

  • Year → Quarter → Month → Day

✔ Used for interactive reporting and exploration


7. Filtering and Cross-Filtering

Relationships enable:

  • Filter propagation between tables
  • Cross-filtering in visuals

✔ Example:
Selecting a product filters related sales data


8. Data Granularity

Granularity refers to the level of detail in data.

  • Fine-grained → detailed (e.g., individual transactions)
  • Coarse-grained → summarized (e.g., monthly totals)

✔ Consistent granularity is important for accurate analysis


9. Model Optimization

Well-designed models:

  • Use fewer tables when possible
  • Avoid unnecessary columns
  • Use measures instead of calculated columns
  • Follow star schema design

✔ Improves performance and usability


10. Relationships Direction (Filter Direction)

Relationships can filter:

  • Single direction (default, recommended)
  • Both directions (used cautiously)

✔ Incorrect settings can lead to ambiguous results


Typical Data Modeling Workflow in Power BI

  1. Load data into Power BI
  2. Clean and transform data (Power Query)
  3. Define relationships
  4. Create measures and calculations
  5. Build reports and visuals

Why This Matters for DP-900

On the exam, you may be asked to:

  • Identify components of a data model
  • Understand relationships and keys
  • Differentiate between measures and calculated columns
  • Recognize star schema design
  • Understand filtering behavior

Summary — Exam-Relevant Takeaways

✔ A data model includes:

  • Tables
  • Relationships
  • Calculations

✔ Key features:

  • Relationships (1:*, :)
  • Star schema (fact + dimension tables)
  • Measures vs calculated columns
  • Hierarchies and filtering

✔ Best practices:

  • Use star schema
  • Prefer measures over calculated columns
  • Maintain consistent granularity

✔ Exam tips:
👉 Fact table = metrics (numbers)
👉 Dimension table = descriptive attributes
👉 Measure = dynamic calculation
👉 Calculated column = stored value


Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

Practice Questions: Describe features of data models in Power BI (DP-900 Exam Prep)

Practice Questions


Question 1

What is the primary purpose of a data model in Power BI?

A. To store raw files
B. To define relationships and enable data analysis
C. To manage network connections
D. To create dashboards only

Answer: B

Explanation:
A data model organizes data and defines relationships and calculations for analysis.


Question 2

Which component connects tables together in a Power BI data model?

A. Measures
B. Relationships
C. Dashboards
D. Queries

Answer: B

Explanation:
Relationships define how tables interact and allow filtering across them.


Question 3

Which type of relationship is MOST common in Power BI models?

A. Many-to-many
B. One-to-many
C. One-to-one
D. No relationship

Answer: B

Explanation:
The one-to-many (1:*) relationship is the most common in analytical models.


Question 4

In a star schema, which table typically contains numeric values used for analysis?

A. Dimension table
B. Lookup table
C. Fact table
D. Bridge table

Answer: C

Explanation:
Fact tables store measurable data (e.g., sales, revenue).


Question 5

What is the role of a dimension table in a data model?

A. Store raw transaction data
B. Store aggregated values only
C. Provide descriptive attributes for filtering and grouping
D. Execute calculations

Answer: C

Explanation:
Dimension tables contain descriptive data like customer or product details.


Question 6

Which type of calculation is evaluated at query time in Power BI?

A. Calculated column
B. Measure
C. Table relationship
D. Data type

Answer: B

Explanation:
Measures are calculated dynamically during query execution.


Question 7

Which language is used to create measures and calculated columns in Power BI?

A. SQL
B. Python
C. DAX
D. Java

Answer: C

Explanation:
DAX (Data Analysis Expressions) is used for calculations in Power BI.


Question 8

What is the benefit of using a star schema in Power BI?

A. Increased data duplication
B. Simplified relationships and improved performance
C. Elimination of fact tables
D. Reduced data types

Answer: B

Explanation:
Star schema improves performance and usability by simplifying relationships.


Question 9

What happens when you create a relationship between two tables?

A. Data is duplicated
B. Tables are merged into one
C. Filters can propagate between tables
D. Data types are changed

Answer: C

Explanation:
Relationships allow filtering across related tables.


Question 10

Which feature allows users to drill down through levels such as Year → Month → Day?

A. Measures
B. Hierarchies
C. Relationships
D. Dashboards

Answer: B

Explanation:
Hierarchies enable drill-down analysis in reports.


✅ Quick Exam Takeaways

✔ Data model components:

  • Tables
  • Relationships
  • Measures & calculated columns

✔ Key concepts:

  • Fact table → numeric data
  • Dimension table → descriptive data
  • Relationships → connect tables

✔ Calculations:

  • Measures → dynamic
  • Calculated columns → stored

✔ Design best practice:

  • Use star schema

✔ Exam tip:
👉 Measure = calculated at query time
👉 Calculated column = stored in table
👉 Fact = numbers, Dimension = descriptions


Go to the DP-900 Exam Prep Hub main page.

Identify capabilities of Power BI (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
--> Describe data visualization in Microsoft Power BI
--> Identify capabilities of Power BI


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Microsoft Power BI is Microsoft’s business intelligence (BI) and data visualization platform. It enables users to connect to data, transform it, and create interactive reports and dashboards for data-driven decision-making.

For the DP-900 exam, you should understand what Power BI can do, its core components, and its role in an analytics solution.


What Is Power BI?

Power BI is a self-service and enterprise BI tool that allows users to:

  • Connect to multiple data sources
  • Transform and model data
  • Create visualizations and reports
  • Share insights across an organization

Core Capabilities of Power BI


1. Data Connectivity

Power BI can connect to a wide range of data sources:

  • Cloud services (Azure, SaaS apps)
  • Databases (SQL Server, Azure SQL)
  • Files (Excel, CSV)
  • Streaming data sources

✔ Supports both import and direct query modes


2. Data Transformation (Power Query)

Power BI includes Power Query, a tool for:

  • Cleaning data
  • Shaping and transforming data
  • Merging and filtering datasets

✔ Uses a visual interface (no coding required, though M language is available)


3. Data Modeling

Power BI enables users to create data models by:

  • Defining relationships between tables
  • Creating calculated columns and measures
  • Using DAX (Data Analysis Expressions)

✔ Supports star schema design (common in analytics)


4. Data Visualization

Power BI provides a rich set of visualizations:

  • Charts (bar, line, pie, etc.)
  • Tables and matrices
  • Maps and geographic visuals
  • KPIs and gauges

✔ Visuals are interactive and dynamic


5. Reports

A report in Power BI:

  • Is a collection of visualizations
  • Typically spans multiple pages
  • Allows filtering, slicing, and drill-down

✔ Built in Power BI Desktop and published to the cloud


6. Dashboards

A dashboard:

  • Is a single-page view of key metrics
  • Displays pinned visuals from reports
  • Provides a high-level overview

✔ Used for quick insights and monitoring


7. Data Refresh

Power BI supports:

  • Scheduled refresh (periodic updates)
  • Real-time/streaming data updates

✔ Ensures reports reflect current data


8. Sharing and Collaboration

Power BI enables users to:

  • Publish reports to the Power BI Service
  • Share dashboards with others
  • Collaborate across teams

✔ Integrates with Microsoft 365 (Teams, SharePoint)


9. Security

Power BI provides:

  • Row-Level Security (RLS)
  • Data access controls
  • Integration with Azure Active Directory

✔ Ensures users only see authorized data


10. Integration with Azure and Microsoft Ecosystem

Power BI integrates with:

  • Azure Synapse Analytics
  • Azure Data Lake Storage
  • Microsoft Fabric
  • Excel and other Microsoft tools

✔ Plays a key role in end-to-end analytics solutions


Power BI Components


Power BI Desktop

  • Authoring tool for reports
  • Installed on a local machine

Power BI Service

  • Cloud-based platform
  • Used for sharing and collaboration

Power BI Mobile

  • View dashboards and reports on mobile devices

Typical Analytics Workflow with Power BI

  1. Connect to data sources
  2. Transform data (Power Query)
  3. Model data (relationships, DAX)
  4. Create visualizations
  5. Publish reports
  6. Share dashboards

Why This Matters for DP-900

On the exam, you may be asked to:

  • Identify Power BI capabilities
  • Differentiate between reports and dashboards
  • Understand data connectivity and refresh options
  • Recognize Power BI’s role in analytics solutions

Summary — Exam-Relevant Takeaways

✔ Power BI is used for:

  • Data visualization
  • Reporting
  • Business intelligence

✔ Key capabilities:

  • Data connectivity
  • Data transformation (Power Query)
  • Data modeling (relationships, DAX)
  • Interactive visualizations
  • Sharing and collaboration

✔ Key components:

  • Power BI Desktop → report creation
  • Power BI Service → sharing
  • Dashboards → single-page overview
  • Reports → multi-page detailed analysis

✔ Exam tips:
👉 Report = multi-page, detailed
👉 Dashboard = single-page, summary
👉 Power Query = data transformation
👉 DAX = calculations and measures


Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

Practice Questions: Identify capabilities of Power BI (DP-900 Exam Prep)

Practice Questions


Question 1

What is the primary purpose of Microsoft Power BI?

A. Managing databases
B. Running virtual machines
C. Creating reports and visualizations from data
D. Developing applications

Answer: C

Explanation:
Power BI is a business intelligence tool used to create reports, dashboards, and visualizations.


Question 2

Which Power BI component is used to create reports?

A. Power BI Service
B. Power BI Mobile
C. Power BI Desktop
D. Azure Portal

Answer: C

Explanation:
Power BI Desktop is the primary tool for building reports and data models.


Question 3

What is the main difference between a report and a dashboard in Power BI?

A. Reports are single-page, dashboards are multi-page
B. Reports are multi-page, dashboards are single-page
C. Reports are only for developers
D. Dashboards cannot contain visuals

Answer: B

Explanation:
Reports are multi-page and detailed, while dashboards are single-page summaries.


Question 4

Which feature in Power BI is used to clean and transform data?

A. DAX
B. Power Query
C. Power Pivot
D. Azure Data Factory

Answer: B

Explanation:
Power Query is used for data transformation and preparation.


Question 5

Which language is used in Power BI for creating calculations and measures?

A. SQL
B. Python
C. DAX
D. Java

Answer: C

Explanation:
DAX (Data Analysis Expressions) is used for calculations and measures.


Question 6

Which Power BI feature allows users to restrict data access to specific rows?

A. Data refresh
B. Row-Level Security (RLS)
C. Power Query
D. Dashboards

Answer: B

Explanation:
Row-Level Security (RLS) ensures users only see data they are authorized to access.


Question 7

Which of the following is a key capability of Power BI?

A. Running operating systems
B. Hosting web applications
C. Connecting to multiple data sources
D. Managing network traffic

Answer: C

Explanation:
Power BI can connect to many different data sources, including databases, files, and cloud services.


Question 8

Where are Power BI reports typically published for sharing and collaboration?

A. Power BI Desktop
B. Power BI Service
C. Azure Virtual Machines
D. SQL Server

Answer: B

Explanation:
Reports are published to the Power BI Service for sharing and collaboration.


Question 9

Which capability allows Power BI to display near real-time data?

A. Scheduled refresh only
B. Streaming datasets
C. Static reports
D. Data export

Answer: B

Explanation:
Streaming datasets enable real-time or near real-time updates.


Question 10

What is the purpose of a Power BI dashboard?

A. To store raw data
B. To create data pipelines
C. To provide a single-page view of key metrics
D. To manage user accounts

Answer: C

Explanation:
Dashboards provide a high-level, single-page summary of important data.


✅ Quick Exam Takeaways

✔ Power BI is used for:

  • Data visualization
  • Reporting
  • Business intelligence

✔ Key features:

  • Power Query → data transformation
  • DAX → calculations
  • Reports → multi-page
  • Dashboards → single-page

✔ Components:

  • Power BI Desktop → build reports
  • Power BI Service → share and collaborate

✔ Security:

  • Row-Level Security (RLS)

✔ Exam tip:
👉 Transform data → Power Query
👉 Create calculations → DAX
👉 Share reports → Power BI Service


Go to the DP-900 Exam Prep Hub main page.