Configure Dataflows Gen2 workspace settings (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
--> Configure Microsoft Fabric workspace settings
--> Configure Dataflows Gen2 workspace settings


Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Dataflows Gen2 are a core component of Microsoft Fabric’s data ingestion and transformation capabilities. They provide a low-code/no-code method for extracting, transforming, and loading (ETL) data into Fabric destinations such as Lakehouses, Warehouses, and other analytics assets.

For the DP-700 exam, it is important to understand not only how to create Dataflows Gen2, but also how workspace settings affect their operation, governance, security, performance, and administration.

Workspace-level settings help administrators establish standards and controls for how Dataflows Gen2 are used within a Fabric environment. Understanding these settings enables data engineers to create scalable, maintainable, and governed data integration solutions.


What Are Dataflows Gen2?

Dataflows Gen2 are cloud-based data transformation solutions built on Power Query technology.

They allow users to:

  • Connect to data sources
  • Clean and transform data
  • Combine multiple datasets
  • Perform data quality operations
  • Load data into Fabric destinations

Unlike notebooks or Spark jobs that require coding skills, Dataflows Gen2 provide a graphical interface for data preparation.

Common use cases include:

  • Data ingestion
  • Data cleansing
  • Dimension table creation
  • Data enrichment
  • ETL and ELT workflows
  • Self-service data preparation

Dataflows Gen2 Architecture

A typical Dataflow Gen2 process consists of:

Data Source
Power Query Transformations
Dataflow Gen2
Destination

Possible destinations include:

  • Lakehouse Tables
  • Warehouse Tables
  • Azure SQL Database
  • Other supported Fabric destinations

Why Workspace Settings Matter

In small environments, Dataflows Gen2 can be managed individually.

However, in enterprise environments, administrators need centralized control over:

  • Dataflow creation
  • Dataflow execution
  • Compute usage
  • Security
  • Data destinations
  • Governance

Workspace settings help establish consistent behavior across all Dataflows Gen2 within a workspace.


Dataflows Gen2 Workspace Administration

Workspace administrators control who can:

  • Create Dataflows Gen2
  • Modify Dataflows Gen2
  • Schedule refreshes
  • Access source data
  • Access destinations

These permissions are governed through Fabric workspace roles.

Workspace RoleDataflow Capability
AdminFull control
MemberCreate and manage
ContributorCreate and edit
ViewerRead-only

DP-700 Exam Tip

Remember that Dataflows Gen2 do not have a separate security model.

They inherit Fabric workspace permissions.


Configure Dataflow Creation Permissions

Organizations often restrict who can create Dataflows Gen2.

Reasons include:

  • Governance requirements
  • Cost management
  • Data quality controls
  • Standardization

A common enterprise pattern is:

  • Contributors create Dataflows
  • Members manage Dataflows
  • Admins govern Dataflows

This prevents uncontrolled proliferation of ETL processes.


Configure Data Destinations

One of the most important Dataflows Gen2 settings involves destination configuration.

Supported destinations include:

Lakehouse

The most common destination.

Benefits:

  • Delta table storage
  • Integration with Spark
  • Medallion architecture support

Common usage:

  • Bronze layer ingestion
  • Silver layer transformation

Warehouse

Dataflows can load directly into Fabric Warehouses.

Benefits:

  • Structured analytics
  • SQL querying
  • Dimensional modeling support

Multiple Destinations

Dataflows Gen2 support loading data into multiple destinations from a single transformation pipeline.

Benefits include:

  • Reduced duplication of transformation logic
  • Improved maintainability
  • Consistent outputs

Configure Refresh Settings

Refresh configuration is one of the most frequently tested Dataflow topics.

Refresh settings determine:

  • When Dataflows execute
  • How often they run
  • How data is updated

Options include:

Manual Refresh

Execution occurs only when initiated by a user.

Best for:

  • Testing
  • Development
  • Small workloads

Scheduled Refresh

Execution occurs automatically based on a defined schedule.

Examples:

  • Hourly
  • Daily
  • Weekly

Most production Dataflows use scheduled refresh.


Pipeline-Orchestrated Refresh

Dataflows can be executed through Fabric Data Factory pipelines.

Benefits:

  • End-to-end orchestration
  • Dependency management
  • Complex workflow support

This is commonly used in enterprise ETL solutions.


Refresh Failure Notifications

Administrators can configure monitoring and notifications for refresh failures.

Benefits:

  • Faster troubleshooting
  • Improved reliability
  • Reduced downtime

Monitoring is particularly important when Dataflows support business-critical reporting systems.


Configure Data Source Credentials

Dataflows require access credentials for source systems.

Supported authentication methods vary by connector and may include:

  • Organizational account
  • OAuth
  • Basic authentication
  • Service principals
  • API keys

Workspace administrators often establish governance policies around credential management.

Best Practice

Use service accounts or service principals whenever possible for production workloads.

This avoids refresh failures caused by employee account changes.


Configure Gateway Usage

Some data sources reside inside private corporate networks.

Examples:

  • On-premises SQL Server
  • Oracle databases
  • File shares

In these scenarios, Dataflows Gen2 may require an On-Premises Data Gateway.

Gateway settings determine:

  • Connectivity
  • Authentication
  • Data access paths

A common DP-700 scenario involves selecting a gateway for on-premises data access.


Dataflow Compute and Performance Considerations

Dataflows Gen2 execute within Fabric-managed infrastructure.

Administrators should understand factors that impact performance:

Data Volume

Larger datasets increase:

  • Refresh duration
  • Resource consumption

Transformation Complexity

Operations such as:

  • Merges
  • Joins
  • Group By
  • Aggregations

increase processing requirements.


Number of Refreshes

Frequent refresh schedules can consume additional capacity resources.

Administrators should balance:

  • Data freshness
  • Capacity utilization

Dataflow Lineage and Impact Analysis

Fabric automatically captures lineage information.

Administrators can view:

Source
Dataflow Gen2
Lakehouse
Semantic Model
Report

Benefits include:

  • Impact analysis
  • Dependency tracking
  • Governance visibility

Lineage is an important governance feature frequently associated with Dataflows.


Dataflow Monitoring

Workspace administrators can monitor:

  • Refresh history
  • Success rates
  • Failure messages
  • Duration metrics

Monitoring tools include:

  • Refresh history
  • Monitoring Hub
  • Fabric capacity metrics

Common Troubleshooting Areas

  • Credential failures
  • Gateway connectivity issues
  • Schema changes
  • Destination write failures
  • Capacity limitations

Dataflow Governance Best Practices

Standardize Naming Conventions

Example:

DF_Bronze_Customer_Ingestion
DF_Silver_Sales_Transform
DF_Gold_Product_Aggregation

Consistent naming improves maintainability.


Use Scheduled Refresh Sparingly

Avoid unnecessary refresh frequency.

Example:

Do not refresh every 15 minutes if daily updates are sufficient.


Implement Service Principals

Reduce dependency on individual user accounts.


Leverage Lineage Views

Monitor downstream dependencies before making changes.


Align with Medallion Architecture

Use Dataflows strategically within:

  • Bronze Layer
  • Silver Layer
  • Gold Layer

Common DP-700 Exam Scenarios

Scenario 1

A Dataflow must load data from an on-premises SQL Server.

Solution:

Configure an On-Premises Data Gateway.


Scenario 2

A Dataflow should execute only after a source ingestion process completes.

Solution:

Use a Data Factory pipeline to orchestrate execution.


Scenario 3

A Dataflow should load transformed data into a Lakehouse for downstream Spark processing.

Solution:

Configure the Lakehouse as the destination.


DP-700 Exam Focus Areas

You should understand:

✓ Dataflows Gen2 architecture

✓ Workspace permissions

✓ Dataflow creation governance

✓ Data destinations

✓ Refresh scheduling

✓ Pipeline orchestration

✓ Credential management

✓ Gateway configuration

✓ Monitoring and troubleshooting

✓ Lineage and impact analysis

✓ Performance considerations


10 Practice Exam Questions

Question 1

Which technology provides the transformation engine used by Dataflows Gen2?

A. Power Query

B. Apache Spark

C. Kusto Query Language (KQL)

D. T-SQL

Answer: A

Explanation

Dataflows Gen2 use Power Query as their transformation engine, providing a low-code interface for data preparation.


Question 2

A Dataflow Gen2 needs to access an on-premises SQL Server database.

What must be configured?

A. Eventstream

B. Data Activator

C. On-Premises Data Gateway

D. OneLake Shortcut

Answer: C

Explanation

An On-Premises Data Gateway enables Fabric services to securely access data sources located inside private networks.


Question 3

Which destination is most commonly used for storing Dataflow Gen2 outputs within a medallion architecture?

A. Semantic Model

B. Dashboard

C. Notebook

D. Lakehouse

Answer: D

Explanation

Lakehouses are commonly used as Bronze, Silver, and Gold layers within Fabric medallion architectures.


Question 4

What is the primary advantage of scheduled refresh?

A. Eliminates authentication requirements

B. Automatically updates data without manual intervention

C. Increases storage capacity

D. Creates backup copies of source systems

Answer: B

Explanation

Scheduled refresh ensures that data remains current without requiring users to manually run the Dataflow.


Question 5

Which Fabric feature can orchestrate Dataflow Gen2 execution as part of a larger workflow?

A. Data Factory Pipeline

B. Lakehouse Explorer

C. Monitoring Hub

D. OneLake File Explorer

Answer: A

Explanation

Data Factory pipelines provide orchestration, dependency management, and scheduling capabilities.


Question 6

What information can lineage views provide?

A. Network bandwidth consumption

B. Spark executor utilization

C. Upstream and downstream dependencies

D. Gateway installation logs

Answer: C

Explanation

Lineage views show how data moves between sources, Dataflows, Lakehouses, semantic models, and reports.


Question 7

Which workspace role has full administrative control over Dataflows Gen2?

A. Viewer

B. Contributor

C. Member

D. Admin

Answer: D

Explanation

Workspace Admins have complete control over all workspace items, including Dataflows Gen2.


Question 8

A company wants to minimize production refresh failures caused by employee account changes.

What is the recommended approach?

A. Increase refresh frequency

B. Use service principals or service accounts

C. Disable scheduled refresh

D. Use Viewer permissions

Answer: B

Explanation

Service principals provide stable authentication that is not tied to individual users.


Question 9

Which factor is most likely to increase Dataflow refresh duration?

A. Smaller datasets

B. Reduced transformations

C. Complex joins and aggregations

D. Fewer destination tables

Answer: C

Explanation

Complex transformation logic increases processing requirements and refresh times.


Question 10

What is the primary purpose of Dataflow monitoring?

A. Create semantic models

B. Manage workspace domains

C. Configure Spark runtimes

D. Identify refresh failures and performance issues

Answer: D

Explanation

Monitoring helps administrators detect failures, troubleshoot issues, and optimize performance.


Final Exam Tip

For DP-700, Dataflows Gen2 questions typically focus on data ingestion, destinations, refresh management, gateways, orchestration, and governance. When evaluating exam scenarios, remember that Dataflows Gen2 are designed to provide a low-code ETL experience using Power Query, while Fabric Pipelines provide orchestration and Lakehouses commonly serve as the destination within modern medallion architectures.


Go to the DP-700 Exam Prep Hub main page.

Leave a comment