This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Implement and manage an analytics solution (30–35%)
--> Configure Microsoft Fabric workspace settings
--> Configure Dataflows Gen2 workspace settings
Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.
Introduction
Dataflows Gen2 are a core component of Microsoft Fabric’s data ingestion and transformation capabilities. They provide a low-code/no-code method for extracting, transforming, and loading (ETL) data into Fabric destinations such as Lakehouses, Warehouses, and other analytics assets.
For the DP-700 exam, it is important to understand not only how to create Dataflows Gen2, but also how workspace settings affect their operation, governance, security, performance, and administration.
Workspace-level settings help administrators establish standards and controls for how Dataflows Gen2 are used within a Fabric environment. Understanding these settings enables data engineers to create scalable, maintainable, and governed data integration solutions.
What Are Dataflows Gen2?
Dataflows Gen2 are cloud-based data transformation solutions built on Power Query technology.
They allow users to:
- Connect to data sources
- Clean and transform data
- Combine multiple datasets
- Perform data quality operations
- Load data into Fabric destinations
Unlike notebooks or Spark jobs that require coding skills, Dataflows Gen2 provide a graphical interface for data preparation.
Common use cases include:
- Data ingestion
- Data cleansing
- Dimension table creation
- Data enrichment
- ETL and ELT workflows
- Self-service data preparation
Dataflows Gen2 Architecture
A typical Dataflow Gen2 process consists of:
Data Source ↓Power Query Transformations ↓Dataflow Gen2 ↓Destination
Possible destinations include:
- Lakehouse Tables
- Warehouse Tables
- Azure SQL Database
- Other supported Fabric destinations
Why Workspace Settings Matter
In small environments, Dataflows Gen2 can be managed individually.
However, in enterprise environments, administrators need centralized control over:
- Dataflow creation
- Dataflow execution
- Compute usage
- Security
- Data destinations
- Governance
Workspace settings help establish consistent behavior across all Dataflows Gen2 within a workspace.
Dataflows Gen2 Workspace Administration
Workspace administrators control who can:
- Create Dataflows Gen2
- Modify Dataflows Gen2
- Schedule refreshes
- Access source data
- Access destinations
These permissions are governed through Fabric workspace roles.
| Workspace Role | Dataflow Capability |
|---|---|
| Admin | Full control |
| Member | Create and manage |
| Contributor | Create and edit |
| Viewer | Read-only |
DP-700 Exam Tip
Remember that Dataflows Gen2 do not have a separate security model.
They inherit Fabric workspace permissions.
Configure Dataflow Creation Permissions
Organizations often restrict who can create Dataflows Gen2.
Reasons include:
- Governance requirements
- Cost management
- Data quality controls
- Standardization
A common enterprise pattern is:
- Contributors create Dataflows
- Members manage Dataflows
- Admins govern Dataflows
This prevents uncontrolled proliferation of ETL processes.
Configure Data Destinations
One of the most important Dataflows Gen2 settings involves destination configuration.
Supported destinations include:
Lakehouse
The most common destination.
Benefits:
- Delta table storage
- Integration with Spark
- Medallion architecture support
Common usage:
- Bronze layer ingestion
- Silver layer transformation
Warehouse
Dataflows can load directly into Fabric Warehouses.
Benefits:
- Structured analytics
- SQL querying
- Dimensional modeling support
Multiple Destinations
Dataflows Gen2 support loading data into multiple destinations from a single transformation pipeline.
Benefits include:
- Reduced duplication of transformation logic
- Improved maintainability
- Consistent outputs
Configure Refresh Settings
Refresh configuration is one of the most frequently tested Dataflow topics.
Refresh settings determine:
- When Dataflows execute
- How often they run
- How data is updated
Options include:
Manual Refresh
Execution occurs only when initiated by a user.
Best for:
- Testing
- Development
- Small workloads
Scheduled Refresh
Execution occurs automatically based on a defined schedule.
Examples:
- Hourly
- Daily
- Weekly
Most production Dataflows use scheduled refresh.
Pipeline-Orchestrated Refresh
Dataflows can be executed through Fabric Data Factory pipelines.
Benefits:
- End-to-end orchestration
- Dependency management
- Complex workflow support
This is commonly used in enterprise ETL solutions.
Refresh Failure Notifications
Administrators can configure monitoring and notifications for refresh failures.
Benefits:
- Faster troubleshooting
- Improved reliability
- Reduced downtime
Monitoring is particularly important when Dataflows support business-critical reporting systems.
Configure Data Source Credentials
Dataflows require access credentials for source systems.
Supported authentication methods vary by connector and may include:
- Organizational account
- OAuth
- Basic authentication
- Service principals
- API keys
Workspace administrators often establish governance policies around credential management.
Best Practice
Use service accounts or service principals whenever possible for production workloads.
This avoids refresh failures caused by employee account changes.
Configure Gateway Usage
Some data sources reside inside private corporate networks.
Examples:
- On-premises SQL Server
- Oracle databases
- File shares
In these scenarios, Dataflows Gen2 may require an On-Premises Data Gateway.
Gateway settings determine:
- Connectivity
- Authentication
- Data access paths
A common DP-700 scenario involves selecting a gateway for on-premises data access.
Dataflow Compute and Performance Considerations
Dataflows Gen2 execute within Fabric-managed infrastructure.
Administrators should understand factors that impact performance:
Data Volume
Larger datasets increase:
- Refresh duration
- Resource consumption
Transformation Complexity
Operations such as:
- Merges
- Joins
- Group By
- Aggregations
increase processing requirements.
Number of Refreshes
Frequent refresh schedules can consume additional capacity resources.
Administrators should balance:
- Data freshness
- Capacity utilization
Dataflow Lineage and Impact Analysis
Fabric automatically captures lineage information.
Administrators can view:
Source ↓Dataflow Gen2 ↓Lakehouse ↓Semantic Model ↓Report
Benefits include:
- Impact analysis
- Dependency tracking
- Governance visibility
Lineage is an important governance feature frequently associated with Dataflows.
Dataflow Monitoring
Workspace administrators can monitor:
- Refresh history
- Success rates
- Failure messages
- Duration metrics
Monitoring tools include:
- Refresh history
- Monitoring Hub
- Fabric capacity metrics
Common Troubleshooting Areas
- Credential failures
- Gateway connectivity issues
- Schema changes
- Destination write failures
- Capacity limitations
Dataflow Governance Best Practices
Standardize Naming Conventions
Example:
DF_Bronze_Customer_IngestionDF_Silver_Sales_TransformDF_Gold_Product_Aggregation
Consistent naming improves maintainability.
Use Scheduled Refresh Sparingly
Avoid unnecessary refresh frequency.
Example:
Do not refresh every 15 minutes if daily updates are sufficient.
Implement Service Principals
Reduce dependency on individual user accounts.
Leverage Lineage Views
Monitor downstream dependencies before making changes.
Align with Medallion Architecture
Use Dataflows strategically within:
- Bronze Layer
- Silver Layer
- Gold Layer
Common DP-700 Exam Scenarios
Scenario 1
A Dataflow must load data from an on-premises SQL Server.
Solution:
Configure an On-Premises Data Gateway.
Scenario 2
A Dataflow should execute only after a source ingestion process completes.
Solution:
Use a Data Factory pipeline to orchestrate execution.
Scenario 3
A Dataflow should load transformed data into a Lakehouse for downstream Spark processing.
Solution:
Configure the Lakehouse as the destination.
DP-700 Exam Focus Areas
You should understand:
✓ Dataflows Gen2 architecture
✓ Workspace permissions
✓ Dataflow creation governance
✓ Data destinations
✓ Refresh scheduling
✓ Pipeline orchestration
✓ Credential management
✓ Gateway configuration
✓ Monitoring and troubleshooting
✓ Lineage and impact analysis
✓ Performance considerations
10 Practice Exam Questions
Question 1
Which technology provides the transformation engine used by Dataflows Gen2?
A. Power Query
B. Apache Spark
C. Kusto Query Language (KQL)
D. T-SQL
Answer: A
Explanation
Dataflows Gen2 use Power Query as their transformation engine, providing a low-code interface for data preparation.
Question 2
A Dataflow Gen2 needs to access an on-premises SQL Server database.
What must be configured?
A. Eventstream
B. Data Activator
C. On-Premises Data Gateway
D. OneLake Shortcut
Answer: C
Explanation
An On-Premises Data Gateway enables Fabric services to securely access data sources located inside private networks.
Question 3
Which destination is most commonly used for storing Dataflow Gen2 outputs within a medallion architecture?
A. Semantic Model
B. Dashboard
C. Notebook
D. Lakehouse
Answer: D
Explanation
Lakehouses are commonly used as Bronze, Silver, and Gold layers within Fabric medallion architectures.
Question 4
What is the primary advantage of scheduled refresh?
A. Eliminates authentication requirements
B. Automatically updates data without manual intervention
C. Increases storage capacity
D. Creates backup copies of source systems
Answer: B
Explanation
Scheduled refresh ensures that data remains current without requiring users to manually run the Dataflow.
Question 5
Which Fabric feature can orchestrate Dataflow Gen2 execution as part of a larger workflow?
A. Data Factory Pipeline
B. Lakehouse Explorer
C. Monitoring Hub
D. OneLake File Explorer
Answer: A
Explanation
Data Factory pipelines provide orchestration, dependency management, and scheduling capabilities.
Question 6
What information can lineage views provide?
A. Network bandwidth consumption
B. Spark executor utilization
C. Upstream and downstream dependencies
D. Gateway installation logs
Answer: C
Explanation
Lineage views show how data moves between sources, Dataflows, Lakehouses, semantic models, and reports.
Question 7
Which workspace role has full administrative control over Dataflows Gen2?
A. Viewer
B. Contributor
C. Member
D. Admin
Answer: D
Explanation
Workspace Admins have complete control over all workspace items, including Dataflows Gen2.
Question 8
A company wants to minimize production refresh failures caused by employee account changes.
What is the recommended approach?
A. Increase refresh frequency
B. Use service principals or service accounts
C. Disable scheduled refresh
D. Use Viewer permissions
Answer: B
Explanation
Service principals provide stable authentication that is not tied to individual users.
Question 9
Which factor is most likely to increase Dataflow refresh duration?
A. Smaller datasets
B. Reduced transformations
C. Complex joins and aggregations
D. Fewer destination tables
Answer: C
Explanation
Complex transformation logic increases processing requirements and refresh times.
Question 10
What is the primary purpose of Dataflow monitoring?
A. Create semantic models
B. Manage workspace domains
C. Configure Spark runtimes
D. Identify refresh failures and performance issues
Answer: D
Explanation
Monitoring helps administrators detect failures, troubleshoot issues, and optimize performance.
Final Exam Tip
For DP-700, Dataflows Gen2 questions typically focus on data ingestion, destinations, refresh management, gateways, orchestration, and governance. When evaluating exam scenarios, remember that Dataflows Gen2 are designed to provide a low-code ETL experience using Power Query, while Fabric Pipelines provide orchestration and Lakehouses commonly serve as the destination within modern medallion architectures.
Go to the DP-700 Exam Prep Hub main page.
