This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Ingest and transform data (30–35%)
--> Ingest and transform batch data
--> Choose Between Dataflows Gen2, Notebooks, KQL, and T-SQL for data transformation
Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.
Introduction
Microsoft Fabric provides multiple technologies for transforming data. One of the most common challenges for a Data Engineer is determining which transformation tool is best suited for a specific business requirement.
The DP-700 exam frequently tests your ability to select the appropriate transformation technology based on:
- Data volume
- Data complexity
- Required programming skills
- Data source type
- Performance requirements
- Real-time versus batch processing needs
- User expertise
- Maintainability
The four most important transformation technologies covered in the exam are:
- Dataflows Gen2
- Notebooks
- KQL
- T-SQL
Although all four can transform data, they are optimized for different workloads and use cases.
Understanding their strengths, limitations, and ideal scenarios is critical for success on the DP-700 exam.
Overview of Transformation Technologies
| Technology | Primary Purpose | Best For |
|---|---|---|
| Dataflows Gen2 | Low-code ETL | Business-friendly transformations |
| Notebooks | Advanced engineering and Spark processing | Large-scale data engineering |
| T-SQL | Relational transformations | Warehouses and SQL workloads |
| KQL | Real-time analytics and telemetry processing | Logs and streaming data |
Dataflows Gen2
What Are Dataflows Gen2?
Dataflows Gen2 are low-code data transformation tools within Microsoft Fabric that use Power Query.
They allow users to:
- Connect to data sources
- Clean data
- Transform data
- Load data into Fabric destinations
without writing significant amounts of code.
Transformation Engine
Dataflows Gen2 use:
- Power Query
- M Language (behind the scenes)
Most transformations are performed through a graphical interface.
Typical Transformations
Examples include:
- Renaming columns
- Removing duplicates
- Filtering rows
- Merging datasets
- Splitting columns
- Data type conversions
- Calculated columns
When to Use Dataflows Gen2
Choose Dataflows Gen2 when:
- Low-code development is desired
- Data volumes are moderate
- Business analysts participate in development
- Transformations are relatively straightforward
- Self-service data preparation is required
Examples:
- Preparing Excel data
- Cleaning CSV files
- Combining multiple business datasets
- Standard ETL processes
Advantages
Low-Code Experience
Minimal coding required.
Large Connector Library
Supports numerous source systems.
Easy Maintenance
Visual transformation steps are easier to understand.
Integration with Fabric
Loads directly into:
- Lakehouses
- Warehouses
- Other Fabric destinations
Limitations
Less Flexible
Complex logic may become difficult.
Not Ideal for Very Large Data Volumes
Spark-based solutions often scale better.
Limited Advanced Programming
Compared to notebooks.
Notebooks
What Are Notebooks?
Notebooks are code-based development environments that support:
- PySpark
- Python
- Scala
- Spark SQL
- R
within Microsoft Fabric.
Transformation Engine
Notebooks execute on Spark clusters.
This enables:
- Distributed processing
- Parallel execution
- Large-scale transformations
Typical Transformations
Examples:
- Complex joins
- Data enrichment
- Machine learning preparation
- Feature engineering
- Data quality validation
- Custom business logic
When to Use Notebooks
Choose notebooks when:
- Large data volumes exist
- Spark processing is required
- Advanced transformations are needed
- Custom programming is necessary
- Machine learning integration is planned
Examples:
- Processing billions of records
- Data science workflows
- Medallion architecture pipelines
- Complex transformations
Advantages
Massive Scalability
Handles large datasets efficiently.
Flexible Programming
Supports multiple languages.
Machine Learning Integration
Works with Spark ML libraries.
Advanced Data Engineering
Ideal for enterprise-scale pipelines.
Limitations
Requires Coding Skills
Less accessible for business users.
More Complex Development
Compared to Dataflows Gen2.
T-SQL
What Is T-SQL?
T-SQL (Transact-SQL) is Microsoft’s extension of SQL.
Fabric Warehouses and SQL endpoints support T-SQL for:
- Querying
- Transforming
- Managing relational data
Transformation Techniques
Common operations include:
SELECTJOINGROUP BYCASECTEMERGEWINDOW FUNCTIONS
When to Use T-SQL
Choose T-SQL when:
- Data resides in a Warehouse
- Relational transformations are required
- SQL expertise already exists
- Dimensional models are being built
Examples:
- Fact table loading
- Dimension updates
- Data warehouse ETL
- Reporting data preparation
Advantages
Familiar Language
Widely used by data professionals.
Excellent Relational Processing
Optimized for structured data.
Strong Performance
Particularly for warehouse workloads.
Easy Integration
Works naturally with BI tools.
Limitations
Less Suitable for Unstructured Data
Not ideal for files and raw data.
Limited Distributed Processing
Compared to Spark.
KQL
What Is KQL?
Kusto Query Language (KQL) is designed for:
- Log analytics
- Telemetry analysis
- Real-time data processing
- Event analytics
KQL is commonly used in:
- KQL Databases
- Eventhouse
- Real-Time Intelligence
Typical Transformations
Examples include:
- Filtering events
- Aggregations
- Pattern detection
- Time-series analysis
- Stream transformations
When to Use KQL
Choose KQL when:
- Working with telemetry data
- Processing logs
- Analyzing streaming events
- Building real-time dashboards
Examples:
- Sensor monitoring
- Application logs
- Security analytics
- Operational monitoring
Advantages
Optimized for Time-Series Data
Excellent for event-driven workloads.
Fast Query Performance
Handles large event volumes efficiently.
Real-Time Analytics
Supports low-latency analysis.
Limitations
Not a General ETL Tool
Less suitable for traditional batch ETL.
Not Designed for Dimensional Modeling
Warehouses are generally better for reporting models.
Comparing Transformation Technologies
| Requirement | Dataflows Gen2 | Notebooks | T-SQL | KQL |
|---|---|---|---|---|
| Low-Code Development | Excellent | Poor | Moderate | Moderate |
| Large-Scale Processing | Moderate | Excellent | Good | Excellent |
| Relational Transformations | Moderate | Good | Excellent | Limited |
| Streaming Analytics | Limited | Moderate | Poor | Excellent |
| Machine Learning Support | Poor | Excellent | Poor | Limited |
| Telemetry Analytics | Poor | Moderate | Poor | Excellent |
| Business User Friendly | Excellent | Poor | Moderate | Moderate |
| Advanced Programming | Limited | Excellent | Moderate | Limited |
Decision Framework
Choose Dataflows Gen2 When:
- Low-code ETL is preferred
- Business users are involved
- Data volumes are moderate
- Transformations are straightforward
Choose Notebooks When:
- Spark processing is required
- Data volumes are large
- Complex transformations exist
- Machine learning is involved
Choose T-SQL When:
- Working with a Warehouse
- Building dimensional models
- SQL skills are available
- Data is highly structured
Choose KQL When:
- Processing logs
- Analyzing telemetry
- Supporting streaming analytics
- Building operational monitoring solutions
Common DP-700 Scenario Questions
Scenario 1
A business analyst needs to combine Excel spreadsheets and remove duplicate rows using a visual interface.
Best choice:
Dataflows Gen2
Scenario 2
A data engineer must transform billions of records stored in a Lakehouse.
Best choice:
Notebook
Scenario 3
A warehouse team must populate fact and dimension tables.
Best choice:
T-SQL
Scenario 4
An operations team analyzes millions of application log events each hour.
Best choice:
KQL
Scenario 5
A machine learning team requires custom Python transformations.
Best choice:
Notebook
Exam Tips
Many DP-700 questions are not asking what can perform a transformation, but what should perform the transformation.
Remember these associations:
| Requirement | Best Choice |
|---|---|
| Visual ETL | Dataflows Gen2 |
| Spark processing | Notebook |
| Data warehouse transformations | T-SQL |
| Telemetry and logs | KQL |
| Machine learning preparation | Notebook |
| Self-service data preparation | Dataflows Gen2 |
| Streaming analytics | KQL |
Practice Exam Questions
Question 1
A business analyst needs to cleanse CSV files using a graphical interface with minimal coding. Which transformation technology should be used?
A. T-SQL
B. Notebook
C. KQL
D. Dataflows Gen2
Answer: D
Explanation
Dataflows Gen2 provide a low-code, visual interface that is ideal for business users and simple ETL processes.
Question 2
A data engineer must process several billion records stored in a Lakehouse using distributed computing.
Which option should be selected?
A. Notebook
B. Dataflows Gen2
C. T-SQL
D. KQL
Answer: A
Explanation
Notebooks leverage Spark for distributed processing and are designed for large-scale data transformations.
Question 3
Which technology is specifically optimized for transforming and analyzing telemetry and log data?
A. Dataflows Gen2
B. Notebook
C. KQL
D. T-SQL
Answer: C
Explanation
KQL is designed for log analytics, telemetry processing, and real-time operational analytics.
Question 4
A team is loading dimension and fact tables within a Fabric Warehouse.
Which transformation technology is most appropriate?
A. Notebook
B. Dataflows Gen2
C. KQL
D. T-SQL
Answer: D
Explanation
T-SQL is the preferred technology for relational transformations in Fabric Warehouses.
Question 5
A company requires machine learning feature engineering using Python libraries.
Which technology should be selected?
A. Notebook
B. Dataflows Gen2
C. T-SQL
D. KQL
Answer: A
Explanation
Notebooks support Python, Spark, and machine learning frameworks, making them ideal for feature engineering.
Question 6
Which technology relies primarily on Power Query transformations?
A. Notebook
B. Dataflows Gen2
C. T-SQL
D. KQL
Answer: B
Explanation
Dataflows Gen2 use Power Query and the M language behind the scenes for data transformations.
Question 7
An operations team needs to perform real-time aggregations on streaming sensor data.
Which option should be used?
A. Dataflows Gen2
B. Notebook
C. KQL
D. T-SQL
Answer: C
Explanation
KQL is optimized for real-time event processing and telemetry analysis.
Question 8
A data engineer needs maximum flexibility to implement custom business logic across multiple data sources.
Which technology is most appropriate?
A. KQL
B. Dataflows Gen2
C. T-SQL
D. Notebook
Answer: D
Explanation
Notebooks provide the highest degree of customization through programming languages such as Python and PySpark.
Question 9
A team already has extensive SQL expertise and needs to transform highly structured relational data in a Warehouse.
Which option is best?
A. Notebook
B. T-SQL
C. Dataflows Gen2
D. KQL
Answer: B
Explanation
T-SQL is optimized for relational transformations and leverages existing SQL skills.
Question 10
Which technology is generally the most business-user-friendly option for creating batch data transformation processes?
A. Notebook
B. KQL
C. T-SQL
D. Dataflows Gen2
Answer: D
Explanation
Dataflows Gen2 provide a visual, low-code experience that is easier for business users and citizen developers than code-based solutions.
DP-700 Exam Summary
When deciding between transformation technologies, focus on the primary workload:
- Dataflows Gen2 → Low-code ETL and self-service data preparation
- Notebooks → Spark, large-scale processing, advanced engineering, and machine learning
- T-SQL → Relational transformations and warehouse development
- KQL → Telemetry, logs, time-series analytics, and real-time event processing
A common DP-700 exam strategy is to identify the keywords in the scenario:
- Visual interface → Dataflows Gen2
- Billions of rows / Spark → Notebook
- Warehouse / dimensional model → T-SQL
- Logs / telemetry / real-time analytics → KQL
These keywords often point directly to the correct answer.
Go to the DP-700 Exam Prep Hub main page.
