Choose Between Dataflows Gen2, Notebooks, KQL, and T-SQL for data transformation (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Ingest and transform data (30–35%)
   --> Ingest and transform batch data
      --> Choose Between Dataflows Gen2, Notebooks, KQL, and T-SQL for data transformation


Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Microsoft Fabric provides multiple technologies for transforming data. One of the most common challenges for a Data Engineer is determining which transformation tool is best suited for a specific business requirement.

The DP-700 exam frequently tests your ability to select the appropriate transformation technology based on:

  • Data volume
  • Data complexity
  • Required programming skills
  • Data source type
  • Performance requirements
  • Real-time versus batch processing needs
  • User expertise
  • Maintainability

The four most important transformation technologies covered in the exam are:

  • Dataflows Gen2
  • Notebooks
  • KQL
  • T-SQL

Although all four can transform data, they are optimized for different workloads and use cases.

Understanding their strengths, limitations, and ideal scenarios is critical for success on the DP-700 exam.


Overview of Transformation Technologies

TechnologyPrimary PurposeBest For
Dataflows Gen2Low-code ETLBusiness-friendly transformations
NotebooksAdvanced engineering and Spark processingLarge-scale data engineering
T-SQLRelational transformationsWarehouses and SQL workloads
KQLReal-time analytics and telemetry processingLogs and streaming data

Dataflows Gen2

What Are Dataflows Gen2?

Dataflows Gen2 are low-code data transformation tools within Microsoft Fabric that use Power Query.

They allow users to:

  • Connect to data sources
  • Clean data
  • Transform data
  • Load data into Fabric destinations

without writing significant amounts of code.


Transformation Engine

Dataflows Gen2 use:

  • Power Query
  • M Language (behind the scenes)

Most transformations are performed through a graphical interface.


Typical Transformations

Examples include:

  • Renaming columns
  • Removing duplicates
  • Filtering rows
  • Merging datasets
  • Splitting columns
  • Data type conversions
  • Calculated columns

When to Use Dataflows Gen2

Choose Dataflows Gen2 when:

  • Low-code development is desired
  • Data volumes are moderate
  • Business analysts participate in development
  • Transformations are relatively straightforward
  • Self-service data preparation is required

Examples:

  • Preparing Excel data
  • Cleaning CSV files
  • Combining multiple business datasets
  • Standard ETL processes

Advantages

Low-Code Experience

Minimal coding required.

Large Connector Library

Supports numerous source systems.

Easy Maintenance

Visual transformation steps are easier to understand.

Integration with Fabric

Loads directly into:

  • Lakehouses
  • Warehouses
  • Other Fabric destinations

Limitations

Less Flexible

Complex logic may become difficult.

Not Ideal for Very Large Data Volumes

Spark-based solutions often scale better.

Limited Advanced Programming

Compared to notebooks.


Notebooks

What Are Notebooks?

Notebooks are code-based development environments that support:

  • PySpark
  • Python
  • Scala
  • Spark SQL
  • R

within Microsoft Fabric.


Transformation Engine

Notebooks execute on Spark clusters.

This enables:

  • Distributed processing
  • Parallel execution
  • Large-scale transformations

Typical Transformations

Examples:

  • Complex joins
  • Data enrichment
  • Machine learning preparation
  • Feature engineering
  • Data quality validation
  • Custom business logic

When to Use Notebooks

Choose notebooks when:

  • Large data volumes exist
  • Spark processing is required
  • Advanced transformations are needed
  • Custom programming is necessary
  • Machine learning integration is planned

Examples:

  • Processing billions of records
  • Data science workflows
  • Medallion architecture pipelines
  • Complex transformations

Advantages

Massive Scalability

Handles large datasets efficiently.

Flexible Programming

Supports multiple languages.

Machine Learning Integration

Works with Spark ML libraries.

Advanced Data Engineering

Ideal for enterprise-scale pipelines.


Limitations

Requires Coding Skills

Less accessible for business users.

More Complex Development

Compared to Dataflows Gen2.


T-SQL

What Is T-SQL?

T-SQL (Transact-SQL) is Microsoft’s extension of SQL.

Fabric Warehouses and SQL endpoints support T-SQL for:

  • Querying
  • Transforming
  • Managing relational data

Transformation Techniques

Common operations include:

SELECT
JOIN
GROUP BY
CASE
CTE
MERGE
WINDOW FUNCTIONS

When to Use T-SQL

Choose T-SQL when:

  • Data resides in a Warehouse
  • Relational transformations are required
  • SQL expertise already exists
  • Dimensional models are being built

Examples:

  • Fact table loading
  • Dimension updates
  • Data warehouse ETL
  • Reporting data preparation

Advantages

Familiar Language

Widely used by data professionals.

Excellent Relational Processing

Optimized for structured data.

Strong Performance

Particularly for warehouse workloads.

Easy Integration

Works naturally with BI tools.


Limitations

Less Suitable for Unstructured Data

Not ideal for files and raw data.

Limited Distributed Processing

Compared to Spark.


KQL

What Is KQL?

Kusto Query Language (KQL) is designed for:

  • Log analytics
  • Telemetry analysis
  • Real-time data processing
  • Event analytics

KQL is commonly used in:

  • KQL Databases
  • Eventhouse
  • Real-Time Intelligence

Typical Transformations

Examples include:

  • Filtering events
  • Aggregations
  • Pattern detection
  • Time-series analysis
  • Stream transformations

When to Use KQL

Choose KQL when:

  • Working with telemetry data
  • Processing logs
  • Analyzing streaming events
  • Building real-time dashboards

Examples:

  • Sensor monitoring
  • Application logs
  • Security analytics
  • Operational monitoring

Advantages

Optimized for Time-Series Data

Excellent for event-driven workloads.

Fast Query Performance

Handles large event volumes efficiently.

Real-Time Analytics

Supports low-latency analysis.


Limitations

Not a General ETL Tool

Less suitable for traditional batch ETL.

Not Designed for Dimensional Modeling

Warehouses are generally better for reporting models.


Comparing Transformation Technologies

RequirementDataflows Gen2NotebooksT-SQLKQL
Low-Code DevelopmentExcellentPoorModerateModerate
Large-Scale ProcessingModerateExcellentGoodExcellent
Relational TransformationsModerateGoodExcellentLimited
Streaming AnalyticsLimitedModeratePoorExcellent
Machine Learning SupportPoorExcellentPoorLimited
Telemetry AnalyticsPoorModeratePoorExcellent
Business User FriendlyExcellentPoorModerateModerate
Advanced ProgrammingLimitedExcellentModerateLimited

Decision Framework

Choose Dataflows Gen2 When:

  • Low-code ETL is preferred
  • Business users are involved
  • Data volumes are moderate
  • Transformations are straightforward

Choose Notebooks When:

  • Spark processing is required
  • Data volumes are large
  • Complex transformations exist
  • Machine learning is involved

Choose T-SQL When:

  • Working with a Warehouse
  • Building dimensional models
  • SQL skills are available
  • Data is highly structured

Choose KQL When:

  • Processing logs
  • Analyzing telemetry
  • Supporting streaming analytics
  • Building operational monitoring solutions

Common DP-700 Scenario Questions

Scenario 1

A business analyst needs to combine Excel spreadsheets and remove duplicate rows using a visual interface.

Best choice:

Dataflows Gen2


Scenario 2

A data engineer must transform billions of records stored in a Lakehouse.

Best choice:

Notebook


Scenario 3

A warehouse team must populate fact and dimension tables.

Best choice:

T-SQL


Scenario 4

An operations team analyzes millions of application log events each hour.

Best choice:

KQL


Scenario 5

A machine learning team requires custom Python transformations.

Best choice:

Notebook


Exam Tips

Many DP-700 questions are not asking what can perform a transformation, but what should perform the transformation.

Remember these associations:

RequirementBest Choice
Visual ETLDataflows Gen2
Spark processingNotebook
Data warehouse transformationsT-SQL
Telemetry and logsKQL
Machine learning preparationNotebook
Self-service data preparationDataflows Gen2
Streaming analyticsKQL

Practice Exam Questions

Question 1

A business analyst needs to cleanse CSV files using a graphical interface with minimal coding. Which transformation technology should be used?

A. T-SQL

B. Notebook

C. KQL

D. Dataflows Gen2

Answer: D

Explanation

Dataflows Gen2 provide a low-code, visual interface that is ideal for business users and simple ETL processes.


Question 2

A data engineer must process several billion records stored in a Lakehouse using distributed computing.

Which option should be selected?

A. Notebook

B. Dataflows Gen2

C. T-SQL

D. KQL

Answer: A

Explanation

Notebooks leverage Spark for distributed processing and are designed for large-scale data transformations.


Question 3

Which technology is specifically optimized for transforming and analyzing telemetry and log data?

A. Dataflows Gen2

B. Notebook

C. KQL

D. T-SQL

Answer: C

Explanation

KQL is designed for log analytics, telemetry processing, and real-time operational analytics.


Question 4

A team is loading dimension and fact tables within a Fabric Warehouse.

Which transformation technology is most appropriate?

A. Notebook

B. Dataflows Gen2

C. KQL

D. T-SQL

Answer: D

Explanation

T-SQL is the preferred technology for relational transformations in Fabric Warehouses.


Question 5

A company requires machine learning feature engineering using Python libraries.

Which technology should be selected?

A. Notebook

B. Dataflows Gen2

C. T-SQL

D. KQL

Answer: A

Explanation

Notebooks support Python, Spark, and machine learning frameworks, making them ideal for feature engineering.


Question 6

Which technology relies primarily on Power Query transformations?

A. Notebook

B. Dataflows Gen2

C. T-SQL

D. KQL

Answer: B

Explanation

Dataflows Gen2 use Power Query and the M language behind the scenes for data transformations.


Question 7

An operations team needs to perform real-time aggregations on streaming sensor data.

Which option should be used?

A. Dataflows Gen2

B. Notebook

C. KQL

D. T-SQL

Answer: C

Explanation

KQL is optimized for real-time event processing and telemetry analysis.


Question 8

A data engineer needs maximum flexibility to implement custom business logic across multiple data sources.

Which technology is most appropriate?

A. KQL

B. Dataflows Gen2

C. T-SQL

D. Notebook

Answer: D

Explanation

Notebooks provide the highest degree of customization through programming languages such as Python and PySpark.


Question 9

A team already has extensive SQL expertise and needs to transform highly structured relational data in a Warehouse.

Which option is best?

A. Notebook

B. T-SQL

C. Dataflows Gen2

D. KQL

Answer: B

Explanation

T-SQL is optimized for relational transformations and leverages existing SQL skills.


Question 10

Which technology is generally the most business-user-friendly option for creating batch data transformation processes?

A. Notebook

B. KQL

C. T-SQL

D. Dataflows Gen2

Answer: D

Explanation

Dataflows Gen2 provide a visual, low-code experience that is easier for business users and citizen developers than code-based solutions.


DP-700 Exam Summary

When deciding between transformation technologies, focus on the primary workload:

  • Dataflows Gen2 → Low-code ETL and self-service data preparation
  • Notebooks → Spark, large-scale processing, advanced engineering, and machine learning
  • T-SQL → Relational transformations and warehouse development
  • KQL → Telemetry, logs, time-series analytics, and real-time event processing

A common DP-700 exam strategy is to identify the keywords in the scenario:

  • Visual interface → Dataflows Gen2
  • Billions of rows / Spark → Notebook
  • Warehouse / dimensional model → T-SQL
  • Logs / telemetry / real-time analytics → KQL

These keywords often point directly to the correct answer.


Go to the DP-700 Exam Prep Hub main page.

Leave a comment