
This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections:
Prepare data
--> Query and analyze data
--> Select, filter, and aggregate data by using KQL
The Kusto Query Language (KQL) is a read-only request language used for querying large, distributed, event-driven datasets — especially within Eventhouse and Azure Data Explorer–backed workloads in Microsoft Fabric. KQL enables you to select, filter, and aggregate data efficiently in scenarios involving high-velocity data like telemetry, logs, and streaming events.
For the DP-600 exam, you should understand KQL basics and how it supports data exploration and analytical summarization in a real-time analytics context.
KQL Basics
KQL is designed to be expressive and performant for time-series or log-like data. Queries are built as a pipeline of operations, where each operator transforms the data and passes it to the next.
Selecting Data
In KQL, the project operator performs the equivalent of selecting columns:
EventHouseTable
| project Timestamp, Country, EventType, Value
projectlets you choose which fields to include- You can rename fields inline:
| project Time=Timestamp, Sales=Value
Exam Tip:
Use project early to limit data to relevant columns and reduce processing downstream.
Filtering Data
Filtering in KQL is done using the where operator:
EventHouseTable
| where Country == "USA"
Multiple conditions can be combined with and/or:
| where Value > 100 and EventType == "Purchase"
Filtering early in the pipeline improves performance by reducing the dataset before subsequent transformations.
Aggregating Data
KQL uses the summarize operator to perform aggregations such as counts, sums, averages, min, max, etc.
Example – Aggregate Total Sales:
EventHouseTable
| where EventType == "Purchase"
| summarize TotalSales = sum(Value)
Example – Grouped Aggregation:
EventHouseTable
| where EventType == "Purchase"
| summarize CountEvents = count(), TotalSales = sum(Value) by Country
Time-Bucketed Aggregation
KQL supports time binning using bin():
EventHouseTable
| where EventType == "Purchase"
| summarize TotalSales = sum(Value) by Country, bin(Timestamp, 1h)
This groups results into hourly buckets, which is ideal for time-series analytics and dashboards.
Common KQL Aggregation Functions
| Function | Description |
|---|---|
count() | Total number of records |
sum(column) | Sum of numeric values |
avg(column) | Average value |
min(column) / max(column) | Minimum / maximum value |
percentile(column, p) | Percentile calculation |
Combining Operators
KQL queries are often a combination of select, filter, and aggregation:
EventHouseTable
| where EventType == "Purchase" and Timestamp >= ago(7d)
| project Country, Value, Timestamp
| summarize TotalSales = sum(Value), CountPurchases = count() by Country
| order by TotalSales desc
This pipeline:
- Filters for purchases in the last 7 days
- Projects relevant fields
- Aggregates totals and counts
- Orders the result by highest total sales
KQL vs SQL: What’s Different?
| Feature | SQL | KQL |
|---|---|---|
| Syntax | Declarative | Pipeline-based |
| Joins | Extensive support | Limited pivot semantics |
| Use cases | Relational data | Time-series, event, logs |
| Aggregation | GROUP BY | summarize |
KQL shines when querying streaming or event data at scale — exactly the kinds of scenarios Eventhouse targets.
Performance Considerations in KQL
- Apply
whereas early as possible. - Use
projectto keep only necessary fields. - Time-range filters (e.g., last 24h) drastically reduce scan size.
- KQL runs distributed and is optimized for large event streams.
Practical Use Cases
Example – Top Countries by Event Count:
EventHouseTable
| summarize EventCount = count() by Country
| top 10 by EventCount
Example – Average Value of Events per Day:
EventHouseTable
| where EventType == "SensorReading"
| summarize AvgValue = avg(Value) by bin(Timestamp, 1d)
Exam Relevance
In DP-600 exam scenarios involving event or near-real-time analytics (such as with Eventhouse or KQL-backed lakehouse sources), you may be asked to:
- Write or interpret KQL that:
- projects specific fields
- filters records based on conditions
- aggregates and groups results
- Choose the correct operator (
where,project,summarize) for a task - Understand how KQL can be optimized with time-based filtering
Key Takeaways
projectselects specific fields.wherefilters rows based on conditions.summarizeperforms aggregations.- Time-series queries often use
bin()for bucketing. - The KQL pipeline enables modular, readable, and optimized queries for large datasets.
Final Exam Tips
If a question involves event streams, telemetry, metrics over time, or real-time analytics, and asks about summarizing values after filtering, think KQL with where, project, and summarize.
- project → select columns
- where → filter rows
- summarize → aggregate and group
- bin() → time-based grouping
- KQL is pipeline-based, not declarative like SQL
- Used heavily in Eventhouse / real-time analytics
Practice Questions:
Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …
- Identifying and understand why an option is correct (or incorrect) — not just which one
- Look for and understand the usage scenario of keywords in exam questions to guide you
- Expect scenario-based questions rather than direct definitions
1. Which KQL operator is used to select specific columns from a dataset?
A. select
B. where
C. project
D. summarize
✅ Correct Answer: C
Explanation:project is the KQL operator used to select and optionally rename columns. KQL does not use SELECT like SQL.
2. Which operator is used to filter rows in a KQL query?
A. filter
B. where
C. having
D. restrict
✅ Correct Answer: B
Explanation:
The where operator filters rows based on conditions and is typically placed early in the query pipeline for performance.
3. How do you count the number of records in a table using KQL?
A. count(*)
B. summarize count()
C. summarize count(*)
D. summarize count()
✅ Correct Answer: D
Explanation:
In KQL, aggregation functions are used inside summarize. count() counts rows; count(*) is SQL syntax.
4. Which KQL operator performs aggregations similar to SQL’s GROUP BY?
A. group
B. aggregate
C. summarize
D. partition
✅ Correct Answer: C
Explanation:summarize is the KQL operator used for aggregation and grouping.
5. Which query returns total sales grouped by country?
A.
| group by Country sum(Value)
B.
| summarize sum(Value) Country
C.
| summarize TotalSales = sum(Value) by Country
D.
| aggregate Value by Country
✅ Correct Answer: C
Explanation:
KQL requires explicit naming of aggregates and grouping using summarize … by.
6. What is the purpose of the bin() function in KQL?
A. To sort data
B. To group numeric values
C. To bucket values into time intervals
D. To remove null values
✅ Correct Answer: C
Explanation:bin() groups values—commonly timestamps—into fixed-size intervals (for example, hourly or daily buckets).
7. Which query correctly summarizes event counts per hour?
A.
| summarize count() by Timestamp
B.
| summarize count() by hour(Timestamp)
C.
| summarize count() by bin(Timestamp, 1h)
D.
| count() by Timestamp
✅ Correct Answer: C
Explanation:
Time-based grouping in KQL requires bin() to define the interval size.
8. Which operator should be placed as early as possible in a KQL query for performance reasons?
A. summarize
B. project
C. order by
D. where
✅ Correct Answer: D
Explanation:
Applying where early reduces the dataset size before further processing, improving performance.
9. Which KQL query returns the top 5 countries by event count?
A.
| top 5 Country by count()
B.
| summarize count() by Country | top 5 by count_
C.
| summarize EventCount = count() by Country | top 5 by EventCount
D.
| order by Country limit 5
✅ Correct Answer: C
Explanation:
You must first aggregate using summarize, then use top based on the aggregated column.
10. In Microsoft Fabric, KQL is primarily used with which workload?
A. Warehouse
B. Lakehouse SQL endpoint
C. Eventhouse
D. Semantic model
✅ Correct Answer: C
Explanation:
KQL is the primary query language for Eventhouse and real-time analytics scenarios in Microsoft Fabric.
