
This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections:
Prepare data
--> Transform data
--> Merge or join data
Merging or joining data is a fundamental transformation task in Microsoft Fabric. It enables you to combine related data from multiple tables or sources into a single dataset for analysis, modeling, or reporting. This skill is essential for preparing clean, well-structured data in lakehouses, warehouses, dataflows, and Power BI semantic models.
For the DP-600 exam, you are expected to understand when, where, and how to merge or join data using the appropriate Fabric tools, as well as the implications for performance, data quality, and modeling.
Merge vs. Join: Key Distinction
Although often used interchangeably, the terms have slightly different meanings depending on the tool:
- Merge
- Commonly used in Power Query
- Combines tables by matching rows based on one or more key columns
- Produces a new column that can be expanded
- Join
- Commonly used in SQL and Spark
- Combines tables using explicit join logic (JOIN clauses)
- Output schema is defined directly in the query
Where Merging and Joining Occur in Fabric
| Fabric Experience | How It’s Done |
| Power Query (Dataflows Gen2, Lakehouse) | Merge Queries UI |
| Warehouse | SQL JOIN statements |
| Lakehouse (Spark notebooks) | DataFrame joins |
| Power BI Desktop | Power Query merges |
Common Join Types (Exam-Critical)
Understanding join types is heavily tested:
- Inner Join
- Returns only matching rows from both tables
- Left Outer Join
- Returns all rows from the left table and matching rows from the right
- Right Outer Join
- Returns all rows from the right table and matching rows from the left
- Full Outer Join
- Returns all rows from both tables
- Left Anti / Right Anti Join
- Returns rows with no match in the other table
👉 Exam tip: Anti joins are commonly used to identify missing or unmatched data.
Join Keys and Data Quality Considerations
Before merging or joining data, it’s critical to ensure:
- Join columns:
- Have matching data types
- Are cleaned and standardized
- Represent the same business entity
- Duplicate values in join keys can:
- Create unexpected row multiplication
- Impact aggregations and performance
Performance and Design Considerations
- Prefer SQL joins or Spark joins for large datasets rather than Power Query
- Filter and clean data before joining to reduce data volume
- In dimensional modeling:
- Fact tables typically join to dimension tables using left joins
- Avoid unnecessary joins in the semantic layer when they can be handled upstream
Common Use Cases
- Combining fact data with descriptive attributes
- Enriching transactional data with reference or lookup tables
- Building dimension tables for star schema models
- Validating data completeness using anti joins
Exam Tips and Pitfalls
- Don’t confuse merge vs. append (append stacks rows vertically)
- Know which tool to use based on:
- Data size
- Refresh frequency
- Complexity
- Expect scenario questions asking:
- Which join type to use
- Where the join should occur in the architecture
Key Takeaways
- Merging and joining data is essential for data preparation in Fabric
- Different Fabric experiences offer different ways to join data
- Correct join type and clean join keys are critical for accuracy
- Performance and modeling best practices matter for the DP-600 exam
Practice Questions:
Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …
- Identifying and understand why an option is correct (or incorrect) — not just which one
- Look for and understand the usage scenario of keywords in exam questions to guide you
- Expect scenario-based questions rather than direct definitions
Question 1
What is the primary purpose of merging or joining data in Microsoft Fabric?
A. To reduce storage costs
B. To vertically stack tables
C. To combine related data based on a common key
D. To encrypt sensitive columns
✅ Correct Answer: C
Explanation:
Merging or joining data combines related datasets horizontally using shared key columns so that related attributes appear in a single dataset.
Question 2
In Power Query, what is the result of a Merge Queries operation?
A. Rows from both tables are appended
B. A new table is automatically created
C. A new column containing related table data is added
D. A relationship is created in the semantic model
✅ Correct Answer: C
Explanation:
Power Query merges add a column that contains matching rows from the second table, which can then be expanded.
Question 3
Which join type returns only rows that exist in both tables?
A. Left outer join
B. Right outer join
C. Full outer join
D. Inner join
✅ Correct Answer: D
Explanation:
An inner join returns only rows with matching keys in both tables.
Question 4
You want to keep all rows from a fact table and bring in matching dimension attributes. Which join type should you use?
A. Inner join
B. Left outer join
C. Right outer join
D. Full outer join
✅ Correct Answer: B
Explanation:
A left outer join preserves all rows from the left (fact) table while bringing in matching rows from the dimension table.
Question 5
Which join type is most useful for identifying records that do not have a match in another table?
A. Inner join
B. Full outer join
C. Left anti join
D. Right outer join
✅ Correct Answer: C
Explanation:
A left anti join returns rows from the left table that do not have matching rows in the right table, making it ideal for data quality checks.
Question 6
What issue can occur when joining tables that contain duplicate values in the join key?
A. Data type conversion errors
B. Row multiplication
C. Data loss
D. Query failure
✅ Correct Answer: B
Explanation:
Duplicate keys can cause one-to-many or many-to-many matches, resulting in more rows than expected after the join.
Question 7
Which Fabric experience is best suited for performing joins on very large datasets?
A. Power BI Desktop
B. Power Query
C. Warehouse using SQL
D. Excel
✅ Correct Answer: C
Explanation:
SQL joins in a warehouse are optimized for large-scale data processing and typically outperform Power Query for large datasets.
Question 8
Which operation should not be confused with merging or joining data?
A. Append
B. Inner join
C. Left join
D. Anti join
✅ Correct Answer: A
Explanation:
Append stacks tables vertically (row-wise), while merges and joins combine tables horizontally (column-wise).
Question 9
What should you verify before merging two tables?
A. That both tables have the same number of rows
B. That join columns use compatible data types
C. That all columns are indexed
D. That the tables are in the same workspace
✅ Correct Answer: B
Explanation:
Join columns must have compatible data types and clean values; otherwise, matches may fail or produce incorrect results.
Question 10
From a modeling best-practice perspective, where should complex joins ideally be performed?
A. In Power BI visuals
B. In DAX measures
C. Upstream in lakehouse or warehouse transformations
D. At query time in reports
✅ Correct Answer: C
Explanation:
Performing joins upstream simplifies semantic models, improves performance, and ensures consistency across reports.

One thought on “Merge or Join Data”