Enrich data by adding new columns and tables

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Prepare data 
    --> Transform data 
        --> Enrich data by adding new columns and tables

Data enrichment is a critical step in preparing analytics-ready datasets in Microsoft Fabric. This section of the DP-600 exam focuses on how analytics engineers enhance existing data by adding derived columns, augmenting datasets with new tables, and combining multiple data sources to provide more business context and analytical value.

What Does Data Enrichment Mean in Fabric?

Data enrichment involves extending raw or curated datasets with additional attributes, calculations, or related entities so that the data is more useful for reporting, analytics, and downstream consumption.

In Microsoft Fabric, enrichment can occur across:

Lakehouses
Data Warehouses
Dataflows Gen2
Power BI semantic models
SQL analytics endpoints

Adding New Columns

Common Ways to Add Columns

Calculated columns using SQL expressions
Derived columns in Dataflows Gen2
Computed columns in warehouses or lakehouses
Calculated columns in semantic models (DAX)

Typical Use Cases

Creating business-friendly attributes (e.g., full name, year-month)
Applying business rules (e.g., customer segment, status flags)
Performing type conversions or formatting
Adding derived metrics (e.g., profit, margin, age)

Where This Is Done

Lakehouse / Warehouse (SQL): Persistent, reusable transformations
Dataflows Gen2: Low-code, ETL-style transformations
Semantic models (DAX): Report-specific or analytical calculations

Exam Tip

Know where to add a column based on reuse, performance, and governance. Foundational logic belongs earlier in the data pipeline.

Adding New Tables

How New Tables Are Created

Creating tables from queries or transformations
Materializing enriched datasets
Joining multiple source tables into curated outputs
Creating dimension or fact tables for analytics

Common Use Cases

Creating lookup or reference tables
Building star schema components
Storing aggregated or summarized data
Supporting reuse across multiple reports and models

Fabric Components Involved

Lakehouse tables (Delta format)
Warehouse tables
Dataflows Gen2 outputs
Shared semantic models

Enrichment Through Joins and Relationships

Enrichment Patterns

Joining transactional data with reference data
Adding descriptive attributes from lookup tables
Combining data from multiple domains (e.g., sales + geography)

Best Practices

Use appropriate join types (inner, left, etc.)
Ensure consistent data types and keys
Validate row counts and data integrity
Avoid unnecessary denormalization when not required

Exam Tip

Understand the impact of joins on data volume, performance, and data correctness.

Enrichment Using Dataflows Gen2

Dataflows Gen2 are a key enrichment tool in Fabric:

Low-code Power Query transformations
Combine, append, and merge datasets
Add derived and conditional columns
Output enriched tables to OneLake

Ideal for:

Source-level enrichment
Repeatable, governed transformations
Non-SQL-based data preparation

Enrichment in Semantic Models

Some enrichment happens at the modeling layer:

Calculated columns (DAX)
Calculated tables
Role-playing dimensions

However:

Semantic-layer enrichment is best for analysis, not heavy transformation
Overuse can impact model performance and complexity

Governance and Performance Considerations

Prefer enriching data upstream when logic is reusable
Document derived columns and tables
Apply consistent naming conventions
Avoid duplicating enrichment logic across layers
Balance flexibility with maintainability

What to Know for the DP-600 Exam

You should be comfortable with:

When to add columns vs. when to add tables
Choosing the right Fabric component for enrichment
SQL vs. Power Query vs. DAX enrichment
Performance and governance trade-offs
Supporting analytics-ready and reusable datasets

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

Identifying and understand why an option is correct (or incorrect) — not just which one
Look for and understand the usage scenario of keywords in exam questions to guide you
Expect scenario-based questions rather than direct definitions

1. What is the primary goal of data enrichment in Microsoft Fabric?

A. Reduce data storage costs
B. Improve infrastructure security
C. Increase analytical value by adding context
D. Replace raw data sources

Correct Answer: C

Explanation:
Data enrichment enhances datasets by adding derived attributes or related data so that the data becomes more meaningful and useful for analytics and reporting.

2. Where should reusable business logic for derived columns ideally be implemented?

A. Power BI report visuals
B. Semantic model calculated columns
C. Lakehouse or warehouse SQL transformations
D. Ad-hoc DAX measures

Correct Answer: C

Explanation:
Reusable business logic should be implemented upstream (lakehouse or warehouse) to promote consistency, reuse, and better governance across multiple reports and models.

3. Which Fabric feature is best suited for low-code enrichment such as merging datasets and adding conditional columns?

A. SQL analytics endpoint
B. Dataflows Gen2
C. Eventhouse
D. Real-Time hub

Correct Answer: B

Explanation:
Dataflows Gen2 use Power Query to perform low-code transformations, including merges, derived columns, and conditional logic, making them ideal for enrichment scenarios.

4. When enriching data by joining tables, which join type preserves all rows from the primary dataset?

A. Inner join
B. Right join
C. Left join
D. Cross join

Correct Answer: C

Explanation:
A left join preserves all rows from the left (primary) table while adding matching data from the right table where available.

5. Which scenario best justifies creating a new enriched table instead of adding columns to an existing one?

A. Adding a formatting column for display
B. Creating a reusable dimension table
C. Renaming an existing column
D. Filtering rows for a specific report

Correct Answer: B

Explanation:
Creating a new table is appropriate when building reusable dimension or reference tables that support multiple fact tables or analytics use cases.

6. Why should heavy transformation logic generally be avoided in Power BI semantic models?

A. Semantic models cannot handle transformations
B. It increases OneLake storage usage
C. It can negatively affect performance and maintainability
D. Semantic models do not support calculated columns

Correct Answer: C

Explanation:
While semantic models support calculated columns and tables, heavy transformation logic is better handled upstream to improve performance and simplify model maintenance.

7. Which of the following is an example of enriching data by adding a new column?

A. Changing workspace permissions
B. Adding a “Customer Segment” column based on business rules
C. Publishing a semantic model
D. Creating a new workspace

Correct Answer: B

Explanation:
Derived columns such as customer segmentation add business context to data, which is a classic enrichment scenario.

8. What is a key benefit of enriching data earlier in the data pipeline?

A. Faster report publishing
B. Reduced need for Power BI licenses
C. Improved consistency across analytics assets
D. Automatic index creation

Correct Answer: C

Explanation:
Enriching data upstream ensures that all downstream consumers use the same logic and definitions, improving consistency and governance.

9. Which Fabric storage format is typically used when creating enriched tables in a lakehouse?

A. CSV
B. Parquet
C. Delta
D. JSON

Correct Answer: C

Explanation:
Lakehouse tables in Microsoft Fabric are stored in Delta format, which supports ACID transactions and efficient analytics.

10. Which factor should most influence where enrichment logic is implemented?

A. User interface preferences
B. Reusability and performance considerations
C. The number of Power BI visuals
D. Workspace naming conventions

Correct Answer: B

Explanation:
Choosing where to enrich data depends on how reusable the logic is and how it affects performance, scalability, and governance.

The Data Community

Enrich data by adding new columns and tables

Practice Questions:

One thought on “Enrich data by adding new columns and tables”

Leave a comment Cancel reply

Information and resources for the data professionals' community

Practice Questions:

Share this:

Related

One thought on “Enrich data by adding new columns and tables”

Leave a comment Cancel reply

Information and resources for the data professionals' community