Category: Data Cleaning

Data Cleaning, Data Governance, Data Quality Assurance, Glossary of Data Terms January 28, 2026

Glossary – 100 “Data Quality & Data Validation” terms

Below is a glossary that includes 100 common “Data Quality & Data Validation” terms and phrases in alphabetical order. Enjoy!

Term	Definition & Example
Business Rule	Business-defined constraint on data. Example: Credit limit approval rules.
Check Constraint	SQL rule enforcing condition. Example: Age > 0.
Constraint	Rule enforced at database level. Example: NOT NULL constraint.
Continuous Validation	Ongoing automated validation. Example: Streaming pipelines.
Corrective Control	Fixes identified errors. Example: Data reload.
Data Accuracy	Degree to which data correctly represents reality. Example: Correct customer addresses.
Data Accuracy Rate	Percentage of correct values. Example: 99.5% accurate.
Data Anomaly	Unexpected or suspicious data value. Example: Sudden traffic spike.
Data Bias	Systematic data distortion. Example: Sampling bias.
Data Certification	Marking trusted datasets. Example: Certified gold tables.
Data Cleansing	Correcting or removing invalid data. Example: Fixing malformed phone numbers.
Data Completeness	Presence of all required data elements. Example: No missing customer IDs.
Data Completeness Rate	Percentage of populated fields. Example: 97% filled.
Data Confidence	Trust users have in data. Example: Executive reporting trust.
Data Conformance	Adherence to standards or schemas. Example: ISO country codes.
Data Consistency	Uniformity of data across systems. Example: Same currency code everywhere.
Data Deduplication	Removing duplicate records. Example: Merge customer profiles.
Data Defect	Specific instance of poor quality. Example: Invalid customer record.
Data Drift	Gradual change in data patterns. Example: Customer behavior shifts.
Data Enrichment	Enhancing data with additional attributes. Example: Adding demographic data.
Data Error	Incorrect or invalid data value. Example: Misspelled city name.
Data Exception	Approved rule deviation. Example: Legacy records.
Data Exception Handling	Process for managing violations. Example: Manual review.
Data Freshness	How current the data is. Example: Last updated timestamp.
Data Governance	Framework overseeing data quality. Example: Stewardship model.
Data Imputation	Filling missing values. Example: Replacing null with average.
Data Integrity	Accuracy and consistency over the lifecycle. Example: Foreign key relationships enforced.
Data Issue	Identified quality problem. Example: Missing values.
Data Latency	Delay between event and availability. Example: 2-hour ingestion lag.
Data Lineage	Tracking data flow and transformations. Example: Source to dashboard.
Data Matching	Identifying records referring to same entity. Example: Customer record linkage.
Data Noise	Irrelevant or misleading data. Example: Test records in prod.
Data Observability	Visibility into data health and behavior. Example: Pipeline monitoring.
Data Ownership	Accountability for data quality. Example: Business owner.
Data Precision	Level of detail in data. Example: Decimal places.
Data Profiling	Analyzing data to understand structure and quality. Example: Null percentage analysis.
Data Quality	Measure of how fit data is for its intended use. Example: Accurate sales totals in reports.
Data Quality Alert	Notification of quality issue. Example: Slack alert.
Data Quality Audit	Formal assessment of data quality. Example: Quarterly review.
Data Quality Automation	Automated quality processes. Example: CI/CD checks.
Data Quality Backlog	Tracked list of quality issues. Example: Jira tickets.
Data Quality Benchmark	Comparison standard. Example: Industry averages.
Data Quality Dashboard	Visual view of quality metrics. Example: Completeness trends.
Data Quality Dimension	Category used to measure quality. Example: Accuracy, completeness.
Data Quality Framework	Structured quality approach. Example: DAMA dimensions.
Data Quality Incident	Major quality failure. Example: Incorrect financial report.
Data Quality KPI	Metric tracking quality performance. Example: Duplicate rate.
Data Quality Maturity	Level of quality capability. Example: Reactive vs proactive.
Data Quality Monitoring	Ongoing quality measurement. Example: Daily freshness checks.
Data Quality Ownership Matrix	Mapping quality responsibility. Example: RACI chart.
Data Quality Program	Organization-wide quality initiative. Example: Enterprise DQ strategy.
Data Quality Regression	Reintroduced quality issue. Example: After schema change.
Data Quality Rule Engine	System executing validation rules. Example: Automated checks.
Data Quality Rule Violation	Failure to meet a rule. Example: Negative balance.
Data Quality Score	Numeric representation of data quality. Example: 98% completeness.
Data Quality SLA	Quality expectations agreement. Example: 99% accuracy target.
Data Quality SLA Breach	Failure to meet quality targets. Example: Accuracy below SLA.
Data Quality Trend	Quality performance over time. Example: Monthly improvement.
Data Reconciliation	Comparing datasets for consistency. Example: Finance system vs warehouse.
Data Reliability	Consistent data performance over time. Example: Stable metrics.
Data Remediation	Fixing data quality issues. Example: Reprocessing failed loads.
Data Sampling	Checking subset of data. Example: Random record review.
Data Standardization	Transforming data into a common format. Example: Converting dates to ISO format.
Data Steward	Role responsible for data quality. Example: Customer data steward.
Data Threshold	Acceptable quality limit. Example: ≤ 1% nulls.
Data Timeliness	Data availability within required timeframes. Example: Daily data refresh by 6 AM.
Data Trust Score	Composite measure of reliability. Example: Internal trust index.
Data Uniqueness	No unintended duplicates exist. Example: One row per customer.
Data Validation	Process of checking data against rules. Example: Rejecting invalid dates.
Data Validation Pipeline	Automated validation process. Example: Ingestion checks.
Data Validity	Data conforms to defined formats and rules. Example: Email follows standard pattern.
Data Verification	Confirming data accuracy. Example: Source system comparison.
Detective Control	Finds errors after entry. Example: Quality audits.
Domain Validation	Restricting values to a set. Example: Status = Active/Inactive.
Downstream Validation	Validating analytical outputs. Example: Dashboard totals.
Duplicate Detection	Identifying duplicate records. Example: Same email address twice.
Error Rate	Proportion of invalid records. Example: 2% failures.
Foreign Key	Reference to another table. Example: Order → Customer.
Format Validation	Ensuring correct data format. Example: YYYY-MM-DD dates.
Golden Dataset	Highest-quality dataset version. Example: Curated finance data.
Hard Validation	Blocking invalid data. Example: Reject invalid IDs.
Null Check	Ensuring required fields are populated. Example: Order ID not null.
Outlier Detection	Identifying abnormal values. Example: Negative revenue amounts.
Pattern Matching	Validating via regex patterns. Example: Postal code validation.
Post-Load Validation	Checks after data load. Example: Row count comparisons.
Pre-Load Validation	Checks before data ingestion. Example: File schema validation.
Preventive Control	Stops errors before entry. Example: Input validation.
Primary Key	Unique record identifier. Example: CustomerID.
Quality Gate	Mandatory validation checkpoint. Example: Before publishing data.
Range Validation	Checking values fall within limits. Example: Age between 0 and 120.
Referential Integrity	Valid relationships between tables. Example: Orders reference valid customers.
Root Cause Analysis	Identifying source of data issues. Example: ETL failure investigation.
Schema Validation	Checking data structure against schema. Example: Column data types.
Soft Validation	Warning without rejecting data. Example: Flag unusual values.
Source System Validation	Checking upstream data. Example: CRM record checks.
Statistical Validation	Using statistics to validate data. Example: Distribution checks.
Trusted Dataset	Data approved for consumption. Example: Executive KPIs.
Validation Coverage	Proportion of data checked. Example: 100% of critical fields.
Validation Rule	Condition data must satisfy. Example: Quantity must be ≥ 0.
Validation Threshold	Limit triggering failure. Example: >5% nulls.

Analytics, Business Intelligence, Data Cleaning, Data Development, Data Governance, Data Integration, Data Modeling, Data Quality Assurance, Data Security, Data Strategy, Data Visualization January 22, 2026

Self-Service Analytics: Empowering Users While Maintaining Trust and Control

Self-service analytics has become a cornerstone of modern data strategies. As organizations generate more data and business users demand faster insights, relying solely on centralized analytics teams creates bottlenecks. Self-service analytics shifts part of the analytical workload closer to the business—while still requiring strong foundations in data quality, governance, and enablement.

This article is based on a detailed presentation I did at a HIUG conference a few years ago.

What Is Self-Service Analytics?

Self-service analytics refers to the ability for business users—such as analysts, managers, and operational teams—to access, explore, analyze, and visualize data on their own, without requiring constant involvement from IT or centralized data teams.

Instead of submitting requests and waiting days or weeks for reports, users can:

Explore curated datasets
Build their own dashboards and reports
Answer ad-hoc questions in real time
Make data-driven decisions within their daily workflows

Self-service does not mean unmanaged or uncontrolled analytics. Successful self-service environments combine user autonomy with governed, trusted data and clear usage standards.

Why Implement or Provide Self-Service Analytics?

Organizations adopt self-service analytics to address speed, scalability, and empowerment challenges.

Key Benefits

Faster Decision-Making
Users can answer questions immediately instead of waiting in a reporting queue.
Reduced Bottlenecks for Data Teams
Central teams spend less time producing basic reports and more time on high-value work such as modeling, optimization, and advanced analytics.
Greater Business Engagement with Data
When users interact directly with data, data literacy improves and analytics becomes part of everyday decision-making.
Scalability
A small analytics team cannot serve hundreds or thousands of users manually. Self-service scales insight generation across the organization.
Better Alignment with Business Context
Business users understand their domain best and can explore data with that context in mind, uncovering insights that might otherwise be missed.

Why Not Implement Self-Service Analytics? (Challenges & Risks)

While powerful, self-service analytics introduces real risks if implemented poorly.

Common Challenges

Data Inconsistency & Conflicting Metrics
Without shared definitions, different users may calculate the same KPI differently, eroding trust.
“Spreadsheet Chaos” at Scale
Self-service without governance can recreate the same problems seen with uncontrolled Excel usage—just in dashboards.
Overloaded or Misleading Visuals
Users may build reports that look impressive but lead to incorrect conclusions due to poor data modeling or statistical misunderstandings.
Security & Privacy Risks
Improper access controls can expose sensitive or regulated data.
Low Adoption or Misuse
Without training and support, users may feel overwhelmed or misuse tools, resulting in poor outcomes.
Shadow IT
If official self-service tools are too restrictive or confusing, users may turn to unsanctioned tools and data sources.

**What an Environment Looks Like Without Self-Service Analytics**

In organizations without self-service analytics, patterns tend to repeat:

Business users submit report requests via tickets or emails
Long backlogs form for even simple questions
Analytics teams become report factories
Insights arrive too late to influence decisions
Users create their own disconnected spreadsheets and extracts
Trust in data erodes due to multiple versions of the truth

Decision-making becomes reactive, slow, and often based on partial or outdated information.

**How Things Change With Self-Service Analytics**

When implemented well, self-service analytics fundamentally changes how an organization works with data.

Users explore trusted datasets independently
Analytics teams focus on enablement, modeling, and governance
Insights are discovered earlier in the decision cycle
Collaboration improves through shared dashboards and metrics
Data becomes part of daily conversations, not just monthly reports

The organization shifts from report consumption to insight exploration. Well, that’s the goal.

How to Implement Self-Service Analytics Successfully

Self-service analytics is as much an operating model as it is a technology choice. The list below outlines important aspects that must be considered, decided on, and implemented when planning the implementation of self-service analytics.

1. Data Foundation

Curated, well-modeled datasets (often star schemas or semantic models)
Clear metric definitions and business logic
Certified or “gold” datasets for common use cases
Data freshness aligned with business needs

A strong semantic layer is critical—users should not have to interpret raw tables.

2. Processes

Defined workflows for dataset creation and certification
Clear ownership for data products and metrics
Feedback loops for users to request improvements or flag issues
Change management processes for metric updates

3. Security

Role-based access control (RBAC)
Row-level and column-level security where needed
Separation between sensitive and general-purpose datasets
Audit logging and monitoring of usage

Security must be embedded, not bolted on.

4. Users & Roles

Successful self-service environments recognize different user personas:

Consumers: View and interact with dashboards
Explorers: Build their own reports from curated data
Power Users: Create shared datasets and advanced models
Data Teams: Govern, enable, and support the ecosystem

Not everyone needs the same level of access or capability.

5. Training & Enablement

Tool-specific training (e.g., how to build reports correctly)
Data literacy education (interpreting metrics, avoiding bias)
Best practices for visualization and storytelling
Office hours, communities of practice, and internal champions

Training is ongoing—not a one-time event.

6. Documentation

Metric definitions and business glossaries
Dataset descriptions and usage guidelines
Known limitations and caveats
Examples of certified reports and dashboards

Good documentation builds trust and reduces rework.

7. Data Governance

Self-service requires guardrails, not gates.

Key governance elements include:

Data ownership and stewardship
Certification and endorsement processes
Naming conventions and standards
Quality checks and validation
Policies for personal vs shared content

Governance should enable speed while protecting consistency and trust.

8. Technology & Tools

Modern self-service analytics typically includes:

Data Platforms

Cloud data warehouses or lakehouses
Centralized semantic models

Data Visualization & BI Tools

Interactive dashboards and ad-hoc analysis
Low-code or no-code report creation
Sharing and collaboration features

Supporting Capabilities

Metadata management
Cataloging and discovery
Usage monitoring and adoption analytics

The key is selecting tools that balance ease of use with enterprise-grade governance.

Conclusion

Self-service analytics is not about giving everyone raw data and hoping for the best. It is about empowering users with trusted, governed, and well-designed data experiences.

Organizations that succeed treat self-service analytics as a partnership between data teams and the business—combining strong foundations, thoughtful governance, and continuous enablement. When done right, self-service analytics accelerates decision-making, scales insight creation, and embeds data into the fabric of everyday work.

Thanks for reading!

Data Analysis, Data Cleaning, Data Conversions, Data Integration, Data Migration, Data Strategy January 18, 2026January 22, 2026

Data Conversions: Steps, Best Practices, and Considerations for Success

Introduction

Data conversions are critical undertakings in the world of IT and business, often required during system upgrades, migrations, mergers, or to meet new regulatory requirements. I have been involved in many data conversions over the years, and in this article, I am sharing information from that experience. This article provides a comprehensive guide to the stages, steps, and best practices for executing successful data conversions. This article was created from a detailed presentation I did some time back at a SQL Saturday event.

What Is Data Conversion and Why Is It Needed?

Data conversion involves transforming data from one format, system, or structure to another. Common scenarios include application upgrades, migrating to new systems, adapting to new business or regulatory requirements, and integrating data after mergers or acquisitions. For example, merging two customer databases into a new structure is a typical conversion challenge.

Stages of a Data Conversion Project

Let’s take a look at the stages of a data conversion project.

Stage 1: Big Picture, Analysis, and Feasibility

The first stage is about understanding the overall impact and feasibility of the conversion:

Understand the Big Picture: Identify what the conversion is about, which systems are involved, the reasons for conversion, and its importance. Assess the size, complexity, and impact on business and system processes, users, and external parties. Determine dependencies and whether the conversion can be done in phases.
Know Your Sources and Destinations: Profile the source data, understand its use, and identify key measurements for success. Compare source and destination systems, noting differences and existing data in the destination.
Feasibility – Proof of Concept: Test with the most critical or complex data to ensure the conversion will meet the new system’s needs before proceeding further.
Project Planning: Draft a high-level project plan and requirements document, estimate complexity and resources, assemble the team, and officially launch the project.

Stage 2: Impact, Mappings, and QA Planning

Once the conversion is likely, the focus shifts to detailed impact analysis and mapping:

Impact Analysis: Assess how business and system processes, reports, and users will be affected. Consider equipment and resource needs, and make a go/no-go decision.
Source/Destination Mapping & Data Gap Analysis: Profile the data, create detailed mappings, list included and excluded data, and address gaps where source or destination fields don’t align. Maintain legacy keys for backward compatibility.
QA/Verification Planning: Plan for thorough testing, comparing aggregates and detailed records between source and destination, and involve both IT and business teams in verification.

Stage 3: Project Execution, Development, and QA

With the project moving forward, detailed planning, development and validation, and user involvement become the priority:

Detailed Project Planning: Refine requirements, assign tasks, and ensure all parties are aligned. Communication is key.
Development: Set up environments, develop conversion scripts and programs, determine order of processing, build in logging, and ensure processes can be restarted if interrupted. Optimize for performance and parallel processing where possible.
Testing and Verification: Test repeatedly, verify data integrity and functionality, and involve all relevant teams. Business users should provide final sign-off.
Other Considerations: Train users, run old and new systems in parallel, set a firm cut-off for source updates, consider archiving, determine if any SLAs needed to be adjusted, and ensure compliance with regulations.

Stage 4: Execution and Post-Conversion Tasks

The final stage is about production execution and transition:

Schedule and Execute: Stick to the schedule, monitor progress, keep stakeholders informed, lock out users where necessary, and back up data before running conversion processes.
Post-Conversion: Run post-conversion scripts, allow limited access for verification, and where applicable, provide close monitoring and support as the new system goes live.

Best Practices and Lessons Learned

Involve All Stakeholders Early: Early engagement ensures smoother execution and better outcomes.
Analyze and Plan Thoroughly: A well-thought-out plan is the foundation of a successful conversion.
Develop Smartly and Test Vigorously: Build robust, traceable processes and test extensively.
Communicate Throughout: Keep all team members and stakeholders informed at every stage.
Pay Attention to Details: Watch out for tricky data types like DATETIME and time zones, and never underestimate the effort required.

Conclusion

Data conversions are complex, multi-stage projects that require careful planning, execution, and communication. By following the structured approach and best practices outlined above, organizations can minimize risks and ensure successful outcomes.

Thanks for reading!

Business Intelligence, Data Cleaning, Data Quality Assurance, Microsoft Certification, PL-300, Power BI January 17, 2026

Resolve inconsistencies, unexpected or null values, and data quality issues (PL-300 Exam Prep)

This post is a part of the PL-300: Microsoft Power BI Data Analyst Exam Prep Hub; and this topic falls under these sections: 
Prepare the data (25–30%) 
  --> Profile and clean the data 
    --> Resolve inconsistencies, unexpected or null values, and data quality issues

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub's main page.

High-quality data is essential for accurate analysis and trustworthy reports. In the PL-300 exam, Microsoft expects candidates to understand how to identify and resolve common data quality problems using Power Query before data is loaded into the model.

This section focuses on handling inconsistencies, unexpected values, nulls, and errors—all of which can negatively impact calculations, relationships, and visuals if left unresolved.

Why This Topic Matters for the Exam

From an exam perspective, this topic tests your ability to:

Diagnose data quality problems using profiling tools
Choose the correct transformation to fix an issue
Understand when to remove, replace, or transform data
Prevent downstream modeling and reporting issues

Most questions are scenario-based, asking what action you should take to fix a specific data issue.

Common Data Quality Issues You Must Recognize

1. Null (Blank) Values

Nulls represent missing or unknown data and can cause:

Incorrect aggregations
Broken relationships
Visuals that behave unexpectedly

Common causes:

Incomplete source data
Left joins with no matching rows
Data entry gaps

2. Unexpected or Invalid Values

These include:

Negative values where only positives make sense
Text values in numeric columns
Dates outside expected ranges
Misspelled or inconsistent category names

3. Inconsistent Data

Inconsistencies often appear as:

Mixed casing (USA vs usa)
Trailing or leading spaces
Multiple spellings for the same value
Different date or number formats

4. Error Values

Errors usually occur when:

Converting data types
Performing calculations
Parsing malformed data

Examples include:

Conversion failed
Divide by zero
Invalid date format

Identifying Data Quality Issues in Power Query

Power Query provides built-in data profiling tools to quickly detect problems:

Column Quality

Shows percentages of Valid, Error, and Empty values
Ideal for spotting nulls and errors

Column Distribution

Displays value frequency and distinct counts
Helps identify unexpected or inconsistent values

Column Profile

Provides min, max, average, and other statistics
Useful for detecting outliers and invalid ranges

Exam Tip: Profiling tools only analyze a sample by default. You may need to enable “Column profiling based on entire dataset” for accuracy.

Techniques to Resolve Null Values

Remove Rows

Used when nulls make a record unusable
Common for missing primary keys or required fields

Replace Values

Replace nulls with:
- 0 (for numeric measures)
- “Unknown” or “Not Provided” (for text)
- A default date

Fill Down / Fill Up

Used for hierarchical or grouped data
Common in spreadsheets with merged cells

Exam Insight: Replacing nulls should be a business-justified decision, not automatic.

Resolving Inconsistencies

Standardizing Text

Use Transform → Format:
- Uppercase
- Lowercase
- Capitalize Each Word

Trimming and Cleaning

Trim removes leading/trailing spaces
Clean removes non-printable characters

Replacing Values

Normalize spelling differences (e.g., “US”, “USA”, “United States”)

Handling Unexpected or Invalid Values

Filtering

Remove values outside acceptable ranges
Exclude invalid categories

Conditional Columns

Create logic to flag or correct invalid data
Example: Replace negative sales with null or zero

Data Type Corrections

Ensure columns use appropriate data types
Prevents aggregation and calculation errors later

Fixing Error Values

Replace Errors

Replace with null or a default value

Remove Errors

Used when rows are unreliable

Fix the Root Cause

Change transformation order
Adjust data type conversion
Clean data before applying calculations

Exam Tip: Microsoft often tests whether you know why an error occurs, not just how to remove it.

Best Practices for PL-300 Candidates

Always profile before transforming
Fix issues in Power Query, not DAX, when possible
Understand the impact of removing vs replacing data
Keep transformations repeatable and documented
Prefer clean data models over complex report logic

Key Takeaways for the Exam

You should be able to:

Identify different types of data quality issues
Choose the correct Power Query tool to resolve them
Understand the downstream impact on models and visuals
Interpret profiling results correctly

Mastering this topic ensures cleaner datasets, better models, and fewer surprises during analysis—exactly what the PL-300 exam is designed to validate.

Practice Questions

Go to the Practice Exam Questions for this topic.

Analytics, Business Intelligence, Data Cleaning, Data Integration, Data Modeling, Microsoft Certification, PL-300, Power BI January 17, 2026January 17, 2026

Evaluate Data including Data Statistics & Column Properties (PL-300 Exam Prep)

This post is a part of the PL-300: Microsoft Power BI Data Analyst Exam Prep Hub; and this topic falls under these sections: 
Prepare the data (25–30%) 
    --> Profile and clean the data 
        --> Evaluate data, including data statistics and column properties

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Before cleaning, transforming, or modeling data, a Power BI Data Analyst must first evaluate the quality and structure of the data. The PL-300 exam tests your ability to profile data, interpret data statistics, and understand column properties to identify issues such as missing values, incorrect data types, outliers, and inconsistent formats.

This topic lives under Profile and clean the data because effective data preparation starts with understanding what the data looks like and how it behaves.

What Does “Evaluate Data” Mean in Power BI?

Evaluating data means using Power BI (specifically Power Query) to:

Understand data distribution and completeness
Identify data quality issues
Verify correct data types and formats
Decide what cleaning or transformation steps are required

Rather than guessing, Power BI provides built-in profiling tools that summarize data characteristics automatically.

Data Profiling Tools in Power Query

Power BI includes several profiling features that appear in the Power Query Editor, primarily within the View tab.

Key Data Profiling Options

Column quality
Column distribution
Column profile

These tools help you quickly assess whether a column is usable, trustworthy, and correctly defined.

Column Quality

Column quality provides a high-level overview of data completeness and validity.

It visually displays:

Valid values
Error values
Empty (null) values

Why Column Quality Matters

Quickly highlights missing or broken data
Helps determine whether rows should be filtered, fixed, or removed
Useful for early detection of refresh or ingestion issues

📌 Exam insight:
Questions often test whether you can identify which tool reveals missing or invalid values—column quality is the answer.

Column Distribution

Column distribution shows how values are spread across a column.

It provides:

Frequency of values
Distinct vs unique counts
A histogram-style visualization (for numeric fields)

Common Uses

Spotting unexpected duplicates
Identifying skewed data
Detecting outliers
Validating categorical values

📌 Exam insight:
Column distribution is used to understand value frequency, not just nulls or errors.

Column Profile

Column profile gives the most detailed statistical view of a column.

Depending on the data type, it may include:

Minimum and maximum values
Average
Standard deviation
Count and distinct count
Null count

Typical Use Cases

Verifying numeric ranges (e.g., negative values where none should exist)
Checking date ranges
Understanding overall data shape before modeling

📌 Exam insight:
Column profile helps validate statistical characteristics, not formatting or naming.

Understanding Column Properties

Beyond statistics, Power BI also evaluates column properties, which affect how data behaves in the model and visuals.

Key Column Properties to Evaluate

Data Type

Examples:

Whole number
Decimal number
Text
Date / DateTime
Boolean

Incorrect data types can:

Break visuals
Prevent aggregations
Cause relationship issues

📌 Exam tip:
Always verify data types before applying transformations or creating measures.

Format

Controls how values appear (e.g., currency, percentage, date format).

Affects display, not calculation logic
Often adjusted after validating data type

Default Summarization

Determines how numeric columns aggregate in visuals:

Sum
Average
Count
Do not summarize

📌 Exam insight:
Default summarization is evaluated when deciding how columns behave in visuals—not during Power Query transformations.

Column Name & Description

Clear names improve usability
Descriptions help report consumers understand the data

While not deeply technical, the exam may include best-practice questions around data clarity and usability.

Evaluating Data at the Right Stage

Most evaluation tasks occur in Power Query, before data is loaded into the model.

Why?

Faster detection of issues
Prevents poor-quality data from entering the model
Reduces downstream modeling complexity

📌 Key distinction for the exam:

Power Query → data evaluation & cleaning
Model view → relationships & behavior
Report view → visualization

Common Exam Scenarios

You may encounter questions like:

Scenario 1

You need to quickly identify columns with missing or invalid values.

Correct concept: Column quality

Scenario 2

You want to understand how frequently values appear in a categorical column.

Correct concept: Column distribution

Scenario 3

You need to verify numeric ranges and detect outliers.

Correct concept: Column profile

Scenario 4

A numeric column is being treated as text and cannot be aggregated.

Correct concept: Incorrect data type (column property)

Best Practices to Remember

Enable profiling tools early in data preparation
Validate data types before transformations
Use statistics to guide cleaning decisions
Don’t rely on visuals alone to detect data quality issues

Key Exam Takeaways

For the PL-300 exam, remember:

Column quality → valid, error, and null values
Column distribution → frequency and distinct values
Column profile → statistical insights
Column properties affect aggregation, relationships, and visuals
Data evaluation happens primarily in Power Query

Understanding how to interpret what Power BI is telling you about your data is just as important as knowing how to clean it.

Practice Questions

Go to the Practice Exam Questions for this topic.

Data Cleaning, Data Development, Data Engineering, Data Integration, Data Security, Data Strategy, Glossary of Data Terms January 10, 2026January 19, 2026

Glossary – 100 “Data Engineering” Terms

Below is a glossary that includes 100 common “Data Engineering” terms and phrases in alphabetical order. Enjoy!

Term	Definition & Example
Access Control	Managing who can access data. Example: Role-based permissions.
At-Least-Once Processing	Data may be processed more than once. Example: Duplicate-safe pipelines.
At-Most-Once Processing	Data processed zero or one time. Example: No retries on failure.
Backfill	Processing historical data. Example: Reloading last year’s data.
Batch Processing	Processing data in scheduled chunks. Example: Daily sales aggregation.
Blue-Green Deployment	Deployment strategy minimizing downtime. Example: Switching pipeline versions.
Canary Release	Gradual rollout to detect issues. Example: New pipeline tested on 5% of data.
Change Data Capture (CDC)	Capturing database changes. Example: Streaming updates from OLTP DB.
Checkpointing	Saving progress during processing. Example: Spark streaming checkpoints.
Cloud Storage	Scalable remote data storage. Example: Azure Data Lake Storage.
Cold Storage	Low-cost storage for infrequent access. Example: Archived logs.
Columnar Storage	Data stored by column instead of row. Example: Parquet files.
Compression	Reducing data size. Example: Gzip-compressed files.
Compute Engine	System performing data processing. Example: Spark cluster.
Consumption Layer	Data prepared for analytics. Example: Gold layer.
Cost Optimization	Reducing infrastructure costs. Example: Query optimization.
Curated Layer	Cleaned and transformed data. Example: Silver layer.
DAG (Directed Acyclic Graph)	Workflow structure with dependencies. Example: Airflow pipeline.
Data Catalog	Searchable inventory of data assets. Example: Azure Purview.
Data Contract	Agreement defining data structure and expectations. Example: Producer guarantees column names and types.
Data Engineering	The practice of designing, building, and maintaining data systems. Example: Creating pipelines that feed analytics dashboards.
Data Governance	Policies for data management and usage. Example: Access control rules.
Data Ingestion	Collecting data from source systems. Example: Ingesting API data hourly.
Data Lake	Centralized storage for raw data. Example: S3-based data lake.
Data Latency	Time delay in data availability. Example: 5-minute pipeline delay.
Data Lineage	Tracking data flow from source to output. Example: Source-to-dashboard trace.
Data Mart	Subset of warehouse for specific use. Example: Finance data mart.
Data Masking	Obscuring sensitive data. Example: Masked credit card numbers.
Data Mesh	Domain-oriented decentralized data ownership. Example: Teams own their data products.
Data Modeling	Designing data structures for usage. Example: Star schema design.
Data Observability	Monitoring data health and pipelines. Example: Freshness alerts.
Data Partition Pruning	Skipping irrelevant partitions. Example: Querying one date only.
Data Pipeline	An automated process that moves and transforms data. Example: Nightly ETL job from CRM to warehouse.
Data Platform	Integrated set of data tools. Example: End-to-end analytics stack.
Data Product	A dataset treated as a product. Example: Curated customer table.
Data Profiling	Analyzing data characteristics. Example: Value distributions.
Data Quality	Accuracy, completeness, and reliability of data. Example: No duplicate records.
Data Replay	Reprocessing historical events. Example: Rebuilding aggregates from logs.
Data Retention	Rules for data lifespan. Example: Delete logs after 1 year.
Data Security	Protecting data from unauthorized access. Example: Encryption at rest.
Data Serialization	Converting data for storage or transport. Example: Avro encoding.
Data Sink	The destination where data is stored. Example: Data warehouse.
Data Source	The origin of data. Example: ERP system, SaaS application.
Data Validation	Ensuring data meets expectations. Example: Null checks.
Data Versioning	Tracking dataset changes. Example: Snapshot tables.
Data Warehouse	Optimized storage for analytics queries. Example: Azure Synapse Analytics.
Dead Letter Queue (DLQ)	Storage for failed records. Example: Invalid messages routed for review.
Dimension Table	Table storing descriptive attributes. Example: Customer details.
ELT	Extract, Load, Transform approach. Example: Transforming data inside Snowflake.
ETL	Extract, Transform, Load process. Example: Cleaning data before loading into a database.
Event Time	Timestamp when event occurred. Example: User click time.
Event-Driven Architecture	Systems reacting to events in real time. Example: Trigger pipeline on file arrival.
Exactly-Once Processing	Ensuring data is processed only once. Example: Preventing duplicate events.
Fact Table	Table storing quantitative measures. Example: Order transactions.
Fault Tolerance	System resilience to failures. Example: Node failure recovery.
File Format	How data is stored on disk. Example: Parquet, CSV.
Foreign Key	Field linking tables together. Example: CustomerID in orders table.
Full Load	Reloading all data. Example: Initial table population.
High Availability	System uptime and reliability. Example: Multi-zone deployment.
Hot Storage	High-performance storage for frequent access. Example: Real-time tables.
Idempotency	Ability to rerun pipelines safely. Example: Reprocessing without duplicates.
Incremental Load	Loading only new or changed data. Example: CDC-based ingestion.
Indexing	Creating structures to speed queries. Example: Index on order date.
Infrastructure as Code (IaC)	Managing infrastructure via code. Example: Terraform scripts.
Lakehouse	Hybrid of data lake and warehouse. Example: Databricks Lakehouse.
Late-Arriving Data	Data that arrives after expected time. Example: Delayed event logs.
Logging	Recording system events. Example: Job execution logs.
Message Queue	Buffer for asynchronous data transfer. Example: Kafka topic for events.
Metadata	Data about data. Example: Table definitions and lineage.
Metrics	Quantitative indicators of performance. Example: Rows processed per run.
Orchestration	Coordinating pipeline execution. Example: DAG scheduling.
Partitioning	Dividing data for performance. Example: Partitioning by date.
Personally Identifiable Information (PII)	Data identifying individuals. Example: Email addresses.
Pipeline Monitoring	Tracking pipeline execution status. Example: Failure notifications.
Primary Key	Unique identifier for a record. Example: CustomerID.
Processing Time	Timestamp when data is processed. Example: Ingestion time.
Query Optimization	Improving query efficiency. Example: Predicate pushdown.
Raw Layer	Storage of unprocessed data. Example: Bronze layer.
Real-Time Data	Data available with minimal latency. Example: Live dashboard updates.
Retry Logic	Automatic reruns on failure. Example: Retry failed ingestion job.
Scalability	Ability to handle growing workloads. Example: Auto-scaling clusters.
Scheduler	Tool managing execution timing. Example: Cron, Airflow.
Schema	The structure of a dataset. Example: Table columns and data types.
Schema Evolution	Handling schema changes over time. Example: Adding new columns safely.
Secrets Management	Secure handling of credentials. Example: Key Vault for passwords.
Semi-Structured Data	Data with flexible schema. Example: JSON, Parquet.
Serverless	Infrastructure managed by provider. Example: Serverless SQL pools.
Serving Layer	Layer optimized for consumption. Example: BI-ready tables.
Sharding	Distributing data across nodes. Example: User data split across servers.
Snowflake Schema	Normalized version of star schema. Example: Product broken into sub-dimensions.
Star Schema	Fact table surrounded by dimensions. Example: Sales fact with date dimension.
Stream Processing	Processing data in real time. Example: Clickstream event processing.
Structured Data	Data with a fixed schema. Example: SQL tables.
Technical Debt	Long-term cost of quick fixes. Example: Hardcoded transformations.
Throughput	Amount of data processed per unit time. Example: Records per second.
Transformation Layer	Layer where business logic is applied. Example: dbt models.
Unstructured Data	Data without a predefined structure. Example: Images, PDFs.
Watermark	Marker for processed data. Example: Last processed timestamp.
Windowing	Grouping stream data by time windows. Example: 5-minute aggregations.
Workload Isolation	Separating workloads to avoid contention. Example: Dedicated compute pools.

Please share your suggestions for any terms that should be added.

Data Analysis, Data Cleaning, Data Development, Data Munging, Data Quality Assurance, Data Wrangling, Power BI, Power Query January 2, 2026

How to Perform a Safe DIVIDE in Power BI (DAX and Power Query)

Division is a common operation in Power BI, but it can cause errors when the divisor is zero. Both DAX and Power Query provide built-in ways to handle these scenarios safely.

Safe DIVIDE in DAX

In DAX, the DIVIDE function is the recommended approach. Its syntax is:

DIVIDE(numerator, divisor [, alternateResult])

If the divisor is zero (or BLANK), the function returns the optional alternateResult; otherwise, it performs the division normally.

Examples:

DIVIDE(10, 2) → 5
DIVIDE(10, 0) → BLANK
DIVIDE(10, 0, 0) → 0

This makes DIVIDE safer and cleaner than using conditional logic.

Safe DIVIDE in Power Query

In Power Query (M language), you can use the try … otherwise expression to handle divide-by-zero errors gracefully. The syntax is:

try [expression] otherwise [alternateValue]

Example:

try [Sales] / [Quantity] otherwise 0

If the division fails (such as when Quantity is zero), Power Query returns 0 instead of an error.

Using DIVIDE in DAX and try … otherwise in Power Query ensures your division calculations remain error-free.

Data Cleaning, Data Integration, Data Munging, Data Wrangling, Power BI, Power Query January 2, 2026

How to replace a NULL value in Power BI Power Query

In Power BI, handling NULL values is a common data-preparation step to get your data ready for analysis, and Power Query makes this easy using the Replace Values feature.

This option is available from both the Home menu …

… and the Transform menu in the Power Query Editor.

To replace NULLs, first select the column where the NULL values exist. Then choose Replace Values. When the dialog box appears, enter null as the value to find and replace, and specify the value you want to use instead—such as 0 for numeric columns or “Unknown” for text columns.

After confirming, Power Query automatically updates the column and records the step.

Thanks for reading!

Analytics, Artificial Intelligence (AI), Business Intelligence, Business Intelligence (BI) Development, Data Analysis, Data Cleaning, Data Development, Data Governance, Data Integration, Data Integration (ETL), Data Modeling, Data Security, Data Strategy, Data Visualization, Data Warehousing, Data Wrangling, Databases, DP-600, Microsoft Certification, Microsoft Fabric, Microsoft OneLake, Performance Tuning, Power BI, Power Query, Python, SQL December 28, 2025January 31, 2026

Exam Prep Hub for DP-600: Implementing Analytics Solutions Using Microsoft Fabric

This is your one-stop hub with information for preparing for the DP-600: Implementing Analytics Solutions Using Microsoft Fabric certification exam. Upon successful completion of the exam, you earn the Fabric Analytics Engineer Associate certification.

This hub provides information directly here, links to a number of external resources, tips for preparing for the exam, practice tests, and section questions to help you prepare. Bookmark this page and use it as a guide to ensure that you are fully covering all relevant topics for the exam and using as many of the resources available as possible. We hope you find it convenient and helpful.

Why do the DP-600: Implementing Analytics Solutions Using Microsoft Fabric exam to gain the Fabric Analytics Engineer Associate certification?

Most likely, you already know why you want to earn this certification, but in case you are seeking information on its benefits, here are a few:
(1) there is a possibility for career advancement because Microsoft Fabric is a leading data platform used by companies of all sizes, all over the world, and is likely to become even more popular
(2) greater job opportunities due to the edge provided by the certification
(3) higher earnings potential,
(4) you will expand your knowledge about the Fabric platform by going beyond what you would normally do on the job and
(5) it will provide immediate credibility about your knowledge, and
(6) it may, and it should, provide you with greater confidence about your knowledge and skills.

Important DP-600 resources:

In the section below this one, titled “DP-600: Skills measured as of October 31, 2025“, you will find the “skills measured” topics from the official study guide with links to exam preparation content for each topic. Bookmark this page and use that section as a structured topic-by-topic guide for your prep.
Link to the Microsoft Fabric Analytics Engineer Associate Certification page
Link to the Microsoft DP-600 study guide page.
- This page provides information for preparing for, practicing for, and registering for the exam. The skills measured content in the guide is also what is used to form the “Skills Measured as of …” outline below.
About the exam:
- Cost: US $165
- Number of questions: approximately 60
- Time to do exam: 120 minutes (2 hours)
To Do’s:
- Schedule time to learn, study, perform labs, and do practice exams and questions
- Schedule the exam based on when you think you will be ready; scheduling the exam gives you a target and drives you to keep working on it
- Use the various resources above and below to learn
- Take the free Microsoft Learn practice test, any other available practice tests, and do the practice questions in each section and the two practice tests available in this hub.
Link to the free, comprehensive, self-paced course: Microsoft Learn course for a Microsoft Fabric Analytics Engineer. It contains 4 Learning Paths, each with multiple Modules, and each module has multiple Units. It will take some time to do it, but we recommend that you complete this entire course, including the exercises/labs. To help you work through your preparation in a structured manner, we will point you to the relevant sections in the training material corresponding to each of the sections in the skills measured section below.
YouTube videos that you will find useful:
- DP-600 Exam Full Course (6+ hours) | Microsoft Fabric Analytics Engineer by Learn Microsoft Fabric with Will
- Learn the Fundamentals of Microsoft Fabric in 38 minutes by Learn Microsoft Fabric with Will
- Microsoft Analytics Fabric Engineer course by Microsoft Learn
- How To Prepare for the DP-600 Microsoft Fabric Certification Exam [Full Course] by Pragmatic Works
- How to pass Exam DP-600: Implementing Analytics Solutions Using Microsoft Fabric by Microsoft Power BI
- DP-600 | Microsoft Fabric Analytics Engineer Exam | 109 Practice Questions With Explanation by Learn With Priyanka
- What is Microsoft Fabric? by Pragmatic Works
- Learn Together: Get started with end-to-end analytics and lakehouses in Microsoft Fabric by Microsoft Power BI
- Learn Together: Get started with data warehouses in Microsoft Fabric by Microsoft Power BI
  - Note: There are quite a few “Learn Together” videos about Fabric. Check out as many as you can.
Additional Microsoft links:
- https://aka.ms/GetCertified/dp600
- https://aka.ms/IamReady/DP600Prepare
Microsoft Fabric Community Blog
Microsoft Community Blog post you might find useful. It is titled “Step-by-Step-Strategy-to-Ace-the-Microsoft-Fabric-Analytics“
Microsoft Fabric Career Hub – includes information for (1) Data Engineer and (2) Analytics Engineer
Reddit DP-600 Mega Thread
Books you might be interested in:
- Exam Ref DP-600 Implementing Analytics Solutions Using Microsoft Fabric
- Implementing Analytics Solutions Using Microsoft Fabric—DP-600 Exam Study Guide: Boost your skills with expert insights and certification-ready strategies for Microsoft analytics
Courses you might be interested in:
- Udemy: Microsoft DP-600 prep: Fabric Analytics Engineer Associate
  - Note: There are multiple, highly rated DP-600 courses available on Udemy
  - Tip: await the occasional Udemy sale to buy
- Coursera: Exam Prep DP-600: Microsoft Fabric Analytics Engineer

DP-600: Skills measured as of October 31, 2025:

Here you can learn in a structured manner by going through the topics of the exam one-by-one to ensure full coverage; click on each hyperlinked topic below to go to more information about it:

Skills at a glance

Maintain a data analytics solution (25%-30%)
Prepare data (45%-50%)
Implement and manage semantic models (25%-30%)

Maintain a data analytics solution (25%-30%)

Implement security and governance

Maintain the analytics development lifecycle

Prepare data (45%-50%)

Get Data

Transform Data

Query and analyze data

Implement and manage semantic models (25%-30%)

Design and build semantic models

Optimize enterprise-scale semantic models

Practice Exams:

We have provided 2 practice exams with answers to help you prepare.

DP-600 Practice Exam 1 (60 questions with answer key)

DP-600 Practice Exam 2 (60 questions with answer key)

Good luck to you passing the DP-600: Implementing Analytics Solutions Using Microsoft Fabric certification exam and earning the Fabric Analytics Engineer Associate certification!

Analytics, Business Intelligence, Business Intelligence (BI) Development, Data Analysis, Data Cleaning, Data Development, Data Modeling, Data Quality Assurance, Data Security, Data Strategy, Data Visualization, Data Warehousing, Data Wrangling, DP-600, Microsoft Certification, Microsoft Fabric, Performance Tuning, Power BI December 28, 2025

Select, Filter, and Aggregate Data Using DAX

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Prepare data 
    --> Query and analyze data 
        --> Select, Filter, and Aggregate Data Using DAX

Data Analysis Expressions (DAX) is a formula language used to create dynamic calculations in Power BI semantic models. Unlike SQL or KQL, DAX works within the analytical model and is designed for filter context–aware calculations, interactive reporting, and business logic. For DP-600, you should understand how to use DAX to select, filter, and aggregate data within a semantic model for analytics and reporting.

What Is DAX?

DAX is similar to Excel formulas but optimized for relational, in-memory analytics. It is used in:

Measures (dynamic calculations)
Calculated columns (row-level derived values)
Calculated tables (additional, reusable query results)

In a semantic model, DAX queries run in response to visuals and can produce results based on current filters and slicers.

Selecting Data in DAX

DAX itself doesn’t use a traditional SELECT statement like SQL. Instead:

Data is selected implicitly by filter context
DAX measures operate over table columns referenced in expressions

Example of a simple DAX measure selecting and displaying sales:

Total Sales = SUM(Sales[SalesAmount])

Here:

Sales[SalesAmount] references the column in the Sales table
The measure returns the sum of all values in that column

Filtering Data in DAX

Filtering in DAX is context-driven and can be applied in multiple ways:

1. Implicit Filters

Visual-level filters and slicers automatically apply filters to DAX measures.

Example:
A card visual showing Total Sales will reflect only the filtered subset by product or date.

2. FILTER Function

Used within measures or calculated tables to narrow down rows:

HighValueSales = CALCULATE(
    SUM(Sales[SalesAmount]),
    FILTER(Sales, Sales[SalesAmount] > 1000)
)

Here:

FILTER returns a table with rows meeting the condition
CALCULATE modifies the filter context

3. CALCULATE as Filter Modifier

CALCULATE changes the context under which a measure evaluates:

SalesLastYear = CALCULATE(
    [Total Sales],
    SAMEPERIODLASTYEAR(Date[Date])
)

This measure selects data for the previous year based on current filters.

Aggregating Data in DAX

Aggregation in DAX is done using built-in functions and is influenced by filter context.

Common Aggregation Functions

SUM() — totals a numeric column
AVERAGE() — computes the mean
COUNT() / COUNTA() — row counts
MAX() / MIN() — extreme values
SUMX() — row-by-row iteration and sum

Example of row-by-row aggregation:

Total Profit = SUMX(
    Sales,
    Sales[SalesAmount] - Sales[Cost]
)

This computes the difference per row and then sums it.

Filter Context and Row Context

Understanding how DAX handles filter context and row context is essential:

Filter context: Set by the report (slicers, column filters) or modified by CALCULATE
Row context: Used in calculated columns and iteration functions (SUMX, FILTER)

DAX measures always respect the current filter context unless explicitly modified.

Grouping and Summarization

While DAX doesn’t use GROUP BY in the same way SQL does, measures inherently aggregate over groups determined by filter context or visual grouping.

Example:
In a table visual grouped by Product Category, the measure Total Sales returns aggregated values per category automatically.

Time Intelligence Functions

DAX includes built-in functions for time-based aggregation:

TOTALYTD(), TOTALQTD(), TOTALMTD() — year-to-date, quarter-to-date, month-to-date
SAMEPERIODLASTYEAR() — compare values year-over-year
DATESINPERIOD() — custom period

Example:

SalesYTD = TOTALYTD(
    [Total Sales],
    Date[Date]
)

Best Practices

Use measures, not calculated columns, for dynamic, filter-sensitive aggregations.
Let visuals control filter context via slicers, rows, and columns.
Avoid unnecessary row-by-row calculations when simple aggregation functions suffice.
Explicitly use CALCULATE to modify filter context for advanced scenarios.

When to Use DAX vs SQL/KQL

Scenario	Best Tool
Static relational querying	SQL
Streaming/event analytics	KQL
Report-level dynamic calculations	DAX
Interactive dashboards with slicers	DAX

Example Use Cases

1. Total Sales Measure

Total Sales = SUM(Sales[SalesAmount])

2. Filtered Sales for Big Orders

Big Orders Sales = CALCULATE(
    [Total Sales],
    Sales[SalesAmount] > 1000
)

3. Year-over-Year Sales

Sales YOY = CALCULATE(
    [Total Sales],
    SAMEPERIODLASTYEAR(Date[Date])
)

Key Takeaways for the Exam

DAX operates based on filter context and evaluates measures dynamically.
There is no explicit SELECT statement — rather, measures compute values based on current context.
Use CALCULATE to change filter context.
Aggregation functions (e.g., SUM, COUNT, AVERAGE) are fundamental to summarizing data.
Filtering functions like FILTER and time intelligence functions enhance analytical flexibility.

Final Exam Tips

If a question mentions interactive reports, dynamic filters, slicers, or time-based comparisons, DAX is likely the right language to use for the solution.
Measures + CALCULATE + filter context appear frequently.
If the question mentions slicers, visuals, or dynamic results, think DAX measure.
Time intelligence functions are high-value topics.

Practice Questions:

Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …

Identifying and understand why an option is correct (or incorrect) — not just which one
Look for and understand the usage scenario of keywords in exam questions to guide you
Expect scenario-based questions rather than direct definitions

1. Which DAX function is primarily used to modify the filter context of a calculation?

A. FILTER
B. SUMX
C. CALCULATE
D. ALL

Correct answer: ✅ C
Explanation: CALCULATE changes the filter context under which an expression is evaluated.

2. A Power BI report contains slicers for Year and Product. A measure returns different results as slicers change. What concept explains this behavior?

A. Row context
B. Filter context
C. Evaluation context
D. Query context

Correct answer: ✅ B
Explanation: Filter context is affected by slicers, filters, and visual interactions.

3. Which DAX function iterates row by row over a table to perform a calculation?

A. SUM
B. COUNT
C. AVERAGE
D. SUMX

Correct answer: ✅ D
Explanation: SUMX evaluates an expression for each row and then aggregates the results.

4. You want to calculate total sales only for transactions greater than $1,000. Which approach is correct?

SUM(Sales[SalesAmount] > 1000)

FILTER(Sales, Sales[SalesAmount] > 1000)

CALCULATE(
    SUM(Sales[SalesAmount]),
    Sales[SalesAmount] > 1000
)

SUMX(Sales, Sales[SalesAmount] > 1000)

Correct answer: ✅ C
Explanation: CALCULATE applies a filter condition while aggregating.

5. Which DAX object is evaluated dynamically based on report filters and slicers?

A. Calculated column
B. Calculated table
C. Measure
D. Relationship

Correct answer: ✅ C
Explanation: Measures respond dynamically to filter context; calculated columns do not.

6. Which function is commonly used to calculate year-to-date (YTD) values in DAX?

A. DATESINPERIOD
B. SAMEPERIODLASTYEAR
C. TOTALYTD
D. CALCULATE

Correct answer: ✅ C
Explanation: TOTALYTD is designed for year-to-date aggregations.

7. A DAX measure returns different totals when placed in a table visual grouped by Category. Why does this happen?

A. The measure contains row context
B. The table visual creates filter context
C. The measure is recalculated per row
D. Relationships are ignored

Correct answer: ✅ B
Explanation: Visual grouping applies filter context automatically.

8. Which DAX function returns a table instead of a scalar value?

A. SUM
B. AVERAGE
C. FILTER
D. COUNT

Correct answer: ✅ C
Explanation: FILTER returns a table that can be consumed by other functions like CALCULATE.

9. Which scenario is the best use case for DAX instead of SQL or KQL?

A. Cleaning raw data before ingestion
B. Transforming streaming event data
C. Creating interactive report-level calculations
D. Querying flat files in a lakehouse

Correct answer: ✅ C
Explanation: DAX excels at dynamic, interactive calculations in semantic models.