Category: Data Engineering

Business Intelligence, Data Engineering, Microsoft Certification, PL-300, Power BI January 17, 2026

Create and Transform Columns (PL-300 Exam Prep)

This post is a part of the PL-300: Microsoft Power BI Data Analyst Exam Prep Hub; and this topic falls under these sections:
Prepare the data (25–30%)
   --> Transform and load the data
      --> Create and Transform Columns

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Columns are the foundation of data analysis in Power BI. The ability to create new columns and transform existing ones is essential for shaping your dataset into a structure that supports meaningful insights and accurate reports.

In the PL-300 exam, Microsoft tests not only whether you can perform transformations but also whether you understand when and why to apply them.

Why Create and Transform Columns?

Before data can be modeled and visualized:

It must be clean, consistent, and in the right format
New columns may be needed to support business logic
Transformations ensure data is reliable and analysis-ready

For example:

Converting text dates into true Date types
Extracting parts of a string (e.g., Year from a date)
Splitting a full name into first and last names
Normalizing inconsistent text values

These are not just useful—they are often necessary for accurate DAX measures and reporting.

Where Column Transformations Happen

Most column creation and transformation tasks happen in Power Query Editor (before the data loads into the model).

Key places include:

Transform tab
Add Column tab
Applied Steps pane
Advanced Editor (for M code)

Power BI also allows column creation after loading the data through:

DAX Calculated Columns (in the data model)

The exam may present scenarios where you choose which tool (Power Query vs DAX) to use.

Common Column Transformations

Here are the main categories of column operations you should be ready to apply:

1. Basic Transformations

These change existing columns:

Rename columns
Change data types
Trim, clean, or format text
Replace values
Remove columns

These are the bread-and-butter tasks that clean and standardize data.

2. Splitting and Merging Columns

When data is combined within one field:

Split Column (by delimiter or number of characters)
- Example: Split Full Name → First Name and Last Name
Merge Columns
- Example: Combine City and State into a single location field

This is essential when data needs to be restructured for modeling.

3. Extracting Components

Examples include:

Extracting Year, Month, or Day from a Date column
Taking the first/last characters from a text string
Extracting text before or after a specific character

These operations prepare granular fields needed for grouping or calculations.

4. Calculations Using “Add Column”

You can create derived columns based on logic:

Custom Columns (via M formulas)
Conditional Columns
- Example: Flag High Value sales where sales > $1,000
Index Columns
- Useful for row ordering

These columns often support business metrics or classifications.

Text Transformations

Text columns commonly require cleaning and standardization:

Uppercase / Lowercase
Trim (removes leading/trailing spaces)
Clean (removes non-printable characters)
Replace Values (e.g., “N/A” → null)

The exam often tests whether you know how to fix inconsistent text data.

Date and Time Transformations

Working with dates is core to analysis:

Change text to date/time type
Extract Year, Quarter, Month, Day
Add custom time intelligence columns
Use locale conversion for date parsing

This enables time-based grouping and accurate measures like YTD (Year-to-Date).

Conditional and Custom Columns

Conditional Columns

Created through UI (Add Column → Conditional Column)
Define logic visually (e.g., if Sales > 500 then "High" else "Low")

Custom Columns

Created using Power Query M code
More advanced logic and functions

Both are useful depending on the complexity of your requirement. Exam questions often compare these approaches.

Column Transformations vs DAX Calculated Columns

Power Query Column

Transformation occurs before data loads into model
Changes physical data shape
Useful for cleaning and structuring data

DAX Calculated Column

Created after data loads into the model
Evaluated per row in the model
Useful for measures and relationships tied to data model context

Exam insight:
Use Power Query transformations for structural cleanup. Use DAX calculated columns when the logic depends on model relationships or evaluation context.

Best Practices for the Exam

Clean data before creating columns.
Don’t derive new data from dirty input.
Apply the right transformation tool.
Power Query for structural cleanup; DAX for model-aware calculations.
Name columns clearly.
Report consumers and measures depend on intuitive names.
Avoid unnecessary columns.
Only keep what’s needed for reporting to improve model performance.
Group related transformations logically.
Use Query folding where possible (especially for large datasets).

How This Appears on the PL-300 Exam

You might see scenarios like:

You need to split a full address column into street, city, and postal code for better filtering. Which transformation should you use?

This tests:

Knowledge of Split Column
When to apply it
How to maintain data type integrity afterward

Or:

Your date column is text and not aggregating correctly. What do you do?

This tests:

Understanding of data types
Ability to convert to proper Date/Time

Most questions are scenario-based, requiring both decision and action reasoning.

Key Takeaways

✔ Column transformations are a core part of shaping data
✔ Power Query is the primary environment for creating and transforming columns
✔ Use Add Column for new fields and Transform for modifying existing fields
✔ Know the difference between Power Query and DAX calculated columns
✔ Common transformations include text, date, splitting/merging, conditional logic, and custom formulas

Practice Questions

Go to the Practice Exam Questions for this topic.

Analytics, Data Development, Data Engineering, Data Integration, Microsoft Certification, PL-300, Power BI January 17, 2026

Resolve Data Import Errors (PL-300 Exam Prep)

This post is a part of the PL-300: Microsoft Power BI Data Analyst Exam Prep Hub; and this topic falls under these sections: 
Prepare the data (25–30%) 
   --> Profile and clean the data 
      --> Resolve data import errors

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub's main page.

Data import errors are a common issue when bringing data into Power BI. These errors typically arise during the Power Query stage and must be resolved before data can be successfully loaded into the data model. The PL-300 exam tests your ability to identify, interpret, and fix these errors using Power Query’s built-in tools and transformations.

What Are Data Import Errors?

Import errors occur when Power BI cannot process or convert incoming data as expected. These errors can arise from:

Invalid data formats
Incompatible data types
Data corruption
Unexpected null or missing values
Transformation steps that fail

Identifying and resolving these errors early ensures that your dataset is clean, consistent, and ready for modeling and reporting.

Where Import Errors Occur

Import errors are most commonly encountered:

🧩 During Data Type Conversion

When the source value cannot be converted to the target type
(e.g., text "N/A" converted to number)

🧩 In Applied Steps

If a transformation step references a column that doesn’t exist
or expects a format that isn’t present

🧩 While Combining Queries

When merging or appending tables with mismatched structures

🧩 When Parsing Complex Formats

Such as dates in nonstandard formats or malformed JSON

How Power BI Signals Import Errors

In Power Query Editor, import errors are typically shown as:

Error icons in the preview cells
A warning message in the query results (“Error” link)
Red dotted underlines or warnings in applied steps
The “Load failed” message when refreshing

The first step in resolving errors is to examine the error details.

Viewing Error Details

When an error appears in Power Query:

Click the Error indicator in the cell or
Use View → Column quality / Column profile

You can also filter the column to show only error values by filtering on Errors.

Exam tip:
Power BI often shows technical error messages, so part of the task is interpreting what the underlying issue is (e.g., type mismatch, invalid format, null where not expected).

Common Import Errors & How to Fix Them

1. Type Conversion Errors

Scenario: A column expected to be numeric contains text such as "Unknown".

Fix Options:

Use Replace Errors to substitute a default value
Use Replace Values to convert specific text to numeric (e.g., "Unknown" → 0)
Adjust data type after cleaning

Key Idea: Always fix the root cause before changing the data type.

2. Unexpected Null Values

Scenario: A key column has nulls where values are required, causing subsequent transformations to fail.

Fix Options:

Replace nulls with default values via Replace Values
Remove rows where the column is null
Use conditional logic (Add Column → Conditional Column) to handle nulls appropriately

Key Idea: Nulls can break transformations (like merges) if not handled first.

3. Transformation Step Errors

Scenario: A transformation step refers to a column removed or renamed earlier in the applied steps.

Fix Options:

Review and reorder steps in the APPLIED STEPS pane
Rename the column consistently before referencing it
Delete the problematic step and reapply it correctly

Key Idea: Power BI applies steps sequentially. A downstream step can fail if an upstream change invalidates assumptions.

4. Merge/Append Structure Errors

Scenario: You merge or append tables that don’t share compatible column structures (e.g., mismatched data types).

Fix Options:

Ensure columns used for merger/join have identical data types
Rename or reorder columns to match structures
Preclean individual tables before combining

Key Idea: Always validate structure and types before merging or appending tables.

5. Parsing & Date Format Errors

Scenario: Date values import as text due to regional format differences (MM/DD/YYYY vs DD/MM/YYYY).

Fix Options:

Change the column data type to Date after validating format
Use Transform → Using Locale to define the correct regional format
Use Custom Columns to parse dates manually with Date.FromText

Key Idea: Locale-aware parsing helps resolve ambiguous date formats.

Tools to Help Diagnose Import Errors

Power BI provides several tools to help you locate and fix import errors:

🔍 Error Filtering

Filter columns to show only error rows.

📊 Column Quality / Distribution / Profile

Use profiling tools to identify patterns, nulls, and anomalies.

🧠 Step Validation

Hover over each Applied Step to see whether it is valid or failing.

📝 Advanced Editor

Review M code for logic errors or incorrect references.

Best Practices for Fixing Import Errors

1. Clean Before Converting Types
Always fix textual anomalies and nulls before assigning data types.

2. Avoid Hard-Coding Values
Replace problematic values using conditional logic or parameters for maintenance.

3. Inspect Impact of Each Step
Use the Applied Steps pane to ensure each transformation is valid.

4. Test Incrementally
Fix errors one at a time and refresh often to confirm success.

5. Document Assumptions
Add comments or descriptive step names to make logic clearer.

How This Appears on the PL-300 Exam

The exam commonly tests your ability to:

✔ Identify why a query fails (type mismatch, nulls, missing column)
✔ Choose the correct sequence to fix the issue
✔ Understand the difference between Replace Errors and Remove Errors
✔ Apply transformations in the correct order (clean → convert → transform)

Most questions are scenario-based, asking what action you would take next to successfully import data.

Key Exam Takeaways

Import errors can be caused by data type mismatches, unexpected nulls, invalid formats, and broken transformation steps.
Use Power Query tools to diagnose and resolve errors before loading data into the model.
Always understand the root cause before applying a fix.
Knowing how to use Replace Errors, Replace Values, Conditional Columns, and Data Type changes is essential.

Practice Questions

Go to the Practice Exam Questions for this topic.

Data Cleaning, Data Development, Data Engineering, Data Integration, Data Security, Data Strategy, Glossary of Data Terms January 10, 2026January 19, 2026

Glossary – 100 “Data Engineering” Terms

Below is a glossary that includes 100 common “Data Engineering” terms and phrases in alphabetical order. Enjoy!

Term	Definition & Example
Access Control	Managing who can access data. Example: Role-based permissions.
At-Least-Once Processing	Data may be processed more than once. Example: Duplicate-safe pipelines.
At-Most-Once Processing	Data processed zero or one time. Example: No retries on failure.
Backfill	Processing historical data. Example: Reloading last year’s data.
Batch Processing	Processing data in scheduled chunks. Example: Daily sales aggregation.
Blue-Green Deployment	Deployment strategy minimizing downtime. Example: Switching pipeline versions.
Canary Release	Gradual rollout to detect issues. Example: New pipeline tested on 5% of data.
Change Data Capture (CDC)	Capturing database changes. Example: Streaming updates from OLTP DB.
Checkpointing	Saving progress during processing. Example: Spark streaming checkpoints.
Cloud Storage	Scalable remote data storage. Example: Azure Data Lake Storage.
Cold Storage	Low-cost storage for infrequent access. Example: Archived logs.
Columnar Storage	Data stored by column instead of row. Example: Parquet files.
Compression	Reducing data size. Example: Gzip-compressed files.
Compute Engine	System performing data processing. Example: Spark cluster.
Consumption Layer	Data prepared for analytics. Example: Gold layer.
Cost Optimization	Reducing infrastructure costs. Example: Query optimization.
Curated Layer	Cleaned and transformed data. Example: Silver layer.
DAG (Directed Acyclic Graph)	Workflow structure with dependencies. Example: Airflow pipeline.
Data Catalog	Searchable inventory of data assets. Example: Azure Purview.
Data Contract	Agreement defining data structure and expectations. Example: Producer guarantees column names and types.
Data Engineering	The practice of designing, building, and maintaining data systems. Example: Creating pipelines that feed analytics dashboards.
Data Governance	Policies for data management and usage. Example: Access control rules.
Data Ingestion	Collecting data from source systems. Example: Ingesting API data hourly.
Data Lake	Centralized storage for raw data. Example: S3-based data lake.
Data Latency	Time delay in data availability. Example: 5-minute pipeline delay.
Data Lineage	Tracking data flow from source to output. Example: Source-to-dashboard trace.
Data Mart	Subset of warehouse for specific use. Example: Finance data mart.
Data Masking	Obscuring sensitive data. Example: Masked credit card numbers.
Data Mesh	Domain-oriented decentralized data ownership. Example: Teams own their data products.
Data Modeling	Designing data structures for usage. Example: Star schema design.
Data Observability	Monitoring data health and pipelines. Example: Freshness alerts.
Data Partition Pruning	Skipping irrelevant partitions. Example: Querying one date only.
Data Pipeline	An automated process that moves and transforms data. Example: Nightly ETL job from CRM to warehouse.
Data Platform	Integrated set of data tools. Example: End-to-end analytics stack.
Data Product	A dataset treated as a product. Example: Curated customer table.
Data Profiling	Analyzing data characteristics. Example: Value distributions.
Data Quality	Accuracy, completeness, and reliability of data. Example: No duplicate records.
Data Replay	Reprocessing historical events. Example: Rebuilding aggregates from logs.
Data Retention	Rules for data lifespan. Example: Delete logs after 1 year.
Data Security	Protecting data from unauthorized access. Example: Encryption at rest.
Data Serialization	Converting data for storage or transport. Example: Avro encoding.
Data Sink	The destination where data is stored. Example: Data warehouse.
Data Source	The origin of data. Example: ERP system, SaaS application.
Data Validation	Ensuring data meets expectations. Example: Null checks.
Data Versioning	Tracking dataset changes. Example: Snapshot tables.
Data Warehouse	Optimized storage for analytics queries. Example: Azure Synapse Analytics.
Dead Letter Queue (DLQ)	Storage for failed records. Example: Invalid messages routed for review.
Dimension Table	Table storing descriptive attributes. Example: Customer details.
ELT	Extract, Load, Transform approach. Example: Transforming data inside Snowflake.
ETL	Extract, Transform, Load process. Example: Cleaning data before loading into a database.
Event Time	Timestamp when event occurred. Example: User click time.
Event-Driven Architecture	Systems reacting to events in real time. Example: Trigger pipeline on file arrival.
Exactly-Once Processing	Ensuring data is processed only once. Example: Preventing duplicate events.
Fact Table	Table storing quantitative measures. Example: Order transactions.
Fault Tolerance	System resilience to failures. Example: Node failure recovery.
File Format	How data is stored on disk. Example: Parquet, CSV.
Foreign Key	Field linking tables together. Example: CustomerID in orders table.
Full Load	Reloading all data. Example: Initial table population.
High Availability	System uptime and reliability. Example: Multi-zone deployment.
Hot Storage	High-performance storage for frequent access. Example: Real-time tables.
Idempotency	Ability to rerun pipelines safely. Example: Reprocessing without duplicates.
Incremental Load	Loading only new or changed data. Example: CDC-based ingestion.
Indexing	Creating structures to speed queries. Example: Index on order date.
Infrastructure as Code (IaC)	Managing infrastructure via code. Example: Terraform scripts.
Lakehouse	Hybrid of data lake and warehouse. Example: Databricks Lakehouse.
Late-Arriving Data	Data that arrives after expected time. Example: Delayed event logs.
Logging	Recording system events. Example: Job execution logs.
Message Queue	Buffer for asynchronous data transfer. Example: Kafka topic for events.
Metadata	Data about data. Example: Table definitions and lineage.
Metrics	Quantitative indicators of performance. Example: Rows processed per run.
Orchestration	Coordinating pipeline execution. Example: DAG scheduling.
Partitioning	Dividing data for performance. Example: Partitioning by date.
Personally Identifiable Information (PII)	Data identifying individuals. Example: Email addresses.
Pipeline Monitoring	Tracking pipeline execution status. Example: Failure notifications.
Primary Key	Unique identifier for a record. Example: CustomerID.
Processing Time	Timestamp when data is processed. Example: Ingestion time.
Query Optimization	Improving query efficiency. Example: Predicate pushdown.
Raw Layer	Storage of unprocessed data. Example: Bronze layer.
Real-Time Data	Data available with minimal latency. Example: Live dashboard updates.
Retry Logic	Automatic reruns on failure. Example: Retry failed ingestion job.
Scalability	Ability to handle growing workloads. Example: Auto-scaling clusters.
Scheduler	Tool managing execution timing. Example: Cron, Airflow.
Schema	The structure of a dataset. Example: Table columns and data types.
Schema Evolution	Handling schema changes over time. Example: Adding new columns safely.
Secrets Management	Secure handling of credentials. Example: Key Vault for passwords.
Semi-Structured Data	Data with flexible schema. Example: JSON, Parquet.
Serverless	Infrastructure managed by provider. Example: Serverless SQL pools.
Serving Layer	Layer optimized for consumption. Example: BI-ready tables.
Sharding	Distributing data across nodes. Example: User data split across servers.
Snowflake Schema	Normalized version of star schema. Example: Product broken into sub-dimensions.
Star Schema	Fact table surrounded by dimensions. Example: Sales fact with date dimension.
Stream Processing	Processing data in real time. Example: Clickstream event processing.
Structured Data	Data with a fixed schema. Example: SQL tables.
Technical Debt	Long-term cost of quick fixes. Example: Hardcoded transformations.
Throughput	Amount of data processed per unit time. Example: Records per second.
Transformation Layer	Layer where business logic is applied. Example: dbt models.
Unstructured Data	Data without a predefined structure. Example: Images, PDFs.
Watermark	Marker for processed data. Example: Last processed timestamp.
Windowing	Grouping stream data by time windows. Example: 5-minute aggregations.
Workload Isolation	Separating workloads to avoid contention. Example: Dedicated compute pools.

Please share your suggestions for any terms that should be added.

Columns are the foundation of data analysis in Power BI. The ability to create new columns and transform existing ones is essential for shaping your dataset into a structure that supports meaningful insights and accurate reports.

In the PL-300 exam, Microsoft tests not only whether you can perform transformations but also whether you understand when and why to apply them.

Why Create and Transform Columns?

Where Column Transformations Happen

Common Column Transformations

1. Basic Transformations

2. Splitting and Merging Columns

3. Extracting Components

4. Calculations Using “Add Column”

Text Transformations

Date and Time Transformations

Conditional and Custom Columns

Conditional Columns

Custom Columns

Column Transformations vs DAX Calculated Columns

Power Query Column

DAX Calculated Column

Best Practices for the Exam

How This Appears on the PL-300 Exam

Key Takeaways

Practice Questions

What Are Data Import Errors?

Where Import Errors Occur

🧩 During Data Type Conversion

🧩 In Applied Steps

🧩 While Combining Queries

🧩 When Parsing Complex Formats

How Power BI Signals Import Errors

Viewing Error Details

Common Import Errors & How to Fix Them

1. Type Conversion Errors

2. Unexpected Null Values

3. Transformation Step Errors

4. Merge/Append Structure Errors

5. Parsing & Date Format Errors

Tools to Help Diagnose Import Errors

🔍 Error Filtering

📊 Column Quality / Distribution / Profile

🧠 Step Validation

📝 Advanced Editor

Best Practices for Fixing Import Errors

How This Appears on the PL-300 Exam

Key Exam Takeaways

Practice Questions

Information and resources for the data professionals' community