Identify and resolve notebook errors (DP-700 Exam Prep)

This post is a part of the DP-700: Implementing Data Engineering Solutions Using Microsoft Fabric Exam Prep Hub.
This topic falls under these sections:
Monitor and optimize an analytics solution (30–35%)
   --> Identify and resolve errors
      --> Identify and resolve notebook errors


Note that there are 10 practice questions (with answers) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Notebook troubleshooting is an important skill for the DP-700 certification exam because notebooks are one of the primary tools used for data ingestion, transformation, orchestration, machine learning, and advanced analytics in Microsoft Fabric. Data engineers must be able to quickly identify failures, interpret error messages, diagnose root causes, and implement corrective actions.

This topic focuses on understanding notebook execution, common notebook errors, monitoring tools, debugging techniques, and best practices for building reliable notebook solutions in Microsoft Fabric.


Understanding Notebooks in Microsoft Fabric

A notebook is an interactive development environment that allows engineers to write and execute code using:

  • PySpark
  • Spark SQL
  • Scala
  • Python
  • R (where supported)

Fabric notebooks run on Spark clusters and are commonly used to:

  • Ingest data into Lakehouses
  • Transform data
  • Build ETL processes
  • Execute streaming workloads
  • Perform data quality checks
  • Orchestrate complex data engineering workflows

Because notebooks often process large datasets and depend on external systems, failures are inevitable. Effective troubleshooting is therefore a critical data engineering skill.


Common Categories of Notebook Errors

Notebook failures generally fall into several categories:

Syntax Errors

These occur when code violates language rules.

Example

df = spark.read.csv("/Files/data.csv"

Error:

SyntaxError: unexpected EOF while parsing

Cause:

  • Missing closing parenthesis

Resolution:

  • Review code carefully
  • Use notebook syntax highlighting
  • Validate code before execution

Runtime Errors

Runtime errors occur when code is syntactically correct but fails during execution.

Example

value = 100 / 0

Error:

ZeroDivisionError

Cause:

  • Division by zero

Resolution:

  • Add validation logic
  • Implement exception handling

Data Access Errors

These are among the most common notebook failures.

Examples

  • File not found
  • Table not found
  • Permission denied
  • Invalid storage path

Example:

df = spark.read.parquet(
"/Files/Sales2025"
)

Error:

Path does not exist

Possible causes:

  • Incorrect path
  • Deleted file
  • Typographical error
  • Missing shortcut

Resolution:

  • Verify file location
  • Confirm OneLake shortcut configuration
  • Check permissions

Authentication and Authorization Errors

A notebook may be unable to access resources because the user or service principal lacks required permissions.

Examples:

Access Denied
Unauthorized
Permission denied

Common causes:

  • Workspace role limitations
  • Missing Lakehouse permissions
  • Source-system authentication failures

Resolution:

  • Verify workspace access
  • Confirm security settings
  • Validate credentials

Spark Resource Errors

Spark jobs require compute resources.

Failures may occur because of:

  • Insufficient memory
  • Driver overload
  • Executor failures
  • Large shuffle operations

Typical errors:

OutOfMemoryError
ExecutorLostFailure
Driver memory exceeded

Resolution:

  • Increase Spark resources
  • Optimize queries
  • Partition data appropriately
  • Reduce data movement

Dependency Errors

Notebook code may depend on external packages.

Example:

import pandas_profiling

Error:

ModuleNotFoundError

Cause:

  • Package not installed

Resolution:

  • Install required libraries
  • Use supported package versions

Monitoring Notebook Execution

Fabric provides several methods for monitoring notebook runs.

Notebook Run Status

Execution status may show:

  • Running
  • Completed
  • Failed
  • Cancelled

A failed run should always be investigated using execution logs.


Cell-Level Error Analysis

Notebook failures typically identify:

  • Failed cell
  • Error type
  • Line number
  • Stack trace

Example:

Cell 8 failed
AnalysisException
Table not found

This information significantly narrows troubleshooting efforts.


Spark Job Monitoring

Fabric allows engineers to inspect Spark jobs generated by notebook execution.

Useful information includes:

  • Job duration
  • Task failures
  • Stage failures
  • Resource utilization
  • Data shuffle activity

This information is particularly valuable for performance-related issues.


Reading Spark Error Messages

One of the most important DP-700 skills is interpreting Spark exceptions.

AnalysisException

Example:

AnalysisException:
Table customer_dim not found

Cause:

  • Missing table
  • Incorrect table name
  • Incorrect Lakehouse attachment

Resolution:

  • Verify table existence
  • Check notebook Lakehouse context

FileNotFoundException

Example:

FileNotFoundException

Cause:

  • Missing file
  • Incorrect path

Resolution:

  • Validate storage path
  • Confirm file availability

OutOfMemoryError

Example:

Java heap space

Cause:

  • Dataset too large
  • Inefficient transformations

Resolution:

  • Optimize Spark processing
  • Use partitioning
  • Increase cluster resources

NullPointerException

Cause:

  • Unexpected null values
  • Missing objects

Resolution:

  • Validate inputs
  • Add null handling

Debugging Techniques

Execute Incrementally

Rather than running an entire notebook:

  1. Run cells individually
  2. Verify outputs
  3. Isolate failures

This approach greatly reduces troubleshooting time.


Inspect Intermediate Results

Example:

df.show()

or

display(df)

Benefits:

  • Verify schema
  • Validate transformations
  • Detect null values
  • Confirm expected row counts

Check Schemas

Schema mismatches are a common source of errors.

Example:

df.printSchema()

Verify:

  • Column names
  • Data types
  • Nullable settings

Validate Row Counts

Example:

df.count()

Useful for identifying:

  • Missing records
  • Unexpected filtering
  • Data quality issues

Exception Handling

PySpark notebooks can implement error handling using Python exceptions.

Example:

try:
df = spark.read.parquet(path)
except Exception as e:
print(e)

Benefits:

  • Graceful failure handling
  • Better logging
  • Easier troubleshooting

Logging Best Practices

Instead of relying solely on notebook output, create structured logging.

Example:

print("Starting ingestion...")
print("Reading source data...")
print("Writing destination table...")

Benefits:

  • Easier root-cause analysis
  • Better operational monitoring
  • Faster issue resolution

Many organizations write logs to:

  • Lakehouse tables
  • Monitoring databases
  • Log Analytics environments

Notebook Failures in Pipelines

Many Fabric notebooks are executed through Data Pipelines.

When notebook activities fail:

Pipeline monitoring provides:

  • Activity status
  • Error messages
  • Execution duration
  • Retry history

Common troubleshooting process:

  1. Identify failed activity
  2. Open notebook run details
  3. Review Spark logs
  4. Identify root cause
  5. Correct notebook logic

Common Production Notebook Issues

Lakehouse Not Attached

Symptoms:

Table not found

Resolution:

  • Attach correct Lakehouse

Schema Drift

Symptoms:

  • New columns appear
  • Data types change

Resolution:

  • Add schema validation logic
  • Handle schema evolution

Large Data Volumes

Symptoms:

  • Slow execution
  • Memory failures

Resolution:

  • Optimize partitions
  • Filter data earlier
  • Reduce shuffle operations

Missing Upstream Data

Symptoms:

File not found

Resolution:

  • Verify ingestion completion
  • Add dependency checks

Notebook Optimization to Prevent Errors

Proactive optimization reduces future failures.

Best practices include:

  • Use partition pruning
  • Cache only when necessary
  • Avoid excessive collect() operations
  • Filter data early
  • Use Delta tables
  • Monitor Spark resource usage
  • Implement retry logic where appropriate
  • Validate input datasets before processing

Exam Tips

For the DP-700 exam, remember:

  • AnalysisException usually indicates missing tables, views, or schema issues.
  • FileNotFoundException typically indicates invalid paths or missing files.
  • OutOfMemoryError often indicates resource constraints or inefficient Spark processing.
  • Notebook debugging frequently involves reviewing Spark logs and cell-level errors.
  • Lakehouse attachment problems commonly cause table-access failures.
  • Pipelines provide monitoring information when notebook activities fail.
  • Exception handling and logging improve operational reliability.
  • Schema validation helps prevent runtime failures caused by schema drift.
  • Spark monitoring tools help diagnose performance and execution problems.
  • Resource optimization can prevent many notebook failures before they occur.

Practice Exam Questions

Question 1

A Fabric notebook fails with the following error:

AnalysisException: Table sales_fact not found

What is the MOST likely cause?

A. Spark cluster memory exhaustion

B. The referenced table does not exist or the wrong Lakehouse is attached

C. Network connectivity failure

D. Missing Python package

Correct Answer: B

Explanation:
AnalysisException commonly occurs when a referenced table, view, or schema object cannot be found. An incorrect Lakehouse attachment is also a frequent cause.


Question 2

A notebook fails with a FileNotFoundException when reading a parquet file.

What should be investigated first?

A. Spark executor configuration

B. Notebook language version

C. Storage path and file existence

D. Semantic model refresh history

Correct Answer: C

Explanation:
FileNotFoundException generally indicates an incorrect path, deleted file, missing shortcut, or unavailable source file.


Question 3

Which tool provides the MOST detailed information about Spark stage failures and executor issues?

A. Semantic model refresh history

B. Power BI usage metrics

C. Workspace role assignments

D. Spark job monitoring details

Correct Answer: D

Explanation:
Spark monitoring provides insight into jobs, stages, tasks, executor failures, and resource utilization.


Question 4

A notebook consistently fails due to Java heap space errors.

What is the MOST likely root cause?

A. Lakehouse attachment issue

B. Missing notebook parameter

C. Insufficient memory for the workload

D. Authentication failure

Correct Answer: C

Explanation:
Java heap space errors typically indicate memory pressure caused by large datasets or inefficient Spark operations.


Question 5

Which practice is MOST useful for isolating the source of a notebook failure?

A. Executing the entire notebook repeatedly

B. Running notebook cells individually and validating outputs

C. Increasing semantic model refresh frequency

D. Deleting Spark logs

Correct Answer: B

Explanation:
Executing cells incrementally helps identify exactly where a failure occurs and simplifies troubleshooting.


Question 6

A notebook references a Python package that is unavailable in the Spark environment.

Which error is MOST likely?

A. ModuleNotFoundError

B. AnalysisException

C. FileNotFoundException

D. TimeoutException

Correct Answer: A

Explanation:
ModuleNotFoundError occurs when required libraries or dependencies are unavailable.


Question 7

Which technique helps detect schema drift before downstream failures occur?

A. Increasing cluster size

B. Restarting the Spark session

C. Validating schemas during ingestion and transformation

D. Disabling logging

Correct Answer: C

Explanation:
Schema validation identifies unexpected columns, missing fields, or data type changes before they impact processing.


Question 8

A notebook activity fails within a Fabric pipeline.

Where should an engineer typically begin troubleshooting?

A. Power BI report usage metrics

B. Semantic model refresh schedule

C. Workspace branding settings

D. Pipeline activity run details and notebook execution logs

Correct Answer: D

Explanation:
Pipeline activity logs provide error messages, execution status, duration, and links to notebook execution details.


Question 9

Which action can help reduce the likelihood of OutOfMemoryError exceptions?

A. Using partition pruning and filtering data early

B. Disabling Spark monitoring

C. Removing notebook logging

D. Creating additional semantic models

Correct Answer: A

Explanation:
Reducing data volume processed by Spark lowers memory requirements and improves execution efficiency.


Question 10

Why should exception handling be implemented in production notebooks?

A. To eliminate all Spark errors

B. To increase Lakehouse storage capacity

C. To improve report rendering speed

D. To capture errors gracefully and improve troubleshooting

Correct Answer: D

Explanation:
Exception handling enables controlled failure behavior, better logging, easier diagnosis, and more resilient notebook execution.


Go to the DP-700 Exam Prep Hub main page.

Leave a comment