DP-700 Practice Exam #2 (30 questions with answers)


Question 1

A company plans to ingest customer data from Azure Data Lake Storage Gen2 into a Fabric Lakehouse. The source data changes daily and must be copied automatically.

Which Fabric component should perform the data movement?

A. Data Pipeline
B. KQL Queryset
C. Semantic Model
D. Warehouse View

Answer: A

Explanation

Data Pipelines are designed to orchestrate and automate data movement between sources and destinations.


Question 2

You are designing a medallion architecture.

Which layer should contain data that has been standardized, validated, and enriched but is not yet optimized for business reporting?

A. Gold
B. Bronze
C. Silver
D. Semantic

Answer: C

Explanation

The Silver layer contains cleansed and transformed data that serves as an intermediate layer between raw and business-ready data.


Question 3

Which THREE actions can be performed using Dataflow Gen2?

(Choose three.)

A. Apply Power Query transformations
B. Join datasets from multiple sources
C. Create streaming windows on IoT events
D. Filter rows before loading data

Answers: A, B, D

Explanation

Dataflow Gen2 supports Power Query-based transformations including filtering, joining, and shaping data. Streaming windows are typically handled through Eventstreams, KQL, or Spark Structured Streaming.


Question 4

Match each Fabric item with its primary workload.

Fabric ItemWorkload
1. WarehouseA. Real-time analytics
2. EventhouseB. Relational analytics
3. EventstreamC. Event ingestion

Answer

  • 1 → B
  • 2 → A
  • 3 → C

Explanation

Warehouses support relational analytics, Eventhouses support real-time analytics, and Eventstreams handle event ingestion.


Question 5

Fill in the blank.

A OneLake __________ allows data to be referenced from another location without physically copying the data.

Answer

shortcut

Explanation

Shortcuts provide virtual access to data while avoiding duplication.


Question 6

A Fabric data engineer wants to create a Spark DataFrame from a Delta table.

Which language is most commonly used?

A. DAX
B. MDX
C. PySpark
D. Power Query M

Answer: C

Explanation

PySpark is the most common language used in Fabric notebooks for Spark processing.


Question 7

A table contains duplicate customer records.

Which Spark operation is most appropriate?

A. cache()
B. dropDuplicates()
C. repartition()
D. collect()

Answer: B

Explanation

dropDuplicates() removes duplicate rows from a DataFrame.


Question 8

A company wants to analyze machine telemetry arriving every second.

Which solution is most appropriate?

A. Dataflow Gen2
B. Warehouse
C. Eventhouse
D. SQL Analytics Endpoint

Answer: C

Explanation

Eventhouse is optimized for high-volume streaming and telemetry analytics.


Question 9

You need to aggregate website clicks into five-minute windows.

Which technology is best suited?

A. Eventstream alone
B. Structured Streaming window functions
C. OneLake Shortcut
D. Semantic Model

Answer: B

Explanation

Window functions in Structured Streaming are designed specifically for time-based aggregations.


Question 10

Which statement about Delta Lake is correct?

A. Delta tables support ACID transactions.
B. Delta tables cannot be queried through SQL.
C. Delta tables require Eventhouse.
D. Delta tables are read-only.

Answer: A

Explanation

Delta Lake provides ACID transaction support and is queryable through SQL, Spark, and Fabric workloads.


Question 11

A data engineer needs to query real-time events using KQL.

Which Fabric item should store the data?

A. Dataflow Gen2
B. Semantic Model
C. Eventhouse
D. Notebook

Answer: C

Explanation

Eventhouse stores data for KQL-based analytics.


Question 12

Which TWO advantages does Direct Lake provide?

(Choose two.)

A. Near-import performance
B. No requirement for OneLake
C. Reduced data duplication
D. Requires continuous ETL refreshes

Answers: A, C

Explanation

Direct Lake provides high performance while reducing duplicated storage.


Question 13

You need to troubleshoot a failed Spark notebook execution.

Where should you review execution logs first?

A. Capacity Metrics App
B. Spark Monitoring
C. Semantic Model Refresh History
D. Eventstream Destination Settings

Answer: B

Explanation

Spark Monitoring provides execution details, stages, and error information.


Question 14

A Fabric Warehouse query frequently filters by ProductCategory.

What optimization technique may reduce scanning?

A. Partitioning
B. Removing statistics
C. Converting all values to VARCHAR(MAX)
D. Disabling caching

Answer: A

Explanation

Partitioning can reduce the amount of data scanned.


Question 15

Match each KQL operator with its function.

OperatorFunction
1. whereA. Create calculated column
2. summarizeB. Aggregate results
3. extendC. Filter rows

Answer

  • 1 → C
  • 2 → B
  • 3 → A

Explanation

where filters rows, summarize aggregates data, and extend creates calculated columns.


Question 16

Which feature allows querying historical versions of Delta tables?

A. Mirroring
B. Time Travel
C. DirectQuery
D. Event Processing

Answer: B

Explanation

Time Travel enables access to previous Delta table versions.


Question 17

A company requires event enrichment by joining streaming data with reference data.

Which technology should be used?

A. Structured Streaming
B. Dataflow Gen2
C. Warehouse Views
D. Semantic Relationships

Answer: A

Explanation

Structured Streaming supports stream-static joins.


Question 18

Which Fabric feature enables near real-time movement of streaming data from sources to destinations?

A. Warehouse
B. Semantic Model
C. Eventstream
D. Dataflow Gen2

Answer: C

Explanation

Eventstreams route and process streaming events.


Question 19

You need to monitor workspace-wide execution history across notebooks, pipelines, and dataflows.

Which tool should you use?

A. Spark UI
B. Monitoring Hub
C. Warehouse Explorer
D. Notebook View

Answer: B

Explanation

Monitoring Hub provides centralized monitoring across Fabric items.


Question 20

A Lakehouse contains thousands of tiny Delta files.

Which command should be executed?

A. CACHE
B. ANALYZE
C. VACUUM
D. OPTIMIZE

Answer: D

Explanation

OPTIMIZE compacts small files into larger ones.


Question 21

Which THREE sources are commonly used with OneLake shortcuts?

(Choose three.)

A. Azure Data Lake Storage Gen2
B. Another Fabric Lakehouse
C. Amazon S3
D. Local Excel file on a desktop

Answers: A, B, C

Explanation

Shortcuts can reference supported cloud storage systems and Fabric items.


Question 22

A Fabric engineer needs to investigate why a semantic model refresh failed.

Where should they begin?

A. Refresh History
B. Spark Job Definitions
C. Eventhouse Metrics
D. Notebook Parameters

Answer: A

Explanation

Refresh History provides details about semantic model refresh failures.


Question 23

Fill in the blank.

The KQL operator used to create a new calculated column is __________.

Answer

extend

Explanation

extend creates calculated columns during query execution.


Question 24

A Fabric Warehouse contains a very large fact table and several small dimension tables.

Which join strategy generally performs best?

A. Cross Join
B. Joining on mismatched datatypes
C. Star schema joins
D. Cartesian joins

Answer: C

Explanation

Star schemas are optimized for analytical workloads.


Question 25

A pipeline activity occasionally fails due to temporary network issues.

What should be configured first?

A. Retry policy
B. Additional semantic models
C. KQL cache
D. OneLake replication

Answer: A

Explanation

Retry policies help recover from transient failures.


Question 26

Which TWO actions improve Spark performance?

(Choose two.)

A. Cache frequently used DataFrames
B. Reduce unnecessary shuffles
C. Use SELECT *
D. Create duplicate notebooks

Answers: A, B

Explanation

Caching and minimizing shuffles significantly improve Spark performance.


Question 27

A company wants to query operational data from Azure SQL Database without building a separate ingestion process.

Which Fabric capability should be considered?

A. Dataflow Gen2
B. Mirroring
C. Spark Streaming
D. Semantic Refresh

Answer: B

Explanation

Mirroring provides near-real-time access to operational data sources.


Question 28

You are creating a streaming analytics solution.

Which window type continuously moves forward as time progresses?

A. Tumbling Window
B. Fixed Window
C. Sliding Window
D. Batch Window

Answer: C

Explanation

Sliding windows overlap and move continuously over time.


Question 29

A notebook runs successfully but takes significantly longer than expected.

Which monitoring tool provides stage-level Spark execution details?

A. Monitoring Hub
B. Spark Monitoring
C. Workspace Settings
D. Dataflow History

Answer: B

Explanation

Spark Monitoring provides detailed stage and task-level performance information.


Question 30

A data engineer wants to improve SQL query performance in a Warehouse.

Which action is generally recommended?

A. Use SELECT * in production queries
B. Disable statistics collection
C. Remove partitioning
D. Filter data as early as possible

Answer: D

Explanation

Applying filters early reduces the volume of processed data and improves query performance.


Go to the DP-700 Exam Prep Hub main page.

Leave a comment