Exam Prep Hubs available on The Data Community

Below are the free Exam Prep Hubs currently available on The Data Community.
Bookmark the hubs you are interested in and use them to ensure you are fully prepared for the respective exam.

Each hub contains:

  1. The topic-by-topic (from the official study guide) coverage of the material, making it easy for you to ensure you are covering all aspects of the exam material.
  2. Practice exam questions for each section.
  3. Bonus material to help you prepare
  4. Two (2) Practice Exams with 60 questions each, along with answer keys.
  5. Links to useful resources, such as Microsoft Learn content, YouTube video series, and more.


AI-900: Microsoft Azure AI Fundamentals

WARNING: AI-900 will retire on June 30, 2026. It will be replaced with AI-901. You can continue to earn this certification after AI-900 retires by passing AI-901.



DP-900: Microsoft Azure Data Fundamentals certification exam – Frequently Asked Questions (FAQs)

Below are some commonly asked questions about the DP-900: Microsoft Azure Data Fundamentals certification exam. Upon successfully passing this exam, you earn the Microsoft Certified: Azure Data Fundamentals certification.


What is the DP-900 certification exam?

The DP-900: Microsoft Azure Data Fundamentals exam validates your foundational knowledge of core data concepts and how data is implemented using Microsoft Azure services.

Candidates who pass the exam demonstrate understanding of:

  • Core data concepts (relational vs non-relational data, transactional vs analytical workloads)
  • Relational data workloads in Azure (Azure SQL Database, SQL Server on Azure Virtual Machines, Azure SQL Managed Instance)
  • Non-relational data workloads in Azure (Azure Cosmos DB)
  • Analytical workloads in Azure (Azure Synapse Analytics, Azure Data Factory, Azure Data Lake, Power BI)

This certification is designed for individuals who want to build a baseline understanding of data in the cloud. Upon successfully passing this exam, candidates earn the Microsoft Certified: Azure Data Fundamentals certification.


Is the DP-900 certification exam worth it?

The short answer is yes.

DP-900 is an excellent entry point into Microsoft’s data certification ecosystem. Preparing for this exam helps you:

  • Build a solid foundation in data concepts
  • Understand how Azure supports different data workloads
  • Gain confidence working with cloud-based data platforms
  • Prepare for more advanced certifications such as DP-203, PL-300, or AI-900

For beginners, career switchers, students, and professionals new to Azure or data, DP-900 provides structured learning and practical context that transfers directly to real-world scenarios.


How many questions are on the DP-900 exam?

The DP-900 exam typically contains between 40 and 60 questions.

Question formats may include:

  • Single-choice and multiple-choice questions
  • Multi-select questions
  • Drag-and-drop or matching questions
  • Short scenario-based questions

The exact number and format can vary from exam to exam.


How hard is the DP-900 exam?

DP-900 is considered a fundamentals-level exam and is generally easier than associate-level certifications such as PL-300 or DP-203.

That said, it still requires preparation.

The challenge comes from:

  • Understanding when to use relational vs non-relational data
  • Recognizing Azure services and their purposes
  • Interpreting scenario-based questions
  • Learning basic analytics concepts

With focused study and practice, most candidates find the exam very achievable.

Helpful preparation resources include:

  • Microsoft Learn (official and free)
  • The official DP-900 study guide
  • Practice exams
  • Community resources and blogs
  • YouTube tutorials and walkthroughs

How much does the DP-900 certification exam cost?

As of early 2026, the standard exam pricing is approximately:

  • United States: $99 USD
  • Other countries: Regionally adjusted pricing applies

Microsoft frequently offers student discounts, academic pricing, and exam vouchers, so it’s worth checking the official Microsoft certification site before scheduling.


How do I prepare for the Microsoft DP-900 certification exam?

The most important advice is not to rush.

Recommended preparation steps:

  1. Review the official DP-900 exam skills outline.
  2. Complete the free Microsoft Learn DP-900 learning path.
  3. Study core data concepts (relational vs non-relational, OLTP vs OLAP).
  4. Learn the purpose of key Azure services such as Azure SQL Database, Azure Cosmos DB, Azure Synapse Analytics, and Power BI.
  5. Take practice exams to confirm your readiness.

Additional learning resources include:

Hands-on labs are helpful but not strictly required for DP-900. Conceptual understanding is the primary focus.


How do I pass the DP-900 exam?

To maximize your chances of passing:

  • Focus on understanding concepts rather than memorization
  • Learn what each Azure data service is designed for
  • Carefully read scenario questions before answering
  • Eliminate obviously incorrect choices
  • Manage your time effectively

Consistently performing well on reputable practice exams is usually a good indicator that you’re ready.


What is the best site for DP-900 certification dumps?

Using exam dumps is not recommended and may violate Microsoft’s exam policies.

Instead, rely on legitimate preparation resources such as:

  • Microsoft’s official practice exam
  • High-quality community-created practice tests
  • Scenario-based questions that reinforce understanding

Legitimate preparation builds real skills that extend beyond passing the exam.


How long should I study for the DP-900 exam?

Study time varies based on background.

General guidelines:

  • Prior data or Azure experience: 2–4 weeks
  • Some technical background: 3–5 weeks
  • Beginners or career changers: 4–8 weeks

Rather than focusing strictly on time, aim to understand all exam topics and perform well on practice tests before scheduling.


Where can I find training or a course for the DP-900 exam?

Training options include:

  • Microsoft Learn: Free, official learning path
  • Online platforms: Udemy, Coursera, Exam Prep Hub for DP-900: Azure Data Fundamentals, and similar providers
  • YouTube: Free DP-900 playlists and walkthroughs
  • Subscription platforms: Datacamp and others offering Azure or data fundamentals
  • Microsoft partners: Instructor-led courses

A mix of structured learning and light hands-on exploration works well.


What skills should I have before taking the DP-900 exam?

Before attempting the exam, it helps to understand:

  • Basic data concepts (tables, rows, columns)
  • Differences between relational and non-relational data
  • Basic analytics terminology
  • General cloud computing concepts

No coding experience is required.

DP-900 is designed specifically for beginners.


What score do I need to pass the DP-900 exam?

Microsoft exams are scored on a scale of 1–1000, and a score of 700 or higher is required to pass.

Scores are scaled based on question difficulty, not simply percentage correct.


How long is the DP-900 exam?

You are given approximately 60 minutes to complete the exam, not including onboarding and instructions.

Time pressure is generally lower than associate-level exams.


How long is the DP-900 certification valid?

The Microsoft Certified: Azure Data Fundamentals certification does not expire.

Unlike associate-level certifications, DP-900 currently does not require renewal.


Is DP-900 suitable for beginners?

Yes — DP-900 is specifically designed for beginners.

It’s ideal for:

  • Students
  • Career switchers
  • Business professionals entering data or analytics
  • Technical professionals new to Azure

No prior Azure or database experience is required.


What roles benefit most from the DP-900 certification?

DP-900 is especially valuable for:

  • Aspiring Data Analysts or Data Engineers
  • Business Analysts
  • Students and graduates
  • Cloud beginners
  • Professionals exploring data careers

It also serves as a strong foundation before pursuing PL-300, DP-203, or AI-900.


What languages is the DP-900 exam offered in?

The DP-900 certification exam is commonly offered in:

English, Japanese, Chinese (Simplified), Korean, German, French, Spanish, Portuguese (Brazil), Chinese (Traditional), Italian

Availability may vary by region.


Have additional questions? Post them in the comments.

Thanks for reading and good luck on your data journey!

Exam Prep Hub for DP-900: Azure Data Fundamentals

Welcome to the DP-900: Azure Data Fundamentals Exam Prep Hub!

Welcome to the one-stop hub with information for preparing for the DP-900: Microsoft Azure Data Fundamentals certification exam. The content for this exam helps you to “Demonstrate foundational knowledge of core data concepts related to Microsoft Azure data services.”. Upon successful completion of the exam, you earn the Microsoft Certified: Azure Data Fundamentals certification.

This hub provides information directly here (topic-by-topic as outlined in the official study guide), links to a number of external resources, tips for preparing for the exam, practice tests, and section questions to help you prepare. Bookmark this page and use it as a guide to ensure that you are fully covering all relevant topics for the AI-900 exam and making use of as many of the resources available as possible.


Audience profile (from Microsoft’s site)

This exam is an opportunity to demonstrate your knowledge of core data concepts and related Microsoft Azure data services. As a candidate for this exam, you should have familiarity with Exam DP-900’s self-paced or instructor-led learning material.
This exam is intended for you, if you’re a candidate beginning to work with data in the cloud.
You should be familiar with:
- The concepts of relational and non-relational data.
- Different types of data workloads such as transactional or analytical.
You can use Azure Data Fundamentals to prepare for other Azure role-based certifications like Azure Database Administrator Associate or Azure Data Engineer Associate, but it is not a prerequisite for any of them.

Skills at a glance (as specified in the official study guide)

  • Describe core data concepts (25–30%)
  • Identify considerations for relational data on Azure (20–25%)
  • Describe considerations for working with non-relational data on Azure (15–20%)
  • Describe an analytics workload on Azure (25–30%)

Topic-by-Topic Exam Content

Describe core data concepts (25–30%)

Describe ways to represent data

Identify options for data storage

Describe common data workloads

Identify roles and responsibilities for data workloads

Identify considerations for relational data on Azure (20–25%)

Describe relational concepts

Describe relational Azure data services

Describe considerations for working with non-relational data on Azure (15–20%)

Describe capabilities of Azure storage

Describe capabilities and features of Azure Cosmos DB

Describe an analytics workload (25–30%)

Describe common elements of large-scale analytics

Describe consideration for real-time data analytics

Describe data visualization in Microsoft Power BI


DP-900 Practice Exams

DP-900 Practice Exam 1 (60 questions with answers)

DP-900 Practice Exam 2 (60 questions with answers)


Important DP-900 Resources

YouTube video series: Microsoft Learn DP-900 Azure Data Fundamentals YouTube series

A book you may find useful (on Amazon): Exam Ref DP-900 Microsoft Azure Data Fundamentals 2nd Edition


Good luck to you on your data journey!

DP-900: Azure Data Fundamentals – Advanced Practice Exam – 60 questions

Advanced Practice Exam (60 Questions)

This advanced practice exam contains:

  • Higher-difficulty questions
  • More scenario-based questions
  • Multi-answer questions
  • Matching questions
  • Fill-in-the-blank questions
  • SQL and architecture concepts
  • Azure service selection scenarios

Section 1 — Core Data Concepts


Question 1 (Scenario-Based)

A company stores customer survey responses in JSON format. Each survey can contain different fields depending on the survey type.

How should this data be classified?

A. Structured
B. Semi-structured
C. Unstructured
D. Transactional

Answer: B — Semi-structured

Explanations

A. Incorrect
Structured data requires a rigid schema.

B. Correct
JSON is semi-structured because it contains flexible tagged fields.

C. Incorrect
Unstructured data has little or no organization.

D. Incorrect
Transactional refers to workload type, not structure.


Question 2 (Multi-Answer)

Which characteristics are associated with transactional workloads? (Choose TWO)

A. High concurrency
B. Historical aggregations
C. Fast insert/update operations
D. Large-scale reporting queries

Answers: A and C

Explanations

A. Correct
Transactional systems support many simultaneous users.

B. Incorrect
Historical aggregations are analytical.

C. Correct
OLTP systems perform fast write operations.

D. Incorrect
Large reporting queries belong to analytics workloads.


Question 3 (Scenario-Based)

A database contains duplicated customer addresses across multiple tables. The database architect wants to reduce redundancy and improve consistency.

Which process should be used?

A. Partitioning
B. Normalization
C. Encryption
D. Replication

Answer: B — Normalization

Explanations

A. Incorrect
Partitioning improves scalability.

B. Correct
Normalization reduces duplication.

C. Incorrect
Encryption secures data.

D. Incorrect
Replication copies data.


Question 4 (Single Answer)

Which SQL statement removes an existing table and all its data?

A. DELETE
B. REMOVE
C. DROP
D. ERASE

Answer: C — DROP

Explanations

A. Incorrect
DELETE removes rows only.

B. Incorrect
REMOVE is not standard SQL.

C. Correct
DROP deletes the table structure and data.

D. Incorrect
ERASE is not standard SQL.


Question 5 (Matching)

Match the role to the responsibility.

RoleResponsibility
1. DBAA. Creates dashboards
2. Data AnalystB. Maintains database performance
3. Data EngineerC. Builds data pipelines

Answers

  • 1 → B
  • 2 → A
  • 3 → C

Question 6 (Scenario-Based)

A retail company needs a database for processing thousands of purchases per minute with guaranteed consistency.

Which workload type is MOST appropriate?

A. Analytical
B. Streaming
C. Transactional
D. Archival

Answer: C — Transactional

Explanations

A. Incorrect
Analytical systems focus on reporting.

B. Incorrect
Streaming processes event flows.

C. Correct
Transactional systems support operational consistency and speed.

D. Incorrect
Archival systems store inactive data.


Question 7 (Fill in the Blank)

The SQL statement used to add new rows to a table is __________.

Answer: INSERT


Question 8 (Multi-Answer)

Which file formats are commonly used in analytics workloads? (Choose TWO)

A. Parquet
B. ORC
C. BMP
D. EXE

Answers: A and B

Explanations

A. Correct
Parquet is optimized for analytics.

B. Correct
ORC is another columnar analytics format.

C. Incorrect
BMP is an image format.

D. Incorrect
EXE is executable software.


Question 9 (Scenario-Based)

An organization wants to analyze 10 years of sales history for trends and forecasting.

Which workload type is BEST suited?

A. OLTP
B. Analytical
C. Streaming
D. Operational

Answer: B — Analytical


Question 10 (Single Answer)

Which database object contains reusable SQL logic?

A. View
B. Index
C. Stored Procedure
D. Key

Answer: C — Stored Procedure


Section 2 — Relational Data on Azure


Question 11 (Scenario-Based)

A company is migrating an on-premises SQL Server application that relies heavily on SQL Server Agent, cross-database queries, and instance-level features.

Which Azure service is MOST appropriate?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. Azure Cosmos DB
D. Azure Blob Storage

Answer: B — Azure SQL Managed Instance

Explanations

A. Incorrect
Azure SQL Database has fewer instance-level features.

B. Correct
Managed Instance offers near full SQL Server compatibility.

C. Incorrect
Cosmos DB is NoSQL.

D. Incorrect
Blob Storage stores files.


Question 12 (Single Answer)

Which Azure SQL offering provides the HIGHEST level of infrastructure control?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. SQL Server on Azure Virtual Machines
D. Azure Synapse Analytics

Answer: C — SQL Server on Azure Virtual Machines


Question 13 (Multi-Answer)

Which are advantages of Platform as a Service (PaaS) databases? (Choose TWO)

A. Automatic patching
B. Reduced administrative overhead
C. Full operating system control
D. Manual backups only

Answers: A and B


Question 14 (Scenario-Based)

A company wants automatic scaling, backups, and minimal management overhead for a new cloud-native application.

Which solution is BEST?

A. SQL Server on Azure VMs
B. Azure SQL Database
C. Windows Server Failover Cluster
D. Self-hosted SQL Server

Answer: B — Azure SQL Database


Question 15 (Single Answer)

What is the purpose of a foreign key?

A. Encrypt data
B. Create indexes
C. Enforce relationships between tables
D. Remove duplicates

Answer: C — Enforce relationships between tables


Question 16 (Scenario-Based)

A company needs a managed PostgreSQL service in Azure.

Which service should be used?

A. Azure SQL Database
B. Azure Database for PostgreSQL
C. Azure Blob Storage
D. Azure Cosmos DB

Answer: B — Azure Database for PostgreSQL


Question 17 (Single Answer)

Which normalization form removes transitive dependencies?

A. 1NF
B. 2NF
C. 3NF
D. 4NF

Answer: C — 3NF


Question 18 (Multi-Answer)

Which SQL statements are Data Manipulation Language (DML)? (Choose TWO)

A. SELECT
B. INSERT
C. CREATE
D. DROP

Answers: A and B


Question 19 (Scenario-Based)

A query needs to return ALL customers, including those without orders.

Which JOIN should be used?

A. INNER JOIN
B. CROSS JOIN
C. LEFT JOIN
D. SELF JOIN

Answer: C — LEFT JOIN


Question 20 (Single Answer)

Which object improves query performance but does NOT store actual business data?

A. Table
B. View
C. Index
D. Row

Answer: C — Index


Section 3 — Non-Relational Data


Question 21 (Scenario-Based)

A media company needs to store petabytes of video content at low cost.

Which Azure service is MOST appropriate?

A. Azure SQL Database
B. Azure Blob Storage
C. Azure Table Storage
D. Azure Cache for Redis

Answer: B — Azure Blob Storage


Question 22 (Single Answer)

Which Azure Blob Storage tier is optimized for infrequently accessed data?

A. Premium
B. Hot
C. Cool
D. Archive

Answer: C — Cool


Question 23 (Scenario-Based)

An organization needs cloud-hosted SMB file shares accessible by both cloud and on-premises servers.

Which service should be used?

A. Azure Cosmos DB
B. Azure Files
C. Azure Table Storage
D. Azure SQL Database

Answer: B — Azure Files


Question 24 (Multi-Answer)

Which APIs are supported by Azure Cosmos DB? (Choose TWO)

A. MongoDB
B. Cassandra
C. Oracle
D. SMB

Answers: A and B


Question 25 (Scenario-Based)

A gaming company needs globally distributed low-latency data access for player profiles.

Which Azure service is BEST?

A. Azure Cosmos DB
B. Azure Files
C. Azure SQL Database
D. Azure Blob Storage

Answer: A — Azure Cosmos DB


Question 26 (Single Answer)

What is a major benefit of Azure Cosmos DB partitioning?

A. Reduces security
B. Enables scalability
C. Removes replication
D. Prevents indexing

Answer: B — Enables scalability


Question 27 (Fill in the Blank)

Azure Cosmos DB provides multi-region __________ to improve availability and performance.

Answer: replication


Question 28 (Scenario-Based)

A company needs a NoSQL key-value store for massive telemetry ingestion.

Which service is MOST appropriate?

A. Azure Table Storage
B. Azure SQL Database
C. Azure Files
D. Azure DNS

Answer: A — Azure Table Storage


Question 29 (Single Answer)

Which storage service stores data as objects inside containers?

A. Azure Files
B. Azure Blob Storage
C. Azure SQL Database
D. Azure Cosmos DB

Answer: B — Azure Blob Storage


Question 30 (Multi-Answer)

Which are characteristics of non-relational databases? (Choose TWO)

A. Flexible schemas
B. Strict relational constraints
C. Horizontal scalability
D. Mandatory JOIN operations

Answers: A and C


Section 4 — Analytics Workloads


Question 31 (Scenario-Based)

A company collects IoT sensor readings every second and needs near real-time dashboards.

Which processing approach is MOST appropriate?

A. Batch processing
B. Streaming processing
C. Archival processing
D. Offline reporting

Answer: B — Streaming processing


Question 32 (Single Answer)

Which Azure service is designed for high-throughput event ingestion?

A. Azure Event Hubs
B. Azure Backup
C. Azure Files
D. Azure DNS

Answer: A — Azure Event Hubs


Question 33 (Scenario-Based)

An organization needs Apache Spark-based analytics with collaborative notebooks.

Which service is BEST?

A. Azure Databricks
B. Azure Files
C. Azure DNS
D. Azure Firewall

Answer: A — Azure Databricks


Question 34 (Single Answer)

Which architecture commonly includes fact tables and dimension tables?

A. OLTP schema
B. Star schema
C. Graph schema
D. XML schema

Answer: B — Star schema


Question 35 (Multi-Answer)

Which are characteristics of a data warehouse? (Choose TWO)

A. Optimized for analytics
B. Stores historical data
C. Primarily supports OLTP transactions
D. Limited aggregations

Answers: A and B


Question 36 (Scenario-Based)

A company wants a unified analytics platform combining engineering, warehousing, data science, and BI.

Which Microsoft service BEST fits?

A. Microsoft Fabric
B. Azure Files
C. Azure Firewall
D. Azure DNS

Answer: A — Microsoft Fabric


Question 37 (Single Answer)

Which service allows SQL-like queries against streaming data?

A. Azure Stream Analytics
B. Azure Files
C. Azure Backup
D. Azure Monitor

Answer: A — Azure Stream Analytics


Question 38 (Scenario-Based)

An organization processes payroll data once nightly.

Which processing type is MOST appropriate?

A. Streaming
B. Batch
C. Event-driven only
D. Real-time analytics

Answer: B — Batch


Question 39 (Single Answer)

Which process extracts, transforms, and loads data into analytical systems?

A. ETL
B. DNS
C. RAID
D. OLTP

Answer: A — ETL


Question 40 (Multi-Answer)

Which services are commonly associated with real-time analytics? (Choose TWO)

A. Azure Event Hubs
B. Azure Stream Analytics
C. Azure Files
D. Azure Backup

Answers: A and B


Section 5 — Power BI


Question 41 (Scenario-Based)

An executive wants a single-page overview showing KPIs and summary visuals.

Which Power BI object should be used?

A. Dataset
B. Dashboard
C. Dataflow
D. Semantic model

Answer: B — Dashboard


Question 42 (Single Answer)

Which Power BI component is primarily used for data transformation?

A. DAX
B. Power Query
C. Azure Functions
D. Power Automate

Answer: B — Power Query


Question 43 (Scenario-Based)

A report must show revenue trends over 24 months.

Which visualization is BEST?

A. Pie chart
B. Gauge chart
C. Line chart
D. Scatter chart

Answer: C — Line chart


Question 44 (Single Answer)

Which visualization is BEST for displaying proportions?

A. Scatter chart
B. Pie chart
C. Card
D. Gauge chart

Answer: B — Pie chart


Question 45 (Scenario-Based)

A company wants users to filter reports interactively by region and year.

Which feature should be used?

A. Indexes
B. Slicers
C. Measures
D. Triggers

Answer: B — Slicers


Question 46 (Single Answer)

Which Power BI language creates measures and calculated columns?

A. SQL
B. Python
C. DAX
D. XML

Answer: C — DAX


Question 47 (Scenario-Based)

A business analyst wants to identify the relationship between advertising spend and revenue.

Which visualization is BEST?

A. Pie chart
B. Scatter chart
C. Gauge chart
D. Card

Answer: B — Scatter chart


Question 48 (Single Answer)

Which Power BI visualization is BEST for detailed row-level data?

A. Table
B. Gauge
C. Pie chart
D. Card

Answer: A — Table


Question 49 (Multi-Answer)

Which are benefits of Power BI dashboards? (Choose TWO)

A. Real-time monitoring
B. Single-page summaries
C. Operating system administration
D. SQL indexing

Answers: A and B


Question 50 (Scenario-Based)

A company needs a geographic visualization of sales by country.

Which visualization is BEST?

A. Matrix
B. Map
C. Gauge
D. Card

Answer: B — Map


Section 6 — Comprehensive Scenarios


Question 51 (Scenario-Based)

A healthcare organization requires:

  • Globally distributed NoSQL storage
  • Automatic replication
  • Low latency worldwide
  • Flexible schema support

Which solution BEST fits?

A. Azure SQL Database
B. Azure Cosmos DB
C. Azure Files
D. Azure Synapse Analytics

Answer: B — Azure Cosmos DB


Question 52 (Scenario-Based)

A manufacturing company collects sensor telemetry every second from thousands of devices.

Which Azure service should ingest the streaming events?

A. Azure Event Hubs
B. Azure Files
C. Azure SQL Managed Instance
D. Azure Backup

Answer: A — Azure Event Hubs


Question 53 (Scenario-Based)

A company wants full control of SQL Server patching, OS configuration, and backups.

Which deployment option should be used?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. SQL Server on Azure Virtual Machines
D. Azure Cosmos DB

Answer: C — SQL Server on Azure Virtual Machines


Question 54 (Single Answer)

Which Azure service is MOST optimized for unstructured object storage?

A. Azure Blob Storage
B. Azure SQL Database
C. Azure Files
D. Azure Synapse Analytics

Answer: A — Azure Blob Storage


Question 55 (Scenario-Based)

An analytics team needs to store historical sales data optimized for aggregation queries.

Which solution is BEST?

A. Transactional database
B. Data warehouse
C. Azure Files
D. DNS server

Answer: B — Data warehouse


Question 56 (Single Answer)

Which SQL statement changes existing records?

A. CREATE
B. UPDATE
C. INSERT
D. ALTER

Answer: B — UPDATE


Question 57 (Multi-Answer)

Which are benefits of normalization? (Choose TWO)

A. Reduced redundancy
B. Improved consistency
C. Increased duplicate storage
D. Reduced relationships

Answers: A and B


Question 58 (Scenario-Based)

A report needs to compare revenue across product categories.

Which visualization is BEST?

A. Line chart
B. Scatter chart
C. Bar chart
D. Gauge chart

Answer: C — Bar chart


Question 59 (Fill in the Blank)

The SQL JOIN that returns only matching rows from both tables is called an __________ JOIN.

Answer: INNER


Question 60 (Scenario-Based)

A company needs:

  • Large-scale analytics
  • Integrated Power BI reporting
  • Data engineering
  • Real-time analytics
  • Unified SaaS experience

Which platform BEST meets these requirements?

A. Microsoft Fabric
B. Azure Files
C. Azure DNS
D. Windows Server Failover Clustering

Answer: A — Microsoft Fabric


Advanced Exam Study Tips

Know the differences between:

  • OLTP vs OLAP
  • Batch vs streaming
  • Structured vs semi-structured vs unstructured
  • Relational vs NoSQL

Memorize Azure service associations:

ServicePurpose
Azure Blob StorageUnstructured object storage
Azure FilesSMB file shares
Azure Table StorageKey-value NoSQL
Azure Cosmos DBGlobally distributed NoSQL
Azure Event HubsStreaming ingestion
Azure Stream AnalyticsReal-time analytics
Azure DatabricksSpark analytics
Microsoft FabricUnified analytics platform

Power BI visualization shortcuts:

VisualizationBest Use
Line chartTrends
Bar chartComparisons
Pie chartProportions
Scatter chartRelationships
CardSingle KPI
MapGeographic analysis
GaugeProgress toward target

Go to the DP-900 Exam Prep Hub main page.

DP-900: Azure Data Fundamentals – Practice Exam Questions – 60 questions

Full Practice Exam

This practice exam covers all major skills measured on the DP-900 certification exam, including:

  • Core data concepts
  • Relational data on Azure
  • Non-relational data on Azure
  • Analytics workloads
  • Power BI and visualization
  • Real-time analytics
  • Azure data services

Question formats include:

  • Single-answer multiple choice
  • Multi-answer multiple choice
  • Matching/connect-the-answers
  • Fill-in-the-blank
  • Scenario-based questions

Section 1 — Core Data Concepts


Question 1 (Single Answer)

Which type of data has a predefined schema consisting of rows and columns?

A. Unstructured data
B. Semi-structured data
C. Structured data
D. Streaming data

Answer: C — Structured data

Explanations

A. Incorrect
Unstructured data does not have a predefined schema.

B. Incorrect
Semi-structured data has some organization but not fixed rows/columns.

C. Correct
Structured data uses a defined schema with rows and columns.

D. Incorrect
Streaming data refers to continuously arriving data, not structure type.


Question 2 (Multi-Answer)

Which of the following are examples of semi-structured data? (Choose TWO)

A. JSON
B. CSV
C. XML
D. SQL tables

Answers: A and C

Explanations

A. Correct
JSON contains tags/structure but flexible schemas.

B. Incorrect
CSV is structured tabular data.

C. Correct
XML is semi-structured because it uses tagged hierarchical data.

D. Incorrect
SQL tables are structured relational data.


Question 3 (Fill in the Blank)

A database design technique used to reduce data redundancy is called __________.

Answer: Normalization

Explanation

Normalization organizes data efficiently and minimizes duplication.


Question 4 (Single Answer)

Which SQL statement retrieves data from a table?

A. INSERT
B. UPDATE
C. SELECT
D. DELETE

Answer: C — SELECT

Explanations

A. Incorrect
INSERT adds records.

B. Incorrect
UPDATE modifies records.

C. Correct
SELECT retrieves data.

D. Incorrect
DELETE removes records.


Question 5 (Matching)

Match the workload to its description.

WorkloadDescription
1. TransactionalA. Historical analysis
2. AnalyticalB. Real-time business operations

Answers

  • 1 → B
  • 2 → A

Explanation

Transactional workloads support day-to-day operations; analytical workloads analyze historical data.


Question 6 (Single Answer)

Which role is MOST responsible for maintaining database availability and backups?

A. Data Analyst
B. Data Engineer
C. Database Administrator
D. Business User

Answer: C — Database Administrator

Explanations

A. Incorrect
Data analysts focus on reporting and insights.

B. Incorrect
Data engineers build pipelines and integration systems.

C. Correct
DBAs manage availability, backups, and performance.

D. Incorrect
Business users consume reports.


Question 7 (Multi-Answer)

Which are characteristics of analytical workloads? (Choose TWO)

A. Frequent INSERT operations
B. Historical trend analysis
C. Large-scale aggregations
D. High-volume OLTP transactions

Answers: B and C

Explanations

A. Incorrect
Frequent inserts are more common in transactional systems.

B. Correct
Analytical systems examine historical data.

C. Correct
Aggregations are common in analytics.

D. Incorrect
OLTP workloads are transactional.


Question 8 (Single Answer)

Which file format is commonly used for big data analytics because of columnar storage and compression?

A. TXT
B. CSV
C. Parquet
D. XML

Answer: C — Parquet

Explanations

A. Incorrect
TXT files are plain text.

B. Incorrect
CSV is row-based text data.

C. Correct
Parquet is optimized for analytics workloads.

D. Incorrect
XML is semi-structured but not optimized for analytics.


Question 9 (Single Answer)

Which database object stores data in rows and columns?

A. View
B. Stored procedure
C. Table
D. Index

Answer: C — Table

Explanations

A. Incorrect
Views are virtual query results.

B. Incorrect
Stored procedures contain SQL logic.

C. Correct
Tables store relational data.

D. Incorrect
Indexes improve query performance.


Question 10 (Single Answer)

Which SQL JOIN returns only matching rows from both tables?

A. LEFT JOIN
B. RIGHT JOIN
C. INNER JOIN
D. FULL OUTER JOIN

Answer: C — INNER JOIN

Explanations

A. Incorrect
LEFT JOIN includes unmatched left-side rows.

B. Incorrect
RIGHT JOIN includes unmatched right-side rows.

C. Correct
INNER JOIN returns only matches.

D. Incorrect
FULL OUTER JOIN includes all rows.


Section 2 — Relational Data on Azure


Question 11 (Single Answer)

Which Azure SQL option provides the MOST compatibility with on-premises SQL Server?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. Azure Cosmos DB
D. Azure Blob Storage

Answer: B — Azure SQL Managed Instance

Explanations

A. Incorrect
Azure SQL Database is fully managed but has fewer instance-level features.

B. Correct
Managed Instance provides near full SQL Server compatibility.

C. Incorrect
Cosmos DB is NoSQL.

D. Incorrect
Blob Storage is object storage.


Question 12 (Multi-Answer)

Which Azure services support open-source relational databases? (Choose TWO)

A. Azure Database for PostgreSQL
B. Azure Database for MySQL
C. Azure Synapse Analytics
D. Azure Files

Answers: A and B

Explanations

A. Correct
Azure provides managed PostgreSQL.

B. Correct
Azure provides managed MySQL.

C. Incorrect
Synapse is analytics-focused.

D. Incorrect
Azure Files is storage.


Question 13 (Single Answer)

Which Azure SQL option gives customers the MOST operating system control?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. SQL Server on Azure Virtual Machines
D. Azure Cosmos DB

Answer: C — SQL Server on Azure Virtual Machines

Explanations

A. Incorrect
Fully managed platform service.

B. Incorrect
Managed service with limited OS access.

C. Correct
VMs provide full infrastructure control.

D. Incorrect
Cosmos DB is NoSQL.


Question 14 (Fill in the Blank)

A column whose values uniquely identify each row in a table is called a __________ key.

Answer: Primary

Explanation

A primary key uniquely identifies rows.


Question 15 (Single Answer)

Which database normalization form removes repeating groups?

A. 1NF
B. 2NF
C. 3NF
D. 4NF

Answer: A — 1NF

Explanations

A. Correct
1NF eliminates repeating groups.

B. Incorrect
2NF removes partial dependencies.

C. Incorrect
3NF removes transitive dependencies.

D. Incorrect
4NF handles multi-valued dependencies.


Section 3 — Non-Relational Data on Azure


Question 16 (Single Answer)

Which Azure storage service is best for storing large unstructured files?

A. Azure SQL Database
B. Azure Blob Storage
C. Azure Table Storage
D. Azure Cosmos DB

Answer: B — Azure Blob Storage

Explanations

A. Incorrect
SQL Database is relational.

B. Correct
Blob Storage stores unstructured objects like images/videos.

C. Incorrect
Table Storage stores NoSQL key-value data.

D. Incorrect
Cosmos DB is a globally distributed database.


Question 17 (Single Answer)

Which Azure storage service provides SMB file shares?

A. Azure Blob Storage
B. Azure Cosmos DB
C. Azure Files
D. Azure Table Storage

Answer: C — Azure Files

Explanations

A. Incorrect
Blob Storage is object storage.

B. Incorrect
Cosmos DB is NoSQL.

C. Correct
Azure Files supports SMB shares.

D. Incorrect
Table Storage stores structured NoSQL entities.


Question 18 (Multi-Answer)

Which are valid Azure Cosmos DB APIs? (Choose TWO)

A. MongoDB API
B. Cassandra API
C. Oracle API
D. SMB API

Answers: A and B

Explanations

A. Correct
Cosmos DB supports MongoDB API.

B. Correct
Cosmos DB supports Cassandra API.

C. Incorrect
Oracle API is not supported.

D. Incorrect
SMB is a file-sharing protocol.


Question 19 (Single Answer)

Which characteristic is a major feature of Azure Cosmos DB?

A. Single-region architecture
B. Global distribution
C. Relational-only schema
D. File-share management

Answer: B — Global distribution

Explanations

A. Incorrect
Cosmos DB supports multiple regions.

B. Correct
Global distribution is a key feature.

C. Incorrect
Cosmos DB is NoSQL.

D. Incorrect
Not a file-sharing service.


Question 20 (Matching)

Match the storage service to its use case.

ServiceUse Case
1. Blob StorageA. SMB file shares
2. Azure FilesB. Unstructured objects

Answers

  • 1 → B
  • 2 → A

Section 4 — Analytics Workloads


Question 21 (Single Answer)

Which process involves collecting data from multiple sources into an analytics system?

A. Visualization
B. Data ingestion
C. Data modeling
D. Backup

Answer: B — Data ingestion

Explanations

A. Incorrect
Visualization displays data.

B. Correct
Ingestion collects and imports data.

C. Incorrect
Modeling defines relationships/calculations.

D. Incorrect
Backup protects data copies.


Question 22 (Single Answer)

Which analytical store is optimized for historical analytics and reporting?

A. OLTP database
B. Data warehouse
C. Azure Files
D. DNS server

Answer: B — Data warehouse

Explanations

A. Incorrect
OLTP supports transactions.

B. Correct
Warehouses support analytics.

C. Incorrect
Files are storage shares.

D. Incorrect
DNS resolves names.


Question 23 (Multi-Answer)

Which Microsoft services support large-scale analytics? (Choose TWO)

A. Azure Databricks
B. Microsoft Fabric
C. Azure DNS
D. Azure Firewall

Answers: A and B

Explanations

A. Correct
Databricks supports big data analytics.

B. Correct
Fabric is an end-to-end analytics platform.

C. Incorrect
DNS is networking.

D. Incorrect
Firewall is security infrastructure.


Question 24 (Single Answer)

What is the primary difference between batch processing and streaming processing?

A. Batch processing handles data continuously
B. Streaming processes data as it arrives
C. Streaming stores only historical data
D. Batch requires IoT devices

Answer: B — Streaming processes data as it arrives

Explanations

A. Incorrect
Continuous processing is streaming.

B. Correct
Streaming handles near real-time data.

C. Incorrect
Streaming is not limited to historical data.

D. Incorrect
Batch does not require IoT.


Question 25 (Single Answer)

Which Azure service is commonly used for streaming event ingestion?

A. Azure Event Hubs
B. Azure Files
C. Azure SQL Database
D. Azure DNS

Answer: A — Azure Event Hubs

Explanations

A. Correct
Event Hubs ingests streaming events.

B. Incorrect
Azure Files is storage.

C. Incorrect
SQL Database is relational.

D. Incorrect
DNS is networking.


Question 26 (Single Answer)

Which service uses SQL-like queries for real-time stream processing?

A. Azure Stream Analytics
B. Azure Firewall
C. Azure DNS
D. Azure Virtual Machines

Answer: A — Azure Stream Analytics

Explanations

A. Correct
Stream Analytics uses SQL-like syntax.

B. Incorrect
Firewall is security.

C. Incorrect
DNS resolves names.

D. Incorrect
VMs are infrastructure.


Question 27 (Fill in the Blank)

The architecture commonly used in analytics models with fact and dimension tables is called a __________ schema.

Answer: Star


Question 28 (Single Answer)

Which Power BI object is a single-page collection of visualizations?

A. Report
B. Dashboard
C. Dataset
D. Workspace

Answer: B — Dashboard

Explanations

A. Incorrect
Reports are usually multi-page.

B. Correct
Dashboards are single-page summaries.

C. Incorrect
Datasets store data models.

D. Incorrect
Workspaces organize content.


Question 29 (Single Answer)

Which Power BI feature is used for data transformation?

A. DAX
B. Power Query
C. Power Automate
D. Azure Functions

Answer: B — Power Query

Explanations

A. Incorrect
DAX creates calculations.

B. Correct
Power Query cleans and transforms data.

C. Incorrect
Power Automate automates workflows.

D. Incorrect
Azure Functions run code.


Question 30 (Single Answer)

Which Power BI language is used for measures and calculations?

A. Python
B. JavaScript
C. DAX
D. XML

Answer: C — DAX


Section 5 — Power BI Visualization


Question 31 (Single Answer)

Which chart type is BEST for showing trends over time?

A. Pie chart
B. Scatter chart
C. Line chart
D. Gauge chart

Answer: C — Line chart


Question 32 (Single Answer)

Which visualization is BEST for showing proportions of a whole?

A. Pie chart
B. Table
C. Scatter chart
D. Card

Answer: A — Pie chart


Question 33 (Single Answer)

Which visualization is BEST for geographic analysis?

A. Matrix
B. Map
C. Gauge
D. Card

Answer: B — Map


Question 34 (Single Answer)

Which visualization is BEST for displaying a single KPI?

A. Scatter chart
B. Card
C. Matrix
D. Pie chart

Answer: B — Card


Question 35 (Single Answer)

Which visualization is BEST for comparing categories?

A. Line chart
B. Map
C. Bar chart
D. Gauge chart

Answer: C — Bar chart


Question 36 (Multi-Answer)

Which visuals support detailed tabular reporting? (Choose TWO)

A. Table
B. Matrix
C. Gauge
D. Pie chart

Answers: A and B


Question 37 (Single Answer)

Which Power BI feature enables interactive filtering?

A. DAX
B. Slicer
C. Gauge
D. Workspace

Answer: B — Slicer


Question 38 (Single Answer)

Which visualization is BEST for identifying relationships between two numeric variables?

A. Pie chart
B. Scatter chart
C. Card
D. Gauge chart

Answer: B — Scatter chart


Question 39 (Fill in the Blank)

A Power BI object containing multiple pages of visualizations is called a __________.

Answer: Report


Question 40 (Single Answer)

Which Power BI component is cloud-based and used for sharing reports?

A. Power BI Desktop
B. Power BI Service
C. Power Query
D. Power Pivot

Answer: B — Power BI Service


Section 6 — Advanced Scenarios


Question 41 (Scenario)

A company needs a globally distributed NoSQL database with low latency worldwide.

Which Azure service should they use?

A. Azure SQL Database
B. Azure Cosmos DB
C. Azure Files
D. Azure Blob Storage

Answer: B — Azure Cosmos DB


Question 42 (Scenario)

A company needs to store millions of images and videos cost-effectively.

Which Azure service is MOST appropriate?

A. Azure SQL Database
B. Azure Blob Storage
C. Azure Files
D. Azure Synapse Analytics

Answer: B — Azure Blob Storage


Question 43 (Scenario)

A company needs fully managed relational databases with automatic patching and backups.

Which service is BEST?

A. SQL Server on Azure VMs
B. Azure SQL Database
C. Azure Files
D. Azure Event Hubs

Answer: B — Azure SQL Database


Question 44 (Scenario)

A retail company wants real-time fraud detection from transaction streams.

Which Azure service is MOST appropriate for processing?

A. Azure Stream Analytics
B. Azure DNS
C. Azure Files
D. Azure Backup

Answer: A — Azure Stream Analytics


Question 45 (Multi-Answer)

Which are characteristics of transactional systems? (Choose TWO)

A. Low-latency transactions
B. Historical trend analysis
C. High concurrency
D. Large aggregations

Answers: A and C


Question 46 (Single Answer)

Which SQL statement modifies existing rows?

A. INSERT
B. UPDATE
C. SELECT
D. CREATE

Answer: B — UPDATE


Question 47 (Single Answer)

Which SQL JOIN returns all rows from the left table and matching rows from the right table?

A. INNER JOIN
B. LEFT JOIN
C. RIGHT JOIN
D. CROSS JOIN

Answer: B — LEFT JOIN


Question 48 (Matching)

Match the visualization to the purpose.

VisualizationPurpose
1. Line chartA. Show relationships
2. Scatter chartB. Show trends

Answers

  • 1 → B
  • 2 → A

Question 49 (Single Answer)

Which Azure service supports Apache Spark analytics?

A. Azure Databricks
B. Azure Files
C. Azure DNS
D. Azure Firewall

Answer: A — Azure Databricks


Question 50 (Single Answer)

Which storage type is MOST appropriate for key-value NoSQL storage?

A. Azure Table Storage
B. Azure SQL Database
C. Azure Files
D. Azure Synapse Analytics

Answer: A — Azure Table Storage


Section 7 — Mixed Difficulty Review


Question 51 (Single Answer)

What is the primary purpose of normalization?

A. Increase redundancy
B. Improve graphics rendering
C. Reduce duplicate data
D. Increase storage costs

Answer: C — Reduce duplicate data


Question 52 (Single Answer)

Which data type stores audio and video files?

A. Structured
B. Semi-structured
C. Unstructured
D. Relational

Answer: C — Unstructured


Question 53 (Multi-Answer)

Which are benefits of Power BI dashboards? (Choose TWO)

A. Real-time monitoring
B. Single-page summary
C. Operating system management
D. Virtual machine provisioning

Answers: A and B


Question 54 (Single Answer)

Which service is MOST associated with IoT device ingestion?

A. Azure IoT Hub
B. Azure SQL Database
C. Azure Files
D. Azure Backup

Answer: A — Azure IoT Hub


Question 55 (Single Answer)

Which Azure service provides a unified analytics platform with BI integration?

A. Microsoft Fabric
B. Azure Firewall
C. Azure DNS
D. Azure Backup

Answer: A — Microsoft Fabric


Question 56 (Single Answer)

Which object improves database query performance?

A. Table
B. View
C. Index
D. Trigger

Answer: C — Index


Question 57 (Single Answer)

Which workload typically uses OLTP systems?

A. Analytical
B. Transactional
C. Archival
D. Reporting-only

Answer: B — Transactional


Question 58 (Fill in the Blank)

The SQL statement used to remove rows from a table is __________.

Answer: DELETE


Question 59 (Single Answer)

Which Azure SQL offering is a Platform as a Service (PaaS) solution?

A. SQL Server on Azure Virtual Machines
B. Azure SQL Database
C. Windows Server
D. Hyper-V

Answer: B — Azure SQL Database


Question 60 (Single Answer)

Which Power BI visualization is MOST appropriate for showing progress toward a goal?

A. Scatter chart
B. Gauge chart
C. Table
D. Pie chart

Answer: B — Gauge chart


Final Exam Tips

Focus heavily on:

  • Relational vs non-relational data
  • Azure storage services
  • Azure SQL family
  • Cosmos DB features
  • Power BI basics
  • Analytics workloads
  • Batch vs streaming concepts

Frequently tested associations:

  • Blob Storage → unstructured files
  • Event Hubs → streaming ingestion
  • Stream Analytics → real-time processing
  • Cosmos DB → globally distributed NoSQL
  • Power BI → visualization and reporting
  • DAX → calculations
  • Power Query → transformation

Power BI Visualization Tips

  • Line chart → trends
  • Bar chart → comparisons
  • Pie chart → proportions
  • Scatter chart → relationships
  • Card → single KPI
  • Map → geographic data

Go to the DP-900 Exam Prep Hub main page.

Practice Questions: Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric) (DP-900 Exam Prep)

Practice Questions


Question 1

What is the primary purpose of Azure Databricks?

A. Hosting relational databases
B. Managing file shares
C. Processing large-scale data using Apache Spark
D. Running virtual machines

Answer: C

Explanation:
Azure Databricks is built on Apache Spark for large-scale data processing.


Question 2

Which feature is a key characteristic of Azure Databricks?

A. Fixed schema relational tables
B. Distributed data processing
C. File-based storage only
D. Limited scalability

Answer: B

Explanation:
Databricks uses distributed computing to process large datasets efficiently.


Question 3

Which scenario is BEST suited for Azure Databricks?

A. Hosting a transactional database
B. Running large-scale ETL pipelines and machine learning models
C. Managing shared file storage
D. Serving static web pages

Answer: B

Explanation:
Databricks is ideal for data engineering and machine learning at scale.


Question 4

What is Microsoft Fabric primarily designed for?

A. Running operating systems
B. Providing a unified, end-to-end analytics platform
C. Managing virtual networks
D. Hosting relational databases only

Answer: B

Explanation:
Microsoft Fabric integrates multiple analytics capabilities into one unified platform.


Question 5

Which component of Microsoft Fabric serves as a unified data storage layer?

A. Azure Blob Storage
B. SQL Database
C. OneLake
D. Azure Files

Answer: C

Explanation:
OneLake is the centralized storage layer within Microsoft Fabric.


Question 6

Which service is BEST suited for organizations that want a single platform for data engineering, data warehousing, and BI?

A. Azure Virtual Machines
B. Azure Databricks
C. Microsoft Fabric
D. Azure Table Storage

Answer: C

Explanation:
Fabric provides an end-to-end unified analytics experience.


Question 7

Which of the following best describes the difference between Azure Databricks and Microsoft Fabric?

A. Databricks is for storage, Fabric is for compute
B. Databricks focuses on big data processing, Fabric provides a unified analytics platform
C. Fabric only supports relational data, Databricks does not
D. Databricks cannot scale, Fabric can

Answer: B

Explanation:
Databricks focuses on processing and ML, while Fabric provides end-to-end analytics.


Question 8

Which programming environments are commonly supported in Azure Databricks notebooks?

A. HTML and CSS only
B. Python, SQL, Scala, and R
C. JavaScript only
D. PowerShell only

Answer: B

Explanation:
Databricks notebooks support multiple languages including Python, SQL, Scala, and R.


Question 9

Which scenario is NOT ideal for Azure Databricks?

A. Large-scale data transformation
B. Machine learning model training
C. Managing simple file shares
D. Processing streaming data

Answer: C

Explanation:
Databricks is not designed for file-sharing scenarios.


Question 10

Which statement about Microsoft Fabric is TRUE?

A. It requires manual infrastructure management
B. It is a SaaS-based unified analytics platform
C. It only supports batch processing
D. It replaces all Azure services

Answer: B

Explanation:
Microsoft Fabric is a fully managed SaaS platform that integrates analytics services.


✅ Quick Exam Takeaways

Azure Databricks

  • Apache Spark-based
  • Distributed processing
  • Data engineering & machine learning

Microsoft Fabric

  • Unified analytics platform
  • End-to-end solution (data + analytics + BI)
  • Includes OneLake storage

✔ Key differences:

  • Databricks → processing & ML
  • Fabric → all-in-one analytics platform

✔ Exam tip:
👉 Big data processing → Azure Databricks
👉 Unified analytics platform → Microsoft Fabric


Go to the DP-900 Exam Prep Hub main page.

Practice Questions: Describe responsibilities for data engineers (DP-900 Exam Prep)

Practice Questions


Question 1

Which task is a primary responsibility of a data engineer?

A. Creating dashboards for business users
B. Managing database user permissions
C. Building and maintaining data pipelines
D. Training machine learning models

Answer: C

Explanation:
Data engineers are responsible for designing and maintaining data pipelines that move and transform data.


Question 2

A company needs to collect data from multiple systems and prepare it for reporting.

Which role is primarily responsible for this task?

A. Data Analyst
B. Database Administrator
C. Data Engineer
D. Business User

Answer: C

Explanation:
Data engineers handle data ingestion, integration, and preparation for downstream analytics.


Question 3

Which process involves extracting data from sources, transforming it, and loading it into a destination system?

A. OLTP
B. ETL
C. OLAP
D. ACID

Answer: B

Explanation:
ETL (Extract, Transform, Load) is a core responsibility of data engineers.


Question 4

Which Azure service is commonly used by data engineers to orchestrate data pipelines?

A. Azure SQL Database
B. Azure Data Factory
C. Azure Blob Storage
D. Azure Virtual Machines

Answer: B

Explanation:
Azure Data Factory is used to build, schedule, and manage data pipelines.


Question 5

Which responsibility ensures that data used for analytics is accurate and reliable?

A. Query optimization
B. Data visualization
C. Data quality management
D. User authentication

Answer: C

Explanation:
Data engineers ensure data quality through validation and cleaning processes.


Question 6

A data engineer is working with large-scale data processing using Apache Spark.

Which Azure service are they MOST likely using?

A. Azure SQL Database
B. Azure Cosmos DB
C. Azure Databricks
D. Azure Table Storage

Answer: C

Explanation:
Azure Databricks is a Spark-based platform used for large-scale data processing.


Question 7

Which storage solution is commonly used by data engineers for storing large volumes of raw and processed data?

A. Azure Data Lake Storage
B. Azure Queue Storage
C. Azure SQL Database
D. Azure Cache for Redis

Answer: A

Explanation:
Azure Data Lake Storage is optimized for big data storage and analytics workloads.


Question 8

Which task is LEAST likely to be performed by a data engineer?

A. Transforming raw data into structured formats
B. Monitoring data pipelines
C. Creating Power BI dashboards
D. Integrating multiple data sources

Answer: C

Explanation:
Creating dashboards is typically the responsibility of a data analyst, not a data engineer.


Question 9

Which type of data processing involves handling real-time data streams?

A. Batch processing
B. Streaming processing
C. Relational processing
D. Transactional processing

Answer: B

Explanation:
Data engineers often work with streaming pipelines for real-time data ingestion.


Question 10

A data engineer selects Parquet as a storage format for a dataset.

What is the primary reason for this choice?

A. It is human readable
B. It supports transactional updates
C. It is optimized for analytical performance
D. It enforces a strict schema

Answer: C

Explanation:
Parquet is a columnar format that improves performance for analytical workloads.


✅ Quick Exam Takeaways

For DP-900, remember data engineers:

✔ Build and manage data pipelines
✔ Handle ETL/ELT processes
✔ Work with batch and streaming data
✔ Ensure data quality and reliability
✔ Manage data storage solutions (Data Lake, Blob)
✔ Use Azure services like:

  • Azure Data Factory
  • Azure Databricks
  • Azure Data Lake Storage
  • Azure Synapse Analytics

✔ Enable analytics and BI by preparing data


Go to the DP-900 Exam Prep Hub main page.

Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric) (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
--> Describe common elements of large-scale analytics
--> Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric)


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Modern analytics workloads often require processing massive volumes of data quickly and efficiently. Microsoft provides powerful cloud services to meet these needs, including Azure Databricks and Microsoft Fabric.

For the DP-900 exam, you should understand what these services are, their key features, and when to use each.


Why Large-Scale Analytics Services Matter

Large-scale analytics involves:

  • Processing big data (TBs to PBs)
  • Supporting batch and real-time workloads
  • Enabling advanced analytics and machine learning

✔ Traditional tools often cannot scale to meet these demands.


Azure Databricks


What Is Azure Databricks?

Azure Databricks is a cloud-based analytics platform built on Apache Spark.

It is designed for:

  • Big data processing
  • Data engineering
  • Machine learning
  • Collaborative analytics

Key Features


1. Apache Spark-Based Processing

  • Distributed computing engine
  • Processes large datasets in parallel

✔ Ideal for big data workloads


2. Collaborative Workspace

  • Notebooks (Python, SQL, Scala, R)
  • Multiple users can collaborate

3. Integration with Azure

  • Works with Azure Data Lake Storage
  • Integrates with Azure Synapse Analytics

4. Machine Learning Support

  • Built-in ML capabilities
  • Supports advanced analytics workflows

Common Use Cases

  • Big data processing (ETL/ELT pipelines)
  • Data science and machine learning
  • Real-time analytics
  • Data transformation at scale

Best for: Data engineers and data scientists working with large datasets


Microsoft Fabric


What Is Microsoft Fabric?

Microsoft Fabric is an end-to-end, unified analytics platform that brings together multiple data services into a single environment.

It integrates:

  • Data engineering
  • Data warehousing
  • Data science
  • Real-time analytics
  • Business intelligence

Key Features


1. Unified Platform

  • Combines multiple services into one
  • Reduces complexity of managing separate tools

2. OneLake (Unified Storage Layer)

  • Centralized data lake for all workloads
  • Eliminates data silos

3. Integrated Analytics Experiences

  • Data Factory (ingestion)
  • Data Warehouse
  • Real-Time Analytics
  • Power BI integration

4. SaaS-Based Model

  • Fully managed platform
  • Minimal infrastructure management

Common Use Cases

  • End-to-end analytics solutions
  • Unified data platform for organizations
  • Business intelligence and reporting
  • Data integration and transformation

Best for: Organizations wanting a single, unified analytics solution


Azure Databricks vs Microsoft Fabric

FeatureAzure DatabricksMicrosoft Fabric
FocusBig data processing & MLEnd-to-end analytics platform
EngineApache SparkMultiple integrated engines
UsersData engineers, data scientistsBroad (engineers, analysts, business users)
ComplexityMore flexible, more technicalSimpler, unified experience
Use CaseAdvanced analytics & MLUnified analytics and BI

How They Fit in an Analytics Architecture

Typical roles:

  • Azure Databricks
    • Data processing
    • Advanced transformations
    • Machine learning
  • Microsoft Fabric
    • End-to-end pipeline
    • Storage (OneLake)
    • Reporting (Power BI integration)

✔ They can complement each other in modern architectures.


Key Considerations When Choosing


Choose Azure Databricks when:

  • You need advanced data engineering or machine learning
  • You require Spark-based processing
  • You want full control and flexibility

Choose Microsoft Fabric when:

  • You want a unified analytics platform
  • You prefer simplified, integrated workflows
  • You need end-to-end analytics in one place

Why This Matters for DP-900

On the exam, you may be asked to:

  • Identify the purpose of Azure Databricks
  • Recognize Microsoft Fabric as a unified analytics platform
  • Choose the right service for a scenario
  • Understand how these services support large-scale analytics

Summary — Exam-Relevant Takeaways

Azure Databricks

  • Apache Spark-based
  • Big data processing
  • Machine learning
  • Flexible and powerful

Microsoft Fabric

  • Unified analytics platform
  • End-to-end solution
  • Includes data engineering, warehousing, and BI

✔ Key difference:

  • Databricks → advanced processing & ML
  • Fabric → all-in-one analytics platform

✔ Exam tip:
👉 Spark + big data processing → Azure Databricks
👉 Unified analytics platform → Microsoft Fabric


Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

Describe the difference between Batch and Streaming data (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
--> Describe considerations for real-time data analytics
--> Describe the difference between Batch and Streaming data


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Understanding the difference between batch data and streaming data is fundamental for designing modern analytics solutions. These two approaches define how data is ingested, processed, and analyzed.


What Is Batch Data?

Batch data refers to data that is:

  • Collected over a period of time
  • Processed in large chunks (batches)
  • Handled at scheduled intervals

Key Characteristics of Batch Data

  • High latency (minutes, hours, or days)
  • Processes large volumes at once
  • Typically scheduled (e.g., nightly jobs)
  • Efficient and cost-effective

Common Use Cases

  • Daily sales reports
  • Monthly financial summaries
  • Historical data analysis
  • Data warehousing workloads

Azure Services for Batch Processing

  • Azure Data Factory → batch ingestion and orchestration
  • Azure Synapse Analytics → batch processing and analytics

What Is Streaming Data?

Streaming data refers to data that is:

  • Generated continuously
  • Processed in real time (or near real time)
  • Handled as individual events or small micro-batches

Key Characteristics of Streaming Data

  • Low latency (seconds or milliseconds)
  • Continuous data flow
  • Enables real-time insights
  • Often requires more complex processing

Common Use Cases

  • IoT sensor monitoring
  • Fraud detection
  • Live dashboards
  • Website activity tracking

Azure Services for Streaming

  • Azure Event Hubs → event ingestion
  • Azure Stream Analytics → real-time processing

Batch vs Streaming — Key Differences

FeatureBatch ProcessingStreaming Processing
Data FlowPeriodicContinuous
LatencyHighLow
Data SizeLarge chunksSmall events
ComplexitySimplerMore complex
CostLowerHigher
Use CaseHistorical analysisReal-time insights

When to Use Batch Processing

Choose batch when:

  • Real-time data is not required
  • You are working with large historical datasets
  • Cost efficiency is important
  • Processing can occur on a schedule

When to Use Streaming Processing

Choose streaming when:

  • You need real-time or near real-time insights
  • Data is generated continuously
  • Immediate action is required

Hybrid Approaches (Lambda / Modern Architectures)

Many modern systems use both:

  • Batch layer → historical analysis
  • Streaming layer → real-time insights

✔ Example:

  • Real-time dashboard + nightly aggregated reports

Why This Matters for DP-900

On the exam, you may be asked to:

  • Distinguish between batch and streaming scenarios
  • Choose the appropriate processing method
  • Identify Azure services for each approach
  • Understand trade-offs (latency, cost, complexity)

Summary — Exam-Relevant Takeaways

Batch processing

  • Processes data in chunks
  • Higher latency
  • Lower cost
  • Best for historical analysis

Streaming processing

  • Processes data continuously
  • Low latency
  • Enables real-time insights
  • More complex

✔ Azure services:

  • Batch → Azure Data Factory, Azure Synapse Analytics
  • Streaming → Azure Event Hubs, Azure Stream Analytics

✔ Exam tip:
👉 Real-time requirement → Streaming
👉 Scheduled / historical → Batch


Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

Identify appropriate visualizations for data (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
--> Describe data visualization in Microsoft Power BI
--> Identify appropriate visualizations for data


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Data visualization is the process of representing data graphically so users can quickly understand patterns, trends, relationships, and insights. In Microsoft Power BI, choosing the correct visualization is important for effective reporting and decision-making.

For the DP-900 exam, you should understand:

  • Common visualization types
  • When each visualization should be used
  • The strengths and limitations of different visuals

Why Visualization Selection Matters

The correct visualization helps users:

  • Understand data quickly
  • Identify trends and anomalies
  • Compare values
  • Monitor performance
  • Make informed decisions

Using the wrong visualization can make data confusing or misleading.


Common Visualization Types in Power BI


1. Bar Charts and Column Charts

Purpose

Used to compare values across categories.


Best Used For

  • Comparing sales by region
  • Comparing revenue by product
  • Ranking categories

Difference

  • Bar chart → horizontal bars
  • Column chart → vertical bars

Advantages

✔ Easy to read
✔ Good for comparisons
✔ Works well with categorical data


Example

Sales by product category


2. Line Charts

Purpose

Used to show trends over time.


Best Used For

  • Monthly sales trends
  • Website traffic over time
  • Stock price movement

Advantages

✔ Excellent for time-series data
✔ Clearly shows increases/decreases


Example

Revenue by month


3. Pie Charts and Donut Charts

Purpose

Show proportions or percentages of a whole.


Best Used For

  • Market share
  • Percentage of sales by region

Limitations

❌ Difficult with many categories
❌ Hard to compare similar values


Best Practice

Use only with a small number of categories


4. Tables and Matrices


Tables

Purpose

Display detailed data in rows and columns.

Best Used For

  • Exact values
  • Detailed records

Matrices

Purpose

Similar to pivot tables with grouped summaries.

Best Used For

  • Aggregated business reporting
  • Cross-tab analysis

Advantages

✔ Good for detailed analysis
✔ Supports drill-down


5. Maps

Purpose

Visualize geographic data.


Best Used For

  • Sales by country
  • Store locations
  • Regional performance

Requirements

Data should contain:

  • Country
  • City
  • Coordinates

6. KPI Visuals

Purpose

Display performance against goals.


Best Used For

  • Revenue targets
  • Operational metrics
  • Performance monitoring

Advantages

✔ Easy to monitor status
✔ Quickly highlights success/failure


7. Gauge Charts

Purpose

Show progress toward a target value.


Best Used For

  • Budget usage
  • Performance thresholds

Example

Current sales vs sales target


8. Scatter Charts

Purpose

Show relationships between two numeric variables.


Best Used For

  • Correlation analysis
  • Identifying outliers

Example

Advertising spend vs revenue


9. Cards

Purpose

Display a single key metric.


Best Used For

  • Total revenue
  • Customer count
  • Profit margin

Advantages

✔ Simple and clear
✔ Common in dashboards


10. Slicers

Purpose

Provide interactive filtering.


Best Used For

  • Filtering by date
  • Selecting regions or categories

Advantages

✔ Enhances report interactivity


Choosing the Right Visualization

GoalRecommended Visualization
Compare categoriesBar/Column Chart
Show trends over timeLine Chart
Show proportionsPie/Donut Chart
Display exact valuesTable
Summarize grouped dataMatrix
Show geographic dataMap
Track KPIsKPI/Gauge
Show correlationsScatter Chart
Show a single metricCard

Visualization Best Practices


Keep Visuals Simple

Avoid clutter and unnecessary complexity.


Use Appropriate Colors

Colors should improve readability, not distract.


Limit Pie Chart Categories

Too many slices reduce readability.


Use Consistent Formatting

Helps users interpret reports more easily.


Focus on Business Questions

Choose visuals that answer specific questions.


Interactive Features in Power BI

Power BI visuals support:

  • Filtering
  • Drill-down
  • Cross-highlighting
  • Tooltips

These features make reports interactive and user-friendly.


Why This Matters for DP-900

On the exam, you may be asked to:

  • Identify the best visualization for a scenario
  • Match visualization types to business requirements
  • Understand the strengths and weaknesses of visuals

Summary — Exam-Relevant Takeaways

✔ Common visuals:

  • Bar/Column → comparisons
  • Line → trends over time
  • Pie/Donut → proportions
  • Map → geographic data
  • Scatter → relationships
  • Card → single metric

✔ Tables show detailed data

✔ KPIs and gauges track performance

✔ Slicers provide interactivity

✔ Exam tips:
👉 Line chart = trends over time
👉 Bar chart = category comparison
👉 Pie chart = parts of a whole
👉 Scatter chart = relationships/correlation
👉 Card = single value


Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

Additional information: Visualization comparison table (DP-900 Exam Prep)

The table below serves as a tool that can be used to quickly compare and contrast the various visualization types available in Power BI.

Visualization Comparison Table

Chart TypePurposeBest Used ForAdvantagesDifferencesBest PracticeExample
Bar ChartCompare values across categoriesComparing sales by region, revenue by product, ranking categoriesEasy to read; excellent for comparisonsUses horizontal barsUse for categorical comparisons with many category labelsSales by product category
Column ChartCompare values across categoriesComparing monthly revenue, product performance, department comparisonsClear visual comparisons; familiar layoutUses vertical barsIdeal when category names are shortRevenue by department
Line ChartShow trends over timeMonthly sales trends, stock prices, website trafficExcellent for time-series analysis; clearly shows increases/decreasesFocuses on continuous progression over timeUse with dates or sequential dataRevenue by month
Pie ChartShow proportions of a wholeMarket share, percentage contribution by regionEasy to understand with small datasetsCircular chart divided into slicesLimit to a small number of categoriesPercentage of sales by region
Donut ChartShow proportions of a wholeSimilar use cases as pie chartsModern appearance; center area can display totalsSimilar to pie chart but with a hollow centerAvoid too many slicesProduct category contribution percentages
TableDisplay detailed dataTransaction records, exact valuesShows precise values; supports detailed analysisDisplays raw row-and-column dataUse when exact figures are importantCustomer order list
MatrixSummarize grouped dataCross-tab analysis, business summariesSupports grouping and drill-downSimilar to a pivot tableUse for summarized reportingSales by region and product
MapVisualize geographic dataSales by country, store locations, regional analysisExcellent for location-based insightsUses geographic plottingEnsure geographic fields are accurateRevenue by state
KPI VisualDisplay performance against goalsRevenue targets, operational metricsQuickly shows status and performanceFocuses on KPI indicators and trendsUse for executive dashboardsMonthly sales target status
Gauge ChartShow progress toward a targetBudget usage, performance thresholdsEasy to interpret progress toward goalsCircular meter-style visualizationUse for single-metric target trackingCurrent sales vs target
Scatter ChartShow relationships between variablesCorrelation analysis, identifying outliersHelps identify patterns and relationshipsPlots points using two numeric axesUse with numeric datasetsAdvertising spend vs revenue
CardDisplay a single key metricTotal revenue, customer count, profit marginVery simple and clearDisplays one summarized valueUse for important KPIsTotal Sales
SlicerProvide interactive filteringFiltering by date, region, categoryEnhances report interactivityFunctions as a filter control rather than a chartKeep slicers simple and intuitiveRegion selection filter

Go to the DP-900 Exam Prep Hub main page.