Below are the free Exam Prep Hubs currently available on The Data Community. Bookmark the hubs you are interested in and use them to ensure you are fully prepared for the respective exam.
Each hub contains:
The topic-by-topic (from the official study guide) coverage of the material, making it easy for you to ensure you are covering all aspects of the exam material.
Practice exam questions for each section.
Bonus material to help you prepare
Two (2) Practice Exams with 60 questions each, along with answer keys.
Links to useful resources, such as Microsoft Learn content, YouTube video series, and more.
WARNING: AI-900 will retire on June 30, 2026. It will be replaced with AI-901. You can continue to earn this certification after AI-900 retires by passing AI-901.
Welcome to The Data Community! A great online resource for information centered around the broad and important topic of “data”. Thank you for visiting and participating.
Below are some commonly asked questions about the DP-900: Microsoft Azure Data Fundamentals certification exam. Upon successfully passing this exam, you earn the Microsoft Certified: Azure Data Fundamentals certification.
What is the DP-900 certification exam?
The DP-900: Microsoft Azure Data Fundamentals exam validates your foundational knowledge of core data concepts and how data is implemented using Microsoft Azure services.
Candidates who pass the exam demonstrate understanding of:
Core data concepts (relational vs non-relational data, transactional vs analytical workloads)
Relational data workloads in Azure (Azure SQL Database, SQL Server on Azure Virtual Machines, Azure SQL Managed Instance)
Non-relational data workloads in Azure (Azure Cosmos DB)
Analytical workloads in Azure (Azure Synapse Analytics, Azure Data Factory, Azure Data Lake, Power BI)
This certification is designed for individuals who want to build a baseline understanding of data in the cloud. Upon successfully passing this exam, candidates earn the Microsoft Certified: Azure Data Fundamentals certification.
Is the DP-900 certification exam worth it?
The short answer is yes.
DP-900 is an excellent entry point into Microsoft’s data certification ecosystem. Preparing for this exam helps you:
Build a solid foundation in data concepts
Understand how Azure supports different data workloads
Gain confidence working with cloud-based data platforms
Prepare for more advanced certifications such as DP-203, PL-300, or AI-900
For beginners, career switchers, students, and professionals new to Azure or data, DP-900 provides structured learning and practical context that transfers directly to real-world scenarios.
How many questions are on the DP-900 exam?
The DP-900 exam typically contains between 40 and 60 questions.
Question formats may include:
Single-choice and multiple-choice questions
Multi-select questions
Drag-and-drop or matching questions
Short scenario-based questions
The exact number and format can vary from exam to exam.
How hard is the DP-900 exam?
DP-900 is considered a fundamentals-level exam and is generally easier than associate-level certifications such as PL-300 or DP-203.
That said, it still requires preparation.
The challenge comes from:
Understanding when to use relational vs non-relational data
Recognizing Azure services and their purposes
Interpreting scenario-based questions
Learning basic analytics concepts
With focused study and practice, most candidates find the exam very achievable.
Helpful preparation resources include:
Microsoft Learn (official and free)
The official DP-900 study guide
Practice exams
Community resources and blogs
YouTube tutorials and walkthroughs
How much does the DP-900 certification exam cost?
As of early 2026, the standard exam pricing is approximately:
United States: $99 USD
Other countries: Regionally adjusted pricing applies
Microsoft frequently offers student discounts, academic pricing, and exam vouchers, so it’s worth checking the official Microsoft certification site before scheduling.
How do I prepare for the Microsoft DP-900 certification exam?
The most important advice is not to rush.
Recommended preparation steps:
Review the official DP-900 exam skills outline.
Complete the free Microsoft Learn DP-900 learning path.
Study core data concepts (relational vs non-relational, OLTP vs OLAP).
Learn the purpose of key Azure services such as Azure SQL Database, Azure Cosmos DB, Azure Synapse Analytics, and Power BI.
Welcome to the DP-900: Azure Data Fundamentals Exam Prep Hub!
Welcome to the one-stop hub with information for preparing for the DP-900: Microsoft Azure Data Fundamentals certification exam. The content for this exam helps you to “Demonstrate foundational knowledge of core data concepts related to Microsoft Azure data services.”. Upon successful completion of the exam, you earn the Microsoft Certified: Azure Data Fundamentals certification.
This hub provides information directly here (topic-by-topic as outlined in the official study guide), links to a number of external resources, tips for preparing for the exam, practice tests, and section questions to help you prepare. Bookmark this page and use it as a guide to ensure that you are fully covering all relevant topics for the AI-900 exam and making use of as many of the resources available as possible.
Audience profile (from Microsoft’s site)
This exam is an opportunity to demonstrate your knowledge of core data concepts and related Microsoft Azure data services. As a candidate for this exam, you should have familiarity with Exam DP-900’s self-paced or instructor-led learning material.
This exam is intended for you, if you’re a candidate beginning to work with data in the cloud.
You should be familiar with: - The concepts of relational and non-relational data. - Different types of data workloads such as transactional or analytical.
You can use Azure Data Fundamentals to prepare for other Azure role-based certifications like Azure Database Administrator Associate or Azure Data Engineer Associate, but it is not a prerequisite for any of them.
Skills at a glance (as specified in the official study guide)
Describe core data concepts (25–30%)
Identify considerations for relational data on Azure (20–25%)
Describe considerations for working with non-relational data on Azure (15–20%)
A company stores customer survey responses in JSON format. Each survey can contain different fields depending on the survey type.
How should this data be classified?
A. Structured B. Semi-structured C. Unstructured D. Transactional
✅ Answer: B — Semi-structured
Explanations
A. Incorrect Structured data requires a rigid schema.
B. Correct JSON is semi-structured because it contains flexible tagged fields.
C. Incorrect Unstructured data has little or no organization.
D. Incorrect Transactional refers to workload type, not structure.
Question 2 (Multi-Answer)
Which characteristics are associated with transactional workloads? (Choose TWO)
A. High concurrency B. Historical aggregations C. Fast insert/update operations D. Large-scale reporting queries
✅ Answers: A and C
Explanations
A. Correct Transactional systems support many simultaneous users.
B. Incorrect Historical aggregations are analytical.
C. Correct OLTP systems perform fast write operations.
D. Incorrect Large reporting queries belong to analytics workloads.
Question 3 (Scenario-Based)
A database contains duplicated customer addresses across multiple tables. The database architect wants to reduce redundancy and improve consistency.
Which process should be used?
A. Partitioning B. Normalization C. Encryption D. Replication
✅ Answer: B — Normalization
Explanations
A. Incorrect Partitioning improves scalability.
B. Correct Normalization reduces duplication.
C. Incorrect Encryption secures data.
D. Incorrect Replication copies data.
Question 4 (Single Answer)
Which SQL statement removes an existing table and all its data?
A. DELETE B. REMOVE C. DROP D. ERASE
✅ Answer: C — DROP
Explanations
A. Incorrect DELETE removes rows only.
B. Incorrect REMOVE is not standard SQL.
C. Correct DROP deletes the table structure and data.
D. Incorrect ERASE is not standard SQL.
Question 5 (Matching)
Match the role to the responsibility.
Role
Responsibility
1. DBA
A. Creates dashboards
2. Data Analyst
B. Maintains database performance
3. Data Engineer
C. Builds data pipelines
✅ Answers
1 → B
2 → A
3 → C
Question 6 (Scenario-Based)
A retail company needs a database for processing thousands of purchases per minute with guaranteed consistency.
Which workload type is MOST appropriate?
A. Analytical B. Streaming C. Transactional D. Archival
✅ Answer: C — Transactional
Explanations
A. Incorrect Analytical systems focus on reporting.
B. Incorrect Streaming processes event flows.
C. Correct Transactional systems support operational consistency and speed.
D. Incorrect Archival systems store inactive data.
Question 7 (Fill in the Blank)
The SQL statement used to add new rows to a table is __________.
✅ Answer: INSERT
Question 8 (Multi-Answer)
Which file formats are commonly used in analytics workloads? (Choose TWO)
A. Parquet B. ORC C. BMP D. EXE
✅ Answers: A and B
Explanations
A. Correct Parquet is optimized for analytics.
B. Correct ORC is another columnar analytics format.
C. Incorrect BMP is an image format.
D. Incorrect EXE is executable software.
Question 9 (Scenario-Based)
An organization wants to analyze 10 years of sales history for trends and forecasting.
Which workload type is BEST suited?
A. OLTP B. Analytical C. Streaming D. Operational
✅ Answer: B — Analytical
Question 10 (Single Answer)
Which database object contains reusable SQL logic?
A. View B. Index C. Stored Procedure D. Key
✅ Answer: C — Stored Procedure
Section 2 — Relational Data on Azure
Question 11 (Scenario-Based)
A company is migrating an on-premises SQL Server application that relies heavily on SQL Server Agent, cross-database queries, and instance-level features.
Which Azure service is MOST appropriate?
A. Azure SQL Database B. Azure SQL Managed Instance C. Azure Cosmos DB D. Azure Blob Storage
✅ Answer: B — Azure SQL Managed Instance
Explanations
A. Incorrect Azure SQL Database has fewer instance-level features.
B. Correct Managed Instance offers near full SQL Server compatibility.
C. Incorrect Cosmos DB is NoSQL.
D. Incorrect Blob Storage stores files.
Question 12 (Single Answer)
Which Azure SQL offering provides the HIGHEST level of infrastructure control?
A. Azure SQL Database B. Azure SQL Managed Instance C. SQL Server on Azure Virtual Machines D. Azure Synapse Analytics
✅ Answer: C — SQL Server on Azure Virtual Machines
Question 13 (Multi-Answer)
Which are advantages of Platform as a Service (PaaS) databases? (Choose TWO)
A. Automatic patching B. Reduced administrative overhead C. Full operating system control D. Manual backups only
✅ Answers: A and B
Question 14 (Scenario-Based)
A company wants automatic scaling, backups, and minimal management overhead for a new cloud-native application.
Which solution is BEST?
A. SQL Server on Azure VMs B. Azure SQL Database C. Windows Server Failover Cluster D. Self-hosted SQL Server
✅ Answer: B — Azure SQL Database
Question 15 (Single Answer)
What is the purpose of a foreign key?
A. Encrypt data B. Create indexes C. Enforce relationships between tables D. Remove duplicates
✅ Answer: C — Enforce relationships between tables
Question 16 (Scenario-Based)
A company needs a managed PostgreSQL service in Azure.
Which service should be used?
A. Azure SQL Database B. Azure Database for PostgreSQL C. Azure Blob Storage D. Azure Cosmos DB
✅ Answer: B — Azure Database for PostgreSQL
Question 17 (Single Answer)
Which normalization form removes transitive dependencies?
A. 1NF B. 2NF C. 3NF D. 4NF
✅ Answer: C — 3NF
Question 18 (Multi-Answer)
Which SQL statements are Data Manipulation Language (DML)? (Choose TWO)
A. SELECT B. INSERT C. CREATE D. DROP
✅ Answers: A and B
Question 19 (Scenario-Based)
A query needs to return ALL customers, including those without orders.
Which JOIN should be used?
A. INNER JOIN B. CROSS JOIN C. LEFT JOIN D. SELF JOIN
✅ Answer: C — LEFT JOIN
Question 20 (Single Answer)
Which object improves query performance but does NOT store actual business data?
A. Table B. View C. Index D. Row
✅ Answer: C — Index
Section 3 — Non-Relational Data
Question 21 (Scenario-Based)
A media company needs to store petabytes of video content at low cost.
Which Azure service is MOST appropriate?
A. Azure SQL Database B. Azure Blob Storage C. Azure Table Storage D. Azure Cache for Redis
✅ Answer: B — Azure Blob Storage
Question 22 (Single Answer)
Which Azure Blob Storage tier is optimized for infrequently accessed data?
A. Premium B. Hot C. Cool D. Archive
✅ Answer: C — Cool
Question 23 (Scenario-Based)
An organization needs cloud-hosted SMB file shares accessible by both cloud and on-premises servers.
Which service should be used?
A. Azure Cosmos DB B. Azure Files C. Azure Table Storage D. Azure SQL Database
✅ Answer: B — Azure Files
Question 24 (Multi-Answer)
Which APIs are supported by Azure Cosmos DB? (Choose TWO)
A. MongoDB B. Cassandra C. Oracle D. SMB
✅ Answers: A and B
Question 25 (Scenario-Based)
A gaming company needs globally distributed low-latency data access for player profiles.
Which Azure service is BEST?
A. Azure Cosmos DB B. Azure Files C. Azure SQL Database D. Azure Blob Storage
✅ Answer: A — Azure Cosmos DB
Question 26 (Single Answer)
What is a major benefit of Azure Cosmos DB partitioning?
A. Reduces security B. Enables scalability C. Removes replication D. Prevents indexing
✅ Answer: B — Enables scalability
Question 27 (Fill in the Blank)
Azure Cosmos DB provides multi-region __________ to improve availability and performance.
✅ Answer: replication
Question 28 (Scenario-Based)
A company needs a NoSQL key-value store for massive telemetry ingestion.
Which service is MOST appropriate?
A. Azure Table Storage B. Azure SQL Database C. Azure Files D. Azure DNS
✅ Answer: A — Azure Table Storage
Question 29 (Single Answer)
Which storage service stores data as objects inside containers?
A. Azure Files B. Azure Blob Storage C. Azure SQL Database D. Azure Cosmos DB
✅ Answer: B — Azure Blob Storage
Question 30 (Multi-Answer)
Which are characteristics of non-relational databases? (Choose TWO)
A. Flexible schemas B. Strict relational constraints C. Horizontal scalability D. Mandatory JOIN operations
✅ Answers: A and C
Section 4 — Analytics Workloads
Question 31 (Scenario-Based)
A company collects IoT sensor readings every second and needs near real-time dashboards.
Which processing approach is MOST appropriate?
A. Batch processing B. Streaming processing C. Archival processing D. Offline reporting
✅ Answer: B — Streaming processing
Question 32 (Single Answer)
Which Azure service is designed for high-throughput event ingestion?
A. Azure Event Hubs B. Azure Backup C. Azure Files D. Azure DNS
✅ Answer: A — Azure Event Hubs
Question 33 (Scenario-Based)
An organization needs Apache Spark-based analytics with collaborative notebooks.
Which service is BEST?
A. Azure Databricks B. Azure Files C. Azure DNS D. Azure Firewall
✅ Answer: A — Azure Databricks
Question 34 (Single Answer)
Which architecture commonly includes fact tables and dimension tables?
A. OLTP schema B. Star schema C. Graph schema D. XML schema
✅ Answer: B — Star schema
Question 35 (Multi-Answer)
Which are characteristics of a data warehouse? (Choose TWO)
A. Optimized for analytics B. Stores historical data C. Primarily supports OLTP transactions D. Limited aggregations
✅ Answers: A and B
Question 36 (Scenario-Based)
A company wants a unified analytics platform combining engineering, warehousing, data science, and BI.
Which Microsoft service BEST fits?
A. Microsoft Fabric B. Azure Files C. Azure Firewall D. Azure DNS
✅ Answer: A — Microsoft Fabric
Question 37 (Single Answer)
Which service allows SQL-like queries against streaming data?
A. Azure Stream Analytics B. Azure Files C. Azure Backup D. Azure Monitor
✅ Answer: A — Azure Stream Analytics
Question 38 (Scenario-Based)
An organization processes payroll data once nightly.
Which processing type is MOST appropriate?
A. Streaming B. Batch C. Event-driven only D. Real-time analytics
✅ Answer: B — Batch
Question 39 (Single Answer)
Which process extracts, transforms, and loads data into analytical systems?
A. ETL B. DNS C. RAID D. OLTP
✅ Answer: A — ETL
Question 40 (Multi-Answer)
Which services are commonly associated with real-time analytics? (Choose TWO)
A. Azure Event Hubs B. Azure Stream Analytics C. Azure Files D. Azure Backup
✅ Answers: A and B
Section 5 — Power BI
Question 41 (Scenario-Based)
An executive wants a single-page overview showing KPIs and summary visuals.
Which Power BI object should be used?
A. Dataset B. Dashboard C. Dataflow D. Semantic model
✅ Answer: B — Dashboard
Question 42 (Single Answer)
Which Power BI component is primarily used for data transformation?
A. DAX B. Power Query C. Azure Functions D. Power Automate
✅ Answer: B — Power Query
Question 43 (Scenario-Based)
A report must show revenue trends over 24 months.
Which visualization is BEST?
A. Pie chart B. Gauge chart C. Line chart D. Scatter chart
✅ Answer: C — Line chart
Question 44 (Single Answer)
Which visualization is BEST for displaying proportions?
A. Scatter chart B. Pie chart C. Card D. Gauge chart
✅ Answer: B — Pie chart
Question 45 (Scenario-Based)
A company wants users to filter reports interactively by region and year.
Which feature should be used?
A. Indexes B. Slicers C. Measures D. Triggers
✅ Answer: B — Slicers
Question 46 (Single Answer)
Which Power BI language creates measures and calculated columns?
A. SQL B. Python C. DAX D. XML
✅ Answer: C — DAX
Question 47 (Scenario-Based)
A business analyst wants to identify the relationship between advertising spend and revenue.
Which visualization is BEST?
A. Pie chart B. Scatter chart C. Gauge chart D. Card
✅ Answer: B — Scatter chart
Question 48 (Single Answer)
Which Power BI visualization is BEST for detailed row-level data?
A. Table B. Gauge C. Pie chart D. Card
✅ Answer: A — Table
Question 49 (Multi-Answer)
Which are benefits of Power BI dashboards? (Choose TWO)
A. Real-time monitoring B. Single-page summaries C. Operating system administration D. SQL indexing
✅ Answers: A and B
Question 50 (Scenario-Based)
A company needs a geographic visualization of sales by country.
Which visualization is BEST?
A. Matrix B. Map C. Gauge D. Card
✅ Answer: B — Map
Section 6 — Comprehensive Scenarios
Question 51 (Scenario-Based)
A healthcare organization requires:
Globally distributed NoSQL storage
Automatic replication
Low latency worldwide
Flexible schema support
Which solution BEST fits?
A. Azure SQL Database B. Azure Cosmos DB C. Azure Files D. Azure Synapse Analytics
✅ Answer: B — Azure Cosmos DB
Question 52 (Scenario-Based)
A manufacturing company collects sensor telemetry every second from thousands of devices.
Which Azure service should ingest the streaming events?
A. Azure Event Hubs B. Azure Files C. Azure SQL Managed Instance D. Azure Backup
✅ Answer: A — Azure Event Hubs
Question 53 (Scenario-Based)
A company wants full control of SQL Server patching, OS configuration, and backups.
Which deployment option should be used?
A. Azure SQL Database B. Azure SQL Managed Instance C. SQL Server on Azure Virtual Machines D. Azure Cosmos DB
✅ Answer: C — SQL Server on Azure Virtual Machines
Question 54 (Single Answer)
Which Azure service is MOST optimized for unstructured object storage?
A. Azure Blob Storage B. Azure SQL Database C. Azure Files D. Azure Synapse Analytics
✅ Answer: A — Azure Blob Storage
Question 55 (Scenario-Based)
An analytics team needs to store historical sales data optimized for aggregation queries.
Which solution is BEST?
A. Transactional database B. Data warehouse C. Azure Files D. DNS server
✅ Answer: B — Data warehouse
Question 56 (Single Answer)
Which SQL statement changes existing records?
A. CREATE B. UPDATE C. INSERT D. ALTER
✅ Answer: B — UPDATE
Question 57 (Multi-Answer)
Which are benefits of normalization? (Choose TWO)
A. Reduced redundancy B. Improved consistency C. Increased duplicate storage D. Reduced relationships
✅ Answers: A and B
Question 58 (Scenario-Based)
A report needs to compare revenue across product categories.
Which visualization is BEST?
A. Line chart B. Scatter chart C. Bar chart D. Gauge chart
✅ Answer: C — Bar chart
Question 59 (Fill in the Blank)
The SQL JOIN that returns only matching rows from both tables is called an __________ JOIN.
✅ Answer: INNER
Question 60 (Scenario-Based)
A company needs:
Large-scale analytics
Integrated Power BI reporting
Data engineering
Real-time analytics
Unified SaaS experience
Which platform BEST meets these requirements?
A. Microsoft Fabric B. Azure Files C. Azure DNS D. Windows Server Failover Clustering
This practice exam covers all major skills measured on the DP-900 certification exam, including:
Core data concepts
Relational data on Azure
Non-relational data on Azure
Analytics workloads
Power BI and visualization
Real-time analytics
Azure data services
Question formats include:
Single-answer multiple choice
Multi-answer multiple choice
Matching/connect-the-answers
Fill-in-the-blank
Scenario-based questions
Section 1 — Core Data Concepts
Question 1 (Single Answer)
Which type of data has a predefined schema consisting of rows and columns?
A. Unstructured data B. Semi-structured data C. Structured data D. Streaming data
✅ Answer: C — Structured data
Explanations
A. Incorrect Unstructured data does not have a predefined schema.
B. Incorrect Semi-structured data has some organization but not fixed rows/columns.
C. Correct Structured data uses a defined schema with rows and columns.
D. Incorrect Streaming data refers to continuously arriving data, not structure type.
Question 2 (Multi-Answer)
Which of the following are examples of semi-structured data? (Choose TWO)
A. JSON B. CSV C. XML D. SQL tables
✅ Answers: A and C
Explanations
A. Correct JSON contains tags/structure but flexible schemas.
B. Incorrect CSV is structured tabular data.
C. Correct XML is semi-structured because it uses tagged hierarchical data.
D. Incorrect SQL tables are structured relational data.
Question 3 (Fill in the Blank)
A database design technique used to reduce data redundancy is called __________.
✅ Answer: Normalization
Explanation
Normalization organizes data efficiently and minimizes duplication.
Question 4 (Single Answer)
Which SQL statement retrieves data from a table?
A. INSERT B. UPDATE C. SELECT D. DELETE
✅ Answer: C — SELECT
Explanations
A. Incorrect INSERT adds records.
B. Incorrect UPDATE modifies records.
C. Correct SELECT retrieves data.
D. Incorrect DELETE removes records.
Question 5 (Matching)
Match the workload to its description.
Workload
Description
1. Transactional
A. Historical analysis
2. Analytical
B. Real-time business operations
✅ Answers
1 → B
2 → A
Explanation
Transactional workloads support day-to-day operations; analytical workloads analyze historical data.
Question 6 (Single Answer)
Which role is MOST responsible for maintaining database availability and backups?
A. Data Analyst B. Data Engineer C. Database Administrator D. Business User
✅ Answer: C — Database Administrator
Explanations
A. Incorrect Data analysts focus on reporting and insights.
B. Incorrect Data engineers build pipelines and integration systems.
C. Correct DBAs manage availability, backups, and performance.
D. Incorrect Business users consume reports.
Question 7 (Multi-Answer)
Which are characteristics of analytical workloads? (Choose TWO)
A. Frequent INSERT operations B. Historical trend analysis C. Large-scale aggregations D. High-volume OLTP transactions
✅ Answers: B and C
Explanations
A. Incorrect Frequent inserts are more common in transactional systems.
B. Correct Analytical systems examine historical data.
C. Correct Aggregations are common in analytics.
D. Incorrect OLTP workloads are transactional.
Question 8 (Single Answer)
Which file format is commonly used for big data analytics because of columnar storage and compression?
A. TXT B. CSV C. Parquet D. XML
✅ Answer: C — Parquet
Explanations
A. Incorrect TXT files are plain text.
B. Incorrect CSV is row-based text data.
C. Correct Parquet is optimized for analytics workloads.
D. Incorrect XML is semi-structured but not optimized for analytics.
Question 9 (Single Answer)
Which database object stores data in rows and columns?
A. View B. Stored procedure C. Table D. Index
✅ Answer: C — Table
Explanations
A. Incorrect Views are virtual query results.
B. Incorrect Stored procedures contain SQL logic.
C. Correct Tables store relational data.
D. Incorrect Indexes improve query performance.
Question 10 (Single Answer)
Which SQL JOIN returns only matching rows from both tables?
A. LEFT JOIN B. RIGHT JOIN C. INNER JOIN D. FULL OUTER JOIN
✅ Answer: C — INNER JOIN
Explanations
A. Incorrect LEFT JOIN includes unmatched left-side rows.
B. Incorrect RIGHT JOIN includes unmatched right-side rows.
C. Correct INNER JOIN returns only matches.
D. Incorrect FULL OUTER JOIN includes all rows.
Section 2 — Relational Data on Azure
Question 11 (Single Answer)
Which Azure SQL option provides the MOST compatibility with on-premises SQL Server?
A. Azure SQL Database B. Azure SQL Managed Instance C. Azure Cosmos DB D. Azure Blob Storage
✅ Answer: B — Azure SQL Managed Instance
Explanations
A. Incorrect Azure SQL Database is fully managed but has fewer instance-level features.
B. Correct Managed Instance provides near full SQL Server compatibility.
C. Incorrect Cosmos DB is NoSQL.
D. Incorrect Blob Storage is object storage.
Question 12 (Multi-Answer)
Which Azure services support open-source relational databases? (Choose TWO)
A. Azure Database for PostgreSQL B. Azure Database for MySQL C. Azure Synapse Analytics D. Azure Files
✅ Answers: A and B
Explanations
A. Correct Azure provides managed PostgreSQL.
B. Correct Azure provides managed MySQL.
C. Incorrect Synapse is analytics-focused.
D. Incorrect Azure Files is storage.
Question 13 (Single Answer)
Which Azure SQL option gives customers the MOST operating system control?
A. Azure SQL Database B. Azure SQL Managed Instance C. SQL Server on Azure Virtual Machines D. Azure Cosmos DB
✅ Answer: C — SQL Server on Azure Virtual Machines
Explanations
A. Incorrect Fully managed platform service.
B. Incorrect Managed service with limited OS access.
C. Correct VMs provide full infrastructure control.
D. Incorrect Cosmos DB is NoSQL.
Question 14 (Fill in the Blank)
A column whose values uniquely identify each row in a table is called a __________ key.
✅ Answer: Primary
Explanation
A primary key uniquely identifies rows.
Question 15 (Single Answer)
Which database normalization form removes repeating groups?
A. 1NF B. 2NF C. 3NF D. 4NF
✅ Answer: A — 1NF
Explanations
A. Correct 1NF eliminates repeating groups.
B. Incorrect 2NF removes partial dependencies.
C. Incorrect 3NF removes transitive dependencies.
D. Incorrect 4NF handles multi-valued dependencies.
Section 3 — Non-Relational Data on Azure
Question 16 (Single Answer)
Which Azure storage service is best for storing large unstructured files?
A. Azure SQL Database B. Azure Blob Storage C. Azure Table Storage D. Azure Cosmos DB
✅ Answer: B — Azure Blob Storage
Explanations
A. Incorrect SQL Database is relational.
B. Correct Blob Storage stores unstructured objects like images/videos.
C. Incorrect Table Storage stores NoSQL key-value data.
D. Incorrect Cosmos DB is a globally distributed database.
Question 17 (Single Answer)
Which Azure storage service provides SMB file shares?
A. Azure Blob Storage B. Azure Cosmos DB C. Azure Files D. Azure Table Storage
✅ Answer: C — Azure Files
Explanations
A. Incorrect Blob Storage is object storage.
B. Incorrect Cosmos DB is NoSQL.
C. Correct Azure Files supports SMB shares.
D. Incorrect Table Storage stores structured NoSQL entities.
Question 18 (Multi-Answer)
Which are valid Azure Cosmos DB APIs? (Choose TWO)
A. MongoDB API B. Cassandra API C. Oracle API D. SMB API
✅ Answers: A and B
Explanations
A. Correct Cosmos DB supports MongoDB API.
B. Correct Cosmos DB supports Cassandra API.
C. Incorrect Oracle API is not supported.
D. Incorrect SMB is a file-sharing protocol.
Question 19 (Single Answer)
Which characteristic is a major feature of Azure Cosmos DB?
A. Single-region architecture B. Global distribution C. Relational-only schema D. File-share management
✅ Answer: B — Global distribution
Explanations
A. Incorrect Cosmos DB supports multiple regions.
B. Correct Global distribution is a key feature.
C. Incorrect Cosmos DB is NoSQL.
D. Incorrect Not a file-sharing service.
Question 20 (Matching)
Match the storage service to its use case.
Service
Use Case
1. Blob Storage
A. SMB file shares
2. Azure Files
B. Unstructured objects
✅ Answers
1 → B
2 → A
Section 4 — Analytics Workloads
Question 21 (Single Answer)
Which process involves collecting data from multiple sources into an analytics system?
A. Visualization B. Data ingestion C. Data modeling D. Backup
✅ Answer: B — Data ingestion
Explanations
A. Incorrect Visualization displays data.
B. Correct Ingestion collects and imports data.
C. Incorrect Modeling defines relationships/calculations.
D. Incorrect Backup protects data copies.
Question 22 (Single Answer)
Which analytical store is optimized for historical analytics and reporting?
A. OLTP database B. Data warehouse C. Azure Files D. DNS server
✅ Answer: B — Data warehouse
Explanations
A. Incorrect OLTP supports transactions.
B. Correct Warehouses support analytics.
C. Incorrect Files are storage shares.
D. Incorrect DNS resolves names.
Question 23 (Multi-Answer)
Which Microsoft services support large-scale analytics? (Choose TWO)
A. Azure Databricks B. Microsoft Fabric C. Azure DNS D. Azure Firewall
✅ Answers: A and B
Explanations
A. Correct Databricks supports big data analytics.
B. Correct Fabric is an end-to-end analytics platform.
C. Incorrect DNS is networking.
D. Incorrect Firewall is security infrastructure.
Question 24 (Single Answer)
What is the primary difference between batch processing and streaming processing?
A. Batch processing handles data continuously B. Streaming processes data as it arrives C. Streaming stores only historical data D. Batch requires IoT devices
✅ Answer: B — Streaming processes data as it arrives
Explanations
A. Incorrect Continuous processing is streaming.
B. Correct Streaming handles near real-time data.
C. Incorrect Streaming is not limited to historical data.
D. Incorrect Batch does not require IoT.
Question 25 (Single Answer)
Which Azure service is commonly used for streaming event ingestion?
A. Azure Event Hubs B. Azure Files C. Azure SQL Database D. Azure DNS
✅ Answer: A — Azure Event Hubs
Explanations
A. Correct Event Hubs ingests streaming events.
B. Incorrect Azure Files is storage.
C. Incorrect SQL Database is relational.
D. Incorrect DNS is networking.
Question 26 (Single Answer)
Which service uses SQL-like queries for real-time stream processing?
A. Azure Stream Analytics B. Azure Firewall C. Azure DNS D. Azure Virtual Machines
✅ Answer: A — Azure Stream Analytics
Explanations
A. Correct Stream Analytics uses SQL-like syntax.
B. Incorrect Firewall is security.
C. Incorrect DNS resolves names.
D. Incorrect VMs are infrastructure.
Question 27 (Fill in the Blank)
The architecture commonly used in analytics models with fact and dimension tables is called a __________ schema.
✅ Answer: Star
Question 28 (Single Answer)
Which Power BI object is a single-page collection of visualizations?
A. Report B. Dashboard C. Dataset D. Workspace
✅ Answer: B — Dashboard
Explanations
A. Incorrect Reports are usually multi-page.
B. Correct Dashboards are single-page summaries.
C. Incorrect Datasets store data models.
D. Incorrect Workspaces organize content.
Question 29 (Single Answer)
Which Power BI feature is used for data transformation?
A. DAX B. Power Query C. Power Automate D. Azure Functions
✅ Answer: B — Power Query
Explanations
A. Incorrect DAX creates calculations.
B. Correct Power Query cleans and transforms data.
C. Incorrect Power Automate automates workflows.
D. Incorrect Azure Functions run code.
Question 30 (Single Answer)
Which Power BI language is used for measures and calculations?
A. Python B. JavaScript C. DAX D. XML
✅ Answer: C — DAX
Section 5 — Power BI Visualization
Question 31 (Single Answer)
Which chart type is BEST for showing trends over time?
A. Pie chart B. Scatter chart C. Line chart D. Gauge chart
✅ Answer: C — Line chart
Question 32 (Single Answer)
Which visualization is BEST for showing proportions of a whole?
A. Pie chart B. Table C. Scatter chart D. Card
✅ Answer: A — Pie chart
Question 33 (Single Answer)
Which visualization is BEST for geographic analysis?
A. Matrix B. Map C. Gauge D. Card
✅ Answer: B — Map
Question 34 (Single Answer)
Which visualization is BEST for displaying a single KPI?
A. Scatter chart B. Card C. Matrix D. Pie chart
✅ Answer: B — Card
Question 35 (Single Answer)
Which visualization is BEST for comparing categories?
A. Line chart B. Map C. Bar chart D. Gauge chart
✅ Answer: C — Bar chart
Question 36 (Multi-Answer)
Which visuals support detailed tabular reporting? (Choose TWO)
A. Table B. Matrix C. Gauge D. Pie chart
✅ Answers: A and B
Question 37 (Single Answer)
Which Power BI feature enables interactive filtering?
A. DAX B. Slicer C. Gauge D. Workspace
✅ Answer: B — Slicer
Question 38 (Single Answer)
Which visualization is BEST for identifying relationships between two numeric variables?
A. Pie chart B. Scatter chart C. Card D. Gauge chart
✅ Answer: B — Scatter chart
Question 39 (Fill in the Blank)
A Power BI object containing multiple pages of visualizations is called a __________.
✅ Answer: Report
Question 40 (Single Answer)
Which Power BI component is cloud-based and used for sharing reports?
A. Power BI Desktop B. Power BI Service C. Power Query D. Power Pivot
✅ Answer: B — Power BI Service
Section 6 — Advanced Scenarios
Question 41 (Scenario)
A company needs a globally distributed NoSQL database with low latency worldwide.
Which Azure service should they use?
A. Azure SQL Database B. Azure Cosmos DB C. Azure Files D. Azure Blob Storage
✅ Answer: B — Azure Cosmos DB
Question 42 (Scenario)
A company needs to store millions of images and videos cost-effectively.
Which Azure service is MOST appropriate?
A. Azure SQL Database B. Azure Blob Storage C. Azure Files D. Azure Synapse Analytics
✅ Answer: B — Azure Blob Storage
Question 43 (Scenario)
A company needs fully managed relational databases with automatic patching and backups.
Which service is BEST?
A. SQL Server on Azure VMs B. Azure SQL Database C. Azure Files D. Azure Event Hubs
✅ Answer: B — Azure SQL Database
Question 44 (Scenario)
A retail company wants real-time fraud detection from transaction streams.
Which Azure service is MOST appropriate for processing?
A. Azure Stream Analytics B. Azure DNS C. Azure Files D. Azure Backup
✅ Answer: A — Azure Stream Analytics
Question 45 (Multi-Answer)
Which are characteristics of transactional systems? (Choose TWO)
A. Low-latency transactions B. Historical trend analysis C. High concurrency D. Large aggregations
✅ Answers: A and C
Question 46 (Single Answer)
Which SQL statement modifies existing rows?
A. INSERT B. UPDATE C. SELECT D. CREATE
✅ Answer: B — UPDATE
Question 47 (Single Answer)
Which SQL JOIN returns all rows from the left table and matching rows from the right table?
A. INNER JOIN B. LEFT JOIN C. RIGHT JOIN D. CROSS JOIN
✅ Answer: B — LEFT JOIN
Question 48 (Matching)
Match the visualization to the purpose.
Visualization
Purpose
1. Line chart
A. Show relationships
2. Scatter chart
B. Show trends
✅ Answers
1 → B
2 → A
Question 49 (Single Answer)
Which Azure service supports Apache Spark analytics?
A. Azure Databricks B. Azure Files C. Azure DNS D. Azure Firewall
✅ Answer: A — Azure Databricks
Question 50 (Single Answer)
Which storage type is MOST appropriate for key-value NoSQL storage?
A. Azure Table Storage B. Azure SQL Database C. Azure Files D. Azure Synapse Analytics
✅ Answer: A — Azure Table Storage
Section 7 — Mixed Difficulty Review
Question 51 (Single Answer)
What is the primary purpose of normalization?
A. Increase redundancy B. Improve graphics rendering C. Reduce duplicate data D. Increase storage costs
✅ Answer: C — Reduce duplicate data
Question 52 (Single Answer)
Which data type stores audio and video files?
A. Structured B. Semi-structured C. Unstructured D. Relational
✅ Answer: C — Unstructured
Question 53 (Multi-Answer)
Which are benefits of Power BI dashboards? (Choose TWO)
A. Real-time monitoring B. Single-page summary C. Operating system management D. Virtual machine provisioning
✅ Answers: A and B
Question 54 (Single Answer)
Which service is MOST associated with IoT device ingestion?
A. Azure IoT Hub B. Azure SQL Database C. Azure Files D. Azure Backup
✅ Answer: A — Azure IoT Hub
Question 55 (Single Answer)
Which Azure service provides a unified analytics platform with BI integration?
A. Microsoft Fabric B. Azure Firewall C. Azure DNS D. Azure Backup
✅ Answer: A — Microsoft Fabric
Question 56 (Single Answer)
Which object improves database query performance?
A. Table B. View C. Index D. Trigger
✅ Answer: C — Index
Question 57 (Single Answer)
Which workload typically uses OLTP systems?
A. Analytical B. Transactional C. Archival D. Reporting-only
✅ Answer: B — Transactional
Question 58 (Fill in the Blank)
The SQL statement used to remove rows from a table is __________.
✅ Answer: DELETE
Question 59 (Single Answer)
Which Azure SQL offering is a Platform as a Service (PaaS) solution?
A. SQL Server on Azure Virtual Machines B. Azure SQL Database C. Windows Server D. Hyper-V
✅ Answer: B — Azure SQL Database
Question 60 (Single Answer)
Which Power BI visualization is MOST appropriate for showing progress toward a goal?
A. Scatter chart B. Gauge chart C. Table D. Pie chart
A. Hosting relational databases B. Managing file shares C. Processing large-scale data using Apache Spark D. Running virtual machines
✅ Answer: C
Explanation: Azure Databricks is built on Apache Spark for large-scale data processing.
Question 2
Which feature is a key characteristic of Azure Databricks?
A. Fixed schema relational tables B. Distributed data processing C. File-based storage only D. Limited scalability
✅ Answer: B
Explanation: Databricks uses distributed computing to process large datasets efficiently.
Question 3
Which scenario is BEST suited for Azure Databricks?
A. Hosting a transactional database B. Running large-scale ETL pipelines and machine learning models C. Managing shared file storage D. Serving static web pages
✅ Answer: B
Explanation: Databricks is ideal for data engineering and machine learning at scale.
Question 4
What is Microsoft Fabric primarily designed for?
A. Running operating systems B. Providing a unified, end-to-end analytics platform C. Managing virtual networks D. Hosting relational databases only
✅ Answer: B
Explanation: Microsoft Fabric integrates multiple analytics capabilities into one unified platform.
Question 5
Which component of Microsoft Fabric serves as a unified data storage layer?
A. Azure Blob Storage B. SQL Database C. OneLake D. Azure Files
✅ Answer: C
Explanation: OneLake is the centralized storage layer within Microsoft Fabric.
Question 6
Which service is BEST suited for organizations that want a single platform for data engineering, data warehousing, and BI?
A. Azure Virtual Machines B. Azure Databricks C. Microsoft Fabric D. Azure Table Storage
✅ Answer: C
Explanation: Fabric provides an end-to-end unified analytics experience.
Question 7
Which of the following best describes the difference between Azure Databricks and Microsoft Fabric?
A. Databricks is for storage, Fabric is for compute B. Databricks focuses on big data processing, Fabric provides a unified analytics platform C. Fabric only supports relational data, Databricks does not D. Databricks cannot scale, Fabric can
✅ Answer: B
Explanation: Databricks focuses on processing and ML, while Fabric provides end-to-end analytics.
Question 8
Which programming environments are commonly supported in Azure Databricks notebooks?
A. HTML and CSS only B. Python, SQL, Scala, and R C. JavaScript only D. PowerShell only
✅ Answer: B
Explanation: Databricks notebooks support multiple languages including Python, SQL, Scala, and R.
Question 9
Which scenario is NOT ideal for Azure Databricks?
A. Large-scale data transformation B. Machine learning model training C. Managing simple file shares D. Processing streaming data
✅ Answer: C
Explanation: Databricks is not designed for file-sharing scenarios.
Question 10
Which statement about Microsoft Fabric is TRUE?
A. It requires manual infrastructure management B. It is a SaaS-based unified analytics platform C. It only supports batch processing D. It replaces all Azure services
✅ Answer: B
Explanation: Microsoft Fabric is a fully managed SaaS platform that integrates analytics services.
✅ Quick Exam Takeaways
✔ Azure Databricks
Apache Spark-based
Distributed processing
Data engineering & machine learning
✔ Microsoft Fabric
Unified analytics platform
End-to-end solution (data + analytics + BI)
Includes OneLake storage
✔ Key differences:
Databricks → processing & ML
Fabric → all-in-one analytics platform
✔ Exam tip: 👉 Big data processing → Azure Databricks 👉 Unified analytics platform → Microsoft Fabric
Which task is a primary responsibility of a data engineer?
A. Creating dashboards for business users B. Managing database user permissions C. Building and maintaining data pipelines D. Training machine learning models
✅ Answer: C
Explanation: Data engineers are responsible for designing and maintaining data pipelines that move and transform data.
Question 2
A company needs to collect data from multiple systems and prepare it for reporting.
Which role is primarily responsible for this task?
A. Data Analyst B. Database Administrator C. Data Engineer D. Business User
✅ Answer: C
Explanation: Data engineers handle data ingestion, integration, and preparation for downstream analytics.
Question 3
Which process involves extracting data from sources, transforming it, and loading it into a destination system?
A. OLTP B. ETL C. OLAP D. ACID
✅ Answer: B
Explanation: ETL (Extract, Transform, Load) is a core responsibility of data engineers.
Question 4
Which Azure service is commonly used by data engineers to orchestrate data pipelines?
A. Azure SQL Database B. Azure Data Factory C. Azure Blob Storage D. Azure Virtual Machines
✅ Answer: B
Explanation: Azure Data Factory is used to build, schedule, and manage data pipelines.
Question 5
Which responsibility ensures that data used for analytics is accurate and reliable?
A. Query optimization B. Data visualization C. Data quality management D. User authentication
✅ Answer: C
Explanation: Data engineers ensure data quality through validation and cleaning processes.
Question 6
A data engineer is working with large-scale data processing using Apache Spark.
Which Azure service are they MOST likely using?
A. Azure SQL Database B. Azure Cosmos DB C. Azure Databricks D. Azure Table Storage
✅ Answer: C
Explanation: Azure Databricks is a Spark-based platform used for large-scale data processing.
Question 7
Which storage solution is commonly used by data engineers for storing large volumes of raw and processed data?
A. Azure Data Lake Storage B. Azure Queue Storage C. Azure SQL Database D. Azure Cache for Redis
✅ Answer: A
Explanation: Azure Data Lake Storage is optimized for big data storage and analytics workloads.
Question 8
Which task is LEAST likely to be performed by a data engineer?
A. Transforming raw data into structured formats B. Monitoring data pipelines C. Creating Power BI dashboards D. Integrating multiple data sources
✅ Answer: C
Explanation: Creating dashboards is typically the responsibility of a data analyst, not a data engineer.
Question 9
Which type of data processing involves handling real-time data streams?
A. Batch processing B. Streaming processing C. Relational processing D. Transactional processing
✅ Answer: B
Explanation: Data engineers often work with streaming pipelines for real-time data ingestion.
Question 10
A data engineer selects Parquet as a storage format for a dataset.
What is the primary reason for this choice?
A. It is human readable B. It supports transactional updates C. It is optimized for analytical performance D. It enforces a strict schema
✅ Answer: C
Explanation: Parquet is a columnar format that improves performance for analytical workloads.
✅ Quick Exam Takeaways
For DP-900, remember data engineers:
✔ Build and manage data pipelines ✔ Handle ETL/ELT processes ✔ Work with batch and streaming data ✔ Ensure data quality and reliability ✔ Manage data storage solutions (Data Lake, Blob) ✔ Use Azure services like:
This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. This topic falls under these sections: Describe an analytics workload (25–30%) --> Describe common elements of large-scale analytics --> Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric)
Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.
Modern analytics workloads often require processing massive volumes of data quickly and efficiently. Microsoft provides powerful cloud services to meet these needs, including Azure Databricks and Microsoft Fabric.
For the DP-900 exam, you should understand what these services are, their key features, and when to use each.
Why Large-Scale Analytics Services Matter
Large-scale analytics involves:
Processing big data (TBs to PBs)
Supporting batch and real-time workloads
Enabling advanced analytics and machine learning
✔ Traditional tools often cannot scale to meet these demands.
Azure Databricks
What Is Azure Databricks?
Azure Databricks is a cloud-based analytics platform built on Apache Spark.
It is designed for:
Big data processing
Data engineering
Machine learning
Collaborative analytics
Key Features
1. Apache Spark-Based Processing
Distributed computing engine
Processes large datasets in parallel
✔ Ideal for big data workloads
2. Collaborative Workspace
Notebooks (Python, SQL, Scala, R)
Multiple users can collaborate
3. Integration with Azure
Works with Azure Data Lake Storage
Integrates with Azure Synapse Analytics
4. Machine Learning Support
Built-in ML capabilities
Supports advanced analytics workflows
Common Use Cases
Big data processing (ETL/ELT pipelines)
Data science and machine learning
Real-time analytics
Data transformation at scale
✔ Best for: Data engineers and data scientists working with large datasets
Microsoft Fabric
What Is Microsoft Fabric?
Microsoft Fabric is an end-to-end, unified analytics platform that brings together multiple data services into a single environment.
It integrates:
Data engineering
Data warehousing
Data science
Real-time analytics
Business intelligence
Key Features
1. Unified Platform
Combines multiple services into one
Reduces complexity of managing separate tools
2. OneLake (Unified Storage Layer)
Centralized data lake for all workloads
Eliminates data silos
3. Integrated Analytics Experiences
Data Factory (ingestion)
Data Warehouse
Real-Time Analytics
Power BI integration
4. SaaS-Based Model
Fully managed platform
Minimal infrastructure management
Common Use Cases
End-to-end analytics solutions
Unified data platform for organizations
Business intelligence and reporting
Data integration and transformation
✔ Best for: Organizations wanting a single, unified analytics solution
Azure Databricks vs Microsoft Fabric
Feature
Azure Databricks
Microsoft Fabric
Focus
Big data processing & ML
End-to-end analytics platform
Engine
Apache Spark
Multiple integrated engines
Users
Data engineers, data scientists
Broad (engineers, analysts, business users)
Complexity
More flexible, more technical
Simpler, unified experience
Use Case
Advanced analytics & ML
Unified analytics and BI
How They Fit in an Analytics Architecture
Typical roles:
Azure Databricks
Data processing
Advanced transformations
Machine learning
Microsoft Fabric
End-to-end pipeline
Storage (OneLake)
Reporting (Power BI integration)
✔ They can complement each other in modern architectures.
Key Considerations When Choosing
Choose Azure Databricks when:
You need advanced data engineering or machine learning
You require Spark-based processing
You want full control and flexibility
Choose Microsoft Fabric when:
You want a unified analytics platform
You prefer simplified, integrated workflows
You need end-to-end analytics in one place
Why This Matters for DP-900
On the exam, you may be asked to:
Identify the purpose of Azure Databricks
Recognize Microsoft Fabric as a unified analytics platform
Choose the right service for a scenario
Understand how these services support large-scale analytics
Summary — Exam-Relevant Takeaways
✔ Azure Databricks
Apache Spark-based
Big data processing
Machine learning
Flexible and powerful
✔ Microsoft Fabric
Unified analytics platform
End-to-end solution
Includes data engineering, warehousing, and BI
✔ Key difference:
Databricks → advanced processing & ML
Fabric → all-in-one analytics platform
✔ Exam tip: 👉 Spark + big data processing → Azure Databricks 👉 Unified analytics platform → Microsoft Fabric
This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. This topic falls under these sections: Describe an analytics workload (25–30%) --> Describe considerations for real-time data analytics --> Describe the difference between Batch and Streaming data
Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.
Understanding the difference between batch data and streaming data is fundamental for designing modern analytics solutions. These two approaches define how data is ingested, processed, and analyzed.
What Is Batch Data?
Batch data refers to data that is:
Collected over a period of time
Processed in large chunks (batches)
Handled at scheduled intervals
Key Characteristics of Batch Data
High latency (minutes, hours, or days)
Processes large volumes at once
Typically scheduled (e.g., nightly jobs)
Efficient and cost-effective
Common Use Cases
Daily sales reports
Monthly financial summaries
Historical data analysis
Data warehousing workloads
Azure Services for Batch Processing
Azure Data Factory → batch ingestion and orchestration
Azure Synapse Analytics → batch processing and analytics
What Is Streaming Data?
Streaming data refers to data that is:
Generated continuously
Processed in real time (or near real time)
Handled as individual events or small micro-batches
Key Characteristics of Streaming Data
Low latency (seconds or milliseconds)
Continuous data flow
Enables real-time insights
Often requires more complex processing
Common Use Cases
IoT sensor monitoring
Fraud detection
Live dashboards
Website activity tracking
Azure Services for Streaming
Azure Event Hubs → event ingestion
Azure Stream Analytics → real-time processing
Batch vs Streaming — Key Differences
Feature
Batch Processing
Streaming Processing
Data Flow
Periodic
Continuous
Latency
High
Low
Data Size
Large chunks
Small events
Complexity
Simpler
More complex
Cost
Lower
Higher
Use Case
Historical analysis
Real-time insights
When to Use Batch Processing
Choose batch when:
Real-time data is not required
You are working with large historical datasets
Cost efficiency is important
Processing can occur on a schedule
When to Use Streaming Processing
Choose streaming when:
You need real-time or near real-time insights
Data is generated continuously
Immediate action is required
Hybrid Approaches (Lambda / Modern Architectures)
Many modern systems use both:
Batch layer → historical analysis
Streaming layer → real-time insights
✔ Example:
Real-time dashboard + nightly aggregated reports
Why This Matters for DP-900
On the exam, you may be asked to:
Distinguish between batch and streaming scenarios
Choose the appropriate processing method
Identify Azure services for each approach
Understand trade-offs (latency, cost, complexity)
Summary — Exam-Relevant Takeaways
✔ Batch processing
Processes data in chunks
Higher latency
Lower cost
Best for historical analysis
✔ Streaming processing
Processes data continuously
Low latency
Enables real-time insights
More complex
✔ Azure services:
Batch → Azure Data Factory, Azure Synapse Analytics
This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. This topic falls under these sections: Describe an analytics workload (25–30%) --> Describe data visualization in Microsoft Power BI --> Identify appropriate visualizations for data
Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.
Data visualization is the process of representing data graphically so users can quickly understand patterns, trends, relationships, and insights. In Microsoft Power BI, choosing the correct visualization is important for effective reporting and decision-making.
For the DP-900 exam, you should understand:
Common visualization types
When each visualization should be used
The strengths and limitations of different visuals
Why Visualization Selection Matters
The correct visualization helps users:
Understand data quickly
Identify trends and anomalies
Compare values
Monitor performance
Make informed decisions
Using the wrong visualization can make data confusing or misleading.
Common Visualization Types in Power BI
1. Bar Charts and Column Charts
Purpose
Used to compare values across categories.
Best Used For
Comparing sales by region
Comparing revenue by product
Ranking categories
Difference
Bar chart → horizontal bars
Column chart → vertical bars
Advantages
✔ Easy to read ✔ Good for comparisons ✔ Works well with categorical data
Example
Sales by product category
2. Line Charts
Purpose
Used to show trends over time.
Best Used For
Monthly sales trends
Website traffic over time
Stock price movement
Advantages
✔ Excellent for time-series data ✔ Clearly shows increases/decreases
Example
Revenue by month
3. Pie Charts and Donut Charts
Purpose
Show proportions or percentages of a whole.
Best Used For
Market share
Percentage of sales by region
Limitations
❌ Difficult with many categories ❌ Hard to compare similar values
Best Practice
Use only with a small number of categories
4. Tables and Matrices
Tables
Purpose
Display detailed data in rows and columns.
Best Used For
Exact values
Detailed records
Matrices
Purpose
Similar to pivot tables with grouped summaries.
Best Used For
Aggregated business reporting
Cross-tab analysis
Advantages
✔ Good for detailed analysis ✔ Supports drill-down
5. Maps
Purpose
Visualize geographic data.
Best Used For
Sales by country
Store locations
Regional performance
Requirements
Data should contain:
Country
City
Coordinates
6. KPI Visuals
Purpose
Display performance against goals.
Best Used For
Revenue targets
Operational metrics
Performance monitoring
Advantages
✔ Easy to monitor status ✔ Quickly highlights success/failure
7. Gauge Charts
Purpose
Show progress toward a target value.
Best Used For
Budget usage
Performance thresholds
Example
Current sales vs sales target
8. Scatter Charts
Purpose
Show relationships between two numeric variables.
Best Used For
Correlation analysis
Identifying outliers
Example
Advertising spend vs revenue
9. Cards
Purpose
Display a single key metric.
Best Used For
Total revenue
Customer count
Profit margin
Advantages
✔ Simple and clear ✔ Common in dashboards
10. Slicers
Purpose
Provide interactive filtering.
Best Used For
Filtering by date
Selecting regions or categories
Advantages
✔ Enhances report interactivity
Choosing the Right Visualization
Goal
Recommended Visualization
Compare categories
Bar/Column Chart
Show trends over time
Line Chart
Show proportions
Pie/Donut Chart
Display exact values
Table
Summarize grouped data
Matrix
Show geographic data
Map
Track KPIs
KPI/Gauge
Show correlations
Scatter Chart
Show a single metric
Card
Visualization Best Practices
Keep Visuals Simple
Avoid clutter and unnecessary complexity.
Use Appropriate Colors
Colors should improve readability, not distract.
Limit Pie Chart Categories
Too many slices reduce readability.
Use Consistent Formatting
Helps users interpret reports more easily.
Focus on Business Questions
Choose visuals that answer specific questions.
Interactive Features in Power BI
Power BI visuals support:
Filtering
Drill-down
Cross-highlighting
Tooltips
These features make reports interactive and user-friendly.
Why This Matters for DP-900
On the exam, you may be asked to:
Identify the best visualization for a scenario
Match visualization types to business requirements
Understand the strengths and weaknesses of visuals
Summary — Exam-Relevant Takeaways
✔ Common visuals:
Bar/Column → comparisons
Line → trends over time
Pie/Donut → proportions
Map → geographic data
Scatter → relationships
Card → single metric
✔ Tables show detailed data
✔ KPIs and gauges track performance
✔ Slicers provide interactivity
✔ Exam tips: 👉 Line chart = trends over time 👉 Bar chart = category comparison 👉 Pie chart = parts of a whole 👉 Scatter chart = relationships/correlation 👉 Card = single value