Category: azure

Analytics, azure, Data Engineering, DP-900, Microsoft Certification May 10, 2026May 10, 2026

Describe the difference between Batch and Streaming data (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
   --> Describe considerations for real-time data analytics
      --> Describe the difference between Batch and Streaming data

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Understanding the difference between batch data and streaming data is fundamental for designing modern analytics solutions. These two approaches define how data is ingested, processed, and analyzed.

What Is Batch Data?

Batch data refers to data that is:

Collected over a period of time
Processed in large chunks (batches)
Handled at scheduled intervals

Key Characteristics of Batch Data

High latency (minutes, hours, or days)
Processes large volumes at once
Typically scheduled (e.g., nightly jobs)
Efficient and cost-effective

Common Use Cases

Daily sales reports
Monthly financial summaries
Historical data analysis
Data warehousing workloads

Azure Services for Batch Processing

Azure Data Factory → batch ingestion and orchestration
Azure Synapse Analytics → batch processing and analytics

What Is Streaming Data?

Streaming data refers to data that is:

Generated continuously
Processed in real time (or near real time)
Handled as individual events or small micro-batches

Key Characteristics of Streaming Data

Low latency (seconds or milliseconds)
Continuous data flow
Enables real-time insights
Often requires more complex processing

Common Use Cases

IoT sensor monitoring
Fraud detection
Live dashboards
Website activity tracking

Azure Services for Streaming

Azure Event Hubs → event ingestion
Azure Stream Analytics → real-time processing

Batch vs Streaming — Key Differences

Feature	Batch Processing	Streaming Processing
Data Flow	Periodic	Continuous
Latency	High	Low
Data Size	Large chunks	Small events
Complexity	Simpler	More complex
Cost	Lower	Higher
Use Case	Historical analysis	Real-time insights

When to Use Batch Processing

Choose batch when:

Real-time data is not required
You are working with large historical datasets
Cost efficiency is important
Processing can occur on a schedule

When to Use Streaming Processing

Choose streaming when:

You need real-time or near real-time insights
Data is generated continuously
Immediate action is required

Hybrid Approaches (Lambda / Modern Architectures)

Many modern systems use both:

Batch layer → historical analysis
Streaming layer → real-time insights

✔ Example:

Real-time dashboard + nightly aggregated reports

Why This Matters for DP-900

On the exam, you may be asked to:

Distinguish between batch and streaming scenarios
Choose the appropriate processing method
Identify Azure services for each approach
Understand trade-offs (latency, cost, complexity)

Summary — Exam-Relevant Takeaways

✔ Batch processing

Processes data in chunks
Higher latency
Lower cost
Best for historical analysis

✔ Streaming processing

Processes data continuously
Low latency
Enables real-time insights
More complex

✔ Azure services:

Batch → Azure Data Factory, Azure Synapse Analytics
Streaming → Azure Event Hubs, Azure Stream Analytics

✔ Exam tip:
👉 Real-time requirement → Streaming
👉 Scheduled / historical → Batch

Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

Analytics, azure, Data Engineering, DP-900, Microsoft Certification May 10, 2026

Practice Questions: Describe the difference between Batch and Streaming data (DP-900 Exam Prep)

Practice Questions

Question 1

What is the primary characteristic of batch data processing?

A. Continuous data flow
B. Real-time processing
C. Processing data in scheduled chunks
D. Immediate event handling

✅ Answer: C

Explanation:
Batch processing handles data in groups at scheduled intervals, not continuously.

Question 2

Which type of processing is BEST suited for real-time analytics?

A. Batch processing
B. Stream processing
C. Periodic processing
D. Manual processing

✅ Answer: B

Explanation:
Stream processing enables real-time or near real-time insights.

Question 3

Which Azure service is commonly used for streaming data ingestion?

A. Azure Data Factory
B. Azure Event Hubs
C. Azure Synapse Analytics
D. Azure SQL Database

✅ Answer: B

Explanation:
Azure Event Hubs is designed for high-throughput, real-time data ingestion.

Question 4

Which scenario is BEST suited for batch processing?

A. Monitoring live stock prices
B. Detecting fraud in real time
C. Generating a monthly financial report
D. Tracking website clicks instantly

✅ Answer: C

Explanation:
Batch processing is ideal for scheduled, periodic workloads like reports.

Question 5

What is the typical latency for streaming data processing?

A. Hours
B. Days
C. Seconds or milliseconds
D. Weeks

✅ Answer: C

Explanation:
Streaming processing provides low-latency, near real-time results.

Question 6

Which Azure service is used to process streaming data in real time?

A. Azure Blob Storage
B. Azure Stream Analytics
C. Azure Files
D. Azure Virtual Machines

✅ Answer: B

Explanation:
Azure Stream Analytics processes streaming data in real time.

Question 7

Which statement about batch processing is TRUE?

A. It processes data continuously
B. It always requires real-time data sources
C. It is typically more cost-effective than streaming
D. It has lower latency than streaming

✅ Answer: C

Explanation:
Batch processing is generally more cost-efficient than continuous streaming.

Question 8

Which scenario requires streaming processing?

A. Archiving old data
B. Processing annual tax records
C. Monitoring IoT sensor data in real time
D. Generating quarterly reports

✅ Answer: C

Explanation:
Streaming is needed for continuous, real-time data flows like IoT.

Question 9

What is a key difference between batch and streaming processing?

A. Batch uses structured data, streaming does not
B. Streaming has higher latency than batch
C. Batch processes data in chunks, streaming processes data continuously
D. Streaming is always cheaper than batch

✅ Answer: C

Explanation:
Batch = periodic chunks, Streaming = continuous flow.

Question 10

Which approach would you choose if immediate action is required based on incoming data?

A. Batch processing
B. Stream processing
C. Scheduled processing
D. Offline processing

✅ Answer: B

Explanation:
Streaming is required when real-time decisions are needed.

✅ Quick Exam Takeaways

✔ Batch processing

Scheduled
High latency
Cost-effective
Best for historical analysis

✔ Streaming processing

Continuous
Low latency
Real-time insights
More complex

✔ Azure services:

Batch → Azure Data Factory, Azure Synapse Analytics
Streaming → Azure Event Hubs, Azure Stream Analytics

✔ Exam tip:
👉 Real-time = Streaming
👉 Scheduled/historical = Batch

Go to the DP-900 Exam Prep Hub main page.

Analytics, azure, DP-900, Microsoft Certification May 10, 2026

Describe options for analytical data stores (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
   --> Describe common elements of large-scale analytics
      --> Describe options for analytical data stores

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Analytical data stores are designed to support reporting, business intelligence, and large-scale data analysis. For the DP-900 exam, you should understand the different types of analytical stores, their characteristics, and when to use each.

What Is an Analytical Data Store?

An analytical data store is optimized for:

Querying large volumes of data
Aggregations and reporting
Historical analysis

✔ Unlike transactional systems, analytical stores focus on read-heavy workloads rather than frequent updates.

Key Characteristics

Optimized for complex queries and aggregations
Stores historical data
Handles large datasets (TBs to PBs)
Typically uses denormalized schemas
Designed for high-performance reads

Main Types of Analytical Data Stores

1. Data Warehouse

Definition

A structured repository designed for relational analytical queries.

Key Features

Uses structured data
Schema-based (often star or snowflake schema)
Supports SQL queries

Azure Example

Azure Synapse Analytics

Use Cases

Business intelligence reporting
Financial analysis
Enterprise dashboards

✔ Best for: Structured data and SQL-based analytics

2. Data Lake

Definition

A storage repository for raw data in its native format.

Key Features

Supports structured, semi-structured, and unstructured data
Schema-on-read (schema applied when querying)
Highly scalable and cost-effective

Azure Example

Azure Data Lake Storage

Use Cases

Big data analytics
Machine learning
Storing raw ingestion data

✔ Best for: Flexible, large-scale data storage

3. Data Lakehouse (Conceptual)

Definition

A hybrid approach combining features of data lakes and data warehouses.

Key Features

Stores raw data like a data lake
Supports structured queries like a warehouse
Often uses open formats (e.g., Parquet, Delta)

Azure Context

Often implemented using:
- Azure Data Lake Storage
- Azure Synapse Analytics

✔ Best for: Unified analytics platform

4. Analytical Databases / Big Data Processing Systems

Definition

Systems designed for distributed processing of large datasets.

Azure Example

Azure Synapse Analytics

Key Features

Parallel processing
Handles massive datasets
Supports batch and interactive queries

✔ Best for: Large-scale analytics workloads

Comparison of Analytical Data Stores

Feature	Data Warehouse	Data Lake	Lakehouse
Data Type	Structured	All types	All types
Schema	Schema-on-write	Schema-on-read	Hybrid
Cost	Higher	Lower	Moderate
Flexibility	Low	High	High
Query Performance	High	Variable	High

Key Design Considerations

1. Data Structure

Structured → Data warehouse
Mixed or raw → Data lake

2. Query Requirements

Complex SQL queries → Data warehouse
Exploratory analytics → Data lake

3. Cost

Data lakes are generally more cost-effective
Warehouses provide optimized performance at higher cost

4. Scalability

All Azure analytical stores scale
Data lakes excel in massive data storage

5. Performance Needs

Warehouses → optimized for speed
Lakes → optimized for storage and flexibility

Typical Analytics Architecture

Data Ingestion
- Batch or streaming
Storage
- Data lake or data warehouse
Processing
- Transformations and aggregations
Visualization
- BI tools (e.g., Power BI)

Why This Matters for DP-900

On the exam, you may be asked to:

Identify the correct analytical store for a scenario
Compare data lakes vs data warehouses
Understand schema-on-read vs schema-on-write
Recognize Azure services used for analytics

Summary — Exam-Relevant Takeaways

✔ Analytical data stores are used for:

Reporting
Analytics
Historical data analysis

✔ Main types:

Data Warehouse → structured, high-performance queries
Data Lake → raw, flexible storage
Lakehouse → hybrid approach

✔ Key concepts:

Schema-on-write (warehouse)
Schema-on-read (lake)

✔ Azure services to know:

Azure Synapse Analytics → data warehouse & analytics
Azure Data Lake Storage → scalable data lake

✔ Exam tip:
👉 Structured + SQL analytics → Data Warehouse
👉 Raw + flexible + big data → Data Lake

Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

Analytics, azure, DP-900, Microsoft Certification May 10, 2026

Practice Questions: Describe options for analytical data stores (DP-900 Exam Prep)

Practice Questions

Question 1

What is the primary purpose of an analytical data store?

A. To process high-volume transactions
B. To store temporary application data
C. To support reporting and data analysis
D. To manage user authentication

✅ Answer: C

Explanation:
Analytical data stores are optimized for reporting, querying, and analysis, not transactions.

Question 2

Which type of data store is BEST suited for structured data and complex SQL queries?

A. Data lake
B. Data warehouse
C. File storage
D. Key-value store

✅ Answer: B

Explanation:
Data warehouses are designed for structured data and high-performance SQL queries.

Question 3

Which Azure service is commonly used as a data warehouse?

A. Azure Data Lake Storage
B. Azure Synapse Analytics
C. Azure Files
D. Azure Table Storage

✅ Answer: B

Explanation:
Azure Synapse Analytics provides data warehousing and large-scale analytics capabilities.

Question 4

What is a key characteristic of a data lake?

A. Requires predefined schema before loading data
B. Stores only structured data
C. Stores data in its raw format
D. Optimized for transactional workloads

✅ Answer: C

Explanation:
Data lakes store raw data in native formats, supporting schema-on-read.

Question 5

Which concept describes applying schema when data is read rather than when it is written?

A. Schema-on-write
B. Schema-on-read
C. Data normalization
D. Data partitioning

✅ Answer: B

Explanation:
Schema-on-read is used in data lakes, allowing flexible analysis.

Question 6

Which scenario is BEST suited for a data lake?

A. Financial reporting with strict schema
B. Running complex SQL joins on structured data
C. Storing raw IoT and log data for later analysis
D. Processing online transactions

✅ Answer: C

Explanation:
Data lakes are ideal for large volumes of raw, diverse data.

Question 7

Which analytical data store typically uses schema-on-write?

A. Data lake
B. Data warehouse
C. Object storage
D. Key-value store

✅ Answer: B

Explanation:
Data warehouses require a defined schema before data is loaded.

Question 8

Which of the following best describes a data lakehouse?

A. A transactional database system
B. A file storage system only
C. A hybrid of data lake and data warehouse
D. A key-value storage solution

✅ Answer: C

Explanation:
A lakehouse combines flexibility of data lakes with performance of warehouses.

Question 9

Which factor is MOST important when choosing between a data lake and a data warehouse?

A. Screen resolution
B. Data structure and query requirements
C. Programming language
D. User interface design

✅ Answer: B

Explanation:
The choice depends on data type (structured vs raw) and query needs.

Question 10

Which Azure service is BEST suited for storing large volumes of raw, unstructured data?

A. Azure SQL Database
B. Azure Data Lake Storage
C. Azure Synapse Analytics
D. Azure Table Storage

✅ Answer: B

Explanation:
Azure Data Lake Storage is optimized for large-scale raw data storage.

✅ Quick Exam Takeaways

✔ Analytical data stores support:

Reporting
Business intelligence
Large-scale analytics

✔ Main types:

Data Warehouse → structured, SQL, high performance
Data Lake → raw, flexible, scalable
Lakehouse → hybrid approach

✔ Key concepts:

Schema-on-write → warehouse
Schema-on-read → lake

✔ Azure services:

Azure Synapse Analytics → data warehouse / analytics
Azure Data Lake Storage → data lake

✔ Exam tip:
👉 Structured + SQL → Data Warehouse
👉 Raw + flexible → Data Lake

Go to the DP-900 Exam Prep Hub main page.

azure, Data Cleaning, Data Engineering, Data Integration, DP-900, Microsoft Certification May 10, 2026

Describe considerations for data ingestion and processing (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
   --> Describe common elements of large-scale analytics
      --> Describe considerations for data ingestion and processing

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

In modern data platforms, data ingestion and processing are critical steps that determine how raw data becomes meaningful insights. For the DP-900 exam, you should understand how data enters a system, how it is transformed, and the key design considerations involved.

What Is Data Ingestion?

Data ingestion is the process of collecting and importing data from various sources into a storage or analytics system.

Common Data Sources

Databases (relational and NoSQL)
Files (CSV, JSON, logs)
Streaming data (IoT devices, sensors)
Applications and APIs

Types of Data Ingestion

1. Batch Ingestion

Data is collected and processed at scheduled intervals
Suitable for large volumes of data
Higher latency (not real-time)

✔ Example:

Daily sales data uploads

✔ Common Azure service:
Azure Data Factory

2. Stream (Real-Time) Ingestion

Data is ingested continuously as it is generated
Low latency (near real-time processing)

✔ Example:

IoT sensor data
Live website activity

✔ Common Azure services:

Azure Event Hubs
Azure Stream Analytics

What Is Data Processing?

Data processing involves transforming raw data into a usable format for analysis.

Typical Processing Tasks

Cleaning data (removing errors, duplicates)
Transforming formats (e.g., JSON → tabular)
Aggregating data (summaries, totals)
Enriching data (adding additional context)

Types of Data Processing

1. Batch Processing

Processes large datasets at scheduled intervals
Efficient for historical analysis

✔ Example:

Monthly financial reporting

✔ Common Azure service:

Azure Synapse Analytics

2. Stream Processing

Processes data in real time as it arrives
Enables immediate insights and actions

✔ Example:

Fraud detection
Real-time dashboards

✔ Common Azure service:

Azure Stream Analytics

Key Considerations for Data Ingestion and Processing

1. Latency Requirements

Batch → Higher latency (minutes/hours)
Streaming → Low latency (seconds)

✔ Choose based on how quickly insights are needed.

2. Data Volume and Velocity

Large datasets require scalable solutions
High-velocity data requires streaming platforms

✔ Azure services are designed to scale automatically.

3. Data Variety

Structured, semi-structured, and unstructured data
Requires flexible processing tools

4. Data Quality

Ensure accuracy and consistency
Clean and validate data during processing

5. Scalability

Systems must handle increasing data sizes
Cloud platforms provide elastic scaling

6. Cost Optimization

Batch processing is generally more cost-efficient
Streaming may cost more due to continuous processing

7. Reliability and Fault Tolerance

Ensure data is not lost during ingestion
Use checkpointing and retry mechanisms

Common Architecture Pattern

A typical analytics pipeline:

Ingestion
- Batch: Azure Data Factory
- Stream: Azure Event Hubs
Storage
- Data lake or storage account
Processing
- Batch: Azure Synapse Analytics
- Stream: Azure Stream Analytics
Visualization
- Reporting tools (e.g., Power BI)

Batch vs Stream — Quick Comparison

Feature	Batch Processing	Stream Processing
Data Flow	Periodic	Continuous
Latency	High	Low
Use Case	Historical analysis	Real-time insights
Cost	Lower	Higher

Why This Matters for DP-900

On the exam, you may be asked to:

Distinguish between batch and stream processing
Identify appropriate ingestion methods
Choose Azure services based on scenarios
Understand trade-offs (latency, cost, scalability)

Summary — Exam-Relevant Takeaways

✔ Data ingestion = bringing data into the system
✔ Data processing = transforming data for analysis

✔ Two main patterns:

Batch → periodic, high latency
Streaming → real-time, low latency

✔ Key considerations:

Latency
Volume and velocity
Data quality
Scalability
Cost

✔ Azure services to know:

Azure Data Factory (batch ingestion)
Azure Event Hubs (stream ingestion)
Azure Stream Analytics (real-time processing)
Azure Synapse Analytics (batch processing)

Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

azure, Data Development, Data Engineering, Data Integration, DP-900, Microsoft Certification May 10, 2026

Practice Questions: Describe considerations for data ingestion and processing (DP-900 Exam Prep)

Practice Questions

Question 1

What is the primary purpose of data ingestion?

A. To visualize data
B. To store data permanently
C. To collect and import data into a system
D. To delete outdated data

✅ Answer: C

Explanation:
Data ingestion is the process of bringing data into a storage or analytics system.

Question 2

Which type of ingestion processes data at scheduled intervals?

A. Stream ingestion
B. Batch ingestion
C. Real-time ingestion
D. Event-driven ingestion

✅ Answer: B

Explanation:
Batch ingestion processes data periodically, not continuously.

Question 3

Which Azure service is commonly used for batch data ingestion?

A. Azure Event Hubs
B. Azure Data Factory
C. Azure Stream Analytics
D. Azure Virtual Machines

✅ Answer: B

Explanation:
Azure Data Factory is designed for batch ETL/ELT workflows.

Question 4

Which scenario requires stream (real-time) ingestion?

A. Monthly sales reporting
B. Archiving old data
C. Monitoring live sensor data from IoT devices
D. Migrating historical records

✅ Answer: C

Explanation:
Streaming ingestion is used for continuous, real-time data like IoT.

Question 5

What is the primary benefit of stream processing?

A. Lower cost
B. Simpler architecture
C. Real-time insights
D. Reduced storage requirements

✅ Answer: C

Explanation:
Stream processing enables low-latency, real-time analysis.

Question 6

Which Azure service is used for real-time data ingestion at scale?

A. Azure Synapse Analytics
B. Azure Blob Storage
C. Azure Event Hubs
D. Azure Files

✅ Answer: C

Explanation:
Azure Event Hubs is designed for high-throughput streaming ingestion.

Question 7

Which type of processing is BEST suited for historical data analysis?

A. Stream processing
B. Batch processing
C. Real-time processing
D. Event-driven processing

✅ Answer: B

Explanation:
Batch processing is ideal for large, historical datasets.

Question 8

Which factor is MOST important when choosing between batch and stream processing?

A. File format
B. Latency requirements
C. Storage account type
D. Programming language

✅ Answer: B

Explanation:
The key decision is how quickly the data needs to be processed.

Question 9

Which Azure service is used to process streaming data in real time?

A. Azure Data Factory
B. Azure Stream Analytics
C. Azure SQL Database
D. Azure Files

✅ Answer: B

Explanation:
Azure Stream Analytics processes real-time streaming data.

Question 10

Which of the following is a key consideration when designing a data ingestion pipeline?

A. Screen resolution
B. Latency, scalability, and data volume
C. Programming language syntax
D. User interface design

✅ Answer: B

Explanation:
Important considerations include latency, scalability, volume, and data quality.

✅ Quick Exam Takeaways

✔ Data ingestion = bringing data into the system
✔ Data processing = transforming data for analysis

✔ Two main approaches:

Batch → scheduled, high latency
Streaming → continuous, low latency

✔ Key Azure services:

Azure Data Factory → batch ingestion
Azure Event Hubs → streaming ingestion
Azure Stream Analytics → real-time processing
Azure Synapse Analytics → batch processing

✔ Key decision factor:
👉 Do you need real-time insights or not?

Go to the DP-900 Exam Prep Hub main page.

azure, DP-900, Microsoft Certification May 10, 2026

Practice Questions: Describe Azure Cosmos DB APIs (DP-900 Exam Prep)

Practice Questions

Question 1

Which API in Azure Cosmos DB uses a SQL-like query language?

A. Gremlin API
B. Cassandra API
C. Core (SQL) API
D. Table API

✅ Answer: C

Explanation:
The Core (SQL) API uses a SQL-like syntax to query JSON documents.

Question 2

Which Azure Cosmos DB API is BEST suited for applications currently using MongoDB?

A. Core (SQL) API
B. MongoDB API
C. Cassandra API
D. Table API

✅ Answer: B

Explanation:
The MongoDB API provides compatibility with MongoDB drivers and queries.

Question 3

Which API should you choose for graph-based data and relationships?

A. Table API
B. Cassandra API
C. Gremlin API
D. MongoDB API

✅ Answer: C

Explanation:
The Gremlin API is designed for graph data models and relationship analysis.

Question 4

Which API in Cosmos DB is most similar to Azure Table Storage?

A. MongoDB API
B. Cassandra API
C. Table API
D. Core (SQL) API

✅ Answer: C

Explanation:
The Table API uses a key-value model similar to Azure Table Storage.

Question 5

Which statement about Azure Cosmos DB APIs is TRUE?

A. You can switch APIs after creating the account
B. Each API uses a different query language and data model
C. All APIs use T-SQL
D. APIs determine storage redundancy

✅ Answer: B

Explanation:
Each API has its own data model and query language.

Question 6

Which API would you choose for a distributed system currently using Apache Cassandra?

A. Core (SQL) API
B. MongoDB API
C. Cassandra API
D. Gremlin API

✅ Answer: C

Explanation:
The Cassandra API supports Cassandra Query Language (CQL) and workloads.

Question 7

Which API is the default and most commonly used in Azure Cosmos DB?

A. Table API
B. Gremlin API
C. Core (SQL) API
D. Cassandra API

✅ Answer: C

Explanation:
The Core (SQL) API is the most commonly used and general-purpose API.

Question 8

Which scenario is BEST suited for the Table API?

A. Complex graph traversal
B. Large-scale relational queries
C. Simple key-value data storage
D. Document-based analytics

✅ Answer: C

Explanation:
The Table API is ideal for simple, scalable key-value storage.

Question 9

What is a key consideration when choosing a Cosmos DB API?

A. The size of the storage account
B. The number of virtual machines
C. The application’s existing data model and query language
D. The type of Azure subscription

✅ Answer: C

Explanation:
API selection depends on existing technologies and data models.

Question 10

Which statement best describes Azure Cosmos DB APIs?

A. Each API uses a different underlying database engine
B. APIs provide different ways to interact with the same service
C. APIs are only used for relational data
D. APIs determine the pricing tier only

✅ Answer: B

Explanation:
All APIs use the same Cosmos DB service but offer different interfaces and models.

✅ Quick Exam Takeaways

✔ Cosmos DB APIs allow different ways to interact with the same service

✔ APIs:

Core (SQL) → SQL-like queries (most common)
MongoDB → MongoDB compatibility
Cassandra → Distributed systems (CQL)
Table → Key-value storage
Gremlin → Graph data

✔ Key concepts:

API choice depends on data model and existing system
API selection is permanent after creation

✔ Exam tip:
👉 Match data model → API type

Go to the DP-900 Exam Prep Hub main page.

azure, DP-900, Microsoft Certification May 10, 2026

Describe Azure Cosmos DB APIs (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe considerations for working with non-relational data on Azure (15–20%)
   --> Describe Capabilities and Features of Azure Cosmos DB
      --> Describe Azure Cosmos DB APIs

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Azure Cosmos DB supports multiple APIs that allow developers to interact with the database using different data models and familiar query languages.

For the DP-900 exam, you should understand what these APIs are, how they differ, and when to use each one.

What Are Azure Cosmos DB APIs?

APIs in Azure Cosmos DB define:

How data is structured
How it is queried
Which tools and SDKs are used

✔ Each API provides a different way to interact with the same underlying Cosmos DB service.

Why Multiple APIs?

Azure Cosmos DB supports multiple APIs to:

Allow developers to use familiar tools
Enable easy migration from existing systems
Support different types of applications and data models

💡 Key idea:
👉 Choose the API based on your application’s existing technology or data model

Core Azure Cosmos DB APIs

1. Core (SQL) API

Also known as the SQL API.

Key Features

Uses a SQL-like query language
Stores data as JSON documents
Most commonly used API

Use Cases

New application development
General-purpose NoSQL workloads

✔ Best for: Developers familiar with SQL who want flexibility

2. MongoDB API

Key Features

Compatible with MongoDB drivers and tools
Uses MongoDB query syntax

Use Cases

Migrating existing MongoDB applications
Applications already using MongoDB

✔ Best for: MongoDB workloads moving to Azure

3. Cassandra API

Key Features

Compatible with Apache Cassandra
Supports Cassandra Query Language (CQL)

Use Cases

Large-scale distributed workloads
Applications using Cassandra

✔ Best for: Cassandra-based systems needing cloud scalability

4. Table API

Key Features

Similar to Azure Table Storage
Key-value data model
Uses OData-based queries

Use Cases

Simple key-value workloads
Applications already using Table Storage

✔ Best for: Lightweight, scalable key-value scenarios

5. Gremlin API

Key Features

Supports graph data models
Uses Gremlin query language

Use Cases

Graph-based applications
Relationship-heavy data

✔ Best for: Social networks, recommendation engines, network analysis

Key Differences Between APIs

API	Data Model	Query Language	Best For
Core (SQL)	Document (JSON)	SQL-like	General-purpose apps
MongoDB	Document	MongoDB query	MongoDB migration
Cassandra	Wide-column	CQL	Distributed systems
Table	Key-value	OData	Simple scalable storage
Gremlin	Graph	Gremlin	Relationship-based data

Important Concepts for DP-900

1. Same Service, Different Interfaces

All APIs run on Azure Cosmos DB, but:

Each API has its own endpoint
Each uses different query syntax
Each supports different SDKs

2. API Choice Is Permanent

You choose the API when creating a Cosmos DB account
You cannot switch APIs later

3. Performance and Features Are Shared

Global distribution
Low latency
High availability
Scalability

✔ These benefits apply regardless of API choice.

When to Choose Each API

Core (SQL) API → Default choice for most applications
MongoDB API → Existing MongoDB apps
Cassandra API → Distributed, large-scale systems
Table API → Simple key-value workloads
Gremlin API → Graph relationships

Why This Matters for DP-900

On the exam, you may be asked to:

Identify the correct API for a scenario
Match APIs to data models
Understand why multiple APIs exist
Recognize migration scenarios

Summary — Exam-Relevant Takeaways

✔ Azure Cosmos DB supports multiple APIs:

Core (SQL) API
MongoDB API
Cassandra API
Table API
Gremlin API

✔ Each API:

Uses a different data model
Has its own query language

✔ Key concept:
👉 Choose the API based on your application’s needs or existing system

✔ Important:

API choice is fixed at creation
All APIs benefit from Cosmos DB features (scalability, global distribution)

Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

azure, DP-900, Microsoft Certification May 10, 2026

Identify use cases for Azure Cosmos DB (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe considerations for working with non-relational data on Azure (15–20%)
   --> Describe capabilities and features of Azure Cosmos DB
      --> Identify use cases for Azure Cosmos DB

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Azure Cosmos DB is a fully managed, globally distributed database service designed for modern applications that require low latency, massive scalability, and flexible data models.

For the DP-900 exam, you should understand when and why to use Azure Cosmos DB, especially compared to other Azure storage and database services.

What Is Azure Cosmos DB?

Azure Cosmos DB is a NoSQL, multi-model database service that supports:

Global distribution across multiple regions
Low-latency reads and writes
Automatic scaling
Multiple APIs (Core SQL, MongoDB, Cassandra, Table, Gremlin)

✔ It is designed for high-performance, internet-scale applications.

Key Characteristics That Drive Use Cases

Understanding Cosmos DB use cases starts with its capabilities:

1. Global Distribution

Replicate data across multiple Azure regions
Users access data from the closest region

✔ Enables global applications with low latency

2. Low Latency

Single-digit millisecond response times
Ideal for real-time applications

3. Massive Scalability

Scales throughput and storage independently
Handles millions of requests per second

4. Flexible Schema

Schema-less (JSON-based data model)
Supports evolving application requirements

5. Multiple APIs

Supports different data models:
- SQL (Core API)
- MongoDB
- Cassandra
- Table
- Gremlin (graph)

✔ Allows developers to use familiar tools and frameworks

Common Use Cases for Azure Cosmos DB

1. Global Web and Mobile Applications

Scenario

Applications with users distributed worldwide.

Why Cosmos DB?

Global distribution
Low latency access
High availability

✔ Example:

Social media platforms
E-commerce applications

2. Real-Time Personalization

Scenario

Applications that tailor content to users instantly.

Why Cosmos DB?

Fast read/write performance
Flexible schema

✔ Example:

Product recommendations
Personalized dashboards

3. IoT and Telemetry Data

Scenario

Large volumes of streaming data from devices.

Why Cosmos DB?

High ingestion rates
Scalable storage
Schema flexibility

✔ Example:

Sensor data collection
Smart devices

4. Gaming Applications

Scenario

Online games requiring real-time interactions.

Why Cosmos DB?

Low latency
Global availability
High throughput

✔ Example:

Leaderboards
Player profiles
Game state storage

5. E-commerce Platforms

Scenario

High-traffic applications with variable workloads.

Why Cosmos DB?

Elastic scalability
Fast performance
Global distribution

✔ Example:

Shopping carts
Product catalogs

6. Content Management Systems

Scenario

Managing diverse and evolving content.

Why Cosmos DB?

Schema-less design
Flexible data models

✔ Example:

Blogs
Media platforms

7. Event-Driven and Microservices Architectures

Scenario

Modern distributed applications.

Why Cosmos DB?

Scales independently per service
Supports high-throughput operations

✔ Example:

Microservices storing independent datasets

When NOT to Use Azure Cosmos DB

Cosmos DB is not ideal when:

You need complex joins and relational queries
You require strict relational consistency across multiple tables
Your workload is small and cost-sensitive

✔ In these cases, relational databases like Azure SQL may be more appropriate.

Cosmos DB vs Other Azure Storage Options

Service	Best For
Blob Storage	Unstructured files (images, videos)
Azure Files	File shares
Table Storage	Simple key-value storage
Cosmos DB	Global, high-performance NoSQL apps

Why This Matters for DP-900

On the exam, you may be asked to:

Identify appropriate Cosmos DB use cases
Choose Cosmos DB for global, low-latency applications
Compare it with other Azure storage services
Recognize scenarios requiring scalability and flexibility

Summary — Exam-Relevant Takeaways

✔ Azure Cosmos DB = globally distributed NoSQL database

✔ Key strengths:

Low latency
Global distribution
Massive scalability
Flexible schema

✔ Common use cases:

Global apps
Real-time personalization
IoT and telemetry
Gaming
E-commerce

✔ Not suitable for:

Complex relational workloads
Heavy join operations

✔ Key decision factor:
👉 High scale + low latency + global users = Cosmos DB

Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

azure, DP-900, Microsoft Certification May 10, 2026

Practice Questions: Identify use cases for Azure Cosmos DB (DP-900 Exam Prep)

Practice Questions

Question 1

Which scenario is BEST suited for Azure Cosmos DB?

A. Running complex SQL joins across multiple tables
B. Storing structured financial transactions with strict relational constraints
C. Supporting a globally distributed mobile application with low latency
D. Hosting a traditional on-premises file share

✅ Answer: C

Explanation:
Cosmos DB is ideal for globally distributed applications requiring low latency.

Question 2

Which type of application benefits MOST from Cosmos DB’s global distribution capabilities?

A. Local desktop application
B. Single-region reporting system
C. Global e-commerce website
D. Batch processing system

✅ Answer: C

Explanation:
Global applications benefit from multi-region replication and low latency.

Question 3

Which use case is BEST suited for Cosmos DB?

A. Data warehouse for historical reporting
B. IoT application collecting real-time sensor data
C. Relational database with complex joins
D. File storage for images

✅ Answer: B

Explanation:
Cosmos DB is optimized for high-ingestion, real-time data scenarios like IoT.

Question 4

Why is Cosmos DB suitable for real-time personalization scenarios?

A. It enforces strict relational schemas
B. It supports high-latency operations
C. It provides low-latency read/write performance
D. It requires predefined schemas

✅ Answer: C

Explanation:
Low latency enables instant updates and responses for personalization.

Question 5

Which application would MOST benefit from Cosmos DB?

A. Payroll system requiring strict ACID compliance across multiple tables
B. Static website hosting images
C. Gaming application storing player state globally
D. Spreadsheet-based reporting system

✅ Answer: C

Explanation:
Gaming apps require low latency, high throughput, and global availability.

Question 6

Which scenario is NOT a good fit for Cosmos DB?

A. Global content management system
B. Real-time analytics dashboard
C. Complex relational reporting with joins
D. Social media application

✅ Answer: C

Explanation:
Cosmos DB is not ideal for complex relational queries or joins.

Question 7

Which feature of Cosmos DB makes it ideal for microservices architectures?

A. Fixed schema design
B. Independent scalability for each service
C. Requirement for relational constraints
D. Limited throughput options

✅ Answer: B

Explanation:
Each microservice can scale independently using Cosmos DB.

Question 8

Which use case involves storing flexible, evolving data structures?

A. Financial ledger system
B. Product catalog with changing attributes
C. Relational reporting system
D. Fixed-schema inventory system

✅ Answer: B

Explanation:
Cosmos DB’s schema-less design supports evolving data models.

Question 9

Which scenario best demonstrates Cosmos DB’s high-throughput capabilities?

A. Processing monthly reports
B. Handling millions of real-time user requests
C. Archiving old documents
D. Storing backup files

✅ Answer: B

Explanation:
Cosmos DB is designed for high-throughput, real-time workloads.

Question 10

Which Azure service would you choose for a globally distributed application requiring millisecond response times?

A. Azure Blob Storage
B. Azure Files
C. Azure Cosmos DB
D. Azure Table Storage

✅ Answer: C

Explanation:
Cosmos DB is specifically designed for low-latency, globally distributed applications.

✅ Quick Exam Takeaways

✔ Cosmos DB = global, low-latency NoSQL database

✔ Best for:

Global web/mobile apps
IoT and telemetry
Gaming
Real-time personalization
Microservices

✔ Key strengths:

Global distribution
Massive scalability
Flexible schema
High throughput

✔ Not ideal for:

Complex joins
Strict relational workloads

✔ Exam tip:
👉 If you see “global + real-time + high scale” → think Cosmos DB

Go to the DP-900 Exam Prep Hub main page.