Describe the difference between Batch and Streaming data (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
   --> Describe considerations for real-time data analytics
      --> Describe the difference between Batch and Streaming data

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Understanding the difference between batch data and streaming data is fundamental for designing modern analytics solutions. These two approaches define how data is ingested, processed, and analyzed.

What Is Batch Data?

Batch data refers to data that is:

Collected over a period of time
Processed in large chunks (batches)
Handled at scheduled intervals

Key Characteristics of Batch Data

High latency (minutes, hours, or days)
Processes large volumes at once
Typically scheduled (e.g., nightly jobs)
Efficient and cost-effective

Common Use Cases

Daily sales reports
Monthly financial summaries
Historical data analysis
Data warehousing workloads

Azure Services for Batch Processing

Azure Data Factory → batch ingestion and orchestration
Azure Synapse Analytics → batch processing and analytics

What Is Streaming Data?

Streaming data refers to data that is:

Generated continuously
Processed in real time (or near real time)
Handled as individual events or small micro-batches

Key Characteristics of Streaming Data

Low latency (seconds or milliseconds)
Continuous data flow
Enables real-time insights
Often requires more complex processing

Common Use Cases

IoT sensor monitoring
Fraud detection
Live dashboards
Website activity tracking

Azure Services for Streaming

Azure Event Hubs → event ingestion
Azure Stream Analytics → real-time processing

Batch vs Streaming — Key Differences

Feature	Batch Processing	Streaming Processing
Data Flow	Periodic	Continuous
Latency	High	Low
Data Size	Large chunks	Small events
Complexity	Simpler	More complex
Cost	Lower	Higher
Use Case	Historical analysis	Real-time insights

When to Use Batch Processing

Choose batch when:

Real-time data is not required
You are working with large historical datasets
Cost efficiency is important
Processing can occur on a schedule

When to Use Streaming Processing

Choose streaming when:

You need real-time or near real-time insights
Data is generated continuously
Immediate action is required

Hybrid Approaches (Lambda / Modern Architectures)

Many modern systems use both:

Batch layer → historical analysis
Streaming layer → real-time insights

✔ Example:

Real-time dashboard + nightly aggregated reports

Why This Matters for DP-900

On the exam, you may be asked to:

Distinguish between batch and streaming scenarios
Choose the appropriate processing method
Identify Azure services for each approach
Understand trade-offs (latency, cost, complexity)

Summary — Exam-Relevant Takeaways

✔ Batch processing

Processes data in chunks
Higher latency
Lower cost
Best for historical analysis

✔ Streaming processing

Processes data continuously
Low latency
Enables real-time insights
More complex

✔ Azure services:

Batch → Azure Data Factory, Azure Synapse Analytics
Streaming → Azure Event Hubs, Azure Stream Analytics

✔ Exam tip:
👉 Real-time requirement → Streaming
👉 Scheduled / historical → Batch

Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

The Data Community

Describe the difference between Batch and Streaming data (DP-900 Exam Prep)