Describe options for analytical data stores (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
--> Describe common elements of large-scale analytics
--> Describe options for analytical data stores


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Analytical data stores are designed to support reporting, business intelligence, and large-scale data analysis. For the DP-900 exam, you should understand the different types of analytical stores, their characteristics, and when to use each.


What Is an Analytical Data Store?

An analytical data store is optimized for:

  • Querying large volumes of data
  • Aggregations and reporting
  • Historical analysis

✔ Unlike transactional systems, analytical stores focus on read-heavy workloads rather than frequent updates.


Key Characteristics

  • Optimized for complex queries and aggregations
  • Stores historical data
  • Handles large datasets (TBs to PBs)
  • Typically uses denormalized schemas
  • Designed for high-performance reads

Main Types of Analytical Data Stores


1. Data Warehouse

Definition

A structured repository designed for relational analytical queries.

Key Features

  • Uses structured data
  • Schema-based (often star or snowflake schema)
  • Supports SQL queries

Azure Example

Azure Synapse Analytics

Use Cases

  • Business intelligence reporting
  • Financial analysis
  • Enterprise dashboards

Best for: Structured data and SQL-based analytics


2. Data Lake

Definition

A storage repository for raw data in its native format.

Key Features

  • Supports structured, semi-structured, and unstructured data
  • Schema-on-read (schema applied when querying)
  • Highly scalable and cost-effective

Azure Example

Azure Data Lake Storage

Use Cases

  • Big data analytics
  • Machine learning
  • Storing raw ingestion data

Best for: Flexible, large-scale data storage


3. Data Lakehouse (Conceptual)

Definition

A hybrid approach combining features of data lakes and data warehouses.

Key Features

  • Stores raw data like a data lake
  • Supports structured queries like a warehouse
  • Often uses open formats (e.g., Parquet, Delta)

Azure Context

  • Often implemented using:
    • Azure Data Lake Storage
    • Azure Synapse Analytics

Best for: Unified analytics platform


4. Analytical Databases / Big Data Processing Systems

Definition

Systems designed for distributed processing of large datasets.

Azure Example

Azure Synapse Analytics

Key Features

  • Parallel processing
  • Handles massive datasets
  • Supports batch and interactive queries

Best for: Large-scale analytics workloads


Comparison of Analytical Data Stores

FeatureData WarehouseData LakeLakehouse
Data TypeStructuredAll typesAll types
SchemaSchema-on-writeSchema-on-readHybrid
CostHigherLowerModerate
FlexibilityLowHighHigh
Query PerformanceHighVariableHigh

Key Design Considerations


1. Data Structure

  • Structured → Data warehouse
  • Mixed or raw → Data lake

2. Query Requirements

  • Complex SQL queries → Data warehouse
  • Exploratory analytics → Data lake

3. Cost

  • Data lakes are generally more cost-effective
  • Warehouses provide optimized performance at higher cost

4. Scalability

  • All Azure analytical stores scale
  • Data lakes excel in massive data storage

5. Performance Needs

  • Warehouses → optimized for speed
  • Lakes → optimized for storage and flexibility

Typical Analytics Architecture

  1. Data Ingestion
    • Batch or streaming
  2. Storage
    • Data lake or data warehouse
  3. Processing
    • Transformations and aggregations
  4. Visualization
    • BI tools (e.g., Power BI)

Why This Matters for DP-900

On the exam, you may be asked to:

  • Identify the correct analytical store for a scenario
  • Compare data lakes vs data warehouses
  • Understand schema-on-read vs schema-on-write
  • Recognize Azure services used for analytics

Summary — Exam-Relevant Takeaways

✔ Analytical data stores are used for:

  • Reporting
  • Analytics
  • Historical data analysis

✔ Main types:

  • Data Warehouse → structured, high-performance queries
  • Data Lake → raw, flexible storage
  • Lakehouse → hybrid approach

✔ Key concepts:

  • Schema-on-write (warehouse)
  • Schema-on-read (lake)

✔ Azure services to know:

  • Azure Synapse Analytics → data warehouse & analytics
  • Azure Data Lake Storage → scalable data lake

✔ Exam tip:
👉 Structured + SQL analytics → Data Warehouse
👉 Raw + flexible + big data → Data Lake


Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

Leave a comment