Tag: Analytical Data Stores

Describe options for analytical data stores (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
--> Describe common elements of large-scale analytics
--> Describe options for analytical data stores


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Analytical data stores are designed to support reporting, business intelligence, and large-scale data analysis. For the DP-900 exam, you should understand the different types of analytical stores, their characteristics, and when to use each.


What Is an Analytical Data Store?

An analytical data store is optimized for:

  • Querying large volumes of data
  • Aggregations and reporting
  • Historical analysis

✔ Unlike transactional systems, analytical stores focus on read-heavy workloads rather than frequent updates.


Key Characteristics

  • Optimized for complex queries and aggregations
  • Stores historical data
  • Handles large datasets (TBs to PBs)
  • Typically uses denormalized schemas
  • Designed for high-performance reads

Main Types of Analytical Data Stores


1. Data Warehouse

Definition

A structured repository designed for relational analytical queries.

Key Features

  • Uses structured data
  • Schema-based (often star or snowflake schema)
  • Supports SQL queries

Azure Example

Azure Synapse Analytics

Use Cases

  • Business intelligence reporting
  • Financial analysis
  • Enterprise dashboards

Best for: Structured data and SQL-based analytics


2. Data Lake

Definition

A storage repository for raw data in its native format.

Key Features

  • Supports structured, semi-structured, and unstructured data
  • Schema-on-read (schema applied when querying)
  • Highly scalable and cost-effective

Azure Example

Azure Data Lake Storage

Use Cases

  • Big data analytics
  • Machine learning
  • Storing raw ingestion data

Best for: Flexible, large-scale data storage


3. Data Lakehouse (Conceptual)

Definition

A hybrid approach combining features of data lakes and data warehouses.

Key Features

  • Stores raw data like a data lake
  • Supports structured queries like a warehouse
  • Often uses open formats (e.g., Parquet, Delta)

Azure Context

  • Often implemented using:
    • Azure Data Lake Storage
    • Azure Synapse Analytics

Best for: Unified analytics platform


4. Analytical Databases / Big Data Processing Systems

Definition

Systems designed for distributed processing of large datasets.

Azure Example

Azure Synapse Analytics

Key Features

  • Parallel processing
  • Handles massive datasets
  • Supports batch and interactive queries

Best for: Large-scale analytics workloads


Comparison of Analytical Data Stores

FeatureData WarehouseData LakeLakehouse
Data TypeStructuredAll typesAll types
SchemaSchema-on-writeSchema-on-readHybrid
CostHigherLowerModerate
FlexibilityLowHighHigh
Query PerformanceHighVariableHigh

Key Design Considerations


1. Data Structure

  • Structured → Data warehouse
  • Mixed or raw → Data lake

2. Query Requirements

  • Complex SQL queries → Data warehouse
  • Exploratory analytics → Data lake

3. Cost

  • Data lakes are generally more cost-effective
  • Warehouses provide optimized performance at higher cost

4. Scalability

  • All Azure analytical stores scale
  • Data lakes excel in massive data storage

5. Performance Needs

  • Warehouses → optimized for speed
  • Lakes → optimized for storage and flexibility

Typical Analytics Architecture

  1. Data Ingestion
    • Batch or streaming
  2. Storage
    • Data lake or data warehouse
  3. Processing
    • Transformations and aggregations
  4. Visualization
    • BI tools (e.g., Power BI)

Why This Matters for DP-900

On the exam, you may be asked to:

  • Identify the correct analytical store for a scenario
  • Compare data lakes vs data warehouses
  • Understand schema-on-read vs schema-on-write
  • Recognize Azure services used for analytics

Summary — Exam-Relevant Takeaways

✔ Analytical data stores are used for:

  • Reporting
  • Analytics
  • Historical data analysis

✔ Main types:

  • Data Warehouse → structured, high-performance queries
  • Data Lake → raw, flexible storage
  • Lakehouse → hybrid approach

✔ Key concepts:

  • Schema-on-write (warehouse)
  • Schema-on-read (lake)

✔ Azure services to know:

  • Azure Synapse Analytics → data warehouse & analytics
  • Azure Data Lake Storage → scalable data lake

✔ Exam tip:
👉 Structured + SQL analytics → Data Warehouse
👉 Raw + flexible + big data → Data Lake


Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.

Practice Questions: Describe options for analytical data stores (DP-900 Exam Prep)

Practice Questions


Question 1

What is the primary purpose of an analytical data store?

A. To process high-volume transactions
B. To store temporary application data
C. To support reporting and data analysis
D. To manage user authentication

Answer: C

Explanation:
Analytical data stores are optimized for reporting, querying, and analysis, not transactions.


Question 2

Which type of data store is BEST suited for structured data and complex SQL queries?

A. Data lake
B. Data warehouse
C. File storage
D. Key-value store

Answer: B

Explanation:
Data warehouses are designed for structured data and high-performance SQL queries.


Question 3

Which Azure service is commonly used as a data warehouse?

A. Azure Data Lake Storage
B. Azure Synapse Analytics
C. Azure Files
D. Azure Table Storage

Answer: B

Explanation:
Azure Synapse Analytics provides data warehousing and large-scale analytics capabilities.


Question 4

What is a key characteristic of a data lake?

A. Requires predefined schema before loading data
B. Stores only structured data
C. Stores data in its raw format
D. Optimized for transactional workloads

Answer: C

Explanation:
Data lakes store raw data in native formats, supporting schema-on-read.


Question 5

Which concept describes applying schema when data is read rather than when it is written?

A. Schema-on-write
B. Schema-on-read
C. Data normalization
D. Data partitioning

Answer: B

Explanation:
Schema-on-read is used in data lakes, allowing flexible analysis.


Question 6

Which scenario is BEST suited for a data lake?

A. Financial reporting with strict schema
B. Running complex SQL joins on structured data
C. Storing raw IoT and log data for later analysis
D. Processing online transactions

Answer: C

Explanation:
Data lakes are ideal for large volumes of raw, diverse data.


Question 7

Which analytical data store typically uses schema-on-write?

A. Data lake
B. Data warehouse
C. Object storage
D. Key-value store

Answer: B

Explanation:
Data warehouses require a defined schema before data is loaded.


Question 8

Which of the following best describes a data lakehouse?

A. A transactional database system
B. A file storage system only
C. A hybrid of data lake and data warehouse
D. A key-value storage solution

Answer: C

Explanation:
A lakehouse combines flexibility of data lakes with performance of warehouses.


Question 9

Which factor is MOST important when choosing between a data lake and a data warehouse?

A. Screen resolution
B. Data structure and query requirements
C. Programming language
D. User interface design

Answer: B

Explanation:
The choice depends on data type (structured vs raw) and query needs.


Question 10

Which Azure service is BEST suited for storing large volumes of raw, unstructured data?

A. Azure SQL Database
B. Azure Data Lake Storage
C. Azure Synapse Analytics
D. Azure Table Storage

Answer: B

Explanation:
Azure Data Lake Storage is optimized for large-scale raw data storage.


✅ Quick Exam Takeaways

✔ Analytical data stores support:

  • Reporting
  • Business intelligence
  • Large-scale analytics

✔ Main types:

  • Data Warehouse → structured, SQL, high performance
  • Data Lake → raw, flexible, scalable
  • Lakehouse → hybrid approach

✔ Key concepts:

  • Schema-on-write → warehouse
  • Schema-on-read → lake

✔ Azure services:

  • Azure Synapse Analytics → data warehouse / analytics
  • Azure Data Lake Storage → data lake

✔ Exam tip:
👉 Structured + SQL → Data Warehouse
👉 Raw + flexible → Data Lake


Go to the DP-900 Exam Prep Hub main page.