Describe Data and Compute Services for Data Science and Machine Learning (AI-900 Exam Prep)

This topic focuses on understanding which Azure services are used to store data and provide compute power for data science and machine learning workloads — not on how to configure them in depth. For the AI-900 exam, you should recognize what each service is used for and when you would choose one over another.


Why Data and Compute Matter in Machine Learning

Machine learning solutions require two essential components:

  • Data services → where training and inference data is stored and accessed
  • Compute services → where models are trained and executed

Azure provides scalable, cloud-based services for both, allowing organizations to build, train, and deploy machine learning solutions efficiently.


Data Services for Machine Learning on Azure

Azure offers several data storage services commonly used in machine learning scenarios.

Azure Blob Storage

Azure Blob Storage is the most common data store for machine learning.

Key characteristics:

  • Stores unstructured data (files, images, videos, CSVs)
  • Highly scalable and cost-effective
  • Frequently used as the data source for Azure Machine Learning experiments

Typical use cases:

  • Training datasets
  • Model artifacts
  • Logs and output files

👉 On AI-900: If the question mentions large datasets, files, or unstructured data, Blob Storage is usually the answer.


Azure Data Lake Storage Gen2

Azure Data Lake Storage is optimized for big data analytics and machine learning.

Key characteristics:

  • Built on Azure Blob Storage
  • Supports hierarchical namespaces
  • Designed for analytics workloads

Typical use cases:

  • Large-scale machine learning projects
  • Advanced analytics and data science pipelines

👉 On AI-900: Think of Data Lake Storage when big data and analytics are mentioned.


Azure SQL Database

Azure SQL Database stores structured, relational data.

Key characteristics:

  • Table-based storage
  • Uses SQL for querying
  • Suitable for well-defined schemas

Typical use cases:

  • Business and transactional data
  • Structured datasets used in ML training

👉 On AI-900: If the data is relational and structured, Azure SQL Database is a common choice.


Compute Services for Machine Learning on Azure

Compute services provide the processing power needed to train and run machine learning models.


Azure Machine Learning Compute

Azure Machine Learning provides managed compute resources specifically designed for ML workloads.

Key characteristics:

  • Scalable CPU and GPU compute
  • Used for training and inference
  • Managed through Azure Machine Learning workspace

Typical use cases:

  • Model training
  • Experimentation
  • Batch inference

👉 On AI-900: This is the primary compute service for machine learning.


Azure Virtual Machines

Azure Virtual Machines (VMs) offer full control over the compute environment.

Key characteristics:

  • Customizable CPU or GPU configurations
  • Supports specialized ML workloads
  • More management responsibility

Typical use cases:

  • Custom machine learning environments
  • Legacy or specialized ML tools

👉 On AI-900: VMs appear when flexibility or custom configuration is required.


Azure Kubernetes Service (AKS)

AKS is used primarily for deploying machine learning models at scale.

Key characteristics:

  • Container orchestration
  • High availability and scalability
  • Often used for real-time inference

Typical use cases:

  • Production ML model deployment
  • Scalable inference endpoints

👉 On AI-900: AKS is associated with deployment, not training.


How These Services Work Together

In a typical Azure machine learning workflow:

  1. Data is stored in Blob Storage, Data Lake, or SQL Database
  2. Models are trained using Azure Machine Learning compute or VMs
  3. Models are deployed using Azure Machine Learning or AKS
  4. Predictions are generated and consumed by applications

Azure handles scalability, security, and integration across these services.


Key Exam Takeaways

For AI-900, remember:

  • Blob Storage → unstructured ML data
  • Data Lake Storage → big data analytics
  • Azure SQL Database → structured data
  • Azure Machine Learning compute → training and experimentation
  • Virtual Machines → custom compute environments
  • AKS → scalable model deployment

You are not expected to configure these services — only recognize their purpose.


Exam Tip 💡

If a question asks:

  • “Where is ML data stored?”Blob Storage or Data Lake
  • “Where is the model trained?”Azure Machine Learning compute
  • “How is a model deployed at scale?”AKS

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

Leave a comment