Category: Machine Learning (ML)

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Identify Features and Labels in a Dataset for Machine Learning (AI-900 Exam Prep)

This section of the AI-900: Microsoft Azure AI Fundamentals exam focuses on understanding one of the most important foundational concepts in machine learning: features and labels. You are not expected to build models or write code, but you must be able to recognize features and labels in a dataset and understand their role in different machine learning scenarios.

This topic appears under: Describe Artificial Intelligence workloads and considerations (15–20%) → Describe core machine learning concepts

What Is a Dataset in Machine Learning?

A dataset is a collection of data used to train, validate, and test machine learning models. In supervised learning scenarios (which are emphasized in AI-900), a dataset typically contains:

Features: The input values used to make predictions
Labels: The output or target values the model learns to predict

Each row in a dataset usually represents a single observation or record, and each column represents either a feature or a label.

What Are Features?

Features are the individual measurable properties or characteristics of the data that are used as inputs to a machine learning model.

Key Characteristics of Features

Features describe what you know about each data point
They are used by the model to identify patterns
Features can be numerical, categorical, or derived

Examples of Features

Scenario	Example Features
House price prediction	Number of bedrooms, square footage, location
Customer churn	Account age, number of support tickets, monthly spend
Email classification	Word frequency, sender domain, message length

In Azure Machine Learning, features are often referred to as input variables.

What Are Labels?

A label is the value that a machine learning model is trained to predict. Labels are only present in supervised learning datasets.

Key Characteristics of Labels

Labels represent the outcome or answer
A dataset usually has one label column
Labels are known during training but unknown during prediction

Examples of Labels

Scenario	Label
House price prediction	Sale price
Customer churn	Churned (Yes/No)
Image classification	Object category

In Azure Machine Learning, labels are often called target variables.

Features vs Labels: Key Differences

Aspect	Features	Labels
Purpose	Input to the model	Output to predict
Quantity	Usually many	Typically one
Known during training	Yes	Yes
Known during prediction	Yes	No

Understanding this distinction is critical for AI-900 exam questions.

Features and Labels in Supervised Learning

Supervised learning relies on labeled datasets. The model learns by comparing its predictions to the known labels and adjusting accordingly.

Common Supervised Learning Types

Regression
- Features: numeric or categorical inputs
- Label: numeric value (e.g., price, temperature)
Classification
- Features: descriptive inputs
- Label: category or class (e.g., spam/not spam)

Features and Labels in Unsupervised Learning

Unsupervised learning datasets do not contain labels.

The model identifies patterns or groupings on its own
Common example: clustering

In AI-900, this distinction is important:

If a dataset has no labels, it is not supervised learning.

Real-World Azure Example

Consider a dataset used in Azure Machine Learning to predict whether a customer will cancel a subscription.

Features:
- Number of logins per month
- Subscription length
- Customer support interactions
Label:
- Subscription canceled (Yes or No)

The model learns the relationship between the features and the label to make future predictions.

Exam Tips for AI-900

If the question asks “what the model uses to make predictions”, look for features
If the question asks “what the model predicts”, look for labels
If labels are present, it is supervised learning
AI-900 focuses on conceptual understanding, not data science implementation

Key Takeaways

Features are input variables used to make predictions
Labels are the known outcomes the model learns to predict
Supervised learning requires labeled data
Being able to identify features and labels in a scenario is essential for AI-900

This knowledge forms the foundation for understanding regression, classification, and many Azure AI workloads covered later in the exam.

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Practice Questions: Describe How Training and Validation Datasets Are Used in Machine Learning (AI-900 Exam Prep)

Practice Exam Questions

Question 1

What is the primary purpose of a training dataset in machine learning?

A. To evaluate the model’s accuracy on new data
B. To teach the model patterns using known outcomes
C. To store prediction results
D. To deploy the model to production

Correct Answer: B

Explanation:
The training dataset is used to teach the model by learning relationships between features and labels.

Question 2

Which dataset is used to assess how well a machine learning model performs on unseen data?

A. Training dataset
B. Feature dataset
C. Validation dataset
D. Prediction dataset

Correct Answer: C

Explanation:
The validation dataset is separate from training data and is used to evaluate the model’s ability to generalize.

Question 3

Why should the same dataset not be used for both training and validation?

A. It increases storage costs
B. It slows down training
C. It can lead to misleading performance results
D. It prevents model deployment

Correct Answer: C

Explanation:
Using the same data for training and validation can hide overfitting and give an inaccurate measure of model performance.

Question 4

A model performs very well on training data but poorly on validation data. What is this most likely an example of?

A. Underfitting
B. Overfitting
C. Data labeling
D. Feature engineering

Correct Answer: B

Explanation:
Overfitting occurs when a model memorizes training data but fails to generalize to new, unseen data.

Question 5

Which statement about a validation dataset is TRUE?

A. It is used to adjust model parameters
B. It replaces the need for training data
C. It helps evaluate model performance
D. It contains only unlabeled data

Correct Answer: C

Explanation:
Validation data is used to assess how well the model performs but is not used to train or adjust it.

Question 6

In supervised learning, which datasets typically contain both features and labels?

A. Validation only
B. Training only
C. Both training and validation
D. Neither training nor validation

Correct Answer: C

Explanation:
Both datasets contain features and labels, but they are used for different purposes.

Question 7

What is a key benefit of using a validation dataset during model development?

A. Faster training times
B. Automatic feature creation
C. Detection of overfitting
D. Reduced data storage

Correct Answer: C

Explanation:
Validation data helps identify whether the model is overfitting the training data.

Question 8

A dataset is split into 80% training data and 20% validation data.
What is the purpose of the 20% portion?

A. To retrain the model after deployment
B. To evaluate the model’s predictions
C. To generate new features
D. To label the data

Correct Answer: B

Explanation:
The validation portion is used to evaluate how well the model performs on unseen data.

Question 9

Which phrase best describes how a validation dataset is used?

A. Teaching the model
B. Fine-tuning the labels
C. Testing model generalization
D. Storing predictions

Correct Answer: C

Explanation:
Validation data is used to test how well the model generalizes beyond its training data.

Question 10

Which scenario correctly describes the use of training and validation datasets?

A. Training data is used only after deployment
B. Validation data is used to adjust model weights
C. Training data teaches the model; validation data evaluates it
D. Both datasets are identical

Correct Answer: C

Explanation:
Training data is used for learning, while validation data is used for evaluation.

Exam Strategy Tip

On AI-900:

Training dataset → learning and pattern recognition
Validation dataset → evaluation and generalization
Watch for keywords like overfitting, unseen data, and model performance

If you can map those keywords quickly, these questions become easy points.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Describe How Training and Validation Datasets Are Used in Machine Learning (AI-900 Exam Prep)

This section of the AI-900: Microsoft Azure AI Fundamentals exam focuses on understanding how machine learning models learn from data and how their performance is evaluated. Specifically, it covers the role of training datasets and validation datasets, which are core concepts in supervised machine learning.

This topic appears under: Describe fundamental principles of machine learning on Azure (15–20%) → Describe core machine learning concepts

You are not expected to build or tune models for AI-900, but you must be able to describe the purpose of training and validation datasets and how they differ.

Why Datasets Are Split in Machine Learning

In machine learning, using the same data to both train and evaluate a model can lead to misleading results. To avoid this, datasets are commonly split into separate subsets, each with a distinct purpose.

At a minimum, most machine learning workflows use:

A training dataset
A validation dataset

These datasets help ensure that a model can generalize to new, unseen data.

Training Dataset

A training dataset is the portion of data used to teach the machine learning model how to make predictions.

Key Characteristics of Training Data

Contains both features and labels (in supervised learning)
Used to identify patterns and relationships in the data
Typically makes up the largest portion of the dataset

What Happens During Training

The model makes predictions using the features
Predictions are compared to the known labels
The model adjusts its internal parameters to reduce errors

In Azure Machine Learning, this is the phase where the model “learns” from historical data.

Validation Dataset

A validation dataset is used to evaluate how well the model performs on unseen data during the training process.

Key Characteristics of Validation Data

Separate from the training dataset
Contains features and labels
Used to assess model accuracy and generalization

Why Validation Data Is Important

Helps detect overfitting (when a model memorizes training data)
Provides an unbiased evaluation of model performance
Supports decisions about model selection or improvement

For AI-900, the key idea is that validation data is not used to train the model, only to evaluate it.

Training vs Validation: Key Differences

Aspect	Training Dataset	Validation Dataset
Primary purpose	Teach the model	Evaluate the model
Used to adjust model parameters	Yes	No
Seen by the model during learning	Yes	No
Helps detect overfitting	Indirectly	Yes

Understanding this distinction is essential for AI-900 exam questions.

Common Data Split Ratios

While AI-900 does not test exact percentages, common industry practices include:

70% training / 30% validation
80% training / 20% validation

The exact split depends on dataset size and use case, but the concept is what matters for the exam.

Example Scenario

A company is building a model to predict whether customers will cancel a subscription.

Training dataset:
- Used to teach the model using historical customer behavior and known outcomes
Validation dataset:
- Used to test how accurately the model predicts cancellations for customers it has not seen before

This approach helps ensure the model performs well in real-world scenarios.

Overfitting and Generalization

One of the main reasons for using a validation dataset is to avoid overfitting.

Overfitting occurs when a model performs well on training data but poorly on new data
Validation data helps confirm that the model can generalize beyond the training set

For AI-900, you only need to recognize this relationship, not the mathematical details.

Azure Context for AI-900

In Azure Machine Learning:

Training data is used to train machine learning models
Validation data is used to evaluate model performance during development
This separation supports reliable and responsible AI solutions

Exam Tips for AI-900

If the question mentions learning or adjusting the model, think training dataset
If the question mentions evaluation or performance on unseen data, think validation dataset
Validation data is not used to teach the model
AI-900 focuses on understanding why datasets are separated

Key Takeaways

Training datasets are used to teach machine learning models
Validation datasets are used to evaluate model performance
Separating datasets helps prevent overfitting
Understanding these roles is a core AI-900 exam skill

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Practice Questions: Describe Capabilities of Automated Machine Learning (AI-900 Exam Prep)

Practice Exam Questions

Question 1

What is the primary purpose of Automated Machine Learning (AutoML) in Azure?

A. To replace data scientists
B. To automatically label data
C. To select and optimize machine learning models
D. To deploy models without evaluation

Correct Answer: C

Explanation:
AutoML automatically selects algorithms and tunes parameters to identify the best-performing model for a given dataset.

Question 2

Which machine learning scenarios are supported by Azure Automated Machine Learning?

A. Clustering only
B. Regression and classification
C. Reinforcement learning
D. Rule-based automation

Correct Answer: B

Explanation:
AutoML supports supervised learning scenarios such as regression and classification, which are core to AI-900.

Question 3

How does AutoML reduce the need for deep machine learning expertise?

A. By eliminating the need for training data
B. By automatically selecting models and hyperparameters
C. By generating business requirements
D. By replacing human oversight

Correct Answer: B

Explanation:
AutoML handles model selection and hyperparameter tuning automatically, reducing manual effort and expertise requirements.

Question 4

Which task is handled automatically by Azure AutoML?

A. Defining business objectives
B. Cleaning poor-quality data
C. Hyperparameter tuning
D. Approving model deployment

Correct Answer: C

Explanation:
AutoML automatically adjusts hyperparameters to improve model performance.

Question 5

A team wants to quickly build a sales forecasting model with minimal manual configuration.
Which Azure capability should they use?

A. Azure Cognitive Services
B. Azure Bot Service
C. Automated Machine Learning
D. Azure Logic Apps

Correct Answer: C

Explanation:
AutoML is designed to quickly build supervised ML models, including time-series forecasting.

Question 6

Which statement about Automated Machine Learning is TRUE?

A. AutoML guarantees perfect model accuracy
B. AutoML removes the need for human review
C. AutoML compares multiple models automatically
D. AutoML works only with unlabeled data

Correct Answer: C

Explanation:
AutoML evaluates and compares multiple models to identify the best-performing option.

Question 7

Which Azure service provides Automated Machine Learning capabilities?

A. Azure Functions
B. Azure Machine Learning
C. Azure App Service
D. Azure Synapse Analytics

Correct Answer: B

Explanation:
Automated Machine Learning is a feature within Azure Machine Learning.

Question 8

What is a key benefit of using AutoML?

A. Manual feature engineering
B. Faster model development
C. Elimination of data preparation
D. Guaranteed regulatory compliance

Correct Answer: B

Explanation:
AutoML speeds up model development by automating model selection, tuning, and evaluation.

Question 9

Which of the following is NOT a capability of Automated Machine Learning?

A. Automatic model evaluation
B. Automatic algorithm selection
C. Automatic business decision-making
D. Hyperparameter tuning

Correct Answer: C

Explanation:
AutoML supports model creation and evaluation but does not make business decisions.

Question 10

Why is Automated Machine Learning especially useful for beginners?

A. It removes the need for labeled data
B. It eliminates model deployment steps
C. It simplifies model creation and experimentation
D. It replaces Azure Machine Learning

Correct Answer: C

Explanation:
AutoML simplifies experimentation by automating many steps involved in building machine learning models.

Exam Strategy Tip

On AI-900, think of AutoML as a productivity accelerator:

You provide the data and goal
AutoML handles model selection, tuning, and evaluation
Humans still review and deploy the model

If a question mentions automatic selection, minimal configuration, or quick model building, the answer is might be related to Automated Machine Learning.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Describe Capabilities of Automated Machine Learning (AI-900 Exam Prep)

This section of the AI-900: Microsoft Azure AI Fundamentals exam focuses on understanding what Automated Machine Learning (AutoML) is and what it can do within Azure Machine Learning. The emphasis is on recognizing capabilities and use cases, not on configuring pipelines or writing code.

This topic appears under: Describe fundamental principles of machine learning on Azure (15–20%) → Describe Azure Machine Learning capabilities

What Is Automated Machine Learning?

Automated Machine Learning (AutoML) is a capability in Azure Machine Learning that automatically selects the best machine learning model and tuning settings for a given dataset and problem.

AutoML helps users:

Build machine learning models faster
Reduce the need for deep data science expertise
Focus on business problems rather than algorithms

For AI-900, you only need to understand what AutoML does, not how to implement it.

Problems AutoML Can Solve

Automated Machine Learning in Azure supports common supervised learning scenarios:

Regression – Predicting numeric values (for example, sales forecasts)
Classification – Predicting categories or classes (for example, fraud detection)
Time-series forecasting – Predicting values over time (for example, demand prediction)

AutoML does not focus on unsupervised learning scenarios such as clustering for AI-900.

Key Capabilities of Automated Machine Learning

Automatic Model Selection

AutoML automatically:

Tries multiple machine learning algorithms
Compares model performance
Selects the best-performing model based on evaluation metrics

Users do not need to manually choose algorithms.

Automated Hyperparameter Tuning

AutoML automatically adjusts hyperparameters to improve model performance, such as:

Learning rate
Number of trees
Regularization settings

This removes the need for manual trial-and-error tuning.

Built-in Feature Engineering

AutoML can automatically create and transform features, including:

Normalizing numeric data
Encoding categorical values
Handling missing values

This simplifies data preparation for machine learning.

Model Evaluation and Comparison

AutoML evaluates models using validation data and metrics such as:

Accuracy
Precision and recall
Mean absolute error

It then ranks models so users can easily compare results.

Integration with Azure Machine Learning

AutoML is fully integrated into Azure Machine Learning, allowing users to:

Track experiments
View model performance
Deploy selected models

This integration supports repeatable and responsible ML workflows.

Example Scenario

A retail company wants to predict monthly product sales but does not have a data science team.

Using Automated Machine Learning:

The company provides historical sales data
AutoML tests multiple regression models
The best-performing model is automatically selected

This allows faster model creation with minimal manual effort.

What AutoML Does NOT Do (Exam-Relevant)

It is important to recognize AutoML limitations for AI-900:

It does not eliminate the need for quality data
It does not automatically define business goals
It does not replace human oversight

AutoML assists model creation but does not remove responsibility from users.

Azure Context for AI-900

In Azure Machine Learning, AutoML:

Simplifies model creation
Supports beginners and non-experts
Accelerates experimentation and deployment

AI-900 questions often focus on why AutoML is useful rather than how it works internally.

Exam Tips for AI-900

If the question mentions automatic model selection or tuning, think AutoML
AutoML is best for quickly building supervised ML models
Remember: AutoML helps choose models, but humans still provide data and goals

Key Takeaways

Automated Machine Learning automates model selection, tuning, and evaluation
It supports regression, classification, and forecasting scenarios
AutoML reduces the need for deep ML expertise
Understanding its capabilities is essential for AI-900

This topic connects directly to Azure Machine Learning services and helps bridge core ML concepts with real-world Azure AI capabilities.

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Data Science, Machine Learning (ML), Microsoft Certification January 31, 2026

Practice Questions: Describe data and compute services for data science and machine learning (AI-900 Exam Prep)

Practice Exam Questions

Question 1

Which Azure service is most commonly used to store large, unstructured datasets for machine learning training?

A. Azure SQL Database
B. Azure Blob Storage
C. Azure Cosmos DB
D. Azure Virtual Machines

✅ Correct Answer: B. Azure Blob Storage

Explanation:
Azure Blob Storage is designed to store large amounts of unstructured data such as files, images, and CSVs. It is the most common data storage service used in machine learning workflows.

Question 2

Which Azure service is specifically designed to train, manage, and deploy machine learning models?

A. Azure Kubernetes Service (AKS)
B. Azure Machine Learning
C. Azure Data Factory
D. Azure App Service

✅ Correct Answer: B. Azure Machine Learning

Explanation:
Azure Machine Learning provides managed tools and compute for training, evaluating, and deploying machine learning models. It is the core ML service in Azure.

Question 3

You need to store structured, relational data that will be used to train a machine learning model. Which Azure service is most appropriate?

A. Azure Blob Storage
B. Azure Data Lake Storage
C. Azure SQL Database
D. Azure File Storage

✅ Correct Answer: C. Azure SQL Database

Explanation:
Azure SQL Database is used for structured data stored in tables with defined schemas, making it suitable for relational datasets used in machine learning.

Question 4

Which Azure service is primarily used to deploy machine learning models for scalable, real-time predictions?

A. Azure Virtual Machines
B. Azure Machine Learning compute
C. Azure Kubernetes Service (AKS)
D. Azure Blob Storage

✅ Correct Answer: C. Azure Kubernetes Service (AKS)

Explanation:
AKS is commonly used to deploy machine learning models in production environments where scalability and high availability are required.

Question 5

What is the primary purpose of compute resources in machine learning?

A. To store training data
B. To visualize data
C. To train and run machine learning models
D. To manage user access

✅ Correct Answer: C. To train and run machine learning models

Explanation:
Compute resources provide the processing power required to train models and perform inference.

Question 6

Which Azure service provides customizable compute environments, including GPU-based machines, for machine learning workloads?

A. Azure Functions
B. Azure Virtual Machines
C. Azure Logic Apps
D. Azure SQL Database

✅ Correct Answer: B. Azure Virtual Machines

Explanation:
Azure Virtual Machines allow users to fully control the operating system, software, and hardware configuration, making them ideal for specialized ML workloads.

Question 7

Which data service is best suited for big data analytics and large-scale machine learning workloads?

A. Azure Blob Storage
B. Azure SQL Database
C. Azure Data Lake Storage Gen2
D. Azure Table Storage

✅ Correct Answer: C. Azure Data Lake Storage Gen2

Explanation:
Azure Data Lake Storage Gen2 is optimized for analytics and big data workloads, making it ideal for large-scale machine learning scenarios.

Question 8

In a typical Azure machine learning workflow, where are trained models and output artifacts often stored?

A. Azure Virtual Machines
B. Azure Blob Storage
C. Azure SQL Database
D. Azure Active Directory

✅ Correct Answer: B. Azure Blob Storage

Explanation:
Blob Storage is commonly used to store trained models, logs, and experiment outputs due to its scalability and cost efficiency.

Question 9

Which Azure service combines data storage and analytics capabilities for machine learning and data science?

A. Azure Data Lake Storage
B. Azure File Storage
C. Azure App Service
D. Azure Functions

✅ Correct Answer: A. Azure Data Lake Storage

Explanation:
Azure Data Lake Storage is built for analytics and integrates well with data science and machine learning workloads.

Question 10

Which statement best describes Azure Machine Learning compute?

A. It is used only for storing machine learning data
B. It provides managed compute resources for training and inference
C. It replaces Azure Virtual Machines
D. It is used only for model deployment

✅ Correct Answer: B. It provides managed compute resources for training and inference

Explanation:
Azure Machine Learning compute offers scalable, managed CPU and GPU resources specifically designed for training and running machine learning models.

Final Exam Tips 🔑

For AI-900, remember these high-yield associations:

Blob Storage → unstructured ML data
Data Lake Storage → big data & analytics
Azure SQL Database → structured data
Azure Machine Learning → training & managing models
Virtual Machines → custom ML environments
AKS → scalable deployment

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Data Science, Machine Learning (ML) January 31, 2026

Describe Data and Compute Services for Data Science and Machine Learning (AI-900 Exam Prep)

This topic focuses on understanding which Azure services are used to store data and provide compute power for data science and machine learning workloads — not on how to configure them in depth. For the AI-900 exam, you should recognize what each service is used for and when you would choose one over another.

Why Data and Compute Matter in Machine Learning

Machine learning solutions require two essential components:

Data services → where training and inference data is stored and accessed
Compute services → where models are trained and executed

Azure provides scalable, cloud-based services for both, allowing organizations to build, train, and deploy machine learning solutions efficiently.

Data Services for Machine Learning on Azure

Azure offers several data storage services commonly used in machine learning scenarios.

Azure Blob Storage

Azure Blob Storage is the most common data store for machine learning.

Key characteristics:

Stores unstructured data (files, images, videos, CSVs)
Highly scalable and cost-effective
Frequently used as the data source for Azure Machine Learning experiments

Typical use cases:

Training datasets
Model artifacts
Logs and output files

👉 On AI-900: If the question mentions large datasets, files, or unstructured data, Blob Storage is usually the answer.

Azure Data Lake Storage Gen2

Azure Data Lake Storage is optimized for big data analytics and machine learning.

Key characteristics:

Built on Azure Blob Storage
Supports hierarchical namespaces
Designed for analytics workloads

Typical use cases:

Large-scale machine learning projects
Advanced analytics and data science pipelines

👉 On AI-900: Think of Data Lake Storage when big data and analytics are mentioned.

Azure SQL Database

Azure SQL Database stores structured, relational data.

Key characteristics:

Table-based storage
Uses SQL for querying
Suitable for well-defined schemas

Typical use cases:

Business and transactional data
Structured datasets used in ML training

👉 On AI-900: If the data is relational and structured, Azure SQL Database is a common choice.

Compute Services for Machine Learning on Azure

Compute services provide the processing power needed to train and run machine learning models.

Azure Machine Learning Compute

Azure Machine Learning provides managed compute resources specifically designed for ML workloads.

Key characteristics:

Scalable CPU and GPU compute
Used for training and inference
Managed through Azure Machine Learning workspace

Typical use cases:

Model training
Experimentation
Batch inference

👉 On AI-900: This is the primary compute service for machine learning.

Azure Virtual Machines

Azure Virtual Machines (VMs) offer full control over the compute environment.

Key characteristics:

Customizable CPU or GPU configurations
Supports specialized ML workloads
More management responsibility

Typical use cases:

Custom machine learning environments
Legacy or specialized ML tools

👉 On AI-900: VMs appear when flexibility or custom configuration is required.

Azure Kubernetes Service (AKS)

AKS is used primarily for deploying machine learning models at scale.

Key characteristics:

Container orchestration
High availability and scalability
Often used for real-time inference

Typical use cases:

Production ML model deployment
Scalable inference endpoints

👉 On AI-900: AKS is associated with deployment, not training.

How These Services Work Together

In a typical Azure machine learning workflow:

Data is stored in Blob Storage, Data Lake, or SQL Database
Models are trained using Azure Machine Learning compute or VMs
Models are deployed using Azure Machine Learning or AKS
Predictions are generated and consumed by applications

Azure handles scalability, security, and integration across these services.

Key Exam Takeaways

For AI-900, remember:

Blob Storage → unstructured ML data
Data Lake Storage → big data analytics
Azure SQL Database → structured data
Azure Machine Learning compute → training and experimentation
Virtual Machines → custom compute environments
AKS → scalable model deployment

You are not expected to configure these services — only recognize their purpose.

Exam Tip 💡

If a question asks:

“Where is ML data stored?” → Blob Storage or Data Lake
“Where is the model trained?” → Azure Machine Learning compute
“How is a model deployed at scale?” → AKS

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Practice Questions: Describe Model Management and Deployment Capabilities in Azure Machine Learning (AI-900 Exam Prep)

Practice Questions

Question 1

You train multiple machine learning models using different algorithms and want to store them in a central location with version tracking. Which Azure Machine Learning capability should you use?

A. Azure Kubernetes Service
B. Model registration
C. Batch endpoints
D. Automated machine learning

Correct Answer: B

Explanation:
Model registration stores trained models in Azure Machine Learning, enabling centralized management, versioning, and reuse.

Question 2

Why is model versioning important in Azure Machine Learning?

A. To reduce compute costs
B. To allow rollback to previous model versions
C. To encrypt model files
D. To improve model accuracy automatically

Correct Answer: B

Explanation:
Model versioning allows teams to track changes over time and revert to earlier versions if newer models perform poorly.

Question 3

Which deployment option should you use when predictions must be returned immediately to a web application?

A. Batch endpoint
B. Training pipeline
C. Real-time endpoint
D. Experiment run

Correct Answer: C

Explanation:
Real-time endpoints provide low-latency predictions through REST APIs, making them suitable for applications that need immediate responses.

Question 4

A data science team wants to score millions of records overnight without requiring instant responses. Which deployment approach is most appropriate?

A. Real-time endpoint
B. Batch endpoint
C. Azure Functions
D. Model registration

Correct Answer: B

Explanation:
Batch endpoints are designed for large-scale, offline predictions and do not require low-latency responses.

Question 5

Which Azure Machine Learning feature tracks metrics, parameters, and outputs from training runs?

A. Model deployment
B. Experiment tracking
C. Model endpoint
D. Azure Blob Storage

Correct Answer: B

Explanation:
Experiment tracking captures training details such as metrics and parameters, enabling comparison and reproducibility.

Question 6

After training a model, what is the primary purpose of registering it in Azure Machine Learning?

A. To retrain the model automatically
B. To expose the model as an API
C. To store and manage the model for future use
D. To encrypt the dataset

Correct Answer: C

Explanation:
Registering a model allows it to be stored, versioned, and managed, making it available for deployment and reuse.

Question 7

Which Azure Machine Learning capability simplifies scaling and infrastructure management when deploying models?

A. Model versioning
B. Containerized deployment
C. Experiment tracking
D. Data labeling

Correct Answer: B

Explanation:
Azure Machine Learning packages models into containers, simplifying deployment, scaling, and infrastructure management.

Question 8

What is a key difference between real-time endpoints and batch endpoints?

A. Real-time endpoints do not require models
B. Batch endpoints are used only for training
C. Real-time endpoints provide immediate predictions
D. Batch endpoints use more accurate models

Correct Answer: C

Explanation:
Real-time endpoints return predictions immediately, while batch endpoints process large datasets asynchronously.

Question 9

Which task is part of model management rather than model deployment?

A. Exposing a REST API
B. Scaling compute resources
C. Registering and versioning models
D. Handling prediction requests

Correct Answer: C

Explanation:
Registering and versioning models are model management tasks. Deployment focuses on making models available for predictions.

Question 10

Which statement best describes Azure Machine Learning’s role in model deployment?

A. It requires manual server configuration
B. It automates model training only
C. It simplifies deploying models to scalable endpoints
D. It replaces Azure Kubernetes Service

Correct Answer: C

Explanation:
Azure Machine Learning abstracts infrastructure complexity, making it easier to deploy models as scalable endpoints.

Final Exam Tips ✅

Model registration = storage + versioning
Real-time endpoint = immediate predictions
Batch endpoint = large-scale, offline predictions
AI-900 tests concepts, not implementation details

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Describe Model Management and Deployment Capabilities in Azure Machine Learning (AI-900 Exam Prep)

Where this fits in the exam

Exam domain: Describe fundamental principles of machine learning on Azure (15–20%)
Sub-area: Describe Azure Machine Learning capabilities
Skill level: Conceptual understanding (no deep implementation details)

For AI-900, Microsoft expects you to understand what Azure Machine Learning can do for managing and deploying models — not how to write code or configure infrastructure in detail.

What Is Model Management in Azure Machine Learning?

Model management refers to how machine learning models are:

Stored
Versioned
Tracked
Prepared for deployment

Azure Machine Learning provides built-in tools to manage the entire model lifecycle, from training to production.

Key Model Management Capabilities

1. Model Registration

After a model is trained, it can be registered in Azure Machine Learning.

What model registration provides:

Centralized model storage
Model versioning
Metadata tracking (name, version, description)
Easy reuse across experiments and deployments

📌 Exam tip:
Registration allows multiple versions of the same model to be stored and compared.

2. Model Versioning

Azure Machine Learning automatically assigns versions to registered models.

Why this matters:

Compare performance between model versions
Roll back to a previous version if a newer model performs poorly
Support continuous improvement and experimentation

📌 AI-900 focus:
You only need to know that versioning exists and why it’s useful, not how to configure it.

3. Experiment Tracking

Azure Machine Learning tracks:

Training runs
Parameters
Metrics (accuracy, error, etc.)
Output artifacts

This helps data scientists:

Compare models
Reproduce results
Understand how a model was created

Model Deployment in Azure Machine Learning

Once a model is trained and registered, it can be deployed so applications can use it to make predictions.

Deployment Options in Azure Machine Learning

1. Real-Time Endpoints

Used for on-demand predictions.

Key characteristics:

Low-latency responses
Exposed via a REST API
Commonly used for web or application integrations

Typical compute targets:

Azure Kubernetes Service (AKS)
Azure Container Instances (ACI)

📌 Exam tip:
Real-time endpoints are used when predictions are needed immediately.

2. Batch Endpoints

Used for large-scale, offline predictions.

Key characteristics:

Processes large datasets at once
Not time-sensitive
Often scheduled or run periodically

Example use cases:

Scoring customer records overnight
Generating predictions for reports

Managed Deployment Features

Azure Machine Learning simplifies deployment by providing:

Containerized deployments
Models are packaged into containers for consistency.
Scaling support
Automatically handles increasing or decreasing load.
Monitoring and logging
Tracks performance and usage after deployment.

📌 AI-900 emphasis:
You should understand that Azure ML manages infrastructure complexity, not the low-level details.

Model Management vs Deployment (At a Glance)

Capability	Purpose
Model registration	Store and organize trained models
Versioning	Track changes and improvements
Experiment tracking	Compare training runs and metrics
Real-time deployment	Immediate predictions via API
Batch deployment	Large-scale, offline predictions

Why This Matters for AI-900

For the AI-900 exam, Microsoft wants you to recognize that:

Azure Machine Learning supports the full ML lifecycle
Models can be managed, versioned, and deployed without custom infrastructure
Deployment can be real-time or batch, depending on the scenario

You are not expected to:

Write deployment scripts
Configure Kubernetes clusters
Optimize production pipelines

Key Takeaways for the Exam

Azure Machine Learning provides centralized model management
Models can be registered and versioned
Deployment options include real-time endpoints and batch endpoints
Azure ML simplifies scaling, monitoring, and management

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Practice Questions: Identify Features and Labels in a Dataset for Machine Learning (AI-900 Exam Prep)

Practice Exam Questions

Question 1

You are training a model to predict house prices. The dataset includes columns for square footage, number of bedrooms, location, and sale price.
Which column is the label?

A. Square footage
B. Number of bedrooms
C. Location
D. Sale price

Correct Answer: D

Explanation:
The label is the value the model is trained to predict. In this scenario, the goal is to predict the sale price.

Question 2

Which statement best describes a feature in a machine learning dataset?

A. The final prediction made by the model
B. An input value used to make predictions
C. A rule written by a developer
D. The accuracy of the model

Correct Answer: B

Explanation:
Features are the input variables that provide information the model uses to make predictions.

Question 3

A dataset contains customer age, subscription length, monthly charges, and whether the customer canceled the service.
What is the label?

A. Customer age
B. Subscription length
C. Monthly charges
D. Whether the customer canceled

Correct Answer: D

Explanation:
The label represents the outcome being predicted—in this case, whether the customer canceled the service.

Question 4

Which type of machine learning requires both features and labels?

A. Unsupervised learning
B. Reinforcement learning
C. Supervised learning
D. Clustering

Correct Answer: C

Explanation:
Supervised learning uses labeled data so the model can learn the relationship between features and known outcomes.

Question 5

A dataset is used to group customers based on purchasing behavior, but it does not contain any target outcome.
What does this dataset contain?

A. Labels only
B. Features only
C. Training results
D. Predictions

Correct Answer: B

Explanation:
Unsupervised learning datasets contain features but do not include labels.

Question 6

In an email spam detection dataset, which item would most likely be a feature?

A. Spam or not spam
B. Model accuracy score
C. Number of words in the email
D. Final prediction

Correct Answer: C

Explanation:
The number of words is an input characteristic used by the model to make predictions, making it a feature.

Question 7

Which statement about labels is TRUE?

A. Labels are optional in supervised learning
B. Labels are the inputs used by the model
C. Labels represent the value the model predicts
D. Labels are created after predictions are made

Correct Answer: C

Explanation:
Labels are the known outcomes the model is trained to predict in supervised learning scenarios.

Question 8

You are preparing data in Azure Machine Learning to predict product demand.
Which columns should be selected as features?

A. Only the column you want to predict
B. All columns except the target outcome
C. Only numerical columns
D. Only categorical columns

Correct Answer: B

Explanation:
Features are the input columns used to predict the target outcome, which is the label.

Question 9

A dataset includes the following columns: temperature, humidity, wind speed, and weather condition.
If the goal is to predict the weather condition, what are temperature, humidity, and wind speed?

A. Labels
B. Predictions
C. Features
D. Outputs

Correct Answer: C

Explanation:
These values are inputs used to predict the weather condition, making them features.

Question 10

Which scenario best represents a labeled dataset?

A. Customer data grouped by similarity
B. Sensor readings without outcomes
C. Product reviews with sentiment categories
D. Website logs without classifications

Correct Answer: C

Explanation:
Product reviews with sentiment categories include known outcomes, which are labels, making the dataset labeled.

Exam Pattern Tip

On AI-900:

Features = inputs
Labels = outputs
If labels exist → supervised learning
If no labels → unsupervised learning

If you can identify those quickly, you’ll eliminate most wrong answers immediately.

Go to the AI-900 Exam Prep Hub main page.