Tag: AI-900 Exam Prep Hub

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Additional Material: Regression vs Classification vs Clustering (AI-900 Exam Prep)

Here is some additional information to help you prepare for the AI-900 or can be used just to solidify your knowledge of these concepts.

Machine Learning Techniques Comparison Table

Aspect	Regression	Classification	Clustering
Type of Learning	Supervised	Supervised	Unsupervised
Primary Goal	Predict a numeric value	Predict a category or label	Group similar data points
Output Type	Continuous number	Discrete category	Cluster/group assignment
Labeled Training Data	Yes	Yes	No
Key Question Answered	How much? How many? How long?	Which category? Yes or No?	Which items are similar?
Common Keywords	Predict, estimate, forecast	Classify, assign, detect	Group, segment, organize
Typical Output Examples	Price, temperature, revenue, time	Approved/Rejected, Spam/Not spam	Customer segments, usage groups
Example Scenario	Predict house prices	Detect fraudulent transactions	Segment customers by behavior
AI-900 Exam Focus	Identifying numeric predictions	Identifying label predictions	Identifying pattern discovery
Common Exam Trap	Confusing ranges with categories	Treating Yes/No as numeric	Assuming labels exist

Quick Visual Memory Trick

Regression → 📈 Numbers on a line
Classification → 🏷️ Named buckets
Clustering → 🧩 Natural groupings

Side-by-Side Example

Imagine a retail company:

Business Question	Technique
“What will next month’s revenue be?”	Regression
“Will this customer churn?”	Classification
“Which customers behave similarly?”	Clustering

Common AI-900 Exam Pitfalls to Avoid

❌ High / Medium / Low → Classification, not regression
❌ Yes / No → Classification, not regression
❌ Grouping without predefined labels → Clustering
❌ Predicting quantities → Regression

Exam-Day Decision Shortcut

Ask yourself one question:

“Is the output a number?”

Yes → Regression
No, it’s a label → Classification
No labels, just groups → Clustering

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Practice Questions: Identify Classification Machine Learning Scenarios (AI-900 Exam Prep)

Practice Exam Questions

Question 1

A bank wants to determine whether a credit card transaction is fraudulent.

Which machine learning technique should be used?

A. Regression
B. Classification
C. Clustering
D. Anomaly detection

Correct Answer: B

Explanation:
The output is Fraud / Not Fraud, which is a category. Predicting categories is a classification task.

Question 2

An organization wants to predict whether a customer will renew their subscription.

Which type of machine learning problem is this?

A. Regression
B. Classification
C. Clustering
D. Recommendation

Correct Answer: B

Explanation:
The outcome is Yes / No, which makes this a binary classification scenario.

Question 3

Which of the following scenarios is best suited for classification?

A. Predicting the price of a product
B. Grouping customers based on behavior
C. Determining if an email is spam
D. Estimating delivery time

Correct Answer: C

Explanation:
Spam detection involves assigning emails to Spam or Not Spam categories, which is classification.

Question 4

An AI system categorizes customer support tickets into predefined issue types.

What type of machine learning technique is being used?

A. Regression
B. Classification
C. Clustering
D. Time-series forecasting

Correct Answer: B

Explanation:
The system assigns each ticket to a known category, which is classification.

Question 5

Which output value most clearly indicates a classification scenario?

A. 128.5
B. 4.2 hours
C. High risk
D. 99.7

Correct Answer: C

Explanation:
High risk is a label, not a numeric value, indicating classification.

Question 6

A model predicts whether a customer will default on a loan.

Which machine learning approach is most appropriate?

A. Regression
B. Classification
C. Clustering
D. Anomaly detection

Correct Answer: B

Explanation:
Default / Not Default is a binary label, making this a classification problem.

Question 7

Which scenario represents multi-class classification?

A. Predicting house prices
B. Detecting unusual network traffic
C. Assigning images to animal types
D. Grouping products by sales patterns

Correct Answer: C

Explanation:
Assigning images to multiple animal types (cat, dog, bird) is multi-class classification.

Question 8

A healthcare system predicts whether a patient is at low, medium, or high risk.

Which type of machine learning is being used?

A. Regression
B. Classification
C. Clustering
D. Forecasting

Correct Answer: B

Explanation:
Low / Medium / High are categories, not numeric values, so this is classification.

Question 9

Which statement best describes classification models?

A. They predict continuous numeric values
B. They group unlabeled data
C. They assign inputs to predefined categories
D. They detect rare anomalies

Correct Answer: C

Explanation:
Classification models assign data points to predefined labels or categories.

Question 10

On the AI-900 exam, which keyword most strongly indicates a classification scenario?

A. Forecast
B. Estimate
C. Categorize
D. Measure

Correct Answer: C

Explanation:
Categorize indicates assigning labels, which is classification.

Exam-Day Tip

For machine learning related questions, if the question describes …

Yes / No decisions
Named labels
Risk levels or categories

… the correct answer is likely related to Classification.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Identify Classification Machine Learning Scenarios (AI-900 Exam Prep)

Where This Fits in the Exam

Exam Domain: Describe fundamental principles of machine learning on Azure (15–20%)
Sub-Domain: Identify common machine learning techniques
Topic: Identify classification machine learning scenarios

On the AI-900 exam, classification questions test your ability to recognize when classification is the appropriate machine learning technique, not how to build models.

What Is Classification in Machine Learning?

Classification is a type of supervised machine learning used to predict a category, class, or label.

The model is trained on labeled data
The output is discrete, not numeric
The goal is to decide which category something belongs to

Key exam rule:
If the output is a label or category, the scenario is classification.

Characteristics of Classification Scenarios

A classification workload typically includes:

Historical data with known labels
Input features used to make predictions
A finite set of possible outcomes
Binary or multi-class results

Common classification outputs:

Yes / No
True / False
Approved / Rejected
Spam / Not Spam
High Risk / Low Risk

Binary vs Multi-Class Classification

Binary Classification

Only two possible outcomes
Examples:
- Fraud / Not Fraud
- Pass / Fail
- Churn / No Churn

Multi-Class Classification

More than two categories
Examples:
- Product category (electronics, clothing, food)
- Support ticket priority (low, medium, high)
- Image labels (cat, dog, bird)

Both are classification scenarios on the AI-900 exam.

Common Classification Use Cases

Decision-Based Predictions

Loan approval decisions
Insurance claim approval
Credit risk classification

Detection and Filtering

Spam email detection
Fraud detection
Content moderation

Categorization

Customer churn prediction
Sentiment categories (positive, neutral, negative)
Product classification

All of these involve choosing a label, not predicting a number.

Classification vs Other ML Techniques

Understanding how classification differs from regression and clustering is critical for AI-900.

Technique	Output	Example
Regression	Numeric value	Predicting house price
Classification	Category or label	Approving a loan
Clustering	Group assignment	Customer segmentation

Exam tip:
If the answer choices include Yes/No, True/False, or named groups, think Classification.

Example Exam Scenarios

Scenario 1

A bank wants to determine whether a transaction is fraudulent.

Output: Fraud / Not Fraud
ML Technique: Classification

Scenario 2

A company wants to predict whether a customer will cancel their subscription.

Output: Cancel / Not Cancel
ML Technique: Classification

Scenario 3

An AI system categorizes customer support tickets into predefined issue types.

Output: Issue category
ML Technique: Classification

Azure Context for AI-900

On the AI-900 exam, classification scenarios are often described using Azure Machine Learning concepts such as:

Training models with labeled datasets
Predicting predefined categories
Evaluating model accuracy

You are not required to:

Select algorithms
Write code
Configure Azure services

Focus on recognizing the technique, not implementing it.

Common Exam Traps and Misconceptions

❌ Predicting a numeric score → Regression
❌ Grouping data without labels → Clustering
❌ Predicting ranges like High / Medium / Low → Classification, not regression
✅ Predicting labels or categories → Classification

Key Takeaways for the Exam

Classification predicts categories or labels
It is a supervised learning technique
Outputs are discrete, not numeric
Binary and multi-class scenarios are both classification
Look for keywords like classify, detect, assign, categorize

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Practice Questions: Identify Clustering Machine Learning Scenarios (AI-900 Exam Prep)

Practice Exam Questions

Question 1

A retail company wants to group customers based on purchasing behavior without defining categories in advance.

Which machine learning technique should be used?

A. Regression
B. Classification
C. Clustering
D. Anomaly detection

Correct Answer: C

Explanation:
The goal is to group unlabeled data and discover natural segments, which is clustering.

Question 2

An organization analyzes large volumes of web traffic data to identify patterns in user behavior.

Which machine learning approach is most appropriate?

A. Classification
B. Regression
C. Clustering
D. Forecasting

Correct Answer: C

Explanation:
Identifying patterns and similarities in unlabeled data is a clustering scenario.

Question 3

Which scenario is best suited for clustering?

A. Predicting monthly revenue
B. Determining whether a transaction is fraudulent
C. Segmenting customers into behavior-based groups
D. Estimating delivery time

Correct Answer: C

Explanation:
Customer segmentation without predefined labels is a classic clustering use case.

Question 4

A company wants to organize products into groups based on similarity without predefined categories.

What type of machine learning technique is being used?

A. Regression
B. Classification
C. Clustering
D. Recommendation

Correct Answer: C

Explanation:
Grouping items based on similarity without labels is clustering.

Question 5

Which characteristic most strongly indicates a clustering scenario?

A. Numeric output values
B. Predefined labels
C. Labeled training data
D. Unlabeled data

Correct Answer: D

Explanation:
Clustering uses unlabeled data to discover structure and patterns.

Question 6

An AI system groups support tickets by similarity to identify common issues, without predefined issue types.

Which machine learning approach is being used?

A. Classification
B. Regression
C. Clustering
D. Natural language processing

Correct Answer: C

Explanation:
The system groups tickets without predefined labels, which indicates clustering.

Question 7

Which output best represents a clustering result?

A. Approved / Rejected
B. 4.7 hours
C. Cluster A, Cluster B, Cluster C
D. High risk

Correct Answer: C

Explanation:
Clusters represent group assignments, not numeric values or labels.

Question 8

A data scientist wants to explore a dataset to discover natural groupings before defining categories.

Which technique should be used?

A. Classification
B. Regression
C. Clustering
D. Forecasting

Correct Answer: C

Explanation:
Clustering is used for exploratory analysis to find natural groupings.

Question 9

Which statement best describes clustering?

A. It predicts numeric values
B. It assigns predefined labels
C. It groups similar data points
D. It detects unusual events

Correct Answer: C

Explanation:
Clustering groups data points based on similarity without predefined labels.

Question 10

On the AI-900 exam, which keyword most strongly signals a clustering scenario?

A. Estimate
B. Categorize
C. Group
D. Measure

Correct Answer: C

Explanation:
Group indicates organizing unlabeled data into clusters, which is clustering.

Exam-Day Tip

If a machine learning related question mentions …

No labels
Discover patterns
Group or segment data

… the correct answer is likely to be related to Clustering.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Identify Clustering Machine Learning Scenarios (AI-900 Exam Prep)

Where This Fits in the Exam

Exam Domain: Describe fundamental principles of machine learning on Azure (15–20%)
Sub-Domain: Identify common machine learning techniques
Topic: Identify clustering machine learning scenarios

On the AI-900 exam, clustering questions test whether you can recognize when grouping unlabeled data is the goal, not how to build or tune clustering models.

What Is Clustering in Machine Learning?

Clustering is a type of unsupervised machine learning used to group similar data points together based on patterns in the data.

No labeled training data is provided
The algorithm discovers structure on its own
The output is a group or cluster, not a predefined label

Key exam rule:
If the data has no labels and the goal is to discover natural groupings, the scenario is clustering.

Characteristics of Clustering Scenarios

A clustering workload typically includes:

Large amounts of unlabeled data
Multiple features describing each data point
No predefined categories
A goal of discovering similarity or structure

Clustering answers questions like:

Which items are similar?
How can this data be segmented?
What patterns exist in this dataset?

Common Clustering Use Cases

Customer Segmentation

Grouping customers by purchasing behavior
Identifying customer personas
Segmenting users for marketing campaigns

Data Exploration

Discovering patterns in large datasets
Identifying natural groupings in usage data
Understanding customer behavior trends

Image and Document Grouping

Grouping images by visual similarity
Organizing documents by topic
Detecting patterns in text collections

All of these involve grouping without predefined labels, which is the hallmark of clustering.

Clustering vs Other Machine Learning Techniques

This distinction is very important for AI-900.

Technique	Labeled Data	Output	Example
Regression	Yes	Numeric value	Predicting house price
Classification	Yes	Category or label	Approving a loan
Clustering	No	Group or cluster	Customer segmentation

Exam tip:
If the scenario mentions no labels, discover, or group, think Clustering.

Example Exam Scenarios

Scenario 1

A retailer wants to group customers based on shopping habits without defining categories in advance.

Labeled data: No
ML Technique: Clustering

Scenario 2

An organization analyzes sensor data to identify natural groupings of usage patterns.

Goal: Discover patterns
ML Technique: Clustering

Scenario 3

A company wants to organize products into groups based on similarity.

Predefined categories: None
ML Technique: Clustering

Azure Context for AI-900

On the AI-900 exam, clustering scenarios are often framed using Azure Machine Learning concepts such as:

Analyzing unlabeled datasets
Discovering patterns in data
Segmenting data for insights

You are not expected to:

Choose clustering algorithms
Configure Azure services
Write code

The focus is on recognizing when clustering is appropriate.

Common Exam Traps and Misconceptions

❌ Predicting a value → Regression
❌ Assigning predefined labels → Classification
❌ Detecting fraud → Classification or anomaly detection
✅ Grouping unlabeled data → Clustering

Key Takeaways for the Exam

Clustering is unsupervised learning
No labeled training data is required
The goal is to group similar data
Outputs are clusters, not predictions
Keywords: group, segment, organize, discover patterns

Identify Clustering Machine Learning Scenarios

AI-900: Microsoft Azure AI Fundamentals

Where This Fits in the Exam

Exam Domain: Describe fundamental principles of machine learning on Azure (15–20%)
Sub-Domain: Identify common machine learning techniques
Topic: Identify clustering machine learning scenarios

On the AI-900 exam, clustering questions test whether you can recognize when grouping unlabeled data is the goal, not how to build or tune clustering models.

What Is Clustering in Machine Learning?

Clustering is a type of unsupervised machine learning used to group similar data points together based on patterns in the data.

No labeled training data is provided
The algorithm discovers structure on its own
The output is a group or cluster, not a predefined label

Key exam rule:
If the data has no labels and the goal is to discover natural groupings, the scenario is clustering.

Characteristics of Clustering Scenarios

A clustering workload typically includes:

Large amounts of unlabeled data
Multiple features describing each data point
No predefined categories
A goal of discovering similarity or structure

Clustering answers questions like:

Which items are similar?
How can this data be segmented?
What patterns exist in this dataset?

Common Clustering Use Cases

Customer Segmentation

Grouping customers by purchasing behavior
Identifying customer personas
Segmenting users for marketing campaigns

Data Exploration

Discovering patterns in large datasets
Identifying natural groupings in usage data
Understanding customer behavior trends

Image and Document Grouping

Grouping images by visual similarity
Organizing documents by topic
Detecting patterns in text collections

All of these involve grouping without predefined labels, which is the hallmark of clustering.

Clustering vs Other Machine Learning Techniques

This distinction is very important for AI-900.

Technique	Labeled Data	Output	Example
Regression	Yes	Numeric value	Predicting house price
Classification	Yes	Category or label	Approving a loan
Clustering	No	Group or cluster	Customer segmentation

Exam tip:
If the scenario mentions no labels, discover, or group, think Clustering.

Example Exam Scenarios

Scenario 1

A retailer wants to group customers based on shopping habits without defining categories in advance.

Labeled data: No
ML Technique: Clustering

Scenario 2

An organization analyzes sensor data to identify natural groupings of usage patterns.

Goal: Discover patterns
ML Technique: Clustering

Scenario 3

A company wants to organize products into groups based on similarity.

Predefined categories: None
ML Technique: Clustering

Azure Context for AI-900

On the AI-900 exam, clustering scenarios are often framed using Azure Machine Learning concepts such as:

Analyzing unlabeled datasets
Discovering patterns in data
Segmenting data for insights

You are not expected to:

Choose clustering algorithms
Configure Azure services
Write code

The focus is on recognizing when clustering is appropriate.

Common Exam Traps and Misconceptions

❌ Predicting a value → Regression
❌ Assigning predefined labels → Classification
❌ Detecting fraud → Classification or anomaly detection
✅ Grouping unlabeled data → Clustering

Key Takeaways for the Exam

Clustering is unsupervised learning
No labeled training data is required
The goal is to group similar data
Outputs are clusters, not predictions
Keywords: group, segment, organize, discover patterns

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Deep Learning, Machine Learning (ML), Microsoft Certification January 31, 2026

Practice Questions: Identify features of deep learning techniques (AI-900 Exam Prep)

Practice Questions

Question 1

Which characteristic best distinguishes deep learning from traditional machine learning techniques?

A. Deep learning always produces more accurate results
B. Deep learning uses rule-based logic
C. Deep learning uses neural networks with multiple layers
D. Deep learning does not require training data

Correct Answer: C

Explanation:
Deep learning is defined by the use of multi-layer (deep) neural networks, which allows the model to learn complex patterns. Accuracy is not guaranteed, and deep learning still requires training data.

Question 2

A data scientist is building a system to identify objects in photographs without manually defining features such as edges or shapes. Which approach best supports this requirement?

A. Linear regression
B. Decision trees
C. Deep learning
D. Rule-based classification

Correct Answer: C

Explanation:
Deep learning models automatically extract features from raw data, making them ideal for image recognition scenarios where manual feature engineering is difficult.

Question 3

Which type of data is deep learning particularly well suited to process?

A. Highly structured tabular data only
B. Unstructured data such as images and text
C. Small datasets with few attributes
D. Pre-aggregated numerical summaries

Correct Answer: B

Explanation:
Deep learning excels with unstructured data like images, audio, video, and natural language text — a key exam concept.

Question 4

Which scenario is the best example of a deep learning workload?

A. Predicting house prices using historical averages
B. Grouping customers by age and income
C. Translating spoken language into text
D. Calculating monthly sales totals

Correct Answer: C

Explanation:
Speech-to-text translation relies on deep neural networks trained on large datasets and is a classic deep learning use case.

Question 5

Why do deep learning models typically require large amounts of training data?

A. They rely on predefined rules
B. They use many layers with numerous parameters
C. They only work with structured data
D. They do not support feature reuse

Correct Answer: B

Explanation:
Deep learning models contain many parameters across multiple layers, which requires large datasets to train effectively and avoid overfitting.

Question 6

Which statement accurately describes feature engineering in deep learning?

A. Features must always be manually selected
B. Features are randomly generated
C. Features are automatically learned during training
D. Feature engineering is not possible

Correct Answer: C

Explanation:
A defining feature of deep learning is automatic feature extraction, reducing the need for manual feature engineering.

Question 7

Which Azure workload is most likely to use deep learning techniques?

A. Calculating averages in a SQL database
B. Performing rule-based fraud detection
C. Detecting faces in images
D. Sorting records by date

Correct Answer: C

Explanation:
Computer vision tasks such as face detection rely heavily on deep learning models.

Question 8

Compared to traditional machine learning models, deep learning models generally require:

A. Less computational power
B. No training data
C. More computational resources
D. Fewer model parameters

Correct Answer: C

Explanation:
Deep learning models are computationally intensive, often requiring GPUs and longer training times.

Question 9

Which statement is true about deep learning and structured data?

A. Deep learning cannot process structured data
B. Deep learning is always the best choice for structured data
C. Traditional ML is often sufficient for structured data
D. Structured data requires neural networks

Correct Answer: C

Explanation:
For many structured data problems, traditional machine learning techniques may be simpler and more efficient than deep learning.

Question 10

A model uses an input layer, multiple hidden layers, and an output layer. What type of technique does this describe?

A. Clustering
B. Regression
C. Deep learning
D. Rule-based inference

Correct Answer: C

Explanation:
This layered structure is characteristic of deep neural networks, which form the foundation of deep learning techniques.

Exam Tips for This Topic

Look for keywords like images, speech, text, neural networks, and automatic feature extraction
Avoid choosing deep learning for simple, structured, low-data scenarios
Remember: deep learning ≠ better in all cases

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Deep Learning, Machine Learning (ML), Microsoft Certification January 31, 2026

Identify Features of Deep Learning Techniques (AI-900 Exam Prep)

Where This Fits in the Exam

Exam Domain: Describe fundamental principles of machine learning on Azure (15–20%)
Sub-Domain: Identify common machine learning techniques
Topic: Identify features of deep learning techniques

On the AI-900 exam, deep learning questions focus on what makes deep learning distinct, when it is used, and what types of problems it solves well.

What Is Deep Learning?

Deep learning is a subset of machine learning that uses artificial neural networks with multiple layers (deep neural networks) to learn complex patterns in data.

Inspired by how the human brain works
Uses many layers to extract increasingly abstract features
Particularly effective with large and complex datasets

Key exam idea:
Deep learning uses multi-layer neural networks to automatically learn features from data.

Key Features of Deep Learning Techniques

Multi-Layer Neural Networks

Deep learning models consist of:

An input layer
One or more hidden layers
An output layer

Each layer learns progressively more complex representations of the data.

This “depth” is what differentiates deep learning from traditional machine learning models.

Automatic Feature Extraction

Traditional machine learning often requires manual feature engineering.

Deep learning:

Automatically learns relevant features
Reduces the need for human-designed features
Is well-suited for unstructured data

This is a high-frequency exam point.

Works Well with Unstructured Data

Deep learning excels at handling:

Images
Audio
Video
Natural language text

These data types are difficult for traditional ML models but ideal for deep neural networks.

Requires Large Amounts of Data

Deep learning models typically:

Perform better with large datasets
Require significant training data
Benefit from increased data volume and variety

On the exam, deep learning is often associated with big data scenarios.

High Computational Requirements

Deep learning models:

Require more processing power
Often use GPUs for training
Take longer to train than simpler models

You don’t need hardware details for AI-900 — just recognize that deep learning is computationally intensive.

Common Deep Learning Use Cases

Computer Vision

Image classification
Facial recognition
Object detection

Natural Language Processing

Language translation
Sentiment analysis
Text generation

Speech Recognition

Voice assistants
Speech-to-text systems

These scenarios frequently appear in AI-900 questions tied to deep learning.

Deep Learning vs Traditional Machine Learning

This comparison is commonly tested.

Aspect	Traditional ML	Deep Learning
Feature engineering	Manual	Automatic
Model complexity	Simpler models	Multi-layer neural networks
Data requirements	Smaller datasets	Large datasets
Best for	Structured data	Unstructured data
Compute needs	Lower	Higher

Azure Context for AI-900

In Azure, deep learning is commonly associated with:

Azure Machine Learning
AI services built on deep neural networks
Vision, speech, and language workloads

You are not expected to:

Build neural networks
Choose architectures
Write training code

Focus on identifying features and use cases.

Common Exam Traps and Misconceptions

❌ Deep learning is required for all ML problems
❌ Deep learning works best with small datasets
❌ Deep learning requires manual feature selection
✅ Deep learning excels at complex, unstructured data tasks

Key Takeaways for the Exam

Deep learning uses multi-layer neural networks
It automatically learns features from data
It works best with large datasets
It is ideal for images, text, audio, and video
It requires more computational resources than traditional ML

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Deep Learning, Microsoft Certification January 31, 2026

Practice Exam Questions: Identify Features of the Transformer Architecture (AI-900 Exam Prep)

Practice Exam Questions

Question 1

What is the primary purpose of the self-attention mechanism in a Transformer model?

A. To reduce the size of the training dataset
B. To allow the model to focus on relevant parts of the input sequence
C. To replace the need for training data
D. To process words strictly in order

Correct Answer: B

Explanation:
Self-attention enables a Transformer to determine which words in a sentence are most relevant to one another, improving context understanding. It does not enforce strict order or reduce dataset size.

Question 2

Which feature allows Transformers to be trained more efficiently than recurrent neural networks (RNNs)?

A. Sequential word processing
B. Parallel processing of input data
C. Manual feature engineering
D. Rule-based language models

Correct Answer: B

Explanation:
Transformers process entire sequences in parallel, unlike RNNs that process tokens sequentially. This makes Transformers faster and more scalable.

Question 3

A key reason Transformers require positional encoding is because they:

A. Use convolutional layers
B. Process all input tokens at the same time
C. Rely on labeled data only
D. Perform unsupervised learning

Correct Answer: B

Explanation:
Because Transformers process words in parallel, positional encoding is needed to preserve information about word order in a sentence.

Question 4

Which type of AI workload most commonly uses Transformer-based models?

A. Time-series forecasting
B. Natural language processing
C. Image compression
D. Robotics control systems

Correct Answer: B

Explanation:
Transformers are primarily used for NLP tasks such as translation, summarization, and conversational AI.

Question 5

Which statement best describes the encoder–decoder architecture used in many Transformer models?

A. Both components generate output text
B. The encoder understands input, and the decoder generates output
C. The decoder trains the encoder
D. Both components store training data

Correct Answer: B

Explanation:
The encoder processes and understands the input sequence, while the decoder generates the output sequence based on that understanding.

Question 6

Why are Transformers better at handling long-range dependencies in text compared to earlier models?

A. They use fewer parameters
B. They rely on handcrafted grammar rules
C. They use attention to relate all words in a sequence
D. They process words one at a time

Correct Answer: C

Explanation:
Self-attention allows Transformers to evaluate relationships between all words in a sentence, regardless of distance.

Question 7

Which Azure scenario is most likely to involve a Transformer-based model?

A. Predicting tomorrow’s stock price
B. Detecting network hardware failures
C. Translating text between languages
D. Calculating average sales per region

Correct Answer: C

Explanation:
Language translation is a classic NLP task that relies heavily on Transformer architectures.

Question 8

What is a major advantage of Transformers over traditional sequence models?

A. They require no training data
B. They eliminate bias automatically
C. They improve scalability and performance
D. They work only with structured data

Correct Answer: C

Explanation:
Transformers scale efficiently due to parallel processing and attention mechanisms, improving performance on large datasets.

Question 9

Which statement about Transformers is TRUE?

A. They are rule-based AI systems
B. They process data strictly sequentially
C. They are a type of deep learning model
D. They are limited to image recognition

Correct Answer: C

Explanation:
Transformers are deep learning architectures commonly used for NLP tasks.

Question 10

Which feature enables a Transformer model to understand the context of a word based on surrounding words?

A. Positional encoding
B. Tokenization
C. Self-attention
D. Data labeling

Correct Answer: C

Explanation:
Self-attention allows the model to weigh the importance of surrounding words when interpreting meaning and context.

Quick Exam Tip

If you see keywords like:

attention
context
parallel processing
language understanding
Azure OpenAI

You’re almost certainly dealing with a Transformer-based model.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Deep Learning, Microsoft Certification, Natural Language Processing (NLP) January 31, 2026

Identify Features of the Transformer Architecture (AI-900 Exam Prep)

Where This Topic Fits in the Exam

Exam domain: Describe fundamental principles of machine learning on Azure (15–20%)
Sub-area: Identify common machine learning techniques
Focus: Understanding what Transformers are, why they matter, and what problems they solve — not how to code them

The AI-900 exam tests conceptual understanding, so you should recognize key features, benefits, and common use cases of the Transformer architecture.

What Is the Transformer Architecture?

The Transformer architecture is a type of deep learning model designed primarily for natural language processing (NLP) tasks.
It was introduced in the paper “Attention Is All You Need” and has since become the foundation for modern AI models such as:

Large Language Models (LLMs)
Chatbots
Translation systems
Text summarization tools

Unlike earlier sequence models, Transformers do not process data sequentially. Instead, they analyze entire sequences at once, which makes them faster and more scalable.

Key Features of the Transformer Architecture

1. Attention Mechanism (Self-Attention)

The core feature of a Transformer is self-attention.

Self-attention allows the model to:

Evaluate the importance of each word relative to every other word in a sentence
Understand context and relationships, even when words are far apart

Example:
In the sentence “The animal didn’t cross the road because it was tired”, self-attention helps the model understand what “it” refers to.

📌 Exam takeaway: Transformers use attention to understand context more effectively than older models.

2. Parallel Processing

Traditional models like RNNs process text one word at a time.
Transformers process all words in parallel.

Benefits:

Faster training
Better performance on large datasets
Improved scalability in cloud environments (like Azure)

📌 Exam takeaway: Transformers are efficient and scalable because they don’t rely on sequential processing.

3. Encoder–Decoder Structure

Many Transformer-based models use an encoder–decoder architecture:

Encoder:
- Reads and understands the input (e.g., a sentence in English)
Decoder:
- Generates the output (e.g., the translated sentence in Spanish)

📌 Exam takeaway: Transformers often use encoders to understand input and decoders to generate output.

4. Positional Encoding

Because Transformers process words in parallel, they need a way to understand word order.

Positional encoding:

Adds information about the position of each word
Allows the model to understand sentence structure and sequence

📌 Exam takeaway: Transformers use positional encoding to retain word order information.

5. Strong Performance on Natural Language Tasks

Transformers are especially effective for:

Text translation
Text summarization
Question answering
Chatbots and conversational AI
Sentiment analysis

📌 Exam takeaway: Transformers are closely associated with natural language processing workloads.

Why Transformers Are Important in Azure AI

Microsoft Azure AI services rely heavily on Transformer-based models, especially in:

Azure OpenAI Service
Azure AI Language
Conversational AI and copilots
Search and knowledge mining

Understanding Transformers helps explain why modern AI solutions are more accurate, context-aware, and scalable.

Transformers vs Earlier Models (High-Level)

Feature	Earlier Models (RNNs/CNNs)	Transformers
Sequence processing	Sequential	Parallel
Context handling	Limited	Strong
Long-range dependencies	Difficult	Effective
Training speed	Slower	Faster
NLP performance	Moderate	State-of-the-art

📌 Exam focus: You don’t need technical depth — just understand why Transformers are better for language tasks.

Common Exam Pitfalls to Avoid

❌ Thinking Transformers replace all ML models
❌ Assuming Transformers are only for images
❌ Confusing Transformers with traditional rule-based NLP

✅ Remember: Transformers are deep learning models optimized for language and sequence understanding.

Key Exam Summary (Must-Know Points)

If you remember nothing else, remember this:

Transformers are deep learning models
They rely on self-attention
They process data in parallel
They are especially effective for natural language processing
They power modern AI services in Azure

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

AI, AI-900, Artificial Intelligence (AI), Machine Learning (ML), Microsoft Certification January 31, 2026

Identify Features and Labels in a Dataset for Machine Learning (AI-900 Exam Prep)

This section of the AI-900: Microsoft Azure AI Fundamentals exam focuses on understanding one of the most important foundational concepts in machine learning: features and labels. You are not expected to build models or write code, but you must be able to recognize features and labels in a dataset and understand their role in different machine learning scenarios.

This topic appears under: Describe Artificial Intelligence workloads and considerations (15–20%) → Describe core machine learning concepts

What Is a Dataset in Machine Learning?

A dataset is a collection of data used to train, validate, and test machine learning models. In supervised learning scenarios (which are emphasized in AI-900), a dataset typically contains:

Features: The input values used to make predictions
Labels: The output or target values the model learns to predict

Each row in a dataset usually represents a single observation or record, and each column represents either a feature or a label.

What Are Features?

Features are the individual measurable properties or characteristics of the data that are used as inputs to a machine learning model.

Key Characteristics of Features

Features describe what you know about each data point
They are used by the model to identify patterns
Features can be numerical, categorical, or derived

Examples of Features

Scenario	Example Features
House price prediction	Number of bedrooms, square footage, location
Customer churn	Account age, number of support tickets, monthly spend
Email classification	Word frequency, sender domain, message length

In Azure Machine Learning, features are often referred to as input variables.

What Are Labels?

A label is the value that a machine learning model is trained to predict. Labels are only present in supervised learning datasets.

Key Characteristics of Labels

Labels represent the outcome or answer
A dataset usually has one label column
Labels are known during training but unknown during prediction

Examples of Labels

Scenario	Label
House price prediction	Sale price
Customer churn	Churned (Yes/No)
Image classification	Object category

In Azure Machine Learning, labels are often called target variables.

Features vs Labels: Key Differences

Aspect	Features	Labels
Purpose	Input to the model	Output to predict
Quantity	Usually many	Typically one
Known during training	Yes	Yes
Known during prediction	Yes	No

Understanding this distinction is critical for AI-900 exam questions.

Features and Labels in Supervised Learning

Supervised learning relies on labeled datasets. The model learns by comparing its predictions to the known labels and adjusting accordingly.

Common Supervised Learning Types

Regression
- Features: numeric or categorical inputs
- Label: numeric value (e.g., price, temperature)
Classification
- Features: descriptive inputs
- Label: category or class (e.g., spam/not spam)

Features and Labels in Unsupervised Learning

Unsupervised learning datasets do not contain labels.

The model identifies patterns or groupings on its own
Common example: clustering

In AI-900, this distinction is important:

If a dataset has no labels, it is not supervised learning.

Real-World Azure Example

Consider a dataset used in Azure Machine Learning to predict whether a customer will cancel a subscription.

Features:
- Number of logins per month
- Subscription length
- Customer support interactions
Label:
- Subscription canceled (Yes or No)

The model learns the relationship between the features and the label to make future predictions.

Exam Tips for AI-900

If the question asks “what the model uses to make predictions”, look for features
If the question asks “what the model predicts”, look for labels
If labels are present, it is supervised learning
AI-900 focuses on conceptual understanding, not data science implementation

Key Takeaways

Features are input variables used to make predictions
Labels are the known outcomes the model learns to predict
Supervised learning requires labeled data
Being able to identify features and labels in a scenario is essential for AI-900

This knowledge forms the foundation for understanding regression, classification, and many Azure AI workloads covered later in the exam.

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.