Features – The Data Community

This section of the AI-900: Microsoft Azure AI Fundamentals exam focuses on understanding one of the most important foundational concepts in machine learning: features and labels. You are not expected to build models or write code, but you must be able to recognize features and labels in a dataset and understand their role in different machine learning scenarios.

This topic appears under: Describe Artificial Intelligence workloads and considerations (15–20%) → Describe core machine learning concepts

What Is a Dataset in Machine Learning?

A dataset is a collection of data used to train, validate, and test machine learning models. In supervised learning scenarios (which are emphasized in AI-900), a dataset typically contains:

Features: The input values used to make predictions
Labels: The output or target values the model learns to predict

Each row in a dataset usually represents a single observation or record, and each column represents either a feature or a label.

What Are Features?

Features are the individual measurable properties or characteristics of the data that are used as inputs to a machine learning model.

Key Characteristics of Features

Features describe what you know about each data point
They are used by the model to identify patterns
Features can be numerical, categorical, or derived

Examples of Features

Scenario	Example Features
House price prediction	Number of bedrooms, square footage, location
Customer churn	Account age, number of support tickets, monthly spend
Email classification	Word frequency, sender domain, message length

In Azure Machine Learning, features are often referred to as input variables.

What Are Labels?

A label is the value that a machine learning model is trained to predict. Labels are only present in supervised learning datasets.

Key Characteristics of Labels

Labels represent the outcome or answer
A dataset usually has one label column
Labels are known during training but unknown during prediction

Examples of Labels

Scenario	Label
House price prediction	Sale price
Customer churn	Churned (Yes/No)
Image classification	Object category

In Azure Machine Learning, labels are often called target variables.

Features vs Labels: Key Differences

Aspect	Features	Labels
Purpose	Input to the model	Output to predict
Quantity	Usually many	Typically one
Known during training	Yes	Yes
Known during prediction	Yes	No

Understanding this distinction is critical for AI-900 exam questions.

Features and Labels in Supervised Learning

Supervised learning relies on labeled datasets. The model learns by comparing its predictions to the known labels and adjusting accordingly.

Common Supervised Learning Types

Regression
- Features: numeric or categorical inputs
- Label: numeric value (e.g., price, temperature)
Classification
- Features: descriptive inputs
- Label: category or class (e.g., spam/not spam)

Features and Labels in Unsupervised Learning

Unsupervised learning datasets do not contain labels.

The model identifies patterns or groupings on its own
Common example: clustering

In AI-900, this distinction is important:

If a dataset has no labels, it is not supervised learning.

Real-World Azure Example

Consider a dataset used in Azure Machine Learning to predict whether a customer will cancel a subscription.

Features:
- Number of logins per month
- Subscription length
- Customer support interactions
Label:
- Subscription canceled (Yes or No)

The model learns the relationship between the features and the label to make future predictions.

Exam Tips for AI-900

If the question asks “what the model uses to make predictions”, look for features
If the question asks “what the model predicts”, look for labels
If labels are present, it is supervised learning
AI-900 focuses on conceptual understanding, not data science implementation

Key Takeaways

Features are input variables used to make predictions
Labels are the known outcomes the model learns to predict
Supervised learning requires labeled data
Being able to identify features and labels in a scenario is essential for AI-900

This knowledge forms the foundation for understanding regression, classification, and many Azure AI workloads covered later in the exam.

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

Practice Exam Questions

Question 1

You are training a model to predict house prices. The dataset includes columns for square footage, number of bedrooms, location, and sale price.
Which column is the label?

A. Square footage
B. Number of bedrooms
C. Location
D. Sale price

Correct Answer: D

Explanation:
The label is the value the model is trained to predict. In this scenario, the goal is to predict the sale price.

Question 2

Which statement best describes a feature in a machine learning dataset?

A. The final prediction made by the model
B. An input value used to make predictions
C. A rule written by a developer
D. The accuracy of the model

Correct Answer: B

Explanation:
Features are the input variables that provide information the model uses to make predictions.

Question 3

A dataset contains customer age, subscription length, monthly charges, and whether the customer canceled the service.
What is the label?

A. Customer age
B. Subscription length
C. Monthly charges
D. Whether the customer canceled

Correct Answer: D

Explanation:
The label represents the outcome being predicted—in this case, whether the customer canceled the service.

Question 4

Which type of machine learning requires both features and labels?

A. Unsupervised learning
B. Reinforcement learning
C. Supervised learning
D. Clustering

Correct Answer: C

Explanation:
Supervised learning uses labeled data so the model can learn the relationship between features and known outcomes.

Question 5

A dataset is used to group customers based on purchasing behavior, but it does not contain any target outcome.
What does this dataset contain?

A. Labels only
B. Features only
C. Training results
D. Predictions

Correct Answer: B

Explanation:
Unsupervised learning datasets contain features but do not include labels.

Question 6

In an email spam detection dataset, which item would most likely be a feature?

A. Spam or not spam
B. Model accuracy score
C. Number of words in the email
D. Final prediction

Correct Answer: C

Explanation:
The number of words is an input characteristic used by the model to make predictions, making it a feature.

Question 7

Which statement about labels is TRUE?

A. Labels are optional in supervised learning
B. Labels are the inputs used by the model
C. Labels represent the value the model predicts
D. Labels are created after predictions are made

Correct Answer: C

Explanation:
Labels are the known outcomes the model is trained to predict in supervised learning scenarios.

Question 8

You are preparing data in Azure Machine Learning to predict product demand.
Which columns should be selected as features?

A. Only the column you want to predict
B. All columns except the target outcome
C. Only numerical columns
D. Only categorical columns

Correct Answer: B

Explanation:
Features are the input columns used to predict the target outcome, which is the label.

Question 9

A dataset includes the following columns: temperature, humidity, wind speed, and weather condition.
If the goal is to predict the weather condition, what are temperature, humidity, and wind speed?

A. Labels
B. Predictions
C. Features
D. Outputs

Correct Answer: C

Explanation:
These values are inputs used to predict the weather condition, making them features.

Question 10

Which scenario best represents a labeled dataset?

A. Customer data grouped by similarity
B. Sensor readings without outcomes
C. Product reviews with sentiment categories
D. Website logs without classifications

Correct Answer: C

Explanation:
Product reviews with sentiment categories include known outcomes, which are labels, making the dataset labeled.

Exam Pattern Tip

On AI-900:

Features = inputs
Labels = outputs
If labels exist → supervised learning
If no labels → unsupervised learning

If you can identify those quickly, you’ll eliminate most wrong answers immediately.

Go to the AI-900 Exam Prep Hub main page.

The Data Community

Tag: Features

Identify Features and Labels in a Dataset for Machine Learning (AI-900 Exam Prep)

What Is a Dataset in Machine Learning?

What Are Features?

Key Characteristics of Features

Examples of Features

What Are Labels?

Key Characteristics of Labels

Examples of Labels

Features vs Labels: Key Differences

Features and Labels in Supervised Learning

Common Supervised Learning Types

Features and Labels in Unsupervised Learning

Real-World Azure Example

Exam Tips for AI-900

Key Takeaways

Practice Questions: Identify Features and Labels in a Dataset for Machine Learning (AI-900 Exam Prep)

Practice Exam Questions

Question 1

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

Question 9

Question 10

Exam Pattern Tip

Information and resources for the data professionals' community