This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Identify AI concepts and capabilities (40–45%)
   --> Identify AI model components and configurations
      --> Describe how generative AI models work

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Generative AI is one of the most important and rapidly growing areas of artificial intelligence and is a major topic for the AI-901 certification exam. Microsoft includes generative AI concepts within the “Identify AI model components and configurations” section of the exam objectives.

Understanding how generative AI models work means understanding how AI systems can create new content such as text, images, audio, code, and video based on patterns learned from large datasets.

What Is Generative AI?

Generative AI refers to AI systems that can generate new content based on patterns learned from training data.

Unlike traditional AI systems that primarily classify or predict, generative AI creates original outputs.

Examples of Generated Content

Text
Images
Music
Speech
Code
Video

Example Applications

AI chatbots
Image generators
Code assistants
Content summarization
Translation systems
Virtual assistants

How Generative AI Differs from Traditional AI

Traditional AI	Generative AI
Classifies or predicts	Creates new content
Detects spam emails	Writes emails
Identifies objects in images	Generates images
Predicts sales trends	Creates reports or summaries

Traditional AI often answers questions like:

“What category does this belong to?”
“What will likely happen next?”

Generative AI answers questions like:

“Create something new.”
“Generate content based on this prompt.”

Foundation Models

Many generative AI systems are built using foundation models.

A foundation model is a very large AI model trained on massive amounts of data that can be adapted for many tasks.

Foundation models learn general patterns in:

Language
Images
Audio
Code
Knowledge relationships

These models can then be specialized or prompted for different use cases.

Large Language Models (LLMs)

Large Language Models (LLMs) are a type of generative AI model focused on understanding and generating human language.

Examples include systems used for:

Chatbots
Writing assistants
Summarization
Translation
Question answering
Code generation

LLMs are trained using enormous collections of text data from books, articles, websites, and other sources.

How Large Language Models Work

At a high level, LLMs work by predicting the most likely next word or token in a sequence.

Example

If the model sees:

“The sky is…”

It may predict:

“blue”

By repeatedly predicting the next token, the model can generate sentences, paragraphs, and conversations.

Tokens in Generative AI

Generative AI models process information as tokens.

Tokens are small units of text, which may represent:

Words
Parts of words
Characters
Punctuation

Example

The sentence:

“AI is powerful”

might be broken into tokens such as:

“AI”
“is”
“powerful”

The model predicts tokens one at a time to generate output.

Neural Networks and Deep Learning

Generative AI models are built using deep learning neural networks.

Neural networks are systems inspired by the structure of the human brain.

These networks contain many layers that learn patterns from data.

Generative AI models often contain:

Millions
Billions
Or even trillions of parameters

Parameters are internal values learned during training that help the model recognize relationships and patterns.

Transformers

Most modern generative AI systems use a neural network architecture called the Transformer.

Transformers are highly effective for processing sequences such as language.

Transformers help models:

Understand context
Recognize relationships between words
Handle long passages of text
Generate coherent responses

The Transformer architecture is a foundational technology behind many modern AI systems.

Training Generative AI Models

Training a generative AI model involves exposing it to massive datasets.

During training, the model learns patterns and relationships by repeatedly predicting missing or next tokens.

Simplified Training Process

Provide training data
Hide or predict portions of the data
Compare predictions to actual results
Adjust model parameters
Repeat many times

This process may require enormous computing power and specialized hardware such as GPUs.

Pretraining and Fine-Tuning

Generative AI training often occurs in two stages.

Pretraining

The model learns general knowledge and patterns from very large datasets.

Example

An LLM may learn grammar, facts, reasoning patterns, and language structure from internet-scale text data.

Fine-Tuning

The pretrained model is then adapted for specific tasks or domains.

Example

A healthcare chatbot may be fine-tuned using medical terminology and healthcare conversations.

Fine-tuning improves performance for specialized use cases.

Prompts and Prompt Engineering

Users interact with generative AI systems using prompts.

A prompt is the input or instruction given to the model.

Examples

“Write a summary of this article.”
“Generate an image of a beach at sunset.”
“Explain machine learning simply.”

Prompt engineering refers to designing prompts that produce better outputs.

Well-structured prompts often improve:

Accuracy
Clarity
Relevance
Consistency

Temperature and Randomness

Generative AI systems often include configuration settings such as temperature.

Temperature controls randomness in generated responses.

Temperature	Behavior
Low temperature	More focused and predictable responses
High temperature	More creative and varied responses

Example

A low temperature may be used for factual responses, while a higher temperature may be used for creative writing.

Hallucinations

Generative AI models can sometimes produce incorrect or fabricated information called hallucinations.

Example

An AI chatbot may confidently provide false information or invent references.

Hallucinations occur because models generate likely patterns rather than verifying factual truth.

This is an important AI-901 exam concept.

Context Windows

Generative AI models use context windows to determine how much information they can process at one time.

The context window includes:

User prompts
Previous conversation history
Uploaded content
Instructions

Larger context windows allow models to handle longer conversations and larger documents.

Retrieval-Augmented Generation (RAG)

Some AI systems use Retrieval-Augmented Generation (RAG).

RAG combines:

A generative AI model
External knowledge retrieval

Instead of relying only on training data, the model retrieves current or domain-specific information before generating responses.

Benefits

More accurate responses
Reduced hallucinations
Access to updated information

Generative AI Modalities

Generative AI is not limited to text.

Different model types generate different content formats.

Model Type	Output
Text models	Articles, conversations, summaries
Image models	Pictures and artwork
Audio models	Speech and music
Video models	Video clips
Code models	Programming code

Responsible AI Considerations

Generative AI systems introduce Responsible AI concerns such as:

Bias
Hallucinations
Harmful content generation
Privacy risks
Copyright concerns
Security risks

Organizations should implement:

Human oversight
Content filtering
Monitoring
Transparency
Governance policies

Azure and Generative AI

Microsoft Azure AI Services and related Azure AI offerings provide tools for building and deploying generative AI applications.

Microsoft also provides Responsible AI guidance and safety controls for generative AI systems.

Real-World Example

Scenario: AI Customer Support Assistant

A company deploys a generative AI chatbot for customer support.

How It Works

Users enter prompts
The language model processes tokens
The transformer architecture analyzes context
The model predicts likely responses
The chatbot generates natural language answers

Additional Features

Fine-tuned on company documentation
Uses RAG to retrieve current policy information
Applies content filtering
Escalates uncertain cases to humans

This type of scenario aligns well with AI-901 exam questions.

Microsoft Responsible AI and Generative AI

Microsoft emphasizes Responsible AI practices for generative AI systems, including:

Fairness
Reliability and safety
Privacy and security
Inclusiveness
Transparency
Accountability

Generative AI systems should be designed responsibly and monitored carefully.

Important AI-901 Exam Tips

For the exam, remember these key points:

Generative AI creates new content rather than only classifying or predicting.
Large Language Models (LLMs) generate text by predicting tokens.
Tokens are small pieces of text processed by the model.
Transformers are the core architecture behind many modern generative AI systems.
Foundation models are large pretrained models adaptable to many tasks.
Fine-tuning customizes models for specific use cases.
Prompts guide model behavior.
Temperature controls response randomness.
Hallucinations are incorrect or fabricated outputs.
RAG combines retrieval systems with generative AI models.

Quick Knowledge Check

Question 1

What is the primary function of generative AI?

Answer

To create new content such as text, images, audio, or code.

Question 2

What is a token in a language model?

Answer

A small unit of text processed by the model.

Question 3

What does temperature control in generative AI?

Answer

The randomness and creativity of generated outputs.

Question 4

What is a hallucination in generative AI?

Answer

An incorrect or fabricated response generated by the model.

Practice Exam Questions

Question 1

What is the PRIMARY purpose of a generative AI model?

A. To classify data into categories only
B. To create new content based on learned patterns
C. To replace all human decision-making
D. To store database records

Correct Answer

B. To create new content based on learned patterns

Explanation

Generative AI models are designed to generate new content such as text, images, audio, code, or video using patterns learned from training data.

Why the Other Answers Are Incorrect

A. To classify data into categories only

Classification is more commonly associated with traditional predictive AI models.

C. To replace all human decision-making

AI should support, not fully replace, human decision-making.

D. To store database records

Databases store data but are not generative AI systems.

Question 2

How do Large Language Models (LLMs) primarily generate text?

A. By copying entire documents from the internet
B. By predicting the next likely token in a sequence
C. By manually selecting words from a dictionary
D. By using spreadsheet formulas

Correct Answer

B. By predicting the next likely token in a sequence

Explanation

LLMs generate text by predicting the most probable next token repeatedly until a full response is created.

Why the Other Answers Are Incorrect

A. By copying entire documents from the internet

LLMs generate responses based on learned patterns rather than simply copying content.

C. By manually selecting words from a dictionary

The process is automated using neural networks.

D. By using spreadsheet formulas

Spreadsheet formulas are unrelated to language generation.

Question 3

What is a token in a generative AI language model?

A. A hardware device used for training
B. A small unit of text processed by the model
C. A cloud storage container
D. A type of encryption key

Correct Answer

B. A small unit of text processed by the model

Explanation

Tokens are pieces of text such as words, parts of words, punctuation, or characters that language models process during training and generation.

Why the Other Answers Are Incorrect

A. A hardware device used for training

Tokens are not physical hardware.

C. A cloud storage container

Storage containers are unrelated.

D. A type of encryption key

Encryption keys are used in security systems.

Question 4

Which neural network architecture powers many modern generative AI systems?

A. Decision trees
B. Transformers
C. Linear regression
D. Rule-based engines

Correct Answer

B. Transformers

Explanation

Transformers are the core architecture behind many modern generative AI systems because they handle context and sequential data effectively.

Why the Other Answers Are Incorrect

A. Decision trees

Decision trees are traditional machine learning models.

C. Linear regression

Linear regression is used for predicting numeric values.

D. Rule-based engines

Rule-based systems do not use transformer architectures.

Question 5

What is the purpose of fine-tuning a generative AI model?

A. To physically repair damaged hardware
B. To adapt a pretrained model for a specialized task or domain
C. To permanently disable model updates
D. To reduce network bandwidth usage

Correct Answer

B. To adapt a pretrained model for a specialized task or domain

Explanation

Fine-tuning customizes a pretrained foundation model using additional domain-specific data to improve performance for particular use cases.

Why the Other Answers Are Incorrect

A. To physically repair damaged hardware

Fine-tuning is a training process, not hardware maintenance.

C. To permanently disable model updates

Fine-tuning modifies model behavior rather than disabling updates.

D. To reduce network bandwidth usage

Bandwidth optimization is unrelated.

Question 6

What does the temperature setting control in many generative AI models?

A. The physical temperature of the server hardware
B. The randomness and creativity of generated responses
C. The amount of training data stored
D. The encryption strength of the model

Correct Answer

B. The randomness and creativity of generated responses

Explanation

Higher temperature values generally produce more creative and varied responses, while lower values produce more predictable outputs.

Why the Other Answers Are Incorrect

A. The physical temperature of the server hardware

Temperature is a model configuration setting, not a hardware measurement.

C. The amount of training data stored

Temperature does not affect stored data size.

D. The encryption strength of the model

Temperature is unrelated to encryption.

Question 7

What is a hallucination in generative AI?

A. A hardware malfunction during training
B. A correct response with high confidence
C. An incorrect or fabricated output generated by the model
D. A type of data encryption

Correct Answer

C. An incorrect or fabricated output generated by the model

Explanation

Hallucinations occur when a generative AI model produces false or misleading information that appears convincing.

Why the Other Answers Are Incorrect

A. A hardware malfunction during training

Hallucinations are output issues, not hardware failures.

B. A correct response with high confidence

Hallucinations are inaccurate responses.

D. A type of data encryption

Hallucinations are unrelated to encryption.

Question 8

What is the PRIMARY purpose of a prompt in generative AI?

A. To physically start a computer server
B. To provide instructions or input to guide model output
C. To encrypt training data
D. To replace model training

Correct Answer

B. To provide instructions or input to guide model output

Explanation

Prompts tell the model what task to perform or what type of response to generate.

Why the Other Answers Are Incorrect

A. To physically start a computer server

Prompts are text inputs, not hardware controls.

C. To encrypt training data

Prompts are unrelated to encryption.

D. To replace model training

Prompts guide trained models but do not replace training.

Question 9

What is Retrieval-Augmented Generation (RAG)?

A. A hardware acceleration technique
B. A method that combines generative AI with external information retrieval
C. A database backup process
D. A data compression algorithm

Correct Answer

B. A method that combines generative AI with external information retrieval

Explanation

RAG improves AI responses by retrieving relevant external information before generating outputs.

Why the Other Answers Are Incorrect

A. A hardware acceleration technique

RAG is not a hardware feature.

C. A database backup process

RAG is unrelated to backups.

D. A data compression algorithm

Compression is unrelated.

Question 10

Which statement BEST describes a foundation model?

A. A small model designed for a single narrow task
B. A large pretrained model adaptable to many AI tasks
C. A hardware device used for AI training
D. A database management system

Correct Answer

B. A large pretrained model adaptable to many AI tasks

Explanation

Foundation models are large AI models trained on massive datasets that can be adapted for many applications, including chatbots, summarization, and image generation.

Why the Other Answers Are Incorrect

A. A small model designed for a single narrow task

Foundation models are broad and highly adaptable.

C. A hardware device used for AI training

Foundation models are software models, not hardware.

D. A database management system

Databases manage data but are not AI models.

Final Thoughts

Generative AI is a major area of modern artificial intelligence and an important topic for the AI-901 certification exam. Microsoft expects candidates to understand the foundational concepts behind how generative AI models work, including tokens, transformers, prompts, training, and model behavior.

Understanding these concepts provides a strong foundation for working with modern AI systems and Azure AI technologies.

Go to the AI-901 Exam Prep Hub main page