This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub.
This topic falls under these sections:
Identify AI concepts and capabilities (40–45%)
--> Identify AI model components and configurations
--> Describe how generative AI models work
Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.
Generative AI is one of the most important and rapidly growing areas of artificial intelligence and is a major topic for the AI-901 certification exam. Microsoft includes generative AI concepts within the “Identify AI model components and configurations” section of the exam objectives.
Understanding how generative AI models work means understanding how AI systems can create new content such as text, images, audio, code, and video based on patterns learned from large datasets.
What Is Generative AI?
Generative AI refers to AI systems that can generate new content based on patterns learned from training data.
Unlike traditional AI systems that primarily classify or predict, generative AI creates original outputs.
Examples of Generated Content
- Text
- Images
- Music
- Speech
- Code
- Video
Example Applications
- AI chatbots
- Image generators
- Code assistants
- Content summarization
- Translation systems
- Virtual assistants
How Generative AI Differs from Traditional AI
| Traditional AI | Generative AI |
|---|---|
| Classifies or predicts | Creates new content |
| Detects spam emails | Writes emails |
| Identifies objects in images | Generates images |
| Predicts sales trends | Creates reports or summaries |
Traditional AI often answers questions like:
- “What category does this belong to?”
- “What will likely happen next?”
Generative AI answers questions like:
- “Create something new.”
- “Generate content based on this prompt.”
Foundation Models
Many generative AI systems are built using foundation models.
A foundation model is a very large AI model trained on massive amounts of data that can be adapted for many tasks.
Foundation models learn general patterns in:
- Language
- Images
- Audio
- Code
- Knowledge relationships
These models can then be specialized or prompted for different use cases.
Large Language Models (LLMs)
Large Language Models (LLMs) are a type of generative AI model focused on understanding and generating human language.
Examples include systems used for:
- Chatbots
- Writing assistants
- Summarization
- Translation
- Question answering
- Code generation
LLMs are trained using enormous collections of text data from books, articles, websites, and other sources.
How Large Language Models Work
At a high level, LLMs work by predicting the most likely next word or token in a sequence.
Example
If the model sees:
“The sky is…”
It may predict:
“blue”
By repeatedly predicting the next token, the model can generate sentences, paragraphs, and conversations.
Tokens in Generative AI
Generative AI models process information as tokens.
Tokens are small units of text, which may represent:
- Words
- Parts of words
- Characters
- Punctuation
Example
The sentence:
“AI is powerful”
might be broken into tokens such as:
- “AI”
- “is”
- “powerful”
The model predicts tokens one at a time to generate output.
Neural Networks and Deep Learning
Generative AI models are built using deep learning neural networks.
Neural networks are systems inspired by the structure of the human brain.
These networks contain many layers that learn patterns from data.
Generative AI models often contain:
- Millions
- Billions
- Or even trillions of parameters
Parameters are internal values learned during training that help the model recognize relationships and patterns.
Transformers
Most modern generative AI systems use a neural network architecture called the Transformer.
Transformers are highly effective for processing sequences such as language.
Transformers help models:
- Understand context
- Recognize relationships between words
- Handle long passages of text
- Generate coherent responses
The Transformer architecture is a foundational technology behind many modern AI systems.
Training Generative AI Models
Training a generative AI model involves exposing it to massive datasets.
During training, the model learns patterns and relationships by repeatedly predicting missing or next tokens.
Simplified Training Process
- Provide training data
- Hide or predict portions of the data
- Compare predictions to actual results
- Adjust model parameters
- Repeat many times
This process may require enormous computing power and specialized hardware such as GPUs.
Pretraining and Fine-Tuning
Generative AI training often occurs in two stages.
Pretraining
The model learns general knowledge and patterns from very large datasets.
Example
An LLM may learn grammar, facts, reasoning patterns, and language structure from internet-scale text data.
Fine-Tuning
The pretrained model is then adapted for specific tasks or domains.
Example
A healthcare chatbot may be fine-tuned using medical terminology and healthcare conversations.
Fine-tuning improves performance for specialized use cases.
Prompts and Prompt Engineering
Users interact with generative AI systems using prompts.
A prompt is the input or instruction given to the model.
Examples
- “Write a summary of this article.”
- “Generate an image of a beach at sunset.”
- “Explain machine learning simply.”
Prompt engineering refers to designing prompts that produce better outputs.
Well-structured prompts often improve:
- Accuracy
- Clarity
- Relevance
- Consistency
Temperature and Randomness
Generative AI systems often include configuration settings such as temperature.
Temperature controls randomness in generated responses.
| Temperature | Behavior |
|---|---|
| Low temperature | More focused and predictable responses |
| High temperature | More creative and varied responses |
Example
A low temperature may be used for factual responses, while a higher temperature may be used for creative writing.
Hallucinations
Generative AI models can sometimes produce incorrect or fabricated information called hallucinations.
Example
An AI chatbot may confidently provide false information or invent references.
Hallucinations occur because models generate likely patterns rather than verifying factual truth.
This is an important AI-901 exam concept.
Context Windows
Generative AI models use context windows to determine how much information they can process at one time.
The context window includes:
- User prompts
- Previous conversation history
- Uploaded content
- Instructions
Larger context windows allow models to handle longer conversations and larger documents.
Retrieval-Augmented Generation (RAG)
Some AI systems use Retrieval-Augmented Generation (RAG).
RAG combines:
- A generative AI model
- External knowledge retrieval
Instead of relying only on training data, the model retrieves current or domain-specific information before generating responses.
Benefits
- More accurate responses
- Reduced hallucinations
- Access to updated information
Generative AI Modalities
Generative AI is not limited to text.
Different model types generate different content formats.
| Model Type | Output |
|---|---|
| Text models | Articles, conversations, summaries |
| Image models | Pictures and artwork |
| Audio models | Speech and music |
| Video models | Video clips |
| Code models | Programming code |
Responsible AI Considerations
Generative AI systems introduce Responsible AI concerns such as:
- Bias
- Hallucinations
- Harmful content generation
- Privacy risks
- Copyright concerns
- Security risks
Organizations should implement:
- Human oversight
- Content filtering
- Monitoring
- Transparency
- Governance policies
Azure and Generative AI
Microsoft Azure AI Services and related Azure AI offerings provide tools for building and deploying generative AI applications.
Microsoft also provides Responsible AI guidance and safety controls for generative AI systems.
Real-World Example
Scenario: AI Customer Support Assistant
A company deploys a generative AI chatbot for customer support.
How It Works
- Users enter prompts
- The language model processes tokens
- The transformer architecture analyzes context
- The model predicts likely responses
- The chatbot generates natural language answers
Additional Features
- Fine-tuned on company documentation
- Uses RAG to retrieve current policy information
- Applies content filtering
- Escalates uncertain cases to humans
This type of scenario aligns well with AI-901 exam questions.
Microsoft Responsible AI and Generative AI
Microsoft emphasizes Responsible AI practices for generative AI systems, including:
- Fairness
- Reliability and safety
- Privacy and security
- Inclusiveness
- Transparency
- Accountability
Generative AI systems should be designed responsibly and monitored carefully.
Important AI-901 Exam Tips
For the exam, remember these key points:
- Generative AI creates new content rather than only classifying or predicting.
- Large Language Models (LLMs) generate text by predicting tokens.
- Tokens are small pieces of text processed by the model.
- Transformers are the core architecture behind many modern generative AI systems.
- Foundation models are large pretrained models adaptable to many tasks.
- Fine-tuning customizes models for specific use cases.
- Prompts guide model behavior.
- Temperature controls response randomness.
- Hallucinations are incorrect or fabricated outputs.
- RAG combines retrieval systems with generative AI models.
Quick Knowledge Check
Question 1
What is the primary function of generative AI?
Answer
To create new content such as text, images, audio, or code.
Question 2
What is a token in a language model?
Answer
A small unit of text processed by the model.
Question 3
What does temperature control in generative AI?
Answer
The randomness and creativity of generated outputs.
Question 4
What is a hallucination in generative AI?
Answer
An incorrect or fabricated response generated by the model.
Practice Exam Questions
Question 1
What is the PRIMARY purpose of a generative AI model?
A. To classify data into categories only
B. To create new content based on learned patterns
C. To replace all human decision-making
D. To store database records
Correct Answer
B. To create new content based on learned patterns
Explanation
Generative AI models are designed to generate new content such as text, images, audio, code, or video using patterns learned from training data.
Why the Other Answers Are Incorrect
A. To classify data into categories only
Classification is more commonly associated with traditional predictive AI models.
C. To replace all human decision-making
AI should support, not fully replace, human decision-making.
D. To store database records
Databases store data but are not generative AI systems.
Question 2
How do Large Language Models (LLMs) primarily generate text?
A. By copying entire documents from the internet
B. By predicting the next likely token in a sequence
C. By manually selecting words from a dictionary
D. By using spreadsheet formulas
Correct Answer
B. By predicting the next likely token in a sequence
Explanation
LLMs generate text by predicting the most probable next token repeatedly until a full response is created.
Why the Other Answers Are Incorrect
A. By copying entire documents from the internet
LLMs generate responses based on learned patterns rather than simply copying content.
C. By manually selecting words from a dictionary
The process is automated using neural networks.
D. By using spreadsheet formulas
Spreadsheet formulas are unrelated to language generation.
Question 3
What is a token in a generative AI language model?
A. A hardware device used for training
B. A small unit of text processed by the model
C. A cloud storage container
D. A type of encryption key
Correct Answer
B. A small unit of text processed by the model
Explanation
Tokens are pieces of text such as words, parts of words, punctuation, or characters that language models process during training and generation.
Why the Other Answers Are Incorrect
A. A hardware device used for training
Tokens are not physical hardware.
C. A cloud storage container
Storage containers are unrelated.
D. A type of encryption key
Encryption keys are used in security systems.
Question 4
Which neural network architecture powers many modern generative AI systems?
A. Decision trees
B. Transformers
C. Linear regression
D. Rule-based engines
Correct Answer
B. Transformers
Explanation
Transformers are the core architecture behind many modern generative AI systems because they handle context and sequential data effectively.
Why the Other Answers Are Incorrect
A. Decision trees
Decision trees are traditional machine learning models.
C. Linear regression
Linear regression is used for predicting numeric values.
D. Rule-based engines
Rule-based systems do not use transformer architectures.
Question 5
What is the purpose of fine-tuning a generative AI model?
A. To physically repair damaged hardware
B. To adapt a pretrained model for a specialized task or domain
C. To permanently disable model updates
D. To reduce network bandwidth usage
Correct Answer
B. To adapt a pretrained model for a specialized task or domain
Explanation
Fine-tuning customizes a pretrained foundation model using additional domain-specific data to improve performance for particular use cases.
Why the Other Answers Are Incorrect
A. To physically repair damaged hardware
Fine-tuning is a training process, not hardware maintenance.
C. To permanently disable model updates
Fine-tuning modifies model behavior rather than disabling updates.
D. To reduce network bandwidth usage
Bandwidth optimization is unrelated.
Question 6
What does the temperature setting control in many generative AI models?
A. The physical temperature of the server hardware
B. The randomness and creativity of generated responses
C. The amount of training data stored
D. The encryption strength of the model
Correct Answer
B. The randomness and creativity of generated responses
Explanation
Higher temperature values generally produce more creative and varied responses, while lower values produce more predictable outputs.
Why the Other Answers Are Incorrect
A. The physical temperature of the server hardware
Temperature is a model configuration setting, not a hardware measurement.
C. The amount of training data stored
Temperature does not affect stored data size.
D. The encryption strength of the model
Temperature is unrelated to encryption.
Question 7
What is a hallucination in generative AI?
A. A hardware malfunction during training
B. A correct response with high confidence
C. An incorrect or fabricated output generated by the model
D. A type of data encryption
Correct Answer
C. An incorrect or fabricated output generated by the model
Explanation
Hallucinations occur when a generative AI model produces false or misleading information that appears convincing.
Why the Other Answers Are Incorrect
A. A hardware malfunction during training
Hallucinations are output issues, not hardware failures.
B. A correct response with high confidence
Hallucinations are inaccurate responses.
D. A type of data encryption
Hallucinations are unrelated to encryption.
Question 8
What is the PRIMARY purpose of a prompt in generative AI?
A. To physically start a computer server
B. To provide instructions or input to guide model output
C. To encrypt training data
D. To replace model training
Correct Answer
B. To provide instructions or input to guide model output
Explanation
Prompts tell the model what task to perform or what type of response to generate.
Why the Other Answers Are Incorrect
A. To physically start a computer server
Prompts are text inputs, not hardware controls.
C. To encrypt training data
Prompts are unrelated to encryption.
D. To replace model training
Prompts guide trained models but do not replace training.
Question 9
What is Retrieval-Augmented Generation (RAG)?
A. A hardware acceleration technique
B. A method that combines generative AI with external information retrieval
C. A database backup process
D. A data compression algorithm
Correct Answer
B. A method that combines generative AI with external information retrieval
Explanation
RAG improves AI responses by retrieving relevant external information before generating outputs.
Why the Other Answers Are Incorrect
A. A hardware acceleration technique
RAG is not a hardware feature.
C. A database backup process
RAG is unrelated to backups.
D. A data compression algorithm
Compression is unrelated.
Question 10
Which statement BEST describes a foundation model?
A. A small model designed for a single narrow task
B. A large pretrained model adaptable to many AI tasks
C. A hardware device used for AI training
D. A database management system
Correct Answer
B. A large pretrained model adaptable to many AI tasks
Explanation
Foundation models are large AI models trained on massive datasets that can be adapted for many applications, including chatbots, summarization, and image generation.
Why the Other Answers Are Incorrect
A. A small model designed for a single narrow task
Foundation models are broad and highly adaptable.
C. A hardware device used for AI training
Foundation models are software models, not hardware.
D. A database management system
Databases manage data but are not AI models.
Final Thoughts
Generative AI is a major area of modern artificial intelligence and an important topic for the AI-901 certification exam. Microsoft expects candidates to understand the foundational concepts behind how generative AI models work, including tokens, transformers, prompts, training, and model behavior.
Understanding these concepts provides a strong foundation for working with modern AI systems and Azure AI technologies.
Go to the AI-901 Exam Prep Hub main page

One thought on “Describe how generative AI models work (AI-901 Exam Prep)”