Tag: AI-103 Exam Prep

Agentic AI, AI, AI-103, Artificial Intelligence (AI), Azure AI, Generative AI, Microsoft Certification May 25, 2026

Integrate agent tools, including APIs, knowledge stores, search, Content Understanding, and custom functions (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
   --> Build agents by using Foundry
      --> Integrate agent tools, including APIs, knowledge stores, search, Content Understanding, and custom functions

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Modern AI agents are capable of far more than generating text.

Enterprise AI agents can:

Access business systems
Retrieve enterprise knowledge
Search documents
Understand multimodal content
Execute workflows
Interact with APIs
Use custom functions

These capabilities are possible because modern agentic systems integrate external tools.

Azure AI Foundry provides orchestration and integration capabilities for building tool-augmented AI agents.

For the AI-103: Develop AI Apps and Agents on Azure certification exam, understanding how agents integrate with:

APIs
Knowledge stores
Search systems
Content understanding services
Custom functions

is a major exam objective.

What Are Agent Tools?

Agent tools are external capabilities that agents can invoke to:

Retrieve information
Perform actions
Execute workflows
Interact with systems

Why Tool Integration Matters

LLMs alone cannot:

Access real-time business data
Execute transactions
Query live systems
Retrieve private enterprise information

Tool integration enables these capabilities.

Types of Agent Tools

Common agent tools include:

APIs
Databases
Search services
Vector stores
Content understanding systems
Workflow engines
Custom functions
External applications

Tool-Augmented Agents

Tool-augmented agents combine:

Language reasoning
Retrieval systems
External actions
Workflow orchestration

APIs in Agent Systems

APIs are among the most common tools used by AI agents.

APIs allow agents to:

Retrieve data
Update systems
Trigger workflows
Access cloud services

Common API Integration Scenarios

Examples include:

CRM systems
ERP systems
Ticketing systems
Email services
Calendar systems
Inventory systems
Financial platforms

REST APIs

Many agent integrations use REST APIs.

REST APIs commonly support:

GET operations
POST operations
PUT operations
DELETE operations

API Authentication

Agent systems may authenticate using:

API keys
OAuth tokens
Managed identities
Microsoft Entra ID

Managed Identity Integration

Managed identities allow applications to:

Authenticate securely
Avoid storing secrets
Access Azure resources safely

Function-Calling

Function-calling allows models to:

Invoke tools dynamically
Generate structured requests
Execute external operations

Tool Schemas

Tool schemas define:

Tool names
Input parameters
Data types
Required fields
Expected outputs

Structured Tool Invocation

Structured invocation improves:

Reliability
Validation
Automation
Predictability

Knowledge Stores

Knowledge stores provide persistent enterprise information for retrieval.

Knowledge stores may contain:

Documents
Policies
Product manuals
Research data
Historical records

Why Knowledge Stores Matter

Knowledge stores allow agents to:

Access enterprise-specific information
Ground responses
Improve factual accuracy

Knowledge Sources

Agents may connect to:

Azure AI Search
SharePoint
SQL databases
Blob storage
Cosmos DB
Data Lake storage
Vector databases

Retrieval-Augmented Generation (RAG)

RAG combines:

Retrieval systems
Generative models

Retrieved data is added to prompts to improve grounded responses.

Search Systems in Agent Architectures

Search systems allow agents to:

Retrieve relevant content
Find documents
Search enterprise knowledge
Improve response quality

Azure AI Search

Azure AI Search is commonly used for:

Keyword search
Vector search
Hybrid search
Semantic ranking

Semantic Search

Semantic search focuses on:

Meaning
Context
Intent

rather than exact keyword matches.

Vector Search

Vector search uses embeddings to:

Identify semantic similarity
Retrieve related content
Improve retrieval quality

Hybrid Search

Hybrid search combines:

Keyword search
Vector search

This improves search relevance.

Embeddings

Embeddings are vector representations of data.

Embeddings support:

Semantic retrieval
Similarity comparison
Vector indexing

Retrieval Pipelines

Retrieval pipelines commonly include:

Data ingestion
Chunking
Embedding generation
Indexing
Retrieval
Reranking

Grounded Responses

Grounded responses are generated using retrieved evidence.

Grounding improves:

Accuracy
Explainability
Trustworthiness

Content Understanding

Content understanding systems allow agents to analyze:

Images
Documents
Audio
Video
Forms
Structured and unstructured content

Multimodal Processing

Multimodal systems process multiple content types simultaneously.

Examples include:

Text + images
Text + audio
Documents + tables

Azure AI Content Understanding Capabilities

Agents may integrate with services for:

OCR
Image analysis
Speech recognition
Document intelligence
Form extraction
Video analysis

OCR Integration

Optical Character Recognition (OCR) extracts text from:

Images
PDFs
Scanned documents

Document Intelligence

Document intelligence systems can extract:

Key-value pairs
Tables
Forms
Structured business data

Image Understanding

Agents may analyze images for:

Object detection
Caption generation
Classification
Scene understanding

Speech Integration

Speech systems enable:

Speech-to-text
Text-to-speech
Voice assistants
Audio analysis

Custom Functions

Custom functions extend agent capabilities beyond built-in tools.

Custom functions may:

Execute business logic
Integrate proprietary systems
Trigger workflows
Process specialized data

Examples of Custom Functions

Examples include:

Risk scoring
Inventory forecasting
Pricing calculations
Compliance validation
Workflow automation

Designing Custom Functions

Good custom functions should:

Be narrowly scoped
Use structured parameters
Return predictable outputs
Support validation

Error Handling for Tools

Agent systems should handle:

API failures
Timeouts
Invalid responses
Authentication errors
Missing data

Retry Logic

Retry mechanisms improve resilience when:

APIs temporarily fail
Services throttle requests
Network issues occur

Tool Selection Logic

Agents may decide:

Whether a tool is needed
Which tool to invoke
When to retrieve information
How to sequence actions

Multi-Tool Orchestration

Advanced agents may coordinate:

Search systems
APIs
Memory systems
Custom functions
Workflow engines

Workflow Coordination

Agent workflows may include:

Retrieve enterprise data
Analyze content
Call APIs
Generate summaries
Execute actions

Conversation Memory Integration

Agents may combine tools with:

Short-term memory
Long-term memory
Context tracking
Session persistence

Security Considerations

Secure tool integration requires:

Authentication
Authorization
RBAC
Managed identities
Secret management
Network controls

Least Privilege Principle

Agents should receive:

Minimal required permissions
Restricted tool access
Scoped credentials

Monitoring Tool Usage

Organizations should monitor:

Tool invocation frequency
API failures
Unauthorized actions
Retrieval quality
Workflow success rates

Logging and Auditing

Logs may capture:

Tool calls
API requests
Workflow execution
Retrieved sources
User interactions

Responsible AI Considerations

Organizations should implement:

Safety filters
Guardrails
Human oversight
Approval workflows
Content moderation

Human-in-the-Loop Workflows

Sensitive operations may require:

Human review
Approval checkpoints
Escalation processes

Performance Optimization

Optimization strategies include:

Caching
Query optimization
Efficient chunking
Parallel tool execution
Response streaming

Real-World Scenario

Scenario: Enterprise Legal Assistant

Requirements:

Search legal documents
Retrieve contract clauses
Analyze uploaded PDFs
Query compliance systems
Generate summaries

Recommended Design:

Azure AI Search for retrieval
OCR and document intelligence
Function-calling for compliance APIs
Conversation memory for continuity
Approval workflows for legal actions

Common AI-103 Exam Tips

Understand Tool Integration

Know:

APIs
Function-calling
Tool schemas
Tool orchestration

Learn Retrieval Concepts

Understand:

RAG
Vector search
Embeddings
Hybrid search
Grounding

Understand Content Understanding

Know:

OCR
Document intelligence
Image analysis
Speech services
Multimodal processing

Learn Security Concepts

Understand:

Managed identities
RBAC
Least privilege
Authentication methods

Summary

Modern AI agents integrate:

APIs
Search systems
Knowledge stores
Content understanding services
Custom functions
Workflow orchestration

For the AI-103 exam, you should understand:

Tool integration
Function-calling
Tool schemas
Retrieval systems
Azure AI Search
Embeddings
Grounding
OCR and document intelligence
Multimodal processing
Custom business functions
Workflow orchestration
Monitoring and governance

These capabilities are foundational for enterprise AI agent systems built with Azure AI Foundry.

Practice Exam Questions

Question 1

Why do AI agents integrate external tools?

A. To eliminate workflows
B. To access live systems and execute actions
C. To remove retrieval systems
D. To disable APIs

Answer

B. To access live systems and execute actions

Explanation

External tools allow agents to retrieve data and perform operations.

Question 2

What is the purpose of function-calling?

A. Replace search systems
B. Allow models to invoke external tools dynamically
C. Remove authentication requirements
D. Eliminate embeddings

Answer

B. Allow models to invoke external tools dynamically

Explanation

Function-calling enables structured interaction with external systems.

Question 3

What information is typically defined in a tool schema?

A. GPU temperatures
B. Input parameters and expected outputs
C. Firewall rules only
D. VM configurations only

Answer

B. Input parameters and expected outputs

Explanation

Tool schemas standardize tool interactions.

Question 4

Which Azure service is commonly used for vector and hybrid search?

A. Azure Virtual WAN
B. Azure AI Search
C. Azure Batch
D. Azure Policy

Answer

B. Azure AI Search

Explanation

Azure AI Search supports semantic, vector, and hybrid search.

Question 5

What is the purpose of embeddings?

A. Replace APIs entirely
B. Represent data semantically for similarity comparison
C. Eliminate vector indexes
D. Remove retrieval systems

Answer

B. Represent data semantically for similarity comparison

Explanation

Embeddings support semantic retrieval.

Question 6

What is a key benefit of grounded responses?

A. Reduced monitoring needs
B. Improved factual accuracy and trustworthiness
C. Elimination of search systems
D. Removal of citations

Answer

B. Improved factual accuracy and trustworthiness

Explanation

Grounded systems use retrieved evidence to improve reliability.

Question 7

Which capability extracts text from scanned documents?

A. Vector indexing
B. OCR
C. Hybrid search
D. Tokenization

Answer

B. OCR

Explanation

OCR extracts text from images and scanned files.

Question 8

Why are managed identities important in agent systems?

A. They increase hallucinations
B. They allow secure authentication without stored secrets
C. They eliminate RBAC
D. They disable APIs

Answer

B. They allow secure authentication without stored secrets

Explanation

Managed identities improve security and credential management.

Question 9

What is an example of a custom function?

A. A GPU driver update
B. A proprietary pricing calculation workflow
C. A firewall appliance
D. A VM snapshot

Answer

B. A proprietary pricing calculation workflow

Explanation

Custom functions implement specialized business logic.

Question 10

What should organizations monitor in tool-augmented agents?

A. Only CPU temperatures
B. Tool usage, API failures, retrieval quality, and workflow success
C. Only vector dimensions
D. Only prompt length

Answer

B. Tool usage, API failures, retrieval quality, and workflow success

Explanation

Monitoring improves reliability, governance, and operational visibility.

Go to the AI-103 Exam Prep Hub main page

Agentic AI, AI, AI-103, Azure AI, Microsoft Certification May 25, 2026

Build agents that integrate retrieval, function-calling, and conversation memory (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
   --> Build agents by using Foundry
      --> Build agents that integrate retrieval, function-calling, and conversation memory

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Modern AI agents are far more capable than traditional chatbots.

Today’s enterprise AI agents can:

Retrieve enterprise knowledge
Call APIs and tools
Maintain memory across conversations
Perform multistep workflows
Coordinate reasoning and actions

Azure AI Foundry provides the infrastructure and orchestration capabilities needed to build these advanced agentic systems.

For the AI-103: Develop AI Apps and Agents on Azure certification exam, understanding how to build agents that integrate:

Retrieval
Function-calling
Conversation memory

is extremely important.

These capabilities are foundational to enterprise generative AI systems.

What Is an AI Agent?

An AI agent is an AI-powered system capable of:

Understanding goals
Maintaining context
Using tools
Retrieving information
Performing actions
Adapting to new inputs

Agents extend beyond simple prompt-response interactions.

Core Components of Modern Agents

Modern agents commonly include:

Large language models (LLMs)
Retrieval systems
Tool integrations
Function-calling frameworks
Memory systems
Workflow orchestration
Safety controls

Retrieval in Agent Systems

Retrieval allows agents to:

Access external knowledge
Ground responses in enterprise data
Improve factual accuracy
Reduce hallucinations

Why Retrieval Matters

LLMs are trained on static datasets.

Without retrieval:

Models may lack current information
Enterprise-specific knowledge may be unavailable
Hallucinations become more likely

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) combines:

Search and retrieval systems
LLM reasoning and generation

RAG allows agents to generate responses using retrieved content.

Typical RAG Workflow

A common RAG workflow includes:

User submits a query
Query is converted to embeddings
Search retrieves relevant documents
Documents are added to prompts
LLM generates grounded responses

Knowledge Sources for Retrieval

Agents may retrieve data from:

Azure AI Search
Vector databases
SQL databases
Document repositories
SharePoint
Blob storage
Knowledge bases

Vector Search

Vector search enables semantic retrieval.

Instead of keyword matching only, vector search finds:

Meaning
Similarity
Contextual relationships

Embeddings

Embeddings are numerical vector representations of text or data.

Embeddings help systems:

Measure semantic similarity
Perform vector search
Improve retrieval relevance

Chunking Strategies

Documents are often split into smaller chunks before indexing.

Chunking improves:

Retrieval precision
Context quality
Token efficiency

Retrieval Pipelines

Retrieval pipelines commonly include:

Data ingestion
Chunking
Embedding generation
Indexing
Query retrieval
Reranking

Hybrid Search

Hybrid search combines:

Keyword search
Vector search

This improves search quality.

Grounding Responses

Grounding means generating responses using retrieved evidence.

Grounded systems are:

More accurate
More explainable
More reliable

Citation and Source Attribution

Agents may include:

Source links
Document citations
Retrieved evidence

This improves transparency.

Function-Calling in Agent Systems

Function-calling allows models to invoke:

APIs
Services
Workflows
Databases
External tools

Why Function-Calling Matters

LLMs alone cannot:

Access live systems
Execute actions
Retrieve dynamic business data

Function-calling bridges this gap.

Examples of Functions

Common functions include:

Get weather data
Retrieve customer records
Create support tickets
Query inventory systems
Send emails
Schedule meetings

Tool Schemas

Function-calling relies on structured tool schemas.

Schemas define:

Tool names
Parameters
Data types
Required fields
Expected outputs

Example Function Schema

Example:

Function: GetOrderStatus

Inputs:

OrderID
CustomerID

Outputs:

Shipping status
Estimated delivery date

Structured Tool Invocation

Structured tool invocation improves:

Reliability
Validation
Automation
Error handling

Function Selection Logic

Agents may decide:

Whether tools are needed
Which tools to invoke
When to call functions
How to sequence operations

Multi-Tool Workflows

Advanced agents may orchestrate:

Multiple tools
Sequential workflows
Conditional logic
Parallel execution

Example Multi-Tool Workflow

Example:

Retrieve customer data
Query billing system
Generate summary
Create support ticket
Send notification

Tool Safety Controls

Organizations should control:

Which tools agents can access
Which users may trigger actions
Which workflows require approval

Human-in-the-Loop Approvals

High-risk operations may require:

Human review
Approval checkpoints
Escalation workflows

Conversation Memory

Conversation memory allows agents to:

Maintain context
Track interactions
Remember prior information
Continue workflows

Why Memory Matters

Without memory:

Conversations become disconnected
Users repeat information
Workflow continuity breaks

Types of Memory

Common memory types include:

Short-term memory
Long-term memory
Episodic memory
Semantic memory

Short-Term Memory

Short-term memory stores:

Recent prompts
Recent responses
Current task state

Long-Term Memory

Long-term memory stores:

User preferences
Historical interactions
Persistent context

Stateful vs Stateless Agents

Stateless Agents

Do not retain memory between sessions.

Benefits:

Simpler architecture
Lower storage requirements

Stateful Agents

Maintain context and conversation history.

Benefits:

Better user experiences
Improved multistep reasoning

Context Window Limitations

LLMs have limited context windows.

Applications must manage:

Token usage
Conversation length
Historical context

Memory Management Strategies

Common strategies include:

Rolling conversation windows
Summarized history
Vector memory retrieval
Persistent storage systems

Vector Memory

Conversation history may be stored as embeddings.

This enables:

Semantic memory retrieval
Long-term contextual recall
Personalized interactions

Retrieval-Based Memory

Agents may retrieve:

Prior conversations
Historical workflow data
Previous decisions

Persistent Memory Storage

Persistent memory may use:

Databases
Search indexes
Vector stores
Cloud storage

Agent Orchestration

Orchestration coordinates:

Retrieval systems
Function-calling
Memory systems
Workflow execution

Agent Reasoning Loops

Agents may perform iterative reasoning:

Analyze request
Retrieve information
Call tools
Evaluate outputs
Continue reasoning
Generate response

Workflow State Management

Agents may track:

Active tasks
Tool outputs
Pending actions
Workflow progress

Azure AI Foundry and Agent Development

Azure AI Foundry supports:

Model deployment
Retrieval integration
Agent orchestration
Prompt flows
Evaluation pipelines
Monitoring and governance

Azure AI Search in Agent Systems

Azure AI Search commonly provides:

Vector indexing
Semantic ranking
Hybrid search
Enterprise retrieval

Prompt Engineering for Agents

Effective prompts define:

Agent role
Behavioral expectations
Tool usage rules
Safety constraints

Grounded Prompt Construction

Grounded prompts may include:

Retrieved documents
Citations
Tool outputs
Prior conversation context

Monitoring Agent Systems

Organizations should monitor:

Retrieval relevance
Tool-call accuracy
Memory quality
Latency
Hallucinations
Safety events

Evaluating RAG Systems

RAG systems should be evaluated for:

Retrieval quality
Relevance
Faithfulness
Grounding accuracy
Citation quality

Evaluating Function-Calling

Organizations should validate:

Correct tool selection
Parameter accuracy
Workflow reliability
Error recovery

Evaluating Conversation Memory

Memory systems should be evaluated for:

Context retention
Consistency
Recall accuracy
Session continuity

Security Considerations

Secure agent systems should implement:

Authentication
Authorization
Managed identities
RBAC
Private networking
Audit logging

Responsible AI Considerations

Organizations should apply:

Safety filters
Guardrails
Human oversight
Content moderation
Usage monitoring

Real-World Scenario

Scenario: Enterprise HR Assistant

Requirements:

Retrieve HR policies
Answer employee questions
Access scheduling systems
Remember user preferences
Escalate sensitive requests

Recommended Design:

RAG using Azure AI Search
Function-calling for HR systems
Stateful conversation memory
Approval workflows for sensitive actions
Grounded response generation

Common AI-103 Exam Tips

Understand Retrieval Concepts

Know:

RAG
Embeddings
Vector search
Hybrid search
Grounding

Learn Function-Calling Concepts

Understand:

Tool schemas
Structured invocation
Tool orchestration
Workflow execution

Understand Memory Systems

Know:

Stateful vs stateless agents
Short-term vs long-term memory
Context management
Vector memory

Understand Agent Orchestration

Know how agents combine:

Retrieval
Tool usage
Memory
Reasoning

Summary

Modern enterprise agents combine:

Retrieval systems
Function-calling
Conversation memory
Workflow orchestration

For the AI-103 exam, you should understand:

RAG architectures
Vector search
Embeddings
Grounding
Function-calling
Tool schemas
Tool orchestration
Stateful memory
Context management
Agent reasoning loops
Monitoring and governance

These concepts are foundational to building scalable and intelligent AI agents with Azure AI Foundry.

Practice Exam Questions

Question 1

What is the primary purpose of Retrieval-Augmented Generation (RAG)?

A. Reduce GPU temperatures
B. Combine retrieval systems with LLM generation
C. Eliminate vector search
D. Replace APIs completely

Answer

B. Combine retrieval systems with LLM generation

Explanation

RAG combines retrieval and generation to improve grounded responses.

Question 2

Why are embeddings important in retrieval systems?

A. They increase firewall security
B. They enable semantic similarity comparisons
C. They replace orchestration engines
D. They remove token limits

Answer

B. They enable semantic similarity comparisons

Explanation

Embeddings support semantic vector search.

Question 3

What is a key advantage of hybrid search?

A. It disables semantic ranking
B. It combines keyword and vector search
C. It removes indexing requirements
D. It eliminates embeddings

Answer

B. It combines keyword and vector search

Explanation

Hybrid search improves retrieval quality by combining approaches.

Question 4

What is the purpose of function-calling in agent systems?

A. Reduce network traffic only
B. Allow models to invoke external tools and services
C. Eliminate APIs
D. Disable workflows

Answer

B. Allow models to invoke external tools and services

Explanation

Function-calling enables interaction with external systems.

Question 5

What information is typically included in a tool schema?

A. GPU temperature metrics
B. Parameters, data types, and outputs
C. Only firewall settings
D. Only vector dimensions

Answer

B. Parameters, data types, and outputs

Explanation

Schemas define structured tool interfaces.

Question 6

Why is conversation memory important?

A. It reduces all storage costs
B. It maintains continuity and context across interactions
C. It removes orchestration needs
D. It disables tool invocation

Answer

B. It maintains continuity and context across interactions

Explanation

Memory improves user experiences and multistep workflows.

Question 7

What is a characteristic of stateful agents?

A. They never store context
B. They maintain conversation history and state
C. They disable retrieval systems
D. They remove prompt engineering

Answer

B. They maintain conversation history and state

Explanation

Stateful agents retain memory across interactions.

Question 8

What is a common challenge when using LLM conversation memory?

A. Unlimited context windows
B. Context window limitations and token constraints
C. Elimination of embeddings
D. Removal of grounding

Answer

B. Context window limitations and token constraints

Explanation

LLMs can process only limited amounts of context.

Question 9

Which Azure service is commonly used for enterprise retrieval in RAG architectures?

A. Azure DevOps
B. Azure AI Search
C. Azure Virtual Desktop
D. Azure Batch

Answer

B. Azure AI Search

Explanation

Azure AI Search supports vector and hybrid search for RAG systems.

Question 10

What should organizations monitor in agent systems?

A. Only GPU fan speeds
B. Retrieval quality, tool usage, memory accuracy, and safety
C. Only prompt lengths
D. Only authentication failures

Answer

B. Retrieval quality, tool usage, memory accuracy, and safety

Explanation

Comprehensive monitoring improves reliability, governance, and user trust.

Go to the AI-103 Exam Prep Hub main page

Agentic AI, AI, AI-103, Artificial Intelligence (AI), Azure AI, Microsoft Certification May 25, 2026

Define agent roles, goals, conversation-tracking approach, and tool schemas (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
   --> Build agents by using Foundry
      --> Define agent roles, goals, conversation-tracking approach, and tool schemas

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

AI agents are rapidly becoming one of the most important components of modern AI systems.

Unlike basic chatbots, agents can:

Reason through tasks
Maintain context
Use tools
Execute workflows
Coordinate multistep actions
Interact with external systems

Azure AI Foundry provides tools and frameworks for building agentic systems.

For the AI-103: Develop AI Apps and Agents on Azure certification exam, understanding agent design principles is critical.

This topic focuses on:

Agent roles
Agent goals
Conversation tracking
Tool schemas
Tool orchestration
State management
Memory design
Workflow coordination

What Is an AI Agent?

An AI agent is an AI system capable of:

Understanding objectives
Making decisions
Using tools
Maintaining context
Performing actions
Adapting to changing inputs

Agents are more autonomous than standard prompt-response systems.

Characteristics of AI Agents

Agents commonly include:

Reasoning
Planning
Memory
Tool usage
Workflow orchestration
Goal-oriented behavior

Agent Roles

An agent role defines:

The agent’s responsibilities
Behavioral expectations
Scope of operation
Allowed actions

Why Agent Roles Matter

Clearly defined roles help:

Improve consistency
Reduce unsafe behavior
Prevent scope creep
Improve reliability

Examples of Agent Roles

Examples include:

Customer support assistant
Financial analyst
Research assistant
Scheduling coordinator
Coding assistant
IT operations assistant

Specialized vs General-Purpose Agents

Specialized Agents

Focused on narrow tasks.

Benefits:

Higher reliability
Better governance
Easier evaluation

General-Purpose Agents

Handle broad tasks.

Benefits:

Greater flexibility
Wider applicability

Tradeoff:

Increased complexity and risk

Defining Agent Goals

Goals define:

Desired outcomes
Success criteria
Task objectives

Goal-Oriented Design

Good goals are:

Clear
Measurable
Constrained
Actionable

Examples of Agent Goals

Examples include:

Resolve customer tickets
Retrieve accurate company policies
Generate code suggestions
Schedule meetings
Summarize documents

Constraints in Goal Design

Goals should include:

Safety boundaries
Compliance rules
Tool restrictions
Escalation conditions

Agent Instructions and System Prompts

Agents typically receive:

System instructions
Behavioral guidance
Operational constraints

These instructions influence agent behavior.

Conversation Tracking

Conversation tracking maintains:

Dialogue history
User context
Workflow state
Interaction continuity

Why Conversation Tracking Matters

Without conversation tracking:

Agents lose context
Responses become inconsistent
Multistep workflows fail

Short-Term Conversation Memory

Short-term memory may store:

Recent prompts
Recent responses
Current workflow state

Long-Term Memory

Long-term memory may store:

User preferences
Historical interactions
Persistent knowledge

Session State Management

State management tracks:

Current tasks
Workflow progress
Tool outputs
Active context

Stateless vs Stateful Agents

Stateless Agents

Do not retain context between interactions.

Benefits:

Simpler design
Lower storage requirements

Stateful Agents

Maintain conversation history and workflow state.

Benefits:

Better continuity
Improved multistep reasoning

Context Window Management

LLMs have limited context windows.

Applications may need to:

Trim conversation history
Summarize prior interactions
Retrieve external memory

Memory Strategies

Common memory strategies include:

Rolling conversation windows
Summarization memory
Vector memory
Persistent storage

Retrieval-Augmented Memory

Agents may retrieve:

Historical conversations
Knowledge documents
Workflow data

This improves continuity.

Conversation Persistence

Persistent conversation storage may use:

Databases
Search indexes
Vector stores

Tool Usage in Agent Systems

Agents often interact with:

APIs
Databases
Search systems
External applications
Workflow services

What Is a Tool Schema?

A tool schema defines:

Tool name
Purpose
Input parameters
Output structure
Validation rules

Purpose of Tool Schemas

Tool schemas help:

Standardize interactions
Reduce ambiguity
Improve reliability
Enable function calling

Tool Schema Components

Tool schemas commonly include:

Function name
Description
Parameters
Data types
Required fields

Example Tool Schema

Example:

Tool: GetWeather
Inputs:
- City name
- Date
Output:
- Temperature
- Forecast

Structured Tool Invocation

Structured tool schemas allow agents to:

Generate valid requests
Interact predictably with systems
Reduce execution failures

Function Calling

Function calling enables models to:

Invoke external tools
Execute structured operations
Retrieve external data

Tool Selection Logic

Agents may decide:

Whether a tool is needed
Which tool to invoke
How to sequence tool calls

Multi-Tool Workflows

Complex agents may use:

Multiple tools
Sequential workflows
Conditional branching

Tool Access Controls

Organizations may restrict:

Which tools agents can use
When tools can be invoked
Which users may trigger actions

Safety Considerations for Tool Usage

Improper tool usage can:

Leak data
Execute unsafe actions
Cause workflow failures

Human Approval Workflows

Some actions may require:

Human review
Approval checkpoints
Escalation workflows

Agent Planning

Agents may perform:

Task decomposition
Sequential planning
Goal prioritization

Multistep Reasoning

Agents may:

Gather information
Use tools
Analyze results
Generate conclusions

Orchestration Frameworks

Orchestration frameworks coordinate:

Agent logic
Tool execution
Workflow progression
State transitions

Error Handling in Agents

Agents should handle:

Invalid tool outputs
API failures
Missing data
Ambiguous user requests

Monitoring Agent Behavior

Organizations should monitor:

Tool usage
Conversation quality
Safety violations
Goal completion rates

Trace Logging

Trace logs may capture:

Prompt sequences
Tool calls
Workflow decisions
Agent reasoning steps

Evaluation of Agent Systems

Organizations should evaluate:

Goal completion
Accuracy
Relevance
Safety
Tool reliability

Governance and Compliance

Enterprise agent systems may require:

Access controls
Audit logging
Compliance policies
Responsible AI governance

Real-World Scenario

Scenario: Enterprise IT Support Agent

Requirements:

Resolve common IT requests
Access ticketing systems
Maintain user context
Escalate high-risk actions

Recommended Design:

Specialized support role
Defined goals
Stateful conversation tracking
Structured tool schemas
Human approval workflows

Common AI-103 Exam Tips

Understand Agent Roles

Know:

Specialized vs general-purpose agents
Role boundaries
Behavioral constraints

Learn Conversation Tracking Concepts

Understand:

Stateful vs stateless agents
Memory approaches
Context management

Understand Tool Schemas

Know:

Function definitions
Parameters
Structured tool invocation
Function calling

Learn Governance Concepts

Understand:

Tool access controls
Human approvals
Audit logging
Safety constraints

Summary

Agent design is a core part of modern AI systems.

For the AI-103 exam, you should understand:

Agent roles
Goal-oriented behavior
Conversation tracking
Memory management
Stateful workflows
Tool schemas
Function calling
Tool orchestration
Workflow planning
Safety controls
Human approvals
Monitoring and governance

These concepts are foundational for building secure, scalable, and reliable agentic systems using Azure AI Foundry.

Practice Exam Questions

Question 1

What is the primary purpose of an agent role?

A. Increase GPU utilization
B. Define responsibilities and behavioral boundaries
C. Eliminate tool usage
D. Remove workflow orchestration

Answer

B. Define responsibilities and behavioral boundaries

Explanation

Agent roles establish scope, expectations, and operational constraints.

Question 2

Why are clearly defined agent goals important?

A. They eliminate monitoring
B. They provide measurable objectives and task direction
C. They reduce storage requirements only
D. They remove authentication needs

Answer

B. They provide measurable objectives and task direction

Explanation

Goals help agents focus on desired outcomes.

Question 3

What is the purpose of conversation tracking?

A. Increase vector dimensions
B. Maintain context and workflow continuity
C. Disable memory systems
D. Remove APIs

Answer

B. Maintain context and workflow continuity

Explanation

Conversation tracking preserves interaction history and state.

Question 4

What is a key benefit of stateful agents?

A. They avoid all storage requirements
B. They maintain continuity across interactions
C. They eliminate workflows
D. They remove tool schemas

Answer

B. They maintain continuity across interactions

Explanation

Stateful agents retain memory and conversation context.

Question 5

What is a tool schema?

A. A GPU optimization technique
B. A structured definition of tool inputs and outputs
C. A firewall policy
D. A token compression method

Answer

B. A structured definition of tool inputs and outputs

Explanation

Tool schemas standardize external tool interactions.

Question 6

What is the purpose of function calling?

A. Eliminate orchestration
B. Allow models to invoke external tools dynamically
C. Replace APIs entirely
D. Remove authentication

Answer

B. Allow models to invoke external tools dynamically

Explanation

Function calling enables structured tool execution.

Question 7

Why are tool access controls important?

A. They reduce GPU memory usage
B. They restrict unsafe or unauthorized tool usage
C. They eliminate monitoring
D. They disable workflows

Answer

B. They restrict unsafe or unauthorized tool usage

Explanation

Access controls improve safety and governance.

Question 8

What is a common challenge with large conversation histories?

A. Unlimited context windows
B. Context window limitations in LLMs
C. Elimination of memory usage
D. Reduced orchestration complexity

Answer

B. Context window limitations in LLMs

Explanation

LLMs can only process limited amounts of context.

Question 9

What is the purpose of human approval workflows?

A. Increase hallucinations
B. Provide oversight for sensitive or high-risk actions
C. Remove governance requirements
D. Disable trace logging

Answer

B. Provide oversight for sensitive or high-risk actions

Explanation

Human review reduces operational risk.

Question 10

What should organizations monitor in agent systems?

A. Only GPU temperatures
B. Tool usage, safety, conversation quality, and task completion
C. Only token counts
D. Only API latency

Answer

B. Tool usage, safety, conversation quality, and task completion

Explanation

Comprehensive monitoring improves reliability and governance.

Go to the AI-103 Exam Prep Hub main page

Agentic AI, AI, AI-103, Artificial Intelligence (AI), Azure AI, Generative AI, Microsoft Certification May 25, 2026

Configure an application to connect to a Foundry project (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
   --> Build generative applications by using Foundry
      --> Configure an application to connect to a Foundry project

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Azure AI Foundry provides a centralized environment for developing, deploying, and managing AI applications and agentic solutions.

Applications that use generative AI models, agents, retrieval systems, or multimodal capabilities must connect securely and reliably to Foundry projects.

This topic is important for the AI-103: Develop AI Apps and Agents on Azure certification exam.

For the AI-103 exam, you should understand:

Azure AI Foundry projects
Application connectivity
Authentication methods
SDK configuration
Endpoint configuration
Deployment configuration
Managed identities
API keys
Environment variables
Network security
Role-based access control (RBAC)
Connecting to deployed models and agents
Configuration management
Monitoring and troubleshooting

What Is an Azure AI Foundry Project?

An Azure AI Foundry project is a centralized workspace used to:

Manage AI resources
Deploy models
Configure agents
Build workflows
Store evaluation assets
Monitor AI systems

Projects help organize AI development and operations.

Components of a Foundry Project

A Foundry project may include:

Model deployments
Agent configurations
Prompt flows
Evaluation datasets
Connections
Search resources
Storage resources
Monitoring tools

Why Applications Need Project Connectivity

Applications connect to Foundry projects to:

Access deployed models
Invoke agents
Perform retrieval operations
Execute workflows
Use AI services securely

Common Connection Scenarios

Applications commonly connect to:

Chat models
Embedding models
Multimodal models
Agent services
Prompt flow endpoints
Azure AI Search resources

Connection Architecture

Typical connectivity includes:

Application
Authentication layer
Foundry project endpoint
Model or agent deployment

SDK-Based Connectivity

Applications often use SDKs to:

Authenticate
Send prompts
Receive responses
Stream outputs
Manage workflows

SDKs simplify development.

API-Based Connectivity

Applications may also use:

REST APIs
HTTP endpoints
Direct service requests

Authentication Methods

Applications must authenticate securely.

Common methods include:

API keys
Managed identities
Azure Active Directory (Azure AD)
Keyless authentication

API Key Authentication

API keys are:

Simple to configure
Easy for development and testing

However, they require secure storage.

Managed Identity Authentication

Managed identities provide:

Secretless authentication
Improved security
Automatic credential management

Managed identity is recommended for production workloads.

Azure AD Authentication

Azure AD enables:

Enterprise identity management
Role-based access
Secure authentication workflows

Keyless Authentication

Keyless authentication reduces:

Credential exposure
Secret management overhead

Secure Credential Storage

Applications should avoid:

Hardcoded secrets
Plain-text credentials

Credentials should be stored securely.

Environment Variables

Environment variables commonly store:

API endpoints
Deployment names
Keys
Configuration settings

Configuration Files

Applications may use:

JSON configuration files
YAML files
Application settings

Endpoint Configuration

Applications must connect to the correct:

Foundry endpoint
Model deployment endpoint
Agent endpoint

Deployment Names

Applications typically reference:

Specific deployment names
Model identifiers
Agent identifiers

Connecting to Model Deployments

Applications may connect to:

Chat completion models
Embedding models
Code models
Multimodal models

Connecting to Agent Workflows

Applications may invoke agents that:

Use tools
Access memory
Execute workflows
Coordinate tasks

Connecting to Prompt Flows

Applications can invoke:

Prompt flow endpoints
Orchestrated workflows
Multi-step pipelines

Connecting to Azure AI Search

RAG applications often connect to:

Azure AI Search
Vector indexes
Semantic search pipelines

Role-Based Access Control (RBAC)

RBAC controls:

Resource permissions
Service access
Administrative privileges

Least Privilege Principle

Applications should receive:

Only required permissions
Minimal access rights

Private Networking

Organizations may secure connectivity using:

Private endpoints
Virtual networks
Network isolation

Firewall Configuration

Firewall rules may restrict:

Public access
Unauthorized IP ranges

Secure Communication

Applications should use:

HTTPS
Encrypted communication
Secure APIs

SDK Initialization

Applications typically initialize:

Client objects
Authentication providers
Connection settings

Client Configuration

Client configuration may include:

Endpoint URLs
API versions
Deployment names
Authentication credentials

Streaming Configuration

Applications may enable:

Streaming responses
Incremental output rendering

Retry Policies

Applications should implement:

Retry logic
Exponential backoff
Timeout handling

Error Handling

Applications should handle:

Authentication failures
Network issues
Rate limits
Invalid requests

Logging and Monitoring

Applications should log:

Requests
Responses
Failures
Latency metrics

Observability

Observability helps organizations:

Monitor usage
Diagnose issues
Improve reliability

Application Scalability

Applications should support:

High concurrency
Distributed workloads
Elastic scaling

Cost Considerations

Connection design impacts:

Token usage
API consumption
Search operations
Infrastructure costs

CI/CD Integration

Connection settings may be managed through:

Deployment pipelines
Infrastructure as code
Environment promotion

Development vs Production Environments

Organizations often separate:

Development
Testing
Staging
Production

Each environment may use different:

Endpoints
Credentials
Policies

Multi-Region Connectivity

Global applications may connect to:

Multiple regional deployments
Regional failover systems

High Availability

Applications should support:

Redundant deployments
Failover strategies
Resilient architecture

Governance Considerations

Organizations may enforce:

Access policies
Security baselines
Audit logging
Compliance requirements

Troubleshooting Connectivity Issues

Common issues include:

Invalid credentials
Incorrect endpoints
Missing RBAC permissions
Network restrictions
Deployment mismatches

Performance Optimization

Organizations should optimize:

Connection reuse
Latency
Request batching
Streaming efficiency

Real-World Scenario

Scenario: Enterprise AI Assistant

Requirements:

Secure authentication
RAG integration
Agent orchestration
Enterprise access control

Recommended Approach:

Managed identity
RBAC
Private networking
Azure AI Search integration
SDK-based connectivity

Common AI-103 Exam Tips

Understand Authentication Options

Know when to use:

API keys
Managed identities
Azure AD

Understand Endpoint Configuration

Know:

Deployment names
Service endpoints
Agent endpoints

Learn RBAC Concepts

Understand:

Least privilege
Role assignments
Secure access management

Understand Networking Concepts

Know:

Private endpoints
Firewalls
Secure connectivity

Learn Application Integration Concepts

Understand:

SDK initialization
Client configuration
Retry logic
Monitoring

Summary

Connecting applications to Azure AI Foundry projects is a foundational skill for AI-103.

For the exam, you should understand:

Foundry projects
Application connectivity
SDK integration
API integration
Authentication methods
Managed identities
RBAC
Deployment configuration
Endpoint management
Networking security
Logging and monitoring
Scalability and reliability

These skills are essential for building secure, scalable enterprise AI applications on Azure.

Practice Exam Questions

Question 1

What is the purpose of an Azure AI Foundry project?

A. Replace Azure subscriptions
B. Centrally manage AI resources, deployments, and workflows
C. Eliminate authentication
D. Replace APIs entirely

Answer

B. Centrally manage AI resources, deployments, and workflows

Explanation

Foundry projects organize AI development and operational assets.

Question 2

Which authentication method is recommended for production Azure workloads?

A. Hardcoded credentials
B. Managed identity
C. Shared public keys
D. Anonymous access

Answer

B. Managed identity

Explanation

Managed identities improve security by avoiding embedded secrets.

Question 3

What is a primary advantage of SDKs?

A. They eliminate APIs completely
B. They simplify application development and integration
C. They remove all authentication requirements
D. They prevent monitoring

Answer

B. They simplify application development and integration

Explanation

SDKs provide abstractions that simplify connectivity and workflow development.

Question 4

Why should applications use environment variables?

A. To increase GPU performance
B. To securely manage configuration values
C. To eliminate authentication
D. To disable RBAC

Answer

B. To securely manage configuration values

Explanation

Environment variables help manage endpoints and credentials securely.

Question 5

What does RBAC primarily control?

A. Token compression
B. Permissions and access to resources
C. Model quantization
D. Network bandwidth

Answer

B. Permissions and access to resources

Explanation

RBAC enforces authorization policies.

Question 6

Why are private endpoints used?

A. To increase hallucinations
B. To improve network security and isolate traffic
C. To disable monitoring
D. To reduce embedding dimensions

Answer

B. To improve network security and isolate traffic

Explanation

Private endpoints help secure enterprise AI workloads.

Question 7

What is commonly required when connecting to a deployed model?

A. Deployment name
B. Firewall removal
C. Disabling authentication
D. Public anonymous access

Answer

A. Deployment name

Explanation

Applications typically reference deployment identifiers.

Question 8

Why should applications implement retry policies?

A. To increase hallucinations
B. To recover from transient failures and improve reliability
C. To disable APIs
D. To remove authentication

Answer

B. To recover from transient failures and improve reliability

Explanation

Retry logic improves resiliency.

Question 9

Which service is commonly integrated for RAG search functionality?

A. Azure AI Search
B. Azure DNS
C. Azure Backup
D. Azure Batch

Answer

A. Azure AI Search

Explanation

Azure AI Search supports vector and semantic retrieval.

Question 10

What is the least privilege principle?

A. Give all users full access
B. Grant only the permissions necessary to perform required tasks
C. Disable RBAC
D. Allow anonymous authentication

Answer

B. Grant only the permissions necessary to perform required tasks

Explanation

Least privilege reduces security risk by minimizing unnecessary permissions.

Go to the AI-103 Exam Prep Hub main page

Agentic AI, AI, AI-103, Azure AI, Generative AI, Microsoft Certification May 25, 2026

Integrate generative workflows into applications by using Foundry SDKs and connectors (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
   --> Build generative applications by using Foundry
      --> Integrate generative workflows into applications by using Foundry SDKs and connectors

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Modern AI applications rarely operate in isolation.

Enterprise generative AI solutions typically integrate with:

Web applications
APIs
Databases
Search systems
Business applications
Workflow engines
External tools

Azure AI Foundry provides:

SDKs
APIs
Connectors
Agent frameworks
Workflow orchestration capabilities

These services help developers integrate generative AI into enterprise applications.

The AI-103: Develop AI Apps and Agents on Azure certification exam tests your understanding of integrating generative workflows into applications.

For the AI-103 exam, you should understand:

Foundry SDKs
APIs
Connectors
Workflow orchestration
Tool integration
Agent integration
RAG integration
Authentication
Deployment integration
Event-driven workflows
Monitoring and governance

What Are Foundry SDKs?

SDKs (Software Development Kits) provide:

Libraries
APIs
Helper functions
Authentication support
Workflow integration tools

SDKs simplify application development.

Benefits of SDKs

SDKs help developers:

Reduce development complexity
Standardize integration
Accelerate deployment
Improve reliability

Common SDK Capabilities

SDKs commonly support:

Model invocation
Agent orchestration
Function calling
Authentication
Streaming responses
Workflow management
Monitoring integration

APIs vs SDKs

APIs

Provide direct service access.

SDKs

Provide higher-level development abstractions.

SDKs often simplify API usage.

What Are Connectors?

Connectors integrate AI systems with:

External services
Enterprise applications
Data sources
Workflow systems

Common Connector Scenarios

Examples include:

CRM integration
ERP integration
SharePoint access
Database connectivity
Messaging systems
Search services

Workflow Integration

Generative workflows may integrate with:

Web applications
Mobile applications
Enterprise platforms
Automation systems

Web Application Integration

Generative AI commonly integrates into:

Chat interfaces
Copilots
Knowledge assistants
Recommendation systems

API-Based Integration

Applications often communicate with AI systems through:

REST APIs
HTTP endpoints
SDK abstractions

Authentication and Authorization

Secure integration requires:

Authentication
Authorization
Identity management

Managed Identity

Managed identities allow Azure services to:

Authenticate securely
Avoid hardcoded secrets
Access resources safely

Keyless Authentication

Keyless authentication improves security by reducing:

API key exposure
Credential management complexity

Secure Credential Storage

Applications should protect:

API keys
Tokens
Connection strings

Role-Based Access Control (RBAC)

RBAC helps control:

Resource permissions
Service access
Administrative privileges

Event-Driven Workflows

Event-driven systems react to:

User actions
File uploads
Database changes
External events

Asynchronous Workflows

Asynchronous workflows:

Improve scalability
Reduce blocking operations
Support long-running tasks

Streaming Responses

Streaming enables applications to:

Display responses incrementally
Improve user experience
Reduce perceived latency

Conversational Application Integration

Conversational systems often integrate:

Memory
Retrieval
Tool usage
User context

Integrating Retrieval-Augmented Generation (RAG)

RAG integration typically includes:

Vector search
Embedding generation
Retrieval pipelines
Prompt grounding

Azure AI Search Integration

Applications commonly integrate Azure AI Search for:

Vector search
Semantic search
Hybrid retrieval

Tool-Augmented Integration

Applications may integrate tools such as:

Databases
Search APIs
Business systems
External APIs

Function Calling Integration

Function calling enables:

Dynamic tool invocation
Structured interactions
Workflow orchestration

Agent Integration

Agent-based systems may:

Coordinate tools
Perform multistep reasoning
Execute workflows
Manage task state

Workflow Orchestration

Workflow orchestration coordinates:

AI reasoning
Tool execution
Retrieval
Human approvals

State Management

Integrated systems often maintain:

Session state
Workflow progress
User context

Memory Integration

Applications may integrate:

Short-term memory
Long-term memory
User preferences

Human-in-the-Loop Integration

Enterprise applications may require:

Human approvals
Review workflows
Escalation paths

Monitoring Integration

Applications should integrate monitoring for:

Errors
Latency
Tool usage
Costs
Safety violations

Logging and Traceability

Logging supports:

Troubleshooting
Auditing
Workflow analysis
Compliance

Trace Logging

Trace logs may capture:

Prompt flows
Tool calls
Retrieval steps
Workflow execution

Error Handling

Applications should handle:

API failures
Timeout errors
Invalid responses
Authentication failures

Retry Mechanisms

Retry strategies improve reliability by:

Recovering from transient failures
Reducing workflow interruptions

Scalability Considerations

Integrated AI systems should support:

High concurrency
Dynamic scaling
Distributed workloads

Latency Considerations

Developers should optimize:

Retrieval speed
Tool invocation times
Model response times

Cost Optimization

Organizations should optimize:

Token usage
API calls
Search operations
Infrastructure costs

CI/CD Integration

Generative AI applications may integrate with:

Automated deployment pipelines
Testing frameworks
Infrastructure automation

Testing Integrated Workflows

Organizations should test:

Workflow correctness
Tool integration
Retrieval quality
Safety compliance

Safety Integration

Applications should integrate:

Content filtering
Safety policies
Guardrails
Approval workflows

Governance and Compliance

Enterprise systems may require:

Audit logging
Data protection
Regulatory compliance
Access controls

Azure AI Foundry Integration Features

Azure AI Foundry supports:

SDK-based development
Workflow orchestration
Model deployment
Agent development
Evaluation pipelines
Monitoring

Real-World Integration Scenarios

Scenario 1: Enterprise Knowledge Assistant

Requirements:

Document retrieval
Conversational AI
Enterprise search integration

Recommended Integration:

Foundry SDK + Azure AI Search

Scenario 2: Customer Support Copilot

Requirements:

CRM integration
Ticket lookup
Escalation workflows

Recommended Integration:

Tool-augmented agent workflows

Scenario 3: Financial Workflow Automation

Requirements:

Human approvals
Audit logging
Secure authentication

Recommended Integration:

HITL workflow + RBAC + trace logging

Scenario 4: AI Research Assistant

Requirements:

Multistep reasoning
Web search integration
Citation generation

Recommended Integration:

RAG + orchestration workflows

Common AI-103 Exam Tips

Understand SDK vs API Differences

Know:

SDK abstractions
API integrations
Authentication approaches

Learn Connector Concepts

Understand:

External integrations
Enterprise systems
Workflow connectors

Understand Workflow Integration

Know:

Tool orchestration
Agent integration
Event-driven workflows
Streaming responses

Learn Security Concepts

Understand:

Managed identity
Keyless credentials
RBAC
Secure secret handling

Summary

Modern generative AI systems depend heavily on integration.

For the AI-103 exam, you should understand:

Foundry SDKs
APIs
Connectors
Workflow orchestration
Function calling
Agent integration
RAG integration
Authentication and RBAC
Event-driven workflows
Monitoring and logging
CI/CD integration
Governance and compliance

These concepts are foundational for building scalable enterprise AI applications and agentic systems on Azure.

Practice Exam Questions

Question 1

What is the primary purpose of an SDK?

A. Replace APIs entirely
B. Simplify application development using libraries and abstractions
C. Eliminate authentication requirements
D. Disable workflow orchestration

Answer

B. Simplify application development using libraries and abstractions

Explanation

SDKs provide tools and abstractions that simplify development.

Question 2

What is a connector in a generative AI solution?

A. A GPU optimization engine
B. A mechanism for integrating external systems and services
C. A vector compression method
D. A storage replication service

Answer

B. A mechanism for integrating external systems and services

Explanation

Connectors enable integration with business applications and data sources.

Question 3

Why are managed identities important?

A. They increase token limits
B. They provide secure authentication without hardcoded credentials
C. They replace vector search
D. They eliminate RBAC

Answer

B. They provide secure authentication without hardcoded credentials

Explanation

Managed identities improve security by avoiding embedded secrets.

Question 4

What is the benefit of streaming responses?

A. Eliminates all latency
B. Improves user experience by displaying incremental output
C. Disables monitoring
D. Prevents tool invocation

Answer

B. Improves user experience by displaying incremental output

Explanation

Streaming responses reduce perceived latency.

Question 5

What is the purpose of function calling?

A. Compress prompts
B. Allow models to invoke external tools dynamically
C. Replace orchestration
D. Eliminate APIs

Answer

B. Allow models to invoke external tools dynamically

Explanation

Function calling enables structured tool interactions.

Question 6

Which Azure service is commonly integrated for vector and semantic search?

A. Azure AI Search
B. Azure DNS
C. Azure Backup
D. Azure Batch

Answer

A. Azure AI Search

Explanation

Azure AI Search supports vector and semantic retrieval.

Question 7

What is a key advantage of asynchronous workflows?

A. Increased blocking operations
B. Improved scalability and support for long-running tasks
C. Removal of authentication
D. Elimination of APIs

Answer

B. Improved scalability and support for long-running tasks

Explanation

Asynchronous workflows support efficient distributed execution.

Question 8

Why is trace logging important?

A. It removes monitoring requirements
B. It provides visibility into workflow execution and troubleshooting
C. It disables retrieval pipelines
D. It eliminates RBAC

Answer

B. It provides visibility into workflow execution and troubleshooting

Explanation

Trace logs help monitor workflows and investigate issues.

Question 9

What is the purpose of RBAC?

A. Increase vector dimensions
B. Control permissions and access to resources
C. Replace authentication
D. Reduce prompt sizes

Answer

B. Control permissions and access to resources

Explanation

RBAC enforces authorization policies.

Question 10

What is a major challenge when integrating complex generative workflows?

A. Eliminating all costs
B. Managing latency, scalability, and reliability
C. Removing all monitoring
D. Disabling orchestration

Answer

B. Managing latency, scalability, and reliability

Explanation

Integrated workflows often involve multiple services and asynchronous operations.

Go to the AI-103 Exam Prep Hub main page

Agentic AI, AI, AI-103, Artificial Intelligence (AI), Azure AI, Microsoft Certification May 25, 2026

Evaluate models and apps, including detecting fabrications, relevance, quality, and safety (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
   --> Build generative applications by using Foundry
      --> Evaluate models and apps, including detecting fabrications, relevance, quality, and safety

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Building generative AI applications is only part of the development process.

Organizations must also evaluate whether AI systems are:

Accurate
Reliable
Relevant
Safe
Grounded
Trustworthy

AI systems can generate:

Hallucinations
Unsafe content
Biased responses
Irrelevant answers
Inconsistent outputs

The AI-103: Develop AI Apps and Agents on Azure certification exam tests your understanding of evaluating models and applications.

For the AI-103 exam, you should understand:

Model evaluation
Application evaluation
Fabrication detection
Groundedness
Relevance evaluation
Quality evaluation
Safety evaluation
Responsible AI testing
Automated evaluators
Human evaluation
Benchmarking
Monitoring and continuous evaluation

Why AI Evaluation Matters

Evaluation is essential because generative AI systems are probabilistic.

This means:

Responses may vary
Outputs may be incorrect
Safety risks may occur
Hallucinations may appear

Without evaluation, organizations cannot reliably trust AI systems.

What Is AI Evaluation?

AI evaluation is the process of measuring:

Accuracy
Safety
Reliability
Relevance
Groundedness
User satisfaction

Types of AI Evaluation

Common evaluation categories include:

Model evaluation
Prompt evaluation
Retrieval evaluation
Application evaluation
Safety evaluation
Human evaluation

Model Evaluation

Model evaluation focuses on:

Model quality
Accuracy
Performance
Reasoning ability

Application Evaluation

Application evaluation measures:

End-to-end user experience
Workflow success
Tool orchestration quality
Groundedness

What Are Fabrications?

Fabrications are generated outputs that:

Are incorrect
Are unsupported
Contain invented facts
Misrepresent information

Fabrications are commonly called hallucinations.

Causes of Fabrications

Fabrications may occur because:

The model lacks relevant knowledge
Prompts are ambiguous
Retrieval quality is poor
Context is insufficient
Safety constraints are weak

Fabrication Detection

Organizations should evaluate whether outputs:

Match trusted sources
Remain grounded
Avoid unsupported claims

Groundedness Evaluation

Groundedness measures whether responses are supported by:

Retrieved documents
Enterprise data
Trusted sources

Importance of Groundedness

Grounded responses:

Improve trust
Reduce hallucinations
Increase explainability

Retrieval Quality Evaluation

RAG systems should evaluate:

Search relevance
Retrieved chunk quality
Citation accuracy
Context completeness

Relevance Evaluation

Relevance measures whether responses:

Answer the user’s question
Stay on-topic
Match user intent

Quality Evaluation

Quality evaluations may assess:

Clarity
Completeness
Coherence
Fluency
Professionalism

Consistency Evaluation

Consistency measures whether models:

Produce stable responses
Avoid contradictory outputs
Maintain predictable behavior

Safety Evaluation

Safety evaluations identify:

Harmful outputs
Toxic content
Unsafe instructions
Policy violations

Responsible AI Evaluation

Responsible AI testing focuses on:

Fairness
Safety
Transparency
Accountability
Privacy

Bias Evaluation

Organizations should evaluate whether models:

Produce biased outputs
Treat groups unfairly
Reinforce stereotypes

Toxicity Detection

Toxicity evaluations identify:

Offensive language
Hate speech
Harassment
Abusive content

Jailbreak Testing

Jailbreak testing evaluates whether users can bypass:

Safety controls
Content filters
Guardrails

Adversarial Testing

Adversarial testing intentionally challenges models using:

Malicious prompts
Edge cases
Prompt injection attacks

Prompt Injection Testing

Prompt injection testing evaluates whether:

External content manipulates model behavior
Instructions override safety policies

Automated Evaluators

Automated evaluators use:

Rules
Scoring systems
AI-based evaluators

To assess model outputs.

AI-Assisted Evaluation

Some systems use LLMs to evaluate:

Relevance
Groundedness
Quality
Safety

Human Evaluation

Human reviewers may evaluate:

Accuracy
Tone
Helpfulness
Safety
Business alignment

Human-in-the-Loop Evaluation

Human-in-the-loop evaluation combines:

Automated evaluation
Human oversight
Expert validation

Benchmarking Models

Benchmarking compares models using:

Standard datasets
Consistent prompts
Defined metrics

A/B Testing

A/B testing compares:

Different prompts
Different models
Different workflows

Evaluation Metrics

Common metrics include:

Precision
Recall
Accuracy
Relevance
Groundedness
Toxicity scores
Latency
User satisfaction

Precision and Recall

Precision

Measures how many retrieved results are relevant.

Recall

Measures how many relevant results were successfully retrieved.

Latency Evaluation

Organizations should measure:

Response times
Retrieval delays
Tool execution times

Cost Evaluation

Cost evaluation considers:

Token usage
API calls
Infrastructure consumption

User Satisfaction Evaluation

Organizations may measure:

User feedback
Completion success
Satisfaction ratings

Continuous Evaluation

AI systems should be evaluated continuously because:

User behavior changes
Data evolves
Model drift may occur

Model Drift

Model drift occurs when:

Performance changes over time
Inputs evolve
User expectations shift

Monitoring Production Systems

Organizations should monitor:

Safety violations
Hallucination rates
Retrieval failures
Latency spikes
Cost increases

Evaluation Pipelines

Evaluation pipelines automate:

Testing
Scoring
Reporting
Regression analysis

Regression Testing

Regression testing ensures updates do not:

Reduce quality
Break workflows
Increase hallucinations

Azure AI Foundry Evaluation Capabilities

Azure AI Foundry supports:

Evaluation workflows
Automated evaluators
Safety monitoring
Groundedness evaluation
Prompt testing
Trace analysis

Trace Analysis

Trace analysis helps inspect:

Tool calls
Retrieval steps
Agent decisions
Workflow execution

Evaluation Datasets

Organizations should create datasets containing:

Expected outputs
Edge cases
Adversarial prompts
Real-world scenarios

Synthetic Test Data

Synthetic data may help test:

Rare scenarios
Adversarial prompts
Safety boundaries

Real-World Evaluation Scenarios

Scenario 1: Enterprise Chatbot

Requirements:

Accurate responses
Citation support
Low hallucination rate

Recommended Evaluation:

Groundedness testing
Retrieval quality evaluation

Scenario 2: Financial Assistant

Requirements:

High accuracy
Safety compliance
Low fabrication risk

Recommended Evaluation:

Human review
Adversarial testing
Approval workflows

Scenario 3: Customer Support Copilot

Requirements:

Relevant responses
Fast response times
Consistent tone

Recommended Evaluation:

Latency evaluation
Quality scoring
A/B testing

Scenario 4: Agentic Workflow System

Requirements:

Tool accuracy
Safe tool execution
Workflow traceability

Recommended Evaluation:

Trace analysis
Tool execution monitoring
HITL evaluation

Common AI-103 Exam Tips

Understand Evaluation Categories

Know the differences between:

Relevance
Quality
Groundedness
Safety
Consistency

Learn Fabrication Detection Concepts

Understand:

Hallucinations
Unsupported claims
Grounding validation

Understand Safety Testing

Know:

Toxicity testing
Jailbreak testing
Prompt injection evaluation
Adversarial testing

Learn Monitoring Concepts

Understand:

Continuous evaluation
Drift detection
Trace analysis
Regression testing

Summary

Evaluating generative AI systems is critical for building:

Reliable
Safe
Grounded
Trustworthy applications

For the AI-103 exam, you should understand:

Fabrication detection
Groundedness evaluation
Retrieval quality
Relevance testing
Quality evaluation
Safety evaluation
Toxicity detection
Adversarial testing
Human evaluation
Automated evaluators
Monitoring and drift detection
Evaluation pipelines

These concepts are foundational for developing enterprise-grade AI applications and agentic systems on Azure.

Practice Exam Questions

Question 1

What is a fabrication in generative AI?

A. A storage replication process
B. An unsupported or invented response
C. A vector indexing method
D. A deployment strategy

Answer

B. An unsupported or invented response

Explanation

Fabrications, also called hallucinations, are incorrect or invented outputs.

Question 2

What does groundedness measure?

A. GPU performance
B. Whether outputs are supported by trusted sources
C. Network bandwidth
D. Token compression efficiency

Answer

B. Whether outputs are supported by trusted sources

Explanation

Groundedness evaluates factual support from retrieved or trusted data.

Question 3

Which evaluation type focuses on harmful or unsafe outputs?

A. Latency evaluation
B. Safety evaluation
C. Compression evaluation
D. Replication evaluation

Answer

B. Safety evaluation

Explanation

Safety evaluations detect harmful, toxic, or policy-violating outputs.

Question 4

What is the purpose of retrieval quality evaluation in RAG systems?

A. Measure GPU speed
B. Assess search relevance and retrieved context quality
C. Reduce storage redundancy
D. Disable embeddings

Answer

B. Assess search relevance and retrieved context quality

Explanation

Retrieval quality measures how useful and relevant retrieved information is.

Question 5

What is jailbreak testing?

A. Testing storage failures
B. Evaluating attempts to bypass safety controls
C. Measuring retrieval latency
D. Compressing prompts

Answer

B. Evaluating attempts to bypass safety controls

Explanation

Jailbreak testing checks whether users can circumvent AI safety mechanisms.

Question 6

Which metric measures whether responses answer the user’s question appropriately?

A. Relevance
B. Replication
C. Throughput
D. Compression

Answer

A. Relevance

Explanation

Relevance evaluates how well outputs match user intent.

Question 7

Why is continuous evaluation important?

A. To eliminate all infrastructure costs
B. Because models and data can change over time
C. To remove all safety policies
D. To disable monitoring

Answer

B. Because models and data can change over time

Explanation

Continuous evaluation helps detect drift and performance degradation.

Question 8

What is adversarial testing?

A. Testing network redundancy
B. Challenging AI systems with malicious or difficult prompts
C. Increasing vector dimensions
D. Optimizing GPU allocation

Answer

B. Challenging AI systems with malicious or difficult prompts

Explanation

Adversarial testing identifies vulnerabilities and unsafe behaviors.

Question 9

What is a benefit of A/B testing in AI systems?

A. Eliminates monitoring requirements
B. Compares prompts or models to identify better performance
C. Removes the need for evaluation datasets
D. Disables retrieval pipelines

Answer

B. Compares prompts or models to identify better performance

Explanation

A/B testing helps optimize prompts, workflows, and models.

Question 10

Which Azure capability helps inspect workflow execution and tool calls?

A. Trace analysis
B. DNS failover
C. Storage mirroring
D. GPU partitioning

Answer

A. Trace analysis

Explanation

Trace analysis provides visibility into workflow execution and reasoning steps.

Go to the AI-103 Exam Prep Hub main page

Agentic AI, AI, AI-103, Artificial Intelligence (AI), Azure AI, Generative AI, Microsoft Certification May 25, 2026

Design workflows, tool-augmented flows, and multistep reasoning pipelines (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
   --> Build generative applications by using Foundry
      --> Design workflows, tool-augmented flows, and multistep reasoning pipelines

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Modern AI systems are evolving beyond simple prompt-response interactions.

Today’s generative AI applications often:

Use external tools
Perform multistep reasoning
Orchestrate workflows
Retrieve enterprise data
Execute actions autonomously
Coordinate across services

These systems are commonly called:

Agentic systems
Tool-augmented AI systems
AI workflow pipelines

The AI-103: Develop AI Apps and Agents on Azure certification exam tests your understanding of designing intelligent workflows and reasoning pipelines.

For the AI-103 exam, you should understand:

AI workflows
Agent orchestration
Tool augmentation
Function calling
Multistep reasoning
Workflow pipelines
Retrieval integration
Memory integration
Planning and execution
Human-in-the-loop workflows
Monitoring and governance

What Are AI Workflows?

AI workflows are structured sequences of operations that combine:

AI reasoning
Data retrieval
Tool execution
Decision-making
Automation

Workflows coordinate multiple steps to complete complex tasks.

Why AI Workflows Matter

Simple prompts are often insufficient for:

Enterprise automation
Complex reasoning
Dynamic decision-making
Multi-system integration

Workflows allow AI systems to:

Break problems into steps
Use external tools
Validate outputs
Iterate toward solutions

What Is Tool Augmentation?

Tool augmentation allows AI systems to use external capabilities.

Examples include:

APIs
Databases
Search engines
Calculators
Business systems
Code interpreters

Why Tool Augmentation Is Important

Language models alone:

Cannot access real-time data
Cannot execute business actions directly
Cannot reliably perform all calculations

Tools extend AI capabilities.

Common Tool-Augmented Scenarios

Examples include:

Checking inventory
Booking appointments
Querying databases
Sending emails
Executing workflows
Calling REST APIs

What Is Function Calling?

Function calling enables models to:

Detect when a tool is needed
Generate structured tool requests
Invoke external services
Process returned results

Function Calling Workflow

Typical flow:

User submits request
Model determines tool requirement
Model generates function call
External tool executes
Results return to model
Model generates final response

Structured Tool Inputs

Function calling typically uses:

JSON schemas
Structured parameters
Validated inputs

This improves reliability.

Tool Selection

Agentic systems may dynamically choose:

Which tools to use
Which workflows to invoke
Which retrieval strategies to apply

Tool Orchestration

Tool orchestration coordinates multiple tools within a workflow.

Examples include:

Retrieval + summarization
Search + booking systems
Database queries + reporting

Sequential Workflows

Sequential workflows execute steps in order.

Example:

Retrieve customer data
Analyze account status
Generate recommendations
Send response

Parallel Workflows

Parallel workflows execute multiple tasks simultaneously.

Benefits include:

Faster execution
Better scalability
Reduced latency

Conditional Workflows

Conditional workflows branch based on:

User intent
Retrieved data
Safety evaluations
Confidence scores

What Is Multistep Reasoning?

Multistep reasoning breaks complex problems into smaller steps.

This improves:

Accuracy
Planning
Decision quality

Examples of Multistep Reasoning

Examples include:

Research workflows
Financial analysis
Travel planning
Technical troubleshooting

Chain-of-Thought Reasoning

Chain-of-thought reasoning encourages models to:

Reason step-by-step
Decompose problems
Validate intermediate steps

Planning and Execution Models

Agentic systems often separate:

Planning
Execution

The planner decides:

What steps are needed
Which tools to use

The executor performs actions.

Planner-Executor Architectures

Planner-executor architectures support:

Dynamic workflows
Adaptive reasoning
Task decomposition

ReAct Pattern

The ReAct (Reason + Act) pattern combines:

Reasoning
Tool usage
Observation
Iterative decision-making

Reflection and Self-Correction

Some systems support:

Self-evaluation
Output refinement
Error correction

Retrieval-Augmented Workflows

Workflows often integrate:

Vector search
RAG pipelines
Enterprise grounding

Memory in Agentic Systems

AI systems may use memory for:

Conversation history
User preferences
Workflow state
Long-running tasks

Short-Term Memory

Short-term memory stores:

Current conversation context
Immediate workflow information

Long-Term Memory

Long-term memory stores:

Persistent preferences
Historical interactions
Learned context

Workflow State Management

State management tracks:

Current task progress
Intermediate outputs
Pending actions

Human-in-the-Loop (HITL) Workflows

High-risk workflows may require:

Human approvals
Validation checkpoints
Escalation paths

Approval Gates

Approval gates can prevent:

Unsafe actions
Unauthorized tool usage
Harmful outputs

Safety and Governance

Organizations should enforce:

Tool restrictions
Permission boundaries
Safety filters
Approval workflows

Autonomous vs Semi-Autonomous Agents

Autonomous Agents

Can:

Make decisions independently
Execute workflows automatically

Semi-Autonomous Agents

Require:

Human review
Approval checkpoints

Workflow Monitoring

Organizations should monitor:

Tool usage
Failures
Safety violations
Latency
Costs

Trace Logging

Trace logging helps track:

Workflow execution
Tool calls
Reasoning steps
Agent decisions

Error Handling in Workflows

Workflow pipelines should handle:

API failures
Missing data
Timeout errors
Invalid outputs

Retry Strategies

Common retry strategies include:

Automatic retries
Fallback workflows
Alternative tool selection

Fallback Models

Applications may use fallback models when:

Primary models fail
Costs exceed thresholds
Latency becomes excessive

Workflow Optimization

Optimization strategies include:

Parallel processing
Caching
Smaller models
Efficient retrieval

Latency Considerations

Complex workflows may increase latency due to:

Multiple model calls
Tool invocations
Retrieval operations

Cost Considerations

Tool-augmented systems may increase:

Token usage
API calls
Infrastructure costs

Azure AI Foundry Workflow Capabilities

Azure AI Foundry supports:

Model orchestration
Tool integration
Agent workflows
Evaluation pipelines
Monitoring

Common AI-103 Workflow Scenarios

Scenario 1: Enterprise Research Assistant

Requirements:

Multi-document retrieval
Summarization
Citation generation

Recommended Workflow:

RAG + multistep reasoning

Scenario 2: Customer Service Agent

Requirements:

CRM access
Ticket management
Escalation workflows

Recommended Workflow:

Tool-augmented agent

Scenario 3: Financial Approval System

Requirements:

Risk evaluation
Human approvals
Audit logging

Recommended Workflow:

HITL approval pipeline

Scenario 4: AI Coding Assistant

Requirements:

Code generation
Code execution
Documentation retrieval

Recommended Workflow:

Code model + tool orchestration

Common AI-103 Exam Tips

Understand Workflow Patterns

Know:

Sequential workflows
Parallel workflows
Conditional workflows

Learn Tool-Augmented AI Concepts

Understand:

Function calling
Tool orchestration
Dynamic tool selection

Understand Multistep Reasoning

Know:

Chain-of-thought reasoning
Planner-executor patterns
ReAct workflows

Learn Governance Concepts

Understand:

HITL workflows
Approval gates
Monitoring
Trace logging

Summary

Modern AI applications increasingly rely on:

Workflow orchestration
Tool augmentation
Multistep reasoning
Agentic architectures

For the AI-103 exam, you should understand:

AI workflow design
Function calling
Tool orchestration
Sequential and parallel workflows
Multistep reasoning
Planner-executor architectures
ReAct patterns
Memory integration
HITL workflows
Monitoring and governance

These concepts enable organizations to build:

Intelligent
Autonomous
Scalable
Governed AI systems

They are foundational for modern generative AI and agentic solutions on Azure.

Practice Exam Questions

Question 1

What is the primary purpose of tool augmentation in AI systems?

A. Reduce storage costs
B. Extend model capabilities using external tools
C. Eliminate prompts
D. Replace vector search

Answer

B. Extend model capabilities using external tools

Explanation

Tool augmentation enables AI systems to interact with APIs, databases, and other services.

Question 2

What does function calling enable a model to do?

A. Generate only static responses
B. Invoke external tools using structured inputs
C. Eliminate workflows
D. Replace embeddings

Answer

B. Invoke external tools using structured inputs

Explanation

Function calling allows models to interact with external services.

Question 3

Which workflow type executes tasks simultaneously?

A. Sequential workflow
B. Parallel workflow
C. Manual workflow
D. Static workflow

Answer

B. Parallel workflow

Explanation

Parallel workflows improve speed by running tasks concurrently.

Question 4

What is multistep reasoning?

A. Compressing vector indexes
B. Breaking complex tasks into smaller reasoning steps
C. Increasing GPU memory
D. Reducing prompt size only

Answer

B. Breaking complex tasks into smaller reasoning steps

Explanation

Multistep reasoning improves problem-solving accuracy.

Question 5

What does the ReAct pattern combine?

A. Compression and storage
B. Reasoning and acting
C. Replication and scaling
D. Encryption and backup

Answer

B. Reasoning and acting

Explanation

ReAct combines reasoning steps with tool usage.

Question 6

What is the purpose of workflow state management?

A. Monitor GPU temperature
B. Track task progress and intermediate outputs
C. Disable logging
D. Replace semantic search

Answer

B. Track task progress and intermediate outputs

Explanation

State management helps maintain workflow continuity.

Question 7

Which architecture separates planning from execution?

A. Static inference architecture
B. Planner-executor architecture
C. Batch storage architecture
D. Compression architecture

Answer

B. Planner-executor architecture

Explanation

Planner-executor systems divide reasoning and execution responsibilities.

Question 8

Why are approval gates important in AI workflows?

A. They increase vector dimensions
B. They prevent unsafe or unauthorized actions
C. They reduce indexing speed
D. They eliminate monitoring requirements

Answer

B. They prevent unsafe or unauthorized actions

Explanation

Approval gates enforce governance and human oversight.

Question 9

Which concept allows AI systems to remember previous interactions?

A. Semantic ranking
B. Memory integration
C. Static chunking
D. GPU partitioning

Answer

B. Memory integration

Explanation

Memory enables contextual continuity and long-running workflows.

Question 10

What is a major challenge of complex AI workflows?

A. Eliminating all costs
B. Increased latency from multiple operations
C. Removing all need for monitoring
D. Preventing all hallucinations automatically

Answer

B. Increased latency from multiple operations

Explanation

Complex workflows may require multiple model calls and tool executions.

Go to the AI-103 Exam Prep Hub main page

AI, AI-103, Azure AI, Microsoft Certification May 25, 2026

Implement Retrieval-Augmented Generation (RAG) in an application (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
   --> Build generative applications by using Foundry
      --> Implement Retrieval-Augmented Generation (RAG) in an application

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Large language models (LLMs) are powerful, but they have limitations.

LLMs may:

Hallucinate information
Generate outdated responses
Lack organization-specific knowledge
Produce unverifiable answers

Retrieval-Augmented Generation (RAG) addresses these issues by combining:

Information retrieval
Vector search
Enterprise knowledge grounding
Generative AI

The AI-103: Develop AI Apps and Agents on Azure certification exam tests your understanding of how to implement RAG-based applications.

For the AI-103 exam, you should understand:

RAG architecture
Vector search
Embeddings
Chunking strategies
Indexing
Semantic search
Grounding techniques
Prompt augmentation
Retrieval pipelines
RAG optimization
Monitoring and evaluation
Security considerations

What Is Retrieval-Augmented Generation (RAG)?

RAG is an AI architecture that combines:

Information retrieval
Context augmentation
Generative AI

Instead of relying only on model training data, RAG retrieves relevant information from external sources and injects it into prompts.

Why RAG Matters

RAG improves:

Accuracy
Grounding
Freshness of information
Enterprise knowledge integration
Explainability

Common RAG Use Cases

Typical RAG applications include:

Enterprise chatbots
Knowledge assistants
Internal documentation search
Customer support systems
Research assistants
AI copilots

Core Components of a RAG System

A RAG solution typically includes:

Data sources
Chunking pipeline
Embedding model
Vector database or search index
Retrieval engine
Large language model
Prompt orchestration layer

RAG Workflow Overview

The general workflow is:

Ingest data
Split data into chunks
Generate embeddings
Store embeddings in an index
Receive user query
Convert query to embeddings
Retrieve relevant chunks
Add retrieved context to prompt
Generate grounded response

What Are Embeddings?

Embeddings are numerical vector representations of data.

Embeddings capture:

Semantic meaning
Contextual similarity
Relationships between concepts

Embedding Models

Embedding models convert:

Text
Documents
Queries

Into vectors for similarity comparison.

Vector Similarity Search

Vector search identifies content that is semantically similar.

Unlike keyword search, vector search understands:

Meaning
Intent
Context

What Is Chunking?

Chunking divides documents into smaller sections.

Chunking is essential because:

Models have token limits
Smaller chunks improve retrieval precision
Large documents are difficult to process efficiently

Chunking Strategies

Common chunking methods include:

Fixed-size chunking
Sliding window chunking
Semantic chunking
Paragraph-based chunking

Fixed-Size Chunking

Documents are split into equal-sized chunks.

Advantages:

Simple
Predictable

Disadvantages:

May break context unexpectedly

Sliding Window Chunking

Chunks overlap partially.

Benefits include:

Better context preservation
Improved retrieval continuity

Semantic Chunking

Semantic chunking groups logically related content.

Advantages:

Better contextual integrity
Higher retrieval quality

Metadata in RAG Systems

Metadata may include:

Document title
Author
Date
Category
Security labels

Metadata improves filtering and retrieval.

Indexing in RAG Systems

Indexes store:

Embeddings
Metadata
Searchable content

Indexes enable efficient retrieval.

Vector Databases and Search Indexes

RAG systems commonly use:

Azure AI Search
Vector indexes
Hybrid search systems

Semantic Search

Semantic search improves relevance using:

Meaning
Intent
Natural language understanding

Hybrid Search

Hybrid search combines:

Keyword search
Semantic ranking
Vector similarity search

This often improves retrieval quality.

Retrieval Pipelines

Retrieval pipelines:

Process user queries
Retrieve relevant information
Rank search results
Filter irrelevant content

Query Embeddings

User queries are converted into embeddings.

The query vector is compared against stored vectors.

Similarity Metrics

Common similarity calculations include:

Cosine similarity
Euclidean distance
Dot product similarity

Top-K Retrieval

Top-K retrieval returns the most relevant results.

Choosing the right K value is important:

Too few results may miss context
Too many results may add noise

Prompt Augmentation

Retrieved content is inserted into prompts.

This process is called:

Prompt grounding
Context injection
Prompt augmentation

Grounded Responses

Grounded responses:

Reference trusted data
Reduce hallucinations
Improve reliability

System Prompts in RAG

System prompts may instruct the model to:

Use only retrieved sources
Cite references
Avoid unsupported claims

Citation Generation

Many RAG applications provide:

Source references
Citations
Linked documents

This improves transparency.

Hallucination Reduction

RAG reduces hallucinations by:

Providing factual context
Using enterprise knowledge
Restricting unsupported generation

RAG Architecture Patterns

Common patterns include:

Basic RAG
Hybrid RAG
Multi-stage retrieval
Agentic RAG

Basic RAG

Basic RAG:

Retrieves documents
Injects them into prompts
Generates responses

Hybrid RAG

Hybrid RAG combines:

Vector search
Keyword search
Semantic ranking

Multi-Stage Retrieval

Multi-stage retrieval uses:

Initial retrieval
Re-ranking
Filtering
Secondary refinement

Agentic RAG

Agentic RAG systems may:

Choose retrieval tools dynamically
Perform iterative searches
Validate retrieved data
Orchestrate workflows

Azure AI Search in RAG

Azure AI Search commonly provides:

Vector search
Semantic ranking
Hybrid search
Index management

Data Ingestion Pipelines

RAG ingestion pipelines may process:

PDFs
Web pages
Databases
Office documents
Structured data

Data Freshness

Organizations should ensure indexes remain current.

Strategies include:

Scheduled reindexing
Incremental ingestion
Event-driven updates

Access Control in RAG

Enterprise RAG systems should enforce:

Role-based access
Document-level security
Identity-aware retrieval

Security Considerations

Organizations should secure:

Data ingestion pipelines
Search indexes
Embedding endpoints
Model endpoints

Monitoring RAG Systems

Organizations should monitor:

Retrieval quality
Grounding quality
Latency
Hallucinations
Search relevance

Evaluating RAG Performance

Key evaluation metrics include:

Precision
Recall
Relevance
Groundedness
Citation accuracy

Groundedness Evaluation

Groundedness measures whether responses are supported by retrieved content.

Retrieval Quality Evaluation

Organizations should evaluate:

Search result relevance
Ranking effectiveness
Missing context

Latency Optimization

RAG pipelines can introduce additional latency.

Optimization strategies include:

Caching
Smaller embeddings
Efficient indexing
Query optimization

Cost Optimization

Cost reduction strategies include:

Limiting retrieved chunks
Smaller embedding models
Efficient indexing
Intelligent caching

Responsible AI Considerations

Developers should:

Validate sources
Prevent data leakage
Monitor hallucinations
Enforce safety policies

Common AI-103 RAG Scenarios

Scenario 1: Enterprise Knowledge Chatbot

Requirements:

Internal document access
Accurate answers
Source citations

Recommended Solution:

RAG with Azure AI Search

Scenario 2: Legal Document Assistant

Requirements:

High factual accuracy
Traceability
Large document support

Recommended Solution:

Semantic chunking
Hybrid search
Citation generation

Scenario 3: Customer Support Copilot

Requirements:

Fast retrieval
Grounded answers
Updated knowledge

Recommended Solution:

Incremental indexing
Real-time retrieval

Scenario 4: Agentic AI Workflow

Requirements:

Dynamic retrieval
Multi-step reasoning
Tool orchestration

Recommended Solution:

Agentic RAG architecture

Common AI-103 Exam Tips

Understand the RAG Workflow

Know all stages:

Ingestion
Chunking
Embeddings
Indexing
Retrieval
Prompt augmentation
Generation

Learn Embedding Concepts

Understand:

Semantic vectors
Similarity search
Embedding models

Understand Search Types

Know the differences between:

Keyword search
Vector search
Semantic search
Hybrid search

Understand Grounding

Know how grounding:

Reduces hallucinations
Improves factual accuracy
Supports explainability

Summary

Retrieval-Augmented Generation (RAG) is one of the most important generative AI architectures.

For the AI-103 exam, you should understand:

RAG architecture
Embeddings
Chunking
Indexing
Vector search
Semantic search
Hybrid search
Prompt grounding
Retrieval pipelines
Groundedness evaluation
Security considerations
Monitoring and optimization

RAG enables organizations to build:

Accurate
Explainable
Grounded
Enterprise-aware AI applications

These concepts are foundational for modern AI systems on Azure.

Practice Exam Questions

Question 1

What is the primary goal of Retrieval-Augmented Generation (RAG)?

A. Reduce storage replication
B. Improve factual grounding using retrieved data
C. Eliminate vector search
D. Replace all language models

Answer

B. Improve factual grounding using retrieved data

Explanation

RAG improves accuracy by injecting retrieved information into prompts.

Question 2

What are embeddings?

A. GPU drivers
B. Numerical vector representations of data
C. Network security policies
D. Storage replication methods

Answer

B. Numerical vector representations of data

Explanation

Embeddings represent semantic meaning as vectors.

Question 3

Why is chunking important in RAG systems?

A. To increase network latency
B. To divide documents into manageable sections
C. To disable semantic search
D. To eliminate embeddings

Answer

B. To divide documents into manageable sections

Explanation

Chunking improves retrieval efficiency and contextual relevance.

Question 4

Which search method understands semantic meaning instead of exact keywords?

A. Static indexing
B. Vector search
C. Archive retrieval
D. Compression balancing

Answer

B. Vector search

Explanation

Vector search retrieves semantically similar content.

Question 5

What does hybrid search combine?

A. GPU clusters and storage accounts
B. Keyword search and vector search
C. Virtual machines and containers
D. Authentication and authorization

Answer

B. Keyword search and vector search

Explanation

Hybrid search combines lexical and semantic retrieval methods.

Question 6

What is prompt augmentation?

A. Increasing storage capacity
B. Adding retrieved context to prompts
C. Compressing vectors
D. Removing metadata

Answer

B. Adding retrieved context to prompts

Explanation

Prompt augmentation injects retrieved content into model prompts.

Question 7

What is groundedness?

A. GPU allocation efficiency
B. Whether responses are supported by retrieved sources
C. Network bandwidth usage
D. Storage replication speed

Answer

B. Whether responses are supported by retrieved sources

Explanation

Groundedness measures factual support from retrieved content.

Question 8

Which Azure service is commonly used for vector and semantic search in RAG systems?

A. Azure AI Search
B. Azure DNS
C. Azure Backup
D. Azure Batch

Answer

A. Azure AI Search

Explanation

Azure AI Search supports vector, semantic, and hybrid search.

Question 9

What is a major advantage of semantic chunking?

A. It eliminates embeddings
B. It preserves contextual meaning better
C. It disables retrieval
D. It reduces authentication requirements

Answer

B. It preserves contextual meaning better

Explanation

Semantic chunking groups logically related content.

Question 10

Which metric evaluates whether retrieved results are relevant?

A. Groundedness
B. Retrieval quality
C. GPU utilization
D. Storage redundancy

Answer

B. Retrieval quality

Explanation

Retrieval quality measures the relevance of retrieved documents.

Go to the AI-103 Exam Prep Hub main page

AI, AI-103, Artificial Intelligence (AI), Generative AI, Microsoft Certification May 25, 2026

Deploy and consume LLMs, small models, code models, and multimodal models (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
   --> Build generative applications by using Foundry
      --> Deploy and consume LLMs, small models, code models, and multimodal models

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Modern AI applications rely on a wide variety of AI models.

Different models are optimized for different workloads, including:

Conversational AI
Code generation
Text summarization
Image understanding
Audio processing
Reasoning tasks
Agentic workflows

The AI-103: Develop AI Apps and Agents on Azure certification exam tests your understanding of how to deploy and consume AI models in Azure AI Foundry.

For the AI-103 exam, you should understand:

Large language models (LLMs)
Small language models (SLMs)
Code models
Multimodal models
Model deployment concepts
Model consumption patterns
API-based model access
Endpoint configuration
Performance and cost tradeoffs
Model selection strategies
Responsible AI considerations

What Are Large Language Models (LLMs)?

Large language models are advanced AI systems trained on massive datasets.

LLMs can:

Generate text
Summarize documents
Answer questions
Translate languages
Reason across prompts
Support conversational AI

Common LLM Use Cases

Typical use cases include:

AI assistants
Enterprise chatbots
Content generation
Knowledge retrieval
Agent orchestration
Workflow automation

Characteristics of LLMs

LLMs typically provide:

Strong reasoning
Broad general knowledge
Advanced conversational abilities
Complex instruction following

However, they also:

Require more compute
Cost more to run
May introduce higher latency

What Are Small Language Models (SLMs)?

Small language models are lightweight models optimized for:

Faster inference
Lower cost
Lower latency
Edge deployment
Specialized tasks

Common SLM Use Cases

SLMs are often used for:

Classification
Simple chatbots
Mobile applications
Embedded AI
Lightweight assistants

Benefits of Small Models

Advantages include:

Reduced infrastructure cost
Faster response times
Lower resource requirements
Easier deployment at scale

LLM vs SLM Tradeoffs

LLMs

Best for:

Complex reasoning
Broad knowledge
Multi-step tasks

Tradeoffs:

Higher cost
Higher latency
Larger infrastructure requirements

SLMs

Best for:

Lightweight inference
Narrow tasks
Cost-sensitive workloads

Tradeoffs:

Reduced reasoning capability
Smaller context windows
Less flexibility

What Are Code Models?

Code models are specialized AI models trained for software development tasks.

These models can:

Generate code
Explain code
Complete functions
Debug issues
Convert between languages

Common Code Model Use Cases

Typical scenarios include:

Developer copilots
Code generation
Documentation generation
Test generation
Refactoring assistance

Code Model Capabilities

Code models often support:

Multiple programming languages
Natural language prompts
Code reasoning
Syntax understanding

What Are Multimodal Models?

Multimodal models process multiple types of input.

Examples include:

Text and images
Text and audio
Video and text

Multimodal AI Capabilities

Multimodal models may support:

Image understanding
OCR
Visual question answering
Audio transcription
Speech interaction
Video analysis

Common Multimodal Use Cases

Examples include:

AI vision assistants
Document understanding
Medical imaging analysis
Voice assistants
Image captioning

Model Deployment in Azure AI Foundry

Azure AI Foundry enables developers to:

Discover models
Deploy models
Test models
Monitor deployments
Consume models through APIs

Model Catalogs

Azure AI Foundry provides access to:

Foundation models
Open-source models
Specialized models
Multimodal models

Deployment Concepts

A deployment makes a model available through:

APIs
Endpoints
Applications
Agent workflows

Deployment Types

Common deployment options include:

Managed online deployments
Serverless deployments
Real-time inference endpoints
Batch inference deployments

Real-Time Inference

Real-time inference is used for:

Interactive chat
AI assistants
Live applications
Agent workflows

Batch Inference

Batch inference is used for:

Large-scale document processing
Offline analysis
Scheduled workloads
Bulk content generation

Endpoint Configuration

Deployments expose endpoints for application access.

Endpoints may include:

Authentication
Rate limits
Scaling policies
Monitoring settings

Authentication and Authorization

Applications may access models using:

API keys
Managed identities
Microsoft Entra ID
Role-based access control (RBAC)

Consuming Models Through APIs

Applications consume deployed models using:

REST APIs
SDKs
Client libraries

Prompt-Based Interactions

Generative AI applications commonly interact with models through prompts.

Prompts may include:

Instructions
Context
Examples
Retrieved documents

System Prompts

System prompts define:

AI behavior
Tone
Constraints
Safety policies

Model Parameters

Common inference parameters include:

Temperature
Top-p
Max tokens
Frequency penalty
Presence penalty

Temperature

Temperature controls output randomness.

Lower temperature:

More deterministic
More predictable

Higher temperature:

More creative
More variable

Context Windows

Context windows determine how much information a model can process in a request.

Larger context windows support:

Long conversations
Large documents
Multi-document grounding

Streaming Responses

Streaming enables applications to receive responses incrementally.

Benefits include:

Improved user experience
Faster perceived response times

Grounding Models

Grounding improves factual accuracy by providing trusted data.

Grounded applications commonly use:

Vector search
Retrieval-Augmented Generation (RAG)
Enterprise knowledge sources

Model Selection Considerations

Developers should evaluate:

Accuracy
Cost
Latency
Context size
Reasoning ability
Multimodal support
Scalability

Choosing Between Models

Use LLMs When:

Complex reasoning is required
Broad knowledge is needed
Multi-step workflows are involved

Use SLMs When:

Low latency matters
Cost optimization is critical
Tasks are narrow or repetitive

Use Code Models When:

Building developer tools
Generating code
Supporting programming workflows

Use Multimodal Models When:

Images or audio are required
Visual understanding is needed
Mixed media inputs are processed

Scaling Model Deployments

Scaling strategies may include:

Autoscaling
Regional deployments
Load balancing
Rate limiting

Monitoring Deployments

Organizations should monitor:

Latency
Throughput
Token usage
Errors
Safety events
Cost

Cost Optimization

Cost optimization strategies include:

Choosing smaller models
Limiting token usage
Caching responses
Using batch processing

Responsible AI Considerations

Developers should implement:

Safety filters
Guardrails
Content moderation
Monitoring
Human oversight

Multimodal Safety Concerns

Multimodal systems may require:

Image moderation
OCR filtering
Audio moderation
Content safety evaluation

Agentic AI and Model Consumption

AI agents may use:

LLMs for reasoning
SLMs for lightweight tasks
Code models for automation
Multimodal models for perception

Common AI-103 Deployment Scenarios

Scenario 1: Enterprise Chatbot

Requirements:

Strong reasoning
Long conversations
Grounded responses

Recommended Model:

LLM with RAG

Scenario 2: Mobile AI Assistant

Requirements:

Fast responses
Low cost
Lightweight inference

Recommended Model:

Small language model

Scenario 3: Developer Copilot

Requirements:

Code generation
Programming assistance
Syntax awareness

Recommended Model:

Code model

Scenario 4: Image-Aware AI Assistant

Requirements:

Image analysis
OCR
Text generation

Recommended Model:

Multimodal model

Common AI-103 Exam Tips

Understand Model Categories

Know the differences between:

LLMs
SLMs
Code models
Multimodal models

Learn Deployment Concepts

Understand:

Endpoints
Real-time inference
Batch inference
Scaling

Learn Consumption Patterns

Know:

REST APIs
SDKs
Prompt engineering
System prompts

Understand Cost and Performance Tradeoffs

Know how:

Model size affects cost
Context size affects latency
Scaling impacts performance

Summary

Azure AI Foundry enables developers to deploy and consume a wide range of AI models.

For the AI-103 exam, you should understand:

LLMs
Small language models
Code models
Multimodal models
Deployment options
Model consumption patterns
Prompt engineering
Scaling strategies
Cost optimization
Responsible AI controls

Choosing the right model and deployment strategy is essential for building:

Scalable
Reliable
Efficient
Responsible AI solutions

These concepts are foundational for generative AI and agentic systems on Azure.

Practice Exam Questions

Question 1

What is a primary strength of large language models (LLMs)?

A. Minimal compute usage
B. Complex reasoning and broad knowledge
C. Guaranteed factual accuracy
D. Extremely low latency

Answer

B. Complex reasoning and broad knowledge

Explanation

LLMs excel at reasoning, conversation, and broad knowledge tasks.

Question 2

Which model type is best suited for lightweight, low-cost inference?

A. Large language model
B. Small language model
C. Multimodal model
D. Vision transformer only

Answer

B. Small language model

Explanation

SLMs are optimized for lower latency and reduced cost.

Question 3

Which model type is specifically optimized for programming tasks?

A. Vision model
B. Code model
C. Embedding model
D. Speech model

Answer

B. Code model

Explanation

Code models are trained for software development workflows.

Question 4

What is a defining feature of multimodal models?

A. They only process text
B. They process multiple input types
C. They eliminate inference costs
D. They require no prompting

Answer

B. They process multiple input types

Explanation

Multimodal models handle text, images, audio, and other media.

Question 5

Which deployment type is best for interactive AI chat applications?

A. Batch inference
B. Real-time inference
C. Archive deployment
D. Offline storage deployment

Answer

B. Real-time inference

Explanation

Interactive applications require low-latency real-time inference.

Question 6

What does the temperature parameter control?

A. Network throughput
B. Output randomness and creativity
C. Storage replication
D. GPU memory allocation

Answer

B. Output randomness and creativity

Explanation

Temperature affects how deterministic or creative outputs become.

Question 7

Which technique improves factual accuracy by using trusted data sources?

A. GPU scaling
B. Retrieval-Augmented Generation (RAG)
C. Semantic caching
D. Compression indexing

Answer

B. Retrieval-Augmented Generation (RAG)

Explanation

RAG grounds model outputs using retrieved enterprise data.

Question 8

What is a major benefit of streaming responses?

A. Reduced storage costs
B. Faster perceived response times
C. Elimination of monitoring
D. Improved vector indexing

Answer

B. Faster perceived response times

Explanation

Streaming improves user experience during response generation.

Question 9

Which authentication method supports passwordless access to Azure AI services?

A. Static credentials only
B. Managed identities
C. Anonymous access
D. Embedded API secrets in code

Answer

B. Managed identities

Explanation

Managed identities support secure, keyless authentication.

Question 10

Which model type is most appropriate for image understanding and OCR tasks?

A. Small language model
B. Multimodal model
C. Traditional relational database
D. Static rules engine

Answer

B. Multimodal model

Explanation

Multimodal models process images and text together.

Go to the AI-103 Exam Prep Hub main page

AI, AI-103, Microsoft Certification May 25, 2026

Integrate Foundry projects with Continuous Integration and Continuous Deployment (CI/CD) pipelines (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Plan and manage an Azure AI solution (25–30%)
   --> Set up AI solutions in Foundry
      --> Integrate Foundry projects with Continuous Integration and Continuous Deployment (CI/CD) pipelines

Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Modern AI applications and agent-based systems are continuously evolving.

Organizations frequently update:

AI models
Prompts
Agent workflows
APIs
Retrieval systems
Infrastructure
Security configurations

Manual deployment processes are slow, error-prone, and difficult to scale.

To solve these challenges, organizations use:

Continuous Integration (CI)
Continuous Deployment (CD)
Automated testing
Infrastructure-as-Code (IaC)
Automated validation pipelines

The AI-103: Develop AI Apps and Agents on Azure certification exam tests your understanding of how to integrate Azure AI Foundry projects into CI/CD pipelines.

For the AI-103 exam, you should understand:

CI/CD concepts
Azure DevOps pipelines
GitHub Actions workflows
Infrastructure-as-Code
Automated AI deployment workflows
Model versioning
Deployment automation
Testing and validation
Environment management
Rollback strategies
Monitoring deployment health

What Is CI/CD?

CI/CD stands for:

Continuous Integration
Continuous Deployment (or Continuous Delivery)

CI/CD automates software and AI deployment processes.

Continuous Integration (CI)

Continuous Integration focuses on:

Automatically building code
Running automated tests
Validating changes
Detecting issues early

Developers frequently merge changes into shared repositories.

Continuous Deployment (CD)

Continuous Deployment automates:

Application releases
Model deployments
Infrastructure updates
Environment promotion

CD ensures new versions are deployed safely and consistently.

Why CI/CD Matters for AI Solutions

AI systems are more complex than traditional applications because they include:

Models
Prompts
Retrieval pipelines
Vector indexes
Agent workflows
Tool integrations

CI/CD helps ensure:

Reliable deployments
Repeatable processes
Faster releases
Reduced downtime
Safer experimentation

Azure AI Foundry and CI/CD

Azure AI Foundry integrates with:

Azure DevOps
GitHub Actions
Infrastructure-as-Code tools
Azure CLI
SDKs
REST APIs

This enables automated AI workflows.

Source Control for AI Projects

AI projects should use source control systems.

Common repositories include:

GitHub
Azure Repos

What Should Be Stored in Source Control?

Common AI assets include:

Application code
Prompt templates
Agent configurations
Infrastructure definitions
Deployment scripts
Evaluation workflows
Test cases
CI/CD pipeline definitions

What Should NOT Be Stored in Source Control?

Never store:

Secrets
API keys
Passwords
Certificates
Sensitive credentials

Use Azure Key Vault instead.

Azure DevOps

Azure DevOps provides:

Repositories
Build pipelines
Release pipelines
Work tracking
Artifact management

Azure DevOps is commonly used for enterprise AI deployments.

GitHub Actions

GitHub Actions supports:

Automated workflows
Build automation
Testing pipelines
Deployment automation
CI/CD orchestration

GitHub Actions is widely used for AI applications hosted in GitHub repositories.

Infrastructure-as-Code (IaC)

Infrastructure-as-Code automates infrastructure provisioning.

Instead of manually creating resources, infrastructure is defined in code.

Benefits of IaC

IaC provides:

Repeatability
Version control
Consistency
Automation
Reduced configuration drift

Common IaC Tools in Azure

Common Azure IaC tools include:

ARM templates
Bicep
Terraform

Bicep

Bicep is a declarative language for Azure infrastructure.

Used to deploy:

Azure OpenAI resources
Azure AI Search
Storage accounts
Networking resources
Key Vault
App Services

Terraform

Terraform is a multi-cloud Infrastructure-as-Code tool.

Useful for:

Hybrid environments
Multi-cloud deployments
Large enterprise automation

Automating Azure AI Resource Deployment

CI/CD pipelines can automatically provision:

Azure OpenAI
Azure AI Search
Cosmos DB
Azure Functions
App Service
Networking
Monitoring services

Automating Model Deployments

Model deployment pipelines may automate:

Model version selection
Deployment creation
Endpoint configuration
Scaling configuration
Rollback management

Model Versioning

Versioning is critical for AI deployments.

Benefits include:

Safer updates
Rollback support
Testing new versions
Comparing performance

Environment Management

AI solutions commonly use multiple environments.

Typical environments include:

Development
Testing
Staging
Production

Development Environment

Used for:

Experimentation
Initial testing
Prompt development
Rapid iteration

Testing Environment

Used for:

Automated testing
Integration testing
Validation workflows

Staging Environment

Used for:

Final validation
Production-like testing
User acceptance testing

Production Environment

Used for:

Live workloads
Enterprise applications
Customer-facing systems

Production environments require:

Strong monitoring
Security controls
Scalability
High availability

Automated Testing in AI Pipelines

Testing AI systems is more complex than traditional software testing.

AI pipelines should validate:

Functional behavior
Prompt quality
Retrieval quality
Latency
Safety
Reliability

Unit Testing

Unit testing validates:

Individual functions
APIs
Tool integrations
Components

Integration Testing

Integration testing validates interactions between:

Models
APIs
Search systems
Databases
Agents

Prompt Evaluation

Prompt evaluation helps assess:

Response quality
Groundedness
Hallucinations
Relevance
Consistency

Automated Evaluation Pipelines

Evaluation pipelines may measure:

Accuracy
Latency
Token usage
Toxicity
Retrieval precision

Prompt Flow and CI/CD

Prompt Flow can integrate into CI/CD pipelines.

Prompt Flow supports:

Workflow orchestration
Evaluation pipelines
Prompt testing
Tool integration

Deployment Strategies

Safe deployment strategies reduce risk.

Blue-Green Deployments

Blue-green deployments use two environments:

Current production environment
New deployment environment

Traffic switches after validation.

Benefits:

Reduced downtime
Easy rollback
Safer deployments

Canary Deployments

Canary deployments release updates gradually.

Benefits:

Reduced deployment risk
Easier issue detection
Controlled rollout

Rolling Deployments

Rolling deployments update systems incrementally.

Benefits:

Minimal downtime
Gradual infrastructure replacement

Rollback Strategies

Rollback mechanisms are critical.

Rollbacks may restore:

Previous model versions
Prior prompts
Earlier infrastructure states

Deployment Approval Gates

Approval gates help control production releases.

Approvals may be required before:

Production deployment
Model upgrades
Infrastructure changes

Security in CI/CD Pipelines

Security is a major AI-103 topic.

Azure Key Vault Integration

Pipelines should retrieve secrets securely from:

Azure Key Vault

Examples include:

API keys
Connection strings
Certificates

Managed Identities

Managed identities reduce the need for stored credentials.

Benefits:

Improved security
Simplified authentication
Reduced secret exposure

Role-Based Access Control (RBAC)

RBAC limits access to:

Deployments
Resources
Pipelines
Secrets

Monitoring CI/CD Pipelines

Pipelines should monitor:

Build failures
Deployment failures
Performance regressions
AI quality degradation

Azure Monitor

Azure Monitor supports:

Metrics
Alerts
Logging
Diagnostics

Application Insights

Application Insights helps monitor:

API latency
Failures
Dependency performance
User behavior

AI-Specific Monitoring

AI systems should monitor:

Token usage
Hallucination rates
Retrieval quality
Tool execution failures
Prompt performance

Common AI-103 CI/CD Scenarios

Scenario 1: Enterprise AI Copilot

Requirements:

Frequent prompt updates
Safe production releases
Automated testing

Recommended Approach:

GitHub Actions
Prompt Flow evaluations
Canary deployments

Scenario 2: Large-Scale AI Platform

Requirements:

Infrastructure automation
Multi-environment deployment
Enterprise governance

Recommended Approach:

Azure DevOps
Bicep or Terraform
Approval gates

Scenario 3: AI Agent Workflow System

Requirements:

Frequent workflow updates
Tool integration testing
Prompt validation

Recommended Approach:

Automated evaluation pipelines
Integration testing
Blue-green deployment strategy

Cost Optimization in CI/CD

CI/CD pipelines can increase operational costs.

Cost Optimization Strategies

Use Automated Cleanup

Remove:

Temporary environments
Test resources
Unused deployments

Optimize Test Frequency

Run expensive evaluations only when necessary.

Use Smaller Models for Testing

Smaller models reduce:

Token usage
Compute costs
Evaluation expenses

Common AI-103 Exam Tips

Understand CI/CD Fundamentals

Know:

Continuous Integration
Continuous Deployment
Automated testing
Deployment automation

Learn Deployment Strategies

Understand:

Blue-green deployments
Canary deployments
Rolling deployments
Rollback strategies

Know Infrastructure-as-Code Concepts

Understand:

Bicep
Terraform
ARM templates

Understand AI-Specific Testing

AI systems require testing for:

Prompt quality
Groundedness
Safety
Retrieval accuracy
Latency

Summary

Integrating Azure AI Foundry projects with CI/CD pipelines enables organizations to:

Automate deployments
Improve reliability
Increase scalability
Reduce operational risk
Accelerate AI delivery

For the AI-103 exam, you should understand:

CI/CD fundamentals
Azure DevOps pipelines
GitHub Actions workflows
Infrastructure-as-Code
Automated AI deployment strategies
Environment management
AI testing pipelines
Monitoring and observability
Secure deployment practices
Rollback and release strategies

Strong CI/CD practices are essential for building production-grade AI applications and agent-based systems on Azure.

Practice Exam Questions

Question 1

What does CI/CD stand for?

A. Continuous Integration and Continuous Deployment
B. Centralized Integration and Continuous Diagnostics
C. Continuous Inspection and Cloud Deployment
D. Centralized Infrastructure and Cloud Distribution

Answer

A. Continuous Integration and Continuous Deployment

Explanation

CI/CD automates software and AI deployment workflows.

Question 2

Which Azure service is commonly used for enterprise CI/CD pipelines?

A. Azure DevOps
B. Azure Backup
C. Azure DNS
D. Azure Files

Answer

A. Azure DevOps

Explanation

Azure DevOps provides build, release, and deployment pipeline capabilities.

Question 3

Which GitHub feature supports automated workflow execution for deployments?

A. GitHub Actions
B. GitHub Storage
C. GitHub Search
D. GitHub Monitor

Answer

A. GitHub Actions

Explanation

GitHub Actions automates workflows, testing, and deployments.

Question 4

Which deployment strategy uses two environments and switches traffic after validation?

A. Rolling deployment
B. Blue-green deployment
C. Canary deployment
D. Manual deployment

Answer

B. Blue-green deployment

Explanation

Blue-green deployments reduce downtime and simplify rollback.

Question 5

Which Azure service securely stores secrets for CI/CD pipelines?

A. Azure Key Vault
B. Azure Monitor
C. Azure Firewall
D. Azure CDN

Answer

A. Azure Key Vault

Explanation

Azure Key Vault securely stores secrets and credentials.

Question 6

Which Infrastructure-as-Code language is specifically designed for Azure?

A. Bicep
B. SQL
C. JavaScript
D. HTML

Answer

A. Bicep

Explanation

Bicep is a declarative Infrastructure-as-Code language for Azure.

Question 7

What is the primary purpose of canary deployments?

A. Eliminate monitoring
B. Gradually release updates to reduce risk
C. Replace version control
D. Encrypt model endpoints

Answer

B. Gradually release updates to reduce risk

Explanation

Canary deployments expose updates to a subset of users first.

Question 8

Which type of testing validates interactions between models, APIs, and databases?

A. Unit testing
B. Integration testing
C. Syntax testing
D. Deployment testing

Answer

B. Integration testing

Explanation

Integration testing validates component interactions.

Question 9

Which Azure service helps monitor application telemetry and diagnostics?

A. Application Insights
B. Azure DNS
C. Azure Backup
D. Azure Files

Answer

A. Application Insights

Explanation

Application Insights provides telemetry and monitoring capabilities.

Question 10

Which Azure feature reduces the need to store credentials directly in pipelines?

A. Managed identities
B. Public IP addresses
C. Azure CDN
D. Static tokens

Answer

A. Managed identities

Explanation

Managed identities provide secure authentication without storing credentials.

Go to the AI-103 Exam Prep Hub main page