This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub.
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
--> Build agents by using Foundry
--> Build agents that integrate retrieval, function-calling, and conversation memory
Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.
Introduction
Modern AI agents are far more capable than traditional chatbots.
Today’s enterprise AI agents can:
- Retrieve enterprise knowledge
- Call APIs and tools
- Maintain memory across conversations
- Perform multistep workflows
- Coordinate reasoning and actions
Azure AI Foundry provides the infrastructure and orchestration capabilities needed to build these advanced agentic systems.
For the AI-103: Develop AI Apps and Agents on Azure certification exam, understanding how to build agents that integrate:
- Retrieval
- Function-calling
- Conversation memory
is extremely important.
These capabilities are foundational to enterprise generative AI systems.
What Is an AI Agent?
An AI agent is an AI-powered system capable of:
- Understanding goals
- Maintaining context
- Using tools
- Retrieving information
- Performing actions
- Adapting to new inputs
Agents extend beyond simple prompt-response interactions.
Core Components of Modern Agents
Modern agents commonly include:
- Large language models (LLMs)
- Retrieval systems
- Tool integrations
- Function-calling frameworks
- Memory systems
- Workflow orchestration
- Safety controls
Retrieval in Agent Systems
Retrieval allows agents to:
- Access external knowledge
- Ground responses in enterprise data
- Improve factual accuracy
- Reduce hallucinations
Why Retrieval Matters
LLMs are trained on static datasets.
Without retrieval:
- Models may lack current information
- Enterprise-specific knowledge may be unavailable
- Hallucinations become more likely
Retrieval-Augmented Generation (RAG)
Retrieval-Augmented Generation (RAG) combines:
- Search and retrieval systems
- LLM reasoning and generation
RAG allows agents to generate responses using retrieved content.
Typical RAG Workflow
A common RAG workflow includes:
- User submits a query
- Query is converted to embeddings
- Search retrieves relevant documents
- Documents are added to prompts
- LLM generates grounded responses
Knowledge Sources for Retrieval
Agents may retrieve data from:
- Azure AI Search
- Vector databases
- SQL databases
- Document repositories
- SharePoint
- Blob storage
- Knowledge bases
Vector Search
Vector search enables semantic retrieval.
Instead of keyword matching only, vector search finds:
- Meaning
- Similarity
- Contextual relationships
Embeddings
Embeddings are numerical vector representations of text or data.
Embeddings help systems:
- Measure semantic similarity
- Perform vector search
- Improve retrieval relevance
Chunking Strategies
Documents are often split into smaller chunks before indexing.
Chunking improves:
- Retrieval precision
- Context quality
- Token efficiency
Retrieval Pipelines
Retrieval pipelines commonly include:
- Data ingestion
- Chunking
- Embedding generation
- Indexing
- Query retrieval
- Reranking
Hybrid Search
Hybrid search combines:
- Keyword search
- Vector search
This improves search quality.
Grounding Responses
Grounding means generating responses using retrieved evidence.
Grounded systems are:
- More accurate
- More explainable
- More reliable
Citation and Source Attribution
Agents may include:
- Source links
- Document citations
- Retrieved evidence
This improves transparency.
Function-Calling in Agent Systems
Function-calling allows models to invoke:
- APIs
- Services
- Workflows
- Databases
- External tools
Why Function-Calling Matters
LLMs alone cannot:
- Access live systems
- Execute actions
- Retrieve dynamic business data
Function-calling bridges this gap.
Examples of Functions
Common functions include:
- Get weather data
- Retrieve customer records
- Create support tickets
- Query inventory systems
- Send emails
- Schedule meetings
Tool Schemas
Function-calling relies on structured tool schemas.
Schemas define:
- Tool names
- Parameters
- Data types
- Required fields
- Expected outputs
Example Function Schema
Example:
Function: GetOrderStatus
Inputs:
- OrderID
- CustomerID
Outputs:
- Shipping status
- Estimated delivery date
Structured Tool Invocation
Structured tool invocation improves:
- Reliability
- Validation
- Automation
- Error handling
Function Selection Logic
Agents may decide:
- Whether tools are needed
- Which tools to invoke
- When to call functions
- How to sequence operations
Multi-Tool Workflows
Advanced agents may orchestrate:
- Multiple tools
- Sequential workflows
- Conditional logic
- Parallel execution
Example Multi-Tool Workflow
Example:
- Retrieve customer data
- Query billing system
- Generate summary
- Create support ticket
- Send notification
Tool Safety Controls
Organizations should control:
- Which tools agents can access
- Which users may trigger actions
- Which workflows require approval
Human-in-the-Loop Approvals
High-risk operations may require:
- Human review
- Approval checkpoints
- Escalation workflows
Conversation Memory
Conversation memory allows agents to:
- Maintain context
- Track interactions
- Remember prior information
- Continue workflows
Why Memory Matters
Without memory:
- Conversations become disconnected
- Users repeat information
- Workflow continuity breaks
Types of Memory
Common memory types include:
- Short-term memory
- Long-term memory
- Episodic memory
- Semantic memory
Short-Term Memory
Short-term memory stores:
- Recent prompts
- Recent responses
- Current task state
Long-Term Memory
Long-term memory stores:
- User preferences
- Historical interactions
- Persistent context
Stateful vs Stateless Agents
Stateless Agents
Do not retain memory between sessions.
Benefits:
- Simpler architecture
- Lower storage requirements
Stateful Agents
Maintain context and conversation history.
Benefits:
- Better user experiences
- Improved multistep reasoning
Context Window Limitations
LLMs have limited context windows.
Applications must manage:
- Token usage
- Conversation length
- Historical context
Memory Management Strategies
Common strategies include:
- Rolling conversation windows
- Summarized history
- Vector memory retrieval
- Persistent storage systems
Vector Memory
Conversation history may be stored as embeddings.
This enables:
- Semantic memory retrieval
- Long-term contextual recall
- Personalized interactions
Retrieval-Based Memory
Agents may retrieve:
- Prior conversations
- Historical workflow data
- Previous decisions
Persistent Memory Storage
Persistent memory may use:
- Databases
- Search indexes
- Vector stores
- Cloud storage
Agent Orchestration
Orchestration coordinates:
- Retrieval systems
- Function-calling
- Memory systems
- Workflow execution
Agent Reasoning Loops
Agents may perform iterative reasoning:
- Analyze request
- Retrieve information
- Call tools
- Evaluate outputs
- Continue reasoning
- Generate response
Workflow State Management
Agents may track:
- Active tasks
- Tool outputs
- Pending actions
- Workflow progress
Azure AI Foundry and Agent Development
Azure AI Foundry supports:
- Model deployment
- Retrieval integration
- Agent orchestration
- Prompt flows
- Evaluation pipelines
- Monitoring and governance
Azure AI Search in Agent Systems
Azure AI Search commonly provides:
- Vector indexing
- Semantic ranking
- Hybrid search
- Enterprise retrieval
Prompt Engineering for Agents
Effective prompts define:
- Agent role
- Behavioral expectations
- Tool usage rules
- Safety constraints
Grounded Prompt Construction
Grounded prompts may include:
- Retrieved documents
- Citations
- Tool outputs
- Prior conversation context
Monitoring Agent Systems
Organizations should monitor:
- Retrieval relevance
- Tool-call accuracy
- Memory quality
- Latency
- Hallucinations
- Safety events
Evaluating RAG Systems
RAG systems should be evaluated for:
- Retrieval quality
- Relevance
- Faithfulness
- Grounding accuracy
- Citation quality
Evaluating Function-Calling
Organizations should validate:
- Correct tool selection
- Parameter accuracy
- Workflow reliability
- Error recovery
Evaluating Conversation Memory
Memory systems should be evaluated for:
- Context retention
- Consistency
- Recall accuracy
- Session continuity
Security Considerations
Secure agent systems should implement:
- Authentication
- Authorization
- Managed identities
- RBAC
- Private networking
- Audit logging
Responsible AI Considerations
Organizations should apply:
- Safety filters
- Guardrails
- Human oversight
- Content moderation
- Usage monitoring
Real-World Scenario
Scenario: Enterprise HR Assistant
Requirements:
- Retrieve HR policies
- Answer employee questions
- Access scheduling systems
- Remember user preferences
- Escalate sensitive requests
Recommended Design:
- RAG using Azure AI Search
- Function-calling for HR systems
- Stateful conversation memory
- Approval workflows for sensitive actions
- Grounded response generation
Common AI-103 Exam Tips
Understand Retrieval Concepts
Know:
- RAG
- Embeddings
- Vector search
- Hybrid search
- Grounding
Learn Function-Calling Concepts
Understand:
- Tool schemas
- Structured invocation
- Tool orchestration
- Workflow execution
Understand Memory Systems
Know:
- Stateful vs stateless agents
- Short-term vs long-term memory
- Context management
- Vector memory
Understand Agent Orchestration
Know how agents combine:
- Retrieval
- Tool usage
- Memory
- Reasoning
Summary
Modern enterprise agents combine:
- Retrieval systems
- Function-calling
- Conversation memory
- Workflow orchestration
For the AI-103 exam, you should understand:
- RAG architectures
- Vector search
- Embeddings
- Grounding
- Function-calling
- Tool schemas
- Tool orchestration
- Stateful memory
- Context management
- Agent reasoning loops
- Monitoring and governance
These concepts are foundational to building scalable and intelligent AI agents with Azure AI Foundry.
Practice Exam Questions
Question 1
What is the primary purpose of Retrieval-Augmented Generation (RAG)?
A. Reduce GPU temperatures
B. Combine retrieval systems with LLM generation
C. Eliminate vector search
D. Replace APIs completely
Answer
B. Combine retrieval systems with LLM generation
Explanation
RAG combines retrieval and generation to improve grounded responses.
Question 2
Why are embeddings important in retrieval systems?
A. They increase firewall security
B. They enable semantic similarity comparisons
C. They replace orchestration engines
D. They remove token limits
Answer
B. They enable semantic similarity comparisons
Explanation
Embeddings support semantic vector search.
Question 3
What is a key advantage of hybrid search?
A. It disables semantic ranking
B. It combines keyword and vector search
C. It removes indexing requirements
D. It eliminates embeddings
Answer
B. It combines keyword and vector search
Explanation
Hybrid search improves retrieval quality by combining approaches.
Question 4
What is the purpose of function-calling in agent systems?
A. Reduce network traffic only
B. Allow models to invoke external tools and services
C. Eliminate APIs
D. Disable workflows
Answer
B. Allow models to invoke external tools and services
Explanation
Function-calling enables interaction with external systems.
Question 5
What information is typically included in a tool schema?
A. GPU temperature metrics
B. Parameters, data types, and outputs
C. Only firewall settings
D. Only vector dimensions
Answer
B. Parameters, data types, and outputs
Explanation
Schemas define structured tool interfaces.
Question 6
Why is conversation memory important?
A. It reduces all storage costs
B. It maintains continuity and context across interactions
C. It removes orchestration needs
D. It disables tool invocation
Answer
B. It maintains continuity and context across interactions
Explanation
Memory improves user experiences and multistep workflows.
Question 7
What is a characteristic of stateful agents?
A. They never store context
B. They maintain conversation history and state
C. They disable retrieval systems
D. They remove prompt engineering
Answer
B. They maintain conversation history and state
Explanation
Stateful agents retain memory across interactions.
Question 8
What is a common challenge when using LLM conversation memory?
A. Unlimited context windows
B. Context window limitations and token constraints
C. Elimination of embeddings
D. Removal of grounding
Answer
B. Context window limitations and token constraints
Explanation
LLMs can process only limited amounts of context.
Question 9
Which Azure service is commonly used for enterprise retrieval in RAG architectures?
A. Azure DevOps
B. Azure AI Search
C. Azure Virtual Desktop
D. Azure Batch
Answer
B. Azure AI Search
Explanation
Azure AI Search supports vector and hybrid search for RAG systems.
Question 10
What should organizations monitor in agent systems?
A. Only GPU fan speeds
B. Retrieval quality, tool usage, memory accuracy, and safety
C. Only prompt lengths
D. Only authentication failures
Answer
B. Retrieval quality, tool usage, memory accuracy, and safety
Explanation
Comprehensive monitoring improves reliability, governance, and user trust.
Go to the AI-103 Exam Prep Hub main page
