Category: Artificial Intelligence (AI)

Build autonomous or semi-autonomous workflows with safeguards and approval flow controls (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
--> Build agents by using Foundry
--> Build autonomous or semi-autonomous workflows with safeguards and approval flow controls


Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Modern AI agents are increasingly capable of:

  • Making decisions
  • Executing workflows
  • Calling tools
  • Accessing enterprise systems
  • Performing multistep reasoning

As agents become more autonomous, organizations must ensure these systems operate safely, securely, and within governance boundaries.

Azure AI Foundry supports the development of autonomous and semiautonomous AI workflows with:

  • Guardrails
  • Approval workflows
  • Human oversight
  • Tool restrictions
  • Safety controls
  • Audit logging

For the AI-103: Develop AI Apps and Agents on Azure certification exam, understanding safeguards and approval mechanisms is an important topic.


What Are Autonomous AI Workflows?

Autonomous workflows are systems in which AI agents can:

  • Make decisions independently
  • Invoke tools automatically
  • Execute multistep processes
  • Complete tasks without continuous human intervention

Examples of Autonomous Workflows

Examples include:

  • Automated ticket routing
  • Financial reconciliation
  • Inventory management
  • Scheduling assistants
  • IT remediation workflows
  • Document processing pipelines

What Are Semiautonomous Workflows?

Semiautonomous workflows combine:

  • AI-driven automation
  • Human oversight
  • Approval checkpoints

These systems automate low-risk tasks while escalating higher-risk decisions.


Human-in-the-Loop Systems

Human-in-the-loop (HITL) systems require human review for:

  • Sensitive actions
  • Compliance decisions
  • Financial operations
  • External communications
  • Policy exceptions

Why Safeguards Matter

Without safeguards, AI agents may:

  • Execute unsafe actions
  • Generate inaccurate outputs
  • Access unauthorized systems
  • Trigger harmful workflows
  • Violate compliance requirements

Types of Safeguards

Common safeguards include:

  • Approval workflows
  • Tool restrictions
  • Role-based access control (RBAC)
  • Safety filters
  • Content moderation
  • Policy enforcement
  • Rate limiting
  • Audit logging

Approval Flow Controls

Approval flow controls require authorization before:

  • Executing actions
  • Sending communications
  • Modifying systems
  • Accessing sensitive data

Common Approval Scenarios

Examples include:

  • Approving payments
  • Deploying infrastructure
  • Publishing external communications
  • Updating customer records
  • Triggering high-impact workflows

Workflow States

Approval workflows commonly include states such as:

  • Pending
  • Approved
  • Rejected
  • Escalated
  • Completed

Escalation Workflows

Escalation mechanisms route requests to:

  • Supervisors
  • Compliance teams
  • Security reviewers
  • Human operators

when confidence or risk thresholds are exceeded.


Confidence Thresholds

Agents may use confidence scores to determine:

  • Whether to continue autonomously
  • Whether to escalate to humans
  • Whether additional validation is required

Risk-Based Decisioning

Organizations may classify actions by risk level:

  • Low-risk actions may execute automatically
  • Medium-risk actions may require validation
  • High-risk actions may require approval

Tool Access Controls

Agents should only access:

  • Approved APIs
  • Authorized databases
  • Permitted workflows
  • Scoped enterprise systems

Least Privilege Principle

Agents should receive:

  • Minimal required permissions
  • Restricted credentials
  • Scoped tool access

Managed Identities

Managed identities improve security by:

  • Eliminating embedded secrets
  • Providing secure Azure authentication
  • Supporting RBAC enforcement

Role-Based Access Control (RBAC)

RBAC ensures:

  • Agents only access authorized resources
  • Users receive appropriate permissions
  • Workflows follow governance rules

Guardrails

Guardrails are controls that constrain agent behavior.

Guardrails help:

  • Prevent unsafe outputs
  • Restrict tool usage
  • Enforce policies
  • Reduce hallucinations

Examples of Guardrails

Examples include:

  • Blocking unsafe prompts
  • Restricting financial transactions
  • Limiting external communications
  • Preventing access to sensitive data

Content Moderation

Content moderation systems detect:

  • Harmful content
  • Offensive language
  • Sensitive material
  • Unsafe requests

Safety Filters

Safety filters help block:

  • Violence
  • Hate speech
  • Self-harm content
  • Prompt injection attacks

Prompt Injection Risks

Prompt injection attacks attempt to:

  • Override instructions
  • Bypass safeguards
  • Manipulate agent behavior
  • Access restricted tools

Defending Against Prompt Injection

Defenses include:

  • Tool restrictions
  • Input validation
  • Output filtering
  • Instruction hierarchy
  • Retrieval validation

Validation Agents

Validation agents can:

  • Review outputs
  • Verify citations
  • Check policy compliance
  • Detect hallucinations

before actions are executed.


Approval Chains

Complex workflows may require:

  • Multiple approvers
  • Sequential approvals
  • Department-level authorization

Autonomous vs Semiautonomous Systems

Autonomous Systems

Advantages:

  • Faster execution
  • Reduced manual effort
  • Increased automation

Risks:

  • Reduced oversight
  • Higher operational risk
  • Greater need for safeguards

Semiautonomous Systems

Advantages:

  • Human oversight
  • Better governance
  • Reduced risk

Tradeoffs:

  • Slower workflows
  • Increased operational involvement

Agent Orchestration

Orchestration coordinates:

  • Agent interactions
  • Workflow progression
  • Approval stages
  • Tool invocation

Conditional Workflow Logic

Conditional workflows may:

  • Branch based on confidence
  • Escalate high-risk tasks
  • Retry failed actions
  • Invoke specialized agents

Workflow State Tracking

State tracking records:

  • Current workflow stage
  • Agent outputs
  • Approval status
  • Tool usage history

Audit Logging

Audit logs may capture:

  • Agent decisions
  • Tool invocations
  • Approval actions
  • User interactions
  • Workflow changes

Traceability

Traceability improves:

  • Governance
  • Compliance
  • Debugging
  • Operational transparency

Observability

Observability helps teams:

  • Diagnose failures
  • Monitor workflows
  • Analyze agent behavior
  • Improve orchestration

Monitoring Autonomous Workflows

Organizations should monitor:

  • Workflow success rates
  • Escalation frequency
  • Tool failures
  • Safety events
  • Approval bottlenecks

Safety Evaluations

Safety evaluations assess:

  • Harmful outputs
  • Hallucination rates
  • Compliance violations
  • Prompt injection resistance

Testing Agent Workflows

Organizations should test:

  • Edge cases
  • Failure scenarios
  • Prompt attacks
  • Escalation logic
  • Approval workflows

Failure Recovery

Recovery strategies include:

  • Retries
  • Rollbacks
  • Human intervention
  • Fallback workflows
  • Secondary validation

Rate Limiting

Rate limiting helps:

  • Prevent abuse
  • Reduce accidental loops
  • Protect backend systems
  • Control operational costs

Timeouts and Execution Limits

Agents should have:

  • Maximum execution times
  • Retry thresholds
  • Resource limits
  • Tool usage limits

Sandboxing

Sandboxing isolates:

  • Tool execution
  • Code execution
  • Experimental workflows

from production systems.


Retrieval-Augmented Workflows

Grounded workflows use:

  • Retrieval systems
  • Vector search
  • Enterprise knowledge stores

to improve response accuracy.


Azure AI Search Integration

Azure AI Search supports:

  • Semantic search
  • Hybrid search
  • Vector search
  • Retrieval pipelines

for grounded workflows.


Responsible AI Principles

Responsible AI systems should prioritize:

  • Fairness
  • Reliability
  • Safety
  • Privacy
  • Transparency
  • Accountability

Transparency in Agent Systems

Users should understand:

  • When AI is making decisions
  • When approvals are required
  • What actions are being executed
  • What data is being used

Real-World Scenario

Scenario: Financial Approval Agent

Requirements:

  • Process expense reimbursements
  • Approve low-risk transactions automatically
  • Escalate high-value transactions
  • Log all actions
  • Enforce compliance rules

Recommended Design:

  • Approval workflows
  • Confidence thresholds
  • Validation agents
  • RBAC controls
  • Managed identities
  • Audit logging
  • Human approval for high-risk actions

Common AI-103 Exam Tips

Understand Workflow Types

Know:

  • Autonomous workflows
  • Semiautonomous workflows
  • Human-in-the-loop systems

Learn Safeguard Mechanisms

Understand:

  • Guardrails
  • Approval workflows
  • Tool restrictions
  • Safety filters
  • Content moderation

Learn Security Concepts

Know:

  • RBAC
  • Managed identities
  • Least privilege
  • Tool authorization

Understand Monitoring and Auditing

Know:

  • Trace logging
  • Audit logging
  • Workflow monitoring
  • Safety evaluations

Summary

Autonomous and semiautonomous AI workflows enable:

  • Enterprise automation
  • Coordinated agent execution
  • Tool-driven workflows
  • Intelligent orchestration

For the AI-103 exam, you should understand:

  • Autonomous workflows
  • Semiautonomous workflows
  • Human-in-the-loop systems
  • Approval flow controls
  • Guardrails
  • Safety filters
  • Content moderation
  • Prompt injection defenses
  • Tool restrictions
  • RBAC
  • Managed identities
  • Audit logging
  • Workflow monitoring
  • Validation agents
  • Escalation logic
  • Responsible AI controls

These capabilities are critical for building safe enterprise AI systems with Azure AI Foundry.


Practice Exam Questions

Question 1

What is a semiautonomous workflow?

A. A workflow with no automation
B. A workflow combining AI automation with human oversight
C. A workflow that disables approvals
D. A workflow without safeguards

Answer

B. A workflow combining AI automation with human oversight

Explanation

Semiautonomous systems automate tasks while incorporating human review.


Question 2

What is the purpose of approval flow controls?

A. Increase hallucinations
B. Require authorization before sensitive actions execute
C. Eliminate governance
D. Remove monitoring

Answer

B. Require authorization before sensitive actions execute

Explanation

Approval workflows improve governance and safety.


Question 3

Which principle ensures agents receive minimal required permissions?

A. Semantic ranking
B. Least privilege
C. Parallel orchestration
D. Tokenization

Answer

B. Least privilege

Explanation

Least privilege reduces security exposure.


Question 4

What is a common use case for human-in-the-loop workflows?

A. GPU driver management
B. Financial approvals
C. DNS routing
D. Operating system updates

Answer

B. Financial approvals

Explanation

Sensitive decisions often require human review.


Question 5

What are guardrails used for?

A. Increasing unrestricted tool access
B. Constraining agent behavior and enforcing policies
C. Eliminating RBAC
D. Removing workflow monitoring

Answer

B. Constraining agent behavior and enforcing policies

Explanation

Guardrails help maintain safe and compliant behavior.


Question 6

What is a prompt injection attack?

A. A GPU hardware issue
B. An attempt to manipulate agent instructions or bypass safeguards
C. A storage configuration error
D. A network routing protocol

Answer

B. An attempt to manipulate agent instructions or bypass safeguards

Explanation

Prompt injection attacks target AI workflow controls.


Question 7

Why are managed identities important in autonomous systems?

A. They eliminate logging
B. They provide secure authentication without embedded secrets
C. They disable RBAC
D. They reduce vector search quality

Answer

B. They provide secure authentication without embedded secrets

Explanation

Managed identities improve credential security.


Question 8

What should audit logs capture in agent workflows?

A. Only VM temperatures
B. Agent actions, approvals, and tool invocations
C. Only DNS requests
D. Only prompt length

Answer

B. Agent actions, approvals, and tool invocations

Explanation

Audit logs improve governance and traceability.


Question 9

What is a benefit of confidence thresholds?

A. They remove monitoring requirements
B. They help determine when escalation is needed
C. They disable approval workflows
D. They eliminate retrieval systems

Answer

B. They help determine when escalation is needed

Explanation

Confidence thresholds support risk-based workflow decisions.


Question 10

Which Azure service commonly supports grounded retrieval workflows?

A. Azure AI Search
B. Azure Firewall Manager
C. Azure DNS
D. Azure Bastion

Answer

A. Azure AI Search

Explanation

Azure AI Search supports retrieval and grounding pipelines.


Go to the AI-103 Exam Prep Hub main page

Integrate agent tools, including APIs, knowledge stores, search, Content Understanding, and custom functions (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
--> Build agents by using Foundry
--> Integrate agent tools, including APIs, knowledge stores, search, Content Understanding, and custom functions


Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Modern AI agents are capable of far more than generating text.

Enterprise AI agents can:

  • Access business systems
  • Retrieve enterprise knowledge
  • Search documents
  • Understand multimodal content
  • Execute workflows
  • Interact with APIs
  • Use custom functions

These capabilities are possible because modern agentic systems integrate external tools.

Azure AI Foundry provides orchestration and integration capabilities for building tool-augmented AI agents.

For the AI-103: Develop AI Apps and Agents on Azure certification exam, understanding how agents integrate with:

  • APIs
  • Knowledge stores
  • Search systems
  • Content understanding services
  • Custom functions

is a major exam objective.


What Are Agent Tools?

Agent tools are external capabilities that agents can invoke to:

  • Retrieve information
  • Perform actions
  • Execute workflows
  • Interact with systems

Why Tool Integration Matters

LLMs alone cannot:

  • Access real-time business data
  • Execute transactions
  • Query live systems
  • Retrieve private enterprise information

Tool integration enables these capabilities.


Types of Agent Tools

Common agent tools include:

  • APIs
  • Databases
  • Search services
  • Vector stores
  • Content understanding systems
  • Workflow engines
  • Custom functions
  • External applications

Tool-Augmented Agents

Tool-augmented agents combine:

  • Language reasoning
  • Retrieval systems
  • External actions
  • Workflow orchestration

APIs in Agent Systems

APIs are among the most common tools used by AI agents.

APIs allow agents to:

  • Retrieve data
  • Update systems
  • Trigger workflows
  • Access cloud services

Common API Integration Scenarios

Examples include:

  • CRM systems
  • ERP systems
  • Ticketing systems
  • Email services
  • Calendar systems
  • Inventory systems
  • Financial platforms

REST APIs

Many agent integrations use REST APIs.

REST APIs commonly support:

  • GET operations
  • POST operations
  • PUT operations
  • DELETE operations

API Authentication

Agent systems may authenticate using:

  • API keys
  • OAuth tokens
  • Managed identities
  • Microsoft Entra ID

Managed Identity Integration

Managed identities allow applications to:

  • Authenticate securely
  • Avoid storing secrets
  • Access Azure resources safely

Function-Calling

Function-calling allows models to:

  • Invoke tools dynamically
  • Generate structured requests
  • Execute external operations

Tool Schemas

Tool schemas define:

  • Tool names
  • Input parameters
  • Data types
  • Required fields
  • Expected outputs

Structured Tool Invocation

Structured invocation improves:

  • Reliability
  • Validation
  • Automation
  • Predictability

Knowledge Stores

Knowledge stores provide persistent enterprise information for retrieval.

Knowledge stores may contain:

  • Documents
  • Policies
  • Product manuals
  • Research data
  • Historical records

Why Knowledge Stores Matter

Knowledge stores allow agents to:

  • Access enterprise-specific information
  • Ground responses
  • Improve factual accuracy

Knowledge Sources

Agents may connect to:

  • Azure AI Search
  • SharePoint
  • SQL databases
  • Blob storage
  • Cosmos DB
  • Data Lake storage
  • Vector databases

Retrieval-Augmented Generation (RAG)

RAG combines:

  • Retrieval systems
  • Generative models

Retrieved data is added to prompts to improve grounded responses.


Search Systems in Agent Architectures

Search systems allow agents to:

  • Retrieve relevant content
  • Find documents
  • Search enterprise knowledge
  • Improve response quality

Azure AI Search

Azure AI Search is commonly used for:

  • Keyword search
  • Vector search
  • Hybrid search
  • Semantic ranking

Semantic Search

Semantic search focuses on:

  • Meaning
  • Context
  • Intent

rather than exact keyword matches.


Vector Search

Vector search uses embeddings to:

  • Identify semantic similarity
  • Retrieve related content
  • Improve retrieval quality

Hybrid Search

Hybrid search combines:

  • Keyword search
  • Vector search

This improves search relevance.


Embeddings

Embeddings are vector representations of data.

Embeddings support:

  • Semantic retrieval
  • Similarity comparison
  • Vector indexing

Retrieval Pipelines

Retrieval pipelines commonly include:

  1. Data ingestion
  2. Chunking
  3. Embedding generation
  4. Indexing
  5. Retrieval
  6. Reranking

Grounded Responses

Grounded responses are generated using retrieved evidence.

Grounding improves:

  • Accuracy
  • Explainability
  • Trustworthiness

Content Understanding

Content understanding systems allow agents to analyze:

  • Images
  • Documents
  • Audio
  • Video
  • Forms
  • Structured and unstructured content

Multimodal Processing

Multimodal systems process multiple content types simultaneously.

Examples include:

  • Text + images
  • Text + audio
  • Documents + tables

Azure AI Content Understanding Capabilities

Agents may integrate with services for:

  • OCR
  • Image analysis
  • Speech recognition
  • Document intelligence
  • Form extraction
  • Video analysis

OCR Integration

Optical Character Recognition (OCR) extracts text from:

  • Images
  • PDFs
  • Scanned documents

Document Intelligence

Document intelligence systems can extract:

  • Key-value pairs
  • Tables
  • Forms
  • Structured business data

Image Understanding

Agents may analyze images for:

  • Object detection
  • Caption generation
  • Classification
  • Scene understanding

Speech Integration

Speech systems enable:

  • Speech-to-text
  • Text-to-speech
  • Voice assistants
  • Audio analysis

Custom Functions

Custom functions extend agent capabilities beyond built-in tools.

Custom functions may:

  • Execute business logic
  • Integrate proprietary systems
  • Trigger workflows
  • Process specialized data

Examples of Custom Functions

Examples include:

  • Risk scoring
  • Inventory forecasting
  • Pricing calculations
  • Compliance validation
  • Workflow automation

Designing Custom Functions

Good custom functions should:

  • Be narrowly scoped
  • Use structured parameters
  • Return predictable outputs
  • Support validation

Error Handling for Tools

Agent systems should handle:

  • API failures
  • Timeouts
  • Invalid responses
  • Authentication errors
  • Missing data

Retry Logic

Retry mechanisms improve resilience when:

  • APIs temporarily fail
  • Services throttle requests
  • Network issues occur

Tool Selection Logic

Agents may decide:

  • Whether a tool is needed
  • Which tool to invoke
  • When to retrieve information
  • How to sequence actions

Multi-Tool Orchestration

Advanced agents may coordinate:

  • Search systems
  • APIs
  • Memory systems
  • Custom functions
  • Workflow engines

Workflow Coordination

Agent workflows may include:

  1. Retrieve enterprise data
  2. Analyze content
  3. Call APIs
  4. Generate summaries
  5. Execute actions

Conversation Memory Integration

Agents may combine tools with:

  • Short-term memory
  • Long-term memory
  • Context tracking
  • Session persistence

Security Considerations

Secure tool integration requires:

  • Authentication
  • Authorization
  • RBAC
  • Managed identities
  • Secret management
  • Network controls

Least Privilege Principle

Agents should receive:

  • Minimal required permissions
  • Restricted tool access
  • Scoped credentials

Monitoring Tool Usage

Organizations should monitor:

  • Tool invocation frequency
  • API failures
  • Unauthorized actions
  • Retrieval quality
  • Workflow success rates

Logging and Auditing

Logs may capture:

  • Tool calls
  • API requests
  • Workflow execution
  • Retrieved sources
  • User interactions

Responsible AI Considerations

Organizations should implement:

  • Safety filters
  • Guardrails
  • Human oversight
  • Approval workflows
  • Content moderation

Human-in-the-Loop Workflows

Sensitive operations may require:

  • Human review
  • Approval checkpoints
  • Escalation processes

Performance Optimization

Optimization strategies include:

  • Caching
  • Query optimization
  • Efficient chunking
  • Parallel tool execution
  • Response streaming

Real-World Scenario

Scenario: Enterprise Legal Assistant

Requirements:

  • Search legal documents
  • Retrieve contract clauses
  • Analyze uploaded PDFs
  • Query compliance systems
  • Generate summaries

Recommended Design:

  • Azure AI Search for retrieval
  • OCR and document intelligence
  • Function-calling for compliance APIs
  • Conversation memory for continuity
  • Approval workflows for legal actions

Common AI-103 Exam Tips

Understand Tool Integration

Know:

  • APIs
  • Function-calling
  • Tool schemas
  • Tool orchestration

Learn Retrieval Concepts

Understand:

  • RAG
  • Vector search
  • Embeddings
  • Hybrid search
  • Grounding

Understand Content Understanding

Know:

  • OCR
  • Document intelligence
  • Image analysis
  • Speech services
  • Multimodal processing

Learn Security Concepts

Understand:

  • Managed identities
  • RBAC
  • Least privilege
  • Authentication methods

Summary

Modern AI agents integrate:

  • APIs
  • Search systems
  • Knowledge stores
  • Content understanding services
  • Custom functions
  • Workflow orchestration

For the AI-103 exam, you should understand:

  • Tool integration
  • Function-calling
  • Tool schemas
  • Retrieval systems
  • Azure AI Search
  • Embeddings
  • Grounding
  • OCR and document intelligence
  • Multimodal processing
  • Custom business functions
  • Workflow orchestration
  • Monitoring and governance

These capabilities are foundational for enterprise AI agent systems built with Azure AI Foundry.


Practice Exam Questions

Question 1

Why do AI agents integrate external tools?

A. To eliminate workflows
B. To access live systems and execute actions
C. To remove retrieval systems
D. To disable APIs

Answer

B. To access live systems and execute actions

Explanation

External tools allow agents to retrieve data and perform operations.


Question 2

What is the purpose of function-calling?

A. Replace search systems
B. Allow models to invoke external tools dynamically
C. Remove authentication requirements
D. Eliminate embeddings

Answer

B. Allow models to invoke external tools dynamically

Explanation

Function-calling enables structured interaction with external systems.


Question 3

What information is typically defined in a tool schema?

A. GPU temperatures
B. Input parameters and expected outputs
C. Firewall rules only
D. VM configurations only

Answer

B. Input parameters and expected outputs

Explanation

Tool schemas standardize tool interactions.


Question 4

Which Azure service is commonly used for vector and hybrid search?

A. Azure Virtual WAN
B. Azure AI Search
C. Azure Batch
D. Azure Policy

Answer

B. Azure AI Search

Explanation

Azure AI Search supports semantic, vector, and hybrid search.


Question 5

What is the purpose of embeddings?

A. Replace APIs entirely
B. Represent data semantically for similarity comparison
C. Eliminate vector indexes
D. Remove retrieval systems

Answer

B. Represent data semantically for similarity comparison

Explanation

Embeddings support semantic retrieval.


Question 6

What is a key benefit of grounded responses?

A. Reduced monitoring needs
B. Improved factual accuracy and trustworthiness
C. Elimination of search systems
D. Removal of citations

Answer

B. Improved factual accuracy and trustworthiness

Explanation

Grounded systems use retrieved evidence to improve reliability.


Question 7

Which capability extracts text from scanned documents?

A. Vector indexing
B. OCR
C. Hybrid search
D. Tokenization

Answer

B. OCR

Explanation

OCR extracts text from images and scanned files.


Question 8

Why are managed identities important in agent systems?

A. They increase hallucinations
B. They allow secure authentication without stored secrets
C. They eliminate RBAC
D. They disable APIs

Answer

B. They allow secure authentication without stored secrets

Explanation

Managed identities improve security and credential management.


Question 9

What is an example of a custom function?

A. A GPU driver update
B. A proprietary pricing calculation workflow
C. A firewall appliance
D. A VM snapshot

Answer

B. A proprietary pricing calculation workflow

Explanation

Custom functions implement specialized business logic.


Question 10

What should organizations monitor in tool-augmented agents?

A. Only CPU temperatures
B. Tool usage, API failures, retrieval quality, and workflow success
C. Only vector dimensions
D. Only prompt length

Answer

B. Tool usage, API failures, retrieval quality, and workflow success

Explanation

Monitoring improves reliability, governance, and operational visibility.


Go to the AI-103 Exam Prep Hub main page

Define agent roles, goals, conversation-tracking approach, and tool schemas (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
--> Build agents by using Foundry
--> Define agent roles, goals, conversation-tracking approach, and tool schemas


Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

AI agents are rapidly becoming one of the most important components of modern AI systems.

Unlike basic chatbots, agents can:

  • Reason through tasks
  • Maintain context
  • Use tools
  • Execute workflows
  • Coordinate multistep actions
  • Interact with external systems

Azure AI Foundry provides tools and frameworks for building agentic systems.

For the AI-103: Develop AI Apps and Agents on Azure certification exam, understanding agent design principles is critical.

This topic focuses on:

  • Agent roles
  • Agent goals
  • Conversation tracking
  • Tool schemas
  • Tool orchestration
  • State management
  • Memory design
  • Workflow coordination

What Is an AI Agent?

An AI agent is an AI system capable of:

  • Understanding objectives
  • Making decisions
  • Using tools
  • Maintaining context
  • Performing actions
  • Adapting to changing inputs

Agents are more autonomous than standard prompt-response systems.


Characteristics of AI Agents

Agents commonly include:

  • Reasoning
  • Planning
  • Memory
  • Tool usage
  • Workflow orchestration
  • Goal-oriented behavior

Agent Roles

An agent role defines:

  • The agent’s responsibilities
  • Behavioral expectations
  • Scope of operation
  • Allowed actions

Why Agent Roles Matter

Clearly defined roles help:

  • Improve consistency
  • Reduce unsafe behavior
  • Prevent scope creep
  • Improve reliability

Examples of Agent Roles

Examples include:

  • Customer support assistant
  • Financial analyst
  • Research assistant
  • Scheduling coordinator
  • Coding assistant
  • IT operations assistant

Specialized vs General-Purpose Agents

Specialized Agents

Focused on narrow tasks.

Benefits:

  • Higher reliability
  • Better governance
  • Easier evaluation

General-Purpose Agents

Handle broad tasks.

Benefits:

  • Greater flexibility
  • Wider applicability

Tradeoff:

  • Increased complexity and risk

Defining Agent Goals

Goals define:

  • Desired outcomes
  • Success criteria
  • Task objectives

Goal-Oriented Design

Good goals are:

  • Clear
  • Measurable
  • Constrained
  • Actionable

Examples of Agent Goals

Examples include:

  • Resolve customer tickets
  • Retrieve accurate company policies
  • Generate code suggestions
  • Schedule meetings
  • Summarize documents

Constraints in Goal Design

Goals should include:

  • Safety boundaries
  • Compliance rules
  • Tool restrictions
  • Escalation conditions

Agent Instructions and System Prompts

Agents typically receive:

  • System instructions
  • Behavioral guidance
  • Operational constraints

These instructions influence agent behavior.


Conversation Tracking

Conversation tracking maintains:

  • Dialogue history
  • User context
  • Workflow state
  • Interaction continuity

Why Conversation Tracking Matters

Without conversation tracking:

  • Agents lose context
  • Responses become inconsistent
  • Multistep workflows fail

Short-Term Conversation Memory

Short-term memory may store:

  • Recent prompts
  • Recent responses
  • Current workflow state

Long-Term Memory

Long-term memory may store:

  • User preferences
  • Historical interactions
  • Persistent knowledge

Session State Management

State management tracks:

  • Current tasks
  • Workflow progress
  • Tool outputs
  • Active context

Stateless vs Stateful Agents

Stateless Agents

Do not retain context between interactions.

Benefits:

  • Simpler design
  • Lower storage requirements

Stateful Agents

Maintain conversation history and workflow state.

Benefits:

  • Better continuity
  • Improved multistep reasoning

Context Window Management

LLMs have limited context windows.

Applications may need to:

  • Trim conversation history
  • Summarize prior interactions
  • Retrieve external memory

Memory Strategies

Common memory strategies include:

  • Rolling conversation windows
  • Summarization memory
  • Vector memory
  • Persistent storage

Retrieval-Augmented Memory

Agents may retrieve:

  • Historical conversations
  • Knowledge documents
  • Workflow data

This improves continuity.


Conversation Persistence

Persistent conversation storage may use:

  • Databases
  • Search indexes
  • Vector stores

Tool Usage in Agent Systems

Agents often interact with:

  • APIs
  • Databases
  • Search systems
  • External applications
  • Workflow services

What Is a Tool Schema?

A tool schema defines:

  • Tool name
  • Purpose
  • Input parameters
  • Output structure
  • Validation rules

Purpose of Tool Schemas

Tool schemas help:

  • Standardize interactions
  • Reduce ambiguity
  • Improve reliability
  • Enable function calling

Tool Schema Components

Tool schemas commonly include:

  • Function name
  • Description
  • Parameters
  • Data types
  • Required fields

Example Tool Schema

Example:

  • Tool: GetWeather
  • Inputs:
    • City name
    • Date
  • Output:
    • Temperature
    • Forecast

Structured Tool Invocation

Structured tool schemas allow agents to:

  • Generate valid requests
  • Interact predictably with systems
  • Reduce execution failures

Function Calling

Function calling enables models to:

  • Invoke external tools
  • Execute structured operations
  • Retrieve external data

Tool Selection Logic

Agents may decide:

  • Whether a tool is needed
  • Which tool to invoke
  • How to sequence tool calls

Multi-Tool Workflows

Complex agents may use:

  • Multiple tools
  • Sequential workflows
  • Conditional branching

Tool Access Controls

Organizations may restrict:

  • Which tools agents can use
  • When tools can be invoked
  • Which users may trigger actions

Safety Considerations for Tool Usage

Improper tool usage can:

  • Leak data
  • Execute unsafe actions
  • Cause workflow failures

Human Approval Workflows

Some actions may require:

  • Human review
  • Approval checkpoints
  • Escalation workflows

Agent Planning

Agents may perform:

  • Task decomposition
  • Sequential planning
  • Goal prioritization

Multistep Reasoning

Agents may:

  • Gather information
  • Use tools
  • Analyze results
  • Generate conclusions

Orchestration Frameworks

Orchestration frameworks coordinate:

  • Agent logic
  • Tool execution
  • Workflow progression
  • State transitions

Error Handling in Agents

Agents should handle:

  • Invalid tool outputs
  • API failures
  • Missing data
  • Ambiguous user requests

Monitoring Agent Behavior

Organizations should monitor:

  • Tool usage
  • Conversation quality
  • Safety violations
  • Goal completion rates

Trace Logging

Trace logs may capture:

  • Prompt sequences
  • Tool calls
  • Workflow decisions
  • Agent reasoning steps

Evaluation of Agent Systems

Organizations should evaluate:

  • Goal completion
  • Accuracy
  • Relevance
  • Safety
  • Tool reliability

Governance and Compliance

Enterprise agent systems may require:

  • Access controls
  • Audit logging
  • Compliance policies
  • Responsible AI governance

Real-World Scenario

Scenario: Enterprise IT Support Agent

Requirements:

  • Resolve common IT requests
  • Access ticketing systems
  • Maintain user context
  • Escalate high-risk actions

Recommended Design:

  • Specialized support role
  • Defined goals
  • Stateful conversation tracking
  • Structured tool schemas
  • Human approval workflows

Common AI-103 Exam Tips

Understand Agent Roles

Know:

  • Specialized vs general-purpose agents
  • Role boundaries
  • Behavioral constraints

Learn Conversation Tracking Concepts

Understand:

  • Stateful vs stateless agents
  • Memory approaches
  • Context management

Understand Tool Schemas

Know:

  • Function definitions
  • Parameters
  • Structured tool invocation
  • Function calling

Learn Governance Concepts

Understand:

  • Tool access controls
  • Human approvals
  • Audit logging
  • Safety constraints

Summary

Agent design is a core part of modern AI systems.

For the AI-103 exam, you should understand:

  • Agent roles
  • Goal-oriented behavior
  • Conversation tracking
  • Memory management
  • Stateful workflows
  • Tool schemas
  • Function calling
  • Tool orchestration
  • Workflow planning
  • Safety controls
  • Human approvals
  • Monitoring and governance

These concepts are foundational for building secure, scalable, and reliable agentic systems using Azure AI Foundry.


Practice Exam Questions

Question 1

What is the primary purpose of an agent role?

A. Increase GPU utilization
B. Define responsibilities and behavioral boundaries
C. Eliminate tool usage
D. Remove workflow orchestration

Answer

B. Define responsibilities and behavioral boundaries

Explanation

Agent roles establish scope, expectations, and operational constraints.


Question 2

Why are clearly defined agent goals important?

A. They eliminate monitoring
B. They provide measurable objectives and task direction
C. They reduce storage requirements only
D. They remove authentication needs

Answer

B. They provide measurable objectives and task direction

Explanation

Goals help agents focus on desired outcomes.


Question 3

What is the purpose of conversation tracking?

A. Increase vector dimensions
B. Maintain context and workflow continuity
C. Disable memory systems
D. Remove APIs

Answer

B. Maintain context and workflow continuity

Explanation

Conversation tracking preserves interaction history and state.


Question 4

What is a key benefit of stateful agents?

A. They avoid all storage requirements
B. They maintain continuity across interactions
C. They eliminate workflows
D. They remove tool schemas

Answer

B. They maintain continuity across interactions

Explanation

Stateful agents retain memory and conversation context.


Question 5

What is a tool schema?

A. A GPU optimization technique
B. A structured definition of tool inputs and outputs
C. A firewall policy
D. A token compression method

Answer

B. A structured definition of tool inputs and outputs

Explanation

Tool schemas standardize external tool interactions.


Question 6

What is the purpose of function calling?

A. Eliminate orchestration
B. Allow models to invoke external tools dynamically
C. Replace APIs entirely
D. Remove authentication

Answer

B. Allow models to invoke external tools dynamically

Explanation

Function calling enables structured tool execution.


Question 7

Why are tool access controls important?

A. They reduce GPU memory usage
B. They restrict unsafe or unauthorized tool usage
C. They eliminate monitoring
D. They disable workflows

Answer

B. They restrict unsafe or unauthorized tool usage

Explanation

Access controls improve safety and governance.


Question 8

What is a common challenge with large conversation histories?

A. Unlimited context windows
B. Context window limitations in LLMs
C. Elimination of memory usage
D. Reduced orchestration complexity

Answer

B. Context window limitations in LLMs

Explanation

LLMs can only process limited amounts of context.


Question 9

What is the purpose of human approval workflows?

A. Increase hallucinations
B. Provide oversight for sensitive or high-risk actions
C. Remove governance requirements
D. Disable trace logging

Answer

B. Provide oversight for sensitive or high-risk actions

Explanation

Human review reduces operational risk.


Question 10

What should organizations monitor in agent systems?

A. Only GPU temperatures
B. Tool usage, safety, conversation quality, and task completion
C. Only token counts
D. Only API latency

Answer

B. Tool usage, safety, conversation quality, and task completion

Explanation

Comprehensive monitoring improves reliability and governance.


Go to the AI-103 Exam Prep Hub main page

Configure an application to connect to a Foundry project (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
--> Build generative applications by using Foundry
--> Configure an application to connect to a Foundry project


Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Azure AI Foundry provides a centralized environment for developing, deploying, and managing AI applications and agentic solutions.

Applications that use generative AI models, agents, retrieval systems, or multimodal capabilities must connect securely and reliably to Foundry projects.

This topic is important for the AI-103: Develop AI Apps and Agents on Azure certification exam.

For the AI-103 exam, you should understand:

  • Azure AI Foundry projects
  • Application connectivity
  • Authentication methods
  • SDK configuration
  • Endpoint configuration
  • Deployment configuration
  • Managed identities
  • API keys
  • Environment variables
  • Network security
  • Role-based access control (RBAC)
  • Connecting to deployed models and agents
  • Configuration management
  • Monitoring and troubleshooting

What Is an Azure AI Foundry Project?

An Azure AI Foundry project is a centralized workspace used to:

  • Manage AI resources
  • Deploy models
  • Configure agents
  • Build workflows
  • Store evaluation assets
  • Monitor AI systems

Projects help organize AI development and operations.


Components of a Foundry Project

A Foundry project may include:

  • Model deployments
  • Agent configurations
  • Prompt flows
  • Evaluation datasets
  • Connections
  • Search resources
  • Storage resources
  • Monitoring tools

Why Applications Need Project Connectivity

Applications connect to Foundry projects to:

  • Access deployed models
  • Invoke agents
  • Perform retrieval operations
  • Execute workflows
  • Use AI services securely

Common Connection Scenarios

Applications commonly connect to:

  • Chat models
  • Embedding models
  • Multimodal models
  • Agent services
  • Prompt flow endpoints
  • Azure AI Search resources

Connection Architecture

Typical connectivity includes:

  1. Application
  2. Authentication layer
  3. Foundry project endpoint
  4. Model or agent deployment

SDK-Based Connectivity

Applications often use SDKs to:

  • Authenticate
  • Send prompts
  • Receive responses
  • Stream outputs
  • Manage workflows

SDKs simplify development.


API-Based Connectivity

Applications may also use:

  • REST APIs
  • HTTP endpoints
  • Direct service requests

Authentication Methods

Applications must authenticate securely.

Common methods include:

  • API keys
  • Managed identities
  • Azure Active Directory (Azure AD)
  • Keyless authentication

API Key Authentication

API keys are:

  • Simple to configure
  • Easy for development and testing

However, they require secure storage.


Managed Identity Authentication

Managed identities provide:

  • Secretless authentication
  • Improved security
  • Automatic credential management

Managed identity is recommended for production workloads.


Azure AD Authentication

Azure AD enables:

  • Enterprise identity management
  • Role-based access
  • Secure authentication workflows

Keyless Authentication

Keyless authentication reduces:

  • Credential exposure
  • Secret management overhead

Secure Credential Storage

Applications should avoid:

  • Hardcoded secrets
  • Plain-text credentials

Credentials should be stored securely.


Environment Variables

Environment variables commonly store:

  • API endpoints
  • Deployment names
  • Keys
  • Configuration settings

Configuration Files

Applications may use:

  • JSON configuration files
  • YAML files
  • Application settings

Endpoint Configuration

Applications must connect to the correct:

  • Foundry endpoint
  • Model deployment endpoint
  • Agent endpoint

Deployment Names

Applications typically reference:

  • Specific deployment names
  • Model identifiers
  • Agent identifiers

Connecting to Model Deployments

Applications may connect to:

  • Chat completion models
  • Embedding models
  • Code models
  • Multimodal models

Connecting to Agent Workflows

Applications may invoke agents that:

  • Use tools
  • Access memory
  • Execute workflows
  • Coordinate tasks

Connecting to Prompt Flows

Applications can invoke:

  • Prompt flow endpoints
  • Orchestrated workflows
  • Multi-step pipelines

Connecting to Azure AI Search

RAG applications often connect to:

  • Azure AI Search
  • Vector indexes
  • Semantic search pipelines

Role-Based Access Control (RBAC)

RBAC controls:

  • Resource permissions
  • Service access
  • Administrative privileges

Least Privilege Principle

Applications should receive:

  • Only required permissions
  • Minimal access rights

Private Networking

Organizations may secure connectivity using:

  • Private endpoints
  • Virtual networks
  • Network isolation

Firewall Configuration

Firewall rules may restrict:

  • Public access
  • Unauthorized IP ranges

Secure Communication

Applications should use:

  • HTTPS
  • Encrypted communication
  • Secure APIs

SDK Initialization

Applications typically initialize:

  • Client objects
  • Authentication providers
  • Connection settings

Client Configuration

Client configuration may include:

  • Endpoint URLs
  • API versions
  • Deployment names
  • Authentication credentials

Streaming Configuration

Applications may enable:

  • Streaming responses
  • Incremental output rendering

Retry Policies

Applications should implement:

  • Retry logic
  • Exponential backoff
  • Timeout handling

Error Handling

Applications should handle:

  • Authentication failures
  • Network issues
  • Rate limits
  • Invalid requests

Logging and Monitoring

Applications should log:

  • Requests
  • Responses
  • Failures
  • Latency metrics

Observability

Observability helps organizations:

  • Monitor usage
  • Diagnose issues
  • Improve reliability

Application Scalability

Applications should support:

  • High concurrency
  • Distributed workloads
  • Elastic scaling

Cost Considerations

Connection design impacts:

  • Token usage
  • API consumption
  • Search operations
  • Infrastructure costs

CI/CD Integration

Connection settings may be managed through:

  • Deployment pipelines
  • Infrastructure as code
  • Environment promotion

Development vs Production Environments

Organizations often separate:

  • Development
  • Testing
  • Staging
  • Production

Each environment may use different:

  • Endpoints
  • Credentials
  • Policies

Multi-Region Connectivity

Global applications may connect to:

  • Multiple regional deployments
  • Regional failover systems

High Availability

Applications should support:

  • Redundant deployments
  • Failover strategies
  • Resilient architecture

Governance Considerations

Organizations may enforce:

  • Access policies
  • Security baselines
  • Audit logging
  • Compliance requirements

Troubleshooting Connectivity Issues

Common issues include:

  • Invalid credentials
  • Incorrect endpoints
  • Missing RBAC permissions
  • Network restrictions
  • Deployment mismatches

Performance Optimization

Organizations should optimize:

  • Connection reuse
  • Latency
  • Request batching
  • Streaming efficiency

Real-World Scenario

Scenario: Enterprise AI Assistant

Requirements:

  • Secure authentication
  • RAG integration
  • Agent orchestration
  • Enterprise access control

Recommended Approach:

  • Managed identity
  • RBAC
  • Private networking
  • Azure AI Search integration
  • SDK-based connectivity

Common AI-103 Exam Tips

Understand Authentication Options

Know when to use:

  • API keys
  • Managed identities
  • Azure AD

Understand Endpoint Configuration

Know:

  • Deployment names
  • Service endpoints
  • Agent endpoints

Learn RBAC Concepts

Understand:

  • Least privilege
  • Role assignments
  • Secure access management

Understand Networking Concepts

Know:

  • Private endpoints
  • Firewalls
  • Secure connectivity

Learn Application Integration Concepts

Understand:

  • SDK initialization
  • Client configuration
  • Retry logic
  • Monitoring

Summary

Connecting applications to Azure AI Foundry projects is a foundational skill for AI-103.

For the exam, you should understand:

  • Foundry projects
  • Application connectivity
  • SDK integration
  • API integration
  • Authentication methods
  • Managed identities
  • RBAC
  • Deployment configuration
  • Endpoint management
  • Networking security
  • Logging and monitoring
  • Scalability and reliability

These skills are essential for building secure, scalable enterprise AI applications on Azure.


Practice Exam Questions

Question 1

What is the purpose of an Azure AI Foundry project?

A. Replace Azure subscriptions
B. Centrally manage AI resources, deployments, and workflows
C. Eliminate authentication
D. Replace APIs entirely

Answer

B. Centrally manage AI resources, deployments, and workflows

Explanation

Foundry projects organize AI development and operational assets.


Question 2

Which authentication method is recommended for production Azure workloads?

A. Hardcoded credentials
B. Managed identity
C. Shared public keys
D. Anonymous access

Answer

B. Managed identity

Explanation

Managed identities improve security by avoiding embedded secrets.


Question 3

What is a primary advantage of SDKs?

A. They eliminate APIs completely
B. They simplify application development and integration
C. They remove all authentication requirements
D. They prevent monitoring

Answer

B. They simplify application development and integration

Explanation

SDKs provide abstractions that simplify connectivity and workflow development.


Question 4

Why should applications use environment variables?

A. To increase GPU performance
B. To securely manage configuration values
C. To eliminate authentication
D. To disable RBAC

Answer

B. To securely manage configuration values

Explanation

Environment variables help manage endpoints and credentials securely.


Question 5

What does RBAC primarily control?

A. Token compression
B. Permissions and access to resources
C. Model quantization
D. Network bandwidth

Answer

B. Permissions and access to resources

Explanation

RBAC enforces authorization policies.


Question 6

Why are private endpoints used?

A. To increase hallucinations
B. To improve network security and isolate traffic
C. To disable monitoring
D. To reduce embedding dimensions

Answer

B. To improve network security and isolate traffic

Explanation

Private endpoints help secure enterprise AI workloads.


Question 7

What is commonly required when connecting to a deployed model?

A. Deployment name
B. Firewall removal
C. Disabling authentication
D. Public anonymous access

Answer

A. Deployment name

Explanation

Applications typically reference deployment identifiers.


Question 8

Why should applications implement retry policies?

A. To increase hallucinations
B. To recover from transient failures and improve reliability
C. To disable APIs
D. To remove authentication

Answer

B. To recover from transient failures and improve reliability

Explanation

Retry logic improves resiliency.


Question 9

Which service is commonly integrated for RAG search functionality?

A. Azure AI Search
B. Azure DNS
C. Azure Backup
D. Azure Batch

Answer

A. Azure AI Search

Explanation

Azure AI Search supports vector and semantic retrieval.


Question 10

What is the least privilege principle?

A. Give all users full access
B. Grant only the permissions necessary to perform required tasks
C. Disable RBAC
D. Allow anonymous authentication

Answer

B. Grant only the permissions necessary to perform required tasks

Explanation

Least privilege reduces security risk by minimizing unnecessary permissions.


Go to the AI-103 Exam Prep Hub main page

Evaluate models and apps, including detecting fabrications, relevance, quality, and safety (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
--> Build generative applications by using Foundry
--> Evaluate models and apps, including detecting fabrications, relevance, quality, and safety


Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Building generative AI applications is only part of the development process.

Organizations must also evaluate whether AI systems are:

  • Accurate
  • Reliable
  • Relevant
  • Safe
  • Grounded
  • Trustworthy

AI systems can generate:

  • Hallucinations
  • Unsafe content
  • Biased responses
  • Irrelevant answers
  • Inconsistent outputs

The AI-103: Develop AI Apps and Agents on Azure certification exam tests your understanding of evaluating models and applications.

For the AI-103 exam, you should understand:

  • Model evaluation
  • Application evaluation
  • Fabrication detection
  • Groundedness
  • Relevance evaluation
  • Quality evaluation
  • Safety evaluation
  • Responsible AI testing
  • Automated evaluators
  • Human evaluation
  • Benchmarking
  • Monitoring and continuous evaluation

Why AI Evaluation Matters

Evaluation is essential because generative AI systems are probabilistic.

This means:

  • Responses may vary
  • Outputs may be incorrect
  • Safety risks may occur
  • Hallucinations may appear

Without evaluation, organizations cannot reliably trust AI systems.


What Is AI Evaluation?

AI evaluation is the process of measuring:

  • Accuracy
  • Safety
  • Reliability
  • Relevance
  • Groundedness
  • User satisfaction

Types of AI Evaluation

Common evaluation categories include:

  • Model evaluation
  • Prompt evaluation
  • Retrieval evaluation
  • Application evaluation
  • Safety evaluation
  • Human evaluation

Model Evaluation

Model evaluation focuses on:

  • Model quality
  • Accuracy
  • Performance
  • Reasoning ability

Application Evaluation

Application evaluation measures:

  • End-to-end user experience
  • Workflow success
  • Tool orchestration quality
  • Groundedness

What Are Fabrications?

Fabrications are generated outputs that:

  • Are incorrect
  • Are unsupported
  • Contain invented facts
  • Misrepresent information

Fabrications are commonly called hallucinations.


Causes of Fabrications

Fabrications may occur because:

  • The model lacks relevant knowledge
  • Prompts are ambiguous
  • Retrieval quality is poor
  • Context is insufficient
  • Safety constraints are weak

Fabrication Detection

Organizations should evaluate whether outputs:

  • Match trusted sources
  • Remain grounded
  • Avoid unsupported claims

Groundedness Evaluation

Groundedness measures whether responses are supported by:

  • Retrieved documents
  • Enterprise data
  • Trusted sources

Importance of Groundedness

Grounded responses:

  • Improve trust
  • Reduce hallucinations
  • Increase explainability

Retrieval Quality Evaluation

RAG systems should evaluate:

  • Search relevance
  • Retrieved chunk quality
  • Citation accuracy
  • Context completeness

Relevance Evaluation

Relevance measures whether responses:

  • Answer the user’s question
  • Stay on-topic
  • Match user intent

Quality Evaluation

Quality evaluations may assess:

  • Clarity
  • Completeness
  • Coherence
  • Fluency
  • Professionalism

Consistency Evaluation

Consistency measures whether models:

  • Produce stable responses
  • Avoid contradictory outputs
  • Maintain predictable behavior

Safety Evaluation

Safety evaluations identify:

  • Harmful outputs
  • Toxic content
  • Unsafe instructions
  • Policy violations

Responsible AI Evaluation

Responsible AI testing focuses on:

  • Fairness
  • Safety
  • Transparency
  • Accountability
  • Privacy

Bias Evaluation

Organizations should evaluate whether models:

  • Produce biased outputs
  • Treat groups unfairly
  • Reinforce stereotypes

Toxicity Detection

Toxicity evaluations identify:

  • Offensive language
  • Hate speech
  • Harassment
  • Abusive content

Jailbreak Testing

Jailbreak testing evaluates whether users can bypass:

  • Safety controls
  • Content filters
  • Guardrails

Adversarial Testing

Adversarial testing intentionally challenges models using:

  • Malicious prompts
  • Edge cases
  • Prompt injection attacks

Prompt Injection Testing

Prompt injection testing evaluates whether:

  • External content manipulates model behavior
  • Instructions override safety policies

Automated Evaluators

Automated evaluators use:

  • Rules
  • Scoring systems
  • AI-based evaluators

To assess model outputs.


AI-Assisted Evaluation

Some systems use LLMs to evaluate:

  • Relevance
  • Groundedness
  • Quality
  • Safety

Human Evaluation

Human reviewers may evaluate:

  • Accuracy
  • Tone
  • Helpfulness
  • Safety
  • Business alignment

Human-in-the-Loop Evaluation

Human-in-the-loop evaluation combines:

  • Automated evaluation
  • Human oversight
  • Expert validation

Benchmarking Models

Benchmarking compares models using:

  • Standard datasets
  • Consistent prompts
  • Defined metrics

A/B Testing

A/B testing compares:

  • Different prompts
  • Different models
  • Different workflows

Evaluation Metrics

Common metrics include:

  • Precision
  • Recall
  • Accuracy
  • Relevance
  • Groundedness
  • Toxicity scores
  • Latency
  • User satisfaction

Precision and Recall

Precision

Measures how many retrieved results are relevant.

Recall

Measures how many relevant results were successfully retrieved.


Latency Evaluation

Organizations should measure:

  • Response times
  • Retrieval delays
  • Tool execution times

Cost Evaluation

Cost evaluation considers:

  • Token usage
  • API calls
  • Infrastructure consumption

User Satisfaction Evaluation

Organizations may measure:

  • User feedback
  • Completion success
  • Satisfaction ratings

Continuous Evaluation

AI systems should be evaluated continuously because:

  • User behavior changes
  • Data evolves
  • Model drift may occur

Model Drift

Model drift occurs when:

  • Performance changes over time
  • Inputs evolve
  • User expectations shift

Monitoring Production Systems

Organizations should monitor:

  • Safety violations
  • Hallucination rates
  • Retrieval failures
  • Latency spikes
  • Cost increases

Evaluation Pipelines

Evaluation pipelines automate:

  • Testing
  • Scoring
  • Reporting
  • Regression analysis

Regression Testing

Regression testing ensures updates do not:

  • Reduce quality
  • Break workflows
  • Increase hallucinations

Azure AI Foundry Evaluation Capabilities

Azure AI Foundry supports:

  • Evaluation workflows
  • Automated evaluators
  • Safety monitoring
  • Groundedness evaluation
  • Prompt testing
  • Trace analysis

Trace Analysis

Trace analysis helps inspect:

  • Tool calls
  • Retrieval steps
  • Agent decisions
  • Workflow execution

Evaluation Datasets

Organizations should create datasets containing:

  • Expected outputs
  • Edge cases
  • Adversarial prompts
  • Real-world scenarios

Synthetic Test Data

Synthetic data may help test:

  • Rare scenarios
  • Adversarial prompts
  • Safety boundaries

Real-World Evaluation Scenarios

Scenario 1: Enterprise Chatbot

Requirements:

  • Accurate responses
  • Citation support
  • Low hallucination rate

Recommended Evaluation:

  • Groundedness testing
  • Retrieval quality evaluation

Scenario 2: Financial Assistant

Requirements:

  • High accuracy
  • Safety compliance
  • Low fabrication risk

Recommended Evaluation:

  • Human review
  • Adversarial testing
  • Approval workflows

Scenario 3: Customer Support Copilot

Requirements:

  • Relevant responses
  • Fast response times
  • Consistent tone

Recommended Evaluation:

  • Latency evaluation
  • Quality scoring
  • A/B testing

Scenario 4: Agentic Workflow System

Requirements:

  • Tool accuracy
  • Safe tool execution
  • Workflow traceability

Recommended Evaluation:

  • Trace analysis
  • Tool execution monitoring
  • HITL evaluation

Common AI-103 Exam Tips

Understand Evaluation Categories

Know the differences between:

  • Relevance
  • Quality
  • Groundedness
  • Safety
  • Consistency

Learn Fabrication Detection Concepts

Understand:

  • Hallucinations
  • Unsupported claims
  • Grounding validation

Understand Safety Testing

Know:

  • Toxicity testing
  • Jailbreak testing
  • Prompt injection evaluation
  • Adversarial testing

Learn Monitoring Concepts

Understand:

  • Continuous evaluation
  • Drift detection
  • Trace analysis
  • Regression testing

Summary

Evaluating generative AI systems is critical for building:

  • Reliable
  • Safe
  • Grounded
  • Trustworthy applications

For the AI-103 exam, you should understand:

  • Fabrication detection
  • Groundedness evaluation
  • Retrieval quality
  • Relevance testing
  • Quality evaluation
  • Safety evaluation
  • Toxicity detection
  • Adversarial testing
  • Human evaluation
  • Automated evaluators
  • Monitoring and drift detection
  • Evaluation pipelines

These concepts are foundational for developing enterprise-grade AI applications and agentic systems on Azure.


Practice Exam Questions

Question 1

What is a fabrication in generative AI?

A. A storage replication process
B. An unsupported or invented response
C. A vector indexing method
D. A deployment strategy

Answer

B. An unsupported or invented response

Explanation

Fabrications, also called hallucinations, are incorrect or invented outputs.


Question 2

What does groundedness measure?

A. GPU performance
B. Whether outputs are supported by trusted sources
C. Network bandwidth
D. Token compression efficiency

Answer

B. Whether outputs are supported by trusted sources

Explanation

Groundedness evaluates factual support from retrieved or trusted data.


Question 3

Which evaluation type focuses on harmful or unsafe outputs?

A. Latency evaluation
B. Safety evaluation
C. Compression evaluation
D. Replication evaluation

Answer

B. Safety evaluation

Explanation

Safety evaluations detect harmful, toxic, or policy-violating outputs.


Question 4

What is the purpose of retrieval quality evaluation in RAG systems?

A. Measure GPU speed
B. Assess search relevance and retrieved context quality
C. Reduce storage redundancy
D. Disable embeddings

Answer

B. Assess search relevance and retrieved context quality

Explanation

Retrieval quality measures how useful and relevant retrieved information is.


Question 5

What is jailbreak testing?

A. Testing storage failures
B. Evaluating attempts to bypass safety controls
C. Measuring retrieval latency
D. Compressing prompts

Answer

B. Evaluating attempts to bypass safety controls

Explanation

Jailbreak testing checks whether users can circumvent AI safety mechanisms.


Question 6

Which metric measures whether responses answer the user’s question appropriately?

A. Relevance
B. Replication
C. Throughput
D. Compression

Answer

A. Relevance

Explanation

Relevance evaluates how well outputs match user intent.


Question 7

Why is continuous evaluation important?

A. To eliminate all infrastructure costs
B. Because models and data can change over time
C. To remove all safety policies
D. To disable monitoring

Answer

B. Because models and data can change over time

Explanation

Continuous evaluation helps detect drift and performance degradation.


Question 8

What is adversarial testing?

A. Testing network redundancy
B. Challenging AI systems with malicious or difficult prompts
C. Increasing vector dimensions
D. Optimizing GPU allocation

Answer

B. Challenging AI systems with malicious or difficult prompts

Explanation

Adversarial testing identifies vulnerabilities and unsafe behaviors.


Question 9

What is a benefit of A/B testing in AI systems?

A. Eliminates monitoring requirements
B. Compares prompts or models to identify better performance
C. Removes the need for evaluation datasets
D. Disables retrieval pipelines

Answer

B. Compares prompts or models to identify better performance

Explanation

A/B testing helps optimize prompts, workflows, and models.


Question 10

Which Azure capability helps inspect workflow execution and tool calls?

A. Trace analysis
B. DNS failover
C. Storage mirroring
D. GPU partitioning

Answer

A. Trace analysis

Explanation

Trace analysis provides visibility into workflow execution and reasoning steps.


Go to the AI-103 Exam Prep Hub main page

Design workflows, tool-augmented flows, and multistep reasoning pipelines (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
--> Build generative applications by using Foundry
--> Design workflows, tool-augmented flows, and multistep reasoning pipelines


Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Modern AI systems are evolving beyond simple prompt-response interactions.

Today’s generative AI applications often:

  • Use external tools
  • Perform multistep reasoning
  • Orchestrate workflows
  • Retrieve enterprise data
  • Execute actions autonomously
  • Coordinate across services

These systems are commonly called:

  • Agentic systems
  • Tool-augmented AI systems
  • AI workflow pipelines

The AI-103: Develop AI Apps and Agents on Azure certification exam tests your understanding of designing intelligent workflows and reasoning pipelines.

For the AI-103 exam, you should understand:

  • AI workflows
  • Agent orchestration
  • Tool augmentation
  • Function calling
  • Multistep reasoning
  • Workflow pipelines
  • Retrieval integration
  • Memory integration
  • Planning and execution
  • Human-in-the-loop workflows
  • Monitoring and governance

What Are AI Workflows?

AI workflows are structured sequences of operations that combine:

  • AI reasoning
  • Data retrieval
  • Tool execution
  • Decision-making
  • Automation

Workflows coordinate multiple steps to complete complex tasks.


Why AI Workflows Matter

Simple prompts are often insufficient for:

  • Enterprise automation
  • Complex reasoning
  • Dynamic decision-making
  • Multi-system integration

Workflows allow AI systems to:

  • Break problems into steps
  • Use external tools
  • Validate outputs
  • Iterate toward solutions

What Is Tool Augmentation?

Tool augmentation allows AI systems to use external capabilities.

Examples include:

  • APIs
  • Databases
  • Search engines
  • Calculators
  • Business systems
  • Code interpreters

Why Tool Augmentation Is Important

Language models alone:

  • Cannot access real-time data
  • Cannot execute business actions directly
  • Cannot reliably perform all calculations

Tools extend AI capabilities.


Common Tool-Augmented Scenarios

Examples include:

  • Checking inventory
  • Booking appointments
  • Querying databases
  • Sending emails
  • Executing workflows
  • Calling REST APIs

What Is Function Calling?

Function calling enables models to:

  • Detect when a tool is needed
  • Generate structured tool requests
  • Invoke external services
  • Process returned results

Function Calling Workflow

Typical flow:

  1. User submits request
  2. Model determines tool requirement
  3. Model generates function call
  4. External tool executes
  5. Results return to model
  6. Model generates final response

Structured Tool Inputs

Function calling typically uses:

  • JSON schemas
  • Structured parameters
  • Validated inputs

This improves reliability.


Tool Selection

Agentic systems may dynamically choose:

  • Which tools to use
  • Which workflows to invoke
  • Which retrieval strategies to apply

Tool Orchestration

Tool orchestration coordinates multiple tools within a workflow.

Examples include:

  • Retrieval + summarization
  • Search + booking systems
  • Database queries + reporting

Sequential Workflows

Sequential workflows execute steps in order.

Example:

  1. Retrieve customer data
  2. Analyze account status
  3. Generate recommendations
  4. Send response

Parallel Workflows

Parallel workflows execute multiple tasks simultaneously.

Benefits include:

  • Faster execution
  • Better scalability
  • Reduced latency

Conditional Workflows

Conditional workflows branch based on:

  • User intent
  • Retrieved data
  • Safety evaluations
  • Confidence scores

What Is Multistep Reasoning?

Multistep reasoning breaks complex problems into smaller steps.

This improves:

  • Accuracy
  • Planning
  • Decision quality

Examples of Multistep Reasoning

Examples include:

  • Research workflows
  • Financial analysis
  • Travel planning
  • Technical troubleshooting

Chain-of-Thought Reasoning

Chain-of-thought reasoning encourages models to:

  • Reason step-by-step
  • Decompose problems
  • Validate intermediate steps

Planning and Execution Models

Agentic systems often separate:

  • Planning
  • Execution

The planner decides:

  • What steps are needed
  • Which tools to use

The executor performs actions.


Planner-Executor Architectures

Planner-executor architectures support:

  • Dynamic workflows
  • Adaptive reasoning
  • Task decomposition

ReAct Pattern

The ReAct (Reason + Act) pattern combines:

  • Reasoning
  • Tool usage
  • Observation
  • Iterative decision-making

Reflection and Self-Correction

Some systems support:

  • Self-evaluation
  • Output refinement
  • Error correction

Retrieval-Augmented Workflows

Workflows often integrate:

  • Vector search
  • RAG pipelines
  • Enterprise grounding

Memory in Agentic Systems

AI systems may use memory for:

  • Conversation history
  • User preferences
  • Workflow state
  • Long-running tasks

Short-Term Memory

Short-term memory stores:

  • Current conversation context
  • Immediate workflow information

Long-Term Memory

Long-term memory stores:

  • Persistent preferences
  • Historical interactions
  • Learned context

Workflow State Management

State management tracks:

  • Current task progress
  • Intermediate outputs
  • Pending actions

Human-in-the-Loop (HITL) Workflows

High-risk workflows may require:

  • Human approvals
  • Validation checkpoints
  • Escalation paths

Approval Gates

Approval gates can prevent:

  • Unsafe actions
  • Unauthorized tool usage
  • Harmful outputs

Safety and Governance

Organizations should enforce:

  • Tool restrictions
  • Permission boundaries
  • Safety filters
  • Approval workflows

Autonomous vs Semi-Autonomous Agents

Autonomous Agents

Can:

  • Make decisions independently
  • Execute workflows automatically

Semi-Autonomous Agents

Require:

  • Human review
  • Approval checkpoints

Workflow Monitoring

Organizations should monitor:

  • Tool usage
  • Failures
  • Safety violations
  • Latency
  • Costs

Trace Logging

Trace logging helps track:

  • Workflow execution
  • Tool calls
  • Reasoning steps
  • Agent decisions

Error Handling in Workflows

Workflow pipelines should handle:

  • API failures
  • Missing data
  • Timeout errors
  • Invalid outputs

Retry Strategies

Common retry strategies include:

  • Automatic retries
  • Fallback workflows
  • Alternative tool selection

Fallback Models

Applications may use fallback models when:

  • Primary models fail
  • Costs exceed thresholds
  • Latency becomes excessive

Workflow Optimization

Optimization strategies include:

  • Parallel processing
  • Caching
  • Smaller models
  • Efficient retrieval

Latency Considerations

Complex workflows may increase latency due to:

  • Multiple model calls
  • Tool invocations
  • Retrieval operations

Cost Considerations

Tool-augmented systems may increase:

  • Token usage
  • API calls
  • Infrastructure costs

Azure AI Foundry Workflow Capabilities

Azure AI Foundry supports:

  • Model orchestration
  • Tool integration
  • Agent workflows
  • Evaluation pipelines
  • Monitoring

Common AI-103 Workflow Scenarios

Scenario 1: Enterprise Research Assistant

Requirements:

  • Multi-document retrieval
  • Summarization
  • Citation generation

Recommended Workflow:

  • RAG + multistep reasoning

Scenario 2: Customer Service Agent

Requirements:

  • CRM access
  • Ticket management
  • Escalation workflows

Recommended Workflow:

  • Tool-augmented agent

Scenario 3: Financial Approval System

Requirements:

  • Risk evaluation
  • Human approvals
  • Audit logging

Recommended Workflow:

  • HITL approval pipeline

Scenario 4: AI Coding Assistant

Requirements:

  • Code generation
  • Code execution
  • Documentation retrieval

Recommended Workflow:

  • Code model + tool orchestration

Common AI-103 Exam Tips

Understand Workflow Patterns

Know:

  • Sequential workflows
  • Parallel workflows
  • Conditional workflows

Learn Tool-Augmented AI Concepts

Understand:

  • Function calling
  • Tool orchestration
  • Dynamic tool selection

Understand Multistep Reasoning

Know:

  • Chain-of-thought reasoning
  • Planner-executor patterns
  • ReAct workflows

Learn Governance Concepts

Understand:

  • HITL workflows
  • Approval gates
  • Monitoring
  • Trace logging

Summary

Modern AI applications increasingly rely on:

  • Workflow orchestration
  • Tool augmentation
  • Multistep reasoning
  • Agentic architectures

For the AI-103 exam, you should understand:

  • AI workflow design
  • Function calling
  • Tool orchestration
  • Sequential and parallel workflows
  • Multistep reasoning
  • Planner-executor architectures
  • ReAct patterns
  • Memory integration
  • HITL workflows
  • Monitoring and governance

These concepts enable organizations to build:

  • Intelligent
  • Autonomous
  • Scalable
  • Governed AI systems

They are foundational for modern generative AI and agentic solutions on Azure.


Practice Exam Questions

Question 1

What is the primary purpose of tool augmentation in AI systems?

A. Reduce storage costs
B. Extend model capabilities using external tools
C. Eliminate prompts
D. Replace vector search

Answer

B. Extend model capabilities using external tools

Explanation

Tool augmentation enables AI systems to interact with APIs, databases, and other services.


Question 2

What does function calling enable a model to do?

A. Generate only static responses
B. Invoke external tools using structured inputs
C. Eliminate workflows
D. Replace embeddings

Answer

B. Invoke external tools using structured inputs

Explanation

Function calling allows models to interact with external services.


Question 3

Which workflow type executes tasks simultaneously?

A. Sequential workflow
B. Parallel workflow
C. Manual workflow
D. Static workflow

Answer

B. Parallel workflow

Explanation

Parallel workflows improve speed by running tasks concurrently.


Question 4

What is multistep reasoning?

A. Compressing vector indexes
B. Breaking complex tasks into smaller reasoning steps
C. Increasing GPU memory
D. Reducing prompt size only

Answer

B. Breaking complex tasks into smaller reasoning steps

Explanation

Multistep reasoning improves problem-solving accuracy.


Question 5

What does the ReAct pattern combine?

A. Compression and storage
B. Reasoning and acting
C. Replication and scaling
D. Encryption and backup

Answer

B. Reasoning and acting

Explanation

ReAct combines reasoning steps with tool usage.


Question 6

What is the purpose of workflow state management?

A. Monitor GPU temperature
B. Track task progress and intermediate outputs
C. Disable logging
D. Replace semantic search

Answer

B. Track task progress and intermediate outputs

Explanation

State management helps maintain workflow continuity.


Question 7

Which architecture separates planning from execution?

A. Static inference architecture
B. Planner-executor architecture
C. Batch storage architecture
D. Compression architecture

Answer

B. Planner-executor architecture

Explanation

Planner-executor systems divide reasoning and execution responsibilities.


Question 8

Why are approval gates important in AI workflows?

A. They increase vector dimensions
B. They prevent unsafe or unauthorized actions
C. They reduce indexing speed
D. They eliminate monitoring requirements

Answer

B. They prevent unsafe or unauthorized actions

Explanation

Approval gates enforce governance and human oversight.


Question 9

Which concept allows AI systems to remember previous interactions?

A. Semantic ranking
B. Memory integration
C. Static chunking
D. GPU partitioning

Answer

B. Memory integration

Explanation

Memory enables contextual continuity and long-running workflows.


Question 10

What is a major challenge of complex AI workflows?

A. Eliminating all costs
B. Increased latency from multiple operations
C. Removing all need for monitoring
D. Preventing all hallucinations automatically

Answer

B. Increased latency from multiple operations

Explanation

Complex workflows may require multiple model calls and tool executions.


Go to the AI-103 Exam Prep Hub main page

Deploy and consume LLMs, small models, code models, and multimodal models (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement generative AI and agentic solutions (30–35%)
--> Build generative applications by using Foundry
--> Deploy and consume LLMs, small models, code models, and multimodal models


Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Modern AI applications rely on a wide variety of AI models.

Different models are optimized for different workloads, including:

  • Conversational AI
  • Code generation
  • Text summarization
  • Image understanding
  • Audio processing
  • Reasoning tasks
  • Agentic workflows

The AI-103: Develop AI Apps and Agents on Azure certification exam tests your understanding of how to deploy and consume AI models in Azure AI Foundry.

For the AI-103 exam, you should understand:

  • Large language models (LLMs)
  • Small language models (SLMs)
  • Code models
  • Multimodal models
  • Model deployment concepts
  • Model consumption patterns
  • API-based model access
  • Endpoint configuration
  • Performance and cost tradeoffs
  • Model selection strategies
  • Responsible AI considerations

What Are Large Language Models (LLMs)?

Large language models are advanced AI systems trained on massive datasets.

LLMs can:

  • Generate text
  • Summarize documents
  • Answer questions
  • Translate languages
  • Reason across prompts
  • Support conversational AI

Common LLM Use Cases

Typical use cases include:

  • AI assistants
  • Enterprise chatbots
  • Content generation
  • Knowledge retrieval
  • Agent orchestration
  • Workflow automation

Characteristics of LLMs

LLMs typically provide:

  • Strong reasoning
  • Broad general knowledge
  • Advanced conversational abilities
  • Complex instruction following

However, they also:

  • Require more compute
  • Cost more to run
  • May introduce higher latency

What Are Small Language Models (SLMs)?

Small language models are lightweight models optimized for:

  • Faster inference
  • Lower cost
  • Lower latency
  • Edge deployment
  • Specialized tasks

Common SLM Use Cases

SLMs are often used for:

  • Classification
  • Simple chatbots
  • Mobile applications
  • Embedded AI
  • Lightweight assistants

Benefits of Small Models

Advantages include:

  • Reduced infrastructure cost
  • Faster response times
  • Lower resource requirements
  • Easier deployment at scale

LLM vs SLM Tradeoffs

LLMs

Best for:

  • Complex reasoning
  • Broad knowledge
  • Multi-step tasks

Tradeoffs:

  • Higher cost
  • Higher latency
  • Larger infrastructure requirements

SLMs

Best for:

  • Lightweight inference
  • Narrow tasks
  • Cost-sensitive workloads

Tradeoffs:

  • Reduced reasoning capability
  • Smaller context windows
  • Less flexibility

What Are Code Models?

Code models are specialized AI models trained for software development tasks.

These models can:

  • Generate code
  • Explain code
  • Complete functions
  • Debug issues
  • Convert between languages

Common Code Model Use Cases

Typical scenarios include:

  • Developer copilots
  • Code generation
  • Documentation generation
  • Test generation
  • Refactoring assistance

Code Model Capabilities

Code models often support:

  • Multiple programming languages
  • Natural language prompts
  • Code reasoning
  • Syntax understanding

What Are Multimodal Models?

Multimodal models process multiple types of input.

Examples include:

  • Text and images
  • Text and audio
  • Video and text

Multimodal AI Capabilities

Multimodal models may support:

  • Image understanding
  • OCR
  • Visual question answering
  • Audio transcription
  • Speech interaction
  • Video analysis

Common Multimodal Use Cases

Examples include:

  • AI vision assistants
  • Document understanding
  • Medical imaging analysis
  • Voice assistants
  • Image captioning

Model Deployment in Azure AI Foundry

Azure AI Foundry enables developers to:

  • Discover models
  • Deploy models
  • Test models
  • Monitor deployments
  • Consume models through APIs

Model Catalogs

Azure AI Foundry provides access to:

  • Foundation models
  • Open-source models
  • Specialized models
  • Multimodal models

Deployment Concepts

A deployment makes a model available through:

  • APIs
  • Endpoints
  • Applications
  • Agent workflows

Deployment Types

Common deployment options include:

  • Managed online deployments
  • Serverless deployments
  • Real-time inference endpoints
  • Batch inference deployments

Real-Time Inference

Real-time inference is used for:

  • Interactive chat
  • AI assistants
  • Live applications
  • Agent workflows

Batch Inference

Batch inference is used for:

  • Large-scale document processing
  • Offline analysis
  • Scheduled workloads
  • Bulk content generation

Endpoint Configuration

Deployments expose endpoints for application access.

Endpoints may include:

  • Authentication
  • Rate limits
  • Scaling policies
  • Monitoring settings

Authentication and Authorization

Applications may access models using:

  • API keys
  • Managed identities
  • Microsoft Entra ID
  • Role-based access control (RBAC)

Consuming Models Through APIs

Applications consume deployed models using:

  • REST APIs
  • SDKs
  • Client libraries

Prompt-Based Interactions

Generative AI applications commonly interact with models through prompts.

Prompts may include:

  • Instructions
  • Context
  • Examples
  • Retrieved documents

System Prompts

System prompts define:

  • AI behavior
  • Tone
  • Constraints
  • Safety policies

Model Parameters

Common inference parameters include:

  • Temperature
  • Top-p
  • Max tokens
  • Frequency penalty
  • Presence penalty

Temperature

Temperature controls output randomness.

Lower temperature:

  • More deterministic
  • More predictable

Higher temperature:

  • More creative
  • More variable

Context Windows

Context windows determine how much information a model can process in a request.

Larger context windows support:

  • Long conversations
  • Large documents
  • Multi-document grounding

Streaming Responses

Streaming enables applications to receive responses incrementally.

Benefits include:

  • Improved user experience
  • Faster perceived response times

Grounding Models

Grounding improves factual accuracy by providing trusted data.

Grounded applications commonly use:

  • Vector search
  • Retrieval-Augmented Generation (RAG)
  • Enterprise knowledge sources

Model Selection Considerations

Developers should evaluate:

  • Accuracy
  • Cost
  • Latency
  • Context size
  • Reasoning ability
  • Multimodal support
  • Scalability

Choosing Between Models

Use LLMs When:

  • Complex reasoning is required
  • Broad knowledge is needed
  • Multi-step workflows are involved

Use SLMs When:

  • Low latency matters
  • Cost optimization is critical
  • Tasks are narrow or repetitive

Use Code Models When:

  • Building developer tools
  • Generating code
  • Supporting programming workflows

Use Multimodal Models When:

  • Images or audio are required
  • Visual understanding is needed
  • Mixed media inputs are processed

Scaling Model Deployments

Scaling strategies may include:

  • Autoscaling
  • Regional deployments
  • Load balancing
  • Rate limiting

Monitoring Deployments

Organizations should monitor:

  • Latency
  • Throughput
  • Token usage
  • Errors
  • Safety events
  • Cost

Cost Optimization

Cost optimization strategies include:

  • Choosing smaller models
  • Limiting token usage
  • Caching responses
  • Using batch processing

Responsible AI Considerations

Developers should implement:

  • Safety filters
  • Guardrails
  • Content moderation
  • Monitoring
  • Human oversight

Multimodal Safety Concerns

Multimodal systems may require:

  • Image moderation
  • OCR filtering
  • Audio moderation
  • Content safety evaluation

Agentic AI and Model Consumption

AI agents may use:

  • LLMs for reasoning
  • SLMs for lightweight tasks
  • Code models for automation
  • Multimodal models for perception

Common AI-103 Deployment Scenarios

Scenario 1: Enterprise Chatbot

Requirements:

  • Strong reasoning
  • Long conversations
  • Grounded responses

Recommended Model:

  • LLM with RAG

Scenario 2: Mobile AI Assistant

Requirements:

  • Fast responses
  • Low cost
  • Lightweight inference

Recommended Model:

  • Small language model

Scenario 3: Developer Copilot

Requirements:

  • Code generation
  • Programming assistance
  • Syntax awareness

Recommended Model:

  • Code model

Scenario 4: Image-Aware AI Assistant

Requirements:

  • Image analysis
  • OCR
  • Text generation

Recommended Model:

  • Multimodal model

Common AI-103 Exam Tips

Understand Model Categories

Know the differences between:

  • LLMs
  • SLMs
  • Code models
  • Multimodal models

Learn Deployment Concepts

Understand:

  • Endpoints
  • Real-time inference
  • Batch inference
  • Scaling

Learn Consumption Patterns

Know:

  • REST APIs
  • SDKs
  • Prompt engineering
  • System prompts

Understand Cost and Performance Tradeoffs

Know how:

  • Model size affects cost
  • Context size affects latency
  • Scaling impacts performance

Summary

Azure AI Foundry enables developers to deploy and consume a wide range of AI models.

For the AI-103 exam, you should understand:

  • LLMs
  • Small language models
  • Code models
  • Multimodal models
  • Deployment options
  • Model consumption patterns
  • Prompt engineering
  • Scaling strategies
  • Cost optimization
  • Responsible AI controls

Choosing the right model and deployment strategy is essential for building:

  • Scalable
  • Reliable
  • Efficient
  • Responsible AI solutions

These concepts are foundational for generative AI and agentic systems on Azure.


Practice Exam Questions

Question 1

What is a primary strength of large language models (LLMs)?

A. Minimal compute usage
B. Complex reasoning and broad knowledge
C. Guaranteed factual accuracy
D. Extremely low latency

Answer

B. Complex reasoning and broad knowledge

Explanation

LLMs excel at reasoning, conversation, and broad knowledge tasks.


Question 2

Which model type is best suited for lightweight, low-cost inference?

A. Large language model
B. Small language model
C. Multimodal model
D. Vision transformer only

Answer

B. Small language model

Explanation

SLMs are optimized for lower latency and reduced cost.


Question 3

Which model type is specifically optimized for programming tasks?

A. Vision model
B. Code model
C. Embedding model
D. Speech model

Answer

B. Code model

Explanation

Code models are trained for software development workflows.


Question 4

What is a defining feature of multimodal models?

A. They only process text
B. They process multiple input types
C. They eliminate inference costs
D. They require no prompting

Answer

B. They process multiple input types

Explanation

Multimodal models handle text, images, audio, and other media.


Question 5

Which deployment type is best for interactive AI chat applications?

A. Batch inference
B. Real-time inference
C. Archive deployment
D. Offline storage deployment

Answer

B. Real-time inference

Explanation

Interactive applications require low-latency real-time inference.


Question 6

What does the temperature parameter control?

A. Network throughput
B. Output randomness and creativity
C. Storage replication
D. GPU memory allocation

Answer

B. Output randomness and creativity

Explanation

Temperature affects how deterministic or creative outputs become.


Question 7

Which technique improves factual accuracy by using trusted data sources?

A. GPU scaling
B. Retrieval-Augmented Generation (RAG)
C. Semantic caching
D. Compression indexing

Answer

B. Retrieval-Augmented Generation (RAG)

Explanation

RAG grounds model outputs using retrieved enterprise data.


Question 8

What is a major benefit of streaming responses?

A. Reduced storage costs
B. Faster perceived response times
C. Elimination of monitoring
D. Improved vector indexing

Answer

B. Faster perceived response times

Explanation

Streaming improves user experience during response generation.


Question 9

Which authentication method supports passwordless access to Azure AI services?

A. Static credentials only
B. Managed identities
C. Anonymous access
D. Embedded API secrets in code

Answer

B. Managed identities

Explanation

Managed identities support secure, keyless authentication.


Question 10

Which model type is most appropriate for image understanding and OCR tasks?

A. Small language model
B. Multimodal model
C. Traditional relational database
D. Static rules engine

Answer

B. Multimodal model

Explanation

Multimodal models process images and text together.


Go to the AI-103 Exam Prep Hub main page

Configure model and agent deployments (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Plan and manage an Azure AI solution (25–30%)
--> Set up AI solutions in Foundry
--> Configure model and agent deployments


Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

One of the most important responsibilities for Azure AI developers is configuring and managing model and agent deployments.

Modern AI applications depend on properly configured:

  • Large Language Models (LLMs)
  • Embedding models
  • Multimodal models
  • AI agents
  • Retrieval systems
  • Tool integrations
  • Orchestration workflows

The AI-103: Develop AI Apps and Agents on Azure certification exam tests your ability to configure AI solutions in Azure AI Foundry and related Azure services.

For the AI-103 exam, you should understand:

  • Azure OpenAI model deployments
  • Deployment types
  • Provisioned throughput
  • Model versioning
  • Deployment scaling
  • Agent configuration
  • Tool and function integration
  • Retrieval integration
  • Security configuration
  • Monitoring and evaluation
  • Deployment lifecycle management

What Is a Model Deployment?

A model deployment is a configured instance of an AI model that applications can access through APIs.

Deployments allow developers to:

  • Choose models
  • Configure capacity
  • Control scaling
  • Manage versions
  • Apply security controls
  • Monitor usage

A deployment acts as the operational endpoint for AI inference.


Azure AI Foundry

Azure AI Foundry provides tools and services for:

  • Deploying AI models
  • Configuring AI agents
  • Managing workflows
  • Evaluating AI systems
  • Monitoring AI applications

It integrates with:

  • Azure OpenAI
  • Azure AI Search
  • Prompt Flow
  • Azure AI Content Safety
  • Azure Functions

Types of Models in Azure AI

Common model types include:

  • Large Language Models (LLMs)
  • Small Language Models (SLMs)
  • Embedding models
  • Multimodal models
  • Vision models
  • Speech models

Large Language Models (LLMs)

LLMs are used for:

  • Chatbots
  • AI copilots
  • Summarization
  • Reasoning
  • Tool calling
  • Content generation

Examples include GPT-based models.


Embedding Models

Embedding models convert content into vector representations.

Used for:

  • Vector search
  • Semantic retrieval
  • Similarity matching
  • RAG systems

Multimodal Models

Multimodal models process multiple input types such as:

  • Text
  • Images
  • Audio
  • Documents

Used for:

  • Image analysis
  • Visual reasoning
  • OCR workflows
  • Multimodal agents

Azure OpenAI Deployments

Azure OpenAI deployments expose models through API endpoints.

Deployment configuration includes:

  • Model selection
  • Deployment name
  • Capacity allocation
  • Version selection
  • Region selection
  • Content filtering settings

Deployment Names

Each deployment has a unique deployment name.

Applications use the deployment name when making API requests.

Example:

  • gpt4-copilot-prod
  • embeddings-search-dev

Model Versioning

Models evolve over time.

Versioning helps:

  • Maintain stability
  • Test upgrades
  • Support rollback strategies
  • Compare model behavior

Why Model Versioning Matters

Different versions may:

  • Behave differently
  • Produce different outputs
  • Affect latency
  • Affect costs
  • Impact prompt performance

Deployment Types

Azure AI commonly supports:

  • Standard deployments
  • Provisioned throughput deployments

Standard Deployments

Standard deployments use shared infrastructure.

Advantages:

  • Simpler setup
  • Lower upfront costs
  • Flexible usage

Limitations:

  • Shared capacity
  • Variable latency under heavy load

Provisioned Throughput Deployments

Provisioned throughput reserves dedicated model capacity.

Advantages:

  • Predictable performance
  • Consistent latency
  • Enterprise-grade scaling

Limitations:

  • Higher cost
  • Capacity planning required

When to Use Standard Deployments

Use standard deployments when:

  • Workloads are moderate
  • Usage is variable
  • Cost optimization matters
  • Development/testing environments are used

When to Use Provisioned Throughput

Use provisioned throughput when:

  • High traffic is expected
  • Predictable latency is required
  • Enterprise SLAs exist
  • Production copilots are deployed

Scaling Model Deployments

AI deployments must support varying workloads.


Autoscaling

Autoscaling adjusts resources dynamically based on demand.

Benefits:

  • Improved performance
  • Better cost efficiency
  • Reduced manual intervention

Horizontal Scaling

Horizontal scaling adds additional instances or capacity.

Useful for:

  • High concurrency
  • Enterprise AI systems
  • Large-scale chatbots

Latency Considerations

Latency refers to response time.

Factors affecting latency:

  • Model size
  • Throughput load
  • Geographic distance
  • Retrieval pipelines
  • Tool execution

Choosing the Correct Model

Choosing the correct model is critical.


Use Larger Models When:

  • Advanced reasoning is required
  • Complex workflows exist
  • High-quality generation matters

Use Smaller Models When:

  • Cost efficiency matters
  • Low latency is important
  • Simpler tasks are performed

Agent Deployments

AI agents combine:

  • Models
  • Memory
  • Retrieval
  • Tool calling
  • Workflow orchestration

Agent deployment involves configuring all these components together.


Agent Configuration Components

Common agent configuration elements include:

  • System prompts
  • Tool definitions
  • Function calling
  • Knowledge sources
  • Retrieval settings
  • Memory configuration
  • Safety settings

System Prompts

System prompts define:

  • Agent behavior
  • Role instructions
  • Response style
  • Operational constraints

Well-designed system prompts improve:

  • Reliability
  • Consistency
  • Safety

Tool and Function Integration

Agents may use tools such as:

  • APIs
  • Databases
  • Search services
  • External systems

Function calling enables agents to invoke these tools dynamically.


Retrieval Integration

Many AI agents use Retrieval-Augmented Generation (RAG).

RAG systems commonly integrate:

  • Azure AI Search
  • Embedding models
  • Vector search
  • Knowledge indexes

Knowledge Sources

Agents may connect to:

  • Enterprise documents
  • Databases
  • APIs
  • SharePoint
  • Blob Storage
  • Internal knowledge bases

Memory Configuration

Agents may use:

  • Short-term memory
  • Long-term memory
  • Semantic memory

Common storage systems include:

  • Azure Cosmos DB
  • Azure SQL Database
  • Azure AI Search

Security Configuration

Security is a major AI-103 exam topic.


Microsoft Entra ID

Microsoft Entra ID supports:

  • Authentication
  • Authorization
  • RBAC
  • Identity management

Azure Key Vault

Azure Key Vault securely stores:

  • API keys
  • Secrets
  • Certificates
  • Connection strings

Content Safety Configuration

Azure AI Content Safety helps:

  • Detect harmful content
  • Filter unsafe outputs
  • Apply safety policies

Network Security

Enterprise AI deployments may use:

  • VNets
  • Private Endpoints
  • Firewalls
  • API gateways

Monitoring Deployments

AI deployments require operational monitoring.


Azure Monitor

Azure Monitor provides:

  • Metrics
  • Logging
  • Alerts
  • Diagnostics

Application Insights

Application Insights supports:

  • Telemetry
  • Request tracing
  • Error diagnostics
  • Performance monitoring

Metrics to Monitor

Common metrics include:

  • Latency
  • Token usage
  • Error rates
  • Throughput
  • Tool call failures
  • Retrieval quality

Evaluating AI Deployments

AI systems should be evaluated for:

  • Accuracy
  • Groundedness
  • Safety
  • Relevance
  • Reliability

Prompt Flow

Prompt Flow supports:

  • Workflow orchestration
  • Prompt chaining
  • Tool integration
  • Evaluation pipelines

Prompt Flow is an important AI-103 topic.


CI/CD for AI Deployments

AI deployment pipelines should support:

  • Automated testing
  • Version control
  • Safe releases
  • Rollbacks

Blue-Green Deployments

Blue-green deployments:

  • Reduce downtime
  • Support safer releases
  • Simplify rollback

Canary Deployments

Canary deployments:

  • Roll out changes gradually
  • Reduce deployment risk
  • Support controlled testing

Common AI-103 Deployment Scenarios

Scenario 1: Enterprise AI Copilot

Requirements:

  • High concurrency
  • Secure retrieval
  • Enterprise search
  • Low latency

Recommended Configuration:

  • Provisioned throughput
  • Azure AI Search
  • Entra ID
  • Autoscaling

Scenario 2: Development Chatbot

Requirements:

  • Low cost
  • Rapid experimentation
  • Flexible scaling

Recommended Configuration:

  • Standard deployment
  • App Service
  • Basic monitoring

Scenario 3: AI Agent with Tool Calling

Requirements:

  • API integrations
  • Workflow execution
  • Multi-step reasoning

Recommended Configuration:

  • Azure OpenAI
  • Azure Functions
  • Prompt Flow
  • Tool definitions

Scenario 4: Enterprise Knowledge Assistant

Requirements:

  • Grounded responses
  • Semantic retrieval
  • Document search

Recommended Configuration:

  • Embedding models
  • Azure AI Search
  • Hybrid search
  • RAG pipelines

Cost Optimization Considerations

AI deployments can become expensive.


Common Cost Drivers

  • Token usage
  • Provisioned throughput
  • Search indexing
  • Embedding generation
  • Large models
  • High concurrency

Cost Optimization Strategies

Use Smaller Models When Possible

Smaller models reduce:

  • Latency
  • Compute costs
  • Token usage

Optimize Retrieval

Efficient retrieval reduces:

  • Prompt size
  • Token costs
  • Latency

Use Autoscaling

Autoscaling prevents overprovisioning.


Common AI-103 Exam Tips

Understand Deployment Types

Know the differences between:

  • Standard deployments
  • Provisioned throughput deployments

Learn Agent Configuration Components

Understand:

  • System prompts
  • Tool integration
  • Retrieval settings
  • Memory configuration

Know Security Best Practices

Use:

  • Entra ID
  • RBAC
  • Key Vault
  • Private networking

Understand Monitoring Concepts

Know how to monitor:

  • Latency
  • Token usage
  • Throughput
  • Errors
  • AI quality

Summary

Configuring model and agent deployments is a critical skill for Azure AI developers.

For the AI-103 exam, you should understand:

  • Azure OpenAI deployment configuration
  • Model versioning
  • Deployment scaling
  • Agent architecture
  • Tool integration
  • Retrieval integration
  • Memory configuration
  • Security controls
  • Monitoring and evaluation
  • Deployment lifecycle management

Well-configured deployments improve:

  • Reliability
  • Performance
  • Scalability
  • Security
  • Cost efficiency
  • User experience

These concepts are foundational for building enterprise-grade AI applications and agent-based systems on Azure.


Practice Exam Questions

Question 1

Which deployment type provides dedicated capacity for Azure OpenAI workloads?

A. Shared deployment
B. Provisioned throughput deployment
C. Batch deployment
D. Basic deployment

Answer

B. Provisioned throughput deployment

Explanation

Provisioned throughput reserves dedicated processing capacity.


Question 2

What is the primary purpose of model versioning?

A. Increase storage size
B. Manage model updates and rollback strategies
C. Reduce API authentication
D. Eliminate monitoring

Answer

B. Manage model updates and rollback strategies

Explanation

Versioning helps maintain stability and supports rollback.


Question 3

Which Azure service is MOST commonly used for semantic retrieval in RAG systems?

A. Azure AI Search
B. Azure Backup
C. Azure CDN
D. Azure DNS

Answer

A. Azure AI Search

Explanation

Azure AI Search supports vector and semantic retrieval.


Question 4

What is the purpose of a system prompt in an AI agent?

A. Encrypt embeddings
B. Define agent behavior and instructions
C. Replace APIs
D. Configure storage replication

Answer

B. Define agent behavior and instructions

Explanation

System prompts guide the agent’s role, constraints, and response style.


Question 5

Which Azure service securely stores API keys and secrets?

A. Azure Key Vault
B. Azure Monitor
C. Azure Backup
D. Azure CDN

Answer

A. Azure Key Vault

Explanation

Azure Key Vault securely stores sensitive credentials.


Question 6

Which deployment strategy gradually rolls out updates to a small percentage of users first?

A. Full deployment
B. Canary deployment
C. Offline deployment
D. Batch deployment

Answer

B. Canary deployment

Explanation

Canary deployments reduce deployment risk through gradual rollout.


Question 7

Which type of model is specifically designed for vector generation and semantic similarity?

A. Vision model
B. Embedding model
C. Speech model
D. OCR model

Answer

B. Embedding model

Explanation

Embedding models generate vector representations for semantic retrieval.


Question 8

Which Azure service provides telemetry and request tracing for AI applications?

A. Application Insights
B. Azure DNS
C. Azure Files
D. Azure Firewall

Answer

A. Application Insights

Explanation

Application Insights provides application telemetry and diagnostics.


Question 9

Which feature dynamically adjusts resources based on workload demand?

A. Static allocation
B. Autoscaling
C. Encryption scaling
D. Semantic routing

Answer

B. Autoscaling

Explanation

Autoscaling automatically adjusts capacity based on traffic.


Question 10

Which Azure service is commonly used for workflow orchestration and prompt chaining in AI solutions?

A. Prompt Flow
B. Azure CDN
C. Azure Backup
D. Azure Front Door

Answer

A. Prompt Flow

Explanation

Prompt Flow orchestrates prompts, tools, and AI workflows.


Go to the AI-103 Exam Prep Hub main page

Choose appropriate deployment options (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Plan and manage an Azure AI solution (25–30%)
--> Set up AI solutions in Foundry
--> Choose appropriate deployment options


Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

One of the most important responsibilities for Azure AI developers is selecting the correct deployment option for AI applications and agent-based solutions.

Modern AI systems can be deployed in many different ways depending on:

  • Scalability requirements
  • Cost constraints
  • Security requirements
  • Latency expectations
  • Geographic distribution
  • Operational complexity
  • AI workload patterns
  • Enterprise governance needs

The AI-103: Develop AI Apps and Agents on Azure certification exam tests your understanding of how to choose appropriate deployment options for:

  • Generative AI applications
  • AI agents
  • APIs
  • RAG systems
  • Vector search solutions
  • Multimodal applications
  • Enterprise AI systems

For the AI-103 exam, you should understand:

  • Azure deployment models
  • Hosting options
  • Serverless deployments
  • Containerized deployments
  • Kubernetes deployments
  • Regional deployments
  • High availability strategies
  • Scaling approaches
  • CI/CD deployment pipelines
  • Model deployment considerations
  • Infrastructure tradeoffs

What Is a Deployment Option?

A deployment option refers to the method used to host and run an AI application or service.

Deployment choices affect:

  • Performance
  • Reliability
  • Cost
  • Security
  • Scalability
  • Maintainability

Choosing the wrong deployment strategy can:

  • Increase costs
  • Reduce performance
  • Complicate maintenance
  • Create scaling problems

Common Azure AI Deployment Components

AI solutions commonly include:

  • AI models
  • APIs
  • Search systems
  • Databases
  • Agent orchestration
  • Storage systems
  • Monitoring tools
  • Security services

Each component may use different deployment approaches.


Azure OpenAI Deployment Options

Azure OpenAI allows developers to deploy:

  • GPT models
  • Embedding models
  • Multimodal models
  • Fine-tuned models

Deployment considerations include:

  • Region availability
  • Throughput requirements
  • Latency requirements
  • Cost optimization
  • Capacity planning

Standard Deployments

What Are Standard Deployments?

Standard deployments provide shared model hosting infrastructure.

Advantages:

  • Lower operational complexity
  • Managed infrastructure
  • Easier setup

Disadvantages:

  • Shared capacity
  • Potential throughput limitations

Provisioned Throughput Deployments

What Is Provisioned Throughput?

Provisioned throughput reserves dedicated processing capacity.

Advantages:

  • Predictable performance
  • Dedicated throughput
  • Lower latency consistency

Disadvantages:

  • Higher cost
  • Capacity planning required

When to Use Provisioned Throughput

Use provisioned throughput when:

  • Workloads are high volume
  • Predictable latency is critical
  • Enterprise SLAs are required
  • Large-scale copilots are deployed

Serverless Deployment Options

What Is Serverless?

Serverless computing automatically manages infrastructure.

Developers focus on code instead of servers.


Azure Functions

Azure Functions provides event-driven serverless compute.

Common AI use cases:

  • Tool calling
  • Workflow execution
  • API processing
  • Lightweight orchestration
  • Event-triggered AI actions

Advantages of Azure Functions

  • Automatic scaling
  • Pay-per-use pricing
  • Rapid deployment
  • Minimal infrastructure management

Limitations of Azure Functions

  • Execution duration limits
  • Cold starts
  • Less suitable for large persistent workloads

When to Use Azure Functions

Use Azure Functions when:

  • Workloads are event-driven
  • Execution is lightweight
  • Cost optimization is important
  • Rapid scaling is required

Azure Container Apps

Azure Container Apps provides serverless container hosting.

Useful for:

  • AI middleware
  • APIs
  • Agent orchestration
  • Background workers
  • Lightweight microservices

Advantages of Container Apps

  • Simplified container deployment
  • Autoscaling support
  • Event-driven scaling
  • Lower operational overhead than Kubernetes

Kubernetes Deployments

Azure Kubernetes Service (AKS)

AKS provides enterprise-grade container orchestration.

Common AI uses:

  • Multi-agent systems
  • Large-scale AI platforms
  • Distributed AI services
  • Complex orchestration
  • High-volume APIs

Advantages of AKS

  • High scalability
  • Advanced orchestration
  • Fine-grained control
  • Container portability
  • Enterprise-grade deployments

Limitations of AKS

  • Higher operational complexity
  • More infrastructure management
  • Requires Kubernetes expertise

When to Use AKS

Use AKS when:

  • Large-scale deployments exist
  • Multiple microservices interact
  • High traffic is expected
  • Advanced orchestration is needed

Platform-as-a-Service (PaaS) Deployments

Azure App Service

Azure App Service hosts:

  • Web apps
  • APIs
  • AI front ends
  • Lightweight enterprise applications

Advantages of App Service

  • Managed platform
  • Easy deployment
  • Autoscaling support
  • Simplified maintenance

When to Use App Service

Use App Service when:

  • Hosting AI web applications
  • Managing APIs
  • Rapid development is needed
  • Full Kubernetes orchestration is unnecessary

Edge and Hybrid Deployments

Some AI workloads require local or hybrid deployments.

Reasons include:

  • Low latency
  • Regulatory requirements
  • Limited connectivity
  • On-premises data processing

Azure Arc

Azure Arc extends Azure management to:

  • On-premises systems
  • Multi-cloud environments
  • Edge deployments

Useful for hybrid AI environments.


Deployment Considerations for AI Agents

AI agents often require multiple deployment layers.

Examples include:

  • LLM hosting
  • Retrieval systems
  • Tool execution services
  • Workflow orchestration
  • Persistent memory systems

Multi-Service Architectures

AI agents commonly use:

  • Azure OpenAI
  • Azure AI Search
  • Azure Functions
  • Cosmos DB
  • APIs
  • Orchestration workflows

Different components may use different deployment options.


Geographic Deployment Considerations

AI systems may require global deployment strategies.


Regional Deployments

Deploying resources in a specific region helps:

  • Reduce latency
  • Meet compliance requirements
  • Improve user experience

Multi-Region Deployments

Multi-region deployments improve:

  • Availability
  • Disaster recovery
  • Global performance

Availability Zones

Availability Zones provide redundancy across isolated datacenters.

Benefits include:

  • Higher uptime
  • Fault tolerance
  • Improved resilience

High Availability Design

Enterprise AI applications often require:

  • Redundant infrastructure
  • Automatic failover
  • Load balancing
  • Disaster recovery

Load Balancing

Azure Load Balancer and Azure Application Gateway distribute traffic across services.

Benefits:

  • Scalability
  • High availability
  • Traffic optimization

Autoscaling

Autoscaling dynamically adjusts infrastructure based on demand.

Supported by:

  • AKS
  • Azure Functions
  • App Service
  • Container Apps

Deployment Security Considerations

Security is a major AI-103 exam topic.


Microsoft Entra ID

Microsoft Entra ID supports:

  • Authentication
  • Authorization
  • Identity management
  • RBAC

Azure Key Vault

Azure Key Vault securely stores:

  • Secrets
  • API keys
  • Certificates
  • Connection strings

Private Endpoints

Private Endpoints provide secure private connectivity between Azure services.

Useful for:

  • Enterprise AI systems
  • Sensitive data workloads
  • Compliance-driven deployments

CI/CD for AI Deployments

What Is CI/CD?

CI/CD stands for:

  • Continuous Integration
  • Continuous Deployment

CI/CD automates:

  • Testing
  • Deployment
  • Validation
  • Release management

Azure DevOps

Azure DevOps supports:

  • Build pipelines
  • Release pipelines
  • Source control
  • Automated deployments

GitHub Actions

GitHub Actions supports:

  • Workflow automation
  • CI/CD pipelines
  • Deployment automation

Commonly used for AI application deployments.


Blue-Green Deployments

Blue-green deployments reduce downtime during releases.

How it works:

  • One environment remains active
  • A second environment receives updates
  • Traffic shifts after validation

Benefits:

  • Safer releases
  • Reduced downtime
  • Easier rollback

Canary Deployments

Canary deployments release updates gradually to a small percentage of users.

Benefits:

  • Reduced deployment risk
  • Easier issue detection
  • Safer experimentation

Monitoring Deployment Health

AI deployments should monitor:

  • Latency
  • Throughput
  • Token usage
  • Errors
  • Model failures
  • Tool call failures
  • Retrieval quality

Azure Monitor

Azure Monitor provides:

  • Metrics
  • Logging
  • Alerts
  • Diagnostics

Application Insights

Application Insights supports:

  • Telemetry
  • Request tracing
  • Dependency tracking
  • Error diagnostics

Cost Optimization Considerations

AI deployments can become expensive.


Common Cost Drivers

  • Token consumption
  • GPU usage
  • High-scale orchestration
  • Search indexing
  • Storage
  • Data transfer

Cost Optimization Strategies

Use Smaller Models When Appropriate

Smaller models reduce:

  • Compute costs
  • Token usage
  • Latency

Use Serverless When Appropriate

Serverless deployments reduce idle infrastructure costs.


Use Autoscaling

Autoscaling prevents overprovisioning.


Common AI-103 Deployment Scenarios

Scenario 1: Enterprise AI Chatbot

Requirements:

  • High availability
  • Secure authentication
  • Enterprise search

Recommended Deployment:

  • Azure OpenAI
  • App Service
  • Azure AI Search
  • Entra ID

Scenario 2: Large-Scale AI Agent Platform

Requirements:

  • Multiple AI agents
  • Heavy orchestration
  • High concurrency

Recommended Deployment:

  • AKS
  • Azure Functions
  • Cosmos DB
  • Prompt Flow

Scenario 3: Lightweight AI API

Requirements:

  • Rapid deployment
  • Cost optimization
  • Moderate scale

Recommended Deployment:

  • Azure Functions
  • Container Apps

Scenario 4: Global AI Application

Requirements:

  • Global users
  • Low latency
  • Disaster recovery

Recommended Deployment:

  • Multi-region deployment
  • Availability Zones
  • Load balancing

Common AI-103 Exam Tips

Understand Deployment Tradeoffs

Know when to use:

  • App Service vs AKS
  • Functions vs Containers
  • Standard vs Provisioned Throughput

Know High Availability Concepts

Understand:

  • Availability Zones
  • Multi-region deployments
  • Load balancing
  • Failover strategies

Learn Security Best Practices

Know how to use:

  • Entra ID
  • RBAC
  • Key Vault
  • Private Endpoints

Understand Agent Deployment Needs

AI agents commonly require:

  • Tool orchestration
  • Retrieval systems
  • Persistent memory
  • API integrations

Summary

Choosing the correct deployment option is critical for successful AI applications and agent-based systems.

For the AI-103 exam, you should understand:

  • Azure deployment models
  • Serverless deployment options
  • Kubernetes deployments
  • PaaS hosting options
  • Multi-region architectures
  • High availability design
  • Security considerations
  • CI/CD pipelines
  • Scaling strategies
  • AI deployment tradeoffs

Strong deployment architecture skills help ensure AI systems are:

  • Reliable
  • Scalable
  • Secure
  • Cost-effective
  • Maintainable

Practice Exam Questions

Question 1

Which Azure service is BEST suited for enterprise-scale container orchestration for AI applications?

A. Azure App Service
B. Azure Kubernetes Service (AKS)
C. Azure DNS
D. Azure Backup

Answer

B. Azure Kubernetes Service (AKS)

Explanation

AKS provides enterprise-grade container orchestration and scalability.


Question 2

Which deployment option provides dedicated throughput capacity for Azure OpenAI models?

A. Shared deployment
B. Provisioned throughput deployment
C. Consumption deployment
D. Basic deployment

Answer

B. Provisioned throughput deployment

Explanation

Provisioned throughput reserves dedicated model processing capacity.


Question 3

Which Azure service is MOST appropriate for lightweight event-driven AI workflows?

A. Azure Functions
B. Azure Firewall
C. Azure Backup
D. Azure CDN

Answer

A. Azure Functions

Explanation

Azure Functions supports serverless event-driven execution.


Question 4

What is the primary benefit of Availability Zones?

A. Lower token usage
B. Increased embedding size
C. Improved fault tolerance
D. Reduced API authentication

Answer

C. Improved fault tolerance

Explanation

Availability Zones provide redundancy across isolated datacenters.


Question 5

Which Azure service is commonly used to host AI web applications and APIs with minimal infrastructure management?

A. Azure App Service
B. Azure Load Balancer
C. Azure DNS
D. Azure Monitor

Answer

A. Azure App Service

Explanation

Azure App Service is a managed PaaS platform for hosting web applications and APIs.


Question 6

Which deployment strategy gradually releases updates to a subset of users first?

A. Blue-green deployment
B. Canary deployment
C. Full rollback deployment
D. Batch deployment

Answer

B. Canary deployment

Explanation

Canary deployments release updates incrementally to reduce risk.


Question 7

Which Azure service securely stores API keys and secrets for AI applications?

A. Azure Key Vault
B. Azure CDN
C. Azure Firewall
D. Azure Backup

Answer

A. Azure Key Vault

Explanation

Azure Key Vault securely manages secrets and credentials.


Question 8

Which Azure deployment option is MOST appropriate for serverless container hosting?

A. Azure Container Apps
B. Azure Backup
C. Azure DNS
D. Azure Files

Answer

A. Azure Container Apps

Explanation

Azure Container Apps provides simplified serverless container deployment.


Question 9

Which deployment architecture improves global application availability and disaster recovery?

A. Single-region deployment
B. Multi-region deployment
C. Local-only deployment
D. Single-container deployment

Answer

B. Multi-region deployment

Explanation

Multi-region deployments improve resilience and geographic performance.


Question 10

Which Azure monitoring service provides application telemetry and request diagnostics?

A. Application Insights
B. Azure CDN
C. Azure DNS
D. Azure Policy

Answer

A. Application Insights

Explanation

Application Insights provides monitoring and telemetry for applications.


Go to the AI-103 Exam Prep Hub main page

Design Azure infrastructure for AI Apps and agent-based solutions (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Plan and manage an Azure AI solution (25–30%)
--> Set up AI solutions in Foundry
--> Design Azure infrastructure for AI Apps and agent-based solutions


Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Designing infrastructure for AI applications and agent-based systems is one of the most important responsibilities for Azure AI developers.

Modern AI solutions are not simply standalone models. They are distributed cloud systems that combine:

  • AI services
  • APIs
  • Databases
  • Search systems
  • Storage
  • Networking
  • Security controls
  • Monitoring systems
  • Agent orchestration components

The AI-103: Develop AI Apps and Agents on Azure certification exam tests your ability to design Azure infrastructure that supports:

  • Generative AI applications
  • AI agents
  • Retrieval-Augmented Generation (RAG)
  • Vector search
  • Multimodal AI systems
  • Scalable AI architectures
  • Secure enterprise AI deployments

For the AI-103 exam, you should understand:

  • Core Azure infrastructure services
  • AI architecture patterns
  • Scalability and performance design
  • Networking and security
  • Identity and access management
  • Storage and databases
  • Monitoring and observability
  • Cost optimization
  • High availability and disaster recovery
  • Infrastructure choices for AI agents

Core Components of AI Infrastructure

AI applications commonly require multiple infrastructure layers.

Typical components include:

  1. AI model services
  2. Compute resources
  3. Storage systems
  4. Search and retrieval systems
  5. Networking components
  6. Security services
  7. Monitoring systems
  8. Workflow orchestration
  9. API management
  10. Identity management

Azure AI Services Layer

Azure OpenAI

Azure OpenAI provides:

  • Large Language Models (LLMs)
  • Embedding models
  • Multimodal models
  • Conversational AI capabilities

Azure OpenAI is commonly used for:

  • AI copilots
  • Chatbots
  • AI agents
  • Summarization
  • Content generation
  • Tool calling

Azure AI Search

Azure AI Search supports:

  • Vector search
  • Semantic search
  • Hybrid search
  • Enterprise retrieval
  • RAG architectures

It is commonly used for:

  • Knowledge grounding
  • Enterprise search
  • AI assistant retrieval

Azure AI Vision

Azure AI Vision provides:

  • OCR
  • Image analysis
  • Object detection
  • Caption generation
  • Visual understanding

Azure AI Document Intelligence

Azure AI Document Intelligence supports:

  • Invoice extraction
  • Form processing
  • Layout analysis
  • OCR workflows
  • Structured document extraction

Compute Infrastructure for AI Applications

Azure App Service

Azure App Service is commonly used to host:

  • Web applications
  • AI front ends
  • APIs
  • Lightweight AI services

Advantages:

  • Managed platform
  • Easy scaling
  • Simplified deployment

Azure Kubernetes Service (AKS)

AKS provides container orchestration for:

  • Large-scale AI applications
  • Microservices
  • Agent orchestration systems
  • Distributed AI workloads

Advantages:

  • High scalability
  • Container management
  • Advanced orchestration
  • Enterprise-grade deployments

When to Use AKS

Use AKS when:

  • Complex orchestration is required
  • Multiple services interact
  • High scalability is needed
  • Microservice architectures are used

Azure Functions

Azure Functions provides serverless compute.

Common AI use cases:

  • Tool execution
  • Event-driven workflows
  • API integrations
  • Lightweight processing
  • Agent tool calling

Advantages:

  • Pay-per-use pricing
  • Automatic scaling
  • Fast development

Azure Container Apps

Azure Container Apps provides simplified container hosting.

Useful for:

  • API services
  • AI middleware
  • Lightweight agent services
  • Event-driven AI components

Choosing the Correct Compute Service

Use Azure App Service When:

  • Hosting simple AI web apps
  • Managing APIs
  • Rapid deployment is needed

Use AKS When:

  • Large-scale orchestration is required
  • Complex microservices exist
  • Advanced scalability is necessary

Use Azure Functions When:

  • Event-driven execution is needed
  • Tool calling is required
  • Lightweight compute is sufficient

Use Azure Container Apps When:

  • Container simplicity is preferred
  • Serverless containers are desired

Storage Infrastructure

AI systems often require multiple storage solutions.


Azure Blob Storage

Azure Blob Storage supports:

  • Document storage
  • Training data
  • Images
  • Videos
  • Logs
  • AI datasets

Common AI uses:

  • RAG document storage
  • Knowledge repositories
  • Media storage

Azure Cosmos DB

Azure Cosmos DB provides:

  • Globally distributed NoSQL storage
  • Low-latency access
  • High scalability

Common AI uses:

  • Agent memory
  • Session storage
  • User profiles
  • Conversation history

Azure SQL Database

Azure SQL Database supports:

  • Structured enterprise data
  • Relational workloads
  • Transactional systems

Common AI uses:

  • Enterprise integration
  • Business systems
  • Structured metadata

Vector Storage

Vector-enabled storage supports:

  • Embedding storage
  • Similarity search
  • Semantic retrieval

Common services include:

  • Azure AI Search
  • Azure Cosmos DB
  • Azure SQL Database

Networking Infrastructure

AI solutions require secure and scalable networking.


Virtual Networks (VNets)

VNets provide:

  • Network isolation
  • Secure communication
  • Private connectivity

Use VNets when:

  • Enterprise security is required
  • Private networking is necessary
  • Sensitive data is involved

Private Endpoints

Private Endpoints allow Azure services to be accessed privately through VNets.

Benefits:

  • Improved security
  • Reduced public exposure
  • Enterprise compliance support

API Management

Azure API Management helps:

  • Secure APIs
  • Throttle requests
  • Monitor API usage
  • Apply policies
  • Manage agent APIs

This is important for:

  • AI agents
  • Tool integrations
  • Enterprise API governance

Load Balancing

Azure Load Balancer and Application Gateway help:

  • Distribute traffic
  • Improve availability
  • Scale AI applications

Identity and Security

Security is a major AI-103 exam topic.


Microsoft Entra ID

Microsoft Entra ID provides:

  • Authentication
  • Authorization
  • Identity management
  • Role-based access control (RBAC)

AI applications use Entra ID for:

  • User authentication
  • API access control
  • Secure enterprise integration

Role-Based Access Control (RBAC)

RBAC ensures users and services only access authorized resources.

Examples:

  • Restricting AI model access
  • Controlling storage access
  • Securing search indexes

Azure Key Vault

Azure Key Vault stores:

  • Secrets
  • API keys
  • Certificates
  • Connection strings

Never hardcode secrets in AI applications.


Azure AI Content Safety

Azure AI Content Safety helps:

  • Detect harmful content
  • Filter unsafe outputs
  • Support responsible AI practices

Monitoring and Observability

AI systems require monitoring for:

  • Reliability
  • Performance
  • Cost
  • Failures
  • Hallucinations
  • API latency

Azure Monitor

Azure Monitor collects:

  • Metrics
  • Logs
  • Alerts
  • Performance data

Application Insights

Application Insights supports:

  • Application telemetry
  • Request tracing
  • Error tracking
  • Dependency monitoring

Useful for:

  • AI apps
  • APIs
  • Agent workflows

Logging AI Systems

AI systems should log:

  • Prompts
  • Responses
  • Errors
  • Tool calls
  • Latency
  • Retrieval quality

Logging helps:

  • Troubleshooting
  • Auditing
  • Evaluation
  • Compliance

Scalability Design

AI applications may experience:

  • High traffic
  • Large token volumes
  • Heavy retrieval workloads
  • Concurrent agent operations

Infrastructure must scale effectively.


Horizontal Scaling

Horizontal scaling adds more instances.

Examples:

  • Additional API servers
  • More containers
  • More worker nodes

Vertical Scaling

Vertical scaling increases resource capacity.

Examples:

  • More CPU
  • More memory
  • Larger VM sizes

Autoscaling

Autoscaling dynamically adjusts resources based on demand.

Common services supporting autoscaling:

  • AKS
  • Azure Functions
  • App Service
  • Container Apps

High Availability and Disaster Recovery

Enterprise AI systems require resilience.


Availability Zones

Availability Zones improve fault tolerance.

Benefits:

  • Redundancy
  • Improved uptime
  • Reduced outage risk

Geo-Redundancy

Geo-redundancy replicates data across regions.

Useful for:

  • Disaster recovery
  • Business continuity
  • Global applications

Backup and Recovery

AI systems should back up:

  • Knowledge indexes
  • Databases
  • Configuration data
  • Logs
  • Agent memory

Infrastructure for AI Agents

AI agents often require additional infrastructure components.


Agent Orchestration

AI agents may require orchestration services such as:

  • Prompt Flow
  • Azure Functions
  • Logic Apps
  • AKS workflows

Retrieval Infrastructure

Agent systems commonly use:

  • Azure AI Search
  • Embeddings
  • Vector indexes
  • RAG pipelines

Persistent Memory Infrastructure

Persistent memory may use:

  • Azure Cosmos DB
  • Azure SQL Database
  • Blob Storage

Tool Integration Infrastructure

Agents often integrate with:

  • REST APIs
  • Databases
  • External SaaS systems
  • Enterprise workflows

Common AI-103 Architecture Scenarios

Scenario 1: Enterprise AI Copilot

Requirements:

  • Conversational AI
  • Enterprise search
  • Secure authentication
  • Document retrieval

Recommended Infrastructure:

  • Azure OpenAI
  • Azure AI Search
  • Entra ID
  • Blob Storage
  • App Service

Scenario 2: Large-Scale Multi-Agent System

Requirements:

  • Multiple AI agents
  • High scalability
  • Distributed orchestration

Recommended Infrastructure:

  • AKS
  • Azure Functions
  • Prompt Flow
  • Cosmos DB

Scenario 3: AI Invoice Processing Solution

Requirements:

  • OCR
  • Document extraction
  • Workflow automation

Recommended Infrastructure:

  • Azure AI Document Intelligence
  • Blob Storage
  • Logic Apps
  • Azure Functions

Scenario 4: Global AI Chat Platform

Requirements:

  • Global availability
  • High concurrency
  • Disaster recovery

Recommended Infrastructure:

  • Geo-redundant storage
  • Availability Zones
  • Load balancing
  • Autoscaling

Cost Optimization Considerations

AI infrastructure can become expensive.


Common Cost Drivers

  • Token usage
  • Vector storage
  • GPU workloads
  • Data transfer
  • Search indexing
  • High-scale orchestration

Cost Optimization Strategies

Use Smaller Models When Appropriate

Smaller models reduce:

  • Compute usage
  • Token costs
  • Latency

Use Autoscaling

Autoscaling reduces idle resource costs.


Optimize Retrieval Pipelines

Efficient chunking and indexing reduce:

  • Search costs
  • Storage requirements
  • Retrieval latency

Common AI-103 Exam Tips

Understand Infrastructure Tradeoffs

Know when to use:

  • AKS vs App Service
  • Functions vs Containers
  • Cosmos DB vs SQL Database

Learn Security Best Practices

Know how to use:

  • Entra ID
  • RBAC
  • Key Vault
  • Private Endpoints

Understand RAG Infrastructure

RAG commonly uses:

  • Azure OpenAI
  • Azure AI Search
  • Embeddings
  • Storage systems

Know Agent Infrastructure Patterns

AI agents commonly require:

  • Workflow orchestration
  • Tool integration
  • Persistent memory
  • Retrieval systems

Summary

Designing Azure infrastructure for AI applications requires balancing:

  • Scalability
  • Security
  • Performance
  • Cost
  • Reliability
  • Maintainability

For the AI-103 exam, you should understand:

  • Azure AI service architecture
  • Compute options
  • Storage design
  • Networking and security
  • Monitoring and observability
  • High availability
  • Agent infrastructure patterns
  • RAG infrastructure
  • Infrastructure scaling strategies

Strong infrastructure design skills are essential for deploying production-grade AI apps and agent-based systems on Azure.


Practice Exam Questions

Question 1

Which Azure service is MOST appropriate for enterprise vector search and RAG retrieval?

A. Azure AI Search
B. Azure Backup
C. Azure CDN
D. Azure DNS

Answer

A. Azure AI Search

Explanation

Azure AI Search supports vector search, semantic search, and retrieval for RAG systems.


Question 2

Which Azure compute service is BEST suited for large-scale containerized AI microservices?

A. Azure App Service
B. Azure Kubernetes Service (AKS)
C. Azure Files
D. Azure CDN

Answer

B. Azure Kubernetes Service (AKS)

Explanation

AKS provides advanced container orchestration and scalability.


Question 3

Which Azure service is MOST appropriate for storing API keys and secrets securely?

A. Azure Key Vault
B. Azure Monitor
C. Azure DNS
D. Azure Load Balancer

Answer

A. Azure Key Vault

Explanation

Azure Key Vault securely stores secrets, certificates, and keys.


Question 4

Which Azure service provides serverless execution for lightweight AI workflows and tool calling?

A. Azure Functions
B. Azure Backup
C. Azure CDN
D. Azure Firewall

Answer

A. Azure Functions

Explanation

Azure Functions supports event-driven serverless compute.


Question 5

What is the primary purpose of Availability Zones?

A. Reduce token usage
B. Improve fault tolerance and uptime
C. Replace backups
D. Encrypt embeddings

Answer

B. Improve fault tolerance and uptime

Explanation

Availability Zones provide redundancy across isolated datacenter locations.


Question 6

Which Azure service is MOST commonly used for globally distributed NoSQL storage in AI applications?

A. Azure Cosmos DB
B. Azure DNS
C. Azure Files
D. Azure CDN

Answer

A. Azure Cosmos DB

Explanation

Azure Cosmos DB provides scalable globally distributed NoSQL storage.


Question 7

Which Azure networking feature enables private access to Azure services from a VNet?

A. Private Endpoint
B. Public IP
C. Load Balancer
D. Traffic Manager

Answer

A. Private Endpoint

Explanation

Private Endpoints provide secure private connectivity.


Question 8

Which Azure monitoring service provides application telemetry and request tracing?

A. Application Insights
B. Azure CDN
C. Azure Policy
D. Azure ExpressRoute

Answer

A. Application Insights

Explanation

Application Insights provides telemetry and diagnostics for applications.


Question 9

Which Azure identity service provides authentication and RBAC support for AI applications?

A. Microsoft Entra ID
B. Azure CDN
C. Azure Firewall
D. Azure Front Door

Answer

A. Microsoft Entra ID

Explanation

Microsoft Entra ID provides identity and access management.


Question 10

Which scaling strategy adds additional instances to support increased AI workload demand?

A. Vertical scaling
B. Horizontal scaling
C. Encryption scaling
D. Semantic scaling

Answer

B. Horizontal scaling

Explanation

Horizontal scaling adds more instances to distribute workloads.


Go to the AI-103 Exam Prep Hub main page