This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub.
This topic falls under these sections:
Plan and manage an Azure AI solution (25–30%)
--> Set up AI solutions in Foundry
--> Design Azure infrastructure for AI Apps and agent-based solutions
Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.
Introduction
Designing infrastructure for AI applications and agent-based systems is one of the most important responsibilities for Azure AI developers.
Modern AI solutions are not simply standalone models. They are distributed cloud systems that combine:
- AI services
- APIs
- Databases
- Search systems
- Storage
- Networking
- Security controls
- Monitoring systems
- Agent orchestration components
The AI-103: Develop AI Apps and Agents on Azure certification exam tests your ability to design Azure infrastructure that supports:
- Generative AI applications
- AI agents
- Retrieval-Augmented Generation (RAG)
- Vector search
- Multimodal AI systems
- Scalable AI architectures
- Secure enterprise AI deployments
For the AI-103 exam, you should understand:
- Core Azure infrastructure services
- AI architecture patterns
- Scalability and performance design
- Networking and security
- Identity and access management
- Storage and databases
- Monitoring and observability
- Cost optimization
- High availability and disaster recovery
- Infrastructure choices for AI agents
Core Components of AI Infrastructure
AI applications commonly require multiple infrastructure layers.
Typical components include:
- AI model services
- Compute resources
- Storage systems
- Search and retrieval systems
- Networking components
- Security services
- Monitoring systems
- Workflow orchestration
- API management
- Identity management
Azure AI Services Layer
Azure OpenAI
Azure OpenAI provides:
- Large Language Models (LLMs)
- Embedding models
- Multimodal models
- Conversational AI capabilities
Azure OpenAI is commonly used for:
- AI copilots
- Chatbots
- AI agents
- Summarization
- Content generation
- Tool calling
Azure AI Search
Azure AI Search supports:
- Vector search
- Semantic search
- Hybrid search
- Enterprise retrieval
- RAG architectures
It is commonly used for:
- Knowledge grounding
- Enterprise search
- AI assistant retrieval
Azure AI Vision
Azure AI Vision provides:
- OCR
- Image analysis
- Object detection
- Caption generation
- Visual understanding
Azure AI Document Intelligence
Azure AI Document Intelligence supports:
- Invoice extraction
- Form processing
- Layout analysis
- OCR workflows
- Structured document extraction
Compute Infrastructure for AI Applications
Azure App Service
Azure App Service is commonly used to host:
- Web applications
- AI front ends
- APIs
- Lightweight AI services
Advantages:
- Managed platform
- Easy scaling
- Simplified deployment
Azure Kubernetes Service (AKS)
AKS provides container orchestration for:
- Large-scale AI applications
- Microservices
- Agent orchestration systems
- Distributed AI workloads
Advantages:
- High scalability
- Container management
- Advanced orchestration
- Enterprise-grade deployments
When to Use AKS
Use AKS when:
- Complex orchestration is required
- Multiple services interact
- High scalability is needed
- Microservice architectures are used
Azure Functions
Azure Functions provides serverless compute.
Common AI use cases:
- Tool execution
- Event-driven workflows
- API integrations
- Lightweight processing
- Agent tool calling
Advantages:
- Pay-per-use pricing
- Automatic scaling
- Fast development
Azure Container Apps
Azure Container Apps provides simplified container hosting.
Useful for:
- API services
- AI middleware
- Lightweight agent services
- Event-driven AI components
Choosing the Correct Compute Service
Use Azure App Service When:
- Hosting simple AI web apps
- Managing APIs
- Rapid deployment is needed
Use AKS When:
- Large-scale orchestration is required
- Complex microservices exist
- Advanced scalability is necessary
Use Azure Functions When:
- Event-driven execution is needed
- Tool calling is required
- Lightweight compute is sufficient
Use Azure Container Apps When:
- Container simplicity is preferred
- Serverless containers are desired
Storage Infrastructure
AI systems often require multiple storage solutions.
Azure Blob Storage
Azure Blob Storage supports:
- Document storage
- Training data
- Images
- Videos
- Logs
- AI datasets
Common AI uses:
- RAG document storage
- Knowledge repositories
- Media storage
Azure Cosmos DB
Azure Cosmos DB provides:
- Globally distributed NoSQL storage
- Low-latency access
- High scalability
Common AI uses:
- Agent memory
- Session storage
- User profiles
- Conversation history
Azure SQL Database
Azure SQL Database supports:
- Structured enterprise data
- Relational workloads
- Transactional systems
Common AI uses:
- Enterprise integration
- Business systems
- Structured metadata
Vector Storage
Vector-enabled storage supports:
- Embedding storage
- Similarity search
- Semantic retrieval
Common services include:
- Azure AI Search
- Azure Cosmos DB
- Azure SQL Database
Networking Infrastructure
AI solutions require secure and scalable networking.
Virtual Networks (VNets)
VNets provide:
- Network isolation
- Secure communication
- Private connectivity
Use VNets when:
- Enterprise security is required
- Private networking is necessary
- Sensitive data is involved
Private Endpoints
Private Endpoints allow Azure services to be accessed privately through VNets.
Benefits:
- Improved security
- Reduced public exposure
- Enterprise compliance support
API Management
Azure API Management helps:
- Secure APIs
- Throttle requests
- Monitor API usage
- Apply policies
- Manage agent APIs
This is important for:
- AI agents
- Tool integrations
- Enterprise API governance
Load Balancing
Azure Load Balancer and Application Gateway help:
- Distribute traffic
- Improve availability
- Scale AI applications
Identity and Security
Security is a major AI-103 exam topic.
Microsoft Entra ID
Microsoft Entra ID provides:
- Authentication
- Authorization
- Identity management
- Role-based access control (RBAC)
AI applications use Entra ID for:
- User authentication
- API access control
- Secure enterprise integration
Role-Based Access Control (RBAC)
RBAC ensures users and services only access authorized resources.
Examples:
- Restricting AI model access
- Controlling storage access
- Securing search indexes
Azure Key Vault
Azure Key Vault stores:
- Secrets
- API keys
- Certificates
- Connection strings
Never hardcode secrets in AI applications.
Azure AI Content Safety
Azure AI Content Safety helps:
- Detect harmful content
- Filter unsafe outputs
- Support responsible AI practices
Monitoring and Observability
AI systems require monitoring for:
- Reliability
- Performance
- Cost
- Failures
- Hallucinations
- API latency
Azure Monitor
Azure Monitor collects:
- Metrics
- Logs
- Alerts
- Performance data
Application Insights
Application Insights supports:
- Application telemetry
- Request tracing
- Error tracking
- Dependency monitoring
Useful for:
- AI apps
- APIs
- Agent workflows
Logging AI Systems
AI systems should log:
- Prompts
- Responses
- Errors
- Tool calls
- Latency
- Retrieval quality
Logging helps:
- Troubleshooting
- Auditing
- Evaluation
- Compliance
Scalability Design
AI applications may experience:
- High traffic
- Large token volumes
- Heavy retrieval workloads
- Concurrent agent operations
Infrastructure must scale effectively.
Horizontal Scaling
Horizontal scaling adds more instances.
Examples:
- Additional API servers
- More containers
- More worker nodes
Vertical Scaling
Vertical scaling increases resource capacity.
Examples:
- More CPU
- More memory
- Larger VM sizes
Autoscaling
Autoscaling dynamically adjusts resources based on demand.
Common services supporting autoscaling:
- AKS
- Azure Functions
- App Service
- Container Apps
High Availability and Disaster Recovery
Enterprise AI systems require resilience.
Availability Zones
Availability Zones improve fault tolerance.
Benefits:
- Redundancy
- Improved uptime
- Reduced outage risk
Geo-Redundancy
Geo-redundancy replicates data across regions.
Useful for:
- Disaster recovery
- Business continuity
- Global applications
Backup and Recovery
AI systems should back up:
- Knowledge indexes
- Databases
- Configuration data
- Logs
- Agent memory
Infrastructure for AI Agents
AI agents often require additional infrastructure components.
Agent Orchestration
AI agents may require orchestration services such as:
- Prompt Flow
- Azure Functions
- Logic Apps
- AKS workflows
Retrieval Infrastructure
Agent systems commonly use:
- Azure AI Search
- Embeddings
- Vector indexes
- RAG pipelines
Persistent Memory Infrastructure
Persistent memory may use:
- Azure Cosmos DB
- Azure SQL Database
- Blob Storage
Tool Integration Infrastructure
Agents often integrate with:
- REST APIs
- Databases
- External SaaS systems
- Enterprise workflows
Common AI-103 Architecture Scenarios
Scenario 1: Enterprise AI Copilot
Requirements:
- Conversational AI
- Enterprise search
- Secure authentication
- Document retrieval
Recommended Infrastructure:
- Azure OpenAI
- Azure AI Search
- Entra ID
- Blob Storage
- App Service
Scenario 2: Large-Scale Multi-Agent System
Requirements:
- Multiple AI agents
- High scalability
- Distributed orchestration
Recommended Infrastructure:
- AKS
- Azure Functions
- Prompt Flow
- Cosmos DB
Scenario 3: AI Invoice Processing Solution
Requirements:
- OCR
- Document extraction
- Workflow automation
Recommended Infrastructure:
- Azure AI Document Intelligence
- Blob Storage
- Logic Apps
- Azure Functions
Scenario 4: Global AI Chat Platform
Requirements:
- Global availability
- High concurrency
- Disaster recovery
Recommended Infrastructure:
- Geo-redundant storage
- Availability Zones
- Load balancing
- Autoscaling
Cost Optimization Considerations
AI infrastructure can become expensive.
Common Cost Drivers
- Token usage
- Vector storage
- GPU workloads
- Data transfer
- Search indexing
- High-scale orchestration
Cost Optimization Strategies
Use Smaller Models When Appropriate
Smaller models reduce:
- Compute usage
- Token costs
- Latency
Use Autoscaling
Autoscaling reduces idle resource costs.
Optimize Retrieval Pipelines
Efficient chunking and indexing reduce:
- Search costs
- Storage requirements
- Retrieval latency
Common AI-103 Exam Tips
Understand Infrastructure Tradeoffs
Know when to use:
- AKS vs App Service
- Functions vs Containers
- Cosmos DB vs SQL Database
Learn Security Best Practices
Know how to use:
- Entra ID
- RBAC
- Key Vault
- Private Endpoints
Understand RAG Infrastructure
RAG commonly uses:
- Azure OpenAI
- Azure AI Search
- Embeddings
- Storage systems
Know Agent Infrastructure Patterns
AI agents commonly require:
- Workflow orchestration
- Tool integration
- Persistent memory
- Retrieval systems
Summary
Designing Azure infrastructure for AI applications requires balancing:
- Scalability
- Security
- Performance
- Cost
- Reliability
- Maintainability
For the AI-103 exam, you should understand:
- Azure AI service architecture
- Compute options
- Storage design
- Networking and security
- Monitoring and observability
- High availability
- Agent infrastructure patterns
- RAG infrastructure
- Infrastructure scaling strategies
Strong infrastructure design skills are essential for deploying production-grade AI apps and agent-based systems on Azure.
Practice Exam Questions
Question 1
Which Azure service is MOST appropriate for enterprise vector search and RAG retrieval?
A. Azure AI Search
B. Azure Backup
C. Azure CDN
D. Azure DNS
Answer
A. Azure AI Search
Explanation
Azure AI Search supports vector search, semantic search, and retrieval for RAG systems.
Question 2
Which Azure compute service is BEST suited for large-scale containerized AI microservices?
A. Azure App Service
B. Azure Kubernetes Service (AKS)
C. Azure Files
D. Azure CDN
Answer
B. Azure Kubernetes Service (AKS)
Explanation
AKS provides advanced container orchestration and scalability.
Question 3
Which Azure service is MOST appropriate for storing API keys and secrets securely?
A. Azure Key Vault
B. Azure Monitor
C. Azure DNS
D. Azure Load Balancer
Answer
A. Azure Key Vault
Explanation
Azure Key Vault securely stores secrets, certificates, and keys.
Question 4
Which Azure service provides serverless execution for lightweight AI workflows and tool calling?
A. Azure Functions
B. Azure Backup
C. Azure CDN
D. Azure Firewall
Answer
A. Azure Functions
Explanation
Azure Functions supports event-driven serverless compute.
Question 5
What is the primary purpose of Availability Zones?
A. Reduce token usage
B. Improve fault tolerance and uptime
C. Replace backups
D. Encrypt embeddings
Answer
B. Improve fault tolerance and uptime
Explanation
Availability Zones provide redundancy across isolated datacenter locations.
Question 6
Which Azure service is MOST commonly used for globally distributed NoSQL storage in AI applications?
A. Azure Cosmos DB
B. Azure DNS
C. Azure Files
D. Azure CDN
Answer
A. Azure Cosmos DB
Explanation
Azure Cosmos DB provides scalable globally distributed NoSQL storage.
Question 7
Which Azure networking feature enables private access to Azure services from a VNet?
A. Private Endpoint
B. Public IP
C. Load Balancer
D. Traffic Manager
Answer
A. Private Endpoint
Explanation
Private Endpoints provide secure private connectivity.
Question 8
Which Azure monitoring service provides application telemetry and request tracing?
A. Application Insights
B. Azure CDN
C. Azure Policy
D. Azure ExpressRoute
Answer
A. Application Insights
Explanation
Application Insights provides telemetry and diagnostics for applications.
Question 9
Which Azure identity service provides authentication and RBAC support for AI applications?
A. Microsoft Entra ID
B. Azure CDN
C. Azure Firewall
D. Azure Front Door
Answer
A. Microsoft Entra ID
Explanation
Microsoft Entra ID provides identity and access management.
Question 10
Which scaling strategy adds additional instances to support increased AI workload demand?
A. Vertical scaling
B. Horizontal scaling
C. Encryption scaling
D. Semantic scaling
Answer
B. Horizontal scaling
Explanation
Horizontal scaling adds more instances to distribute workloads.
Go to the AI-103 Exam Prep Hub main page
