Choose an appropriate model for each task, including large language models (LLMs), small language models, multimodal models, and Foundry Tools (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Plan and manage an Azure AI solution (25–30%)
--> Choose the appropriate Foundry services for generative AI and agents
--> Choose an appropriate model for each task, including large language models (LLMs), small language models, multimodal models, and Foundry Tools


Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

One of the most important skills for the AI-103: Develop AI Apps and Agents on Azure certification exam is understanding how to choose the correct AI model and supporting Azure AI Foundry tools for a given business or technical scenario.

Modern AI development is no longer about simply selecting “an AI model.” Instead, developers must evaluate:

  • The type of task being performed
  • Cost constraints
  • Latency requirements
  • Accuracy expectations
  • Reasoning complexity
  • Context window needs
  • Multimodal capabilities
  • Deployment environment
  • Security and governance requirements
  • Agent orchestration requirements

Azure AI Foundry provides access to multiple categories of models and tools that help developers build generative AI applications and AI agents efficiently.

For the AI-103 exam, you should understand:

  • When to use Large Language Models (LLMs)
  • When Small Language Models (SLMs) are preferable
  • When multimodal models are required
  • How Azure AI Foundry tools support model selection and orchestration
  • Tradeoffs between performance, cost, speed, and capability
  • Common real-world scenarios for each model category

Azure AI Foundry Overview

Azure AI Foundry is Microsoft’s unified platform for building, evaluating, deploying, and managing AI applications and agents.

Azure AI Foundry provides:

  • Access to foundation models
  • Agent development capabilities
  • Prompt engineering tools
  • Evaluation tools
  • Safety and content filtering
  • Retrieval-augmented generation (RAG) support
  • Fine-tuning capabilities
  • Monitoring and observability
  • Integration with Azure AI services

Azure AI Foundry enables developers to:

  • Compare multiple models
  • Test prompts
  • Evaluate outputs
  • Build AI agents
  • Connect enterprise data
  • Deploy scalable AI applications

For the AI-103 exam, understanding the relationship between model capabilities and Azure AI Foundry tools is extremely important.


Understanding Model Categories

The exam focuses heavily on selecting the correct model type for specific tasks.

The major categories include:

  1. Large Language Models (LLMs)
  2. Small Language Models (SLMs)
  3. Multimodal Models
  4. Embedding Models
  5. Specialized Models

Each category serves different purposes.


Large Language Models (LLMs)

What Are Large Language Models?

Large Language Models are advanced AI models trained on massive datasets containing text, code, and other information.

LLMs are designed for:

  • Natural language understanding
  • Natural language generation
  • Complex reasoning
  • Summarization
  • Coding assistance
  • Question answering
  • Conversational AI
  • Agent workflows
  • Content creation

Examples include:

  • GPT-4 family models
  • GPT-4o models
  • GPT-4 Turbo
  • Phi large models
  • Other frontier foundation models available in Azure AI Foundry

Characteristics of LLMs

Strengths

LLMs are excellent at:

Complex Reasoning

Examples:

  • Multi-step problem solving
  • Data interpretation
  • Logical analysis
  • Decision support

Advanced Content Generation

Examples:

  • Marketing content
  • Technical documentation
  • Email drafting
  • Knowledge-base generation

Conversational Experiences

Examples:

  • AI chatbots
  • AI copilots
  • Virtual assistants
  • Interactive tutoring systems

Agentic Workflows

LLMs are commonly used as the “reasoning engine” behind AI agents.

They can:

  • Plan tasks
  • Determine next actions
  • Call tools
  • Use memory
  • Chain workflows
  • Interact with APIs

Limitations of LLMs

Although powerful, LLMs have tradeoffs.

Higher Cost

LLMs generally:

  • Require more compute
  • Cost more per token
  • Increase infrastructure expenses

Increased Latency

Larger models may:

  • Respond more slowly
  • Increase application response times
  • Affect real-time user experiences

Resource Requirements

LLMs require:

  • More GPU resources
  • More memory
  • Larger deployments

Overkill for Simple Tasks

Using GPT-4-level reasoning for basic classification or short summarization tasks may be unnecessary and expensive.


When to Use LLMs

Choose an LLM when tasks require:

  • Advanced reasoning
  • Long-context understanding
  • High-quality content generation
  • Complex conversational behavior
  • Tool calling and agent orchestration
  • Coding assistance
  • Sophisticated summarization
  • Enterprise copilots

Example LLM Scenarios

Scenario 1: Enterprise AI Copilot

A company wants an AI assistant that:

  • Reads internal documentation
  • Answers employee questions
  • Generates summaries
  • Explains policies
  • Uses tools and APIs

Best choice:

  • Large Language Model with RAG integration

Reason:

  • Requires reasoning and conversational understanding.

Scenario 2: AI Coding Assistant

A development team needs:

  • Code generation
  • Debugging suggestions
  • Refactoring support
  • Documentation generation

Best choice:

  • Advanced LLM

Reason:

  • Coding tasks require complex contextual reasoning.

Small Language Models (SLMs)

What Are Small Language Models?

Small Language Models are more lightweight AI models optimized for:

  • Faster responses
  • Lower costs
  • Lower resource consumption
  • Edge deployments
  • Narrower tasks

Examples include:

  • Smaller Phi models
  • Compact transformer-based models
  • Task-specific lightweight models

Characteristics of SLMs

Strengths

Lower Cost

SLMs:

  • Consume fewer resources
  • Cost less to run
  • Reduce token usage costs

Faster Inference

SLMs typically:

  • Respond more quickly
  • Improve responsiveness
  • Support near real-time interactions

Edge and Mobile Suitability

SLMs may run:

  • On edge devices
  • On mobile hardware
  • In constrained environments

Efficient for Narrow Tasks

SLMs work well for:

  • Classification
  • Basic summarization
  • Intent detection
  • Simple chat interactions
  • Lightweight automation

Limitations of SLMs

Reduced Reasoning Ability

Compared to LLMs, SLMs may struggle with:

  • Complex logic
  • Long context handling
  • Multi-step reasoning
  • Sophisticated conversations

Lower Output Quality

Outputs may:

  • Be less nuanced
  • Contain reduced detail
  • Provide weaker contextual understanding

When to Use SLMs

Choose an SLM when:

  • Speed is critical
  • Cost optimization matters
  • Tasks are relatively simple
  • Edge deployment is needed
  • High throughput is required
  • Lightweight AI experiences are sufficient

Example SLM Scenarios

Scenario 1: Customer Intent Classification

An application classifies support tickets into categories such as:

  • Billing
  • Technical support
  • Returns
  • Sales

Best choice:

  • Small Language Model

Reason:

  • Classification is relatively simple and does not require advanced reasoning.

Scenario 2: Edge Device Assistant

A manufacturing company deploys an AI assistant on factory equipment with limited compute.

Best choice:

  • Small Language Model

Reason:

  • Edge environments benefit from lightweight models.

Multimodal Models

What Are Multimodal Models?

Multimodal models can process multiple data types simultaneously.

Examples include:

  • Text
  • Images
  • Audio
  • Video
  • Documents

These models combine information across modalities to produce richer outputs.


Capabilities of Multimodal Models

Multimodal models can:

  • Analyze images and answer questions about them
  • Generate captions from images
  • Extract information from documents
  • Process speech and text together
  • Understand charts and diagrams
  • Support visual reasoning

Common Multimodal Tasks

Image Understanding

Examples:

  • Object detection
  • Scene analysis
  • Image captioning
  • Visual question answering

Document Intelligence

Examples:

  • Invoice extraction
  • Receipt processing
  • Form analysis
  • OCR workflows

Audio + Text Experiences

Examples:

  • Voice assistants
  • Meeting summarization
  • Speech transcription
  • Audio analysis

When to Use Multimodal Models

Choose multimodal models when applications involve:

  • Images and text together
  • Document processing
  • Speech interactions
  • Visual understanding
  • Cross-modal reasoning

Example Multimodal Scenarios

Scenario 1: Invoice Processing

A company needs to:

  • Read invoices
  • Extract totals
  • Identify vendors
  • Validate line items

Best choice:

  • Multimodal document processing model

Reason:

  • The solution must interpret both layout and text.

Scenario 2: Retail Image Assistant

Users upload photos of products and ask questions about them.

Best choice:

  • Multimodal model

Reason:

  • Requires simultaneous image and text understanding.

Embedding Models

What Are Embedding Models?

Embedding models convert text or other content into vector representations.

These vectors capture semantic meaning.

Embedding models are essential for:

  • Semantic search
  • Retrieval-Augmented Generation (RAG)
  • Similarity matching
  • Recommendation systems
  • Knowledge retrieval

Retrieval-Augmented Generation (RAG)

RAG combines:

  • Embedding models
  • Vector databases
  • LLMs

Workflow:

  1. Convert documents into embeddings
  2. Store embeddings in a vector index
  3. Convert user query into embeddings
  4. Retrieve relevant content
  5. Send retrieved data to the LLM

RAG improves:

  • Accuracy
  • Freshness of information
  • Enterprise grounding
  • Hallucination reduction

Specialized Models

Some tasks are better handled by specialized AI models instead of general-purpose LLMs.

Examples:

  • Translation models
  • Speech models
  • OCR models
  • Vision models
  • Classification models

Why Specialized Models Matter

Specialized models may provide:

  • Better accuracy
  • Lower cost
  • Faster performance
  • Simpler deployment

Example:

Using a dedicated OCR service is often more efficient than asking an LLM to read text from images.


Model Selection Factors

The AI-103 exam heavily tests your ability to select the correct model based on requirements.


Factor 1: Task Complexity

Use LLMs For:

  • Advanced reasoning
  • Multi-step workflows
  • Complex conversations

Use SLMs For:

  • Simple classification
  • Lightweight interactions
  • Fast automation

Factor 2: Cost

LLMs

  • Higher operational cost
  • More expensive inference

SLMs

  • Lower operational cost
  • Better for high-volume workloads

Factor 3: Latency

Low-Latency Requirements

Prefer:

  • SLMs
  • Lightweight models

Complex Processing

Prefer:

  • LLMs

Even if response time increases.


Factor 4: Context Window

Some tasks require processing:

  • Long documents
  • Large conversations
  • Extensive histories

Choose models with larger context windows for:

  • Legal analysis
  • Knowledge assistants
  • Long-form summarization

Factor 5: Multimodal Requirements

If the application involves:

  • Images
  • Audio
  • Video
  • Documents

Choose multimodal-capable models.


Factor 6: Deployment Environment

Cloud-Hosted Applications

May use:

  • Large frontier models
  • GPU-intensive deployments

Edge or Mobile Deployments

Prefer:

  • Small models
  • Quantized models
  • Lightweight inference

Azure AI Foundry Tools

Azure AI Foundry includes numerous tools that support model selection and AI application development.


Model Catalog

The Model Catalog allows developers to:

  • Browse available models
  • Compare capabilities
  • Review benchmarks
  • Deploy models
  • Evaluate pricing

The catalog includes:

  • Microsoft-hosted models
  • Open-source models
  • Partner models
  • Frontier models

Prompt Flow

Prompt Flow helps developers:

  • Build AI workflows
  • Chain prompts together
  • Integrate tools
  • Evaluate prompts
  • Test model behavior

Prompt Flow is useful for:

  • Agent orchestration
  • RAG pipelines
  • Multi-step AI workflows

AI Agent Development Tools

Azure AI Foundry supports AI agents that can:

  • Use tools
  • Access data
  • Maintain memory
  • Perform actions
  • Execute workflows

Agent frameworks may include:

  • Tool calling
  • Function calling
  • Retrieval integration
  • Multi-agent orchestration

Evaluation Tools

Evaluation tools help developers assess:

  • Accuracy
  • Groundedness
  • Safety
  • Relevance
  • Latency
  • Cost

Evaluation is critical because model quality varies by task.


Content Safety Tools

Azure AI Foundry includes safety features such as:

  • Content filtering
  • Harm detection
  • Prompt injection detection
  • Responsible AI controls

These tools help ensure safe AI deployments.


Fine-Tuning Tools

Fine-tuning allows developers to customize models using:

  • Domain-specific data
  • Proprietary terminology
  • Specialized workflows

Fine-tuning may improve:

  • Accuracy
  • Consistency
  • Industry-specific responses

However, fine-tuning also:

  • Increases cost
  • Requires data preparation
  • Adds operational complexity

Choosing Between Prompt Engineering, RAG, and Fine-Tuning

This is a very important AI-103 exam topic.


Prompt Engineering

Use when:

  • You need quick customization
  • Tasks are general-purpose
  • No private data integration is needed

Advantages:

  • Fast
  • Cheap
  • Easy to maintain

RAG

Use when:

  • You need current or proprietary data
  • You want grounding in enterprise content
  • You need dynamic knowledge retrieval

Advantages:

  • Reduces hallucinations
  • Keeps knowledge current
  • Avoids retraining

Fine-Tuning

Use when:

  • Consistent specialized outputs are required
  • Domain language is highly unique
  • Behavioral customization is necessary

Advantages:

  • Tailored responses
  • Better domain alignment

Real-World Model Selection Examples

Example 1: FAQ Chatbot

Requirements:

  • Low cost
  • Fast responses
  • Basic conversational support

Best Choice:

  • Small Language Model + RAG

Example 2: Legal Document Assistant

Requirements:

  • Long-context understanding
  • Detailed summarization
  • Advanced reasoning

Best Choice:

  • Large Language Model with large context window

Example 3: Mobile AI App

Requirements:

  • Offline capability
  • Fast performance
  • Low resource usage

Best Choice:

  • Small Language Model

Example 4: Image-Based Customer Support

Requirements:

  • Analyze uploaded photos
  • Understand text and images
  • Generate responses

Best Choice:

  • Multimodal model

Key AI-103 Exam Tips

Understand Tradeoffs

You should know:

  • Bigger models are not always better
  • Simpler tasks may not require advanced LLMs
  • Cost and latency matter
  • Specialized models may outperform general models

Know Common Pairings

LLM + RAG

Used for:

  • Enterprise chatbots
  • Knowledge assistants
  • AI copilots

Embeddings + Vector Search

Used for:

  • Semantic search
  • Knowledge retrieval
  • Similarity matching

Multimodal Models

Used for:

  • Vision AI
  • Document processing
  • Audio interactions

Learn the Azure AI Foundry Ecosystem

Know the purpose of:

  • Model Catalog
  • Prompt Flow
  • Evaluation tools
  • Agent tools
  • Safety systems
  • Fine-tuning workflows

Summary

Selecting the correct AI model is one of the most important responsibilities for an Azure AI developer.

For the AI-103 exam, you should understand:

  • The differences between LLMs and SLMs
  • When multimodal models are required
  • How embedding models support RAG
  • When specialized models outperform general-purpose models
  • The tradeoffs between cost, speed, and reasoning capability
  • How Azure AI Foundry tools support AI development and orchestration

In real-world AI systems, choosing the correct model can dramatically improve:

  • Performance
  • User experience
  • Scalability
  • Operational cost
  • Reliability
  • Maintainability

A strong understanding of model selection is essential for designing effective Azure AI applications and AI agents.


Practice Exam Questions

Question 1

A company is building an enterprise AI assistant that must answer complex employee questions using internal documentation and perform multi-step reasoning. Which model type is MOST appropriate?

A. Small Language Model (SLM)
B. Embedding model only
C. Large Language Model (LLM)
D. OCR model

Answer

C. Large Language Model (LLM)

Explanation

Complex reasoning and conversational understanding are best handled by LLMs.


Question 2

Which model type is generally BEST for low-cost, low-latency classification tasks?

A. Large multimodal model
B. Small Language Model (SLM)
C. GPT-4-class reasoning model
D. Vision foundation model

Answer

B. Small Language Model (SLM)

Explanation

SLMs are optimized for lightweight and cost-efficient tasks.


Question 3

A solution must process uploaded invoices and extract totals, vendor names, and line items. Which model type is MOST appropriate?

A. Embedding model
B. Small Language Model
C. Multimodal model
D. Translation model

Answer

C. Multimodal model

Explanation

Invoice extraction requires understanding both layout and text.


Question 4

What is the primary purpose of embedding models?

A. Image generation
B. Semantic vector representation
C. Audio transcription
D. Tool orchestration

Answer

B. Semantic vector representation

Explanation

Embedding models convert content into vectors for semantic search and retrieval.


Question 5

Which Azure AI Foundry tool helps developers chain prompts, integrate tools, and build AI workflows?

A. Azure Monitor
B. Prompt Flow
C. Azure Policy
D. Azure Functions

Answer

B. Prompt Flow

Explanation

Prompt Flow is designed for workflow orchestration and prompt pipelines.


Question 6

A mobile AI application must operate with minimal compute resources and very fast response times. Which model type is MOST appropriate?

A. Large Language Model
B. Small Language Model
C. Large multimodal model
D. High-context reasoning model

Answer

B. Small Language Model

Explanation

SLMs are optimized for lightweight and edge deployments.


Question 7

Which approach is BEST when an AI chatbot must use current enterprise data without retraining the model?

A. Fine-tuning only
B. Prompt engineering only
C. Retrieval-Augmented Generation (RAG)
D. Quantization

Answer

C. Retrieval-Augmented Generation (RAG)

Explanation

RAG retrieves current information dynamically without retraining.


Question 8

Which factor MOST strongly indicates that a multimodal model is required?

A. Need for vector embeddings
B. Need for faster response times
C. Need to process images and text together
D. Need for lower cost

Answer

C. Need to process images and text together

Explanation

Multimodal models handle multiple input modalities simultaneously.


Question 9

What is a major tradeoff of using larger language models?

A. Reduced reasoning capability
B. Lower context windows
C. Increased operational cost
D. Inability to support agents

Answer

C. Increased operational cost

Explanation

Larger models typically require more compute resources and cost more.


Question 10

Which Azure AI Foundry capability helps evaluate model quality, safety, and groundedness?

A. Azure Load Balancer
B. Evaluation tools
C. Azure Backup
D. Traffic Manager

Answer

B. Evaluation tools

Explanation

Evaluation tools assess output quality, safety, and performance metrics.


Go to the AI-103 Exam Prep Hub main page

Leave a comment