This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub.
This topic falls under these sections:
Implement computer vision solutions (10–15%)
--> Design and implement multimodal understanding workflows
--> Configure generation of alt-text and extended image descriptions aligned to accessibility guidelines
Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.
Introduction
Accessibility is a critical requirement in modern AI applications. Multimodal AI systems can automatically generate:
- Alt-text
- Image captions
- Extended image descriptions
- Contextual accessibility summaries
These capabilities improve usability for individuals who rely on:
- Screen readers
- Assistive technologies
- Audio narration
- Alternative interfaces
For the AI-103 certification exam, you should understand how to configure systems that generate accessible image descriptions aligned with accessibility standards and Responsible AI principles.
This includes:
- Alt-text generation
- Extended descriptions
- Accessibility-focused prompting
- Multimodal understanding workflows
- Caption quality validation
- Accessibility compliance
- Responsible AI considerations
You should also understand:
- WCAG accessibility concepts
- Concise vs detailed descriptions
- OCR-enhanced accessibility workflows
- Human review processes
- Azure services used for accessibility-focused AI solutions
This topic falls under:
“Design and implement multimodal understanding workflows”
What Is Alt-Text?
Definition
Alt-text (alternative text) is a textual description of an image used by assistive technologies such as screen readers.
Alt-text helps users who cannot see images understand visual content.
Example of Alt-Text
Image:
- A woman reading a book in a park
Alt-text:
A woman sitting on a park bench reading a book beneath a large tree
Purpose of Alt-Text
Alt-text improves:
- Accessibility
- Inclusion
- Search indexing
- Content usability
It is especially important for:
- Websites
- Mobile apps
- Educational platforms
- E-commerce systems
What Are Extended Image Descriptions?
Definition
Extended image descriptions provide more detailed explanations than standard alt-text.
These are useful for:
- Complex charts
- Infographics
- Educational diagrams
- Scientific imagery
- Data visualizations
Example of Extended Description
Image:
- Sales dashboard
Extended description:
A dashboard displaying quarterly sales trends from January through December. Sales rise steadily from Q1 to Q3 before declining slightly in Q4. The highest-performing category is electronics.
Concise vs Extended Descriptions
Concise Alt-Text
Short and focused.
Example:
A red sports car parked beside a city street
Best for:
- Simple images
- Fast accessibility reading
Extended Descriptions
Detailed and contextual.
Example:
A red convertible sports car is parked beside a busy downtown street lined with office buildings and pedestrians during the evening rush hour
Best for:
- Complex scenes
- Educational content
- Accessibility enhancement
Accessibility Standards
WCAG Overview
Accessibility systems often align with:
World Wide Web Consortium
Web Content Accessibility Guidelines (WCAG).
WCAG focuses on:
- Perceivable content
- Operable interfaces
- Understandable information
- Robust accessibility support
Importance of Accessibility Compliance
Organizations may need accessibility compliance for:
- Legal requirements
- Public sector systems
- Educational platforms
- Enterprise accessibility policies
Characteristics of Good Alt-Text
Effective alt-text should:
- Be concise
- Be meaningful
- Focus on important content
- Avoid unnecessary details
- Reflect image purpose
Weak Alt-Text Example
Image of a thing
Problems:
- Too vague
- Provides little value
Strong Alt-Text Example
A firefighter carrying a child away from a smoke-filled building
Advantages:
- Clear
- Specific
- Contextual
When to Use Extended Descriptions
Extended descriptions are useful when images contain:
- Charts
- Tables
- Infographics
- Scientific diagrams
- Dense visual information
Decorative Images
Decorative images may require:
- Empty alt-text
- No narration
This prevents unnecessary screen reader noise.
Multimodal Models for Accessibility
Modern multimodal AI systems can:
- Analyze images
- Detect objects
- Identify relationships
- Extract visible text
- Generate natural-language descriptions
Accessibility-Focused Captioning
Accessibility captioning differs from general captioning because it prioritizes:
- Clarity
- Inclusiveness
- Contextual usefulness
- Screen-reader compatibility
OCR-Enhanced Accessibility
OCR (Optical Character Recognition) improves accessibility by extracting visible text from:
- Signs
- Labels
- Screenshots
- Infographics
- Documents
Example OCR Workflow
Image:
- Conference slide
OCR extracts:
Quarterly Revenue Growth
The system incorporates this text into the description.
Prompt Engineering for Accessibility
Accessibility-Focused Prompts
Prompt engineering helps guide multimodal models to produce accessibility-friendly descriptions.
Example Prompt
Generate concise alt-text suitable for a screen reader
Extended Description Prompt
Generate a detailed accessibility description including visible text, relationships, and environmental context
Prompt Engineering Best Practices
Focus on Important Information
Describe:
- Key actions
- Important objects
- Meaningful context
Avoid:
- Irrelevant background details
Match Description Length to Use Case
Use:
- Concise descriptions for simple images
- Extended descriptions for complex visuals
Avoid Assumptions
Do not infer:
- Emotions
- Intentions
- Identities
unless visually clear.
Structured Accessibility Outputs
Applications may request:
- JSON output
- Categorized descriptions
- Metadata tags
Example:
Return alt-text and extracted text as JSON
Multi-Image Accessibility Workflows
Applications may generate:
- Individual alt-text
- Album summaries
- Comparative descriptions
Example Multi-Image Summary
A family vacation featuring beach activities, hiking trails, and outdoor dining experiences
Accessibility for Charts and Diagrams
Complex visuals require:
- Trend descriptions
- Key data insights
- Structural explanations
Example Chart Description
The chart shows revenue increasing steadily from January through September before declining slightly in October and November
Responsible AI Considerations
Accessibility systems introduce important Responsible AI concerns.
Bias and Fairness
Models may:
- Misidentify individuals
- Reinforce stereotypes
- Produce biased descriptions
Privacy Concerns
Images may contain:
- Faces
- Sensitive documents
- Personal information
Organizations must protect user privacy.
Hallucinations
What Are Hallucinations?
Hallucinations occur when models describe nonexistent content.
Example:
- Mentioning a laptop that does not appear in the image
Reducing Hallucinations
Strategies include:
- Grounded prompting
- OCR validation
- Confidence scoring
- Human review
Human-in-the-Loop Review
Manual review is often required for:
- Public-facing systems
- Educational materials
- Government applications
- Sensitive accessibility content
Azure AI Content Safety
Microsoft provides:
Azure AI Content Safety
to help detect:
- Harmful content
- Unsafe imagery
- Policy violations
Performance Considerations
Accessibility workflows may process:
- Large image libraries
- High-resolution assets
- Batch uploads
Factors affecting performance include:
- Model complexity
- OCR processing
- Batch size
- GPU availability
Optimization Techniques
Image Resizing
Reduce unnecessary resolution.
Batch Processing
Process multiple images simultaneously.
Asynchronous Workflows
Improve application responsiveness.
Caching
Reuse existing image descriptions when appropriate.
Azure Services for Accessibility Workflows
Azure OpenAI Service
Azure OpenAI Service
Supports:
- Multimodal reasoning
- Accessibility-focused prompting
- Natural-language description generation
Azure AI Vision
Azure AI Vision
Supports:
- Image analysis
- OCR
- Caption generation
- Object detection
Azure AI Document Intelligence
Azure AI Document Intelligence
Supports:
- Layout understanding
- OCR extraction
- Document accessibility workflows
Azure AI Foundry
Azure AI Foundry
Supports:
- Workflow orchestration
- Prompt flows
- AI evaluation pipelines
Azure Blob Storage
Azure Blob Storage
Frequently used for:
- Image storage
- Accessibility metadata storage
- Workflow integration
Azure Functions
Azure Functions
Often used for:
- Event-driven workflows
- Accessibility processing pipelines
- Batch orchestration
Observability and Monitoring
Production accessibility systems should monitor:
- Caption latency
- OCR accuracy
- Hallucination frequency
- Accessibility quality metrics
- Failed requests
- Safety violations
- Operational costs
Best Practices for Accessibility-Focused AI
Prioritize Clarity
Descriptions should be understandable and useful.
Match Description Depth to Content Complexity
Use concise or extended descriptions appropriately.
Include Visible Text When Relevant
OCR improves accessibility quality.
Avoid Biased Language
Use neutral, factual descriptions.
Validate Outputs
Check for hallucinations and inaccuracies.
Support Human Review
Especially important for high-impact content.
Maintain Accessibility Compliance
Align with WCAG principles and organizational policies.
Real-World Example
An educational platform may:
- Upload classroom diagrams
- Use OCR to extract visible labels
- Generate concise alt-text for thumbnails
- Generate extended descriptions for complex diagrams
- Validate outputs with accessibility reviewers
- Store descriptions for screen-reader access
This demonstrates:
- Accessibility-focused prompting
- OCR integration
- Extended descriptions
- Human-in-the-loop review
Exam Tips for AI-103
For the AI-103 exam, remember these important concepts:
- Alt-text provides accessible image descriptions for screen readers.
- Extended descriptions support complex visuals such as charts and diagrams.
- Accessibility workflows often align with WCAG principles.
- OCR improves accessibility by extracting visible text.
- Concise descriptions are best for simple visuals.
- Extended descriptions are best for complex content.
- Hallucinations occur when models describe nonexistent content.
- Accessibility-focused prompting improves output quality.
- Azure AI Vision supports OCR and image analysis.
- Azure AI Content Safety helps moderate unsafe imagery.
- Human review may be required for sensitive or public-facing systems.
Practice Exam Questions
Question 1
What is the primary purpose of alt-text?
A. Compressing image files
B. Providing accessible image descriptions for assistive technologies
C. Encrypting image metadata
D. Accelerating GPU rendering
Answer
B. Providing accessible image descriptions for assistive technologies
Explanation
Alt-text enables screen readers to describe images to visually impaired users.
Question 2
When are extended image descriptions most useful?
A. For decorative images only
B. For complex visuals such as charts and diagrams
C. For reducing GPU utilization
D. For encrypting media assets
Answer
B. For complex visuals such as charts and diagrams
Explanation
Extended descriptions provide detailed explanations for visually dense content.
Question 3
What is a characteristic of good alt-text?
A. Excessive technical jargon
B. Clear and meaningful descriptions
C. Random artistic interpretation
D. Extremely long paragraphs for every image
Answer
B. Clear and meaningful descriptions
Explanation
Good alt-text should concisely communicate important image content.
Question 4
What does OCR contribute to accessibility workflows?
A. Automatic image compression
B. Extraction of visible text from images and documents
C. Elimination of GPU usage
D. Encryption of screen-reader output
Answer
B. Extraction of visible text from images and documents
Explanation
OCR improves accessibility by incorporating visible text into descriptions.
Question 5
What is a hallucination in an accessibility-focused AI system?
A. Generating unsupported or nonexistent details
B. Compressing images automatically
C. Encrypting image metadata
D. Scaling GPU clusters
Answer
A. Generating unsupported or nonexistent details
Explanation
Hallucinations occur when the model describes content not actually present.
Question 6
Which Azure service supports OCR and image analysis?
A. Azure AI Vision
B. Azure DNS
C. Azure Firewall
D. Azure Virtual WAN
Answer
A. Azure AI Vision
Explanation
Azure AI Vision supports OCR, captioning, and image understanding.
Question 7
Why should accessibility-focused prompts be specific?
A. To reduce storage requirements
B. To improve relevance and clarity of generated descriptions
C. To disable OCR functionality
D. To eliminate all hallucinations automatically
Answer
B. To improve relevance and clarity of generated descriptions
Explanation
Specific prompts guide multimodal models toward better accessibility outputs.
Question 8
What is a best practice for accessibility-focused image descriptions?
A. Avoid describing important context
B. Match description detail to image complexity
C. Always generate the longest possible description
D. Ignore visible text in diagrams
Answer
B. Match description detail to image complexity
Explanation
Simple images may need concise descriptions, while complex visuals require more detail.
Question 9
Which organization publishes WCAG accessibility guidelines?
A. World Wide Web Consortium (W3C)
B. Linux Foundation
C. IEEE
D. Apache Software Foundation
Answer
A. World Wide Web Consortium (W3C)
Explanation
The W3C publishes the Web Content Accessibility Guidelines (WCAG).
Question 10
Why might human review be required in accessibility workflows?
A. To validate accuracy and inclusiveness of generated descriptions
B. To reduce internet bandwidth usage
C. To disable multimodal prompting
D. To eliminate OCR processing
Answer
A. To validate accuracy and inclusiveness of generated descriptions
Explanation
Human review helps ensure accessibility descriptions are accurate, fair, and useful.
Go to the AI-103 Exam Prep Hub main page
