Configure generation of alt-text and extended image descriptions aligned to accessibility guidelines (AI-103 Exam Prep)

This post is a part of the AI-103: Develop AI Apps and Agents on Azure Exam Prep Hub. 
This topic falls under these sections:
Implement computer vision solutions (10–15%)
--> Design and implement multimodal understanding workflows
--> Configure generation of alt-text and extended image descriptions aligned to accessibility guidelines


Note that there are 10 practice questions (with answers and explanations) at the end of each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available from the hub's main page below the exam topics section.

Introduction

Accessibility is a critical requirement in modern AI applications. Multimodal AI systems can automatically generate:

  • Alt-text
  • Image captions
  • Extended image descriptions
  • Contextual accessibility summaries

These capabilities improve usability for individuals who rely on:

  • Screen readers
  • Assistive technologies
  • Audio narration
  • Alternative interfaces

For the AI-103 certification exam, you should understand how to configure systems that generate accessible image descriptions aligned with accessibility standards and Responsible AI principles.

This includes:

  • Alt-text generation
  • Extended descriptions
  • Accessibility-focused prompting
  • Multimodal understanding workflows
  • Caption quality validation
  • Accessibility compliance
  • Responsible AI considerations

You should also understand:

  • WCAG accessibility concepts
  • Concise vs detailed descriptions
  • OCR-enhanced accessibility workflows
  • Human review processes
  • Azure services used for accessibility-focused AI solutions

This topic falls under:

“Design and implement multimodal understanding workflows”


What Is Alt-Text?

Definition

Alt-text (alternative text) is a textual description of an image used by assistive technologies such as screen readers.

Alt-text helps users who cannot see images understand visual content.


Example of Alt-Text

Image:

  • A woman reading a book in a park

Alt-text:

A woman sitting on a park bench reading a book beneath a large tree

Purpose of Alt-Text

Alt-text improves:

  • Accessibility
  • Inclusion
  • Search indexing
  • Content usability

It is especially important for:

  • Websites
  • Mobile apps
  • Educational platforms
  • E-commerce systems

What Are Extended Image Descriptions?

Definition

Extended image descriptions provide more detailed explanations than standard alt-text.

These are useful for:

  • Complex charts
  • Infographics
  • Educational diagrams
  • Scientific imagery
  • Data visualizations

Example of Extended Description

Image:

  • Sales dashboard

Extended description:

A dashboard displaying quarterly sales trends from January through December. Sales rise steadily from Q1 to Q3 before declining slightly in Q4. The highest-performing category is electronics.

Concise vs Extended Descriptions

Concise Alt-Text

Short and focused.

Example:

A red sports car parked beside a city street

Best for:

  • Simple images
  • Fast accessibility reading

Extended Descriptions

Detailed and contextual.

Example:

A red convertible sports car is parked beside a busy downtown street lined with office buildings and pedestrians during the evening rush hour

Best for:

  • Complex scenes
  • Educational content
  • Accessibility enhancement

Accessibility Standards

WCAG Overview

Accessibility systems often align with:
World Wide Web Consortium
Web Content Accessibility Guidelines (WCAG).

WCAG focuses on:

  • Perceivable content
  • Operable interfaces
  • Understandable information
  • Robust accessibility support

Importance of Accessibility Compliance

Organizations may need accessibility compliance for:

  • Legal requirements
  • Public sector systems
  • Educational platforms
  • Enterprise accessibility policies

Characteristics of Good Alt-Text

Effective alt-text should:

  • Be concise
  • Be meaningful
  • Focus on important content
  • Avoid unnecessary details
  • Reflect image purpose

Weak Alt-Text Example

Image of a thing

Problems:

  • Too vague
  • Provides little value

Strong Alt-Text Example

A firefighter carrying a child away from a smoke-filled building

Advantages:

  • Clear
  • Specific
  • Contextual

When to Use Extended Descriptions

Extended descriptions are useful when images contain:

  • Charts
  • Tables
  • Infographics
  • Scientific diagrams
  • Dense visual information

Decorative Images

Decorative images may require:

  • Empty alt-text
  • No narration

This prevents unnecessary screen reader noise.


Multimodal Models for Accessibility

Modern multimodal AI systems can:

  • Analyze images
  • Detect objects
  • Identify relationships
  • Extract visible text
  • Generate natural-language descriptions

Accessibility-Focused Captioning

Accessibility captioning differs from general captioning because it prioritizes:

  • Clarity
  • Inclusiveness
  • Contextual usefulness
  • Screen-reader compatibility

OCR-Enhanced Accessibility

OCR (Optical Character Recognition) improves accessibility by extracting visible text from:

  • Signs
  • Labels
  • Screenshots
  • Infographics
  • Documents

Example OCR Workflow

Image:

  • Conference slide

OCR extracts:

Quarterly Revenue Growth

The system incorporates this text into the description.


Prompt Engineering for Accessibility

Accessibility-Focused Prompts

Prompt engineering helps guide multimodal models to produce accessibility-friendly descriptions.


Example Prompt

Generate concise alt-text suitable for a screen reader

Extended Description Prompt

Generate a detailed accessibility description including visible text, relationships, and environmental context

Prompt Engineering Best Practices

Focus on Important Information

Describe:

  • Key actions
  • Important objects
  • Meaningful context

Avoid:

  • Irrelevant background details

Match Description Length to Use Case

Use:

  • Concise descriptions for simple images
  • Extended descriptions for complex visuals

Avoid Assumptions

Do not infer:

  • Emotions
  • Intentions
  • Identities
    unless visually clear.

Structured Accessibility Outputs

Applications may request:

  • JSON output
  • Categorized descriptions
  • Metadata tags

Example:

Return alt-text and extracted text as JSON

Multi-Image Accessibility Workflows

Applications may generate:

  • Individual alt-text
  • Album summaries
  • Comparative descriptions

Example Multi-Image Summary

A family vacation featuring beach activities, hiking trails, and outdoor dining experiences

Accessibility for Charts and Diagrams

Complex visuals require:

  • Trend descriptions
  • Key data insights
  • Structural explanations

Example Chart Description

The chart shows revenue increasing steadily from January through September before declining slightly in October and November

Responsible AI Considerations

Accessibility systems introduce important Responsible AI concerns.


Bias and Fairness

Models may:

  • Misidentify individuals
  • Reinforce stereotypes
  • Produce biased descriptions

Privacy Concerns

Images may contain:

  • Faces
  • Sensitive documents
  • Personal information

Organizations must protect user privacy.


Hallucinations

What Are Hallucinations?

Hallucinations occur when models describe nonexistent content.

Example:

  • Mentioning a laptop that does not appear in the image

Reducing Hallucinations

Strategies include:

  • Grounded prompting
  • OCR validation
  • Confidence scoring
  • Human review

Human-in-the-Loop Review

Manual review is often required for:

  • Public-facing systems
  • Educational materials
  • Government applications
  • Sensitive accessibility content

Azure AI Content Safety

Microsoft provides:
Azure AI Content Safety

to help detect:

  • Harmful content
  • Unsafe imagery
  • Policy violations

Performance Considerations

Accessibility workflows may process:

  • Large image libraries
  • High-resolution assets
  • Batch uploads

Factors affecting performance include:

  • Model complexity
  • OCR processing
  • Batch size
  • GPU availability

Optimization Techniques

Image Resizing

Reduce unnecessary resolution.


Batch Processing

Process multiple images simultaneously.


Asynchronous Workflows

Improve application responsiveness.


Caching

Reuse existing image descriptions when appropriate.


Azure Services for Accessibility Workflows

Azure OpenAI Service

Azure OpenAI Service

Supports:

  • Multimodal reasoning
  • Accessibility-focused prompting
  • Natural-language description generation

Azure AI Vision

Azure AI Vision

Supports:

  • Image analysis
  • OCR
  • Caption generation
  • Object detection

Azure AI Document Intelligence

Azure AI Document Intelligence

Supports:

  • Layout understanding
  • OCR extraction
  • Document accessibility workflows

Azure AI Foundry

Azure AI Foundry

Supports:

  • Workflow orchestration
  • Prompt flows
  • AI evaluation pipelines

Azure Blob Storage

Azure Blob Storage

Frequently used for:

  • Image storage
  • Accessibility metadata storage
  • Workflow integration

Azure Functions

Azure Functions

Often used for:

  • Event-driven workflows
  • Accessibility processing pipelines
  • Batch orchestration

Observability and Monitoring

Production accessibility systems should monitor:

  • Caption latency
  • OCR accuracy
  • Hallucination frequency
  • Accessibility quality metrics
  • Failed requests
  • Safety violations
  • Operational costs

Best Practices for Accessibility-Focused AI

Prioritize Clarity

Descriptions should be understandable and useful.


Match Description Depth to Content Complexity

Use concise or extended descriptions appropriately.


Include Visible Text When Relevant

OCR improves accessibility quality.


Avoid Biased Language

Use neutral, factual descriptions.


Validate Outputs

Check for hallucinations and inaccuracies.


Support Human Review

Especially important for high-impact content.


Maintain Accessibility Compliance

Align with WCAG principles and organizational policies.


Real-World Example

An educational platform may:

  1. Upload classroom diagrams
  2. Use OCR to extract visible labels
  3. Generate concise alt-text for thumbnails
  4. Generate extended descriptions for complex diagrams
  5. Validate outputs with accessibility reviewers
  6. Store descriptions for screen-reader access

This demonstrates:

  • Accessibility-focused prompting
  • OCR integration
  • Extended descriptions
  • Human-in-the-loop review

Exam Tips for AI-103

For the AI-103 exam, remember these important concepts:

  • Alt-text provides accessible image descriptions for screen readers.
  • Extended descriptions support complex visuals such as charts and diagrams.
  • Accessibility workflows often align with WCAG principles.
  • OCR improves accessibility by extracting visible text.
  • Concise descriptions are best for simple visuals.
  • Extended descriptions are best for complex content.
  • Hallucinations occur when models describe nonexistent content.
  • Accessibility-focused prompting improves output quality.
  • Azure AI Vision supports OCR and image analysis.
  • Azure AI Content Safety helps moderate unsafe imagery.
  • Human review may be required for sensitive or public-facing systems.

Practice Exam Questions

Question 1

What is the primary purpose of alt-text?

A. Compressing image files
B. Providing accessible image descriptions for assistive technologies
C. Encrypting image metadata
D. Accelerating GPU rendering

Answer

B. Providing accessible image descriptions for assistive technologies

Explanation

Alt-text enables screen readers to describe images to visually impaired users.


Question 2

When are extended image descriptions most useful?

A. For decorative images only
B. For complex visuals such as charts and diagrams
C. For reducing GPU utilization
D. For encrypting media assets

Answer

B. For complex visuals such as charts and diagrams

Explanation

Extended descriptions provide detailed explanations for visually dense content.


Question 3

What is a characteristic of good alt-text?

A. Excessive technical jargon
B. Clear and meaningful descriptions
C. Random artistic interpretation
D. Extremely long paragraphs for every image

Answer

B. Clear and meaningful descriptions

Explanation

Good alt-text should concisely communicate important image content.


Question 4

What does OCR contribute to accessibility workflows?

A. Automatic image compression
B. Extraction of visible text from images and documents
C. Elimination of GPU usage
D. Encryption of screen-reader output

Answer

B. Extraction of visible text from images and documents

Explanation

OCR improves accessibility by incorporating visible text into descriptions.


Question 5

What is a hallucination in an accessibility-focused AI system?

A. Generating unsupported or nonexistent details
B. Compressing images automatically
C. Encrypting image metadata
D. Scaling GPU clusters

Answer

A. Generating unsupported or nonexistent details

Explanation

Hallucinations occur when the model describes content not actually present.


Question 6

Which Azure service supports OCR and image analysis?

A. Azure AI Vision
B. Azure DNS
C. Azure Firewall
D. Azure Virtual WAN

Answer

A. Azure AI Vision

Explanation

Azure AI Vision supports OCR, captioning, and image understanding.


Question 7

Why should accessibility-focused prompts be specific?

A. To reduce storage requirements
B. To improve relevance and clarity of generated descriptions
C. To disable OCR functionality
D. To eliminate all hallucinations automatically

Answer

B. To improve relevance and clarity of generated descriptions

Explanation

Specific prompts guide multimodal models toward better accessibility outputs.


Question 8

What is a best practice for accessibility-focused image descriptions?

A. Avoid describing important context
B. Match description detail to image complexity
C. Always generate the longest possible description
D. Ignore visible text in diagrams

Answer

B. Match description detail to image complexity

Explanation

Simple images may need concise descriptions, while complex visuals require more detail.


Question 9

Which organization publishes WCAG accessibility guidelines?

A. World Wide Web Consortium (W3C)
B. Linux Foundation
C. IEEE
D. Apache Software Foundation

Answer

A. World Wide Web Consortium (W3C)

Explanation

The W3C publishes the Web Content Accessibility Guidelines (WCAG).


Question 10

Why might human review be required in accessibility workflows?

A. To validate accuracy and inclusiveness of generated descriptions
B. To reduce internet bandwidth usage
C. To disable multimodal prompting
D. To eliminate OCR processing

Answer

A. To validate accuracy and inclusiveness of generated descriptions

Explanation

Human review helps ensure accessibility descriptions are accurate, fair, and useful.


Go to the AI-103 Exam Prep Hub main page

Leave a comment