This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Implement AI solutions by using Microsoft Foundry (55–60%)
   --> Implement AI solutions with computer vision and image-generation capabilities by using Foundry
      --> Build a lightweight application that includes vision capabilities

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Computer vision enables AI systems to interpret and analyze visual information such as images and videos. Organizations use computer vision solutions for automation, accessibility, security, analytics, and customer experiences.

For the AI-901 certification exam, candidates should understand the foundational concepts behind building lightweight applications that include vision capabilities by using Microsoft Azure AI services and Azure AI Foundry.

This topic falls under the “Implement AI solutions with computer vision and image-generation capabilities by using Foundry” section of the AI-901 exam objectives.

What Is Computer Vision?

Computer vision is a field of AI that enables systems to analyze and understand visual information.

Visual data may include:

Images
Videos
Scanned documents
Camera feeds

Common Computer Vision Tasks

Computer vision systems commonly perform:

Image classification
Object detection
Optical character recognition (OCR)
Facial analysis
Image captioning
Content moderation

Azure AI Vision

Azure AI Vision provides computer vision capabilities through cloud-based AI services.

Features include:

Image analysis
OCR
Object detection
Image captioning
Facial attribute analysis

What Is a Lightweight Application?

A lightweight application is a simple application designed to perform focused tasks with minimal complexity and infrastructure.

Characteristics include:

Simple user interface
Fast deployment
Minimal resource usage
Easy maintenance

Examples of Lightweight Vision Applications

Examples include:

Image analysis tools
Receipt scanning apps
Accessibility assistants
Product recognition apps
Photo-tagging systems

Azure AI Foundry

Azure AI Foundry provides tools for building, testing, and managing AI-powered applications.

Developers can:

Access AI models
Deploy services
Test prompts
Build AI workflows

Image Classification

Image classification identifies the main category or subject of an image.

Example

Image

Photo of a bicycle

Classification

“Bicycle”

Object Detection

Object detection identifies multiple objects and their locations within an image.

Example

Image

Street scene

Detected Objects

Car
Traffic light
Pedestrian
Bicycle

Optical Character Recognition (OCR)

OCR extracts text from images and scanned documents.

Example

Image

Photo of a restaurant menu

Extracted Text

Menu items and prices

Image Captioning

Image captioning generates natural-language descriptions of images.

Example

Image

A dog playing in a park

Caption

“A brown dog running through a grassy park.”

Facial Analysis

Computer vision systems can analyze facial features.

Possible capabilities include:

Face detection
Emotion analysis
Age estimation

For Responsible AI reasons, facial recognition and identification systems require careful consideration.

APIs and Endpoints

Applications communicate with Azure AI services using:

APIs
Endpoints

These allow images to be analyzed programmatically.

Authentication

Applications must securely authenticate before accessing Azure AI services.

Common authentication methods include:

API keys
Azure credentials
Managed identities

User Interface Components

A lightweight vision application may include:

Image upload area
Camera capture button
Results display
Image preview

Real-Time Image Processing

Some applications process images in near real time.

Examples include:

Security monitoring
Live object detection
Accessibility tools

Example Workflow

A common workflow includes:

User uploads image
Application sends image to Azure AI Vision
AI service analyzes image
Results are returned
Application displays findings

Example High-Level Pseudocode

			
image = upload_image()
results = analyze_image(image)
display_results(results)

For AI-901, understanding the workflow is more important than memorizing exact syntax.

Common Real-World Scenarios

Scenario 1: Receipt Scanner

Goal

Extract purchase information from receipts.

Features

OCR
Text extraction
Data organization

Scenario 2: Accessibility Assistant

Goal

Describe images for visually impaired users.

Features

Image captioning
OCR
Spoken descriptions

Scenario 3: Product Recognition

Goal

Identify products from photos.

Features

Object detection
Classification
Product lookup

Scenario 4: Content Moderation

Goal

Identify harmful or inappropriate images.

Features

Image analysis
Safety detection
Automated filtering

Responsible AI Considerations

Vision-enabled applications should follow Responsible AI principles.

Key considerations include:

Fairness
Privacy
Transparency
Inclusiveness
Accountability
Security

Privacy Concerns

Images may contain:

Personal data
Faces
Sensitive documents
Location information

Organizations should protect visual data appropriately.

Bias and Fairness

Computer vision systems may perform unevenly across:

Skin tones
Lighting conditions
Demographics
Environmental conditions

Testing and evaluation are important for fairness.

Transparency

Users should understand:

AI is analyzing images
AI-generated results may contain errors
Images may be processed in the cloud

Hallucinations and Errors

Vision systems may occasionally generate:

Incorrect captions
False detections
Inaccurate classifications

These incorrect outputs are sometimes called hallucinations.

Error Handling

Applications should handle:

Invalid image formats
Poor image quality
Authentication failures
Network interruptions
Rate limits

Image Quality Challenges

Computer vision accuracy can decrease with:

Blurry images
Poor lighting
Low resolution
Obstructed objects

Advantages of Vision Applications

Benefits include:

Automation
Faster analysis
Accessibility improvements
Improved customer experiences
Scalable image processing

Limitations of Vision Applications

Challenges include:

Recognition inaccuracies
Bias
Privacy concerns
Variable image quality
Ethical considerations

High-Level Architecture

A simplified architecture often includes:

User interface
Image upload/capture
Azure AI Vision service
AI analysis
Results display

Generative Vision Capabilities

Some modern systems combine:

Computer vision
Generative AI

These multimodal systems can:

Analyze images
Generate descriptions
Answer visual questions
Create new images

Important AI-901 Exam Tips

For the exam, remember these key points:

Computer vision analyzes visual information.
Azure AI Vision provides computer vision capabilities.
OCR extracts text from images.
Object detection identifies multiple objects in images.
Image captioning generates natural-language image descriptions.
APIs and endpoints connect applications to Azure AI services.
Authentication secures service access.
Responsible AI principles apply to computer vision systems.
Image quality affects AI accuracy.
Hallucinations are inaccurate AI-generated outputs.

Quick Knowledge Check

Question 1

What does OCR do?

Answer

Extracts text from images and scanned documents.

Question 2

What is object detection?

Answer

Identifying and locating objects within an image.

Question 3

Why is authentication important?

Answer

It secures access to Azure AI services.

Question 4

What can reduce computer vision accuracy?

Answer

Poor image quality such as blur or low lighting.

Practice Exam Questions

Question 1

What is the PRIMARY purpose of computer vision?

A. To enable AI systems to analyze and understand visual information
B. To increase internet bandwidth
C. To manage database backups
D. To improve keyboard performance

Correct Answer

A. To enable AI systems to analyze and understand visual information

Explanation

Computer vision allows AI systems to process and interpret images, videos, and other visual data.

Why the Other Answers Are Incorrect

B. To increase internet bandwidth

Computer vision does not affect networking speed.

C. To manage database backups

This is unrelated to computer vision.

D. To improve keyboard performance

This is unrelated to AI vision systems.

Question 2

Which Azure service provides computer vision capabilities such as OCR and image analysis?

A. Azure AI Vision
B. Azure Backup
C. Azure Virtual Machines
D. Azure DNS

Correct Answer

A. Azure AI Vision

Explanation

Azure AI Vision provides cloud-based computer vision capabilities including OCR, object detection, and image captioning.

Why the Other Answers Are Incorrect

B. Azure Backup

This is a backup service.

C. Azure Virtual Machines

This provides compute infrastructure.

D. Azure DNS

This is a networking service.

Question 3

What does OCR stand for?

A. Optical Character Recognition
B. Open Cloud Rendering
C. Object Classification Registry
D. Operational Compute Routing

Correct Answer

A. Optical Character Recognition

Explanation

OCR extracts text from images or scanned documents.

Why the Other Answers Are Incorrect

B. Open Cloud Rendering

This is not the meaning of OCR.

C. Object Classification Registry

This is unrelated to OCR.

D. Operational Compute Routing

This is not a computer vision term.

Question 4

What is the PRIMARY purpose of object detection?

A. To identify and locate objects within an image
B. To translate spoken language
C. To summarize long documents
D. To compress image files

Correct Answer

A. To identify and locate objects within an image

Explanation

Object detection identifies multiple objects and their locations inside an image.

Why the Other Answers Are Incorrect

B. To translate spoken language

This is a speech AI task.

C. To summarize long documents

This is a text analysis task.

D. To compress image files

Object detection does not compress files.

Question 5

What does image captioning do?

A. Generates natural-language descriptions of images
B. Converts speech into text
C. Encrypts image files
D. Creates database tables

Correct Answer

A. Generates natural-language descriptions of images

Explanation

Image captioning creates human-readable descriptions of visual content.

Why the Other Answers Are Incorrect

B. Converts speech into text

This is speech recognition.

C. Encrypts image files

Encryption is unrelated to captioning.

D. Creates database tables

This is unrelated to computer vision.

Question 6

How do lightweight vision applications typically communicate with Azure AI services?

A. Through APIs and endpoints
B. Through printer drivers
C. Through monitor settings
D. Through USB-only connections

Correct Answer

A. Through APIs and endpoints

Explanation

Applications use APIs and cloud endpoints to send images and receive AI-generated analysis results.

Why the Other Answers Are Incorrect

B. Through printer drivers

Printers are unrelated to AI communication.

C. Through monitor settings

This is unrelated to cloud AI services.

D. Through USB-only connections

Cloud services use network communication.

Question 7

Why is authentication important when accessing Azure AI Vision services?

A. To secure access to AI resources
B. To increase image brightness
C. To improve keyboard response time
D. To accelerate internet speeds

Correct Answer

A. To secure access to AI resources

Explanation

Authentication helps ensure that only authorized users and applications can access Azure AI services.

Why the Other Answers Are Incorrect

B. To increase image brightness

Authentication does not affect image quality.

C. To improve keyboard response time

This is unrelated to authentication.

D. To accelerate internet speeds

Authentication does not improve network performance.

Question 8

Which Responsible AI concern is especially important in computer vision systems?

A. Protecting personal and sensitive visual information
B. Increasing monitor resolution
C. Improving printer speed
D. Reducing spreadsheet file sizes

Correct Answer

A. Protecting personal and sensitive visual information

Explanation

Images may contain faces, documents, or other sensitive information that must be protected.

Why the Other Answers Are Incorrect

B. Increasing monitor resolution

This is unrelated to Responsible AI.

C. Improving printer speed

Printers are unrelated to computer vision ethics.

D. Reducing spreadsheet file sizes

This is unrelated to image analysis.

Question 9

What challenge can reduce computer vision accuracy?

A. Poor image quality
B. Spreadsheet formatting
C. Keyboard layout changes
D. Audio playback speed

Correct Answer

A. Poor image quality

Explanation

Blur, low lighting, and low resolution can negatively affect image analysis accuracy.

Why the Other Answers Are Incorrect

B. Spreadsheet formatting

This does not affect vision systems.

C. Keyboard layout changes

This is unrelated to image processing.

D. Audio playback speed

This is unrelated to computer vision.

Question 10

What are hallucinations in AI vision systems?

A. Incorrect or fabricated AI-generated outputs
B. Hardware installation failures
C. Network outages
D. Printer connection problems

Correct Answer

A. Incorrect or fabricated AI-generated outputs

Explanation

Hallucinations occur when AI systems generate inaccurate descriptions or detections.

Why the Other Answers Are Incorrect

B. Hardware installation failures

This is unrelated to AI-generated outputs.

C. Network outages

This is a connectivity issue.

D. Printer connection problems

This is unrelated to AI vision systems.

Final Thoughts

Building lightweight applications with vision capabilities is an important topic for the AI-901 certification exam. Microsoft expects candidates to understand the foundational concepts behind computer vision applications, including image classification, object detection, OCR, APIs, authentication, Responsible AI principles, and real-world implementation workflows.

Azure AI Vision and Azure AI Foundry provide powerful cloud-based tools that make it easier to build intelligent applications capable of analyzing and understanding visual information.

Go to the AI-901 Exam Prep Hub main page