Build a lightweight application that includes vision capabilities (AI-901 Exam Prep)

This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Implement AI solutions by using Microsoft Foundry (55–60%)
--> Implement AI solutions with computer vision and image-generation capabilities by using Foundry
--> Build a lightweight application that includes vision capabilities


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Computer vision enables AI systems to interpret and analyze visual information such as images and videos. Organizations use computer vision solutions for automation, accessibility, security, analytics, and customer experiences.

For the AI-901 certification exam, candidates should understand the foundational concepts behind building lightweight applications that include vision capabilities by using Microsoft Azure AI services and Azure AI Foundry.

This topic falls under the “Implement AI solutions with computer vision and image-generation capabilities by using Foundry” section of the AI-901 exam objectives.


What Is Computer Vision?

Computer vision is a field of AI that enables systems to analyze and understand visual information.

Visual data may include:

  • Images
  • Videos
  • Scanned documents
  • Camera feeds

Common Computer Vision Tasks

Computer vision systems commonly perform:

  • Image classification
  • Object detection
  • Optical character recognition (OCR)
  • Facial analysis
  • Image captioning
  • Content moderation

Azure AI Vision

Azure AI Vision provides computer vision capabilities through cloud-based AI services.

Features include:

  • Image analysis
  • OCR
  • Object detection
  • Image captioning
  • Facial attribute analysis

What Is a Lightweight Application?

A lightweight application is a simple application designed to perform focused tasks with minimal complexity and infrastructure.

Characteristics include:

  • Simple user interface
  • Fast deployment
  • Minimal resource usage
  • Easy maintenance

Examples of Lightweight Vision Applications

Examples include:

  • Image analysis tools
  • Receipt scanning apps
  • Accessibility assistants
  • Product recognition apps
  • Photo-tagging systems

Azure AI Foundry

Azure AI Foundry provides tools for building, testing, and managing AI-powered applications.

Developers can:

  • Access AI models
  • Deploy services
  • Test prompts
  • Build AI workflows

Image Classification

Image classification identifies the main category or subject of an image.


Example

Image

Photo of a bicycle

Classification

“Bicycle”


Object Detection

Object detection identifies multiple objects and their locations within an image.


Example

Image

Street scene

Detected Objects

  • Car
  • Traffic light
  • Pedestrian
  • Bicycle

Optical Character Recognition (OCR)

OCR extracts text from images and scanned documents.


Example

Image

Photo of a restaurant menu

Extracted Text

Menu items and prices


Image Captioning

Image captioning generates natural-language descriptions of images.


Example

Image

A dog playing in a park

Caption

“A brown dog running through a grassy park.”


Facial Analysis

Computer vision systems can analyze facial features.

Possible capabilities include:

  • Face detection
  • Emotion analysis
  • Age estimation

For Responsible AI reasons, facial recognition and identification systems require careful consideration.


APIs and Endpoints

Applications communicate with Azure AI services using:

  • APIs
  • Endpoints

These allow images to be analyzed programmatically.


Authentication

Applications must securely authenticate before accessing Azure AI services.

Common authentication methods include:

  • API keys
  • Azure credentials
  • Managed identities

User Interface Components

A lightweight vision application may include:

  • Image upload area
  • Camera capture button
  • Results display
  • Image preview

Real-Time Image Processing

Some applications process images in near real time.

Examples include:

  • Security monitoring
  • Live object detection
  • Accessibility tools

Example Workflow

A common workflow includes:

  1. User uploads image
  2. Application sends image to Azure AI Vision
  3. AI service analyzes image
  4. Results are returned
  5. Application displays findings

Example High-Level Pseudocode

image = upload_image()
results = analyze_image(image)
display_results(results)

For AI-901, understanding the workflow is more important than memorizing exact syntax.


Common Real-World Scenarios


Scenario 1: Receipt Scanner

Goal

Extract purchase information from receipts.

Features

  • OCR
  • Text extraction
  • Data organization

Scenario 2: Accessibility Assistant

Goal

Describe images for visually impaired users.

Features

  • Image captioning
  • OCR
  • Spoken descriptions

Scenario 3: Product Recognition

Goal

Identify products from photos.

Features

  • Object detection
  • Classification
  • Product lookup

Scenario 4: Content Moderation

Goal

Identify harmful or inappropriate images.

Features

  • Image analysis
  • Safety detection
  • Automated filtering

Responsible AI Considerations

Vision-enabled applications should follow Responsible AI principles.

Key considerations include:

  • Fairness
  • Privacy
  • Transparency
  • Inclusiveness
  • Accountability
  • Security

Privacy Concerns

Images may contain:

  • Personal data
  • Faces
  • Sensitive documents
  • Location information

Organizations should protect visual data appropriately.


Bias and Fairness

Computer vision systems may perform unevenly across:

  • Skin tones
  • Lighting conditions
  • Demographics
  • Environmental conditions

Testing and evaluation are important for fairness.


Transparency

Users should understand:

  • AI is analyzing images
  • AI-generated results may contain errors
  • Images may be processed in the cloud

Hallucinations and Errors

Vision systems may occasionally generate:

  • Incorrect captions
  • False detections
  • Inaccurate classifications

These incorrect outputs are sometimes called hallucinations.


Error Handling

Applications should handle:

  • Invalid image formats
  • Poor image quality
  • Authentication failures
  • Network interruptions
  • Rate limits

Image Quality Challenges

Computer vision accuracy can decrease with:

  • Blurry images
  • Poor lighting
  • Low resolution
  • Obstructed objects

Advantages of Vision Applications

Benefits include:

  • Automation
  • Faster analysis
  • Accessibility improvements
  • Improved customer experiences
  • Scalable image processing

Limitations of Vision Applications

Challenges include:

  • Recognition inaccuracies
  • Bias
  • Privacy concerns
  • Variable image quality
  • Ethical considerations

High-Level Architecture

A simplified architecture often includes:

  1. User interface
  2. Image upload/capture
  3. Azure AI Vision service
  4. AI analysis
  5. Results display

Generative Vision Capabilities

Some modern systems combine:

  • Computer vision
  • Generative AI

These multimodal systems can:

  • Analyze images
  • Generate descriptions
  • Answer visual questions
  • Create new images

Important AI-901 Exam Tips

For the exam, remember these key points:

  • Computer vision analyzes visual information.
  • Azure AI Vision provides computer vision capabilities.
  • OCR extracts text from images.
  • Object detection identifies multiple objects in images.
  • Image captioning generates natural-language image descriptions.
  • APIs and endpoints connect applications to Azure AI services.
  • Authentication secures service access.
  • Responsible AI principles apply to computer vision systems.
  • Image quality affects AI accuracy.
  • Hallucinations are inaccurate AI-generated outputs.

Quick Knowledge Check

Question 1

What does OCR do?

Answer

Extracts text from images and scanned documents.


Question 2

What is object detection?

Answer

Identifying and locating objects within an image.


Question 3

Why is authentication important?

Answer

It secures access to Azure AI services.


Question 4

What can reduce computer vision accuracy?

Answer

Poor image quality such as blur or low lighting.


Practice Exam Questions

Question 1

What is the PRIMARY purpose of computer vision?

A. To enable AI systems to analyze and understand visual information
B. To increase internet bandwidth
C. To manage database backups
D. To improve keyboard performance


Correct Answer

A. To enable AI systems to analyze and understand visual information


Explanation

Computer vision allows AI systems to process and interpret images, videos, and other visual data.


Why the Other Answers Are Incorrect

B. To increase internet bandwidth

Computer vision does not affect networking speed.

C. To manage database backups

This is unrelated to computer vision.

D. To improve keyboard performance

This is unrelated to AI vision systems.


Question 2

Which Azure service provides computer vision capabilities such as OCR and image analysis?

A. Azure AI Vision
B. Azure Backup
C. Azure Virtual Machines
D. Azure DNS


Correct Answer

A. Azure AI Vision


Explanation

Azure AI Vision provides cloud-based computer vision capabilities including OCR, object detection, and image captioning.


Why the Other Answers Are Incorrect

B. Azure Backup

This is a backup service.

C. Azure Virtual Machines

This provides compute infrastructure.

D. Azure DNS

This is a networking service.


Question 3

What does OCR stand for?

A. Optical Character Recognition
B. Open Cloud Rendering
C. Object Classification Registry
D. Operational Compute Routing


Correct Answer

A. Optical Character Recognition


Explanation

OCR extracts text from images or scanned documents.


Why the Other Answers Are Incorrect

B. Open Cloud Rendering

This is not the meaning of OCR.

C. Object Classification Registry

This is unrelated to OCR.

D. Operational Compute Routing

This is not a computer vision term.


Question 4

What is the PRIMARY purpose of object detection?

A. To identify and locate objects within an image
B. To translate spoken language
C. To summarize long documents
D. To compress image files


Correct Answer

A. To identify and locate objects within an image


Explanation

Object detection identifies multiple objects and their locations inside an image.


Why the Other Answers Are Incorrect

B. To translate spoken language

This is a speech AI task.

C. To summarize long documents

This is a text analysis task.

D. To compress image files

Object detection does not compress files.


Question 5

What does image captioning do?

A. Generates natural-language descriptions of images
B. Converts speech into text
C. Encrypts image files
D. Creates database tables


Correct Answer

A. Generates natural-language descriptions of images


Explanation

Image captioning creates human-readable descriptions of visual content.


Why the Other Answers Are Incorrect

B. Converts speech into text

This is speech recognition.

C. Encrypts image files

Encryption is unrelated to captioning.

D. Creates database tables

This is unrelated to computer vision.


Question 6

How do lightweight vision applications typically communicate with Azure AI services?

A. Through APIs and endpoints
B. Through printer drivers
C. Through monitor settings
D. Through USB-only connections


Correct Answer

A. Through APIs and endpoints


Explanation

Applications use APIs and cloud endpoints to send images and receive AI-generated analysis results.


Why the Other Answers Are Incorrect

B. Through printer drivers

Printers are unrelated to AI communication.

C. Through monitor settings

This is unrelated to cloud AI services.

D. Through USB-only connections

Cloud services use network communication.


Question 7

Why is authentication important when accessing Azure AI Vision services?

A. To secure access to AI resources
B. To increase image brightness
C. To improve keyboard response time
D. To accelerate internet speeds


Correct Answer

A. To secure access to AI resources


Explanation

Authentication helps ensure that only authorized users and applications can access Azure AI services.


Why the Other Answers Are Incorrect

B. To increase image brightness

Authentication does not affect image quality.

C. To improve keyboard response time

This is unrelated to authentication.

D. To accelerate internet speeds

Authentication does not improve network performance.


Question 8

Which Responsible AI concern is especially important in computer vision systems?

A. Protecting personal and sensitive visual information
B. Increasing monitor resolution
C. Improving printer speed
D. Reducing spreadsheet file sizes


Correct Answer

A. Protecting personal and sensitive visual information


Explanation

Images may contain faces, documents, or other sensitive information that must be protected.


Why the Other Answers Are Incorrect

B. Increasing monitor resolution

This is unrelated to Responsible AI.

C. Improving printer speed

Printers are unrelated to computer vision ethics.

D. Reducing spreadsheet file sizes

This is unrelated to image analysis.


Question 9

What challenge can reduce computer vision accuracy?

A. Poor image quality
B. Spreadsheet formatting
C. Keyboard layout changes
D. Audio playback speed


Correct Answer

A. Poor image quality


Explanation

Blur, low lighting, and low resolution can negatively affect image analysis accuracy.


Why the Other Answers Are Incorrect

B. Spreadsheet formatting

This does not affect vision systems.

C. Keyboard layout changes

This is unrelated to image processing.

D. Audio playback speed

This is unrelated to computer vision.


Question 10

What are hallucinations in AI vision systems?

A. Incorrect or fabricated AI-generated outputs
B. Hardware installation failures
C. Network outages
D. Printer connection problems


Correct Answer

A. Incorrect or fabricated AI-generated outputs


Explanation

Hallucinations occur when AI systems generate inaccurate descriptions or detections.


Why the Other Answers Are Incorrect

B. Hardware installation failures

This is unrelated to AI-generated outputs.

C. Network outages

This is a connectivity issue.

D. Printer connection problems

This is unrelated to AI vision systems.


Final Thoughts

Building lightweight applications with vision capabilities is an important topic for the AI-901 certification exam. Microsoft expects candidates to understand the foundational concepts behind computer vision applications, including image classification, object detection, OCR, APIs, authentication, Responsible AI principles, and real-world implementation workflows.

Azure AI Vision and Azure AI Foundry provide powerful cloud-based tools that make it easier to build intelligent applications capable of analyzing and understanding visual information.


Go to the AI-901 Exam Prep Hub main page

One thought on “Build a lightweight application that includes vision capabilities (AI-901 Exam Prep)”

Leave a comment