Overview
Computer vision is a branch of Artificial Intelligence (AI) that enables machines to interpret, analyze, and understand visual information such as images and videos. In the context of the AI-900: Microsoft Azure AI Fundamentals exam, you are not expected to build complex models or write code. Instead, the focus is on recognizing computer vision workloads, understanding what problems they solve, and knowing which Azure AI services are appropriate for each scenario.
This topic falls under:
- Describe Artificial Intelligence workloads and considerations (15–20%)
- Identify features of common AI workloads
A strong conceptual understanding here will help you confidently answer many scenario-based exam questions.
What Is a Computer Vision Workload?
A computer vision workload involves extracting meaningful insights from visual data. These workloads allow systems to:
- Identify objects, people, or text in images
- Analyze facial features or emotions
- Understand the content of photos or videos
- Detect changes, anomalies, or motion
Common inputs include:
- Images (JPEG, PNG, etc.)
- Video streams (live or recorded)
Common outputs include:
- Labels or tags
- Bounding boxes around detected objects
- Extracted text
- Descriptions of image content
Common Computer Vision Use Cases
On the AI-900 exam, computer vision workloads are usually presented as real-world scenarios. Below are the most common ones you should recognize.
Image Classification
What it does: Assigns a category or label to an image.
Example scenarios:
- Determining whether an image contains a cat, dog, or bird
- Classifying products in an online store
- Identifying whether a photo shows food, people, or scenery
Key idea: The entire image is classified as one or more categories.
Object Detection
What it does: Detects and locates multiple objects within an image.
Example scenarios:
- Detecting cars, pedestrians, and traffic signs in street images
- Counting people in a room
- Identifying damaged items in a warehouse
Key idea: Unlike classification, object detection identifies where objects appear using bounding boxes.
Face Detection and Facial Analysis
What it does: Detects human faces and analyzes facial attributes.
Example scenarios:
- Detecting whether a face is present in an image
- Estimating age or emotion
- Identifying facial landmarks (eyes, nose, mouth)
Important exam note:
- AI-900 focuses on face detection and analysis, not facial recognition for identity verification.
- Be aware of ethical and privacy considerations when working with facial data.
Optical Character Recognition (OCR)
What it does: Extracts printed or handwritten text from images and documents.
Example scenarios:
- Reading text from scanned documents
- Extracting information from receipts or invoices
- Recognizing license plate numbers
Key idea: OCR turns unstructured visual text into machine-readable text.
Image Description and Tagging
What it does: Generates descriptive text or tags that summarize image content.
Example scenarios:
- Automatically tagging photos in a digital library
- Creating alt text for accessibility
- Generating captions for images
Key idea: This workload focuses on understanding the overall context of an image rather than specific objects.
Video Analysis
What it does: Analyzes video content frame by frame.
Example scenarios:
- Detecting motion or anomalies in security footage
- Tracking objects over time
- Summarizing video content
Key idea: Video analysis extends image analysis across time, not just a single frame.
Azure Services Commonly Associated with Computer Vision
For the AI-900 exam, you should recognize which Azure AI services support computer vision workloads at a high level.
Azure AI Vision
Supports:
- Image analysis
- Object detection
- OCR
- Face detection
- Image tagging and description
This is the most commonly referenced service for computer vision scenarios on the exam.
Azure AI Custom Vision
Supports:
- Custom image classification
- Custom object detection
Used when prebuilt models are not sufficient and you need to train a model using your own images.
Azure AI Video Indexer
Supports:
- Video analysis
- Object, face, and scene detection in videos
Typically appears in scenarios involving video content.
How Computer Vision Differs from Other AI Workloads
Understanding what is not computer vision is just as important on the exam.
| AI Workload Type | Focus Area |
|---|---|
| Computer Vision | Images and videos |
| Natural Language Processing | Text and speech |
| Speech AI | Audio and voice |
| Anomaly Detection | Patterns in numerical or time-series data |
Exam tip: If the input data is visual (images or video), you are almost certainly dealing with a computer vision workload.
Responsible AI Considerations
Microsoft emphasizes responsible AI, and AI-900 includes high-level awareness of these principles.
For computer vision workloads, key considerations include:
- Privacy and consent when capturing images or video
- Avoiding bias in facial analysis
- Transparency in how visual data is collected and used
You will not be tested on implementation details, but you may see conceptual questions about ethical use.
Exam Tips for Identifying Computer Vision Workloads
- Focus on keywords like image, photo, video, camera, scanned document
- Look for actions such as detect, recognize, classify, extract text
- Match the scenario to the simplest appropriate workload
- Remember: AI-900 tests understanding, not coding
Summary
To succeed on the AI-900 exam, you should be able to:
- Recognize when a problem is a computer vision workload
- Identify common use cases such as image classification, object detection, and OCR
- Understand which Azure AI services are commonly used
- Distinguish computer vision from other AI workloads
Mastering this topic will give you a strong foundation for many questions in the Describe Artificial Intelligence workloads and considerations domain.
Go to the Practice Exam Questions for this topic.
Go to the PL-300 Exam Prep Hub main page.
