This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub.
This topic falls under these sections:
Implement AI solutions by using Microsoft Foundry (55–60%)
--> Implement AI solutions with computer vision and image-generation capabilities by using Foundry
--> Build a lightweight application that includes vision capabilities
Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.
Computer vision enables AI systems to interpret and analyze visual information such as images and videos. Organizations use computer vision solutions for automation, accessibility, security, analytics, and customer experiences.
For the AI-901 certification exam, candidates should understand the foundational concepts behind building lightweight applications that include vision capabilities by using Microsoft Azure AI services and Azure AI Foundry.
This topic falls under the “Implement AI solutions with computer vision and image-generation capabilities by using Foundry” section of the AI-901 exam objectives.
What Is Computer Vision?
Computer vision is a field of AI that enables systems to analyze and understand visual information.
Visual data may include:
- Images
- Videos
- Scanned documents
- Camera feeds
Common Computer Vision Tasks
Computer vision systems commonly perform:
- Image classification
- Object detection
- Optical character recognition (OCR)
- Facial analysis
- Image captioning
- Content moderation
Azure AI Vision
Azure AI Vision provides computer vision capabilities through cloud-based AI services.
Features include:
- Image analysis
- OCR
- Object detection
- Image captioning
- Facial attribute analysis
What Is a Lightweight Application?
A lightweight application is a simple application designed to perform focused tasks with minimal complexity and infrastructure.
Characteristics include:
- Simple user interface
- Fast deployment
- Minimal resource usage
- Easy maintenance
Examples of Lightweight Vision Applications
Examples include:
- Image analysis tools
- Receipt scanning apps
- Accessibility assistants
- Product recognition apps
- Photo-tagging systems
Azure AI Foundry
Azure AI Foundry provides tools for building, testing, and managing AI-powered applications.
Developers can:
- Access AI models
- Deploy services
- Test prompts
- Build AI workflows
Image Classification
Image classification identifies the main category or subject of an image.
Example
Image
Photo of a bicycle
Classification
“Bicycle”
Object Detection
Object detection identifies multiple objects and their locations within an image.
Example
Image
Street scene
Detected Objects
- Car
- Traffic light
- Pedestrian
- Bicycle
Optical Character Recognition (OCR)
OCR extracts text from images and scanned documents.
Example
Image
Photo of a restaurant menu
Extracted Text
Menu items and prices
Image Captioning
Image captioning generates natural-language descriptions of images.
Example
Image
A dog playing in a park
Caption
“A brown dog running through a grassy park.”
Facial Analysis
Computer vision systems can analyze facial features.
Possible capabilities include:
- Face detection
- Emotion analysis
- Age estimation
For Responsible AI reasons, facial recognition and identification systems require careful consideration.
APIs and Endpoints
Applications communicate with Azure AI services using:
- APIs
- Endpoints
These allow images to be analyzed programmatically.
Authentication
Applications must securely authenticate before accessing Azure AI services.
Common authentication methods include:
- API keys
- Azure credentials
- Managed identities
User Interface Components
A lightweight vision application may include:
- Image upload area
- Camera capture button
- Results display
- Image preview
Real-Time Image Processing
Some applications process images in near real time.
Examples include:
- Security monitoring
- Live object detection
- Accessibility tools
Example Workflow
A common workflow includes:
- User uploads image
- Application sends image to Azure AI Vision
- AI service analyzes image
- Results are returned
- Application displays findings
Example High-Level Pseudocode
image = upload_image()results = analyze_image(image)display_results(results)
For AI-901, understanding the workflow is more important than memorizing exact syntax.
Common Real-World Scenarios
Scenario 1: Receipt Scanner
Goal
Extract purchase information from receipts.
Features
- OCR
- Text extraction
- Data organization
Scenario 2: Accessibility Assistant
Goal
Describe images for visually impaired users.
Features
- Image captioning
- OCR
- Spoken descriptions
Scenario 3: Product Recognition
Goal
Identify products from photos.
Features
- Object detection
- Classification
- Product lookup
Scenario 4: Content Moderation
Goal
Identify harmful or inappropriate images.
Features
- Image analysis
- Safety detection
- Automated filtering
Responsible AI Considerations
Vision-enabled applications should follow Responsible AI principles.
Key considerations include:
- Fairness
- Privacy
- Transparency
- Inclusiveness
- Accountability
- Security
Privacy Concerns
Images may contain:
- Personal data
- Faces
- Sensitive documents
- Location information
Organizations should protect visual data appropriately.
Bias and Fairness
Computer vision systems may perform unevenly across:
- Skin tones
- Lighting conditions
- Demographics
- Environmental conditions
Testing and evaluation are important for fairness.
Transparency
Users should understand:
- AI is analyzing images
- AI-generated results may contain errors
- Images may be processed in the cloud
Hallucinations and Errors
Vision systems may occasionally generate:
- Incorrect captions
- False detections
- Inaccurate classifications
These incorrect outputs are sometimes called hallucinations.
Error Handling
Applications should handle:
- Invalid image formats
- Poor image quality
- Authentication failures
- Network interruptions
- Rate limits
Image Quality Challenges
Computer vision accuracy can decrease with:
- Blurry images
- Poor lighting
- Low resolution
- Obstructed objects
Advantages of Vision Applications
Benefits include:
- Automation
- Faster analysis
- Accessibility improvements
- Improved customer experiences
- Scalable image processing
Limitations of Vision Applications
Challenges include:
- Recognition inaccuracies
- Bias
- Privacy concerns
- Variable image quality
- Ethical considerations
High-Level Architecture
A simplified architecture often includes:
- User interface
- Image upload/capture
- Azure AI Vision service
- AI analysis
- Results display
Generative Vision Capabilities
Some modern systems combine:
- Computer vision
- Generative AI
These multimodal systems can:
- Analyze images
- Generate descriptions
- Answer visual questions
- Create new images
Important AI-901 Exam Tips
For the exam, remember these key points:
- Computer vision analyzes visual information.
- Azure AI Vision provides computer vision capabilities.
- OCR extracts text from images.
- Object detection identifies multiple objects in images.
- Image captioning generates natural-language image descriptions.
- APIs and endpoints connect applications to Azure AI services.
- Authentication secures service access.
- Responsible AI principles apply to computer vision systems.
- Image quality affects AI accuracy.
- Hallucinations are inaccurate AI-generated outputs.
Quick Knowledge Check
Question 1
What does OCR do?
Answer
Extracts text from images and scanned documents.
Question 2
What is object detection?
Answer
Identifying and locating objects within an image.
Question 3
Why is authentication important?
Answer
It secures access to Azure AI services.
Question 4
What can reduce computer vision accuracy?
Answer
Poor image quality such as blur or low lighting.
Practice Exam Questions
Question 1
What is the PRIMARY purpose of computer vision?
A. To enable AI systems to analyze and understand visual information
B. To increase internet bandwidth
C. To manage database backups
D. To improve keyboard performance
Correct Answer
A. To enable AI systems to analyze and understand visual information
Explanation
Computer vision allows AI systems to process and interpret images, videos, and other visual data.
Why the Other Answers Are Incorrect
B. To increase internet bandwidth
Computer vision does not affect networking speed.
C. To manage database backups
This is unrelated to computer vision.
D. To improve keyboard performance
This is unrelated to AI vision systems.
Question 2
Which Azure service provides computer vision capabilities such as OCR and image analysis?
A. Azure AI Vision
B. Azure Backup
C. Azure Virtual Machines
D. Azure DNS
Correct Answer
A. Azure AI Vision
Explanation
Azure AI Vision provides cloud-based computer vision capabilities including OCR, object detection, and image captioning.
Why the Other Answers Are Incorrect
B. Azure Backup
This is a backup service.
C. Azure Virtual Machines
This provides compute infrastructure.
D. Azure DNS
This is a networking service.
Question 3
What does OCR stand for?
A. Optical Character Recognition
B. Open Cloud Rendering
C. Object Classification Registry
D. Operational Compute Routing
Correct Answer
A. Optical Character Recognition
Explanation
OCR extracts text from images or scanned documents.
Why the Other Answers Are Incorrect
B. Open Cloud Rendering
This is not the meaning of OCR.
C. Object Classification Registry
This is unrelated to OCR.
D. Operational Compute Routing
This is not a computer vision term.
Question 4
What is the PRIMARY purpose of object detection?
A. To identify and locate objects within an image
B. To translate spoken language
C. To summarize long documents
D. To compress image files
Correct Answer
A. To identify and locate objects within an image
Explanation
Object detection identifies multiple objects and their locations inside an image.
Why the Other Answers Are Incorrect
B. To translate spoken language
This is a speech AI task.
C. To summarize long documents
This is a text analysis task.
D. To compress image files
Object detection does not compress files.
Question 5
What does image captioning do?
A. Generates natural-language descriptions of images
B. Converts speech into text
C. Encrypts image files
D. Creates database tables
Correct Answer
A. Generates natural-language descriptions of images
Explanation
Image captioning creates human-readable descriptions of visual content.
Why the Other Answers Are Incorrect
B. Converts speech into text
This is speech recognition.
C. Encrypts image files
Encryption is unrelated to captioning.
D. Creates database tables
This is unrelated to computer vision.
Question 6
How do lightweight vision applications typically communicate with Azure AI services?
A. Through APIs and endpoints
B. Through printer drivers
C. Through monitor settings
D. Through USB-only connections
Correct Answer
A. Through APIs and endpoints
Explanation
Applications use APIs and cloud endpoints to send images and receive AI-generated analysis results.
Why the Other Answers Are Incorrect
B. Through printer drivers
Printers are unrelated to AI communication.
C. Through monitor settings
This is unrelated to cloud AI services.
D. Through USB-only connections
Cloud services use network communication.
Question 7
Why is authentication important when accessing Azure AI Vision services?
A. To secure access to AI resources
B. To increase image brightness
C. To improve keyboard response time
D. To accelerate internet speeds
Correct Answer
A. To secure access to AI resources
Explanation
Authentication helps ensure that only authorized users and applications can access Azure AI services.
Why the Other Answers Are Incorrect
B. To increase image brightness
Authentication does not affect image quality.
C. To improve keyboard response time
This is unrelated to authentication.
D. To accelerate internet speeds
Authentication does not improve network performance.
Question 8
Which Responsible AI concern is especially important in computer vision systems?
A. Protecting personal and sensitive visual information
B. Increasing monitor resolution
C. Improving printer speed
D. Reducing spreadsheet file sizes
Correct Answer
A. Protecting personal and sensitive visual information
Explanation
Images may contain faces, documents, or other sensitive information that must be protected.
Why the Other Answers Are Incorrect
B. Increasing monitor resolution
This is unrelated to Responsible AI.
C. Improving printer speed
Printers are unrelated to computer vision ethics.
D. Reducing spreadsheet file sizes
This is unrelated to image analysis.
Question 9
What challenge can reduce computer vision accuracy?
A. Poor image quality
B. Spreadsheet formatting
C. Keyboard layout changes
D. Audio playback speed
Correct Answer
A. Poor image quality
Explanation
Blur, low lighting, and low resolution can negatively affect image analysis accuracy.
Why the Other Answers Are Incorrect
B. Spreadsheet formatting
This does not affect vision systems.
C. Keyboard layout changes
This is unrelated to image processing.
D. Audio playback speed
This is unrelated to computer vision.
Question 10
What are hallucinations in AI vision systems?
A. Incorrect or fabricated AI-generated outputs
B. Hardware installation failures
C. Network outages
D. Printer connection problems
Correct Answer
A. Incorrect or fabricated AI-generated outputs
Explanation
Hallucinations occur when AI systems generate inaccurate descriptions or detections.
Why the Other Answers Are Incorrect
B. Hardware installation failures
This is unrelated to AI-generated outputs.
C. Network outages
This is a connectivity issue.
D. Printer connection problems
This is unrelated to AI vision systems.
Final Thoughts
Building lightweight applications with vision capabilities is an important topic for the AI-901 certification exam. Microsoft expects candidates to understand the foundational concepts behind computer vision applications, including image classification, object detection, OCR, APIs, authentication, Responsible AI principles, and real-world implementation workflows.
Azure AI Vision and Azure AI Foundry provide powerful cloud-based tools that make it easier to build intelligent applications capable of analyzing and understanding visual information.
Go to the AI-901 Exam Prep Hub main page

One thought on “Build a lightweight application that includes vision capabilities (AI-901 Exam Prep)”