Tag: AI-901: Azure AI Fundamentals

AI, AI Governance, AI Strategy, AI-901, Artificial Intelligence (AI), azure, Microsoft Certification May 18, 2026May 24, 2026

Exam Prep Hub for AI-901: Azure AI Fundamentals

Welcome to the AI-901: Azure AI Fundamentals Exam Prep Hub!

Welcome to the one-stop hub with information for preparing for the AI-901: Azure AI Fundamentals certification exam. The content for this exam helps you to demonstrate that “you have conceptual knowledge of AI solutions in Azure and the foundational technical skills to work with them”. You will also need “knowledge of Python coding syntax and programming techniques, and you should be familiar with Azure resources”.
Upon successful completion of the exam, you earn the Microsoft Certified: Azure AI Fundamentals certification.

This hub provides information directly here (topic-by-topic as outlined in the official study guide), links to a number of external resources, tips for preparing for the exam, practice tests, and section questions to help you prepare. Bookmark this page and use it as a guide to ensure that you are fully covering all relevant topics for the AI-901 exam and making use of as many of the resources available as possible.

Audience profile (from Microsoft’s site)



As a candidate for this Microsoft Certification, you’re at the beginning of your career in AI solution development. These Microsoft certifications offer opportunities to demonstrate your understanding of machine learning, AI concepts, and Azure services, whether you are starting your career or advancing your skills in AI solution development. Both certifications are designed for candidates from technical and non-technical backgrounds—prior experience in data science or software engineering is not required, though familiarity with basic cloud concepts and client-server applications will be helpful.

For the AI-901, you should have foundational knowledge of AI workloads and understand the basic principles of AI and machine learning. And also, you should have foundational technical skills for working with AI solutions in Azure, conceptual knowledge of Azure-based AI solutions, and familiarity with Python coding syntax and programming techniques, as well as Azure resources.

You may be eligible for ACE college credit if you pass this certification. See ACE college credit for certification exams for details.

Skills at a glance (as specified in the official study guide)

Identify AI concepts and responsibilities (40–45%)
Implement AI solutions by using Microsoft Foundry (55–60%)

Topic-by-Topic Exam Content

[click a topic link to access the content and practice questions for that topic]

Identify AI concepts and capabilities (40–45%)

Describe principles of responsible AI

Identify AI model components and configurations

Identify AI workloads

Implement AI solutions by using Microsoft Foundry (55–60%)

Implement generative AI apps and agents by using Foundry

Implement AI solutions for text and speech by using Foundry

Implement AI solutions with computer vision and image-generation capabilities by using Foundry

Implement AI solutions for information extraction by using Foundry

AI-901 Practice Exams

Important AI-901 Resources

Link to the free, comprehensive, self-paced course on Microsoft Learn – Introduction to AI in Azure
Link to the certification page: Microsoft Certified: Azure AI Fundamentals certification page
Link to the study guide: Study guide for Exam AI-901: Microsoft Azure AI Fundamentals

An overview of the AI-901 and how it compares to AI-900: AI-901 – Azure AI Fundamentals Exam Review by Tim Warner

Multiple AI-901 related courses/videos: AI-901 Azure AI Fundamentals course and channel by Skilltech Club

Good luck to you on your data journey!

AI, AI-901, Microsoft Certification May 18, 2026

AI-901: Microsoft Azure AI Fundamentals – Practice Exam #4 (30 Questions)

Question 1

Which AI workload is BEST suited for forecasting future inventory demand?

A. Regression
B. OCR
C. Object detection
D. Speech synthesis

Correct Answer

A. Regression

Explanation

Regression predicts continuous numeric values, such as future inventory levels or sales demand.

Question 2

A company wants to automatically categorize support tickets into departments such as Billing, Technical Support, and Sales.

Which machine learning technique should be used?

A. Classification
B. Clustering
C. Regression
D. Forecasting

Correct Answer

A. Classification

Question 3

Which Responsible AI principle emphasizes designing systems that can be used by people with disabilities?

A. Inclusiveness
B. Reliability
C. Accountability
D. Transparency

Correct Answer

A. Inclusiveness

Question 4

What is tokenization in generative AI?

A. Breaking text into smaller pieces for processing
B. Encrypting storage devices
C. Compressing images
D. Improving network bandwidth

Correct Answer

A. Breaking text into smaller pieces for processing

Question 5

HOTSPOT / MATCHING

Match each AI capability with the appropriate use case.

Use Case	Capability
Creating realistic AI-generated artwork	?
Identifying the language of a sentence	?
Extracting text from a passport image	?

Options:

OCR
Language detection
Image generation

Correct Answers

Use Case	Capability
Creating realistic AI-generated artwork	Image generation
Identifying the language of a sentence	Language detection
Extracting text from a passport image	OCR

Question 6

Which Azure AI capability allows applications to answer questions conversationally?

A. Generative AI
B. Regression
C. Clustering
D. Forecasting

Correct Answer

A. Generative AI

Question 7

A retailer wants to detect when shelves in stores are empty by analyzing camera images.

Which AI capability should be used?

A. Object detection
B. Speech recognition
C. Translation
D. Regression

Correct Answer

A. Object detection

Question 8

MULTIPLE ANSWER

Which are common examples of generative AI applications?

Select ALL that apply.

A. Chatbots
B. AI image creation
C. Automated code generation
D. Spreadsheet printing
E. Content summarization

Correct Answers

A. Chatbots
B. AI image creation
C. Automated code generation
E. Content summarization

Question 9

What is the purpose of an API key in Azure AI services?

A. To authenticate access to the service
B. To increase monitor brightness
C. To compress image files
D. To replace endpoints

Correct Answer

A. To authenticate access to the service

Question 10

Which AI capability identifies names, locations, or dates in text?

A. Entity extraction
B. Object detection
C. OCR
D. Speech synthesis

Correct Answer

A. Entity extraction

Question 11

FILL IN THE BLANK

__________ AI can generate new content such as text, images, and code.

Correct Answer

Generative

Generative AI

Question 12

Which statement about cloud-based AI services is TRUE?

A. They reduce the need to manage physical infrastructure
B. They eliminate all cybersecurity risks
C. They only work offline
D. They cannot scale automatically

Correct Answer

A. They reduce the need to manage physical infrastructure

Question 13

You need an AI solution that can convert customer emails into concise summaries.

Which capability is MOST appropriate?

A. Text summarization
B. Regression
C. Object detection
D. Forecasting

Correct Answer

A. Text summarization

Question 14

Which machine learning technique is used when training data already contains labeled outcomes?

A. Supervised learning
B. Unsupervised learning
C. Reinforcement learning
D. OCR

Correct Answer

A. Supervised learning

Question 15

MULTIPLE ANSWER

Which capabilities are commonly associated with Azure AI Vision services?

Select ALL that apply.

A. OCR
B. Image captioning
C. Object detection
D. Sentiment analysis
E. Facial analysis

Correct Answers

A. OCR
B. Image captioning
C. Object detection
E. Facial analysis

Question 16

Which Responsible AI principle focuses on ensuring AI systems operate dependably?

A. Reliability and safety
B. Transparency
C. Inclusiveness
D. Fairness

Correct Answer

A. Reliability and safety

Question 17

You want an AI assistant to always respond in a formal tone.

What is the BEST way to accomplish this?

A. Use a system prompt
B. Use OCR
C. Increase screen resolution
D. Disable APIs

Correct Answer

A. Use a system prompt

Question 18

HOTSPOT / MATCHING

Match the AI workload to the corresponding example.

Workload	Example
Classification	?
Speech synthesis	?
Clustering	?

Options:

Grouping similar customers
Converting text into spoken audio
Determining whether a loan is approved

Correct Answers

Workload	Example
Classification	Determining whether a loan is approved
Speech synthesis	Converting text into spoken audio
Clustering	Grouping similar customers

Question 19

Which capability allows AI systems to create captions that describe images?

A. Image captioning
B. OCR
C. Translation
D. Regression

Correct Answer

A. Image captioning

Question 20

Which statement about Responsible AI is TRUE?

A. AI systems should be monitored and governed responsibly
B. AI systems never require human oversight
C. Responsible AI eliminates all risks
D. Responsible AI only applies to generative AI

Correct Answer

A. AI systems should be monitored and governed responsibly

Question 21

A transportation company wants to analyze dashcam footage to detect traffic signs and pedestrians.

Which workload is MOST appropriate?

A. Computer vision
B. Regression
C. Forecasting
D. Translation

Correct Answer

A. Computer vision

Question 22

MULTIPLE ANSWER

Which are common capabilities of natural language processing solutions?

Select ALL that apply.

A. Translation
B. Sentiment analysis
C. Entity recognition
D. Object detection
E. Summarization

Correct Answers

A. Translation
B. Sentiment analysis
C. Entity recognition
E. Summarization

Question 23

Which Responsible AI principle emphasizes understanding and explaining AI decisions?

A. Transparency
B. Reliability
C. Privacy
D. Inclusiveness

Correct Answer

A. Transparency

Question 24

FILL IN THE BLANK

__________ recognition converts spoken words into text.

Correct Answer

Speech

Speech recognition

Question 25

A business wants AI-generated voice responses for a customer-service phone system.

Which capability should they use?

A. Speech synthesis
B. OCR
C. Classification
D. Clustering

Correct Answer

A. Speech synthesis

Question 26

Which AI capability would BEST help identify duplicate customer records?

A. Entity matching
B. Object detection
C. Speech synthesis
D. Forecasting

Correct Answer

A. Entity matching

Question 27

MULTIPLE ANSWER

Which are benefits of using pretrained generative AI models?

Select ALL that apply.

A. Faster implementation
B. Reduced training costs
C. Access to advanced capabilities
D. Guaranteed perfect outputs
E. Simplified development

Correct Answers

A. Faster implementation
B. Reduced training costs
C. Access to advanced capabilities
E. Simplified development

Question 28

You are developing a lightweight application that sends prompts to an Azure AI model and receives responses.

What is the MOST likely communication method?

A. REST API calls
B. HDMI connections
C. USB transfers
D. Printer drivers

Correct Answer

A. REST API calls

Question 29

Which AI capability can identify the main topics discussed in a document?

A. Keyword extraction
B. Speech synthesis
C. Object detection
D. Regression

Correct Answer

A. Keyword extraction

Question 30

SCENARIO-BASED QUESTION

A media company wants an AI solution that:

Generates promotional images
Converts podcast audio into transcripts
Detects company logos in uploaded photos
Summarizes interview articles

Which AI capabilities are required?

A. Image generation, speech recognition, object detection, and text summarization
B. Regression and clustering only
C. Forecasting and classification only
D. OCR only

Correct Answer

A. Image generation, speech recognition, object detection, and text summarization

Explanation

The scenario requires multiple AI capabilities:

Image generation for promotional graphics
Speech recognition for podcast transcription
Object detection for logo identification
Text summarization for interview summaries

AI, AI-901, Microsoft Certification May 18, 2026

AI-901: Microsoft Azure AI Fundamentals – Practice Exam #3 (30 Questions)

Question 1

Which machine learning technique is commonly used to predict whether an email is spam?

A. Regression
B. Classification
C. Clustering
D. OCR

Correct Answer

B. Classification

Explanation

Classification predicts categories or labels such as spam/not spam, approved/denied, or fraud/not fraud.

Question 2

A company wants an AI system that can automatically generate product descriptions for an online store.

Which AI capability should be used?

A. Generative AI
B. Regression
C. OCR
D. Clustering

Correct Answer

A. Generative AI

Question 3

Which Responsible AI principle focuses on making AI systems accessible to people with varying abilities?

A. Fairness
B. Inclusiveness
C. Accountability
D. Transparency

Correct Answer

B. Inclusiveness

Question 4

Which Azure AI capability is MOST useful for converting scanned paper documents into searchable text?

A. OCR
B. Object detection
C. Speech synthesis
D. Regression

Correct Answer

A. OCR

Question 5

HOTSPOT / MATCHING

Match the AI scenario with the correct capability.

Scenario	Capability
Predicting next month’s sales	?
Grouping similar customers	?
Determining if a review is positive or negative	?

Options:

Clustering
Regression
Sentiment analysis

Correct Answers

Scenario	Capability
Predicting next month’s sales	Regression
Grouping similar customers	Clustering
Determining if a review is positive or negative	Sentiment analysis

Question 6

What is the PRIMARY benefit of pretrained AI models?

A. They eliminate all errors
B. They reduce the need to build models from scratch
C. They require no internet connectivity
D. They replace APIs

Correct Answer

B. They reduce the need to build models from scratch

Question 7

Which AI capability identifies people, products, or objects within images?

A. Object detection
B. Sentiment analysis
C. Speech recognition
D. Translation

Correct Answer

A. Object detection

Question 8

MULTIPLE ANSWER

Which are examples of generative AI outputs?

Select ALL that apply.

A. AI-generated images
B. AI-written summaries
C. AI-created code
D. Printed paper forms
E. AI-generated marketing emails

Correct Answers

A. AI-generated images
B. AI-written summaries
C. AI-created code
E. AI-generated marketing emails

Question 9

What is temperature commonly used for in generative AI models?

A. To control response randomness and creativity
B. To measure server heat
C. To encrypt AI outputs
D. To improve monitor resolution

Correct Answer

A. To control response randomness and creativity

Question 10

A business wants an AI system that can translate customer support chats into multiple languages.

Which workload should they use?

A. Natural language processing
B. Object detection
C. Regression
D. Clustering

Correct Answer

A. Natural language processing

Question 11

FILL IN THE BLANK

__________ AI models can process multiple input types such as text, images, and audio.

Correct Answer

Multimodal

Question 12

Which statement about Azure AI Foundry is TRUE?

A. It can be used to deploy and manage AI models
B. It only supports spreadsheet applications
C. It replaces operating systems
D. It only works for computer vision models

Correct Answer

A. It can be used to deploy and manage AI models

Question 13

You need an AI application that can identify important phrases within documents.

Which capability should you use?

A. Keyword extraction
B. Regression
C. Clustering
D. Object detection

Correct Answer

A. Keyword extraction

Question 14

Which type of machine learning groups data based on similarities without predefined labels?

A. Regression
B. Classification
C. Clustering
D. OCR

Correct Answer

C. Clustering

Question 15

MULTIPLE ANSWER

Which are common uses of speech AI services?

Select ALL that apply.

A. Speech-to-text conversion
B. Text-to-speech conversion
C. Real-time translation
D. Predicting stock prices
E. Speaker recognition

Correct Answers

A. Speech-to-text conversion
B. Text-to-speech conversion
C. Real-time translation
E. Speaker recognition

Question 16

Which Responsible AI principle emphasizes understanding how AI systems operate?

A. Transparency
B. Reliability
C. Inclusiveness
D. Privacy

Correct Answer

A. Transparency

Question 17

You are creating a chatbot that should answer politely and professionally.

Which prompt type is BEST for defining the chatbot’s behavior?

A. System prompt
B. User prompt
C. Regression prompt
D. OCR prompt

Correct Answer

A. System prompt

Question 18

HOTSPOT / MATCHING

Match the capability to the appropriate scenario.

Capability	Scenario
OCR	?
Speech synthesis	?
Image classification	?

Options:

Categorizing animal photos
Reading text from receipts
Generating spoken responses

Correct Answers

Capability	Scenario
OCR	Reading text from receipts
Speech synthesis	Generating spoken responses
Image classification	Categorizing animal photos

Question 19

Which AI workload would BEST identify the language used in a sentence?

A. Natural language processing
B. Regression
C. Forecasting
D. Clustering

Correct Answer

A. Natural language processing

Question 20

What is one limitation of generative AI models?

A. They may produce inaccurate information
B. They cannot process text
C. They only work offline
D. They cannot generate images

Correct Answer

A. They may produce inaccurate information

Question 21

A company wants to monitor customer opinions on social media posts.

Which AI capability should they use?

A. Sentiment analysis
B. Object detection
C. OCR
D. Regression

Correct Answer

A. Sentiment analysis

Question 22

MULTIPLE ANSWER

Which tasks are examples of computer vision workloads?

Select ALL that apply.

A. Detecting objects in images
B. Recognizing handwritten text
C. Identifying image categories
D. Translating speech
E. Facial analysis

Correct Answers

A. Detecting objects in images
B. Recognizing handwritten text
C. Identifying image categories
E. Facial analysis

Question 23

Which Responsible AI principle ensures organizations remain responsible for AI decisions and outcomes?

A. Accountability
B. Transparency
C. Inclusiveness
D. Fairness

Correct Answer

A. Accountability

Question 24

FILL IN THE BLANK

__________ analysis evaluates whether text expresses positive, negative, or neutral emotions.

Correct Answer

Sentiment

Sentiment analysis

Question 25

Which AI capability would BEST help summarize a recorded business meeting?

A. Speech recognition combined with text summarization
B. Object detection only
C. Regression only
D. Clustering only

Correct Answer

A. Speech recognition combined with text summarization

Question 26

A retailer wants an AI model that automatically creates advertising images from written prompts.

Which capability is MOST appropriate?

A. Image-generation models
B. Forecasting models
C. Regression models
D. OCR models

Correct Answer

A. Image-generation models

Question 27

MULTIPLE ANSWER

Which are advantages of lightweight AI applications?

Select ALL that apply.

A. Faster development
B. Reduced infrastructure complexity
C. Easier cloud integration
D. Guaranteed perfect AI accuracy
E. Simplified deployment

Correct Answers

A. Faster development
B. Reduced infrastructure complexity
C. Easier cloud integration
E. Simplified deployment

Question 28

You want an AI application to analyze uploaded images and answer questions about them.

Which type of model is MOST appropriate?

A. Multimodal model
B. Regression model
C. Clustering model
D. Forecasting model

Correct Answer

A. Multimodal model

Question 29

Which statement about APIs in AI solutions is TRUE?

A. APIs allow applications to exchange data with AI services
B. APIs eliminate the need for authentication
C. APIs physically store microphones
D. APIs replace cloud services

Correct Answer

A. APIs allow applications to exchange data with AI services

Question 30

SCENARIO-BASED QUESTION

A financial company wants an AI solution that:

Reads text from loan applications
Detects signatures in uploaded forms
Converts recorded customer calls into text
Generates automated email responses

Which AI capabilities are required?

A. OCR, object detection, speech recognition, and generative AI
B. Regression and clustering only
C. Forecasting and translation only
D. Speech synthesis only

Correct Answer

A. OCR, object detection, speech recognition, and generative AI

Explanation

The scenario requires multiple AI capabilities:

OCR for reading application text
Object detection for identifying signatures
Speech recognition for transcribing calls
Generative AI for automated email creation

AI, AI-901, Microsoft Certification May 18, 2026

AI-901: Microsoft Azure AI Fundamentals – Practice Exam #2 (30 Questions)

Question 1

Which machine learning technique is BEST suited for predicting house prices?

A. Clustering
B. Regression
C. Object detection
D. Translation

Correct Answer

B. Regression

Explanation

Regression predicts continuous numeric values such as prices, temperatures, or sales forecasts.

Question 2

A company wants to automatically detect fraudulent credit-card transactions.

Which type of AI workload is MOST appropriate?

A. Classification
B. OCR
C. Image generation
D. Speech synthesis

Correct Answer

A. Classification

Explanation

Fraud detection commonly uses classification models to determine whether transactions are fraudulent or legitimate.

Question 3

Which Responsible AI principle focuses on protecting sensitive user data?

A. Transparency
B. Fairness
C. Privacy and security
D. Inclusiveness

Correct Answer

C. Privacy and security

Question 4

What is the PRIMARY purpose of a user prompt in generative AI?

A. To provide instructions or requests to the model
B. To replace APIs
C. To install operating systems
D. To secure databases

Correct Answer

A. To provide instructions or requests to the model

Question 5

HOTSPOT / MATCHING

Match each AI capability with its correct output.

Capability	Output
Speech synthesis	?
OCR	?
Sentiment analysis	?

Options:

Emotional tone
Spoken audio
Extracted text

Correct Answers

Capability	Output
Speech synthesis	Spoken audio
OCR	Extracted text
Sentiment analysis	Emotional tone

Question 6

Which type of AI model can generate entirely new images from text prompts?

A. Generative AI model
B. Regression model
C. Clustering model
D. Time-series model

Correct Answer

A. Generative AI model

Question 7

You need an AI solution that converts spoken customer calls into searchable transcripts.

Which capability should you use?

A. Speech recognition
B. Speech synthesis
C. OCR
D. Object detection

Correct Answer

A. Speech recognition

Question 8

MULTIPLE ANSWER

Which are common capabilities of computer vision solutions?

Select ALL that apply.

A. Object detection
B. Image classification
C. OCR
D. Language translation
E. Facial analysis

Correct Answers

A. Object detection
B. Image classification
C. OCR
E. Facial analysis

Question 9

What does an Azure AI endpoint provide?

A. A network-accessible location for interacting with an AI service
B. A physical monitor connection
C. A database backup
D. A printer configuration

Correct Answer

A. A network-accessible location for interacting with an AI service

Question 10

Which AI workload is MOST associated with language translation?

A. Natural language processing
B. Regression
C. Forecasting
D. Clustering

Correct Answer

A. Natural language processing

Question 11

FILL IN THE BLANK

__________ identifies and locates objects within an image or video.

Correct Answer

Object detection

Question 12

A company wants an AI solution that can generate summaries of long documents.

Which AI capability should they use?

A. Text summarization
B. OCR
C. Regression
D. Forecasting

Correct Answer

A. Text summarization

Question 13

Which statement about multimodal AI models is TRUE?

A. They can process multiple content types such as text and images
B. They only process spreadsheets
C. They cannot analyze images
D. They only work with speech input

Correct Answer

A. They can process multiple content types such as text and images

Question 14

You are building an AI solution that extracts invoice numbers and due dates from scanned invoices.

Which technologies are MOST useful?

A. OCR and entity extraction
B. Forecasting and regression
C. Clustering and translation
D. Speech synthesis and object detection

Correct Answer

A. OCR and entity extraction

Question 15

MULTIPLE ANSWER

Which factors can reduce the accuracy of AI vision systems?

Select ALL that apply.

A. Poor lighting
B. Low-resolution images
C. Blurry images
D. Clear high-quality images
E. Obstructed objects

Correct Answers

A. Poor lighting
B. Low-resolution images
C. Blurry images
E. Obstructed objects

Question 16

Which Responsible AI principle focuses on ensuring AI systems work consistently and safely?

A. Reliability and safety
B. Transparency
C. Inclusiveness
D. Fairness

Correct Answer

A. Reliability and safety

Question 17

You deploy a model in Azure AI Foundry.

What is commonly required for applications to securely access the model?

A. Authentication credentials
B. A USB cable
C. A local printer
D. Spreadsheet macros

Correct Answer

A. Authentication credentials

Question 18

HOTSPOT / MATCHING

Match the workload to the correct scenario.

Scenario	Workload
Predicting future sales revenue	?
Detecting emotions in reviews	?
Identifying products in store images	?

Options:

Sentiment analysis
Regression
Object detection

Correct Answers

Scenario	Workload
Predicting future sales revenue	Regression
Detecting emotions in reviews	Sentiment analysis
Identifying products in store images	Object detection

Question 19

Which AI capability generates written descriptions of images?

A. Image captioning
B. OCR
C. Regression
D. Translation

Correct Answer

A. Image captioning

Question 20

Which statement about hallucinations in generative AI is TRUE?

A. Hallucinations are always intentional
B. Hallucinations are fabricated or inaccurate outputs
C. Hallucinations improve model accuracy
D. Hallucinations only occur in image models

Correct Answer

B. Hallucinations are fabricated or inaccurate outputs

Question 21

A retailer wants to group shoppers based on purchasing patterns without predefined categories.

Which machine learning technique should be used?

A. Clustering
B. Classification
C. OCR
D. Regression

Correct Answer

A. Clustering

Question 22

MULTIPLE ANSWER

Which tasks are examples of information extraction?

Select ALL that apply.

A. Extracting names from documents
B. Reading text from images
C. Detecting keywords in audio
D. Predicting stock prices
E. Identifying invoice totals

Correct Answers

A. Extracting names from documents
B. Reading text from images
C. Detecting keywords in audio
E. Identifying invoice totals

Question 23

Which Responsible AI principle emphasizes that humans remain responsible for AI outcomes?

A. Accountability
B. Fairness
C. Inclusiveness
D. Reliability

Correct Answer

A. Accountability

Question 24

FILL IN THE BLANK

__________ converts written text into spoken audio.

Correct Answer

Speech synthesis

Question 25

Which AI capability would BEST help visually impaired users understand photos?

A. Image captioning
B. Regression
C. Clustering
D. Forecasting

Correct Answer

A. Image captioning

Question 26

A customer-service solution automatically identifies whether callers are angry or satisfied.

Which AI capability is being used?

A. Sentiment analysis
B. OCR
C. Image classification
D. Forecasting

Correct Answer

A. Sentiment analysis

Question 27

MULTIPLE ANSWER

Which are advantages of using cloud-based Azure AI services?

Select ALL that apply.

A. Scalability
B. Reduced infrastructure management
C. Access to pretrained models
D. Elimination of all AI errors
E. Faster deployment

Correct Answers

A. Scalability
B. Reduced infrastructure management
C. Access to pretrained models
E. Faster deployment

Question 28

You need an AI solution that can analyze both spoken words and visual content from videos.

Which type of AI system is MOST appropriate?

A. Multimodal AI
B. Regression-only AI
C. Clustering-only AI
D. Spreadsheet automation AI

Correct Answer

A. Multimodal AI

Question 29

Which statement about APIs in Azure AI solutions is TRUE?

A. APIs allow applications to communicate with AI services
B. APIs physically store images
C. APIs replace authentication
D. APIs only work offline

Correct Answer

A. APIs allow applications to communicate with AI services

Question 30

SCENARIO-BASED QUESTION

A healthcare organization wants an AI application that:

Extracts text from medical forms
Converts doctor dictation into text
Identifies medical equipment in images
Summarizes patient notes

Which AI capabilities are required?

A. OCR, speech recognition, object detection, and text summarization
B. Forecasting and clustering only
C. Regression and translation only
D. Speech synthesis only

Correct Answer

A. OCR, speech recognition, object detection, and text summarization

Explanation

The scenario requires multiple AI workloads:

OCR for extracting text from forms
Speech recognition for doctor dictation
Object detection for medical equipment images
Text summarization for patient notes

Go to the AI-901 Exam Prep Hub main page

AI-901, Microsoft Certification May 18, 2026

AI-901: Microsoft Azure AI Fundamentals – Practice Exam #1 (30 Questions)

Question 1

Which type of AI workload is primarily used to predict future numeric values?

A. Computer vision
B. Regression
C. Classification
D. Natural language processing

Correct Answer

B. Regression

Explanation

Regression predicts continuous numeric values such as sales forecasts, temperatures, or stock prices.

Why the Other Answers Are Incorrect

A. Computer vision analyzes images and video.
C. Classification predicts categories rather than numeric values.
D. Natural language processing focuses on text and language.

Question 2

You need to determine whether customer feedback is positive, negative, or neutral.

Which AI capability should you use?

A. OCR
B. Object detection
C. Sentiment analysis
D. Speech synthesis

Correct Answer

C. Sentiment analysis

Explanation

Sentiment analysis evaluates emotional tone in text.

Question 3

Which Responsible AI principle focuses on ensuring AI systems treat people equitably?

A. Transparency
B. Fairness
C. Accountability
D. Reliability

Correct Answer

B. Fairness

Question 4

You are building a chatbot that answers customer questions.

Which type of AI workload is MOST appropriate?

A. Generative AI
B. Regression
C. Clustering
D. Forecasting

Correct Answer

A. Generative AI

Explanation

Generative AI models can generate human-like conversational responses.

Question 5

HOTSPOT / MATCHING

Match the AI capability to the correct scenario.

Scenario	Capability
Detecting handwritten text in scanned forms	?
Identifying objects in an image	?
Converting speech into text	?

Options:

OCR
Speech recognition
Object detection

Correct Answers

Scenario	Capability
Detecting handwritten text in scanned forms	OCR
Identifying objects in an image	Object detection
Converting speech into text	Speech recognition

Question 6

Which Azure AI capability generates spoken audio from text?

A. Speech recognition
B. Speech synthesis
C. OCR
D. Translation

Correct Answer

B. Speech synthesis

Question 7

You want to create an AI application that analyzes invoices and extracts totals and dates.

Which capability should you use?

A. Object detection
B. OCR and entity extraction
C. Speech synthesis
D. Classification only

Correct Answer

B. OCR and entity extraction

Explanation

Invoices contain text and structured information that can be extracted using OCR and entity extraction.

Question 8

MULTIPLE ANSWER

Which are common Responsible AI principles promoted by Microsoft?

Select ALL that apply.

A. Fairness
B. Transparency
C. Accountability
D. Exclusiveness
E. Reliability and safety

Correct Answers

A. Fairness
B. Transparency
C. Accountability
E. Reliability and safety

Explanation

Microsoft’s Responsible AI principles include:

Fairness
Reliability and safety
Privacy and security
Inclusiveness
Transparency
Accountability

Question 9

What is the PRIMARY purpose of a system prompt in generative AI?

A. To define the behavior and rules for the AI model
B. To increase internet speed
C. To encrypt databases
D. To replace APIs

Correct Answer

A. To define the behavior and rules for the AI model

Question 10

You need to identify cars, bicycles, and pedestrians in traffic-camera footage.

Which AI capability should you use?

A. OCR
B. Object detection
C. Sentiment analysis
D. Translation

Correct Answer

B. Object detection

Question 11

FILL IN THE BLANK

__________ converts spoken language into machine-readable text.

Correct Answer

Speech recognition

Question 12

Which statement about generative AI models is TRUE?

A. They only analyze spreadsheets
B. They can generate new content such as text and images
C. They cannot process natural language
D. They only work offline

Correct Answer

B. They can generate new content such as text and images

Question 13

You are designing an AI solution for visually impaired users that describes images aloud.

Which capability is MOST appropriate?

A. Image captioning
B. Forecasting
C. Regression
D. Clustering

Correct Answer

A. Image captioning

Question 14

Which authentication method helps secure access to Azure AI services?

A. API keys
B. Printer drivers
C. HDMI cables
D. Browser bookmarks

Correct Answer

A. API keys

Question 15

MULTIPLE ANSWER

Which tasks are examples of natural language processing (NLP)?

Select ALL that apply.

A. Language translation
B. Sentiment analysis
C. Image classification
D. Text summarization
E. Entity extraction

Correct Answers

A. Language translation
B. Sentiment analysis
D. Text summarization
E. Entity extraction

Question 16

Which AI workload predicts categories such as “approved” or “denied”?

A. Regression
B. Classification
C. Clustering
D. Computer vision

Correct Answer

B. Classification

Question 17

You are using Azure AI Foundry to deploy a generative AI model.

What must happen before applications can interact with the model?

A. The model must be deployed to an endpoint
B. The model must be printed
C. The operating system must be replaced
D. The database must be deleted

Correct Answer

A. The model must be deployed to an endpoint

Question 18

HOTSPOT / MATCHING

Match each workload with the correct example.

Workload	Example
Speech AI	?
Computer Vision	?
Generative AI	?

Options:

Detecting objects in images
Generating marketing text
Transcribing audio recordings

Correct Answers

Workload	Example
Speech AI	Transcribing audio recordings
Computer Vision	Detecting objects in images
Generative AI	Generating marketing text

Question 19

What is a hallucination in generative AI?

A. A hardware failure
B. A networking issue
C. An incorrect or fabricated AI-generated response
D. A database backup

Correct Answer

C. An incorrect or fabricated AI-generated response

Question 20

Which factor can reduce speech-recognition accuracy?

A. Background noise
B. High-quality microphones
C. Clear pronunciation
D. Stable internet connections

Correct Answer

A. Background noise

Question 21

You need to group customers into segments based on purchasing behavior without predefined labels.

Which machine learning technique should you use?

A. Classification
B. Regression
C. Clustering
D. OCR

Correct Answer

C. Clustering

Question 22

MULTIPLE ANSWER

Which capabilities are associated with Azure AI Speech services?

Select ALL that apply.

A. Speech recognition
B. Speech synthesis
C. Translation
D. Object detection
E. Speaker identification

Correct Answers

A. Speech recognition
B. Speech synthesis
C. Translation
E. Speaker identification

Question 23

Which Responsible AI principle emphasizes explaining how AI systems make decisions?

A. Transparency
B. Privacy
C. Inclusiveness
D. Reliability

Correct Answer

A. Transparency

Question 24

FILL IN THE BLANK

__________ extracts machine-readable text from images and scanned documents.

Correct Answer

OCR

Optical Character Recognition

Question 25

A company wants to automatically summarize long customer-support conversations.

Which AI capability should they use?

A. Text summarization
B. Object detection
C. Forecasting
D. Regression

Correct Answer

A. Text summarization

Question 26

You need an AI system that can understand both images and text prompts.

Which type of model should you use?

A. Multimodal model
B. Regression model
C. Clustering model
D. Time-series model

Correct Answer

A. Multimodal model

Question 27

MULTIPLE ANSWER

Which are benefits of cloud-based AI services?

Select ALL that apply.

A. Scalability
B. Reduced infrastructure management
C. Automatic access to pretrained models
D. Elimination of all security concerns
E. Faster deployment

Correct Answers

A. Scalability
B. Reduced infrastructure management
C. Automatic access to pretrained models
E. Faster deployment

Question 28

You are creating a lightweight application that sends images to Azure AI services for analysis.

How does the application typically communicate with the service?

A. Through APIs and endpoints
B. Through printer drivers
C. Through USB storage devices
D. Through monitor settings

Correct Answer

A. Through APIs and endpoints

Question 29

Which AI capability is MOST useful for detecting the emotional tone of customer reviews?

A. OCR
B. Sentiment analysis
C. Image classification
D. Speech synthesis

Correct Answer

B. Sentiment analysis

Question 30

SCENARIO-BASED QUESTION

A retail company wants an AI solution that:

Extracts text from receipts
Detects products in shelf images
Analyzes customer-service calls
Generates chatbot responses

Which AI workloads are required?

A. OCR, object detection, speech AI, and generative AI
B. Regression only
C. Classification only
D. Forecasting and clustering only

Correct Answer

A. OCR, object detection, speech AI, and generative AI

Explanation

The scenario requires multiple AI capabilities:

OCR for receipt text extraction
Object detection for shelf-image analysis
Speech AI for customer-call analysis
Generative AI for chatbot responses

AI, AI-901, azure, Microsoft Certification May 18, 2026

Build a lightweight application with Information Extraction capabilities by using Content Understanding (AI-901 Exam Prep)

This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Implement AI solutions by using Microsoft Foundry (55–60%)
   --> Implement AI solutions for information extraction by using Foundry
      --> Build a lightweight application with Information Extraction capabilities by using Content Understanding

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Modern organizations often need applications that can automatically extract information from documents, images, audio, and video. Azure AI services and Microsoft Foundry tools make it possible to create lightweight applications that use AI-powered content understanding without requiring advanced machine learning expertise.

For the AI-901 certification exam, candidates should understand the foundational concepts involved in building lightweight applications with information extraction capabilities by using Azure Content Understanding and Microsoft Foundry.

This topic falls under the “Implement AI solutions for information extraction by using Foundry” section of the AI-901 exam objectives.

What Is Information Extraction?

Information extraction is the process of automatically identifying and retrieving useful data from content.

AI systems can extract information from:

Documents
Images
Audio
Video
Text

Examples include:

Names
Dates
Invoice totals
Keywords
Objects
Spoken words

What Is Azure Content Understanding?

Azure Content Understanding enables AI-powered analysis of different types of content.

Capabilities include:

OCR (Optical Character Recognition)
Speech recognition
Entity extraction
Image analysis
Video analysis
Classification
Caption generation

What Is a Lightweight Application?

A lightweight application is a simple application that performs focused tasks using cloud-based AI services.

Characteristics include:

Minimal infrastructure
API-based communication
Rapid development
Simple user interface
Cloud-hosted AI processing

For AI-901, candidates should understand concepts and workflows rather than advanced coding details.

Azure AI Foundry

Azure AI Foundry provides tools for building and testing AI applications.

Developers can:

Access AI models
Configure services
Test prompts
Analyze content
Build AI-powered workflows

Common Information Extraction Capabilities

OCR (Optical Character Recognition)

OCR extracts text from images and scanned documents.

Example

Input

Photo of a receipt

Output

Store name
Total amount
Purchase date

Entity Extraction

AI systems can identify important entities within content.

Examples of Entities

Names
Locations
Organizations
Phone numbers
Dates

Speech Recognition

Speech recognition converts spoken language into text.

Example

Input

Customer support call recording

Output

Searchable transcript

Object Detection

Object detection identifies objects within images or video.

Example

A warehouse-monitoring application may detect:

Boxes
Forklifts
Employees

Sentiment Analysis

Sentiment analysis determines emotional tone.

Example

Customer feedback classified as:

Positive
Neutral
Negative

Typical Lightweight Application Workflow

A lightweight information-extraction application often follows these steps:

User uploads content
Application sends content to Azure AI service
AI analyzes content
Structured results are returned
Application displays extracted information

Example Workflow

User uploads:

Image
PDF
Audio file
Video file

AI extracts:

Text
Keywords
Objects
Entities
Captions

APIs and Endpoints

Applications communicate with Azure AI services through:

APIs
Endpoints

The application sends content to the AI service and receives structured results.

Authentication

Applications must authenticate securely before using Azure AI services.

Common authentication methods include:

API keys
Azure credentials
Managed identities

Example High-Level Pseudocode

			
content = upload_file()
results = analyze_content(content)
display_results(results)

For AI-901, understanding the workflow is more important than memorizing exact syntax.

Structured Outputs

AI systems often return structured data formats such as:

JSON
Tables
Lists
Metadata

Structured outputs make integration easier.

Example JSON-Like Output

			
{
  "invoiceNumber": "INV-1001",
  "date": "2026-05-15",
  "total": "$245.99"
}

		

Common Real-World Scenarios

Scenario 1: Invoice Processing

Goal

Automatically extract invoice data.

Extracted Information

Vendor name
Invoice number
Total amount
Due date

Scenario 2: Customer Service Analytics

Goal

Analyze customer interactions.

Extracted Information

Topics
Sentiment
Keywords
Transcripts

Scenario 3: Healthcare Document Analysis

Goal

Extract information from medical documents.

Extracted Information

Patient names
Dates
Medical terms

Scenario 4: Media Monitoring

Goal

Analyze audio and video content.

Extracted Information

Captions
Objects
Speakers
Keywords

Responsible AI Considerations

Information-extraction applications should follow Responsible AI principles.

Key considerations include:

Privacy
Fairness
Transparency
Inclusiveness
Accountability
Security

Privacy Concerns

Content may contain:

Personal information
Financial records
Medical data
Private conversations

Organizations should secure sensitive data appropriately.

Fairness and Bias

AI systems may perform differently across:

Languages
Accents
Demographics
Image quality
Environmental conditions

Testing and evaluation are important.

Transparency

Users should understand:

AI is analyzing their content
AI-generated outputs may contain errors
Human review may still be needed

Accuracy Limitations

Information-extraction systems may struggle with:

Blurry images
Poor audio quality
Handwritten text
Background noise
Low-resolution files

Hallucinations and Errors

AI systems may occasionally:

Extract incorrect information
Misidentify objects
Misinterpret speech
Generate inaccurate summaries

Applications should validate important outputs.

Error Handling

Applications should handle:

Unsupported file formats
Corrupted files
Authentication failures
Network interruptions
Rate limits

Advantages of Lightweight AI Applications

Benefits include:

Rapid deployment
Reduced development complexity
Scalability
Automation
Faster information processing

Limitations of Lightweight AI Applications

Challenges include:

Dependence on cloud services
Accuracy limitations
Privacy concerns
Potential bias
Environmental variability

Multimodal AI

Modern AI systems can combine:

Text
Speech
Vision
Generative AI

These systems can process multiple content types together.

High-Level Architecture

A simplified architecture often includes:

User uploads content
Application sends content to Azure AI service
AI analyzes content
Structured results are returned
Application displays extracted information

Important AI-901 Exam Tips

For the exam, remember these key points:

Information extraction retrieves useful data from content.
OCR extracts text from images and documents.
Speech recognition converts speech into text.
Object detection identifies objects within images or video.
APIs and endpoints connect applications to Azure AI services.
Authentication secures access to AI resources.
Structured outputs often use JSON-like formats.
Responsible AI principles apply to information extraction systems.
Poor-quality content can reduce accuracy.
Hallucinations are inaccurate AI-generated outputs.
Azure AI Foundry supports AI application development.

Quick Knowledge Check

Question 1

What does OCR do?

Answer

Extracts text from images and scanned documents.

Question 2

What does speech recognition do?

Answer

Converts spoken language into text.

Question 3

Why is authentication important?

Answer

It secures access to Azure AI services.

Question 4

What can reduce information-extraction accuracy?

Answer

Poor-quality images, background noise, and blurry documents.

Practice Exam Questions

Exam: AI-901

Topic: Build a Lightweight Application with Information Extraction Capabilities by Using Content Understanding

Question 1

What is the PRIMARY purpose of information extraction in AI applications?

A. To automatically retrieve useful data from content
B. To increase internet speed
C. To replace operating systems
D. To improve monitor resolution

Correct Answer

A. To automatically retrieve useful data from content

Explanation

Information extraction uses AI to identify and retrieve meaningful data from documents, images, audio, video, and text.

Why the Other Answers Are Incorrect

B. To increase internet speed

Information extraction does not improve networking performance.

C. To replace operating systems

AI extraction tools do not replace operating systems.

D. To improve monitor resolution

This is unrelated to AI information extraction.

Question 2

What does OCR stand for?

A. Optical Character Recognition
B. Open Cloud Routing
C. Operational Content Reporting
D. Object Classification Retrieval

Correct Answer

A. Optical Character Recognition

Explanation

OCR extracts machine-readable text from images and scanned documents.

Why the Other Answers Are Incorrect

B. Open Cloud Routing

This is not an OCR term.

C. Operational Content Reporting

This is unrelated to text extraction.

D. Object Classification Retrieval

This is not the meaning of OCR.

Question 3

Which AI capability converts spoken language into text?

A. Speech recognition
B. Image classification
C. Speech synthesis
D. Object detection

Correct Answer

A. Speech recognition

Explanation

Speech recognition transcribes spoken words into text.

Why the Other Answers Are Incorrect

B. Image classification

This categorizes images.

C. Speech synthesis

This converts text into spoken audio.

D. Object detection

This identifies objects within images or video.

Question 4

What is a lightweight AI application?

A. A simple application that uses cloud AI services for focused tasks
B. A hardware-only system
C. A networking device
D. A spreadsheet management tool

Correct Answer

A. A simple application that uses cloud AI services for focused tasks

Explanation

Lightweight applications typically use APIs and cloud services to provide AI capabilities without requiring complex infrastructure.

Why the Other Answers Are Incorrect

B. A hardware-only system

Lightweight AI apps commonly use cloud services.

C. A networking device

Networking devices are unrelated.

D. A spreadsheet management tool

This is unrelated to AI application design.

Question 5

How do lightweight AI applications commonly communicate with Azure AI services?

A. Through APIs and endpoints
B. Through printer drivers
C. Through monitor settings
D. Through USB-only connections

Correct Answer

A. Through APIs and endpoints

Explanation

Applications use APIs and endpoints to send content to Azure AI services and receive analysis results.

Why the Other Answers Are Incorrect

B. Through printer drivers

Printers are unrelated to Azure AI communication.

C. Through monitor settings

This is unrelated to cloud AI services.

D. Through USB-only connections

Cloud AI services use network communication.

Question 6

Why is authentication important in Azure AI applications?

A. To secure access to AI resources
B. To improve image brightness
C. To increase network speed
D. To improve speaker volume

Correct Answer

A. To secure access to AI resources

Explanation

Authentication ensures that only authorized users and applications can access Azure AI services.

Why the Other Answers Are Incorrect

B. To improve image brightness

Authentication does not affect image quality.

C. To increase network speed

Authentication does not improve networking.

D. To improve speaker volume

Authentication does not affect audio playback.

Question 7

Which format is commonly used for structured AI output data?

A. JSON
B. JPEG
C. MP3
D. ZIP

Correct Answer

A. JSON

Explanation

AI systems often return structured data in JSON-like formats for easy application integration.

Why the Other Answers Are Incorrect

B. JPEG

JPEG is an image format.

C. MP3

MP3 is an audio format.

D. ZIP

ZIP is a compressed archive format.

Question 8

Which factor can reduce information-extraction accuracy?

A. Poor-quality input content
B. Spreadsheet formatting
C. Keyboard layout changes
D. Screen brightness settings

Correct Answer

A. Poor-quality input content

Explanation

Blurry images, poor audio quality, and noisy environments can negatively affect AI extraction accuracy.

Why the Other Answers Are Incorrect

B. Spreadsheet formatting

This does not affect AI extraction services.

C. Keyboard layout changes

This is unrelated to AI analysis.

D. Screen brightness settings

This does not affect AI processing accuracy.

Question 9

Which Responsible AI concern is especially important for information extraction applications?

A. Protecting sensitive personal data
B. Increasing printer performance
C. Improving spreadsheet formulas
D. Reducing monitor power usage

Correct Answer

A. Protecting sensitive personal data

Explanation

Extracted content may contain financial, medical, or personal information that must be protected securely.

Why the Other Answers Are Incorrect

B. Increasing printer performance

This is unrelated to Responsible AI.

C. Improving spreadsheet formulas

This is unrelated to information extraction.

D. Reducing monitor power usage

This is unrelated to AI ethics.

Question 10

What are hallucinations in AI information-extraction systems?

A. Incorrect or fabricated AI-generated outputs
B. Hardware installation failures
C. Network outages
D. Operating system crashes

Correct Answer

A. Incorrect or fabricated AI-generated outputs

Explanation

Hallucinations occur when AI systems generate inaccurate extracted information, captions, summaries, or identifications.

Why the Other Answers Are Incorrect

B. Hardware installation failures

This is unrelated to AI-generated outputs.

C. Network outages

This is a connectivity issue.

D. Operating system crashes

This is unrelated to AI hallucinations.

Final Thoughts

Building lightweight applications with information extraction capabilities is an important topic for the AI-901 certification exam. Microsoft expects candidates to understand foundational concepts such as OCR, speech recognition, APIs, authentication, structured outputs, Responsible AI principles, and lightweight AI workflows.

Azure AI services and Azure AI Foundry provide powerful tools for creating scalable applications capable of extracting valuable information from text, images, audio, video, and documents.

Go to the AI-901 Exam Prep Hub main page

AI, AI-901, azure, Microsoft Certification May 18, 2026

Extract information from audio and video by using Content Understanding (AI-901 Exam Prep)

This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Implement AI solutions by using Microsoft Foundry (55–60%)
   --> Implement AI solutions for information extraction by using Foundry
      --> Extract information from audio and video by using Content Understanding

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Organizations increasingly rely on AI systems to analyze audio and video content for automation, accessibility, security, analytics, and customer experiences. AI-powered content understanding solutions can extract valuable information from spoken language, sounds, images, and moving video streams.

For the AI-901 certification exam, candidates should understand the foundational concepts behind extracting information from audio and video by using Azure Content Understanding and Microsoft Foundry tools.

This topic falls under the “Implement AI solutions for information extraction by using Foundry” section of the AI-901 exam objectives.

What Is Content Understanding?

Content understanding refers to AI systems analyzing and interpreting different forms of content, including:

Audio
Video
Images
Documents
Text

AI systems can identify patterns, extract information, and generate useful insights.

Azure Content Understanding

Azure Content Understanding enables AI-powered analysis of multimedia content.

Capabilities include:

Speech recognition
Video analysis
Speaker identification
Caption generation
Object detection
Keyword extraction

Azure AI Foundry

Azure AI Foundry provides tools for building, testing, and managing AI applications.

Developers can:

Deploy AI services
Process multimedia content
Build lightweight applications
Test AI workflows

Audio Information Extraction

AI systems can analyze audio files to extract useful information.

Examples include:

Spoken words
Speaker identity
Keywords
Emotions
Language detection

Speech Recognition

Speech recognition converts spoken language into text.

Example

Input

Audio recording of a meeting

Output

Meeting transcript

Speaker Identification

AI systems can distinguish between different speakers.

Example

A meeting transcription may identify:

Speaker 1
Speaker 2
Speaker 3

Language Detection

AI systems can identify the spoken language within audio content.

Example

An AI system determines whether audio is:

English
Spanish
French
Japanese

Keyword Extraction

AI systems can identify important terms within conversations.

Example

A customer support call may extract:

Product names
Complaint topics
Order numbers

Sentiment Analysis

AI systems can analyze emotional tone in speech.

Example

A customer call may be classified as:

Positive
Neutral
Negative

Video Information Extraction

Video analysis combines:

Audio analysis
Image analysis
Motion analysis

Common Video Analysis Capabilities

AI systems may perform:

Object detection
Facial analysis
Activity recognition
Scene description
Text extraction
Caption generation

Object Detection in Video

AI systems can identify objects appearing in video frames.

Example

A traffic-monitoring system may detect:

Cars
Trucks
Pedestrians
Traffic lights

Scene Detection

AI systems can identify scene changes within videos.

Example

A sports video may identify:

Game start
Replay segments
Commercial breaks

Video Captioning

AI systems can generate descriptions or subtitles for videos.

Example

A training video may automatically generate captions for accessibility.

Optical Character Recognition (OCR) in Video

AI systems can extract text appearing in video frames.

Example

A video may contain:

Street signs
License plates
Product labels

APIs and Endpoints

Applications communicate with Azure AI services using:

APIs
Endpoints

Audio and video content is submitted programmatically for analysis.

Authentication

Applications must securely authenticate before accessing Azure AI services.

Common authentication methods include:

API keys
Azure credentials
Managed identities

Lightweight Application Workflow

A typical workflow includes:

User uploads audio or video
Application sends content to AI service
AI analyzes multimedia content
Results are returned
Application displays extracted information

Example High-Level Pseudocode

			
media = upload_media()
results = analyze_media(media)
display_results(results)

For AI-901, understanding the workflow is more important than memorizing exact syntax.

Common Real-World Scenarios

Scenario 1: Meeting Transcription

Goal

Convert meeting audio into searchable text.

Features

Speech recognition
Speaker identification
Keyword extraction

Scenario 2: Call Center Analytics

Goal

Analyze customer service calls.

Features

Sentiment analysis
Topic extraction
Call summarization

Scenario 3: Security Monitoring

Goal

Analyze surveillance video.

Features

Object detection
Activity recognition
Facial analysis

Scenario 4: Video Accessibility

Goal

Improve accessibility for multimedia content.

Features

Caption generation
Speech transcription
Scene descriptions

Responsible AI Considerations

Audio and video AI systems should follow Responsible AI principles.

Key considerations include:

Privacy
Fairness
Transparency
Inclusiveness
Accountability
Security

Privacy Concerns

Audio and video may contain:

Personal conversations
Faces
Biometric data
Sensitive information

Organizations should protect multimedia data appropriately.

Fairness and Bias

Speech and video systems may perform differently across:

Languages
Accents
Dialects
Lighting conditions
Demographics

Testing and evaluation are important.

Transparency

Users should understand:

AI is analyzing multimedia content
AI-generated outputs may contain errors
Human review may still be needed

Accuracy Limitations

Audio and video analysis systems may struggle with:

Background noise
Poor audio quality
Low-resolution video
Obstructed visuals
Multiple overlapping speakers

Hallucinations and Errors

AI systems may occasionally:

Misidentify speakers
Generate inaccurate captions
Misinterpret speech
Detect nonexistent objects

Applications should validate important outputs.

Error Handling

Applications should handle:

Unsupported file formats
Corrupted media files
Authentication failures
Network interruptions
Rate limits

Advantages of Multimedia Information Extraction

Benefits include:

Automation
Faster analysis
Improved accessibility
Searchable content
Scalable processing

Limitations of Multimedia Information Extraction

Challenges include:

Privacy concerns
Accuracy limitations
Bias
Environmental variability
Ethical considerations

Multimodal AI

Modern AI systems may combine:

Speech
Vision
Text
Generative AI

These systems can:

Analyze multimedia content
Answer questions
Generate summaries
Create captions and descriptions

High-Level Architecture

A simplified architecture often includes:

User uploads audio/video
Application sends media to Azure AI service
AI processes multimedia content
Structured results are returned
Application displays extracted information

Important AI-901 Exam Tips

For the exam, remember these key points:

Speech recognition converts speech to text.
Speaker identification distinguishes speakers.
Sentiment analysis detects emotional tone.
OCR can extract text from video frames.
Object detection identifies objects in video.
APIs and endpoints connect applications to AI services.
Authentication secures AI resources.
Responsible AI principles apply to multimedia AI systems.
Poor audio or video quality can reduce accuracy.
Hallucinations are inaccurate AI-generated outputs.
Azure AI Foundry supports multimedia AI application development.

Quick Knowledge Check

Question 1

What does speech recognition do?

Answer

Converts spoken language into text.

Question 2

What is speaker identification?

Answer

Distinguishing between different speakers in audio content.

Question 3

Why is authentication important?

Answer

It secures access to Azure AI services.

Question 4

What can reduce multimedia-analysis accuracy?

Answer

Background noise, low-quality audio, and poor video quality.

Practice Exam Questions

Exam: AI-901

Topic: Extract Information from Audio and Video by Using Content Understanding

Question 1

What is the PRIMARY purpose of content understanding in AI systems?

A. To analyze and interpret multimedia content such as audio and video
B. To increase internet bandwidth
C. To replace operating systems
D. To improve keyboard performance

Correct Answer

A. To analyze and interpret multimedia content such as audio and video

Explanation

Content understanding enables AI systems to analyze audio, video, images, and other forms of content to extract useful information.

Why the Other Answers Are Incorrect

B. To increase internet bandwidth

Content understanding does not improve networking speed.

C. To replace operating systems

AI multimedia analysis does not replace operating systems.

D. To improve keyboard performance

This is unrelated to AI content understanding.

Question 2

What does speech recognition do?

A. Converts spoken language into text
B. Converts images into audio
C. Encrypts media files
D. Repairs damaged videos

Correct Answer

A. Converts spoken language into text

Explanation

Speech recognition transcribes spoken words into machine-readable text.

Why the Other Answers Are Incorrect

B. Converts images into audio

This is unrelated to speech recognition.

C. Encrypts media files

Encryption is unrelated to speech transcription.

D. Repairs damaged videos

Speech recognition does not repair media files.

Question 3

Which AI capability identifies different speakers in an audio recording?

A. Speaker identification
B. OCR
C. Image classification
D. Object compression

Correct Answer

A. Speaker identification

Explanation

Speaker identification distinguishes between different speakers within audio content.

Why the Other Answers Are Incorrect

B. OCR

OCR extracts text from images.

C. Image classification

This categorizes images.

D. Object compression

This is not a multimedia AI capability.

Question 4

What is sentiment analysis used for in audio processing?

A. Detecting emotional tone in speech
B. Increasing audio volume
C. Compressing audio files
D. Repairing broken microphones

Correct Answer

A. Detecting emotional tone in speech

Explanation

Sentiment analysis identifies whether speech content is positive, negative, or neutral.

Why the Other Answers Are Incorrect

B. Increasing audio volume

This is unrelated to AI analysis.

C. Compressing audio files

Compression is unrelated to sentiment detection.

D. Repairing broken microphones

This is a hardware issue.

Question 5

Which AI capability can extract text from video frames?

A. OCR
B. Speech synthesis
C. Audio normalization
D. File compression

Correct Answer

A. OCR

Explanation

OCR can identify and extract text that appears visually within video frames.

Why the Other Answers Are Incorrect

B. Speech synthesis

This converts text into speech.

C. Audio normalization

This adjusts sound levels.

D. File compression

This reduces file size.

Question 6

How do lightweight multimedia-analysis applications typically communicate with Azure AI services?

A. Through APIs and endpoints
B. Through printer drivers
C. Through monitor settings
D. Through USB-only connections

Correct Answer

A. Through APIs and endpoints

Explanation

Applications use APIs and endpoints to send audio and video content to Azure AI services for analysis.

Why the Other Answers Are Incorrect

B. Through printer drivers

Printers are unrelated to multimedia AI communication.

C. Through monitor settings

This is unrelated to cloud AI services.

D. Through USB-only connections

Cloud AI services use network communication.

Question 7

Why is authentication important when using Azure AI multimedia services?

A. To secure access to AI resources
B. To improve speaker volume
C. To increase internet speed
D. To improve video resolution

Correct Answer

A. To secure access to AI resources

Explanation

Authentication ensures that only authorized users and applications can access Azure AI services.

Why the Other Answers Are Incorrect

B. To improve speaker volume

Authentication does not affect sound levels.

C. To increase internet speed

Authentication does not improve networking.

D. To improve video resolution

Authentication does not affect video quality.

Question 8

Which factor can reduce speech-recognition accuracy?

A. Background noise
B. Spreadsheet formatting
C. Keyboard layout changes
D. Monitor brightness

Correct Answer

A. Background noise

Explanation

Noise and poor audio quality can make it difficult for AI systems to correctly recognize speech.

Why the Other Answers Are Incorrect

B. Spreadsheet formatting

This does not affect audio AI systems.

C. Keyboard layout changes

This is unrelated to speech recognition.

D. Monitor brightness

This does not affect audio analysis.

Question 9

Which Responsible AI concern is especially important for audio and video analysis systems?

A. Protecting sensitive personal information
B. Increasing printer speed
C. Improving spreadsheet formulas
D. Reducing file storage costs

Correct Answer

A. Protecting sensitive personal information

Explanation

Audio and video files may contain faces, voices, and personal conversations that require privacy protection.

Why the Other Answers Are Incorrect

B. Increasing printer speed

This is unrelated to Responsible AI.

C. Improving spreadsheet formulas

This is unrelated to multimedia analysis.

D. Reducing file storage costs

This is not a Responsible AI principle.

Question 10

What are hallucinations in multimedia AI systems?

A. Incorrect or fabricated AI-generated outputs
B. Hardware installation failures
C. Network outages
D. Speaker hardware malfunctions

Correct Answer

A. Incorrect or fabricated AI-generated outputs

Explanation

Hallucinations occur when AI systems produce inaccurate captions, object detections, speaker identifications, or transcriptions.

Why the Other Answers Are Incorrect

B. Hardware installation failures

This is unrelated to AI-generated outputs.

C. Network outages

This is a connectivity issue.

D. Speaker hardware malfunctions

This is a hardware problem, not an AI hallucination.

Final Thoughts

Extracting information from audio and video by using Content Understanding is an important topic for the AI-901 certification exam. Microsoft expects candidates to understand foundational concepts such as speech recognition, video analysis, OCR, APIs, authentication, Responsible AI principles, and lightweight multimedia-analysis workflows.

Azure AI services and Azure AI Foundry provide powerful tools for building intelligent multimedia applications capable of understanding spoken language, video content, and visual information at scale.

Go to the AI-901 Exam Prep Hub main page

AI, AI-901, Artificial Intelligence (AI), Microsoft Certification May 18, 2026May 18, 2026

Extract information from images by using Content Understanding (AI-901 Exam Prep)

This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Implement AI solutions by using Microsoft Foundry (55–60%)
   --> Implement AI solutions for information extraction by using Foundry
      --> Extract information from images by using Content Understanding

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Modern AI systems can analyze images and extract meaningful information automatically. Organizations use image analysis solutions for automation, accessibility, security, healthcare, retail, and business intelligence.

For the AI-901 certification exam, candidates should understand the foundational concepts behind extracting information from images by using Azure Content Understanding and Microsoft Foundry tools.

This topic falls under the “Implement AI solutions for information extraction by using Foundry” section of the AI-901 exam objectives.

What Is Image Information Extraction?

Image information extraction is the process of analyzing images to identify and retrieve useful information.

AI systems can detect:

Text
Objects
Faces
Colors
Products
Landmarks
Visual patterns

What Is Azure Content Understanding?

Azure Content Understanding enables AI systems to interpret and analyze content such as:

Images
Documents
Audio
Video

Capabilities include:

OCR
Object detection
Classification
Caption generation
Metadata extraction

Azure AI Foundry

Azure AI Foundry provides tools for building, testing, and managing AI-powered applications.

Developers can:

Access AI models
Analyze images
Build lightweight applications
Test AI workflows

Common Image Extraction Techniques

Optical Character Recognition (OCR)

OCR extracts text from images.

Example

Image

Photo of a street sign

OCR Output

“Main Street”

Object Detection

Object detection identifies objects and their locations within images.

Example

Detected Objects

Car
Bicycle
Traffic light
Person

Image Classification

Image classification determines the overall category of an image.

Example

Image

Photo of a cat

Classification

“Cat”

Facial Analysis

AI systems can analyze facial characteristics.

Capabilities may include:

Face detection
Emotion analysis
Age estimation

Responsible AI considerations are especially important for facial-analysis systems.

Image Captioning

Image captioning generates natural-language descriptions of images.

Example

Image

A dog running on a beach

Caption

“A brown dog running along a sandy beach.”

Metadata Extraction

AI systems can extract metadata and contextual information from images.

Examples include:

Time
Location
Camera details
Image dimensions

Barcode and QR Code Detection

AI systems can identify and decode:

Barcodes
QR codes

Example

Retail applications may scan product barcodes for inventory management.

APIs and Endpoints

Applications communicate with Azure AI services using:

APIs
Endpoints

Images are submitted programmatically for analysis.

Authentication

Applications must securely authenticate before accessing AI services.

Common methods include:

API keys
Azure credentials
Managed identities

Lightweight Application Workflow

A typical workflow includes:

User uploads image
Application sends image to AI service
AI analyzes image
Results are returned
Application displays extracted information

Example High-Level Pseudocode

			
image = upload_image()
results = analyze_image(image)
display_results(results)

For AI-901, understanding the workflow is more important than memorizing exact syntax.

Common Real-World Scenarios

Scenario 1: Receipt Scanner

Goal

Extract purchase details from receipt images.

Features

OCR
Table extraction
Total amount detection

Scenario 2: Accessibility Assistant

Goal

Describe images for visually impaired users.

Features

Image captioning
OCR
Object detection

Scenario 3: Retail Inventory

Goal

Identify products from shelf images.

Features

Barcode scanning
Object detection
Classification

Scenario 4: Traffic Monitoring

Goal

Analyze roadway images.

Features

Vehicle detection
Traffic analysis
License plate reading

Responsible AI Considerations

Image-analysis applications should follow Responsible AI principles.

Key considerations include:

Privacy
Fairness
Transparency
Inclusiveness
Accountability
Security

Privacy Concerns

Images may contain:

Faces
Personal information
License plates
Sensitive documents

Organizations should protect image data appropriately.

Fairness and Bias

Vision systems may perform differently across:

Lighting conditions
Skin tones
Environmental conditions
Camera quality

Testing and evaluation are important.

Transparency

Users should understand:

AI is analyzing images
AI-generated outputs may contain errors
Images may be processed in the cloud

Accuracy Limitations

Image extraction systems may struggle with:

Blurry images
Poor lighting
Obstructed objects
Low-resolution images

Hallucinations and Errors

AI systems may occasionally:

Misidentify objects
Generate incorrect captions
Extract inaccurate text

Applications should validate important outputs.

Error Handling

Applications should handle:

Unsupported image formats
Corrupted files
Authentication failures
Network interruptions
Rate limits

Advantages of Image Extraction AI

Benefits include:

Faster processing
Automation
Scalability
Accessibility improvements
Reduced manual work

Limitations of Image Extraction AI

Challenges include:

Accuracy limitations
Bias
Privacy concerns
Environmental variability
Ethical considerations

Multimodal AI

Some modern AI systems combine:

Vision
Text
Speech
Generative AI

These systems can:

Analyze images
Answer visual questions
Generate descriptions
Create new content

High-Level Architecture

A simplified architecture often includes:

User uploads image
Application sends image to Azure AI service
AI processes image
Structured results are returned
Application displays information

Important AI-901 Exam Tips

For the exam, remember these key points:

OCR extracts text from images.
Object detection identifies objects and locations.
Image classification categorizes images.
Image captioning generates natural-language descriptions.
APIs and endpoints connect applications to AI services.
Authentication secures access to AI resources.
Responsible AI principles apply to image-analysis systems.
Poor image quality can reduce accuracy.
Hallucinations are inaccurate AI-generated outputs.
Azure AI Foundry supports AI application development.

Quick Knowledge Check

Question 1

What does OCR do?

Answer

Extracts machine-readable text from images.

Question 2

What is object detection?

Answer

Identifying and locating objects within an image.

Question 3

Why is authentication important?

Answer

It secures access to Azure AI services.

Question 4

What can reduce image-analysis accuracy?

Answer

Poor lighting, blur, and low-resolution images.

Practice Exam Questions

Exam: AI-901

Topic: Extract Information from Images by Using Content Understanding

Question 1

What is the PRIMARY purpose of image information extraction?

A. To analyze images and retrieve useful information
B. To increase internet bandwidth
C. To manage operating systems
D. To improve printer performance

Correct Answer

A. To analyze images and retrieve useful information

Explanation

Image information extraction uses AI to identify and retrieve meaningful data from images, such as text, objects, and visual patterns.

Why the Other Answers Are Incorrect

B. To increase internet bandwidth

Image analysis does not affect networking speed.

C. To manage operating systems

This is unrelated to computer vision.

D. To improve printer performance

Printers are unrelated to AI image extraction.

Question 2

What does OCR stand for?

A. Optical Character Recognition
B. Open Content Routing
C. Object Classification Reporting
D. Operational Cloud Rendering

Correct Answer

A. Optical Character Recognition

Explanation

OCR extracts machine-readable text from images and scanned documents.

Why the Other Answers Are Incorrect

B. Open Content Routing

This is not the meaning of OCR.

C. Object Classification Reporting

This is unrelated to text extraction.

D. Operational Cloud Rendering

This is not an OCR term.

Question 3

Which computer vision capability identifies multiple objects and their locations within an image?

A. Object detection
B. Speech synthesis
C. Text summarization
D. Audio transcription

Correct Answer

A. Object detection

Explanation

Object detection identifies objects and determines where they appear within an image.

Why the Other Answers Are Incorrect

B. Speech synthesis

This converts text into speech.

C. Text summarization

This is a text-analysis task.

D. Audio transcription

This converts speech into text.

Question 4

What is image classification?

A. Categorizing an image based on its contents
B. Compressing image file sizes
C. Encrypting image data
D. Converting images into spreadsheets

Correct Answer

A. Categorizing an image based on its contents

Explanation

Image classification determines the overall category or subject represented in an image.

Why the Other Answers Are Incorrect

B. Compressing image file sizes

Compression is unrelated to classification.

C. Encrypting image data

Encryption is unrelated to image categorization.

D. Converting images into spreadsheets

This is unrelated to computer vision.

Question 5

What does image captioning do?

A. Generates natural-language descriptions of images
B. Repairs corrupted image files
C. Converts speech into text
D. Improves internet speeds

Correct Answer

A. Generates natural-language descriptions of images

Explanation

Image captioning creates descriptive text that explains the contents of an image.

Why the Other Answers Are Incorrect

B. Repairs corrupted image files

This is unrelated to caption generation.

C. Converts speech into text

This is speech recognition.

D. Improves internet speeds

This is unrelated to AI image analysis.

Question 6

How do lightweight image-analysis applications typically communicate with Azure AI services?

A. Through APIs and endpoints
B. Through printer drivers
C. Through monitor settings
D. Through USB-only connections

Correct Answer

A. Through APIs and endpoints

Explanation

Applications send images to cloud AI services through APIs and service endpoints.

Why the Other Answers Are Incorrect

B. Through printer drivers

Printers are unrelated to AI communication.

C. Through monitor settings

This is unrelated to cloud AI services.

D. Through USB-only connections

Cloud services use network communication.

Question 7

Why is authentication important when using Azure AI services?

A. To secure access to AI resources
B. To improve image brightness
C. To reduce image resolution
D. To increase network speed

Correct Answer

A. To secure access to AI resources

Explanation

Authentication ensures that only authorized users and applications can access Azure AI services.

Why the Other Answers Are Incorrect

B. To improve image brightness

Authentication does not affect image quality.

C. To reduce image resolution

Authentication is unrelated to image resolution.

D. To increase network speed

Authentication does not improve internet performance.

Question 8

Which Responsible AI concern is especially important for image-analysis systems?

A. Protecting personal and sensitive visual information
B. Increasing printer speed
C. Improving spreadsheet formulas
D. Reducing monitor power usage

Correct Answer

A. Protecting personal and sensitive visual information

Explanation

Images may contain sensitive information such as faces, license plates, and documents that must be protected.

Why the Other Answers Are Incorrect

B. Increasing printer speed

This is unrelated to Responsible AI.

C. Improving spreadsheet formulas

This is unrelated to image analysis.

D. Reducing monitor power usage

This is unrelated to AI ethics.

Question 9

Which factor can reduce image-analysis accuracy?

A. Poor image quality
B. Spreadsheet formatting
C. Keyboard layout changes
D. Audio playback speed

Correct Answer

A. Poor image quality

Explanation

Blur, poor lighting, and low-resolution images can negatively affect AI analysis accuracy.

Why the Other Answers Are Incorrect

B. Spreadsheet formatting

This does not affect image AI systems.

C. Keyboard layout changes

This is unrelated to computer vision.

D. Audio playback speed

This is unrelated to image processing.

Question 10

What are hallucinations in AI image-analysis systems?

A. Incorrect or fabricated AI-generated outputs
B. Hardware installation failures
C. Network outages
D. Audio recording problems

Correct Answer

A. Incorrect or fabricated AI-generated outputs

Explanation

Hallucinations occur when AI systems generate inaccurate captions, object identifications, or extracted information.

Why the Other Answers Are Incorrect

B. Hardware installation failures

This is unrelated to AI-generated outputs.

C. Network outages

This is a connectivity issue.

D. Audio recording problems

This is unrelated to image-analysis systems.

Final Thoughts

Extracting information from images by using Content Understanding is an important topic for the AI-901 certification exam. Microsoft expects candidates to understand foundational concepts such as OCR, object detection, image classification, APIs, authentication, Responsible AI principles, and lightweight image-analysis workflows.

Azure AI services and Azure AI Foundry provide powerful tools for building scalable AI applications capable of understanding and extracting valuable information from visual content.

Go to the AI-901 Exam Prep Hub main page

AI, AI-901, Computer Vision, Microsoft Certification May 18, 2026

Build a lightweight application that includes vision capabilities (AI-901 Exam Prep)

This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Implement AI solutions by using Microsoft Foundry (55–60%)
   --> Implement AI solutions with computer vision and image-generation capabilities by using Foundry
      --> Build a lightweight application that includes vision capabilities

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Computer vision enables AI systems to interpret and analyze visual information such as images and videos. Organizations use computer vision solutions for automation, accessibility, security, analytics, and customer experiences.

For the AI-901 certification exam, candidates should understand the foundational concepts behind building lightweight applications that include vision capabilities by using Microsoft Azure AI services and Azure AI Foundry.

This topic falls under the “Implement AI solutions with computer vision and image-generation capabilities by using Foundry” section of the AI-901 exam objectives.

What Is Computer Vision?

Computer vision is a field of AI that enables systems to analyze and understand visual information.

Visual data may include:

Images
Videos
Scanned documents
Camera feeds

Common Computer Vision Tasks

Computer vision systems commonly perform:

Image classification
Object detection
Optical character recognition (OCR)
Facial analysis
Image captioning
Content moderation

Azure AI Vision

Azure AI Vision provides computer vision capabilities through cloud-based AI services.

Features include:

Image analysis
OCR
Object detection
Image captioning
Facial attribute analysis

What Is a Lightweight Application?

A lightweight application is a simple application designed to perform focused tasks with minimal complexity and infrastructure.

Characteristics include:

Simple user interface
Fast deployment
Minimal resource usage
Easy maintenance

Examples of Lightweight Vision Applications

Examples include:

Image analysis tools
Receipt scanning apps
Accessibility assistants
Product recognition apps
Photo-tagging systems

Azure AI Foundry

Azure AI Foundry provides tools for building, testing, and managing AI-powered applications.

Developers can:

Access AI models
Deploy services
Test prompts
Build AI workflows

Image Classification

Image classification identifies the main category or subject of an image.

Example

Image

Photo of a bicycle

Classification

“Bicycle”

Object Detection

Object detection identifies multiple objects and their locations within an image.

Example

Image

Street scene

Detected Objects

Car
Traffic light
Pedestrian
Bicycle

Optical Character Recognition (OCR)

OCR extracts text from images and scanned documents.

Example

Image

Photo of a restaurant menu

Extracted Text

Menu items and prices

Image Captioning

Image captioning generates natural-language descriptions of images.

Example

Image

A dog playing in a park

Caption

“A brown dog running through a grassy park.”

Facial Analysis

Computer vision systems can analyze facial features.

Possible capabilities include:

Face detection
Emotion analysis
Age estimation

For Responsible AI reasons, facial recognition and identification systems require careful consideration.

APIs and Endpoints

Applications communicate with Azure AI services using:

APIs
Endpoints

These allow images to be analyzed programmatically.

Authentication

Applications must securely authenticate before accessing Azure AI services.

Common authentication methods include:

API keys
Azure credentials
Managed identities

User Interface Components

A lightweight vision application may include:

Image upload area
Camera capture button
Results display
Image preview

Real-Time Image Processing

Some applications process images in near real time.

Examples include:

Security monitoring
Live object detection
Accessibility tools

Example Workflow

A common workflow includes:

User uploads image
Application sends image to Azure AI Vision
AI service analyzes image
Results are returned
Application displays findings

Example High-Level Pseudocode

			
image = upload_image()
results = analyze_image(image)
display_results(results)

For AI-901, understanding the workflow is more important than memorizing exact syntax.

Common Real-World Scenarios

Scenario 1: Receipt Scanner

Goal

Extract purchase information from receipts.

Features

OCR
Text extraction
Data organization

Scenario 2: Accessibility Assistant

Goal

Describe images for visually impaired users.

Features

Image captioning
OCR
Spoken descriptions

Scenario 3: Product Recognition

Goal

Identify products from photos.

Features

Object detection
Classification
Product lookup

Scenario 4: Content Moderation

Goal

Identify harmful or inappropriate images.

Features

Image analysis
Safety detection
Automated filtering

Responsible AI Considerations

Vision-enabled applications should follow Responsible AI principles.

Key considerations include:

Fairness
Privacy
Transparency
Inclusiveness
Accountability
Security

Privacy Concerns

Images may contain:

Personal data
Faces
Sensitive documents
Location information

Organizations should protect visual data appropriately.

Bias and Fairness

Computer vision systems may perform unevenly across:

Skin tones
Lighting conditions
Demographics
Environmental conditions

Testing and evaluation are important for fairness.

Transparency

Users should understand:

AI is analyzing images
AI-generated results may contain errors
Images may be processed in the cloud

Hallucinations and Errors

Vision systems may occasionally generate:

Incorrect captions
False detections
Inaccurate classifications

These incorrect outputs are sometimes called hallucinations.

Error Handling

Applications should handle:

Invalid image formats
Poor image quality
Authentication failures
Network interruptions
Rate limits

Image Quality Challenges

Computer vision accuracy can decrease with:

Blurry images
Poor lighting
Low resolution
Obstructed objects

Advantages of Vision Applications

Benefits include:

Automation
Faster analysis
Accessibility improvements
Improved customer experiences
Scalable image processing

Limitations of Vision Applications

Challenges include:

Recognition inaccuracies
Bias
Privacy concerns
Variable image quality
Ethical considerations

High-Level Architecture

A simplified architecture often includes:

User interface
Image upload/capture
Azure AI Vision service
AI analysis
Results display

Generative Vision Capabilities

Some modern systems combine:

Computer vision
Generative AI

These multimodal systems can:

Analyze images
Generate descriptions
Answer visual questions
Create new images

Important AI-901 Exam Tips

For the exam, remember these key points:

Computer vision analyzes visual information.
Azure AI Vision provides computer vision capabilities.
OCR extracts text from images.
Object detection identifies multiple objects in images.
Image captioning generates natural-language image descriptions.
APIs and endpoints connect applications to Azure AI services.
Authentication secures service access.
Responsible AI principles apply to computer vision systems.
Image quality affects AI accuracy.
Hallucinations are inaccurate AI-generated outputs.

Quick Knowledge Check

Question 1

What does OCR do?

Answer

Extracts text from images and scanned documents.

Question 2

What is object detection?

Answer

Identifying and locating objects within an image.

Question 3

Why is authentication important?

Answer

It secures access to Azure AI services.

Question 4

What can reduce computer vision accuracy?

Answer

Poor image quality such as blur or low lighting.

Practice Exam Questions

Question 1

What is the PRIMARY purpose of computer vision?

A. To enable AI systems to analyze and understand visual information
B. To increase internet bandwidth
C. To manage database backups
D. To improve keyboard performance

Correct Answer

A. To enable AI systems to analyze and understand visual information

Explanation

Computer vision allows AI systems to process and interpret images, videos, and other visual data.

Why the Other Answers Are Incorrect

B. To increase internet bandwidth

Computer vision does not affect networking speed.

C. To manage database backups

This is unrelated to computer vision.

D. To improve keyboard performance

This is unrelated to AI vision systems.

Question 2

Which Azure service provides computer vision capabilities such as OCR and image analysis?

A. Azure AI Vision
B. Azure Backup
C. Azure Virtual Machines
D. Azure DNS

Correct Answer

A. Azure AI Vision

Explanation

Azure AI Vision provides cloud-based computer vision capabilities including OCR, object detection, and image captioning.

Why the Other Answers Are Incorrect

B. Azure Backup

This is a backup service.

C. Azure Virtual Machines

This provides compute infrastructure.

D. Azure DNS

This is a networking service.

Question 3

What does OCR stand for?

A. Optical Character Recognition
B. Open Cloud Rendering
C. Object Classification Registry
D. Operational Compute Routing

Correct Answer

A. Optical Character Recognition

Explanation

OCR extracts text from images or scanned documents.

Why the Other Answers Are Incorrect

B. Open Cloud Rendering

This is not the meaning of OCR.

C. Object Classification Registry

This is unrelated to OCR.

D. Operational Compute Routing

This is not a computer vision term.

Question 4

What is the PRIMARY purpose of object detection?

A. To identify and locate objects within an image
B. To translate spoken language
C. To summarize long documents
D. To compress image files

Correct Answer

A. To identify and locate objects within an image

Explanation

Object detection identifies multiple objects and their locations inside an image.

Why the Other Answers Are Incorrect

B. To translate spoken language

This is a speech AI task.

C. To summarize long documents

This is a text analysis task.

D. To compress image files

Object detection does not compress files.

Question 5

What does image captioning do?

A. Generates natural-language descriptions of images
B. Converts speech into text
C. Encrypts image files
D. Creates database tables

Correct Answer

A. Generates natural-language descriptions of images

Explanation

Image captioning creates human-readable descriptions of visual content.

Why the Other Answers Are Incorrect

B. Converts speech into text

This is speech recognition.

C. Encrypts image files

Encryption is unrelated to captioning.

D. Creates database tables

This is unrelated to computer vision.

Question 6

How do lightweight vision applications typically communicate with Azure AI services?

A. Through APIs and endpoints
B. Through printer drivers
C. Through monitor settings
D. Through USB-only connections

Correct Answer

A. Through APIs and endpoints

Explanation

Applications use APIs and cloud endpoints to send images and receive AI-generated analysis results.

Why the Other Answers Are Incorrect

B. Through printer drivers

Printers are unrelated to AI communication.

C. Through monitor settings

This is unrelated to cloud AI services.

D. Through USB-only connections

Cloud services use network communication.

Question 7

Why is authentication important when accessing Azure AI Vision services?

A. To secure access to AI resources
B. To increase image brightness
C. To improve keyboard response time
D. To accelerate internet speeds

Correct Answer

A. To secure access to AI resources

Explanation

Authentication helps ensure that only authorized users and applications can access Azure AI services.

Why the Other Answers Are Incorrect

B. To increase image brightness

Authentication does not affect image quality.

C. To improve keyboard response time

This is unrelated to authentication.

D. To accelerate internet speeds

Authentication does not improve network performance.

Question 8

Which Responsible AI concern is especially important in computer vision systems?

A. Protecting personal and sensitive visual information
B. Increasing monitor resolution
C. Improving printer speed
D. Reducing spreadsheet file sizes

Correct Answer

A. Protecting personal and sensitive visual information

Explanation

Images may contain faces, documents, or other sensitive information that must be protected.

Why the Other Answers Are Incorrect

B. Increasing monitor resolution

This is unrelated to Responsible AI.

C. Improving printer speed

Printers are unrelated to computer vision ethics.

D. Reducing spreadsheet file sizes

This is unrelated to image analysis.

Question 9

What challenge can reduce computer vision accuracy?

A. Poor image quality
B. Spreadsheet formatting
C. Keyboard layout changes
D. Audio playback speed

Correct Answer

A. Poor image quality

Explanation

Blur, low lighting, and low resolution can negatively affect image analysis accuracy.

Why the Other Answers Are Incorrect

B. Spreadsheet formatting

This does not affect vision systems.

C. Keyboard layout changes

This is unrelated to image processing.

D. Audio playback speed

This is unrelated to computer vision.

Question 10

What are hallucinations in AI vision systems?

A. Incorrect or fabricated AI-generated outputs
B. Hardware installation failures
C. Network outages
D. Printer connection problems

Correct Answer

A. Incorrect or fabricated AI-generated outputs

Explanation

Hallucinations occur when AI systems generate inaccurate descriptions or detections.

Why the Other Answers Are Incorrect

B. Hardware installation failures

This is unrelated to AI-generated outputs.

C. Network outages

This is a connectivity issue.

D. Printer connection problems

This is unrelated to AI vision systems.

Final Thoughts

Building lightweight applications with vision capabilities is an important topic for the AI-901 certification exam. Microsoft expects candidates to understand the foundational concepts behind computer vision applications, including image classification, object detection, OCR, APIs, authentication, Responsible AI principles, and real-world implementation workflows.

Azure AI Vision and Azure AI Foundry provide powerful cloud-based tools that make it easier to build intelligent applications capable of analyzing and understanding visual information.

Go to the AI-901 Exam Prep Hub main page

AI, AI-901, Artificial Intelligence (AI), azure, Microsoft Certification May 18, 2026May 18, 2026

Extract information from documents and forms by using Azure Content Understanding in Foundry Tools (AI-901 Exam Prep)

This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Implement AI solutions by using Microsoft Foundry (55–60%)
   --> Implement AI solutions for information extraction by using Foundry
      --> Extract information from documents and forms by using Azure Content Understanding in Foundry Tools

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Organizations process enormous amounts of documents every day, including invoices, receipts, forms, contracts, and identification documents. AI-powered information extraction solutions help automate the process of reading, understanding, and organizing document data.

For the AI-901 certification exam, candidates should understand the foundational concepts behind extracting information from documents and forms by using Azure Content Understanding and Microsoft Foundry tools.

This topic falls under the “Implement AI solutions for information extraction by using Foundry” section of the AI-901 exam objectives.

What Is Information Extraction?

Information extraction is the process of identifying and retrieving useful data from documents, images, forms, audio, or other content.

Examples include extracting:

Names
Dates
Invoice totals
Addresses
Phone numbers
Product information

What Is Azure Content Understanding?

Azure Content Understanding helps AI systems analyze and interpret structured and unstructured documents.

Capabilities include:

Text extraction
Form recognition
Document analysis
Information classification
Key-value pair extraction

Azure AI Foundry

Azure AI Foundry provides tools for building, testing, and managing AI-powered applications.

Developers can:

Configure AI services
Process documents
Test extraction workflows
Build lightweight AI applications

Structured vs. Unstructured Documents

Structured Documents

Structured documents follow a consistent layout.

Examples include:

Tax forms
Invoices
Receipts
Application forms

Unstructured Documents

Unstructured documents have less predictable layouts.

Examples include:

Emails
Letters
Articles
Contracts

Optical Character Recognition (OCR)

OCR converts text within images or scanned documents into machine-readable text.

Example

Input

Scanned receipt image

OCR Output

Store name
Date
Total amount

Form Recognition

Form recognition identifies fields and values within forms.

Example

Form

Insurance application

Extracted Data

Customer name
Policy number
Address
Claim amount

Key-Value Pair Extraction

AI systems can identify relationships between labels and values.

Example

Key	Value
Invoice Number	INV-1045
Total	$250.00
Due Date	05/30/2026

Table Extraction

AI can identify and extract tables from documents.

Example

A receipt table may contain:

Item names
Quantities
Prices

Classification

Document classification identifies the type of document being processed.

Example

The system determines whether a file is:

Invoice
Contract
Receipt
Resume

Named Entity Recognition (NER)

NER identifies important entities within text.

Entities may include:

People
Organizations
Locations
Dates

Example

Text

“John Smith works for Contoso in Seattle.”

Extracted Entities

John Smith (Person)
Contoso (Organization)
Seattle (Location)

APIs and Endpoints

Applications communicate with Azure AI services through:

APIs
Endpoints

Documents are submitted for analysis programmatically.

Authentication

Applications must securely authenticate before accessing Azure AI services.

Common authentication methods include:

API keys
Azure credentials
Managed identities

Lightweight Application Workflow

A typical workflow includes:

User uploads document
Application sends file to AI service
AI extracts information
Results are returned
Application displays or stores extracted data

Example Workflow

Input

Scanned invoice

AI Processing

OCR
Key-value extraction
Table analysis

Output

Structured invoice data

Example High-Level Pseudocode

			
document = upload_document()
results = analyze_document(document)
display_results(results)

For AI-901, understanding the workflow is more important than memorizing exact syntax.

Common Real-World Scenarios

Scenario 1: Invoice Processing

Goal

Automate invoice data extraction.

Features

OCR
Table extraction
Total amount detection

Scenario 2: Receipt Scanning

Goal

Extract purchase information from receipts.

Features

Text extraction
Merchant identification
Expense categorization

Scenario 3: Resume Processing

Goal

Extract candidate information from resumes.

Features

Name extraction
Skill identification
Contact information detection

Scenario 4: Healthcare Forms

Goal

Digitize patient records.

Features

Form recognition
Key-value extraction
Classification

Responsible AI Considerations

Document-processing applications should follow Responsible AI principles.

Key considerations include:

Privacy
Security
Fairness
Transparency
Accountability
Inclusiveness

Privacy Concerns

Documents may contain:

Personal information
Financial data
Medical information
Legal records

Organizations should protect sensitive data appropriately.

Security Considerations

Applications should secure:

Uploaded files
Stored documents
API credentials
Extracted data

Transparency

Users should understand:

AI is analyzing documents
Extracted data may contain errors
Human review may still be needed

Accuracy Limitations

AI extraction systems may struggle with:

Poor scan quality
Handwritten text
Complex layouts
Damaged documents

Hallucinations and Errors

AI systems may occasionally:

Extract incorrect values
Miss fields
Misclassify documents

Applications should validate important information.

Error Handling

Applications should handle:

Unsupported file formats
Corrupted documents
Authentication failures
Network interruptions
Rate limits

Advantages of Information Extraction AI

Benefits include:

Faster document processing
Reduced manual entry
Improved scalability
Increased automation
Better searchability

Limitations of Information Extraction AI

Challenges include:

Variable document quality
Handwriting recognition difficulties
Inconsistent layouts
Privacy concerns
Extraction inaccuracies

Generative AI and Information Extraction

Some modern systems combine:

OCR
Document intelligence
Generative AI

This enables:

Summarization
Question answering
Conversational document analysis

High-Level Architecture

A simplified architecture often includes:

User uploads document
Application sends document to Azure AI service
AI analyzes content
Structured data is returned
Application displays or stores results

Important AI-901 Exam Tips

For the exam, remember these key points:

OCR extracts text from documents and images.
Form recognition identifies fields and values.
Key-value extraction identifies label-value relationships.
Table extraction retrieves structured table data.
Classification identifies document types.
APIs and endpoints connect applications to Azure AI services.
Authentication secures access to AI resources.
Responsible AI principles apply to document-processing systems.
Poor document quality can reduce extraction accuracy.
AI-generated outputs may still require validation.

Quick Knowledge Check

Question 1

What does OCR do?

Answer

Extracts machine-readable text from images or scanned documents.

Question 2

What is form recognition?

Answer

Identifying and extracting fields and values from forms.

Question 3

Why is authentication important?

Answer

It secures access to Azure AI services and protects resources.

Question 4

What can reduce extraction accuracy?

Answer

Poor scan quality, handwriting, and inconsistent document layouts.

Practice Exam Questions

Exam: AI-901

Topic: Extract Information from Documents and Forms by Using Azure Content Understanding in Foundry Tools

Question 1

What is the PRIMARY purpose of information extraction AI solutions?

A. To retrieve useful data from documents and content
B. To increase internet bandwidth
C. To replace operating systems
D. To improve monitor resolution

Correct Answer

A. To retrieve useful data from documents and content

Explanation

Information extraction AI systems identify and retrieve meaningful information such as names, dates, totals, and addresses from documents and forms.

Why the Other Answers Are Incorrect

B. To increase internet bandwidth

Information extraction does not affect network speed.

C. To replace operating systems

AI document processing does not replace operating systems.

D. To improve monitor resolution

This is unrelated to AI information extraction.

Question 2

What does OCR stand for?

A. Optical Character Recognition
B. Open Content Retrieval
C. Object Classification Routing
D. Operational Compute Reporting

Correct Answer

A. Optical Character Recognition

Explanation

OCR converts printed or handwritten text within images and scanned documents into machine-readable text.

Why the Other Answers Are Incorrect

B. Open Content Retrieval

This is not the meaning of OCR.

C. Object Classification Routing

This is unrelated to document analysis.

D. Operational Compute Reporting

This is not an OCR term.

Question 3

Which AI capability identifies fields and values within forms?

A. Form recognition
B. Speech synthesis
C. Image compression
D. Network monitoring

Correct Answer

A. Form recognition

Explanation

Form recognition extracts structured information such as names, dates, totals, and addresses from forms and documents.

Why the Other Answers Are Incorrect

B. Speech synthesis

This converts text into speech.

C. Image compression

This reduces file size and is unrelated to field extraction.

D. Network monitoring

This is unrelated to document AI.

Question 4

Which Azure platform provides tools for building and managing AI-powered applications?

A. Azure AI Foundry
B. Microsoft Paint
C. Windows Task Manager
D. Azure DNS

Correct Answer

A. Azure AI Foundry

Explanation

Azure AI Foundry provides tools for deploying, testing, and managing AI applications and services.

Why the Other Answers Are Incorrect

B. Microsoft Paint

Paint is a graphics editor.

C. Windows Task Manager

This is a system monitoring tool.

D. Azure DNS

This is a networking service.

Question 5

What is key-value pair extraction?

A. Identifying labels and their associated values in documents
B. Encrypting document files
C. Compressing image sizes
D. Converting audio into text

Correct Answer

A. Identifying labels and their associated values in documents

Explanation

Key-value extraction identifies relationships such as:

Invoice Number → INV-1045
Total → $250.00

Why the Other Answers Are Incorrect

B. Encrypting document files

Encryption is unrelated to data extraction.

C. Compressing image sizes

Compression is unrelated to document intelligence.

D. Converting audio into text

This is speech recognition.

Question 6

What is the purpose of document classification?

A. To identify the type of document being processed
B. To increase network performance
C. To generate music files
D. To repair damaged documents physically

Correct Answer

A. To identify the type of document being processed

Explanation

Document classification determines whether a file is an invoice, contract, receipt, resume, or another document type.

Why the Other Answers Are Incorrect

B. To increase network performance

Classification does not improve networking.

C. To generate music files

This is unrelated to document AI.

D. To repair damaged documents physically

AI classification does not physically repair documents.

Question 7

How do lightweight document-processing applications typically communicate with Azure AI services?

A. Through APIs and endpoints
B. Through USB-only connections
C. Through monitor calibration tools
D. Through printer drivers

Correct Answer

A. Through APIs and endpoints

Explanation

Applications send documents to Azure AI services using APIs and endpoints and receive structured analysis results.

Why the Other Answers Are Incorrect

B. Through USB-only connections

Cloud services use network communication.

C. Through monitor calibration tools

This is unrelated to AI services.

D. Through printer drivers

Printers are unrelated to cloud AI communication.

Question 8

Which factor can reduce the accuracy of document extraction systems?

A. Poor document quality
B. Spreadsheet color themes
C. Keyboard layout changes
D. Audio playback speed

Correct Answer

A. Poor document quality

Explanation

Blurry scans, damaged pages, handwriting, and poor lighting can negatively affect extraction accuracy.

Why the Other Answers Are Incorrect

B. Spreadsheet color themes

This does not affect document extraction AI.

C. Keyboard layout changes

This is unrelated to AI document analysis.

D. Audio playback speed

This is unrelated to document processing.

Question 9

Why is authentication important when using Azure AI services?

A. To secure access to AI resources
B. To improve image resolution
C. To increase internet speed
D. To compress document files

Correct Answer

A. To secure access to AI resources

Explanation

Authentication ensures that only authorized users and applications can access AI services.

Why the Other Answers Are Incorrect

B. To improve image resolution

Authentication does not affect image quality.

C. To increase internet speed

Authentication does not improve networking.

D. To compress document files

Authentication is unrelated to file compression.

Question 10

Which Responsible AI concern is especially important when processing documents?

A. Protecting sensitive personal information
B. Increasing monitor brightness
C. Improving printer speed
D. Reducing spreadsheet file size

Correct Answer

A. Protecting sensitive personal information

Explanation

Documents may contain financial, medical, legal, or personal information that must be protected appropriately.

Why the Other Answers Are Incorrect

B. Increasing monitor brightness

This is unrelated to Responsible AI.

C. Improving printer speed

This is unrelated to document intelligence.

D. Reducing spreadsheet file size

This is unrelated to AI ethics or privacy.

Final Thoughts

Extracting information from documents and forms using Azure Content Understanding and Foundry tools is an important topic for the AI-901 certification exam. Microsoft expects candidates to understand foundational concepts such as OCR, form recognition, document analysis, APIs, authentication, Responsible AI principles, and lightweight document-processing workflows.

Azure AI services and Azure AI Foundry provide powerful tools for automating information extraction and improving efficiency across business, healthcare, finance, and administrative scenarios.

Go to the AI-901 Exam Prep Hub main page