Category: Artificial Intelligence (AI)

Identify Features and Uses for Speech Recognition and Synthesis (AI-900 Exam Prep)

Where This Fits in the Exam

  • Exam area: Describe features of Natural Language Processing (NLP) workloads on Azure (15–20%)
  • Sub-area: Identify features of common NLP workload scenarios
  • Key focus: Understanding what speech recognition and synthesis do, when to use them, and which Azure services support them

This topic is highly scenario-driven on the exam.


Overview: Speech in NLP Workloads

Speech-related NLP workloads allow AI systems to:

  • Understand spoken language (speech recognition)
  • Generate spoken language (speech synthesis)

Together, these capabilities enable voice-based interactions such as virtual assistants, voice bots, dictation tools, and accessibility solutions.


Speech Recognition

What Is Speech Recognition?

Speech recognition (also called speech-to-text) is the process of converting spoken audio into written text.

The AI system analyzes:

  • Audio signals
  • Phonemes and pronunciation
  • Language patterns
  • Context

And produces text that represents what was spoken.


Key Features of Speech Recognition

Speech recognition solutions can:

  • Convert live or recorded audio into text
  • Support real-time transcription
  • Handle multiple languages and accents
  • Apply noise reduction
  • Recognize custom vocabulary (e.g., medical or technical terms)
  • Provide timestamps for spoken words or phrases

Common Uses of Speech Recognition

Speech recognition is used when users speak instead of type.

Common scenarios include:

  • Voice commands (e.g., “Turn on the lights”)
  • Call center transcription
  • Meeting and lecture transcription
  • Voice-controlled applications
  • Accessibility tools for users with limited mobility
  • Voice input for chatbots and virtual assistants

Azure Services for Speech Recognition

In Azure, speech recognition is provided by:

Azure AI Speech (Speech service)

Capabilities include:

  • Speech-to-text
  • Real-time and batch transcription
  • Language detection
  • Custom speech models

Speech Synthesis

What Is Speech Synthesis?

Speech synthesis (also called text-to-speech) is the process of converting written text into spoken audio.

The goal is to produce natural, human-like speech that sounds fluent and expressive.


Key Features of Speech Synthesis

Speech synthesis solutions can:

  • Convert text into spoken audio
  • Use natural-sounding neural voices
  • Support multiple languages and accents
  • Adjust:
    • Pitch
    • Speed
    • Tone
  • Apply SSML (Speech Synthesis Markup Language) for fine control
  • Generate speech for audio files or real-time playback

Common Uses of Speech Synthesis

Speech synthesis is used when systems need to speak to users.

Common scenarios include:

  • Virtual assistants and chatbots
  • Navigation and GPS systems
  • Accessibility tools for visually impaired users
  • Audiobooks and e-learning content
  • Automated announcements
  • Customer service voice bots

Azure Services for Speech Synthesis

In Azure, speech synthesis is also provided by:

Azure AI Speech (Speech service)

Capabilities include:

  • Text-to-speech
  • Neural voices
  • Voice customization
  • Multilingual speech output

Speech Recognition vs Speech Synthesis

CapabilitySpeech RecognitionSpeech Synthesis
DirectionSpeech → TextText → Speech
InputAudioText
OutputTextAudio
Common NameSpeech-to-textText-to-speech
ExampleTranscribing a callReading text aloud

Combined Speech Workloads

Many real-world solutions use both capabilities together.

Example:

  1. User speaks a question (speech recognition)
  2. System processes the text using NLP or AI logic
  3. System responds verbally (speech synthesis)

This is the foundation of:

  • Voice assistants
  • Conversational AI
  • Interactive voice response (IVR) systems

Exam-Focused Clues to Watch For 👀

On the AI-900 exam, speech workloads are usually described using phrases like:

  • “Convert spoken audio into text” → Speech recognition
  • “Generate spoken responses from text” → Speech synthesis
  • “Voice-enabled application” → Azure AI Speech
  • “Real-time transcription” → Speech recognition
  • “Reads text aloud” → Speech synthesis

Key Takeaways for AI-900

  • Speech recognition converts speech to text
  • Speech synthesis converts text to speech
  • Both are part of NLP workloads
  • Azure AI Speech is the primary Azure service for both
  • Common exam scenarios involve:
    • Voice assistants
    • Transcription
    • Accessibility
    • Customer service automation

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

Practice Questions: Identify Features and Uses for Translation (AI-900 Exam Prep)

Practice Questions


Question 1

Which Azure service is primarily used to translate text between languages?

A. Azure Speech Service
B. Azure Language Service
C. Azure Translator
D. Azure OpenAI Service

Correct Answer: C. Azure Translator

Explanation:
Azure Translator (part of Azure AI Services) is specifically designed for text translation across multiple languages. While other services handle NLP or speech, Translator focuses on multilingual text conversion.


Question 2

A company wants to translate product descriptions on a website in real time for international users. Which feature of Azure Translator best supports this scenario?

A. Batch transcription
B. Real-time REST API translation
C. Sentiment analysis
D. Custom question answering

Correct Answer: B. Real-time REST API translation

Explanation:
Azure Translator provides REST APIs that allow applications and websites to translate text dynamically as users access content.


Question 3

Which scenario is the best example of using machine translation?

A. Detecting the emotional tone of customer feedback
B. Extracting key phrases from documents
C. Translating an email from English to French
D. Identifying people and locations in text

Correct Answer: C. Translating an email from English to French

Explanation:
Machine translation focuses on converting text from one language to another, which is exactly what this scenario describes.


Question 4

What type of translation does Azure Translator perform by default?

A. Rule-based translation
B. Human-assisted translation
C. Statistical translation
D. Neural machine translation

Correct Answer: D. Neural machine translation

Explanation:
Azure Translator uses Neural Machine Translation (NMT) models, which rely on deep learning to produce more natural and accurate translations.


Question 5

A travel application needs to detect the source language of user input before translating it. Can Azure Translator support this requirement?

A. No, language detection requires Azure Language Service
B. Yes, language detection is built into Azure Translator
C. Only if custom models are trained
D. Only for speech input

Correct Answer: B. Yes, language detection is built into Azure Translator

Explanation:
Azure Translator can automatically detect the source language of text before translating it, which is a common real-world scenario.


Question 6

Which of the following is a common use case for translation in Azure?

A. Voice-controlled virtual assistants
B. Multilingual customer support chatbots
C. Facial recognition systems
D. Predictive maintenance systems

Correct Answer: B. Multilingual customer support chatbots

Explanation:
Translation enables chatbots and support systems to communicate with users in multiple languages, improving global accessibility.


Question 7

A company needs consistent translation for industry-specific terminology (for example, legal or medical terms). What Azure Translator feature helps with this?

A. Language detection
B. Speech synthesis
C. Custom Translator
D. Sentiment scoring

Correct Answer: C. Custom Translator

Explanation:
Custom Translator allows organizations to train translation models using their own terminology, improving accuracy for specialized domains.


Question 8

Which input format is supported by Azure Translator?

A. Text only
B. Audio only
C. Text and images
D. Text only (speech requires another service)

Correct Answer: D. Text only (speech requires another service)

Explanation:
Azure Translator works with text input. For speech-to-speech translation, Azure Speech Service is used in combination with translation.


Question 9

Which Azure service would you combine with Azure Translator to build a speech-to-speech translation application?

A. Azure Vision Service
B. Azure Speech Service
C. Azure Language Service
D. Azure Bot Service only

Correct Answer: B. Azure Speech Service

Explanation:
Speech-to-speech translation requires speech recognition (speech-to-text) and speech synthesis (text-to-speech), which are handled by Azure Speech Service, alongside translation.


Question 10

Why is translation considered a core Natural Language Processing (NLP) workload?

A. It analyzes numerical data patterns
B. It processes and understands human language
C. It detects objects in images
D. It forecasts future values

Correct Answer: B. It processes and understands human language

Explanation:
Translation involves understanding and generating human language, making it a foundational NLP workload alongside sentiment analysis, entity recognition, and language modeling.


Go to the AI-900 Exam Prep Hub main page.

Identify Features and Uses for Translation (AI-900 Exam Prep)

Where This Topic Fits in the Exam

  • Exam area: Describe features of Natural Language Processing (NLP) workloads on Azure (15–20%)
  • Sub-area: Identify features of common NLP workload scenarios
  • Skill focus: Recognizing when translation is the appropriate NLP workload, and understanding Azure services that support it

Translation is a core NLP workload on the AI-900 exam and often appears in short, scenario-based questions.


What Is Translation in NLP?

Translation is the process of converting text (or speech) from one language into another while preserving the original meaning.

Modern AI-powered translation systems use machine learning and deep learning models to understand context, grammar, and semantics rather than performing word-for-word substitutions.


Key Features of Translation Workloads

Translation solutions typically provide the following features:

  • Text-to-text translation between languages
  • Support for dozens of languages and dialects
  • Context-aware translation (not literal word replacement)
  • Detection of source language
  • Batch or real-time translation
  • Integration with applications, websites, and chatbots
  • Optional customization for domain-specific terminology

Common Uses of Translation

Translation workloads are used whenever language differences create a communication barrier.

Typical scenarios include:

  • Translating websites or product documentation
  • Supporting multilingual customer service
  • Translating chat messages in real time
  • Localizing applications for global users
  • Translating social media posts or reviews
  • Enabling communication across international teams

Azure Services for Translation

In Azure, translation capabilities are provided by:

Azure AI Translator

Azure AI Translator is part of Azure AI Services and offers:

  • Text translation between supported languages
  • Language detection
  • Transliteration (converting text between scripts)
  • Dictionary lookup and examples
  • Real-time and batch translation via APIs

This service uses prebuilt models, so no training is required.


Translation vs Other NLP Workloads

It is important to distinguish translation from similar NLP tasks:

NLP TaskPurpose
TranslationConvert text from one language to another
Language detectionIdentify which language text is written in
Speech recognitionConvert spoken audio into text
Speech synthesisConvert text into spoken audio
Sentiment analysisIdentify emotional tone of text

Translation and Speech

Translation workloads may involve:

  • Text-to-text translation (most common on AI-900)
  • Speech translation, which combines:
    1. Speech recognition
    2. Translation
    3. Speech synthesis

On the exam, focus primarily on text translation scenarios, unless speech is explicitly mentioned.


Responsible AI Considerations

Translation systems should be designed with responsible AI principles in mind:

  • Fairness: Avoid biased or culturally inappropriate translations
  • Reliability: Handle idioms and context accurately
  • Transparency: Clearly indicate when content is machine-translated
  • Privacy: Protect sensitive or personal information in translated text

Exam Clues to Watch For

On AI-900, translation workloads are commonly signaled by phrases such as:

  • “Convert content from one language to another”
  • “Support multilingual users”
  • “Translate customer messages”
  • “Localize an application”

When these appear, translation is the correct NLP workload.


Key Takeaways for AI-900

  • Translation is an NLP workload that converts text between languages
  • Azure AI Translator is the primary Azure service for translation
  • No model training is required
  • Translation is different from sentiment analysis, entity recognition, and speech workloads
  • Exam questions are typically scenario-based and concise

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

Practice Questions: Identify Features and Labels in a Dataset for Machine Learning (AI-900 Exam Prep)

Practice Exam Questions


Question 1

You are training a model to predict house prices. The dataset includes columns for square footage, number of bedrooms, location, and sale price.
Which column is the label?

A. Square footage
B. Number of bedrooms
C. Location
D. Sale price

Correct Answer: D

Explanation:
The label is the value the model is trained to predict. In this scenario, the goal is to predict the sale price.


Question 2

Which statement best describes a feature in a machine learning dataset?

A. The final prediction made by the model
B. An input value used to make predictions
C. A rule written by a developer
D. The accuracy of the model

Correct Answer: B

Explanation:
Features are the input variables that provide information the model uses to make predictions.


Question 3

A dataset contains customer age, subscription length, monthly charges, and whether the customer canceled the service.
What is the label?

A. Customer age
B. Subscription length
C. Monthly charges
D. Whether the customer canceled

Correct Answer: D

Explanation:
The label represents the outcome being predicted—in this case, whether the customer canceled the service.


Question 4

Which type of machine learning requires both features and labels?

A. Unsupervised learning
B. Reinforcement learning
C. Supervised learning
D. Clustering

Correct Answer: C

Explanation:
Supervised learning uses labeled data so the model can learn the relationship between features and known outcomes.


Question 5

A dataset is used to group customers based on purchasing behavior, but it does not contain any target outcome.
What does this dataset contain?

A. Labels only
B. Features only
C. Training results
D. Predictions

Correct Answer: B

Explanation:
Unsupervised learning datasets contain features but do not include labels.


Question 6

In an email spam detection dataset, which item would most likely be a feature?

A. Spam or not spam
B. Model accuracy score
C. Number of words in the email
D. Final prediction

Correct Answer: C

Explanation:
The number of words is an input characteristic used by the model to make predictions, making it a feature.


Question 7

Which statement about labels is TRUE?

A. Labels are optional in supervised learning
B. Labels are the inputs used by the model
C. Labels represent the value the model predicts
D. Labels are created after predictions are made

Correct Answer: C

Explanation:
Labels are the known outcomes the model is trained to predict in supervised learning scenarios.


Question 8

You are preparing data in Azure Machine Learning to predict product demand.
Which columns should be selected as features?

A. Only the column you want to predict
B. All columns except the target outcome
C. Only numerical columns
D. Only categorical columns

Correct Answer: B

Explanation:
Features are the input columns used to predict the target outcome, which is the label.


Question 9

A dataset includes the following columns: temperature, humidity, wind speed, and weather condition.
If the goal is to predict the weather condition, what are temperature, humidity, and wind speed?

A. Labels
B. Predictions
C. Features
D. Outputs

Correct Answer: C

Explanation:
These values are inputs used to predict the weather condition, making them features.


Question 10

Which scenario best represents a labeled dataset?

A. Customer data grouped by similarity
B. Sensor readings without outcomes
C. Product reviews with sentiment categories
D. Website logs without classifications

Correct Answer: C

Explanation:
Product reviews with sentiment categories include known outcomes, which are labels, making the dataset labeled.


Exam Pattern Tip

On AI-900:

  • Features = inputs
  • Labels = outputs
  • If labels exist → supervised learning
  • If no labels → unsupervised learning

If you can identify those quickly, you’ll eliminate most wrong answers immediately.


Go to the AI-900 Exam Prep Hub main page.

Describe Capabilities of the Azure AI Vision Service (AI-900 Exam Prep)

Overview

Azure AI Vision is Microsoft’s prebuilt computer vision service that enables applications to analyze images and videos without requiring machine learning expertise or custom model training. It provides REST APIs and SDKs that allow developers to easily extract visual insights such as objects, text, faces, and image descriptions.

For the AI-900 exam, you are expected to understand what Azure AI Vision can do, which problems it solves, and how it differs from custom vision solutions—not how to build or tune models.


What Is Azure AI Vision?

Azure AI Vision is part of Azure AI Services and offers ready-to-use computer vision capabilities, including:

  • Image analysis
  • Optical Character Recognition (OCR)
  • Facial detection and analysis
  • Object detection
  • Image tagging and categorization

These capabilities are powered by Microsoft-trained deep learning models and are accessed via APIs.


Core Capabilities of Azure AI Vision

1. Image Analysis

Azure AI Vision can analyze images to extract high-level insights, such as:

  • Objects present in an image (for example, car, building, person)
  • Scene descriptions in natural language
  • Image tags and categories
  • Visual features such as color distribution

Example use cases:

  • Auto-generating image captions
  • Content moderation
  • Organizing image libraries

👉 Exam tip: Image analysis describes what is in an image, not where every object is located with precision.


2. Object Detection

Object detection identifies specific objects in an image and returns:

  • Object names
  • Bounding box coordinates
  • Confidence scores

Example use cases:

  • Detecting vehicles in traffic images
  • Identifying products on store shelves

👉 Exam tip: Object detection includes location + object type, unlike image classification which only labels the image as a whole.


3. Optical Character Recognition (OCR)

OCR extracts printed and handwritten text from images and documents.

Azure AI Vision OCR supports:

  • Multiple languages
  • Structured and unstructured text
  • Images, screenshots, and scanned documents

Example use cases:

  • Digitizing receipts
  • Reading license plates
  • Extracting text from scanned forms

👉 Exam tip: OCR is about reading text, not understanding its meaning.


4. Facial Detection and Facial Analysis

Azure AI Vision can detect human faces in images and analyze non-identifying facial attributes, such as:

  • Face location (bounding boxes)
  • Facial landmarks
  • Estimated age range
  • Facial expressions
  • Accessories (glasses, masks)

⚠️ It does NOT identify individuals.

Example use cases:

  • Blurring faces for privacy
  • Counting people in images
  • Analyzing expressions in photos

👉 Exam tip:

  • Facial detection = where faces are
  • Facial analysis = attributes of faces
  • Facial recognition = identity (not required for AI-900)

5. Image Tagging and Categorization

Azure AI Vision automatically assigns tags and categories to images, such as:

  • “outdoor”
  • “food”
  • “animal”

These tags help with searchability and organization.

Example use cases:

  • Image indexing
  • Content filtering
  • Metadata enrichment

👉 Exam tip: Tagging helps describe images at a high level, not detect precise objects.


Azure AI Vision vs Custom Vision

FeatureAzure AI VisionAzure Custom Vision
Prebuilt models✅ Yes❌ No
Requires training❌ No✅ Yes
Quick setup✅ Yes❌ No
Specialized scenarios❌ Limited✅ Strong
AI-900 focus✅ Yes⚠️ Limited

👉 Exam takeaway:
If the question mentions no training, quick setup, or prebuilt models, Azure AI Vision is usually the right answer.


Responsible AI Considerations

Because Azure AI Vision can analyze images of people, Microsoft emphasizes:

  • Privacy and security of image data
  • Transparency in how visual data is processed
  • Fairness and bias mitigation
  • Appropriate use of facial analysis

👉 Exam tip: Facial capabilities often pair with Responsible AI principles in exam questions.


Common AI-900 Exam Scenarios

You should recognize Azure AI Vision when the scenario involves:

  • Analyzing images without training a model
  • Extracting text from images
  • Detecting faces but not identifying people
  • Automatically tagging or describing images

Key Exam Takeaways

  • Azure AI Vision is a prebuilt computer vision service
  • No machine learning expertise required
  • Supports image analysis, OCR, object detection, and facial analysis
  • Focuses on insight extraction, not identity
  • Frequently tested in scenario-based questions

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

Practice Questions: Describe Capabilities of the Azure AI Language Service (AI-900 Exam Prep)

Practice Exam Questions


Question 1

Which Azure service should you use to analyze customer reviews and determine whether the feedback is positive or negative?

A. Azure Translator
B. Azure AI Vision
C. Azure AI Language
D. Azure Speech

Correct Answer: C

Explanation:
Sentiment analysis is a text analytics capability, which is provided by the Azure AI Language service. Translator is for language conversion, Vision is for images, and Speech is for audio.


Question 2

You want to extract people, organizations, and locations from text documents. Which Azure AI Language capability should you use?

A. Key phrase extraction
B. Named entity recognition
C. Text classification
D. Language detection

Correct Answer: B

Explanation:
Named Entity Recognition (NER) identifies and categorizes entities such as people, organizations, and locations within text.


Question 3

A company wants to automatically identify the main topics discussed in customer feedback emails. Which Azure AI Language feature should be used?

A. Sentiment analysis
B. Entity recognition
C. Key phrase extraction
D. Language detection

Correct Answer: C

Explanation:
Key phrase extraction identifies the most important concepts or talking points in text, making it ideal for summarizing feedback.


Question 4

Which scenario is best suited for the Azure AI Language service?

A. Converting spoken audio into text
B. Translating documents from English to French
C. Analyzing written text for sentiment and meaning
D. Detecting objects in images

Correct Answer: C

Explanation:
The Azure AI Language service specializes in understanding and analyzing text, including sentiment, entities, and key phrases.


Question 5

Which capability of Azure AI Language allows an application to answer natural language questions based on provided documents or FAQs?

A. Sentiment analysis
B. Question answering
C. Key phrase extraction
D. Language detection

Correct Answer: B

Explanation:
Question answering enables applications and chatbots to respond to user questions using structured knowledge sources.


Question 6

A multilingual application needs to determine the language of user-submitted text before processing it further. Which Azure AI Language feature should be used?

A. Translation
B. Language detection
C. Text classification
D. Entity recognition

Correct Answer: B

Explanation:
Language detection identifies the language of input text and is often used before other NLP operations.


Question 7

Which Azure service combines sentiment analysis, entity recognition, and key phrase extraction into a single offering?

A. Azure Translator
B. Azure Speech
C. Azure AI Vision
D. Azure AI Language

Correct Answer: D

Explanation:
The Azure AI Language service provides multiple NLP capabilities under one unified service.


Question 8

You want to categorize incoming support tickets into “Billing,” “Technical,” or “General.” Which Azure AI Language capability should you use?

A. Sentiment analysis
B. Key phrase extraction
C. Text classification
D. Language detection

Correct Answer: C

Explanation:
Text classification assigns predefined categories or labels to text content.


Question 9

Which statement best describes the Azure AI Language service?

A. It focuses only on speech-to-text conversion
B. It translates text between languages
C. It extracts meaning and insights from text
D. It analyzes images and video streams

Correct Answer: C

Explanation:
Azure AI Language is designed to analyze text and extract meaning, including sentiment, entities, and key concepts.


Question 10

Which task would NOT typically be handled by the Azure AI Language service?

A. Identifying sentiment in customer reviews
B. Extracting organizations from news articles
C. Translating text from Spanish to English
D. Detecting key phrases in feedback forms

Correct Answer: C

Explanation:
Text translation is handled by Azure Translator, not the Azure AI Language service. The other tasks are core Language service capabilities.


Final Exam Tips for This Topic

  • If the question involves understanding text, think Azure AI Language
  • If the question involves translation, think Azure Translator
  • If the question involves speech, think Azure Speech
  • AI-900 questions focus on what the service does, not how to code it

Go to the AI-900 Exam Prep Hub main page.

Describe Capabilities of the Azure AI Language Service (AI-900 Exam Prep)

Where This Fits in the Exam

  • Exam: AI-900 – Microsoft Azure AI Fundamentals
  • Domain: Describe features of Natural Language Processing (NLP) workloads on Azure (15–20%)
  • Sub-area: Identify Azure tools and services for NLP workloads

At this level, the exam focuses on what the service does, when to use it, and how it differs from other Azure AI services—not on implementation or coding.


What Is the Azure AI Language Service?

The Azure AI Language service is a cloud-based NLP service that enables applications to understand, analyze, and extract meaning from text.

It brings together several NLP capabilities under a single unified service, making it easier to build text-based AI solutions such as:

  • Customer feedback analysis
  • Chatbots
  • Document processing
  • Knowledge mining

For AI-900, think of it as “the main Azure service for understanding text.”


Key Capabilities of the Azure AI Language Service

1. Text Analytics

Text Analytics allows applications to analyze raw text and extract insights.

Main features include:

  • Sentiment analysis
  • Key phrase extraction
  • Named entity recognition
  • Language detection

These features are widely tested on the exam.


2. Sentiment Analysis

What it does:
Determines whether text expresses a positive, negative, neutral, or mixed sentiment.

Example use cases:

  • Analyzing customer reviews
  • Measuring brand perception on social media
  • Evaluating survey responses

Exam tip:
Sentiment analysis answers “How does the text feel?”


3. Key Phrase Extraction

What it does:
Identifies the main talking points in a block of text.

Example:

“The hotel had great service but poor Wi-Fi.”

Key phrases might include:

  • great service
  • poor Wi-Fi

Common exam scenario:
Summarizing long documents or feedback automatically.


4. Named Entity Recognition (NER)

What it does:
Detects and categorizes entities mentioned in text.

Common entity types:

  • People
  • Organizations
  • Locations
  • Dates
  • Products

Example:

“Satya Nadella is the CEO of Microsoft.”

Entities detected:

  • Person: Satya Nadella
  • Organization: Microsoft

5. Language Detection

What it does:
Identifies the language a piece of text is written in.

Why it matters:

  • Enables multilingual applications
  • Often used before translation or sentiment analysis

Exam tip:
Azure AI Language can detect language without being told what it is.


6. Question Answering

What it does:
Allows applications to answer natural language questions based on provided content.

Key points:

  • Replaces the older QnA Maker
  • Uses FAQs, documents, or URLs as knowledge sources
  • Commonly used in chatbots and helpdesk systems

Example:

User: “What is your return policy?”
Bot responds using stored knowledge.


7. Text Classification

What it does:
Assigns predefined categories or labels to text.

Examples:

  • Classifying emails as billing, technical support, or general inquiry
  • Tagging support tickets automatically

Important distinction:
This is about categorizing content, not detecting sentiment.


8. Custom Language Models

What it does:
Allows organizations to train custom NLP models using their own data.

Used for:

  • Domain-specific terminology
  • Industry-specific language (legal, healthcare, finance)

AI-900 focus:
Know that customization is possible, not how to train models.


Azure AI Language Service vs Other Azure AI Services

This distinction is frequently tested.

ServicePrimary Purpose
Azure AI LanguageUnderstand and analyze text
Azure TranslatorTranslate text between languages
Azure SpeechSpeech-to-text and text-to-speech
Azure VisionAnalyze images and video

Exam shortcut:
If the scenario is about meaning, sentiment, or structure of text, the answer is usually Azure AI Language service.


Common Exam Scenarios to Watch For

You’ll often see questions like:

  • “Which Azure service should you use to analyze customer reviews?”
  • “Which service extracts people and locations from documents?”
  • “Which NLP service powers chatbots with question answering?”

If it involves text understanding, not translation or speech → Azure AI Language.


Key Takeaways for AI-900

  • Azure AI Language service is the primary NLP service for text analysis
  • It supports:
    • Sentiment analysis
    • Key phrase extraction
    • Entity recognition
    • Language detection
    • Question answering
    • Text classification
  • It is different from Translator and Speech
  • AI-900 focuses on capabilities and use cases, not APIs or code

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

Practice Questions: Describe Capabilities of the Azure AI Speech Service (AI-900 Exam Prep)

Practice Exam Questions


Question 1

A company wants to automatically convert recorded customer support calls into written transcripts for analysis.
Which Azure service should they use?

A. Azure AI Language
B. Azure AI Vision
C. Azure AI Speech
D. Azure Translator

Correct Answer: C

Explanation:
Azure AI Speech provides Speech to Text, which converts spoken audio into written text. Azure AI Language analyzes existing text but does not process audio.


Question 2

An application needs to read written instructions aloud to users using natural-sounding voices.
Which Azure AI Speech capability is required?

A. Speech to Text
B. Text to Speech
C. Speaker Recognition
D. Speech Translation

Correct Answer: B

Explanation:
Text to Speech converts written text into spoken audio. This is commonly used for accessibility and voice assistants.


Question 3

A global company wants users to speak in Spanish and hear an English audio response in real time.
Which Azure AI Speech feature supports this scenario?

A. Text Analytics
B. Azure Translator
C. Speech Translation
D. Speaker Identification

Correct Answer: C

Explanation:
Speech Translation enables real-time translation of spoken language and can output translated speech or text.


Question 4

Which scenario is best suited for Azure AI Speech instead of Azure AI Language?

A. Extracting key phrases from emails
B. Detecting sentiment in product reviews
C. Transcribing audio from meetings
D. Identifying entities in documents

Correct Answer: C

Explanation:
Azure AI Speech handles audio-based workloads such as transcribing meetings. Azure AI Language is used for written text analysis.


Question 5

A banking app needs to verify a user’s identity based on their voice.
Which Azure AI Speech capability should be used?

A. Speech to Text
B. Speaker Recognition
C. Text to Speech
D. Language Detection

Correct Answer: B

Explanation:
Speaker Recognition is used to verify or identify individuals based on voice characteristics.


Question 6

Which Azure AI Speech capability converts spoken language into written text in real time?

A. Speech Translation
B. Text to Speech
C. Speech to Text
D. Speaker Identification

Correct Answer: C

Explanation:
Speech to Text converts audio input into text and supports real-time transcription.


Question 7

A developer wants to generate lifelike, human-sounding voices for a virtual assistant.
Which feature of Azure AI Speech makes this possible?

A. Optical character recognition
B. Neural voices
C. Language modeling
D. Sentiment analysis

Correct Answer: B

Explanation:
Azure AI Speech uses neural voices to produce natural-sounding speech output.


Question 8

Which input type is primarily required when using the Azure AI Speech service?

A. Images
B. Video streams
C. Audio data
D. Structured tables

Correct Answer: C

Explanation:
Azure AI Speech is designed to process audio input, such as spoken language or sound recordings.


Question 9

Which scenario would require combining multiple Azure AI Speech capabilities?

A. Detecting faces in images
B. Translating written documents
C. Speaking in one language and hearing a translated spoken response
D. Analyzing sentiment in customer feedback

Correct Answer: C

Explanation:
This scenario combines Speech to Text, Translation, and Text to Speech to deliver a speech-to-speech experience.


Question 10

Which statement best describes Azure AI Speech?

A. It analyzes written documents for meaning
B. It processes images and videos
C. It enables spoken language understanding and generation
D. It is used only for chatbots

Correct Answer: C

Explanation:
Azure AI Speech focuses on spoken language, including recognition, synthesis, translation, and speaker identification.


Final Exam Tips 🧠

  • If the question mentions audio, voice, or speech, think Azure AI Speech
  • Know the difference between:
    • Speech to Text
    • Text to Speech
    • Speech Translation
    • Speaker Recognition
  • AI-900 questions are conceptual and scenario-based, not technical

Go to the AI-900 Exam Prep Hub main page.

Describe Capabilities of the Azure AI Speech Service (AI-900 Exam Prep)

Where This Fits in the Exam

  • Exam: AI-900 – Microsoft Azure AI Fundamentals
  • Domain: Describe features of Natural Language Processing (NLP) workloads on Azure (15–20%)
  • Sub-area: Identify Azure tools and services for NLP workloads

For AI-900, Microsoft expects you to understand what the Azure AI Speech service does, when to use it, and how it differs from other AI services — not how to code it.


What Is the Azure AI Speech Service?

The Azure AI Speech service is a cloud-based service that enables applications to process spoken language. It allows systems to:

  • Convert speech into text
  • Convert text into natural-sounding speech
  • Translate spoken language
  • Recognize speakers and voices

It is part of Azure AI Services and focuses on audio and voice-based NLP workloads.


Core Capabilities of Azure AI Speech

1. Speech to Text

Speech to Text converts spoken audio into written text.

Key features:

  • Real-time transcription
  • Batch transcription of audio files
  • Support for multiple languages
  • Automatic punctuation and formatting

Common use cases:

  • Transcribing meetings or calls
  • Voice-controlled applications
  • Call center analytics
  • Accessibility tools (captions and subtitles)

📌 AI-900 exam tip:
If the question mentions converting spoken words into text, the answer is Azure AI Speech (Speech to Text).


2. Text to Speech

Text to Speech converts written text into natural-sounding spoken audio.

Key features:

  • Neural voices that sound human-like
  • Multiple languages and accents
  • Adjustable pitch, speed, and tone
  • Support for voice styles (e.g., cheerful, calm)

Common use cases:

  • Voice assistants
  • Read-aloud applications
  • Accessibility for visually impaired users
  • Automated announcements

📌 AI-900 exam tip:
If the scenario describes reading text out loud, think Text to Speech.


3. Speech Translation

Speech Translation converts spoken language into another language, either as text or synthesized speech.

Key features:

  • Real-time speech translation
  • Multi-language support
  • Can output translated speech or text

Common use cases:

  • Multilingual meetings
  • Travel and tourism apps
  • International customer support

📌 AI-900 exam tip:
Speech translation handles spoken language, while Azure Translator handles written text.


4. Speaker Recognition

Speaker Recognition identifies or verifies who is speaking based on their voice.

Capabilities include:

  • Speaker verification (confirming identity)
  • Speaker identification (determining who is speaking)

Common use cases:

  • Secure voice authentication
  • Call center speaker tracking
  • Personalized voice experiences

📌 AI-900 note:
You only need to understand what it does, not how voice models are trained.


5. Speech-to-Speech Scenarios

By combining Speech to Text, Translation, and Text to Speech, Azure AI Speech supports end-to-end voice experiences, such as:

  • Speaking in one language and hearing a response in another
  • Voice-based chatbots
  • Smart devices and assistants

How Azure AI Speech Differs from Other Azure AI Services

ServicePrimary Purpose
Azure AI SpeechSpoken language (audio)
Azure AI LanguageWritten text analysis
Azure TranslatorText translation
Azure AI VisionImages and video

📌 Exam pattern to watch for:
Microsoft often tests whether you can choose the right service based on the input type (audio vs text vs image).


Typical AI-900 Scenarios Involving Azure AI Speech

You should choose Azure AI Speech when a scenario involves:

  • Audio recordings
  • Live speech
  • Voice input or output
  • Real-time transcription
  • Spoken translation

Key Takeaways for the AI-900 Exam

  • Azure AI Speech focuses on spoken language, not written text
  • Core capabilities:
    • Speech to Text
    • Text to Speech
    • Speech Translation
    • Speaker Recognition
  • Exam questions are scenario-based, not technical
  • If the question mentions audio, voice, or speech, Azure AI Speech is usually the answer

Go to the Practice Exam Questions for this topic.

Go to the AI-900 Exam Prep Hub main page.

Practice Questions: Identify Features of Generative AI Models (AI-900 Exam Prep)

Practice Questions


Question 1

Which scenario is the best example of a generative AI workload?

A. Predicting tomorrow’s temperature based on historical data
B. Classifying emails as spam or not spam
C. Generating a product description from a short prompt
D. Detecting anomalies in server performance metrics

Correct Answer: C

Explanation:
Generative AI models are designed to create new content, such as text, images, or code. Generating a product description is a content creation task, which is a core feature of generative AI.


Question 2

What is a key characteristic that distinguishes generative AI models from traditional machine learning models?

A. They require labeled training data
B. They produce deterministic outputs
C. They generate new data similar to training data
D. They can only be used for classification tasks

Correct Answer: C

Explanation:
Generative AI models learn patterns from data and generate new outputs that resemble the data they were trained on, rather than only predicting labels or numeric values.


Question 3

What role does a prompt play when working with a generative AI model?

A. It retrains the model with new data
B. It defines how the model should generate a response
C. It validates the accuracy of the model
D. It encrypts the generated output

Correct Answer: B

Explanation:
A prompt provides instructions or context that guide the model’s output. It does not retrain the model or affect its underlying parameters.


Question 4

Why can the same prompt sometimes produce different responses from a generative AI model?

A. The model uses rule-based logic
B. The model is deterministic
C. The model generates probabilistic outputs
D. The training data changes after each request

Correct Answer: C

Explanation:
Generative AI models use probabilistic methods, meaning they select likely next outputs rather than fixed responses, which can result in variation.


Question 5

Which feature enables a generative AI model to produce human-like text responses?

A. Feature engineering
B. Context awareness and large-scale pretraining
C. Manual rule definition
D. Binary classification

Correct Answer: B

Explanation:
Generative AI models are pretrained on massive datasets and use context to generate fluent, coherent, human-like responses.


Question 6

Which statement best describes the training approach used by most generative AI models?

A. They are trained only on small, task-specific datasets
B. They are pretrained on large datasets and adapted for many tasks
C. They require real-time retraining for each request
D. They are trained exclusively using reinforcement learning

Correct Answer: B

Explanation:
Generative AI models are typically large pretrained models that can perform multiple tasks without retraining.


Question 7

Which scenario would most likely require the use of a generative AI model?

A. Predicting customer churn
B. Assigning product categories
C. Writing a summary of a long document
D. Detecting fraudulent transactions

Correct Answer: C

Explanation:
Summarization involves creating new text, which is a hallmark of generative AI workloads.


Question 8

What is a common risk associated with generative AI models that requires responsible AI controls?

A. Overfitting to training data
B. Hallucinations and biased outputs
C. Low model accuracy
D. Inability to scale

Correct Answer: B

Explanation:
Generative AI models can produce confident but incorrect information or biased content, making responsible AI safeguards essential.


Question 9

Which feature allows a generative AI model to continue a conversation in a meaningful way?

A. Feature scaling
B. Context retention
C. Label encoding
D. Data normalization

Correct Answer: B

Explanation:
Context retention enables generative AI models to understand previous inputs and generate coherent multi-turn conversations.


Question 10

Which statement best describes the scope of tasks generative AI models can perform?

A. They are limited to a single predefined task
B. They can perform multiple tasks using the same model
C. They must be retrained for each task
D. They only work with numerical data

Correct Answer: B

Explanation:
Generative AI models are general-purpose, capable of handling a wide variety of tasks such as summarization, translation, content generation, and question answering.


Final Exam Tip 💡

If an AI-900 question mentions:

  • Creating text, images, or code
  • Prompts
  • Conversations
  • Human-like responses

👉 Think: Generative AI model


Go to the AI-900 Exam Prep Hub main page.