Practice Exam Questions
Question 1
What is the primary purpose of the self-attention mechanism in a Transformer model?
A. To reduce the size of the training dataset
B. To allow the model to focus on relevant parts of the input sequence
C. To replace the need for training data
D. To process words strictly in order
Correct Answer: B
Explanation:
Self-attention enables a Transformer to determine which words in a sentence are most relevant to one another, improving context understanding. It does not enforce strict order or reduce dataset size.
Question 2
Which feature allows Transformers to be trained more efficiently than recurrent neural networks (RNNs)?
A. Sequential word processing
B. Parallel processing of input data
C. Manual feature engineering
D. Rule-based language models
Correct Answer: B
Explanation:
Transformers process entire sequences in parallel, unlike RNNs that process tokens sequentially. This makes Transformers faster and more scalable.
Question 3
A key reason Transformers require positional encoding is because they:
A. Use convolutional layers
B. Process all input tokens at the same time
C. Rely on labeled data only
D. Perform unsupervised learning
Correct Answer: B
Explanation:
Because Transformers process words in parallel, positional encoding is needed to preserve information about word order in a sentence.
Question 4
Which type of AI workload most commonly uses Transformer-based models?
A. Time-series forecasting
B. Natural language processing
C. Image compression
D. Robotics control systems
Correct Answer: B
Explanation:
Transformers are primarily used for NLP tasks such as translation, summarization, and conversational AI.
Question 5
Which statement best describes the encoder–decoder architecture used in many Transformer models?
A. Both components generate output text
B. The encoder understands input, and the decoder generates output
C. The decoder trains the encoder
D. Both components store training data
Correct Answer: B
Explanation:
The encoder processes and understands the input sequence, while the decoder generates the output sequence based on that understanding.
Question 6
Why are Transformers better at handling long-range dependencies in text compared to earlier models?
A. They use fewer parameters
B. They rely on handcrafted grammar rules
C. They use attention to relate all words in a sequence
D. They process words one at a time
Correct Answer: C
Explanation:
Self-attention allows Transformers to evaluate relationships between all words in a sentence, regardless of distance.
Question 7
Which Azure scenario is most likely to involve a Transformer-based model?
A. Predicting tomorrow’s stock price
B. Detecting network hardware failures
C. Translating text between languages
D. Calculating average sales per region
Correct Answer: C
Explanation:
Language translation is a classic NLP task that relies heavily on Transformer architectures.
Question 8
What is a major advantage of Transformers over traditional sequence models?
A. They require no training data
B. They eliminate bias automatically
C. They improve scalability and performance
D. They work only with structured data
Correct Answer: C
Explanation:
Transformers scale efficiently due to parallel processing and attention mechanisms, improving performance on large datasets.
Question 9
Which statement about Transformers is TRUE?
A. They are rule-based AI systems
B. They process data strictly sequentially
C. They are a type of deep learning model
D. They are limited to image recognition
Correct Answer: C
Explanation:
Transformers are deep learning architectures commonly used for NLP tasks.
Question 10
Which feature enables a Transformer model to understand the context of a word based on surrounding words?
A. Positional encoding
B. Tokenization
C. Self-attention
D. Data labeling
Correct Answer: C
Explanation:
Self-attention allows the model to weigh the importance of surrounding words when interpreting meaning and context.
Quick Exam Tip
If you see keywords like:
- attention
- context
- parallel processing
- language understanding
- Azure OpenAI
You’re almost certainly dealing with a Transformer-based model.
Go to the AI-900 Exam Prep Hub main page.
