Month: May 2026

AI, AI Governance, AI-901, Microsoft Certification May 18, 2026

Describe considerations for privacy and security in an AI Solution (AI-901 Exam Prep)

This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Identify AI concepts and capabilities (40–45%)
   --> Describe principles of responsible AI
      --> Describe considerations for privacy and security in an AI Solution

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Privacy and security are essential principles of Responsible AI and important topics for the AI-901 certification exam. Microsoft emphasizes that AI systems must protect sensitive information, respect user privacy, and defend against unauthorized access or malicious attacks.

As AI systems increasingly process personal, financial, medical, and business data, organizations must ensure that their AI solutions are secure and trustworthy.

What Are Privacy and Security in AI?

Although related, privacy and security are different concepts.

Concept	Meaning
Privacy	Protecting personal and sensitive information and ensuring proper data usage
Security	Protecting systems, models, and data from unauthorized access, attacks, or misuse

Both principles are critical when developing and deploying AI systems.

Why Privacy and Security Matter

AI systems often process large amounts of sensitive information, including:

Personal data
Financial records
Medical information
Images and videos
Voice recordings
Customer behavior data
Business intelligence data

If privacy or security is compromised, organizations may face:

Data breaches
Identity theft
Financial loss
Legal penalties
Loss of customer trust
Regulatory violations

Responsible AI requires organizations to safeguard both the data and the systems that use it.

Privacy Considerations in AI

Collect Only Necessary Data

Organizations should collect only the data required for the AI solution to function properly.

This concept is often called data minimization.

Example

A movie recommendation system may need viewing preferences but may not need a user’s medical history.

Collecting unnecessary data increases privacy risks.

User Consent and Transparency

Users should understand:

What data is being collected
Why the data is being collected
How the data will be used
Who can access the data

Organizations should obtain appropriate user consent before collecting or processing personal information.

Example

A voice assistant application should clearly inform users that voice recordings are being stored and analyzed.

Protect Sensitive Information

Sensitive data should be carefully protected during:

Collection
Storage
Processing
Transmission

Examples of sensitive information include:

Social Security numbers
Credit card data
Medical records
Biometric data

Organizations often use encryption and access controls to protect sensitive data.

Anonymization and Masking

Organizations can reduce privacy risks by removing or hiding personally identifiable information (PII).

Techniques include:

Anonymization
Data masking
Tokenization

Example

A healthcare AI system may replace patient names with anonymous identifiers before training a model.

Compliance with Regulations

Organizations must comply with privacy laws and regulations.

Examples include:

GDPR (General Data Protection Regulation)
HIPAA (Health Insurance Portability and Accountability Act)
CCPA (California Consumer Privacy Act)

AI systems should be designed with regulatory compliance in mind.

Security Considerations in AI

Protecting AI Systems from Unauthorized Access

AI systems should include strong authentication and authorization controls.

Examples

Multi-factor authentication (MFA)
Role-based access control (RBAC)
Identity management systems

Only authorized users should be able to access sensitive models or data.

Securing Data

Data should be protected both:

At rest (stored data)
In transit (moving across networks)

Encryption is commonly used to secure data in both situations.

Protecting Models from Attacks

AI systems can be targets for malicious attacks.

Examples include:

Adversarial attacks
Data poisoning
Model theft
Prompt injection attacks in generative AI systems

Organizations should monitor for suspicious activity and secure AI infrastructure.

Adversarial Attacks

An adversarial attack occurs when someone intentionally manipulates input data to fool an AI model.

Example

Small changes to an image may cause an AI vision system to incorrectly identify an object.

These attacks can reduce reliability and create safety risks.

Data Poisoning

Data poisoning occurs when attackers intentionally insert misleading or malicious data into training datasets.

Example

An attacker adds fraudulent examples into a spam detection dataset so spam messages are classified as safe.

This can compromise model accuracy and trustworthiness.

Generative AI Security Risks

Generative AI introduces additional privacy and security challenges.

Examples include:

Prompt injection attacks
Exposure of confidential data
Harmful content generation
Leakage of sensitive training data

Organizations should implement safeguards such as:

Content filtering
Access restrictions
Human review
Monitoring and logging

Shared Responsibility in Cloud AI

When using cloud-based AI services such as Microsoft Azure AI Services, security responsibilities are shared.

Microsoft Responsibilities	Customer Responsibilities
Physical infrastructure security	User access management
Network security	Proper configuration
Cloud platform protection	Data governance
Service availability	Compliance and policy management

Understanding the shared responsibility model is important for cloud security.

Real-World Example

Scenario: AI Banking Chatbot

A bank deploys an AI chatbot that helps customers manage accounts.

Privacy Considerations

Protect customer financial data
Obtain consent for data collection
Limit access to sensitive records
Mask account numbers in logs

Security Considerations

Use encryption
Require authentication
Prevent unauthorized access
Monitor for suspicious activity
Protect against prompt injection attacks

Risk Mitigation Strategies

Access controls
Security monitoring
Data anonymization
Regular audits
Employee security training

This type of scenario aligns well with AI-901 exam questions.

Privacy vs. Security

A common exam concept is understanding the difference between privacy and security.

Privacy Focuses On:

Proper use of personal data
User consent
Data collection practices
Data sharing limitations

Security Focuses On:

Protecting systems and data
Preventing attacks
Access control
Encryption
Threat detection

Privacy and security work together but are not the same thing.

Microsoft Responsible AI Principles

Microsoft identifies privacy and security as one of six core Responsible AI principles:

Fairness
Reliability and safety
Privacy and security
Inclusiveness
Transparency
Accountability

For AI-901, understand that privacy and security focus on protecting both users and AI systems.

Best Practices for Privacy and Security in AI

Organizations commonly use the following practices:

Encryption

Protect data by encrypting it:

At rest
In transit

Access Controls

Restrict system access using:

RBAC
MFA
Identity management

Data Governance

Establish policies for:

Data handling
Data retention
Data sharing
Compliance

Monitoring and Logging

Track suspicious behavior and system activity to detect threats early.

Regular Security Testing

Perform:

Vulnerability scans
Penetration testing
Security reviews

Human Oversight

Humans should monitor high-risk AI systems and review sensitive outputs.

Important AI-901 Exam Tips

For the exam, remember these key points:

Privacy protects personal and sensitive information.
Security protects systems, models, and data from attacks or unauthorized access.
Data minimization reduces privacy risk.
Encryption protects data at rest and in transit.
AI systems can face adversarial attacks and data poisoning.
Generative AI introduces additional security concerns.
User consent and transparency are important privacy considerations.
Privacy and security are one of Microsoft’s six Responsible AI principles.

Quick Knowledge Check

Question 1

What is the difference between privacy and security?

Answer

Privacy focuses on proper handling of personal data, while security focuses on protecting systems and data from threats and unauthorized access.

Question 2

What is data minimization?

Answer

Collecting only the data necessary for an AI solution to function.

Question 3

What is an adversarial attack?

Answer

An attempt to intentionally manipulate AI inputs to fool the model into producing incorrect results.

Question 4

Why is encryption important in AI systems?

Answer

It helps protect sensitive data from unauthorized access during storage and transmission.

Practice Exam Questions

Question 1

A company develops an AI-powered healthcare application that stores patient medical records.

Which practice BEST helps protect sensitive patient data?

A. Publicly sharing all training data
B. Encrypting stored and transmitted data
C. Removing all authentication requirements
D. Allowing unrestricted administrator access

Correct Answer

B. Encrypting stored and transmitted data

Explanation

Encryption protects sensitive information both while stored (at rest) and while moving across networks (in transit). This is a key privacy and security practice for AI systems handling confidential data.

Why the Other Answers Are Incorrect

A. Publicly sharing all training data

This would create major privacy risks.

C. Removing all authentication requirements

Authentication is necessary for security.

D. Allowing unrestricted administrator access

Access should be limited and controlled.

Question 2

What is the PRIMARY focus of privacy in an AI solution?

A. Preventing hardware failures
B. Protecting personal and sensitive information
C. Increasing processing speed
D. Improving graphics performance

Correct Answer

B. Protecting personal and sensitive information

Explanation

Privacy focuses on ensuring personal data is collected, stored, shared, and used responsibly and lawfully.

Why the Other Answers Are Incorrect

A. Preventing hardware failures

This relates to infrastructure reliability.

C. Increasing processing speed

Performance optimization is unrelated to privacy.

D. Improving graphics performance

Graphics performance is unrelated to Responsible AI privacy principles.

Question 3

Which scenario BEST demonstrates data minimization?

A. Collecting all available user data regardless of need
B. Collecting only the information necessary for the AI solution to function
C. Sharing customer data with external organizations
D. Storing user data indefinitely

Correct Answer

B. Collecting only the information necessary for the AI solution to function

Explanation

Data minimization means limiting data collection to only what is necessary for a specific purpose, reducing privacy risks.

Why the Other Answers Are Incorrect

A. Collecting all available user data regardless of need

This increases privacy risk.

C. Sharing customer data with external organizations

This may create additional privacy concerns.

D. Storing user data indefinitely

Long-term storage may increase compliance and security risks.

Question 4

An attacker slightly modifies an image so that an AI vision system incorrectly identifies an object.

What type of attack is this?

A. Data normalization
B. Adversarial attack
C. Batch processing
D. Role-based access control

Correct Answer

B. Adversarial attack

Explanation

Adversarial attacks intentionally manipulate inputs to fool AI systems into making incorrect predictions or classifications.

Why the Other Answers Are Incorrect

A. Data normalization

Normalization prepares data for analysis.

C. Batch processing

Batch processing refers to grouped data operations.

D. Role-based access control

RBAC is a security access management method.

Question 5

Which security measure helps ensure only authorized users can access an AI system?

A. Increasing training data size
B. Role-based access control (RBAC)
C. Removing encryption
D. Disabling audit logs

Correct Answer

B. Role-based access control (RBAC)

Explanation

RBAC restricts access based on user roles and permissions, helping secure AI systems and sensitive data.

Why the Other Answers Are Incorrect

A. Increasing training data size

Training data size does not control access.

C. Removing encryption

Removing encryption weakens security.

D. Disabling audit logs

Audit logs help monitor and investigate security events.

Question 6

What is the PRIMARY purpose of encryption in AI systems?

A. To increase model accuracy
B. To protect data from unauthorized access
C. To reduce cloud costs
D. To eliminate the need for passwords

Correct Answer

B. To protect data from unauthorized access

Explanation

Encryption converts data into a protected format that unauthorized users cannot easily read.

It is commonly used to secure sensitive information.

Why the Other Answers Are Incorrect

A. To increase model accuracy

Encryption does not improve prediction quality.

C. To reduce cloud costs

Encryption is a security measure, not a cost optimization tool.

D. To eliminate the need for passwords

Authentication may still be required.

Question 7

A company clearly informs users about what personal information is being collected and how it will be used before collecting the data.

What privacy concept does this BEST represent?

A. User consent and transparency
B. Adversarial testing
C. Model drift
D. Data poisoning

Correct Answer

A. User consent and transparency

Explanation

Responsible AI systems should inform users about data collection practices and obtain appropriate consent before using personal data.

Why the Other Answers Are Incorrect

B. Adversarial testing

Adversarial testing evaluates resistance to attacks.

C. Model drift

Model drift refers to performance changes over time.

D. Data poisoning

Data poisoning involves malicious manipulation of training data.

Question 8

An attacker intentionally inserts misleading examples into a training dataset to reduce model accuracy.

What is this called?

A. Encryption
B. Data masking
C. Data poisoning
D. Data normalization

Correct Answer

C. Data poisoning

Explanation

Data poisoning occurs when attackers deliberately manipulate training data to negatively affect AI model behavior.

Why the Other Answers Are Incorrect

A. Encryption

Encryption protects data confidentiality.

B. Data masking

Data masking hides sensitive information.

D. Data normalization

Normalization standardizes data values.

Question 9

Which statement BEST describes the difference between privacy and security?

A. Privacy and security are identical concepts
B. Privacy focuses on proper data usage, while security focuses on protecting systems and data from threats
C. Privacy focuses only on hardware devices
D. Security applies only to cloud computing

Correct Answer

B. Privacy focuses on proper data usage, while security focuses on protecting systems and data from threats

Explanation

Privacy concerns how personal data is collected and used, while security focuses on preventing unauthorized access, attacks, and data breaches.

Why the Other Answers Are Incorrect

A. Privacy and security are identical concepts

They are related but distinct principles.

C. Privacy focuses only on hardware devices

Privacy primarily concerns information handling.

D. Security applies only to cloud computing

Security applies to all computing environments.

Question 10

Which Microsoft Responsible AI principle focuses on protecting sensitive information and securing AI systems?

A. Fairness
B. Inclusiveness
C. Privacy and security
D. Transparency

Correct Answer

C. Privacy and security

Explanation

The Privacy and Security principle focuses on safeguarding personal data and protecting AI systems from threats, misuse, and unauthorized access.

Why the Other Answers Are Incorrect

A. Fairness

Fairness focuses on avoiding unjust bias and discrimination.

B. Inclusiveness

Inclusiveness focuses on designing systems accessible to diverse users.

D. Transparency

Transparency focuses on explainability and understanding AI decisions.

Final Thoughts

Privacy and security are foundational Responsible AI principles and key topics for the AI-901 certification exam. Microsoft expects candidates to understand how AI systems handle sensitive data, how security threats can affect AI solutions, and how organizations can protect both users and systems.

Strong privacy and security practices help organizations build trustworthy AI solutions while reducing legal, operational, and reputational risks.

Go to the AI-901 Exam Prep Hub main page

AI, AI Governance, AI-901, Microsoft Certification May 18, 2026

Describe considerations for reliability and safety in an AI Solution (AI-901 Exam Prep)

This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Identify AI concepts and capabilities (40–45%)
   --> Describe principles of responsible AI
      --> Describe considerations for reliability and safety in an AI Solution

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Reliability and safety are essential principles of Responsible AI and are important topics for the AI-901 certification exam. Microsoft emphasizes that AI systems should operate consistently, safely, and predictably, especially when used in environments that impact people’s lives, finances, health, or security.

Understanding reliability and safety means understanding how AI systems can fail, the risks associated with those failures, and the methods organizations use to reduce those risks.

What Is Reliability and Safety in AI?

Reliability and safety refer to ensuring that AI systems:

Operate consistently
Produce dependable results
Minimize harmful outcomes
Perform safely under expected and unexpected conditions

A reliable AI system should continue functioning properly even when:

Data changes
Conditions vary
Users behave unexpectedly
Inputs are incomplete or unusual

A safe AI system should avoid causing physical, emotional, financial, or operational harm.

Why Reliability and Safety Matter

AI systems are increasingly used in high-impact scenarios such as:

Healthcare diagnostics
Autonomous vehicles
Financial fraud detection
Industrial automation
Security monitoring
Customer service
Smart home devices

Failures in these systems can lead to:

Incorrect medical recommendations
Financial losses
Physical injury
Security vulnerabilities
Loss of trust
Legal and compliance issues

Because of these risks, organizations must carefully design, test, and monitor AI solutions.

Reliability vs. Safety

Although closely related, reliability and safety are slightly different concepts.

Concept	Meaning
Reliability	The AI system consistently performs as expected
Safety	The AI system avoids causing harm

Example

A self-driving car that correctly detects road signs most of the time may be considered reliable.

However, if it occasionally fails in dangerous situations and causes accidents, it is not safe enough.

Both principles must work together.

Key Reliability Considerations

Consistent Performance

AI systems should deliver stable and dependable outputs over time.

Example

A fraud detection model should consistently identify suspicious transactions accurately, not fluctuate unpredictably from day to day.

Inconsistent behavior reduces user trust and may create operational problems.

Handling Unexpected Inputs

AI systems should manage unusual or incomplete inputs gracefully.

Example

A chatbot should respond appropriately when receiving misspelled text, slang, or unsupported questions rather than producing harmful or nonsensical responses.

This is sometimes called robustness.

Testing Across Different Conditions

AI systems should be tested under a wide variety of conditions before deployment.

Examples

Different user groups
Varying lighting conditions for image recognition
Different accents in speech recognition
Heavy workloads and traffic spikes
Missing or corrupted data

Comprehensive testing helps identify weaknesses before users are affected.

Monitoring After Deployment

AI reliability can degrade over time because:

User behavior changes
New data patterns emerge
Business environments evolve

This is often called model drift or data drift.

Organizations should continuously monitor AI systems to ensure they continue performing correctly.

Fail-Safe Mechanisms

AI systems should include safeguards in case something goes wrong.

Example

If an AI-powered medical system is uncertain about a diagnosis, it could escalate the case to a human doctor rather than making an unsafe recommendation.

Fail-safe mechanisms reduce the risk of harmful outcomes.

Key Safety Considerations

Preventing Harmful Outcomes

AI systems should minimize the possibility of causing harm.

Potential harms include:

Physical harm
Emotional harm
Financial harm
Reputational harm
Security risks

Example

A content moderation AI should avoid exposing users to dangerous or abusive material.

Human Oversight

Humans should remain involved in high-risk or sensitive AI decisions.

Examples

Doctors reviewing AI-assisted diagnoses
Loan officers reviewing loan denials
Security analysts reviewing threat alerts

Human oversight helps catch errors and improve accountability.

Security Against Attacks

AI systems can become targets for malicious attacks.

Examples include:

Feeding misleading data into models
Attempting to manipulate outputs
Extracting sensitive information
Prompt injection attacks in generative AI systems

Organizations must secure AI systems just like any other software system.

Reliability in Generative AI

Generative AI systems introduce additional reliability and safety challenges.

These systems may:

Generate incorrect information
Produce harmful content
Hallucinate facts
Create biased responses
Misinterpret prompts

Example

A generative AI chatbot may confidently provide inaccurate medical advice.

Because of this, generative AI systems often require:

Content filtering
Human review
Safety policies
Usage restrictions
Grounding with trusted data sources

Real-World Example

Scenario: AI Medical Assistant

A hospital deploys an AI solution that helps doctors identify diseases from medical images.

Reliability Requirements

Accurate image analysis
Consistent performance across different equipment
Reliable operation during heavy usage

Safety Requirements

Avoid dangerous misdiagnoses
Escalate uncertain cases to physicians
Protect patient data
Prevent harmful recommendations

Risk Mitigation Strategies

Extensive testing
Human oversight
Continuous monitoring
Security protections
Regular retraining

This type of scenario aligns well with AI-901 exam questions.

Common Causes of Reliability Problems

AI systems can become unreliable for many reasons.

Poor Quality Data

Incorrect or incomplete data can reduce model performance.

Example

A weather prediction system trained on inaccurate historical data may produce unreliable forecasts.

Insufficient Testing

Limited testing may fail to expose weaknesses.

Example

A facial recognition model tested only in bright lighting may fail in darker environments.

Data Drift

Real-world conditions may change over time.

Example

Customer purchasing behavior may evolve, reducing the accuracy of recommendation systems.

Adversarial Attacks

Malicious actors may intentionally manipulate AI systems.

Example

Small image modifications may fool computer vision systems into making incorrect classifications.

Microsoft Responsible AI Principles

Microsoft identifies reliability and safety as one of six core Responsible AI principles:

Fairness
Reliability and safety
Privacy and security
Inclusiveness
Transparency
Accountability

For AI-901, understand that reliability and safety focus on ensuring AI systems function dependably and minimize harmful outcomes.

Methods for Improving Reliability and Safety

Organizations use several strategies to improve AI reliability and safety.

Robust Testing

Test systems using:

Edge cases
Rare scenarios
Large workloads
Diverse user conditions
Adversarial testing

Monitoring and Logging

Track system behavior after deployment to identify:

Accuracy degradation
Failures
Unexpected outputs
Security concerns

Human-in-the-Loop Systems

Allow humans to review sensitive decisions before action is taken.

Safety Constraints

Limit what an AI system can do.

Example

A chatbot may block harmful or unsafe responses using content moderation filters.

Backup and Recovery Plans

Organizations should prepare for failures by implementing:

Rollback procedures
Redundant systems
Emergency shutdown controls

Azure and Responsible AI

Microsoft Azure AI Services and related AI platforms include features that help organizations improve reliability and safety, such as:

Monitoring tools
Security controls
Content filtering
Responsible AI guidance
Human review workflows
Governance frameworks

Microsoft encourages organizations to incorporate these principles throughout the AI lifecycle.

Important AI-901 Exam Tips

For the exam, remember these key points:

Reliability means AI systems perform consistently and dependably.
Safety means AI systems minimize harmful outcomes.
AI systems should be tested under many conditions.
Human oversight is important in sensitive scenarios.
Monitoring after deployment is essential.
Generative AI introduces additional safety risks.
Fail-safe mechanisms help reduce harm.
Reliability and safety are one of Microsoft’s six Responsible AI principles.

Quick Knowledge Check

Question 1

What is the primary goal of reliability in AI?

Answer

To ensure the AI system consistently performs as expected.

Question 2

Why is monitoring AI systems after deployment important?

Answer

Because data and user behavior can change over time, potentially reducing model performance.

Question 3

What is an example of a fail-safe mechanism?

Answer

Escalating uncertain AI decisions to a human reviewer.

Question 4

Why can generative AI systems create safety concerns?

Answer

Because they may generate inaccurate, harmful, or misleading content.

Practice Exam Questions

Question 1

A company deploys an AI-powered medical imaging system. The system automatically flags uncertain diagnoses for review by a physician before final decisions are made.

What Responsible AI practice does this BEST represent?

A. Data minimization
B. Human oversight
C. Data labeling
D. Batch processing

Correct Answer

B. Human oversight

Explanation

Human oversight involves allowing people to review, validate, or override AI decisions, especially in high-risk scenarios such as healthcare.

This helps reduce the risk of harmful outcomes.

Why the Other Answers Are Incorrect

A. Data minimization

Data minimization relates to collecting only necessary data.

C. Data labeling

Data labeling is the process of tagging training data.

D. Batch processing

Batch processing refers to processing data in groups.

Question 2

What is the PRIMARY goal of reliability in an AI solution?

A. Increasing advertising revenue
B. Ensuring the AI system performs consistently as expected
C. Eliminating all operational costs
D. Replacing all human workers

Correct Answer

B. Ensuring the AI system performs consistently as expected

Explanation

Reliability means an AI system consistently produces dependable and stable results under expected and unexpected conditions.

Why the Other Answers Are Incorrect

A. Increasing advertising revenue

Revenue generation is unrelated to Responsible AI reliability principles.

C. Eliminating all operational costs

Reliability focuses on system performance, not cost elimination.

D. Replacing all human workers

Responsible AI does not require complete automation.

Question 3

An AI chatbot receives unexpected user input containing spelling mistakes and slang. The chatbot still responds appropriately without crashing or producing harmful output.

What characteristic is the chatbot demonstrating?

A. Transparency
B. Robustness
C. Data encryption
D. Scalability

Correct Answer

B. Robustness

Explanation

Robustness refers to an AI system’s ability to handle unexpected, incomplete, or unusual inputs safely and reliably.

Why the Other Answers Are Incorrect

A. Transparency

Transparency relates to understanding how AI decisions are made.

C. Data encryption

Encryption protects data security.

D. Scalability

Scalability refers to handling increased workloads.

Question 4

Why should AI systems be continuously monitored after deployment?

A. AI systems never change once deployed
B. Data patterns and user behavior may change over time
C. Monitoring guarantees perfect model accuracy
D. Monitoring removes the need for testing

Correct Answer

B. Data patterns and user behavior may change over time

Explanation

Changes in real-world conditions can reduce model accuracy and reliability over time. Continuous monitoring helps identify these issues early.

This is often related to data drift or model drift.

Why the Other Answers Are Incorrect

A. AI systems never change once deployed

AI performance can change as conditions evolve.

C. Monitoring guarantees perfect model accuracy

No monitoring system can guarantee perfection.

D. Monitoring removes the need for testing

Testing before deployment remains essential.

Question 5

Which scenario BEST demonstrates a safety concern in AI?

A. A report loads slowly in a dashboard
B. A chatbot uses too much memory
C. An autonomous vehicle fails to recognize a pedestrian
D. A database backup takes longer than expected

Correct Answer

C. An autonomous vehicle fails to recognize a pedestrian

Explanation

This scenario could lead to physical harm, making it a major AI safety concern.

Safety focuses on minimizing harmful outcomes.

Why the Other Answers Are Incorrect

A. A report loads slowly in a dashboard

This is a performance issue.

B. A chatbot uses too much memory

This is a resource management issue.

D. A database backup takes longer than expected

This is an infrastructure or operational issue.

Question 6

What is a fail-safe mechanism in AI?

A. A process that guarantees 100% model accuracy
B. A backup plan that reduces harm when the AI system encounters problems
C. A method for increasing advertising performance
D. A process that removes all security requirements

Correct Answer

B. A backup plan that reduces harm when the AI system encounters problems

Explanation

Fail-safe mechanisms help prevent harmful outcomes if the AI system becomes uncertain or fails unexpectedly.

Example: Escalating uncertain medical diagnoses to human experts.

Why the Other Answers Are Incorrect

A. A process that guarantees 100% model accuracy

No AI system can guarantee perfect accuracy.

C. A method for increasing advertising performance

Advertising optimization is unrelated to fail-safe mechanisms.

D. A process that removes all security requirements

Security remains critically important.

Question 7

Which statement BEST describes the difference between reliability and safety?

A. Reliability focuses on consistent performance, while safety focuses on minimizing harm
B. Reliability and safety are identical concepts
C. Reliability applies only to hardware systems
D. Safety focuses only on data storage

Correct Answer

A. Reliability focuses on consistent performance, while safety focuses on minimizing harm

Explanation

Reliability ensures dependable system behavior, while safety ensures the AI system avoids causing harm.

Both are key Responsible AI principles.

Why the Other Answers Are Incorrect

B. Reliability and safety are identical concepts

They are closely related but distinct principles.

C. Reliability applies only to hardware systems

Reliability applies to AI software systems as well.

D. Safety focuses only on data storage

Safety includes preventing harmful outcomes.

Question 8

A generative AI system confidently provides incorrect medical advice.

What Responsible AI concern does this BEST represent?

A. Scalability
B. Hallucination and safety risk
C. Database normalization
D. Data compression

Correct Answer

B. Hallucination and safety risk

Explanation

Generative AI systems can sometimes generate inaccurate or fabricated information, known as hallucinations.

In healthcare scenarios, this creates significant safety concerns.

Why the Other Answers Are Incorrect

A. Scalability

Scalability concerns handling workload increases.

C. Database normalization

Normalization relates to database design.

D. Data compression

Compression reduces storage size.

Question 9

Why is extensive testing important before deploying an AI solution?

A. To identify weaknesses and unsafe behavior under different conditions
B. To guarantee the AI will never fail
C. To eliminate the need for monitoring after deployment
D. To reduce the amount of training data required

Correct Answer

A. To identify weaknesses and unsafe behavior under different conditions

Explanation

Testing across many conditions helps organizations discover problems before users are affected.

Testing improves reliability and safety.

Why the Other Answers Are Incorrect

B. To guarantee the AI will never fail

No testing process can guarantee zero failures.

C. To eliminate the need for monitoring after deployment

Monitoring remains necessary after deployment.

D. To reduce the amount of training data required

Testing does not reduce training data needs.

Question 10

Which Microsoft Responsible AI principle focuses on ensuring AI systems operate dependably and minimize harmful outcomes?

A. Inclusiveness
B. Accountability
C. Reliability and safety
D. Transparency

Correct Answer

C. Reliability and safety

Explanation

The Reliability and Safety principle focuses on ensuring AI systems operate consistently, safely, and predictably while reducing the risk of harmful outcomes.

Why the Other Answers Are Incorrect

A. Inclusiveness

Inclusiveness focuses on designing AI systems for diverse populations.

B. Accountability

Accountability concerns responsibility for AI systems and decisions.

D. Transparency

Transparency focuses on explainability and understanding AI behavior.

Final Thoughts

Reliability and safety are foundational concepts in Responsible AI and key topics for the AI-901 certification exam. Microsoft expects candidates to understand how AI systems can fail, how those failures can affect people and organizations, and how responsible design practices can reduce risks.

Reliable and safe AI systems help organizations build trust, reduce harm, and create more dependable AI-powered solutions.

Go to the AI-901 Exam Prep Hub main page

AI, AI-901, Microsoft Certification May 18, 2026

Describe considerations for fairness in an AI solution (AI-901 Exam Prep)

This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Identify AI concepts and capabilities (40–45%)
   --> Describe principles of responsible AI
      --> Describe considerations for fairness in an AI solution

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Fairness is one of the core principles of Responsible AI and is an important topic for the AI-901 certification exam. Microsoft emphasizes that AI systems should treat all people fairly and avoid producing biased or discriminatory outcomes.

Understanding fairness in AI means understanding how bias can enter an AI system, how unfair outcomes can affect people, and what organizations can do to reduce those risks.

What Is Fairness in AI?

Fairness in AI means that an AI system should make decisions or recommendations without unjustly favoring or disadvantaging individuals or groups.

An AI solution is considered unfair if it produces biased outcomes based on characteristics such as:

Gender
Race or ethnicity
Age
Religion
Disability status
Nationality
Socioeconomic background

The goal is not simply technical accuracy. An AI model can be highly accurate overall while still treating certain groups unfairly.

Why Fairness Matters

AI systems increasingly influence important real-world decisions, including:

Hiring and recruiting
Loan approvals
Healthcare recommendations
Insurance pricing
Criminal justice assessments
School admissions
Customer service prioritization

If these systems are unfair, they can reinforce or amplify existing social inequalities.

For example:

A hiring AI might prefer resumes from men because historical company data reflects mostly male hires.
A facial recognition system may perform poorly for people with darker skin tones if training data lacked diversity.
A loan approval model may unfairly deny applications from certain neighborhoods because of biased historical lending patterns.

These outcomes can damage trust, create legal risks, and harm individuals.

How Bias Enters an AI System

Fairness problems usually originate from bias in data, design, or implementation.

1. Biased Training Data

AI models learn patterns from historical data. If the historical data reflects human bias, the AI may learn and repeat that bias.

Example

If a company historically hired mostly men for engineering roles, an AI recruiting tool trained on that data may incorrectly learn that male candidates are preferable.

This is one of the most common causes of unfair AI systems.

2. Underrepresentation in Data

Some groups may not be sufficiently represented in the training dataset.

Example

A speech recognition model trained mostly on American English speakers may perform poorly for people with different accents.

When data lacks diversity, the AI system may not generalize well to all users.

3. Labeling Bias

Humans often label training data. Human assumptions and prejudices can influence those labels.

Example

If reviewers consistently rate certain groups more negatively during data labeling, the AI model may inherit those patterns.

4. Feature Selection Bias

Sometimes developers unintentionally include features that correlate with protected characteristics.

Example

Using ZIP codes in a lending model could indirectly reflect race or income levels.

Even if race is not explicitly included, proxy variables can still create unfair outcomes.

5. Algorithmic Bias

Some algorithms may optimize for overall accuracy while ignoring fairness across groups.

Example

An AI model may achieve 95% accuracy overall but perform significantly worse for a minority population.

This demonstrates why fairness metrics matter alongside accuracy metrics.

Key Fairness Considerations

When evaluating fairness in an AI solution, organizations should consider several important areas.

Equal Treatment

AI systems should provide similar quality of service and outcomes across different demographic groups.

Example

A facial recognition system should work equally well for all skin tones and genders.

Avoiding Discrimination

AI should not unfairly disadvantage protected groups.

Example

A hiring system should evaluate applicants based on qualifications rather than demographic patterns found in historical data.

Inclusive Design

AI systems should be designed for diverse populations from the beginning.

This includes:

Diverse datasets
Diverse testing groups
Accessibility considerations
Multiple languages and accents
Cultural differences

Transparency and Explainability

Organizations should understand how AI systems make decisions and be able to explain those decisions when needed.

Example

If a loan application is denied, the organization should be able to explain the factors involved.

Explainability helps identify unfair behavior and improves accountability.

Continuous Monitoring

Fairness is not a one-time task.

AI systems should be continuously monitored because:

Data changes over time
User populations evolve
Biases may emerge after deployment

Organizations should regularly review model outputs and retrain models when necessary.

Trade-Offs in Fairness

Fairness in AI is complex because different definitions of fairness can conflict.

For example:

Maximizing overall accuracy may reduce fairness for smaller groups.
Equal outcomes across groups may require adjusting decision thresholds.
Removing sensitive attributes does not always eliminate bias.

There is often no perfect fairness solution, which is why ethical judgment and governance are important.

Microsoft’s Responsible AI Principles

Microsoft identifies fairness as one of six core Responsible AI principles.

The six principles are:

Fairness
Reliability and safety
Privacy and security
Inclusiveness
Transparency
Accountability

For the AI-901 exam, you should understand that fairness focuses on ensuring AI systems do not create unjust bias or discrimination.

Tools and Techniques for Improving Fairness

Organizations can reduce unfairness using several approaches.

Improve Data Quality

Use diverse and representative datasets
Remove biased or low-quality data
Balance underrepresented groups

Evaluate Fairness Metrics

Measure model performance across different groups instead of relying only on overall accuracy.

Example Metrics

False positive rates
False negative rates
Accuracy by demographic group

Human Oversight

Humans should remain involved in reviewing sensitive AI decisions.

Example

An AI hiring recommendation system might assist recruiters, but humans should make final hiring decisions.

Explainable AI

Explainability tools help organizations understand why models make certain decisions.

This can help detect hidden bias.

Responsible AI Governance

Organizations should establish policies, reviews, and ethical guidelines for AI development and deployment.

Real-World Example of Fairness

Scenario: AI-Based Hiring System

A company creates an AI model to screen resumes.

Potential Fairness Problem

Historical hiring data shows the company hired mostly men for technical roles.

The AI learns patterns associated with male candidates and begins ranking female candidates lower.

Possible Solutions

Use more diverse training data
Remove biased features
Audit model outputs regularly
Include human review
Test performance across demographic groups

This is a classic AI fairness scenario and aligns well with AI-901 exam objectives.

Azure and Responsible AI

Microsoft Azure AI Services and related AI platforms include Responsible AI guidance and tools to help developers:

Detect bias
Improve transparency
Monitor model behavior
Evaluate fairness metrics
Implement human oversight

Microsoft encourages organizations to adopt Responsible AI practices throughout the AI lifecycle.

Important AI-901 Exam Tips

For the exam, remember these key points:

Fairness means AI systems should avoid unjust bias and discrimination.
Bias often originates from training data.
High model accuracy does not guarantee fairness.
Diverse datasets help improve fairness.
Human oversight remains important.
Fairness is one of Microsoft’s six Responsible AI principles.
AI systems should be monitored continuously after deployment.
Transparency and explainability support fairness efforts.

Practice Exam Questions

Question 1

A company develops an AI system to screen job applicants. The system consistently ranks male applicants higher because historical hiring data mostly contains successful male candidates.

What is the MOST likely cause of this fairness issue?

A. Insufficient computing power
B. Biased training data
C. Excessive model transparency
D. Lack of cloud storage

Correct Answer

B. Biased training data

Explanation

The AI system learned patterns from historical hiring data that reflected past hiring bias. Because the training data was biased toward male candidates, the model inherited those unfair patterns.

This is one of the most common fairness problems in AI systems.

Why the Other Answers Are Incorrect

A. Insufficient computing power

Computing power affects performance and speed, not fairness.

C. Excessive model transparency

Transparency helps identify fairness problems rather than causing them.

D. Lack of cloud storage

Storage capacity does not create demographic bias in AI models.

Question 2

Which statement BEST describes fairness in AI?

A. AI systems should maximize profit for organizations
B. AI systems should make decisions without unjust bias
C. AI systems should eliminate all human involvement
D. AI systems should always make identical decisions for everyone

Correct Answer

B. AI systems should make decisions without unjust bias

Explanation

Fairness in AI focuses on preventing unjust discrimination and ensuring equitable treatment across different groups of people.

Fairness does not necessarily mean identical outcomes for everyone, but rather avoiding harmful or biased treatment.

Why the Other Answers Are Incorrect

A. AI systems should maximize profit for organizations

Profitability is unrelated to the Responsible AI principle of fairness.

C. AI systems should eliminate all human involvement

Human oversight is often important for maintaining fairness.

D. AI systems should always make identical decisions for everyone

Different circumstances may justify different outcomes. Fairness is about avoiding unjust bias.

Question 3

A speech recognition system performs poorly for users with certain accents because most training samples came from a single geographic region.

What fairness issue does this demonstrate?

A. Overfitting
B. Underrepresentation in training data
C. Excessive transparency
D. Encryption failure

Correct Answer

B. Underrepresentation in training data

Explanation

The training data lacked sufficient diversity, causing the model to perform poorly for underrepresented user groups.

Inclusive and representative datasets help improve fairness.

Why the Other Answers Are Incorrect

A. Overfitting

Overfitting occurs when a model memorizes training data rather than generalizing properly.

C. Excessive transparency

Transparency does not cause poor recognition accuracy for accents.

D. Encryption failure

Encryption relates to security, not fairness.

Question 4

Which Microsoft Responsible AI principle focuses on reducing bias and discrimination?

A. Accountability
B. Transparency
C. Fairness
D. Reliability and safety

Correct Answer

C. Fairness

Explanation

The Fairness principle focuses on ensuring AI systems do not unfairly disadvantage individuals or groups.

Why the Other Answers Are Incorrect

A. Accountability

Accountability concerns responsibility for AI systems and their outcomes.

B. Transparency

Transparency focuses on explainability and understanding AI decisions.

D. Reliability and safety

Reliability and safety focus on dependable and safe system operation.

Question 5

An organization removes race from a loan approval model, but the model still produces biased outcomes because ZIP code data indirectly reflects demographic patterns.

What does ZIP code represent in this scenario?

A. A fairness metric
B. A proxy variable
C. A transparency feature
D. A security control

Correct Answer

B. A proxy variable

Explanation

A proxy variable is a feature that indirectly correlates with sensitive attributes such as race, gender, or income level.

Even when protected attributes are removed, proxy variables can still introduce unfairness.

Why the Other Answers Are Incorrect

A. A fairness metric

Fairness metrics are measurements used to evaluate fairness.

C. A transparency feature

Transparency features help explain decisions, not indirectly encode demographic data.

D. A security control

Security controls protect systems and data.

Question 6

Why is human oversight important in AI systems that make sensitive decisions?

A. Humans can completely eliminate all bias
B. Humans can review and challenge potentially unfair outcomes
C. Humans increase automation speed
D. Humans reduce cloud costs

Correct Answer

B. Humans can review and challenge potentially unfair outcomes

Explanation

Human oversight helps organizations identify questionable or unfair AI decisions, especially in high-impact areas like hiring, healthcare, and finance.

AI systems should assist humans rather than fully replace judgment in sensitive scenarios.

Why the Other Answers Are Incorrect

A. Humans can completely eliminate all bias

Humans can reduce bias, but not completely eliminate it.

C. Humans increase automation speed

Human review usually slows processes rather than speeds them up.

D. Humans reduce cloud costs

Human oversight is unrelated to cloud pricing.

Question 7

An AI model achieves 98% accuracy overall but performs significantly worse for older adults than younger adults.

What does this scenario illustrate?

A. High accuracy guarantees fairness
B. Fairness and accuracy are always identical
C. An AI system can be accurate overall while still unfair
D. Transparency automatically prevents bias

Correct Answer

C. An AI system can be accurate overall while still unfair

Explanation

Overall accuracy can hide unequal performance across demographic groups. Fairness evaluations should measure outcomes for different populations separately.

Why the Other Answers Are Incorrect

A. High accuracy guarantees fairness

High accuracy does not guarantee equitable treatment.

B. Fairness and accuracy are always identical

These are different concepts and can conflict.

D. Transparency automatically prevents bias

Transparency helps identify issues but does not automatically eliminate them.

Question 8

Which action would BEST help improve fairness in an AI solution?

A. Limiting testing to a single user group
B. Using more diverse and representative training data
C. Hiding model outputs from reviewers
D. Reducing the amount of training data

Correct Answer

B. Using more diverse and representative training data

Explanation

Representative datasets improve an AI system’s ability to perform fairly across different populations and reduce bias caused by underrepresentation.

Why the Other Answers Are Incorrect

A. Limiting testing to a single user group

This increases the risk of bias and poor generalization.

C. Hiding model outputs from reviewers

Review and transparency help identify fairness issues.

D. Reducing the amount of training data

Less data often reduces model quality and fairness.

Question 9

Which of the following is an example of an unfair AI outcome?

A. A chatbot responding slowly during peak usage
B. A recommendation engine displaying duplicate products
C. A facial recognition system performing poorly for certain skin tones
D. A virtual machine running out of memory

Correct Answer

C. A facial recognition system performing poorly for certain skin tones

Explanation

Unequal performance across demographic groups is a classic fairness problem in AI systems.

This often results from insufficiently diverse training data.

Why the Other Answers Are Incorrect

A. A chatbot responding slowly during peak usage

This is a performance issue.

B. A recommendation engine displaying duplicate products

This is a recommendation quality issue.

D. A virtual machine running out of memory

This is an infrastructure issue.

Question 10

Why should AI systems be continuously monitored after deployment?

A. Fairness issues can emerge as data and user behavior change over time
B. AI systems never require updates after deployment
C. Monitoring removes the need for testing before deployment
D. Monitoring guarantees perfect fairness

Correct Answer

A. Fairness issues can emerge as data and user behavior change over time

Explanation

AI systems operate in changing environments. Data distributions, populations, and behaviors may evolve, creating new fairness risks after deployment.

Continuous monitoring is an important Responsible AI practice.

Why the Other Answers Are Incorrect

B. AI systems never require updates after deployment

AI systems often require retraining and adjustment.

C. Monitoring removes the need for testing before deployment

Pre-deployment testing remains essential.

D. Monitoring guarantees perfect fairness

No approach can guarantee perfect fairness in all situations.

Final Thoughts

Fairness is a foundational concept in Responsible AI and a critical topic for the AI-901 certification exam. Microsoft expects candidates to understand not only what fairness means, but also how bias enters AI systems and what organizations can do to reduce unfair outcomes.

As AI becomes more integrated into business and society, fairness is no longer optional—it is essential for building trustworthy and ethical AI solutions.

Go to the AI-901 Exam Prep Hub main page

DP-900, Microsoft Certification May 11, 2026

DP-900: Microsoft Azure Data Fundamentals certification exam – Frequently Asked Questions (FAQs)

Below are some commonly asked questions about the DP-900: Microsoft Azure Data Fundamentals certification exam. Upon successfully passing this exam, you earn the Microsoft Certified: Azure Data Fundamentals certification.

What is the DP-900 certification exam?

The DP-900: Microsoft Azure Data Fundamentals exam validates your foundational knowledge of core data concepts and how data is implemented using Microsoft Azure services.

Candidates who pass the exam demonstrate understanding of:

Core data concepts (relational vs non-relational data, transactional vs analytical workloads)
Relational data workloads in Azure (Azure SQL Database, SQL Server on Azure Virtual Machines, Azure SQL Managed Instance)
Non-relational data workloads in Azure (Azure Cosmos DB)
Analytical workloads in Azure (Azure Synapse Analytics, Azure Data Factory, Azure Data Lake, Power BI)

This certification is designed for individuals who want to build a baseline understanding of data in the cloud. Upon successfully passing this exam, candidates earn the Microsoft Certified: Azure Data Fundamentals certification.

Is the DP-900 certification exam worth it?

The short answer is yes.

DP-900 is an excellent entry point into Microsoft’s data certification ecosystem. Preparing for this exam helps you:

Build a solid foundation in data concepts
Understand how Azure supports different data workloads
Gain confidence working with cloud-based data platforms
Prepare for more advanced certifications such as DP-203, PL-300, or AI-900

For beginners, career switchers, students, and professionals new to Azure or data, DP-900 provides structured learning and practical context that transfers directly to real-world scenarios.

How many questions are on the DP-900 exam?

The DP-900 exam typically contains between 40 and 60 questions.

Question formats may include:

Single-choice and multiple-choice questions
Multi-select questions
Drag-and-drop or matching questions
Short scenario-based questions

The exact number and format can vary from exam to exam.

How hard is the DP-900 exam?

DP-900 is considered a fundamentals-level exam and is generally easier than associate-level certifications such as PL-300 or DP-203.

That said, it still requires preparation.

The challenge comes from:

Understanding when to use relational vs non-relational data
Recognizing Azure services and their purposes
Interpreting scenario-based questions
Learning basic analytics concepts

With focused study and practice, most candidates find the exam very achievable.

Helpful preparation resources include:

Microsoft Learn (official and free)
The official DP-900 study guide
Practice exams
Community resources and blogs
YouTube tutorials and walkthroughs

How much does the DP-900 certification exam cost?

As of early 2026, the standard exam pricing is approximately:

United States: $99 USD
Other countries: Regionally adjusted pricing applies

Microsoft frequently offers student discounts, academic pricing, and exam vouchers, so it’s worth checking the official Microsoft certification site before scheduling.

How do I prepare for the Microsoft DP-900 certification exam?

The most important advice is not to rush.

Recommended preparation steps:

Review the official DP-900 exam skills outline.
Complete the free Microsoft Learn DP-900 learning path.
Study core data concepts (relational vs non-relational, OLTP vs OLAP).
Learn the purpose of key Azure services such as Azure SQL Database, Azure Cosmos DB, Azure Synapse Analytics, and Power BI.
Take practice exams to confirm your readiness.
- The DP-900 exam prep hub on The Data Community has 2 practice exams: Exam Prep Hub for DP-900: Azure Data Fundamentals

Additional learning resources include:

Courses on platforms like Udemy and Coursera
Community-created study hubs, such as the on this site: Exam Prep Hub for DP-900: Azure Data Fundamentals
YouTube playlists focused on DP-900 topics. See list of other resources here: Exam Prep Hub for DP-900: Azure Data Fundamentals

Hands-on labs are helpful but not strictly required for DP-900. Conceptual understanding is the primary focus.

How do I pass the DP-900 exam?

To maximize your chances of passing:

Focus on understanding concepts rather than memorization
Learn what each Azure data service is designed for
Carefully read scenario questions before answering
Eliminate obviously incorrect choices
Manage your time effectively

Consistently performing well on reputable practice exams is usually a good indicator that you’re ready.

What is the best site for DP-900 certification dumps?

Using exam dumps is not recommended and may violate Microsoft’s exam policies.

Instead, rely on legitimate preparation resources such as:

Microsoft’s official practice exam
High-quality community-created practice tests
Scenario-based questions that reinforce understanding

Legitimate preparation builds real skills that extend beyond passing the exam.

How long should I study for the DP-900 exam?

Study time varies based on background.

General guidelines:

Prior data or Azure experience: 2–4 weeks
Some technical background: 3–5 weeks
Beginners or career changers: 4–8 weeks

Rather than focusing strictly on time, aim to understand all exam topics and perform well on practice tests before scheduling.

Where can I find training or a course for the DP-900 exam?

Training options include:

Microsoft Learn: Free, official learning path
Online platforms: Udemy, Coursera, Exam Prep Hub for DP-900: Azure Data Fundamentals, and similar providers
YouTube: Free DP-900 playlists and walkthroughs
Subscription platforms: Datacamp and others offering Azure or data fundamentals
Microsoft partners: Instructor-led courses

A mix of structured learning and light hands-on exploration works well.

What skills should I have before taking the DP-900 exam?

Before attempting the exam, it helps to understand:

Basic data concepts (tables, rows, columns)
Differences between relational and non-relational data
Basic analytics terminology
General cloud computing concepts

No coding experience is required.

DP-900 is designed specifically for beginners.

What score do I need to pass the DP-900 exam?

Microsoft exams are scored on a scale of 1–1000, and a score of 700 or higher is required to pass.

Scores are scaled based on question difficulty, not simply percentage correct.

How long is the DP-900 exam?

You are given approximately 60 minutes to complete the exam, not including onboarding and instructions.

Time pressure is generally lower than associate-level exams.

How long is the DP-900 certification valid?

The Microsoft Certified: Azure Data Fundamentals certification does not expire.

Unlike associate-level certifications, DP-900 currently does not require renewal.

Is DP-900 suitable for beginners?

Yes — DP-900 is specifically designed for beginners.

It’s ideal for:

Students
Career switchers
Business professionals entering data or analytics
Technical professionals new to Azure

No prior Azure or database experience is required.

What roles benefit most from the DP-900 certification?

DP-900 is especially valuable for:

Aspiring Data Analysts or Data Engineers
Business Analysts
Students and graduates
Cloud beginners
Professionals exploring data careers

It also serves as a strong foundation before pursuing PL-300, DP-203, or AI-900.

What languages is the DP-900 exam offered in?

The DP-900 certification exam is commonly offered in:

English, Japanese, Chinese (Simplified), Korean, German, French, Spanish, Portuguese (Brazil), Chinese (Traditional), Italian

Availability may vary by region.

Have additional questions? Post them in the comments.

Thanks for reading and good luck on your data journey!

DP-900, Microsoft Certification May 11, 2026May 14, 2026

Exam Prep Hub for DP-900: Azure Data Fundamentals

Welcome to the DP-900: Azure Data Fundamentals Exam Prep Hub!

Welcome to the one-stop hub with information for preparing for the DP-900: Microsoft Azure Data Fundamentals certification exam. The content for this exam helps you to “Demonstrate foundational knowledge of core data concepts related to Microsoft Azure data services.”. Upon successful completion of the exam, you earn the Microsoft Certified: Azure Data Fundamentals certification.

This hub provides information directly here (topic-by-topic as outlined in the official study guide), links to a number of external resources, tips for preparing for the exam, practice tests, and section questions to help you prepare. Bookmark this page and use it as a guide to ensure that you are fully covering all relevant topics for the AI-900 exam and making use of as many of the resources available as possible.

Audience profile (from Microsoft’s site)

This exam is an opportunity to demonstrate your knowledge of core data concepts and related Microsoft Azure data services. As a candidate for this exam, you should have familiarity with Exam DP-900’s self-paced or instructor-led learning material.

This exam is intended for you, if you’re a candidate beginning to work with data in the cloud.

You should be familiar with:
- The concepts of relational and non-relational data.
- Different types of data workloads such as transactional or analytical.

You can use Azure Data Fundamentals to prepare for other Azure role-based certifications like Azure Database Administrator Associate or Azure Data Engineer Associate, but it is not a prerequisite for any of them.

Skills at a glance (as specified in the official study guide)

Describe core data concepts (25–30%)
Identify considerations for relational data on Azure (20–25%)
Describe considerations for working with non-relational data on Azure (15–20%)
Describe an analytics workload on Azure (25–30%)

Topic-by-Topic Exam Content

Describe core data concepts (25–30%)

Describe ways to represent data

Identify options for data storage

Describe common data workloads

Identify roles and responsibilities for data workloads

Identify considerations for relational data on Azure (20–25%)

Describe relational concepts

Describe relational Azure data services

Describe considerations for working with non-relational data on Azure (15–20%)

Describe capabilities of Azure storage

Describe capabilities and features of Azure Cosmos DB

Describe an analytics workload (25–30%)

Describe common elements of large-scale analytics

Describe consideration for real-time data analytics

Describe data visualization in Microsoft Power BI

DP-900 Practice Exams

DP-900 Practice Exam 1 (60 questions with answers)

DP-900 Practice Exam 2 (60 questions with answers)

Important DP-900 Resources

Link to the free, comprehensive, self-paced course on Microsoft Learn – Introduction to Microsoft Azure Data
Link to the certification page: Microsoft Certified: Azure Data Fundamentals certification page
Link to the study guide: Study guide for Exam DP-900: Microsoft Azure Data Fundamentals

YouTube video series: Microsoft Learn DP-900 Azure Data Fundamentals YouTube series

A book you may find useful (on Amazon): Exam Ref DP-900 Microsoft Azure Data Fundamentals 2nd Edition

Good luck to you on your data journey!

DP-900, Microsoft Certification May 11, 2026

DP-900: Azure Data Fundamentals – Advanced Practice Exam – 60 questions

Advanced Practice Exam (60 Questions)

This advanced practice exam contains:

Higher-difficulty questions
More scenario-based questions
Multi-answer questions
Matching questions
Fill-in-the-blank questions
SQL and architecture concepts
Azure service selection scenarios

Section 1 — Core Data Concepts

Question 1 (Scenario-Based)

A company stores customer survey responses in JSON format. Each survey can contain different fields depending on the survey type.

How should this data be classified?

A. Structured
B. Semi-structured
C. Unstructured
D. Transactional

✅ Answer: B — Semi-structured

Explanations

A. Incorrect
Structured data requires a rigid schema.

B. Correct
JSON is semi-structured because it contains flexible tagged fields.

C. Incorrect
Unstructured data has little or no organization.

D. Incorrect
Transactional refers to workload type, not structure.

Question 2 (Multi-Answer)

Which characteristics are associated with transactional workloads? (Choose TWO)

A. High concurrency
B. Historical aggregations
C. Fast insert/update operations
D. Large-scale reporting queries

✅ Answers: A and C

Explanations

A. Correct
Transactional systems support many simultaneous users.

B. Incorrect
Historical aggregations are analytical.

C. Correct
OLTP systems perform fast write operations.

D. Incorrect
Large reporting queries belong to analytics workloads.

Question 3 (Scenario-Based)

A database contains duplicated customer addresses across multiple tables. The database architect wants to reduce redundancy and improve consistency.

Which process should be used?

A. Partitioning
B. Normalization
C. Encryption
D. Replication

✅ Answer: B — Normalization

Explanations

A. Incorrect
Partitioning improves scalability.

B. Correct
Normalization reduces duplication.

C. Incorrect
Encryption secures data.

D. Incorrect
Replication copies data.

Question 4 (Single Answer)

Which SQL statement removes an existing table and all its data?

A. DELETE
B. REMOVE
C. DROP
D. ERASE

✅ Answer: C — DROP

Explanations

A. Incorrect
DELETE removes rows only.

B. Incorrect
REMOVE is not standard SQL.

C. Correct
DROP deletes the table structure and data.

D. Incorrect
ERASE is not standard SQL.

Question 5 (Matching)

Match the role to the responsibility.

Role	Responsibility
1. DBA	A. Creates dashboards
2. Data Analyst	B. Maintains database performance
3. Data Engineer	C. Builds data pipelines

✅ Answers

1 → B
2 → A
3 → C

Question 6 (Scenario-Based)

A retail company needs a database for processing thousands of purchases per minute with guaranteed consistency.

Which workload type is MOST appropriate?

A. Analytical
B. Streaming
C. Transactional
D. Archival

✅ Answer: C — Transactional

Explanations

A. Incorrect
Analytical systems focus on reporting.

B. Incorrect
Streaming processes event flows.

C. Correct
Transactional systems support operational consistency and speed.

D. Incorrect
Archival systems store inactive data.

Question 7 (Fill in the Blank)

The SQL statement used to add new rows to a table is __________.

✅ Answer: INSERT

Question 8 (Multi-Answer)

Which file formats are commonly used in analytics workloads? (Choose TWO)

A. Parquet
B. ORC
C. BMP
D. EXE

✅ Answers: A and B

Explanations

A. Correct
Parquet is optimized for analytics.

B. Correct
ORC is another columnar analytics format.

C. Incorrect
BMP is an image format.

D. Incorrect
EXE is executable software.

Question 9 (Scenario-Based)

An organization wants to analyze 10 years of sales history for trends and forecasting.

Which workload type is BEST suited?

A. OLTP
B. Analytical
C. Streaming
D. Operational

✅ Answer: B — Analytical

Question 10 (Single Answer)

Which database object contains reusable SQL logic?

A. View
B. Index
C. Stored Procedure
D. Key

✅ Answer: C — Stored Procedure

Section 2 — Relational Data on Azure

Question 11 (Scenario-Based)

A company is migrating an on-premises SQL Server application that relies heavily on SQL Server Agent, cross-database queries, and instance-level features.

Which Azure service is MOST appropriate?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. Azure Cosmos DB
D. Azure Blob Storage

✅ Answer: B — Azure SQL Managed Instance

Explanations

A. Incorrect
Azure SQL Database has fewer instance-level features.

B. Correct
Managed Instance offers near full SQL Server compatibility.

C. Incorrect
Cosmos DB is NoSQL.

D. Incorrect
Blob Storage stores files.

Question 12 (Single Answer)

Which Azure SQL offering provides the HIGHEST level of infrastructure control?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. SQL Server on Azure Virtual Machines
D. Azure Synapse Analytics

✅ Answer: C — SQL Server on Azure Virtual Machines

Question 13 (Multi-Answer)

Which are advantages of Platform as a Service (PaaS) databases? (Choose TWO)

A. Automatic patching
B. Reduced administrative overhead
C. Full operating system control
D. Manual backups only

✅ Answers: A and B

Question 14 (Scenario-Based)

A company wants automatic scaling, backups, and minimal management overhead for a new cloud-native application.

Which solution is BEST?

A. SQL Server on Azure VMs
B. Azure SQL Database
C. Windows Server Failover Cluster
D. Self-hosted SQL Server

✅ Answer: B — Azure SQL Database

Question 15 (Single Answer)

What is the purpose of a foreign key?

A. Encrypt data
B. Create indexes
C. Enforce relationships between tables
D. Remove duplicates

✅ Answer: C — Enforce relationships between tables

Question 16 (Scenario-Based)

A company needs a managed PostgreSQL service in Azure.

Which service should be used?

A. Azure SQL Database
B. Azure Database for PostgreSQL
C. Azure Blob Storage
D. Azure Cosmos DB

✅ Answer: B — Azure Database for PostgreSQL

Question 17 (Single Answer)

Which normalization form removes transitive dependencies?

A. 1NF
B. 2NF
C. 3NF
D. 4NF

✅ Answer: C — 3NF

Question 18 (Multi-Answer)

Which SQL statements are Data Manipulation Language (DML)? (Choose TWO)

A. SELECT
B. INSERT
C. CREATE
D. DROP

✅ Answers: A and B

Question 19 (Scenario-Based)

A query needs to return ALL customers, including those without orders.

Which JOIN should be used?

A. INNER JOIN
B. CROSS JOIN
C. LEFT JOIN
D. SELF JOIN

✅ Answer: C — LEFT JOIN

Question 20 (Single Answer)

Which object improves query performance but does NOT store actual business data?

A. Table
B. View
C. Index
D. Row

✅ Answer: C — Index

Section 3 — Non-Relational Data

Question 21 (Scenario-Based)

A media company needs to store petabytes of video content at low cost.

Which Azure service is MOST appropriate?

A. Azure SQL Database
B. Azure Blob Storage
C. Azure Table Storage
D. Azure Cache for Redis

✅ Answer: B — Azure Blob Storage

Question 22 (Single Answer)

Which Azure Blob Storage tier is optimized for infrequently accessed data?

A. Premium
B. Hot
C. Cool
D. Archive

✅ Answer: C — Cool

Question 23 (Scenario-Based)

An organization needs cloud-hosted SMB file shares accessible by both cloud and on-premises servers.

Which service should be used?

A. Azure Cosmos DB
B. Azure Files
C. Azure Table Storage
D. Azure SQL Database

✅ Answer: B — Azure Files

Question 24 (Multi-Answer)

Which APIs are supported by Azure Cosmos DB? (Choose TWO)

A. MongoDB
B. Cassandra
C. Oracle
D. SMB

✅ Answers: A and B

Question 25 (Scenario-Based)

A gaming company needs globally distributed low-latency data access for player profiles.

Which Azure service is BEST?

A. Azure Cosmos DB
B. Azure Files
C. Azure SQL Database
D. Azure Blob Storage

✅ Answer: A — Azure Cosmos DB

Question 26 (Single Answer)

What is a major benefit of Azure Cosmos DB partitioning?

A. Reduces security
B. Enables scalability
C. Removes replication
D. Prevents indexing

✅ Answer: B — Enables scalability

Question 27 (Fill in the Blank)

Azure Cosmos DB provides multi-region __________ to improve availability and performance.

✅ Answer: replication

Question 28 (Scenario-Based)

A company needs a NoSQL key-value store for massive telemetry ingestion.

Which service is MOST appropriate?

A. Azure Table Storage
B. Azure SQL Database
C. Azure Files
D. Azure DNS

✅ Answer: A — Azure Table Storage

Question 29 (Single Answer)

Which storage service stores data as objects inside containers?

A. Azure Files
B. Azure Blob Storage
C. Azure SQL Database
D. Azure Cosmos DB

✅ Answer: B — Azure Blob Storage

Question 30 (Multi-Answer)

Which are characteristics of non-relational databases? (Choose TWO)

A. Flexible schemas
B. Strict relational constraints
C. Horizontal scalability
D. Mandatory JOIN operations

✅ Answers: A and C

Section 4 — Analytics Workloads

Question 31 (Scenario-Based)

A company collects IoT sensor readings every second and needs near real-time dashboards.

Which processing approach is MOST appropriate?

A. Batch processing
B. Streaming processing
C. Archival processing
D. Offline reporting

✅ Answer: B — Streaming processing

Question 32 (Single Answer)

Which Azure service is designed for high-throughput event ingestion?

A. Azure Event Hubs
B. Azure Backup
C. Azure Files
D. Azure DNS

✅ Answer: A — Azure Event Hubs

Question 33 (Scenario-Based)

An organization needs Apache Spark-based analytics with collaborative notebooks.

Which service is BEST?

A. Azure Databricks
B. Azure Files
C. Azure DNS
D. Azure Firewall

✅ Answer: A — Azure Databricks

Question 34 (Single Answer)

Which architecture commonly includes fact tables and dimension tables?

A. OLTP schema
B. Star schema
C. Graph schema
D. XML schema

✅ Answer: B — Star schema

Question 35 (Multi-Answer)

Which are characteristics of a data warehouse? (Choose TWO)

A. Optimized for analytics
B. Stores historical data
C. Primarily supports OLTP transactions
D. Limited aggregations

✅ Answers: A and B

Question 36 (Scenario-Based)

A company wants a unified analytics platform combining engineering, warehousing, data science, and BI.

Which Microsoft service BEST fits?

A. Microsoft Fabric
B. Azure Files
C. Azure Firewall
D. Azure DNS

✅ Answer: A — Microsoft Fabric

Question 37 (Single Answer)

Which service allows SQL-like queries against streaming data?

A. Azure Stream Analytics
B. Azure Files
C. Azure Backup
D. Azure Monitor

✅ Answer: A — Azure Stream Analytics

Question 38 (Scenario-Based)

An organization processes payroll data once nightly.

Which processing type is MOST appropriate?

A. Streaming
B. Batch
C. Event-driven only
D. Real-time analytics

✅ Answer: B — Batch

Question 39 (Single Answer)

Which process extracts, transforms, and loads data into analytical systems?

A. ETL
B. DNS
C. RAID
D. OLTP

✅ Answer: A — ETL

Question 40 (Multi-Answer)

Which services are commonly associated with real-time analytics? (Choose TWO)

A. Azure Event Hubs
B. Azure Stream Analytics
C. Azure Files
D. Azure Backup

✅ Answers: A and B

Section 5 — Power BI

Question 41 (Scenario-Based)

An executive wants a single-page overview showing KPIs and summary visuals.

Which Power BI object should be used?

A. Dataset
B. Dashboard
C. Dataflow
D. Semantic model

✅ Answer: B — Dashboard

Question 42 (Single Answer)

Which Power BI component is primarily used for data transformation?

A. DAX
B. Power Query
C. Azure Functions
D. Power Automate

✅ Answer: B — Power Query

Question 43 (Scenario-Based)

A report must show revenue trends over 24 months.

Which visualization is BEST?

A. Pie chart
B. Gauge chart
C. Line chart
D. Scatter chart

✅ Answer: C — Line chart

Question 44 (Single Answer)

Which visualization is BEST for displaying proportions?

A. Scatter chart
B. Pie chart
C. Card
D. Gauge chart

✅ Answer: B — Pie chart

Question 45 (Scenario-Based)

A company wants users to filter reports interactively by region and year.

Which feature should be used?

A. Indexes
B. Slicers
C. Measures
D. Triggers

✅ Answer: B — Slicers

Question 46 (Single Answer)

Which Power BI language creates measures and calculated columns?

A. SQL
B. Python
C. DAX
D. XML

✅ Answer: C — DAX

Question 47 (Scenario-Based)

A business analyst wants to identify the relationship between advertising spend and revenue.

Which visualization is BEST?

A. Pie chart
B. Scatter chart
C. Gauge chart
D. Card

✅ Answer: B — Scatter chart

Question 48 (Single Answer)

Which Power BI visualization is BEST for detailed row-level data?

A. Table
B. Gauge
C. Pie chart
D. Card

✅ Answer: A — Table

Question 49 (Multi-Answer)

Which are benefits of Power BI dashboards? (Choose TWO)

A. Real-time monitoring
B. Single-page summaries
C. Operating system administration
D. SQL indexing

✅ Answers: A and B

Question 50 (Scenario-Based)

A company needs a geographic visualization of sales by country.

Which visualization is BEST?

A. Matrix
B. Map
C. Gauge
D. Card

✅ Answer: B — Map

Section 6 — Comprehensive Scenarios

Question 51 (Scenario-Based)

A healthcare organization requires:

Globally distributed NoSQL storage
Automatic replication
Low latency worldwide
Flexible schema support

Which solution BEST fits?

A. Azure SQL Database
B. Azure Cosmos DB
C. Azure Files
D. Azure Synapse Analytics

✅ Answer: B — Azure Cosmos DB

Question 52 (Scenario-Based)

A manufacturing company collects sensor telemetry every second from thousands of devices.

Which Azure service should ingest the streaming events?

A. Azure Event Hubs
B. Azure Files
C. Azure SQL Managed Instance
D. Azure Backup

✅ Answer: A — Azure Event Hubs

Question 53 (Scenario-Based)

A company wants full control of SQL Server patching, OS configuration, and backups.

Which deployment option should be used?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. SQL Server on Azure Virtual Machines
D. Azure Cosmos DB

✅ Answer: C — SQL Server on Azure Virtual Machines

Question 54 (Single Answer)

Which Azure service is MOST optimized for unstructured object storage?

A. Azure Blob Storage
B. Azure SQL Database
C. Azure Files
D. Azure Synapse Analytics

✅ Answer: A — Azure Blob Storage

Question 55 (Scenario-Based)

An analytics team needs to store historical sales data optimized for aggregation queries.

Which solution is BEST?

A. Transactional database
B. Data warehouse
C. Azure Files
D. DNS server

✅ Answer: B — Data warehouse

Question 56 (Single Answer)

Which SQL statement changes existing records?

A. CREATE
B. UPDATE
C. INSERT
D. ALTER

✅ Answer: B — UPDATE

Question 57 (Multi-Answer)

Which are benefits of normalization? (Choose TWO)

A. Reduced redundancy
B. Improved consistency
C. Increased duplicate storage
D. Reduced relationships

✅ Answers: A and B

Question 58 (Scenario-Based)

A report needs to compare revenue across product categories.

Which visualization is BEST?

A. Line chart
B. Scatter chart
C. Bar chart
D. Gauge chart

✅ Answer: C — Bar chart

Question 59 (Fill in the Blank)

The SQL JOIN that returns only matching rows from both tables is called an __________ JOIN.

✅ Answer: INNER

Question 60 (Scenario-Based)

A company needs:

Large-scale analytics
Integrated Power BI reporting
Data engineering
Real-time analytics
Unified SaaS experience

Which platform BEST meets these requirements?

A. Microsoft Fabric
B. Azure Files
C. Azure DNS
D. Windows Server Failover Clustering

✅ Answer: A — Microsoft Fabric

Advanced Exam Study Tips

Know the differences between:

OLTP vs OLAP
Batch vs streaming
Structured vs semi-structured vs unstructured
Relational vs NoSQL

Memorize Azure service associations:

Service	Purpose
Azure Blob Storage	Unstructured object storage
Azure Files	SMB file shares
Azure Table Storage	Key-value NoSQL
Azure Cosmos DB	Globally distributed NoSQL
Azure Event Hubs	Streaming ingestion
Azure Stream Analytics	Real-time analytics
Azure Databricks	Spark analytics
Microsoft Fabric	Unified analytics platform

Power BI visualization shortcuts:

Visualization	Best Use
Line chart	Trends
Bar chart	Comparisons
Pie chart	Proportions
Scatter chart	Relationships
Card	Single KPI
Map	Geographic analysis
Gauge	Progress toward target

Go to the DP-900 Exam Prep Hub main page.

DP-900, Microsoft Certification May 11, 2026

DP-900: Azure Data Fundamentals – Practice Exam Questions – 60 questions

Full Practice Exam

This practice exam covers all major skills measured on the DP-900 certification exam, including:

Core data concepts
Relational data on Azure
Non-relational data on Azure
Analytics workloads
Power BI and visualization
Real-time analytics
Azure data services

Question formats include:

Single-answer multiple choice
Multi-answer multiple choice
Matching/connect-the-answers
Fill-in-the-blank
Scenario-based questions

Section 1 — Core Data Concepts

Question 1 (Single Answer)

Which type of data has a predefined schema consisting of rows and columns?

A. Unstructured data
B. Semi-structured data
C. Structured data
D. Streaming data

✅ Answer: C — Structured data

Explanations

A. Incorrect
Unstructured data does not have a predefined schema.

B. Incorrect
Semi-structured data has some organization but not fixed rows/columns.

C. Correct
Structured data uses a defined schema with rows and columns.

D. Incorrect
Streaming data refers to continuously arriving data, not structure type.

Question 2 (Multi-Answer)

Which of the following are examples of semi-structured data? (Choose TWO)

A. JSON
B. CSV
C. XML
D. SQL tables

✅ Answers: A and C

Explanations

A. Correct
JSON contains tags/structure but flexible schemas.

B. Incorrect
CSV is structured tabular data.

C. Correct
XML is semi-structured because it uses tagged hierarchical data.

D. Incorrect
SQL tables are structured relational data.

Question 3 (Fill in the Blank)

A database design technique used to reduce data redundancy is called __________.

✅ Answer: Normalization

Explanation

Normalization organizes data efficiently and minimizes duplication.

Question 4 (Single Answer)

Which SQL statement retrieves data from a table?

A. INSERT
B. UPDATE
C. SELECT
D. DELETE

✅ Answer: C — SELECT

Explanations

A. Incorrect
INSERT adds records.

B. Incorrect
UPDATE modifies records.

C. Correct
SELECT retrieves data.

D. Incorrect
DELETE removes records.

Question 5 (Matching)

Match the workload to its description.

Workload	Description
1. Transactional	A. Historical analysis
2. Analytical	B. Real-time business operations

✅ Answers

1 → B
2 → A

Explanation

Transactional workloads support day-to-day operations; analytical workloads analyze historical data.

Question 6 (Single Answer)

Which role is MOST responsible for maintaining database availability and backups?

A. Data Analyst
B. Data Engineer
C. Database Administrator
D. Business User

✅ Answer: C — Database Administrator

Explanations

A. Incorrect
Data analysts focus on reporting and insights.

B. Incorrect
Data engineers build pipelines and integration systems.

C. Correct
DBAs manage availability, backups, and performance.

D. Incorrect
Business users consume reports.

Question 7 (Multi-Answer)

Which are characteristics of analytical workloads? (Choose TWO)

A. Frequent INSERT operations
B. Historical trend analysis
C. Large-scale aggregations
D. High-volume OLTP transactions

✅ Answers: B and C

Explanations

A. Incorrect
Frequent inserts are more common in transactional systems.

B. Correct
Analytical systems examine historical data.

C. Correct
Aggregations are common in analytics.

D. Incorrect
OLTP workloads are transactional.

Question 8 (Single Answer)

Which file format is commonly used for big data analytics because of columnar storage and compression?

A. TXT
B. CSV
C. Parquet
D. XML

✅ Answer: C — Parquet

Explanations

A. Incorrect
TXT files are plain text.

B. Incorrect
CSV is row-based text data.

C. Correct
Parquet is optimized for analytics workloads.

D. Incorrect
XML is semi-structured but not optimized for analytics.

Question 9 (Single Answer)

Which database object stores data in rows and columns?

A. View
B. Stored procedure
C. Table
D. Index

✅ Answer: C — Table

Explanations

A. Incorrect
Views are virtual query results.

B. Incorrect
Stored procedures contain SQL logic.

C. Correct
Tables store relational data.

D. Incorrect
Indexes improve query performance.

Question 10 (Single Answer)

Which SQL JOIN returns only matching rows from both tables?

A. LEFT JOIN
B. RIGHT JOIN
C. INNER JOIN
D. FULL OUTER JOIN

✅ Answer: C — INNER JOIN

Explanations

A. Incorrect
LEFT JOIN includes unmatched left-side rows.

B. Incorrect
RIGHT JOIN includes unmatched right-side rows.

C. Correct
INNER JOIN returns only matches.

D. Incorrect
FULL OUTER JOIN includes all rows.

Section 2 — Relational Data on Azure

Question 11 (Single Answer)

Which Azure SQL option provides the MOST compatibility with on-premises SQL Server?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. Azure Cosmos DB
D. Azure Blob Storage

✅ Answer: B — Azure SQL Managed Instance

Explanations

A. Incorrect
Azure SQL Database is fully managed but has fewer instance-level features.

B. Correct
Managed Instance provides near full SQL Server compatibility.

C. Incorrect
Cosmos DB is NoSQL.

D. Incorrect
Blob Storage is object storage.

Question 12 (Multi-Answer)

Which Azure services support open-source relational databases? (Choose TWO)

A. Azure Database for PostgreSQL
B. Azure Database for MySQL
C. Azure Synapse Analytics
D. Azure Files

✅ Answers: A and B

Explanations

A. Correct
Azure provides managed PostgreSQL.

B. Correct
Azure provides managed MySQL.

C. Incorrect
Synapse is analytics-focused.

D. Incorrect
Azure Files is storage.

Question 13 (Single Answer)

Which Azure SQL option gives customers the MOST operating system control?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. SQL Server on Azure Virtual Machines
D. Azure Cosmos DB

✅ Answer: C — SQL Server on Azure Virtual Machines

Explanations

A. Incorrect
Fully managed platform service.

B. Incorrect
Managed service with limited OS access.

C. Correct
VMs provide full infrastructure control.

D. Incorrect
Cosmos DB is NoSQL.

Question 14 (Fill in the Blank)

A column whose values uniquely identify each row in a table is called a __________ key.

✅ Answer: Primary

Explanation

A primary key uniquely identifies rows.

Question 15 (Single Answer)

Which database normalization form removes repeating groups?

A. 1NF
B. 2NF
C. 3NF
D. 4NF

✅ Answer: A — 1NF

Explanations

A. Correct
1NF eliminates repeating groups.

B. Incorrect
2NF removes partial dependencies.

C. Incorrect
3NF removes transitive dependencies.

D. Incorrect
4NF handles multi-valued dependencies.

Section 3 — Non-Relational Data on Azure

Question 16 (Single Answer)

Which Azure storage service is best for storing large unstructured files?

A. Azure SQL Database
B. Azure Blob Storage
C. Azure Table Storage
D. Azure Cosmos DB

✅ Answer: B — Azure Blob Storage

Explanations

A. Incorrect
SQL Database is relational.

B. Correct
Blob Storage stores unstructured objects like images/videos.

C. Incorrect
Table Storage stores NoSQL key-value data.

D. Incorrect
Cosmos DB is a globally distributed database.

Question 17 (Single Answer)

Which Azure storage service provides SMB file shares?

A. Azure Blob Storage
B. Azure Cosmos DB
C. Azure Files
D. Azure Table Storage

✅ Answer: C — Azure Files

Explanations

A. Incorrect
Blob Storage is object storage.

B. Incorrect
Cosmos DB is NoSQL.

C. Correct
Azure Files supports SMB shares.

D. Incorrect
Table Storage stores structured NoSQL entities.

Question 18 (Multi-Answer)

Which are valid Azure Cosmos DB APIs? (Choose TWO)

A. MongoDB API
B. Cassandra API
C. Oracle API
D. SMB API

✅ Answers: A and B

Explanations

A. Correct
Cosmos DB supports MongoDB API.

B. Correct
Cosmos DB supports Cassandra API.

C. Incorrect
Oracle API is not supported.

D. Incorrect
SMB is a file-sharing protocol.

Question 19 (Single Answer)

Which characteristic is a major feature of Azure Cosmos DB?

A. Single-region architecture
B. Global distribution
C. Relational-only schema
D. File-share management

✅ Answer: B — Global distribution

Explanations

A. Incorrect
Cosmos DB supports multiple regions.

B. Correct
Global distribution is a key feature.

C. Incorrect
Cosmos DB is NoSQL.

D. Incorrect
Not a file-sharing service.

Question 20 (Matching)

Match the storage service to its use case.

Service	Use Case
1. Blob Storage	A. SMB file shares
2. Azure Files	B. Unstructured objects

✅ Answers

1 → B
2 → A

Section 4 — Analytics Workloads

Question 21 (Single Answer)

Which process involves collecting data from multiple sources into an analytics system?

A. Visualization
B. Data ingestion
C. Data modeling
D. Backup

✅ Answer: B — Data ingestion

Explanations

A. Incorrect
Visualization displays data.

B. Correct
Ingestion collects and imports data.

C. Incorrect
Modeling defines relationships/calculations.

D. Incorrect
Backup protects data copies.

Question 22 (Single Answer)

Which analytical store is optimized for historical analytics and reporting?

A. OLTP database
B. Data warehouse
C. Azure Files
D. DNS server

✅ Answer: B — Data warehouse

Explanations

A. Incorrect
OLTP supports transactions.

B. Correct
Warehouses support analytics.

C. Incorrect
Files are storage shares.

D. Incorrect
DNS resolves names.

Question 23 (Multi-Answer)

Which Microsoft services support large-scale analytics? (Choose TWO)

A. Azure Databricks
B. Microsoft Fabric
C. Azure DNS
D. Azure Firewall

✅ Answers: A and B

Explanations

A. Correct
Databricks supports big data analytics.

B. Correct
Fabric is an end-to-end analytics platform.

C. Incorrect
DNS is networking.

D. Incorrect
Firewall is security infrastructure.

Question 24 (Single Answer)

What is the primary difference between batch processing and streaming processing?

A. Batch processing handles data continuously
B. Streaming processes data as it arrives
C. Streaming stores only historical data
D. Batch requires IoT devices

✅ Answer: B — Streaming processes data as it arrives

Explanations

A. Incorrect
Continuous processing is streaming.

B. Correct
Streaming handles near real-time data.

C. Incorrect
Streaming is not limited to historical data.

D. Incorrect
Batch does not require IoT.

Question 25 (Single Answer)

Which Azure service is commonly used for streaming event ingestion?

A. Azure Event Hubs
B. Azure Files
C. Azure SQL Database
D. Azure DNS

✅ Answer: A — Azure Event Hubs

Explanations

A. Correct
Event Hubs ingests streaming events.

B. Incorrect
Azure Files is storage.

C. Incorrect
SQL Database is relational.

D. Incorrect
DNS is networking.

Question 26 (Single Answer)

Which service uses SQL-like queries for real-time stream processing?

A. Azure Stream Analytics
B. Azure Firewall
C. Azure DNS
D. Azure Virtual Machines

✅ Answer: A — Azure Stream Analytics

Explanations

A. Correct
Stream Analytics uses SQL-like syntax.

B. Incorrect
Firewall is security.

C. Incorrect
DNS resolves names.

D. Incorrect
VMs are infrastructure.

Question 27 (Fill in the Blank)

The architecture commonly used in analytics models with fact and dimension tables is called a __________ schema.

✅ Answer: Star

Question 28 (Single Answer)

Which Power BI object is a single-page collection of visualizations?

A. Report
B. Dashboard
C. Dataset
D. Workspace

✅ Answer: B — Dashboard

Explanations

A. Incorrect
Reports are usually multi-page.

B. Correct
Dashboards are single-page summaries.

C. Incorrect
Datasets store data models.

D. Incorrect
Workspaces organize content.

Question 29 (Single Answer)

Which Power BI feature is used for data transformation?

A. DAX
B. Power Query
C. Power Automate
D. Azure Functions

✅ Answer: B — Power Query

Explanations

A. Incorrect
DAX creates calculations.

B. Correct
Power Query cleans and transforms data.

C. Incorrect
Power Automate automates workflows.

D. Incorrect
Azure Functions run code.

Question 30 (Single Answer)

Which Power BI language is used for measures and calculations?

A. Python
B. JavaScript
C. DAX
D. XML

✅ Answer: C — DAX

Section 5 — Power BI Visualization

Question 31 (Single Answer)

Which chart type is BEST for showing trends over time?

A. Pie chart
B. Scatter chart
C. Line chart
D. Gauge chart

✅ Answer: C — Line chart

Question 32 (Single Answer)

Which visualization is BEST for showing proportions of a whole?

A. Pie chart
B. Table
C. Scatter chart
D. Card

✅ Answer: A — Pie chart

Question 33 (Single Answer)

Which visualization is BEST for geographic analysis?

A. Matrix
B. Map
C. Gauge
D. Card

✅ Answer: B — Map

Question 34 (Single Answer)

Which visualization is BEST for displaying a single KPI?

A. Scatter chart
B. Card
C. Matrix
D. Pie chart

✅ Answer: B — Card

Question 35 (Single Answer)

Which visualization is BEST for comparing categories?

A. Line chart
B. Map
C. Bar chart
D. Gauge chart

✅ Answer: C — Bar chart

Question 36 (Multi-Answer)

Which visuals support detailed tabular reporting? (Choose TWO)

A. Table
B. Matrix
C. Gauge
D. Pie chart

✅ Answers: A and B

Question 37 (Single Answer)

Which Power BI feature enables interactive filtering?

A. DAX
B. Slicer
C. Gauge
D. Workspace

✅ Answer: B — Slicer

Question 38 (Single Answer)

Which visualization is BEST for identifying relationships between two numeric variables?

A. Pie chart
B. Scatter chart
C. Card
D. Gauge chart

✅ Answer: B — Scatter chart

Question 39 (Fill in the Blank)

A Power BI object containing multiple pages of visualizations is called a __________.

✅ Answer: Report

Question 40 (Single Answer)

Which Power BI component is cloud-based and used for sharing reports?

A. Power BI Desktop
B. Power BI Service
C. Power Query
D. Power Pivot

✅ Answer: B — Power BI Service

Section 6 — Advanced Scenarios

Question 41 (Scenario)

A company needs a globally distributed NoSQL database with low latency worldwide.

Which Azure service should they use?

A. Azure SQL Database
B. Azure Cosmos DB
C. Azure Files
D. Azure Blob Storage

✅ Answer: B — Azure Cosmos DB

Question 42 (Scenario)

A company needs to store millions of images and videos cost-effectively.

Which Azure service is MOST appropriate?

A. Azure SQL Database
B. Azure Blob Storage
C. Azure Files
D. Azure Synapse Analytics

✅ Answer: B — Azure Blob Storage

Question 43 (Scenario)

A company needs fully managed relational databases with automatic patching and backups.

Which service is BEST?

A. SQL Server on Azure VMs
B. Azure SQL Database
C. Azure Files
D. Azure Event Hubs

✅ Answer: B — Azure SQL Database

Question 44 (Scenario)

A retail company wants real-time fraud detection from transaction streams.

Which Azure service is MOST appropriate for processing?

A. Azure Stream Analytics
B. Azure DNS
C. Azure Files
D. Azure Backup

✅ Answer: A — Azure Stream Analytics

Question 45 (Multi-Answer)

Which are characteristics of transactional systems? (Choose TWO)

A. Low-latency transactions
B. Historical trend analysis
C. High concurrency
D. Large aggregations

✅ Answers: A and C

Question 46 (Single Answer)

Which SQL statement modifies existing rows?

A. INSERT
B. UPDATE
C. SELECT
D. CREATE

✅ Answer: B — UPDATE

Question 47 (Single Answer)

Which SQL JOIN returns all rows from the left table and matching rows from the right table?

A. INNER JOIN
B. LEFT JOIN
C. RIGHT JOIN
D. CROSS JOIN

✅ Answer: B — LEFT JOIN

Question 48 (Matching)

Match the visualization to the purpose.

Visualization	Purpose
1. Line chart	A. Show relationships
2. Scatter chart	B. Show trends

✅ Answers

1 → B
2 → A

Question 49 (Single Answer)

Which Azure service supports Apache Spark analytics?

A. Azure Databricks
B. Azure Files
C. Azure DNS
D. Azure Firewall

✅ Answer: A — Azure Databricks

Question 50 (Single Answer)

Which storage type is MOST appropriate for key-value NoSQL storage?

A. Azure Table Storage
B. Azure SQL Database
C. Azure Files
D. Azure Synapse Analytics

✅ Answer: A — Azure Table Storage

Section 7 — Mixed Difficulty Review

Question 51 (Single Answer)

What is the primary purpose of normalization?

A. Increase redundancy
B. Improve graphics rendering
C. Reduce duplicate data
D. Increase storage costs

✅ Answer: C — Reduce duplicate data

Question 52 (Single Answer)

Which data type stores audio and video files?

A. Structured
B. Semi-structured
C. Unstructured
D. Relational

✅ Answer: C — Unstructured

Question 53 (Multi-Answer)

Which are benefits of Power BI dashboards? (Choose TWO)

A. Real-time monitoring
B. Single-page summary
C. Operating system management
D. Virtual machine provisioning

✅ Answers: A and B

Question 54 (Single Answer)

Which service is MOST associated with IoT device ingestion?

A. Azure IoT Hub
B. Azure SQL Database
C. Azure Files
D. Azure Backup

✅ Answer: A — Azure IoT Hub

Question 55 (Single Answer)

Which Azure service provides a unified analytics platform with BI integration?

A. Microsoft Fabric
B. Azure Firewall
C. Azure DNS
D. Azure Backup

✅ Answer: A — Microsoft Fabric

Question 56 (Single Answer)

Which object improves database query performance?

A. Table
B. View
C. Index
D. Trigger

✅ Answer: C — Index

Question 57 (Single Answer)

Which workload typically uses OLTP systems?

A. Analytical
B. Transactional
C. Archival
D. Reporting-only

✅ Answer: B — Transactional

Question 58 (Fill in the Blank)

The SQL statement used to remove rows from a table is __________.

✅ Answer: DELETE

Question 59 (Single Answer)

Which Azure SQL offering is a Platform as a Service (PaaS) solution?

A. SQL Server on Azure Virtual Machines
B. Azure SQL Database
C. Windows Server
D. Hyper-V

✅ Answer: B — Azure SQL Database

Question 60 (Single Answer)

Which Power BI visualization is MOST appropriate for showing progress toward a goal?

A. Scatter chart
B. Gauge chart
C. Table
D. Pie chart

✅ Answer: B — Gauge chart

Final Exam Tips

Focus heavily on:

Relational vs non-relational data
Azure storage services
Azure SQL family
Cosmos DB features
Power BI basics
Analytics workloads
Batch vs streaming concepts

Frequently tested associations:

Blob Storage → unstructured files
Event Hubs → streaming ingestion
Stream Analytics → real-time processing
Cosmos DB → globally distributed NoSQL
Power BI → visualization and reporting
DAX → calculations
Power Query → transformation

Power BI Visualization Tips

Line chart → trends
Bar chart → comparisons
Pie chart → proportions
Scatter chart → relationships
Card → single KPI
Map → geographic data

Go to the DP-900 Exam Prep Hub main page.

Analytics, Cloud computing, DP-900, Microsoft Certification May 11, 2026

Practice Questions: Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric) (DP-900 Exam Prep)

Practice Questions

Question 1

What is the primary purpose of Azure Databricks?

A. Hosting relational databases
B. Managing file shares
C. Processing large-scale data using Apache Spark
D. Running virtual machines

✅ Answer: C

Explanation:
Azure Databricks is built on Apache Spark for large-scale data processing.

Question 2

Which feature is a key characteristic of Azure Databricks?

A. Fixed schema relational tables
B. Distributed data processing
C. File-based storage only
D. Limited scalability

✅ Answer: B

Explanation:
Databricks uses distributed computing to process large datasets efficiently.

Question 3

Which scenario is BEST suited for Azure Databricks?

A. Hosting a transactional database
B. Running large-scale ETL pipelines and machine learning models
C. Managing shared file storage
D. Serving static web pages

✅ Answer: B

Explanation:
Databricks is ideal for data engineering and machine learning at scale.

Question 4

What is Microsoft Fabric primarily designed for?

A. Running operating systems
B. Providing a unified, end-to-end analytics platform
C. Managing virtual networks
D. Hosting relational databases only

✅ Answer: B

Explanation:
Microsoft Fabric integrates multiple analytics capabilities into one unified platform.

Question 5

Which component of Microsoft Fabric serves as a unified data storage layer?

A. Azure Blob Storage
B. SQL Database
C. OneLake
D. Azure Files

✅ Answer: C

Explanation:
OneLake is the centralized storage layer within Microsoft Fabric.

Question 6

Which service is BEST suited for organizations that want a single platform for data engineering, data warehousing, and BI?

A. Azure Virtual Machines
B. Azure Databricks
C. Microsoft Fabric
D. Azure Table Storage

✅ Answer: C

Explanation:
Fabric provides an end-to-end unified analytics experience.

Question 7

Which of the following best describes the difference between Azure Databricks and Microsoft Fabric?

A. Databricks is for storage, Fabric is for compute
B. Databricks focuses on big data processing, Fabric provides a unified analytics platform
C. Fabric only supports relational data, Databricks does not
D. Databricks cannot scale, Fabric can

✅ Answer: B

Explanation:
Databricks focuses on processing and ML, while Fabric provides end-to-end analytics.

Question 8

Which programming environments are commonly supported in Azure Databricks notebooks?

A. HTML and CSS only
B. Python, SQL, Scala, and R
C. JavaScript only
D. PowerShell only

✅ Answer: B

Explanation:
Databricks notebooks support multiple languages including Python, SQL, Scala, and R.

Question 9

Which scenario is NOT ideal for Azure Databricks?

A. Large-scale data transformation
B. Machine learning model training
C. Managing simple file shares
D. Processing streaming data

✅ Answer: C

Explanation:
Databricks is not designed for file-sharing scenarios.

Question 10

Which statement about Microsoft Fabric is TRUE?

A. It requires manual infrastructure management
B. It is a SaaS-based unified analytics platform
C. It only supports batch processing
D. It replaces all Azure services

✅ Answer: B

Explanation:
Microsoft Fabric is a fully managed SaaS platform that integrates analytics services.

✅ Quick Exam Takeaways

✔ Azure Databricks

Apache Spark-based
Distributed processing
Data engineering & machine learning

✔ Microsoft Fabric

Unified analytics platform
End-to-end solution (data + analytics + BI)
Includes OneLake storage

✔ Key differences:

Databricks → processing & ML
Fabric → all-in-one analytics platform

✔ Exam tip:
👉 Big data processing → Azure Databricks
👉 Unified analytics platform → Microsoft Fabric

Go to the DP-900 Exam Prep Hub main page.

Data Engineering, DP-900, Microsoft Certification May 10, 2026

Practice Questions: Describe responsibilities for data engineers (DP-900 Exam Prep)

Practice Questions

Question 1

Which task is a primary responsibility of a data engineer?

A. Creating dashboards for business users
B. Managing database user permissions
C. Building and maintaining data pipelines
D. Training machine learning models

✅ Answer: C

Explanation:
Data engineers are responsible for designing and maintaining data pipelines that move and transform data.

Question 2

A company needs to collect data from multiple systems and prepare it for reporting.

Which role is primarily responsible for this task?

A. Data Analyst
B. Database Administrator
C. Data Engineer
D. Business User

✅ Answer: C

Explanation:
Data engineers handle data ingestion, integration, and preparation for downstream analytics.

Question 3

Which process involves extracting data from sources, transforming it, and loading it into a destination system?

A. OLTP
B. ETL
C. OLAP
D. ACID

✅ Answer: B

Explanation:
ETL (Extract, Transform, Load) is a core responsibility of data engineers.

Question 4

Which Azure service is commonly used by data engineers to orchestrate data pipelines?

A. Azure SQL Database
B. Azure Data Factory
C. Azure Blob Storage
D. Azure Virtual Machines

✅ Answer: B

Explanation:
Azure Data Factory is used to build, schedule, and manage data pipelines.

Question 5

Which responsibility ensures that data used for analytics is accurate and reliable?

A. Query optimization
B. Data visualization
C. Data quality management
D. User authentication

✅ Answer: C

Explanation:
Data engineers ensure data quality through validation and cleaning processes.

Question 6

A data engineer is working with large-scale data processing using Apache Spark.

Which Azure service are they MOST likely using?

A. Azure SQL Database
B. Azure Cosmos DB
C. Azure Databricks
D. Azure Table Storage

✅ Answer: C

Explanation:
Azure Databricks is a Spark-based platform used for large-scale data processing.

Question 7

Which storage solution is commonly used by data engineers for storing large volumes of raw and processed data?

A. Azure Data Lake Storage
B. Azure Queue Storage
C. Azure SQL Database
D. Azure Cache for Redis

✅ Answer: A

Explanation:
Azure Data Lake Storage is optimized for big data storage and analytics workloads.

Question 8

Which task is LEAST likely to be performed by a data engineer?

A. Transforming raw data into structured formats
B. Monitoring data pipelines
C. Creating Power BI dashboards
D. Integrating multiple data sources

✅ Answer: C

Explanation:
Creating dashboards is typically the responsibility of a data analyst, not a data engineer.

Question 9

Which type of data processing involves handling real-time data streams?

A. Batch processing
B. Streaming processing
C. Relational processing
D. Transactional processing

✅ Answer: B

Explanation:
Data engineers often work with streaming pipelines for real-time data ingestion.

Question 10

A data engineer selects Parquet as a storage format for a dataset.

What is the primary reason for this choice?

A. It is human readable
B. It supports transactional updates
C. It is optimized for analytical performance
D. It enforces a strict schema

✅ Answer: C

Explanation:
Parquet is a columnar format that improves performance for analytical workloads.

✅ Quick Exam Takeaways

For DP-900, remember data engineers:

✔ Build and manage data pipelines
✔ Handle ETL/ELT processes
✔ Work with batch and streaming data
✔ Ensure data quality and reliability
✔ Manage data storage solutions (Data Lake, Blob)
✔ Use Azure services like:

Azure Data Factory
Azure Databricks
Azure Data Lake Storage
Azure Synapse Analytics

✔ Enable analytics and BI by preparing data

Go to the DP-900 Exam Prep Hub main page.

Analytics, Cloud computing, DP-900, Microsoft Certification May 10, 2026May 10, 2026

Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric) (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
   --> Describe common elements of large-scale analytics
      --> Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric)

Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Modern analytics workloads often require processing massive volumes of data quickly and efficiently. Microsoft provides powerful cloud services to meet these needs, including Azure Databricks and Microsoft Fabric.

For the DP-900 exam, you should understand what these services are, their key features, and when to use each.

Why Large-Scale Analytics Services Matter

Large-scale analytics involves:

Processing big data (TBs to PBs)
Supporting batch and real-time workloads
Enabling advanced analytics and machine learning

✔ Traditional tools often cannot scale to meet these demands.

Azure Databricks

What Is Azure Databricks?

Azure Databricks is a cloud-based analytics platform built on Apache Spark.

It is designed for:

Big data processing
Data engineering
Machine learning
Collaborative analytics

Key Features

1. Apache Spark-Based Processing

Distributed computing engine
Processes large datasets in parallel

✔ Ideal for big data workloads

2. Collaborative Workspace

Notebooks (Python, SQL, Scala, R)
Multiple users can collaborate

3. Integration with Azure

Works with Azure Data Lake Storage
Integrates with Azure Synapse Analytics

4. Machine Learning Support

Built-in ML capabilities
Supports advanced analytics workflows

Common Use Cases

Big data processing (ETL/ELT pipelines)
Data science and machine learning
Real-time analytics
Data transformation at scale

✔ Best for: Data engineers and data scientists working with large datasets

Microsoft Fabric

What Is Microsoft Fabric?

Microsoft Fabric is an end-to-end, unified analytics platform that brings together multiple data services into a single environment.

It integrates:

Data engineering
Data warehousing
Data science
Real-time analytics
Business intelligence

Key Features

1. Unified Platform

Combines multiple services into one
Reduces complexity of managing separate tools

2. OneLake (Unified Storage Layer)

Centralized data lake for all workloads
Eliminates data silos

3. Integrated Analytics Experiences

Data Factory (ingestion)
Data Warehouse
Real-Time Analytics
Power BI integration

4. SaaS-Based Model

Fully managed platform
Minimal infrastructure management

Common Use Cases

End-to-end analytics solutions
Unified data platform for organizations
Business intelligence and reporting
Data integration and transformation

✔ Best for: Organizations wanting a single, unified analytics solution

Azure Databricks vs Microsoft Fabric

Feature	Azure Databricks	Microsoft Fabric
Focus	Big data processing & ML	End-to-end analytics platform
Engine	Apache Spark	Multiple integrated engines
Users	Data engineers, data scientists	Broad (engineers, analysts, business users)
Complexity	More flexible, more technical	Simpler, unified experience
Use Case	Advanced analytics & ML	Unified analytics and BI

How They Fit in an Analytics Architecture

Typical roles:

Azure Databricks
- Data processing
- Advanced transformations
- Machine learning
Microsoft Fabric
- End-to-end pipeline
- Storage (OneLake)
- Reporting (Power BI integration)

✔ They can complement each other in modern architectures.

Key Considerations When Choosing

Choose Azure Databricks when:

You need advanced data engineering or machine learning
You require Spark-based processing
You want full control and flexibility

Choose Microsoft Fabric when:

You want a unified analytics platform
You prefer simplified, integrated workflows
You need end-to-end analytics in one place

Why This Matters for DP-900

On the exam, you may be asked to:

Identify the purpose of Azure Databricks
Recognize Microsoft Fabric as a unified analytics platform
Choose the right service for a scenario
Understand how these services support large-scale analytics

Summary — Exam-Relevant Takeaways

✔ Azure Databricks

Apache Spark-based
Big data processing
Machine learning
Flexible and powerful

✔ Microsoft Fabric

Unified analytics platform
End-to-end solution
Includes data engineering, warehousing, and BI

✔ Key difference:

Databricks → advanced processing & ML
Fabric → all-in-one analytics platform

✔ Exam tip:
👉 Spark + big data processing → Azure Databricks
👉 Unified analytics platform → Microsoft Fabric

Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.