Month: May 2026

Describe considerations for privacy and security in an AI Solution (AI-901 Exam Prep)

This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Identify AI concepts and capabilities (40–45%)
--> Describe principles of responsible AI
--> Describe considerations for privacy and security in an AI Solution


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Privacy and security are essential principles of Responsible AI and important topics for the AI-901 certification exam. Microsoft emphasizes that AI systems must protect sensitive information, respect user privacy, and defend against unauthorized access or malicious attacks.

As AI systems increasingly process personal, financial, medical, and business data, organizations must ensure that their AI solutions are secure and trustworthy.


What Are Privacy and Security in AI?

Although related, privacy and security are different concepts.

ConceptMeaning
PrivacyProtecting personal and sensitive information and ensuring proper data usage
SecurityProtecting systems, models, and data from unauthorized access, attacks, or misuse

Both principles are critical when developing and deploying AI systems.


Why Privacy and Security Matter

AI systems often process large amounts of sensitive information, including:

  • Personal data
  • Financial records
  • Medical information
  • Images and videos
  • Voice recordings
  • Customer behavior data
  • Business intelligence data

If privacy or security is compromised, organizations may face:

  • Data breaches
  • Identity theft
  • Financial loss
  • Legal penalties
  • Loss of customer trust
  • Regulatory violations

Responsible AI requires organizations to safeguard both the data and the systems that use it.


Privacy Considerations in AI


Collect Only Necessary Data

Organizations should collect only the data required for the AI solution to function properly.

This concept is often called data minimization.

Example

A movie recommendation system may need viewing preferences but may not need a user’s medical history.

Collecting unnecessary data increases privacy risks.


User Consent and Transparency

Users should understand:

  • What data is being collected
  • Why the data is being collected
  • How the data will be used
  • Who can access the data

Organizations should obtain appropriate user consent before collecting or processing personal information.

Example

A voice assistant application should clearly inform users that voice recordings are being stored and analyzed.


Protect Sensitive Information

Sensitive data should be carefully protected during:

  • Collection
  • Storage
  • Processing
  • Transmission

Examples of sensitive information include:

  • Social Security numbers
  • Credit card data
  • Medical records
  • Biometric data

Organizations often use encryption and access controls to protect sensitive data.


Anonymization and Masking

Organizations can reduce privacy risks by removing or hiding personally identifiable information (PII).

Techniques include:

  • Anonymization
  • Data masking
  • Tokenization

Example

A healthcare AI system may replace patient names with anonymous identifiers before training a model.


Compliance with Regulations

Organizations must comply with privacy laws and regulations.

Examples include:

  • GDPR (General Data Protection Regulation)
  • HIPAA (Health Insurance Portability and Accountability Act)
  • CCPA (California Consumer Privacy Act)

AI systems should be designed with regulatory compliance in mind.


Security Considerations in AI


Protecting AI Systems from Unauthorized Access

AI systems should include strong authentication and authorization controls.

Examples

  • Multi-factor authentication (MFA)
  • Role-based access control (RBAC)
  • Identity management systems

Only authorized users should be able to access sensitive models or data.


Securing Data

Data should be protected both:

  • At rest (stored data)
  • In transit (moving across networks)

Encryption is commonly used to secure data in both situations.


Protecting Models from Attacks

AI systems can be targets for malicious attacks.

Examples include:

  • Adversarial attacks
  • Data poisoning
  • Model theft
  • Prompt injection attacks in generative AI systems

Organizations should monitor for suspicious activity and secure AI infrastructure.


Adversarial Attacks

An adversarial attack occurs when someone intentionally manipulates input data to fool an AI model.

Example

Small changes to an image may cause an AI vision system to incorrectly identify an object.

These attacks can reduce reliability and create safety risks.


Data Poisoning

Data poisoning occurs when attackers intentionally insert misleading or malicious data into training datasets.

Example

An attacker adds fraudulent examples into a spam detection dataset so spam messages are classified as safe.

This can compromise model accuracy and trustworthiness.


Generative AI Security Risks

Generative AI introduces additional privacy and security challenges.

Examples include:

  • Prompt injection attacks
  • Exposure of confidential data
  • Harmful content generation
  • Leakage of sensitive training data

Organizations should implement safeguards such as:

  • Content filtering
  • Access restrictions
  • Human review
  • Monitoring and logging

Shared Responsibility in Cloud AI

When using cloud-based AI services such as Microsoft Azure AI Services, security responsibilities are shared.

Microsoft ResponsibilitiesCustomer Responsibilities
Physical infrastructure securityUser access management
Network securityProper configuration
Cloud platform protectionData governance
Service availabilityCompliance and policy management

Understanding the shared responsibility model is important for cloud security.


Real-World Example

Scenario: AI Banking Chatbot

A bank deploys an AI chatbot that helps customers manage accounts.

Privacy Considerations

  • Protect customer financial data
  • Obtain consent for data collection
  • Limit access to sensitive records
  • Mask account numbers in logs

Security Considerations

  • Use encryption
  • Require authentication
  • Prevent unauthorized access
  • Monitor for suspicious activity
  • Protect against prompt injection attacks

Risk Mitigation Strategies

  • Access controls
  • Security monitoring
  • Data anonymization
  • Regular audits
  • Employee security training

This type of scenario aligns well with AI-901 exam questions.


Privacy vs. Security

A common exam concept is understanding the difference between privacy and security.

Privacy Focuses On:

  • Proper use of personal data
  • User consent
  • Data collection practices
  • Data sharing limitations

Security Focuses On:

  • Protecting systems and data
  • Preventing attacks
  • Access control
  • Encryption
  • Threat detection

Privacy and security work together but are not the same thing.


Microsoft Responsible AI Principles

Microsoft identifies privacy and security as one of six core Responsible AI principles:

  1. Fairness
  2. Reliability and safety
  3. Privacy and security
  4. Inclusiveness
  5. Transparency
  6. Accountability

For AI-901, understand that privacy and security focus on protecting both users and AI systems.


Best Practices for Privacy and Security in AI

Organizations commonly use the following practices:


Encryption

Protect data by encrypting it:

  • At rest
  • In transit

Access Controls

Restrict system access using:

  • RBAC
  • MFA
  • Identity management

Data Governance

Establish policies for:

  • Data handling
  • Data retention
  • Data sharing
  • Compliance

Monitoring and Logging

Track suspicious behavior and system activity to detect threats early.


Regular Security Testing

Perform:

  • Vulnerability scans
  • Penetration testing
  • Security reviews

Human Oversight

Humans should monitor high-risk AI systems and review sensitive outputs.


Important AI-901 Exam Tips

For the exam, remember these key points:

  • Privacy protects personal and sensitive information.
  • Security protects systems, models, and data from attacks or unauthorized access.
  • Data minimization reduces privacy risk.
  • Encryption protects data at rest and in transit.
  • AI systems can face adversarial attacks and data poisoning.
  • Generative AI introduces additional security concerns.
  • User consent and transparency are important privacy considerations.
  • Privacy and security are one of Microsoft’s six Responsible AI principles.

Quick Knowledge Check

Question 1

What is the difference between privacy and security?

Answer

Privacy focuses on proper handling of personal data, while security focuses on protecting systems and data from threats and unauthorized access.


Question 2

What is data minimization?

Answer

Collecting only the data necessary for an AI solution to function.


Question 3

What is an adversarial attack?

Answer

An attempt to intentionally manipulate AI inputs to fool the model into producing incorrect results.


Question 4

Why is encryption important in AI systems?

Answer

It helps protect sensitive data from unauthorized access during storage and transmission.


Practice Exam Questions


Question 1

A company develops an AI-powered healthcare application that stores patient medical records.

Which practice BEST helps protect sensitive patient data?

A. Publicly sharing all training data
B. Encrypting stored and transmitted data
C. Removing all authentication requirements
D. Allowing unrestricted administrator access


Correct Answer

B. Encrypting stored and transmitted data


Explanation

Encryption protects sensitive information both while stored (at rest) and while moving across networks (in transit). This is a key privacy and security practice for AI systems handling confidential data.


Why the Other Answers Are Incorrect

A. Publicly sharing all training data

This would create major privacy risks.

C. Removing all authentication requirements

Authentication is necessary for security.

D. Allowing unrestricted administrator access

Access should be limited and controlled.


Question 2

What is the PRIMARY focus of privacy in an AI solution?

A. Preventing hardware failures
B. Protecting personal and sensitive information
C. Increasing processing speed
D. Improving graphics performance


Correct Answer

B. Protecting personal and sensitive information


Explanation

Privacy focuses on ensuring personal data is collected, stored, shared, and used responsibly and lawfully.


Why the Other Answers Are Incorrect

A. Preventing hardware failures

This relates to infrastructure reliability.

C. Increasing processing speed

Performance optimization is unrelated to privacy.

D. Improving graphics performance

Graphics performance is unrelated to Responsible AI privacy principles.


Question 3

Which scenario BEST demonstrates data minimization?

A. Collecting all available user data regardless of need
B. Collecting only the information necessary for the AI solution to function
C. Sharing customer data with external organizations
D. Storing user data indefinitely


Correct Answer

B. Collecting only the information necessary for the AI solution to function


Explanation

Data minimization means limiting data collection to only what is necessary for a specific purpose, reducing privacy risks.


Why the Other Answers Are Incorrect

A. Collecting all available user data regardless of need

This increases privacy risk.

C. Sharing customer data with external organizations

This may create additional privacy concerns.

D. Storing user data indefinitely

Long-term storage may increase compliance and security risks.


Question 4

An attacker slightly modifies an image so that an AI vision system incorrectly identifies an object.

What type of attack is this?

A. Data normalization
B. Adversarial attack
C. Batch processing
D. Role-based access control


Correct Answer

B. Adversarial attack


Explanation

Adversarial attacks intentionally manipulate inputs to fool AI systems into making incorrect predictions or classifications.


Why the Other Answers Are Incorrect

A. Data normalization

Normalization prepares data for analysis.

C. Batch processing

Batch processing refers to grouped data operations.

D. Role-based access control

RBAC is a security access management method.


Question 5

Which security measure helps ensure only authorized users can access an AI system?

A. Increasing training data size
B. Role-based access control (RBAC)
C. Removing encryption
D. Disabling audit logs


Correct Answer

B. Role-based access control (RBAC)


Explanation

RBAC restricts access based on user roles and permissions, helping secure AI systems and sensitive data.


Why the Other Answers Are Incorrect

A. Increasing training data size

Training data size does not control access.

C. Removing encryption

Removing encryption weakens security.

D. Disabling audit logs

Audit logs help monitor and investigate security events.


Question 6

What is the PRIMARY purpose of encryption in AI systems?

A. To increase model accuracy
B. To protect data from unauthorized access
C. To reduce cloud costs
D. To eliminate the need for passwords


Correct Answer

B. To protect data from unauthorized access


Explanation

Encryption converts data into a protected format that unauthorized users cannot easily read.

It is commonly used to secure sensitive information.


Why the Other Answers Are Incorrect

A. To increase model accuracy

Encryption does not improve prediction quality.

C. To reduce cloud costs

Encryption is a security measure, not a cost optimization tool.

D. To eliminate the need for passwords

Authentication may still be required.


Question 7

A company clearly informs users about what personal information is being collected and how it will be used before collecting the data.

What privacy concept does this BEST represent?

A. User consent and transparency
B. Adversarial testing
C. Model drift
D. Data poisoning


Correct Answer

A. User consent and transparency


Explanation

Responsible AI systems should inform users about data collection practices and obtain appropriate consent before using personal data.


Why the Other Answers Are Incorrect

B. Adversarial testing

Adversarial testing evaluates resistance to attacks.

C. Model drift

Model drift refers to performance changes over time.

D. Data poisoning

Data poisoning involves malicious manipulation of training data.


Question 8

An attacker intentionally inserts misleading examples into a training dataset to reduce model accuracy.

What is this called?

A. Encryption
B. Data masking
C. Data poisoning
D. Data normalization


Correct Answer

C. Data poisoning


Explanation

Data poisoning occurs when attackers deliberately manipulate training data to negatively affect AI model behavior.


Why the Other Answers Are Incorrect

A. Encryption

Encryption protects data confidentiality.

B. Data masking

Data masking hides sensitive information.

D. Data normalization

Normalization standardizes data values.


Question 9

Which statement BEST describes the difference between privacy and security?

A. Privacy and security are identical concepts
B. Privacy focuses on proper data usage, while security focuses on protecting systems and data from threats
C. Privacy focuses only on hardware devices
D. Security applies only to cloud computing


Correct Answer

B. Privacy focuses on proper data usage, while security focuses on protecting systems and data from threats


Explanation

Privacy concerns how personal data is collected and used, while security focuses on preventing unauthorized access, attacks, and data breaches.


Why the Other Answers Are Incorrect

A. Privacy and security are identical concepts

They are related but distinct principles.

C. Privacy focuses only on hardware devices

Privacy primarily concerns information handling.

D. Security applies only to cloud computing

Security applies to all computing environments.


Question 10

Which Microsoft Responsible AI principle focuses on protecting sensitive information and securing AI systems?

A. Fairness
B. Inclusiveness
C. Privacy and security
D. Transparency


Correct Answer

C. Privacy and security


Explanation

The Privacy and Security principle focuses on safeguarding personal data and protecting AI systems from threats, misuse, and unauthorized access.


Why the Other Answers Are Incorrect

A. Fairness

Fairness focuses on avoiding unjust bias and discrimination.

B. Inclusiveness

Inclusiveness focuses on designing systems accessible to diverse users.

D. Transparency

Transparency focuses on explainability and understanding AI decisions.


Final Thoughts

Privacy and security are foundational Responsible AI principles and key topics for the AI-901 certification exam. Microsoft expects candidates to understand how AI systems handle sensitive data, how security threats can affect AI solutions, and how organizations can protect both users and systems.

Strong privacy and security practices help organizations build trustworthy AI solutions while reducing legal, operational, and reputational risks.


Go to the AI-901 Exam Prep Hub main page

Describe considerations for reliability and safety in an AI Solution (AI-901 Exam Prep)

This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Identify AI concepts and capabilities (40–45%)
--> Describe principles of responsible AI
--> Describe considerations for reliability and safety in an AI Solution


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Reliability and safety are essential principles of Responsible AI and are important topics for the AI-901 certification exam. Microsoft emphasizes that AI systems should operate consistently, safely, and predictably, especially when used in environments that impact people’s lives, finances, health, or security.

Understanding reliability and safety means understanding how AI systems can fail, the risks associated with those failures, and the methods organizations use to reduce those risks.


What Is Reliability and Safety in AI?

Reliability and safety refer to ensuring that AI systems:

  • Operate consistently
  • Produce dependable results
  • Minimize harmful outcomes
  • Perform safely under expected and unexpected conditions

A reliable AI system should continue functioning properly even when:

  • Data changes
  • Conditions vary
  • Users behave unexpectedly
  • Inputs are incomplete or unusual

A safe AI system should avoid causing physical, emotional, financial, or operational harm.


Why Reliability and Safety Matter

AI systems are increasingly used in high-impact scenarios such as:

  • Healthcare diagnostics
  • Autonomous vehicles
  • Financial fraud detection
  • Industrial automation
  • Security monitoring
  • Customer service
  • Smart home devices

Failures in these systems can lead to:

  • Incorrect medical recommendations
  • Financial losses
  • Physical injury
  • Security vulnerabilities
  • Loss of trust
  • Legal and compliance issues

Because of these risks, organizations must carefully design, test, and monitor AI solutions.


Reliability vs. Safety

Although closely related, reliability and safety are slightly different concepts.

ConceptMeaning
ReliabilityThe AI system consistently performs as expected
SafetyThe AI system avoids causing harm

Example

A self-driving car that correctly detects road signs most of the time may be considered reliable.

However, if it occasionally fails in dangerous situations and causes accidents, it is not safe enough.

Both principles must work together.


Key Reliability Considerations


Consistent Performance

AI systems should deliver stable and dependable outputs over time.

Example

A fraud detection model should consistently identify suspicious transactions accurately, not fluctuate unpredictably from day to day.

Inconsistent behavior reduces user trust and may create operational problems.


Handling Unexpected Inputs

AI systems should manage unusual or incomplete inputs gracefully.

Example

A chatbot should respond appropriately when receiving misspelled text, slang, or unsupported questions rather than producing harmful or nonsensical responses.

This is sometimes called robustness.


Testing Across Different Conditions

AI systems should be tested under a wide variety of conditions before deployment.

Examples

  • Different user groups
  • Varying lighting conditions for image recognition
  • Different accents in speech recognition
  • Heavy workloads and traffic spikes
  • Missing or corrupted data

Comprehensive testing helps identify weaknesses before users are affected.


Monitoring After Deployment

AI reliability can degrade over time because:

  • User behavior changes
  • New data patterns emerge
  • Business environments evolve

This is often called model drift or data drift.

Organizations should continuously monitor AI systems to ensure they continue performing correctly.


Fail-Safe Mechanisms

AI systems should include safeguards in case something goes wrong.

Example

If an AI-powered medical system is uncertain about a diagnosis, it could escalate the case to a human doctor rather than making an unsafe recommendation.

Fail-safe mechanisms reduce the risk of harmful outcomes.


Key Safety Considerations


Preventing Harmful Outcomes

AI systems should minimize the possibility of causing harm.

Potential harms include:

  • Physical harm
  • Emotional harm
  • Financial harm
  • Reputational harm
  • Security risks

Example

A content moderation AI should avoid exposing users to dangerous or abusive material.


Human Oversight

Humans should remain involved in high-risk or sensitive AI decisions.

Examples

  • Doctors reviewing AI-assisted diagnoses
  • Loan officers reviewing loan denials
  • Security analysts reviewing threat alerts

Human oversight helps catch errors and improve accountability.


Security Against Attacks

AI systems can become targets for malicious attacks.

Examples include:

  • Feeding misleading data into models
  • Attempting to manipulate outputs
  • Extracting sensitive information
  • Prompt injection attacks in generative AI systems

Organizations must secure AI systems just like any other software system.


Reliability in Generative AI

Generative AI systems introduce additional reliability and safety challenges.

These systems may:

  • Generate incorrect information
  • Produce harmful content
  • Hallucinate facts
  • Create biased responses
  • Misinterpret prompts

Example

A generative AI chatbot may confidently provide inaccurate medical advice.

Because of this, generative AI systems often require:

  • Content filtering
  • Human review
  • Safety policies
  • Usage restrictions
  • Grounding with trusted data sources

Real-World Example

Scenario: AI Medical Assistant

A hospital deploys an AI solution that helps doctors identify diseases from medical images.

Reliability Requirements

  • Accurate image analysis
  • Consistent performance across different equipment
  • Reliable operation during heavy usage

Safety Requirements

  • Avoid dangerous misdiagnoses
  • Escalate uncertain cases to physicians
  • Protect patient data
  • Prevent harmful recommendations

Risk Mitigation Strategies

  • Extensive testing
  • Human oversight
  • Continuous monitoring
  • Security protections
  • Regular retraining

This type of scenario aligns well with AI-901 exam questions.


Common Causes of Reliability Problems

AI systems can become unreliable for many reasons.

Poor Quality Data

Incorrect or incomplete data can reduce model performance.

Example

A weather prediction system trained on inaccurate historical data may produce unreliable forecasts.


Insufficient Testing

Limited testing may fail to expose weaknesses.

Example

A facial recognition model tested only in bright lighting may fail in darker environments.


Data Drift

Real-world conditions may change over time.

Example

Customer purchasing behavior may evolve, reducing the accuracy of recommendation systems.


Adversarial Attacks

Malicious actors may intentionally manipulate AI systems.

Example

Small image modifications may fool computer vision systems into making incorrect classifications.


Microsoft Responsible AI Principles

Microsoft identifies reliability and safety as one of six core Responsible AI principles:

  1. Fairness
  2. Reliability and safety
  3. Privacy and security
  4. Inclusiveness
  5. Transparency
  6. Accountability

For AI-901, understand that reliability and safety focus on ensuring AI systems function dependably and minimize harmful outcomes.


Methods for Improving Reliability and Safety

Organizations use several strategies to improve AI reliability and safety.


Robust Testing

Test systems using:

  • Edge cases
  • Rare scenarios
  • Large workloads
  • Diverse user conditions
  • Adversarial testing

Monitoring and Logging

Track system behavior after deployment to identify:

  • Accuracy degradation
  • Failures
  • Unexpected outputs
  • Security concerns

Human-in-the-Loop Systems

Allow humans to review sensitive decisions before action is taken.


Safety Constraints

Limit what an AI system can do.

Example

A chatbot may block harmful or unsafe responses using content moderation filters.


Backup and Recovery Plans

Organizations should prepare for failures by implementing:

  • Rollback procedures
  • Redundant systems
  • Emergency shutdown controls

Azure and Responsible AI

Microsoft Azure AI Services and related AI platforms include features that help organizations improve reliability and safety, such as:

  • Monitoring tools
  • Security controls
  • Content filtering
  • Responsible AI guidance
  • Human review workflows
  • Governance frameworks

Microsoft encourages organizations to incorporate these principles throughout the AI lifecycle.


Important AI-901 Exam Tips

For the exam, remember these key points:

  • Reliability means AI systems perform consistently and dependably.
  • Safety means AI systems minimize harmful outcomes.
  • AI systems should be tested under many conditions.
  • Human oversight is important in sensitive scenarios.
  • Monitoring after deployment is essential.
  • Generative AI introduces additional safety risks.
  • Fail-safe mechanisms help reduce harm.
  • Reliability and safety are one of Microsoft’s six Responsible AI principles.

Quick Knowledge Check

Question 1

What is the primary goal of reliability in AI?

Answer

To ensure the AI system consistently performs as expected.


Question 2

Why is monitoring AI systems after deployment important?

Answer

Because data and user behavior can change over time, potentially reducing model performance.


Question 3

What is an example of a fail-safe mechanism?

Answer

Escalating uncertain AI decisions to a human reviewer.


Question 4

Why can generative AI systems create safety concerns?

Answer

Because they may generate inaccurate, harmful, or misleading content.


Practice Exam Questions


Question 1

A company deploys an AI-powered medical imaging system. The system automatically flags uncertain diagnoses for review by a physician before final decisions are made.

What Responsible AI practice does this BEST represent?

A. Data minimization
B. Human oversight
C. Data labeling
D. Batch processing


Correct Answer

B. Human oversight


Explanation

Human oversight involves allowing people to review, validate, or override AI decisions, especially in high-risk scenarios such as healthcare.

This helps reduce the risk of harmful outcomes.


Why the Other Answers Are Incorrect

A. Data minimization

Data minimization relates to collecting only necessary data.

C. Data labeling

Data labeling is the process of tagging training data.

D. Batch processing

Batch processing refers to processing data in groups.


Question 2

What is the PRIMARY goal of reliability in an AI solution?

A. Increasing advertising revenue
B. Ensuring the AI system performs consistently as expected
C. Eliminating all operational costs
D. Replacing all human workers


Correct Answer

B. Ensuring the AI system performs consistently as expected


Explanation

Reliability means an AI system consistently produces dependable and stable results under expected and unexpected conditions.


Why the Other Answers Are Incorrect

A. Increasing advertising revenue

Revenue generation is unrelated to Responsible AI reliability principles.

C. Eliminating all operational costs

Reliability focuses on system performance, not cost elimination.

D. Replacing all human workers

Responsible AI does not require complete automation.


Question 3

An AI chatbot receives unexpected user input containing spelling mistakes and slang. The chatbot still responds appropriately without crashing or producing harmful output.

What characteristic is the chatbot demonstrating?

A. Transparency
B. Robustness
C. Data encryption
D. Scalability


Correct Answer

B. Robustness


Explanation

Robustness refers to an AI system’s ability to handle unexpected, incomplete, or unusual inputs safely and reliably.


Why the Other Answers Are Incorrect

A. Transparency

Transparency relates to understanding how AI decisions are made.

C. Data encryption

Encryption protects data security.

D. Scalability

Scalability refers to handling increased workloads.


Question 4

Why should AI systems be continuously monitored after deployment?

A. AI systems never change once deployed
B. Data patterns and user behavior may change over time
C. Monitoring guarantees perfect model accuracy
D. Monitoring removes the need for testing


Correct Answer

B. Data patterns and user behavior may change over time


Explanation

Changes in real-world conditions can reduce model accuracy and reliability over time. Continuous monitoring helps identify these issues early.

This is often related to data drift or model drift.


Why the Other Answers Are Incorrect

A. AI systems never change once deployed

AI performance can change as conditions evolve.

C. Monitoring guarantees perfect model accuracy

No monitoring system can guarantee perfection.

D. Monitoring removes the need for testing

Testing before deployment remains essential.


Question 5

Which scenario BEST demonstrates a safety concern in AI?

A. A report loads slowly in a dashboard
B. A chatbot uses too much memory
C. An autonomous vehicle fails to recognize a pedestrian
D. A database backup takes longer than expected


Correct Answer

C. An autonomous vehicle fails to recognize a pedestrian


Explanation

This scenario could lead to physical harm, making it a major AI safety concern.

Safety focuses on minimizing harmful outcomes.


Why the Other Answers Are Incorrect

A. A report loads slowly in a dashboard

This is a performance issue.

B. A chatbot uses too much memory

This is a resource management issue.

D. A database backup takes longer than expected

This is an infrastructure or operational issue.


Question 6

What is a fail-safe mechanism in AI?

A. A process that guarantees 100% model accuracy
B. A backup plan that reduces harm when the AI system encounters problems
C. A method for increasing advertising performance
D. A process that removes all security requirements


Correct Answer

B. A backup plan that reduces harm when the AI system encounters problems


Explanation

Fail-safe mechanisms help prevent harmful outcomes if the AI system becomes uncertain or fails unexpectedly.

Example: Escalating uncertain medical diagnoses to human experts.


Why the Other Answers Are Incorrect

A. A process that guarantees 100% model accuracy

No AI system can guarantee perfect accuracy.

C. A method for increasing advertising performance

Advertising optimization is unrelated to fail-safe mechanisms.

D. A process that removes all security requirements

Security remains critically important.


Question 7

Which statement BEST describes the difference between reliability and safety?

A. Reliability focuses on consistent performance, while safety focuses on minimizing harm
B. Reliability and safety are identical concepts
C. Reliability applies only to hardware systems
D. Safety focuses only on data storage


Correct Answer

A. Reliability focuses on consistent performance, while safety focuses on minimizing harm


Explanation

Reliability ensures dependable system behavior, while safety ensures the AI system avoids causing harm.

Both are key Responsible AI principles.


Why the Other Answers Are Incorrect

B. Reliability and safety are identical concepts

They are closely related but distinct principles.

C. Reliability applies only to hardware systems

Reliability applies to AI software systems as well.

D. Safety focuses only on data storage

Safety includes preventing harmful outcomes.


Question 8

A generative AI system confidently provides incorrect medical advice.

What Responsible AI concern does this BEST represent?

A. Scalability
B. Hallucination and safety risk
C. Database normalization
D. Data compression


Correct Answer

B. Hallucination and safety risk


Explanation

Generative AI systems can sometimes generate inaccurate or fabricated information, known as hallucinations.

In healthcare scenarios, this creates significant safety concerns.


Why the Other Answers Are Incorrect

A. Scalability

Scalability concerns handling workload increases.

C. Database normalization

Normalization relates to database design.

D. Data compression

Compression reduces storage size.


Question 9

Why is extensive testing important before deploying an AI solution?

A. To identify weaknesses and unsafe behavior under different conditions
B. To guarantee the AI will never fail
C. To eliminate the need for monitoring after deployment
D. To reduce the amount of training data required


Correct Answer

A. To identify weaknesses and unsafe behavior under different conditions


Explanation

Testing across many conditions helps organizations discover problems before users are affected.

Testing improves reliability and safety.


Why the Other Answers Are Incorrect

B. To guarantee the AI will never fail

No testing process can guarantee zero failures.

C. To eliminate the need for monitoring after deployment

Monitoring remains necessary after deployment.

D. To reduce the amount of training data required

Testing does not reduce training data needs.


Question 10

Which Microsoft Responsible AI principle focuses on ensuring AI systems operate dependably and minimize harmful outcomes?

A. Inclusiveness
B. Accountability
C. Reliability and safety
D. Transparency


Correct Answer

C. Reliability and safety


Explanation

The Reliability and Safety principle focuses on ensuring AI systems operate consistently, safely, and predictably while reducing the risk of harmful outcomes.


Why the Other Answers Are Incorrect

A. Inclusiveness

Inclusiveness focuses on designing AI systems for diverse populations.

B. Accountability

Accountability concerns responsibility for AI systems and decisions.

D. Transparency

Transparency focuses on explainability and understanding AI behavior.


Final Thoughts

Reliability and safety are foundational concepts in Responsible AI and key topics for the AI-901 certification exam. Microsoft expects candidates to understand how AI systems can fail, how those failures can affect people and organizations, and how responsible design practices can reduce risks.

Reliable and safe AI systems help organizations build trust, reduce harm, and create more dependable AI-powered solutions.


Go to the AI-901 Exam Prep Hub main page

Describe considerations for fairness in an AI solution (AI-901 Exam Prep)

This post is a part of the AI-901: Microsoft Azure AI Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Identify AI concepts and capabilities (40–45%)
--> Describe principles of responsible AI
--> Describe considerations for fairness in an AI solution


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Fairness is one of the core principles of Responsible AI and is an important topic for the AI-901 certification exam. Microsoft emphasizes that AI systems should treat all people fairly and avoid producing biased or discriminatory outcomes.

Understanding fairness in AI means understanding how bias can enter an AI system, how unfair outcomes can affect people, and what organizations can do to reduce those risks.


What Is Fairness in AI?

Fairness in AI means that an AI system should make decisions or recommendations without unjustly favoring or disadvantaging individuals or groups.

An AI solution is considered unfair if it produces biased outcomes based on characteristics such as:

  • Gender
  • Race or ethnicity
  • Age
  • Religion
  • Disability status
  • Nationality
  • Socioeconomic background

The goal is not simply technical accuracy. An AI model can be highly accurate overall while still treating certain groups unfairly.


Why Fairness Matters

AI systems increasingly influence important real-world decisions, including:

  • Hiring and recruiting
  • Loan approvals
  • Healthcare recommendations
  • Insurance pricing
  • Criminal justice assessments
  • School admissions
  • Customer service prioritization

If these systems are unfair, they can reinforce or amplify existing social inequalities.

For example:

  • A hiring AI might prefer resumes from men because historical company data reflects mostly male hires.
  • A facial recognition system may perform poorly for people with darker skin tones if training data lacked diversity.
  • A loan approval model may unfairly deny applications from certain neighborhoods because of biased historical lending patterns.

These outcomes can damage trust, create legal risks, and harm individuals.


How Bias Enters an AI System

Fairness problems usually originate from bias in data, design, or implementation.

1. Biased Training Data

AI models learn patterns from historical data. If the historical data reflects human bias, the AI may learn and repeat that bias.

Example

If a company historically hired mostly men for engineering roles, an AI recruiting tool trained on that data may incorrectly learn that male candidates are preferable.

This is one of the most common causes of unfair AI systems.


2. Underrepresentation in Data

Some groups may not be sufficiently represented in the training dataset.

Example

A speech recognition model trained mostly on American English speakers may perform poorly for people with different accents.

When data lacks diversity, the AI system may not generalize well to all users.


3. Labeling Bias

Humans often label training data. Human assumptions and prejudices can influence those labels.

Example

If reviewers consistently rate certain groups more negatively during data labeling, the AI model may inherit those patterns.


4. Feature Selection Bias

Sometimes developers unintentionally include features that correlate with protected characteristics.

Example

Using ZIP codes in a lending model could indirectly reflect race or income levels.

Even if race is not explicitly included, proxy variables can still create unfair outcomes.


5. Algorithmic Bias

Some algorithms may optimize for overall accuracy while ignoring fairness across groups.

Example

An AI model may achieve 95% accuracy overall but perform significantly worse for a minority population.

This demonstrates why fairness metrics matter alongside accuracy metrics.


Key Fairness Considerations

When evaluating fairness in an AI solution, organizations should consider several important areas.


Equal Treatment

AI systems should provide similar quality of service and outcomes across different demographic groups.

Example

A facial recognition system should work equally well for all skin tones and genders.


Avoiding Discrimination

AI should not unfairly disadvantage protected groups.

Example

A hiring system should evaluate applicants based on qualifications rather than demographic patterns found in historical data.


Inclusive Design

AI systems should be designed for diverse populations from the beginning.

This includes:

  • Diverse datasets
  • Diverse testing groups
  • Accessibility considerations
  • Multiple languages and accents
  • Cultural differences

Transparency and Explainability

Organizations should understand how AI systems make decisions and be able to explain those decisions when needed.

Example

If a loan application is denied, the organization should be able to explain the factors involved.

Explainability helps identify unfair behavior and improves accountability.


Continuous Monitoring

Fairness is not a one-time task.

AI systems should be continuously monitored because:

  • Data changes over time
  • User populations evolve
  • Biases may emerge after deployment

Organizations should regularly review model outputs and retrain models when necessary.


Trade-Offs in Fairness

Fairness in AI is complex because different definitions of fairness can conflict.

For example:

  • Maximizing overall accuracy may reduce fairness for smaller groups.
  • Equal outcomes across groups may require adjusting decision thresholds.
  • Removing sensitive attributes does not always eliminate bias.

There is often no perfect fairness solution, which is why ethical judgment and governance are important.


Microsoft’s Responsible AI Principles

Microsoft identifies fairness as one of six core Responsible AI principles.

The six principles are:

  1. Fairness
  2. Reliability and safety
  3. Privacy and security
  4. Inclusiveness
  5. Transparency
  6. Accountability

For the AI-901 exam, you should understand that fairness focuses on ensuring AI systems do not create unjust bias or discrimination.


Tools and Techniques for Improving Fairness

Organizations can reduce unfairness using several approaches.

Improve Data Quality

  • Use diverse and representative datasets
  • Remove biased or low-quality data
  • Balance underrepresented groups

Evaluate Fairness Metrics

Measure model performance across different groups instead of relying only on overall accuracy.

Example Metrics

  • False positive rates
  • False negative rates
  • Accuracy by demographic group

Human Oversight

Humans should remain involved in reviewing sensitive AI decisions.

Example

An AI hiring recommendation system might assist recruiters, but humans should make final hiring decisions.


Explainable AI

Explainability tools help organizations understand why models make certain decisions.

This can help detect hidden bias.


Responsible AI Governance

Organizations should establish policies, reviews, and ethical guidelines for AI development and deployment.


Real-World Example of Fairness

Scenario: AI-Based Hiring System

A company creates an AI model to screen resumes.

Potential Fairness Problem

Historical hiring data shows the company hired mostly men for technical roles.

The AI learns patterns associated with male candidates and begins ranking female candidates lower.

Possible Solutions

  • Use more diverse training data
  • Remove biased features
  • Audit model outputs regularly
  • Include human review
  • Test performance across demographic groups

This is a classic AI fairness scenario and aligns well with AI-901 exam objectives.


Azure and Responsible AI

Microsoft Azure AI Services and related AI platforms include Responsible AI guidance and tools to help developers:

  • Detect bias
  • Improve transparency
  • Monitor model behavior
  • Evaluate fairness metrics
  • Implement human oversight

Microsoft encourages organizations to adopt Responsible AI practices throughout the AI lifecycle.


Important AI-901 Exam Tips

For the exam, remember these key points:

  • Fairness means AI systems should avoid unjust bias and discrimination.
  • Bias often originates from training data.
  • High model accuracy does not guarantee fairness.
  • Diverse datasets help improve fairness.
  • Human oversight remains important.
  • Fairness is one of Microsoft’s six Responsible AI principles.
  • AI systems should be monitored continuously after deployment.
  • Transparency and explainability support fairness efforts.

Practice Exam Questions

Question 1

A company develops an AI system to screen job applicants. The system consistently ranks male applicants higher because historical hiring data mostly contains successful male candidates.

What is the MOST likely cause of this fairness issue?

A. Insufficient computing power
B. Biased training data
C. Excessive model transparency
D. Lack of cloud storage


Correct Answer

B. Biased training data


Explanation

The AI system learned patterns from historical hiring data that reflected past hiring bias. Because the training data was biased toward male candidates, the model inherited those unfair patterns.

This is one of the most common fairness problems in AI systems.


Why the Other Answers Are Incorrect

A. Insufficient computing power

Computing power affects performance and speed, not fairness.

C. Excessive model transparency

Transparency helps identify fairness problems rather than causing them.

D. Lack of cloud storage

Storage capacity does not create demographic bias in AI models.


Question 2

Which statement BEST describes fairness in AI?

A. AI systems should maximize profit for organizations
B. AI systems should make decisions without unjust bias
C. AI systems should eliminate all human involvement
D. AI systems should always make identical decisions for everyone


Correct Answer

B. AI systems should make decisions without unjust bias


Explanation

Fairness in AI focuses on preventing unjust discrimination and ensuring equitable treatment across different groups of people.

Fairness does not necessarily mean identical outcomes for everyone, but rather avoiding harmful or biased treatment.


Why the Other Answers Are Incorrect

A. AI systems should maximize profit for organizations

Profitability is unrelated to the Responsible AI principle of fairness.

C. AI systems should eliminate all human involvement

Human oversight is often important for maintaining fairness.

D. AI systems should always make identical decisions for everyone

Different circumstances may justify different outcomes. Fairness is about avoiding unjust bias.


Question 3

A speech recognition system performs poorly for users with certain accents because most training samples came from a single geographic region.

What fairness issue does this demonstrate?

A. Overfitting
B. Underrepresentation in training data
C. Excessive transparency
D. Encryption failure


Correct Answer

B. Underrepresentation in training data


Explanation

The training data lacked sufficient diversity, causing the model to perform poorly for underrepresented user groups.

Inclusive and representative datasets help improve fairness.


Why the Other Answers Are Incorrect

A. Overfitting

Overfitting occurs when a model memorizes training data rather than generalizing properly.

C. Excessive transparency

Transparency does not cause poor recognition accuracy for accents.

D. Encryption failure

Encryption relates to security, not fairness.


Question 4

Which Microsoft Responsible AI principle focuses on reducing bias and discrimination?

A. Accountability
B. Transparency
C. Fairness
D. Reliability and safety


Correct Answer

C. Fairness


Explanation

The Fairness principle focuses on ensuring AI systems do not unfairly disadvantage individuals or groups.


Why the Other Answers Are Incorrect

A. Accountability

Accountability concerns responsibility for AI systems and their outcomes.

B. Transparency

Transparency focuses on explainability and understanding AI decisions.

D. Reliability and safety

Reliability and safety focus on dependable and safe system operation.


Question 5

An organization removes race from a loan approval model, but the model still produces biased outcomes because ZIP code data indirectly reflects demographic patterns.

What does ZIP code represent in this scenario?

A. A fairness metric
B. A proxy variable
C. A transparency feature
D. A security control


Correct Answer

B. A proxy variable


Explanation

A proxy variable is a feature that indirectly correlates with sensitive attributes such as race, gender, or income level.

Even when protected attributes are removed, proxy variables can still introduce unfairness.


Why the Other Answers Are Incorrect

A. A fairness metric

Fairness metrics are measurements used to evaluate fairness.

C. A transparency feature

Transparency features help explain decisions, not indirectly encode demographic data.

D. A security control

Security controls protect systems and data.


Question 6

Why is human oversight important in AI systems that make sensitive decisions?

A. Humans can completely eliminate all bias
B. Humans can review and challenge potentially unfair outcomes
C. Humans increase automation speed
D. Humans reduce cloud costs


Correct Answer

B. Humans can review and challenge potentially unfair outcomes


Explanation

Human oversight helps organizations identify questionable or unfair AI decisions, especially in high-impact areas like hiring, healthcare, and finance.

AI systems should assist humans rather than fully replace judgment in sensitive scenarios.


Why the Other Answers Are Incorrect

A. Humans can completely eliminate all bias

Humans can reduce bias, but not completely eliminate it.

C. Humans increase automation speed

Human review usually slows processes rather than speeds them up.

D. Humans reduce cloud costs

Human oversight is unrelated to cloud pricing.


Question 7

An AI model achieves 98% accuracy overall but performs significantly worse for older adults than younger adults.

What does this scenario illustrate?

A. High accuracy guarantees fairness
B. Fairness and accuracy are always identical
C. An AI system can be accurate overall while still unfair
D. Transparency automatically prevents bias


Correct Answer

C. An AI system can be accurate overall while still unfair


Explanation

Overall accuracy can hide unequal performance across demographic groups. Fairness evaluations should measure outcomes for different populations separately.


Why the Other Answers Are Incorrect

A. High accuracy guarantees fairness

High accuracy does not guarantee equitable treatment.

B. Fairness and accuracy are always identical

These are different concepts and can conflict.

D. Transparency automatically prevents bias

Transparency helps identify issues but does not automatically eliminate them.


Question 8

Which action would BEST help improve fairness in an AI solution?

A. Limiting testing to a single user group
B. Using more diverse and representative training data
C. Hiding model outputs from reviewers
D. Reducing the amount of training data


Correct Answer

B. Using more diverse and representative training data


Explanation

Representative datasets improve an AI system’s ability to perform fairly across different populations and reduce bias caused by underrepresentation.


Why the Other Answers Are Incorrect

A. Limiting testing to a single user group

This increases the risk of bias and poor generalization.

C. Hiding model outputs from reviewers

Review and transparency help identify fairness issues.

D. Reducing the amount of training data

Less data often reduces model quality and fairness.


Question 9

Which of the following is an example of an unfair AI outcome?

A. A chatbot responding slowly during peak usage
B. A recommendation engine displaying duplicate products
C. A facial recognition system performing poorly for certain skin tones
D. A virtual machine running out of memory


Correct Answer

C. A facial recognition system performing poorly for certain skin tones


Explanation

Unequal performance across demographic groups is a classic fairness problem in AI systems.

This often results from insufficiently diverse training data.


Why the Other Answers Are Incorrect

A. A chatbot responding slowly during peak usage

This is a performance issue.

B. A recommendation engine displaying duplicate products

This is a recommendation quality issue.

D. A virtual machine running out of memory

This is an infrastructure issue.


Question 10

Why should AI systems be continuously monitored after deployment?

A. Fairness issues can emerge as data and user behavior change over time
B. AI systems never require updates after deployment
C. Monitoring removes the need for testing before deployment
D. Monitoring guarantees perfect fairness


Correct Answer

A. Fairness issues can emerge as data and user behavior change over time


Explanation

AI systems operate in changing environments. Data distributions, populations, and behaviors may evolve, creating new fairness risks after deployment.

Continuous monitoring is an important Responsible AI practice.


Why the Other Answers Are Incorrect

B. AI systems never require updates after deployment

AI systems often require retraining and adjustment.

C. Monitoring removes the need for testing before deployment

Pre-deployment testing remains essential.

D. Monitoring guarantees perfect fairness

No approach can guarantee perfect fairness in all situations.


Final Thoughts

Fairness is a foundational concept in Responsible AI and a critical topic for the AI-901 certification exam. Microsoft expects candidates to understand not only what fairness means, but also how bias enters AI systems and what organizations can do to reduce unfair outcomes.

As AI becomes more integrated into business and society, fairness is no longer optional—it is essential for building trustworthy and ethical AI solutions.


Go to the AI-901 Exam Prep Hub main page

DP-900: Microsoft Azure Data Fundamentals certification exam – Frequently Asked Questions (FAQs)

Below are some commonly asked questions about the DP-900: Microsoft Azure Data Fundamentals certification exam. Upon successfully passing this exam, you earn the Microsoft Certified: Azure Data Fundamentals certification.


What is the DP-900 certification exam?

The DP-900: Microsoft Azure Data Fundamentals exam validates your foundational knowledge of core data concepts and how data is implemented using Microsoft Azure services.

Candidates who pass the exam demonstrate understanding of:

  • Core data concepts (relational vs non-relational data, transactional vs analytical workloads)
  • Relational data workloads in Azure (Azure SQL Database, SQL Server on Azure Virtual Machines, Azure SQL Managed Instance)
  • Non-relational data workloads in Azure (Azure Cosmos DB)
  • Analytical workloads in Azure (Azure Synapse Analytics, Azure Data Factory, Azure Data Lake, Power BI)

This certification is designed for individuals who want to build a baseline understanding of data in the cloud. Upon successfully passing this exam, candidates earn the Microsoft Certified: Azure Data Fundamentals certification.


Is the DP-900 certification exam worth it?

The short answer is yes.

DP-900 is an excellent entry point into Microsoft’s data certification ecosystem. Preparing for this exam helps you:

  • Build a solid foundation in data concepts
  • Understand how Azure supports different data workloads
  • Gain confidence working with cloud-based data platforms
  • Prepare for more advanced certifications such as DP-203, PL-300, or AI-900

For beginners, career switchers, students, and professionals new to Azure or data, DP-900 provides structured learning and practical context that transfers directly to real-world scenarios.


How many questions are on the DP-900 exam?

The DP-900 exam typically contains between 40 and 60 questions.

Question formats may include:

  • Single-choice and multiple-choice questions
  • Multi-select questions
  • Drag-and-drop or matching questions
  • Short scenario-based questions

The exact number and format can vary from exam to exam.


How hard is the DP-900 exam?

DP-900 is considered a fundamentals-level exam and is generally easier than associate-level certifications such as PL-300 or DP-203.

That said, it still requires preparation.

The challenge comes from:

  • Understanding when to use relational vs non-relational data
  • Recognizing Azure services and their purposes
  • Interpreting scenario-based questions
  • Learning basic analytics concepts

With focused study and practice, most candidates find the exam very achievable.

Helpful preparation resources include:

  • Microsoft Learn (official and free)
  • The official DP-900 study guide
  • Practice exams
  • Community resources and blogs
  • YouTube tutorials and walkthroughs

How much does the DP-900 certification exam cost?

As of early 2026, the standard exam pricing is approximately:

  • United States: $99 USD
  • Other countries: Regionally adjusted pricing applies

Microsoft frequently offers student discounts, academic pricing, and exam vouchers, so it’s worth checking the official Microsoft certification site before scheduling.


How do I prepare for the Microsoft DP-900 certification exam?

The most important advice is not to rush.

Recommended preparation steps:

  1. Review the official DP-900 exam skills outline.
  2. Complete the free Microsoft Learn DP-900 learning path.
  3. Study core data concepts (relational vs non-relational, OLTP vs OLAP).
  4. Learn the purpose of key Azure services such as Azure SQL Database, Azure Cosmos DB, Azure Synapse Analytics, and Power BI.
  5. Take practice exams to confirm your readiness.

Additional learning resources include:

Hands-on labs are helpful but not strictly required for DP-900. Conceptual understanding is the primary focus.


How do I pass the DP-900 exam?

To maximize your chances of passing:

  • Focus on understanding concepts rather than memorization
  • Learn what each Azure data service is designed for
  • Carefully read scenario questions before answering
  • Eliminate obviously incorrect choices
  • Manage your time effectively

Consistently performing well on reputable practice exams is usually a good indicator that you’re ready.


What is the best site for DP-900 certification dumps?

Using exam dumps is not recommended and may violate Microsoft’s exam policies.

Instead, rely on legitimate preparation resources such as:

  • Microsoft’s official practice exam
  • High-quality community-created practice tests
  • Scenario-based questions that reinforce understanding

Legitimate preparation builds real skills that extend beyond passing the exam.


How long should I study for the DP-900 exam?

Study time varies based on background.

General guidelines:

  • Prior data or Azure experience: 2–4 weeks
  • Some technical background: 3–5 weeks
  • Beginners or career changers: 4–8 weeks

Rather than focusing strictly on time, aim to understand all exam topics and perform well on practice tests before scheduling.


Where can I find training or a course for the DP-900 exam?

Training options include:

  • Microsoft Learn: Free, official learning path
  • Online platforms: Udemy, Coursera, Exam Prep Hub for DP-900: Azure Data Fundamentals, and similar providers
  • YouTube: Free DP-900 playlists and walkthroughs
  • Subscription platforms: Datacamp and others offering Azure or data fundamentals
  • Microsoft partners: Instructor-led courses

A mix of structured learning and light hands-on exploration works well.


What skills should I have before taking the DP-900 exam?

Before attempting the exam, it helps to understand:

  • Basic data concepts (tables, rows, columns)
  • Differences between relational and non-relational data
  • Basic analytics terminology
  • General cloud computing concepts

No coding experience is required.

DP-900 is designed specifically for beginners.


What score do I need to pass the DP-900 exam?

Microsoft exams are scored on a scale of 1–1000, and a score of 700 or higher is required to pass.

Scores are scaled based on question difficulty, not simply percentage correct.


How long is the DP-900 exam?

You are given approximately 60 minutes to complete the exam, not including onboarding and instructions.

Time pressure is generally lower than associate-level exams.


How long is the DP-900 certification valid?

The Microsoft Certified: Azure Data Fundamentals certification does not expire.

Unlike associate-level certifications, DP-900 currently does not require renewal.


Is DP-900 suitable for beginners?

Yes — DP-900 is specifically designed for beginners.

It’s ideal for:

  • Students
  • Career switchers
  • Business professionals entering data or analytics
  • Technical professionals new to Azure

No prior Azure or database experience is required.


What roles benefit most from the DP-900 certification?

DP-900 is especially valuable for:

  • Aspiring Data Analysts or Data Engineers
  • Business Analysts
  • Students and graduates
  • Cloud beginners
  • Professionals exploring data careers

It also serves as a strong foundation before pursuing PL-300, DP-203, or AI-900.


What languages is the DP-900 exam offered in?

The DP-900 certification exam is commonly offered in:

English, Japanese, Chinese (Simplified), Korean, German, French, Spanish, Portuguese (Brazil), Chinese (Traditional), Italian

Availability may vary by region.


Have additional questions? Post them in the comments.

Thanks for reading and good luck on your data journey!

Exam Prep Hub for DP-900: Azure Data Fundamentals

Welcome to the DP-900: Azure Data Fundamentals Exam Prep Hub!

Welcome to the one-stop hub with information for preparing for the DP-900: Microsoft Azure Data Fundamentals certification exam. The content for this exam helps you to “Demonstrate foundational knowledge of core data concepts related to Microsoft Azure data services.”. Upon successful completion of the exam, you earn the Microsoft Certified: Azure Data Fundamentals certification.

This hub provides information directly here (topic-by-topic as outlined in the official study guide), links to a number of external resources, tips for preparing for the exam, practice tests, and section questions to help you prepare. Bookmark this page and use it as a guide to ensure that you are fully covering all relevant topics for the AI-900 exam and making use of as many of the resources available as possible.


Audience profile (from Microsoft’s site)

This exam is an opportunity to demonstrate your knowledge of core data concepts and related Microsoft Azure data services. As a candidate for this exam, you should have familiarity with Exam DP-900’s self-paced or instructor-led learning material.
This exam is intended for you, if you’re a candidate beginning to work with data in the cloud.
You should be familiar with:
- The concepts of relational and non-relational data.
- Different types of data workloads such as transactional or analytical.
You can use Azure Data Fundamentals to prepare for other Azure role-based certifications like Azure Database Administrator Associate or Azure Data Engineer Associate, but it is not a prerequisite for any of them.

Skills at a glance (as specified in the official study guide)

  • Describe core data concepts (25–30%)
  • Identify considerations for relational data on Azure (20–25%)
  • Describe considerations for working with non-relational data on Azure (15–20%)
  • Describe an analytics workload on Azure (25–30%)

Topic-by-Topic Exam Content

Describe core data concepts (25–30%)

Describe ways to represent data

Identify options for data storage

Describe common data workloads

Identify roles and responsibilities for data workloads

Identify considerations for relational data on Azure (20–25%)

Describe relational concepts

Describe relational Azure data services

Describe considerations for working with non-relational data on Azure (15–20%)

Describe capabilities of Azure storage

Describe capabilities and features of Azure Cosmos DB

Describe an analytics workload (25–30%)

Describe common elements of large-scale analytics

Describe consideration for real-time data analytics

Describe data visualization in Microsoft Power BI


DP-900 Practice Exams

DP-900 Practice Exam 1 (60 questions with answers)

DP-900 Practice Exam 2 (60 questions with answers)


Important DP-900 Resources

YouTube video series: Microsoft Learn DP-900 Azure Data Fundamentals YouTube series

A book you may find useful (on Amazon): Exam Ref DP-900 Microsoft Azure Data Fundamentals 2nd Edition


Good luck to you on your data journey!

DP-900: Azure Data Fundamentals – Advanced Practice Exam – 60 questions

Advanced Practice Exam (60 Questions)

This advanced practice exam contains:

  • Higher-difficulty questions
  • More scenario-based questions
  • Multi-answer questions
  • Matching questions
  • Fill-in-the-blank questions
  • SQL and architecture concepts
  • Azure service selection scenarios

Section 1 — Core Data Concepts


Question 1 (Scenario-Based)

A company stores customer survey responses in JSON format. Each survey can contain different fields depending on the survey type.

How should this data be classified?

A. Structured
B. Semi-structured
C. Unstructured
D. Transactional

Answer: B — Semi-structured

Explanations

A. Incorrect
Structured data requires a rigid schema.

B. Correct
JSON is semi-structured because it contains flexible tagged fields.

C. Incorrect
Unstructured data has little or no organization.

D. Incorrect
Transactional refers to workload type, not structure.


Question 2 (Multi-Answer)

Which characteristics are associated with transactional workloads? (Choose TWO)

A. High concurrency
B. Historical aggregations
C. Fast insert/update operations
D. Large-scale reporting queries

Answers: A and C

Explanations

A. Correct
Transactional systems support many simultaneous users.

B. Incorrect
Historical aggregations are analytical.

C. Correct
OLTP systems perform fast write operations.

D. Incorrect
Large reporting queries belong to analytics workloads.


Question 3 (Scenario-Based)

A database contains duplicated customer addresses across multiple tables. The database architect wants to reduce redundancy and improve consistency.

Which process should be used?

A. Partitioning
B. Normalization
C. Encryption
D. Replication

Answer: B — Normalization

Explanations

A. Incorrect
Partitioning improves scalability.

B. Correct
Normalization reduces duplication.

C. Incorrect
Encryption secures data.

D. Incorrect
Replication copies data.


Question 4 (Single Answer)

Which SQL statement removes an existing table and all its data?

A. DELETE
B. REMOVE
C. DROP
D. ERASE

Answer: C — DROP

Explanations

A. Incorrect
DELETE removes rows only.

B. Incorrect
REMOVE is not standard SQL.

C. Correct
DROP deletes the table structure and data.

D. Incorrect
ERASE is not standard SQL.


Question 5 (Matching)

Match the role to the responsibility.

RoleResponsibility
1. DBAA. Creates dashboards
2. Data AnalystB. Maintains database performance
3. Data EngineerC. Builds data pipelines

Answers

  • 1 → B
  • 2 → A
  • 3 → C

Question 6 (Scenario-Based)

A retail company needs a database for processing thousands of purchases per minute with guaranteed consistency.

Which workload type is MOST appropriate?

A. Analytical
B. Streaming
C. Transactional
D. Archival

Answer: C — Transactional

Explanations

A. Incorrect
Analytical systems focus on reporting.

B. Incorrect
Streaming processes event flows.

C. Correct
Transactional systems support operational consistency and speed.

D. Incorrect
Archival systems store inactive data.


Question 7 (Fill in the Blank)

The SQL statement used to add new rows to a table is __________.

Answer: INSERT


Question 8 (Multi-Answer)

Which file formats are commonly used in analytics workloads? (Choose TWO)

A. Parquet
B. ORC
C. BMP
D. EXE

Answers: A and B

Explanations

A. Correct
Parquet is optimized for analytics.

B. Correct
ORC is another columnar analytics format.

C. Incorrect
BMP is an image format.

D. Incorrect
EXE is executable software.


Question 9 (Scenario-Based)

An organization wants to analyze 10 years of sales history for trends and forecasting.

Which workload type is BEST suited?

A. OLTP
B. Analytical
C. Streaming
D. Operational

Answer: B — Analytical


Question 10 (Single Answer)

Which database object contains reusable SQL logic?

A. View
B. Index
C. Stored Procedure
D. Key

Answer: C — Stored Procedure


Section 2 — Relational Data on Azure


Question 11 (Scenario-Based)

A company is migrating an on-premises SQL Server application that relies heavily on SQL Server Agent, cross-database queries, and instance-level features.

Which Azure service is MOST appropriate?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. Azure Cosmos DB
D. Azure Blob Storage

Answer: B — Azure SQL Managed Instance

Explanations

A. Incorrect
Azure SQL Database has fewer instance-level features.

B. Correct
Managed Instance offers near full SQL Server compatibility.

C. Incorrect
Cosmos DB is NoSQL.

D. Incorrect
Blob Storage stores files.


Question 12 (Single Answer)

Which Azure SQL offering provides the HIGHEST level of infrastructure control?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. SQL Server on Azure Virtual Machines
D. Azure Synapse Analytics

Answer: C — SQL Server on Azure Virtual Machines


Question 13 (Multi-Answer)

Which are advantages of Platform as a Service (PaaS) databases? (Choose TWO)

A. Automatic patching
B. Reduced administrative overhead
C. Full operating system control
D. Manual backups only

Answers: A and B


Question 14 (Scenario-Based)

A company wants automatic scaling, backups, and minimal management overhead for a new cloud-native application.

Which solution is BEST?

A. SQL Server on Azure VMs
B. Azure SQL Database
C. Windows Server Failover Cluster
D. Self-hosted SQL Server

Answer: B — Azure SQL Database


Question 15 (Single Answer)

What is the purpose of a foreign key?

A. Encrypt data
B. Create indexes
C. Enforce relationships between tables
D. Remove duplicates

Answer: C — Enforce relationships between tables


Question 16 (Scenario-Based)

A company needs a managed PostgreSQL service in Azure.

Which service should be used?

A. Azure SQL Database
B. Azure Database for PostgreSQL
C. Azure Blob Storage
D. Azure Cosmos DB

Answer: B — Azure Database for PostgreSQL


Question 17 (Single Answer)

Which normalization form removes transitive dependencies?

A. 1NF
B. 2NF
C. 3NF
D. 4NF

Answer: C — 3NF


Question 18 (Multi-Answer)

Which SQL statements are Data Manipulation Language (DML)? (Choose TWO)

A. SELECT
B. INSERT
C. CREATE
D. DROP

Answers: A and B


Question 19 (Scenario-Based)

A query needs to return ALL customers, including those without orders.

Which JOIN should be used?

A. INNER JOIN
B. CROSS JOIN
C. LEFT JOIN
D. SELF JOIN

Answer: C — LEFT JOIN


Question 20 (Single Answer)

Which object improves query performance but does NOT store actual business data?

A. Table
B. View
C. Index
D. Row

Answer: C — Index


Section 3 — Non-Relational Data


Question 21 (Scenario-Based)

A media company needs to store petabytes of video content at low cost.

Which Azure service is MOST appropriate?

A. Azure SQL Database
B. Azure Blob Storage
C. Azure Table Storage
D. Azure Cache for Redis

Answer: B — Azure Blob Storage


Question 22 (Single Answer)

Which Azure Blob Storage tier is optimized for infrequently accessed data?

A. Premium
B. Hot
C. Cool
D. Archive

Answer: C — Cool


Question 23 (Scenario-Based)

An organization needs cloud-hosted SMB file shares accessible by both cloud and on-premises servers.

Which service should be used?

A. Azure Cosmos DB
B. Azure Files
C. Azure Table Storage
D. Azure SQL Database

Answer: B — Azure Files


Question 24 (Multi-Answer)

Which APIs are supported by Azure Cosmos DB? (Choose TWO)

A. MongoDB
B. Cassandra
C. Oracle
D. SMB

Answers: A and B


Question 25 (Scenario-Based)

A gaming company needs globally distributed low-latency data access for player profiles.

Which Azure service is BEST?

A. Azure Cosmos DB
B. Azure Files
C. Azure SQL Database
D. Azure Blob Storage

Answer: A — Azure Cosmos DB


Question 26 (Single Answer)

What is a major benefit of Azure Cosmos DB partitioning?

A. Reduces security
B. Enables scalability
C. Removes replication
D. Prevents indexing

Answer: B — Enables scalability


Question 27 (Fill in the Blank)

Azure Cosmos DB provides multi-region __________ to improve availability and performance.

Answer: replication


Question 28 (Scenario-Based)

A company needs a NoSQL key-value store for massive telemetry ingestion.

Which service is MOST appropriate?

A. Azure Table Storage
B. Azure SQL Database
C. Azure Files
D. Azure DNS

Answer: A — Azure Table Storage


Question 29 (Single Answer)

Which storage service stores data as objects inside containers?

A. Azure Files
B. Azure Blob Storage
C. Azure SQL Database
D. Azure Cosmos DB

Answer: B — Azure Blob Storage


Question 30 (Multi-Answer)

Which are characteristics of non-relational databases? (Choose TWO)

A. Flexible schemas
B. Strict relational constraints
C. Horizontal scalability
D. Mandatory JOIN operations

Answers: A and C


Section 4 — Analytics Workloads


Question 31 (Scenario-Based)

A company collects IoT sensor readings every second and needs near real-time dashboards.

Which processing approach is MOST appropriate?

A. Batch processing
B. Streaming processing
C. Archival processing
D. Offline reporting

Answer: B — Streaming processing


Question 32 (Single Answer)

Which Azure service is designed for high-throughput event ingestion?

A. Azure Event Hubs
B. Azure Backup
C. Azure Files
D. Azure DNS

Answer: A — Azure Event Hubs


Question 33 (Scenario-Based)

An organization needs Apache Spark-based analytics with collaborative notebooks.

Which service is BEST?

A. Azure Databricks
B. Azure Files
C. Azure DNS
D. Azure Firewall

Answer: A — Azure Databricks


Question 34 (Single Answer)

Which architecture commonly includes fact tables and dimension tables?

A. OLTP schema
B. Star schema
C. Graph schema
D. XML schema

Answer: B — Star schema


Question 35 (Multi-Answer)

Which are characteristics of a data warehouse? (Choose TWO)

A. Optimized for analytics
B. Stores historical data
C. Primarily supports OLTP transactions
D. Limited aggregations

Answers: A and B


Question 36 (Scenario-Based)

A company wants a unified analytics platform combining engineering, warehousing, data science, and BI.

Which Microsoft service BEST fits?

A. Microsoft Fabric
B. Azure Files
C. Azure Firewall
D. Azure DNS

Answer: A — Microsoft Fabric


Question 37 (Single Answer)

Which service allows SQL-like queries against streaming data?

A. Azure Stream Analytics
B. Azure Files
C. Azure Backup
D. Azure Monitor

Answer: A — Azure Stream Analytics


Question 38 (Scenario-Based)

An organization processes payroll data once nightly.

Which processing type is MOST appropriate?

A. Streaming
B. Batch
C. Event-driven only
D. Real-time analytics

Answer: B — Batch


Question 39 (Single Answer)

Which process extracts, transforms, and loads data into analytical systems?

A. ETL
B. DNS
C. RAID
D. OLTP

Answer: A — ETL


Question 40 (Multi-Answer)

Which services are commonly associated with real-time analytics? (Choose TWO)

A. Azure Event Hubs
B. Azure Stream Analytics
C. Azure Files
D. Azure Backup

Answers: A and B


Section 5 — Power BI


Question 41 (Scenario-Based)

An executive wants a single-page overview showing KPIs and summary visuals.

Which Power BI object should be used?

A. Dataset
B. Dashboard
C. Dataflow
D. Semantic model

Answer: B — Dashboard


Question 42 (Single Answer)

Which Power BI component is primarily used for data transformation?

A. DAX
B. Power Query
C. Azure Functions
D. Power Automate

Answer: B — Power Query


Question 43 (Scenario-Based)

A report must show revenue trends over 24 months.

Which visualization is BEST?

A. Pie chart
B. Gauge chart
C. Line chart
D. Scatter chart

Answer: C — Line chart


Question 44 (Single Answer)

Which visualization is BEST for displaying proportions?

A. Scatter chart
B. Pie chart
C. Card
D. Gauge chart

Answer: B — Pie chart


Question 45 (Scenario-Based)

A company wants users to filter reports interactively by region and year.

Which feature should be used?

A. Indexes
B. Slicers
C. Measures
D. Triggers

Answer: B — Slicers


Question 46 (Single Answer)

Which Power BI language creates measures and calculated columns?

A. SQL
B. Python
C. DAX
D. XML

Answer: C — DAX


Question 47 (Scenario-Based)

A business analyst wants to identify the relationship between advertising spend and revenue.

Which visualization is BEST?

A. Pie chart
B. Scatter chart
C. Gauge chart
D. Card

Answer: B — Scatter chart


Question 48 (Single Answer)

Which Power BI visualization is BEST for detailed row-level data?

A. Table
B. Gauge
C. Pie chart
D. Card

Answer: A — Table


Question 49 (Multi-Answer)

Which are benefits of Power BI dashboards? (Choose TWO)

A. Real-time monitoring
B. Single-page summaries
C. Operating system administration
D. SQL indexing

Answers: A and B


Question 50 (Scenario-Based)

A company needs a geographic visualization of sales by country.

Which visualization is BEST?

A. Matrix
B. Map
C. Gauge
D. Card

Answer: B — Map


Section 6 — Comprehensive Scenarios


Question 51 (Scenario-Based)

A healthcare organization requires:

  • Globally distributed NoSQL storage
  • Automatic replication
  • Low latency worldwide
  • Flexible schema support

Which solution BEST fits?

A. Azure SQL Database
B. Azure Cosmos DB
C. Azure Files
D. Azure Synapse Analytics

Answer: B — Azure Cosmos DB


Question 52 (Scenario-Based)

A manufacturing company collects sensor telemetry every second from thousands of devices.

Which Azure service should ingest the streaming events?

A. Azure Event Hubs
B. Azure Files
C. Azure SQL Managed Instance
D. Azure Backup

Answer: A — Azure Event Hubs


Question 53 (Scenario-Based)

A company wants full control of SQL Server patching, OS configuration, and backups.

Which deployment option should be used?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. SQL Server on Azure Virtual Machines
D. Azure Cosmos DB

Answer: C — SQL Server on Azure Virtual Machines


Question 54 (Single Answer)

Which Azure service is MOST optimized for unstructured object storage?

A. Azure Blob Storage
B. Azure SQL Database
C. Azure Files
D. Azure Synapse Analytics

Answer: A — Azure Blob Storage


Question 55 (Scenario-Based)

An analytics team needs to store historical sales data optimized for aggregation queries.

Which solution is BEST?

A. Transactional database
B. Data warehouse
C. Azure Files
D. DNS server

Answer: B — Data warehouse


Question 56 (Single Answer)

Which SQL statement changes existing records?

A. CREATE
B. UPDATE
C. INSERT
D. ALTER

Answer: B — UPDATE


Question 57 (Multi-Answer)

Which are benefits of normalization? (Choose TWO)

A. Reduced redundancy
B. Improved consistency
C. Increased duplicate storage
D. Reduced relationships

Answers: A and B


Question 58 (Scenario-Based)

A report needs to compare revenue across product categories.

Which visualization is BEST?

A. Line chart
B. Scatter chart
C. Bar chart
D. Gauge chart

Answer: C — Bar chart


Question 59 (Fill in the Blank)

The SQL JOIN that returns only matching rows from both tables is called an __________ JOIN.

Answer: INNER


Question 60 (Scenario-Based)

A company needs:

  • Large-scale analytics
  • Integrated Power BI reporting
  • Data engineering
  • Real-time analytics
  • Unified SaaS experience

Which platform BEST meets these requirements?

A. Microsoft Fabric
B. Azure Files
C. Azure DNS
D. Windows Server Failover Clustering

Answer: A — Microsoft Fabric


Advanced Exam Study Tips

Know the differences between:

  • OLTP vs OLAP
  • Batch vs streaming
  • Structured vs semi-structured vs unstructured
  • Relational vs NoSQL

Memorize Azure service associations:

ServicePurpose
Azure Blob StorageUnstructured object storage
Azure FilesSMB file shares
Azure Table StorageKey-value NoSQL
Azure Cosmos DBGlobally distributed NoSQL
Azure Event HubsStreaming ingestion
Azure Stream AnalyticsReal-time analytics
Azure DatabricksSpark analytics
Microsoft FabricUnified analytics platform

Power BI visualization shortcuts:

VisualizationBest Use
Line chartTrends
Bar chartComparisons
Pie chartProportions
Scatter chartRelationships
CardSingle KPI
MapGeographic analysis
GaugeProgress toward target

Go to the DP-900 Exam Prep Hub main page.

DP-900: Azure Data Fundamentals – Practice Exam Questions – 60 questions

Full Practice Exam

This practice exam covers all major skills measured on the DP-900 certification exam, including:

  • Core data concepts
  • Relational data on Azure
  • Non-relational data on Azure
  • Analytics workloads
  • Power BI and visualization
  • Real-time analytics
  • Azure data services

Question formats include:

  • Single-answer multiple choice
  • Multi-answer multiple choice
  • Matching/connect-the-answers
  • Fill-in-the-blank
  • Scenario-based questions

Section 1 — Core Data Concepts


Question 1 (Single Answer)

Which type of data has a predefined schema consisting of rows and columns?

A. Unstructured data
B. Semi-structured data
C. Structured data
D. Streaming data

Answer: C — Structured data

Explanations

A. Incorrect
Unstructured data does not have a predefined schema.

B. Incorrect
Semi-structured data has some organization but not fixed rows/columns.

C. Correct
Structured data uses a defined schema with rows and columns.

D. Incorrect
Streaming data refers to continuously arriving data, not structure type.


Question 2 (Multi-Answer)

Which of the following are examples of semi-structured data? (Choose TWO)

A. JSON
B. CSV
C. XML
D. SQL tables

Answers: A and C

Explanations

A. Correct
JSON contains tags/structure but flexible schemas.

B. Incorrect
CSV is structured tabular data.

C. Correct
XML is semi-structured because it uses tagged hierarchical data.

D. Incorrect
SQL tables are structured relational data.


Question 3 (Fill in the Blank)

A database design technique used to reduce data redundancy is called __________.

Answer: Normalization

Explanation

Normalization organizes data efficiently and minimizes duplication.


Question 4 (Single Answer)

Which SQL statement retrieves data from a table?

A. INSERT
B. UPDATE
C. SELECT
D. DELETE

Answer: C — SELECT

Explanations

A. Incorrect
INSERT adds records.

B. Incorrect
UPDATE modifies records.

C. Correct
SELECT retrieves data.

D. Incorrect
DELETE removes records.


Question 5 (Matching)

Match the workload to its description.

WorkloadDescription
1. TransactionalA. Historical analysis
2. AnalyticalB. Real-time business operations

Answers

  • 1 → B
  • 2 → A

Explanation

Transactional workloads support day-to-day operations; analytical workloads analyze historical data.


Question 6 (Single Answer)

Which role is MOST responsible for maintaining database availability and backups?

A. Data Analyst
B. Data Engineer
C. Database Administrator
D. Business User

Answer: C — Database Administrator

Explanations

A. Incorrect
Data analysts focus on reporting and insights.

B. Incorrect
Data engineers build pipelines and integration systems.

C. Correct
DBAs manage availability, backups, and performance.

D. Incorrect
Business users consume reports.


Question 7 (Multi-Answer)

Which are characteristics of analytical workloads? (Choose TWO)

A. Frequent INSERT operations
B. Historical trend analysis
C. Large-scale aggregations
D. High-volume OLTP transactions

Answers: B and C

Explanations

A. Incorrect
Frequent inserts are more common in transactional systems.

B. Correct
Analytical systems examine historical data.

C. Correct
Aggregations are common in analytics.

D. Incorrect
OLTP workloads are transactional.


Question 8 (Single Answer)

Which file format is commonly used for big data analytics because of columnar storage and compression?

A. TXT
B. CSV
C. Parquet
D. XML

Answer: C — Parquet

Explanations

A. Incorrect
TXT files are plain text.

B. Incorrect
CSV is row-based text data.

C. Correct
Parquet is optimized for analytics workloads.

D. Incorrect
XML is semi-structured but not optimized for analytics.


Question 9 (Single Answer)

Which database object stores data in rows and columns?

A. View
B. Stored procedure
C. Table
D. Index

Answer: C — Table

Explanations

A. Incorrect
Views are virtual query results.

B. Incorrect
Stored procedures contain SQL logic.

C. Correct
Tables store relational data.

D. Incorrect
Indexes improve query performance.


Question 10 (Single Answer)

Which SQL JOIN returns only matching rows from both tables?

A. LEFT JOIN
B. RIGHT JOIN
C. INNER JOIN
D. FULL OUTER JOIN

Answer: C — INNER JOIN

Explanations

A. Incorrect
LEFT JOIN includes unmatched left-side rows.

B. Incorrect
RIGHT JOIN includes unmatched right-side rows.

C. Correct
INNER JOIN returns only matches.

D. Incorrect
FULL OUTER JOIN includes all rows.


Section 2 — Relational Data on Azure


Question 11 (Single Answer)

Which Azure SQL option provides the MOST compatibility with on-premises SQL Server?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. Azure Cosmos DB
D. Azure Blob Storage

Answer: B — Azure SQL Managed Instance

Explanations

A. Incorrect
Azure SQL Database is fully managed but has fewer instance-level features.

B. Correct
Managed Instance provides near full SQL Server compatibility.

C. Incorrect
Cosmos DB is NoSQL.

D. Incorrect
Blob Storage is object storage.


Question 12 (Multi-Answer)

Which Azure services support open-source relational databases? (Choose TWO)

A. Azure Database for PostgreSQL
B. Azure Database for MySQL
C. Azure Synapse Analytics
D. Azure Files

Answers: A and B

Explanations

A. Correct
Azure provides managed PostgreSQL.

B. Correct
Azure provides managed MySQL.

C. Incorrect
Synapse is analytics-focused.

D. Incorrect
Azure Files is storage.


Question 13 (Single Answer)

Which Azure SQL option gives customers the MOST operating system control?

A. Azure SQL Database
B. Azure SQL Managed Instance
C. SQL Server on Azure Virtual Machines
D. Azure Cosmos DB

Answer: C — SQL Server on Azure Virtual Machines

Explanations

A. Incorrect
Fully managed platform service.

B. Incorrect
Managed service with limited OS access.

C. Correct
VMs provide full infrastructure control.

D. Incorrect
Cosmos DB is NoSQL.


Question 14 (Fill in the Blank)

A column whose values uniquely identify each row in a table is called a __________ key.

Answer: Primary

Explanation

A primary key uniquely identifies rows.


Question 15 (Single Answer)

Which database normalization form removes repeating groups?

A. 1NF
B. 2NF
C. 3NF
D. 4NF

Answer: A — 1NF

Explanations

A. Correct
1NF eliminates repeating groups.

B. Incorrect
2NF removes partial dependencies.

C. Incorrect
3NF removes transitive dependencies.

D. Incorrect
4NF handles multi-valued dependencies.


Section 3 — Non-Relational Data on Azure


Question 16 (Single Answer)

Which Azure storage service is best for storing large unstructured files?

A. Azure SQL Database
B. Azure Blob Storage
C. Azure Table Storage
D. Azure Cosmos DB

Answer: B — Azure Blob Storage

Explanations

A. Incorrect
SQL Database is relational.

B. Correct
Blob Storage stores unstructured objects like images/videos.

C. Incorrect
Table Storage stores NoSQL key-value data.

D. Incorrect
Cosmos DB is a globally distributed database.


Question 17 (Single Answer)

Which Azure storage service provides SMB file shares?

A. Azure Blob Storage
B. Azure Cosmos DB
C. Azure Files
D. Azure Table Storage

Answer: C — Azure Files

Explanations

A. Incorrect
Blob Storage is object storage.

B. Incorrect
Cosmos DB is NoSQL.

C. Correct
Azure Files supports SMB shares.

D. Incorrect
Table Storage stores structured NoSQL entities.


Question 18 (Multi-Answer)

Which are valid Azure Cosmos DB APIs? (Choose TWO)

A. MongoDB API
B. Cassandra API
C. Oracle API
D. SMB API

Answers: A and B

Explanations

A. Correct
Cosmos DB supports MongoDB API.

B. Correct
Cosmos DB supports Cassandra API.

C. Incorrect
Oracle API is not supported.

D. Incorrect
SMB is a file-sharing protocol.


Question 19 (Single Answer)

Which characteristic is a major feature of Azure Cosmos DB?

A. Single-region architecture
B. Global distribution
C. Relational-only schema
D. File-share management

Answer: B — Global distribution

Explanations

A. Incorrect
Cosmos DB supports multiple regions.

B. Correct
Global distribution is a key feature.

C. Incorrect
Cosmos DB is NoSQL.

D. Incorrect
Not a file-sharing service.


Question 20 (Matching)

Match the storage service to its use case.

ServiceUse Case
1. Blob StorageA. SMB file shares
2. Azure FilesB. Unstructured objects

Answers

  • 1 → B
  • 2 → A

Section 4 — Analytics Workloads


Question 21 (Single Answer)

Which process involves collecting data from multiple sources into an analytics system?

A. Visualization
B. Data ingestion
C. Data modeling
D. Backup

Answer: B — Data ingestion

Explanations

A. Incorrect
Visualization displays data.

B. Correct
Ingestion collects and imports data.

C. Incorrect
Modeling defines relationships/calculations.

D. Incorrect
Backup protects data copies.


Question 22 (Single Answer)

Which analytical store is optimized for historical analytics and reporting?

A. OLTP database
B. Data warehouse
C. Azure Files
D. DNS server

Answer: B — Data warehouse

Explanations

A. Incorrect
OLTP supports transactions.

B. Correct
Warehouses support analytics.

C. Incorrect
Files are storage shares.

D. Incorrect
DNS resolves names.


Question 23 (Multi-Answer)

Which Microsoft services support large-scale analytics? (Choose TWO)

A. Azure Databricks
B. Microsoft Fabric
C. Azure DNS
D. Azure Firewall

Answers: A and B

Explanations

A. Correct
Databricks supports big data analytics.

B. Correct
Fabric is an end-to-end analytics platform.

C. Incorrect
DNS is networking.

D. Incorrect
Firewall is security infrastructure.


Question 24 (Single Answer)

What is the primary difference between batch processing and streaming processing?

A. Batch processing handles data continuously
B. Streaming processes data as it arrives
C. Streaming stores only historical data
D. Batch requires IoT devices

Answer: B — Streaming processes data as it arrives

Explanations

A. Incorrect
Continuous processing is streaming.

B. Correct
Streaming handles near real-time data.

C. Incorrect
Streaming is not limited to historical data.

D. Incorrect
Batch does not require IoT.


Question 25 (Single Answer)

Which Azure service is commonly used for streaming event ingestion?

A. Azure Event Hubs
B. Azure Files
C. Azure SQL Database
D. Azure DNS

Answer: A — Azure Event Hubs

Explanations

A. Correct
Event Hubs ingests streaming events.

B. Incorrect
Azure Files is storage.

C. Incorrect
SQL Database is relational.

D. Incorrect
DNS is networking.


Question 26 (Single Answer)

Which service uses SQL-like queries for real-time stream processing?

A. Azure Stream Analytics
B. Azure Firewall
C. Azure DNS
D. Azure Virtual Machines

Answer: A — Azure Stream Analytics

Explanations

A. Correct
Stream Analytics uses SQL-like syntax.

B. Incorrect
Firewall is security.

C. Incorrect
DNS resolves names.

D. Incorrect
VMs are infrastructure.


Question 27 (Fill in the Blank)

The architecture commonly used in analytics models with fact and dimension tables is called a __________ schema.

Answer: Star


Question 28 (Single Answer)

Which Power BI object is a single-page collection of visualizations?

A. Report
B. Dashboard
C. Dataset
D. Workspace

Answer: B — Dashboard

Explanations

A. Incorrect
Reports are usually multi-page.

B. Correct
Dashboards are single-page summaries.

C. Incorrect
Datasets store data models.

D. Incorrect
Workspaces organize content.


Question 29 (Single Answer)

Which Power BI feature is used for data transformation?

A. DAX
B. Power Query
C. Power Automate
D. Azure Functions

Answer: B — Power Query

Explanations

A. Incorrect
DAX creates calculations.

B. Correct
Power Query cleans and transforms data.

C. Incorrect
Power Automate automates workflows.

D. Incorrect
Azure Functions run code.


Question 30 (Single Answer)

Which Power BI language is used for measures and calculations?

A. Python
B. JavaScript
C. DAX
D. XML

Answer: C — DAX


Section 5 — Power BI Visualization


Question 31 (Single Answer)

Which chart type is BEST for showing trends over time?

A. Pie chart
B. Scatter chart
C. Line chart
D. Gauge chart

Answer: C — Line chart


Question 32 (Single Answer)

Which visualization is BEST for showing proportions of a whole?

A. Pie chart
B. Table
C. Scatter chart
D. Card

Answer: A — Pie chart


Question 33 (Single Answer)

Which visualization is BEST for geographic analysis?

A. Matrix
B. Map
C. Gauge
D. Card

Answer: B — Map


Question 34 (Single Answer)

Which visualization is BEST for displaying a single KPI?

A. Scatter chart
B. Card
C. Matrix
D. Pie chart

Answer: B — Card


Question 35 (Single Answer)

Which visualization is BEST for comparing categories?

A. Line chart
B. Map
C. Bar chart
D. Gauge chart

Answer: C — Bar chart


Question 36 (Multi-Answer)

Which visuals support detailed tabular reporting? (Choose TWO)

A. Table
B. Matrix
C. Gauge
D. Pie chart

Answers: A and B


Question 37 (Single Answer)

Which Power BI feature enables interactive filtering?

A. DAX
B. Slicer
C. Gauge
D. Workspace

Answer: B — Slicer


Question 38 (Single Answer)

Which visualization is BEST for identifying relationships between two numeric variables?

A. Pie chart
B. Scatter chart
C. Card
D. Gauge chart

Answer: B — Scatter chart


Question 39 (Fill in the Blank)

A Power BI object containing multiple pages of visualizations is called a __________.

Answer: Report


Question 40 (Single Answer)

Which Power BI component is cloud-based and used for sharing reports?

A. Power BI Desktop
B. Power BI Service
C. Power Query
D. Power Pivot

Answer: B — Power BI Service


Section 6 — Advanced Scenarios


Question 41 (Scenario)

A company needs a globally distributed NoSQL database with low latency worldwide.

Which Azure service should they use?

A. Azure SQL Database
B. Azure Cosmos DB
C. Azure Files
D. Azure Blob Storage

Answer: B — Azure Cosmos DB


Question 42 (Scenario)

A company needs to store millions of images and videos cost-effectively.

Which Azure service is MOST appropriate?

A. Azure SQL Database
B. Azure Blob Storage
C. Azure Files
D. Azure Synapse Analytics

Answer: B — Azure Blob Storage


Question 43 (Scenario)

A company needs fully managed relational databases with automatic patching and backups.

Which service is BEST?

A. SQL Server on Azure VMs
B. Azure SQL Database
C. Azure Files
D. Azure Event Hubs

Answer: B — Azure SQL Database


Question 44 (Scenario)

A retail company wants real-time fraud detection from transaction streams.

Which Azure service is MOST appropriate for processing?

A. Azure Stream Analytics
B. Azure DNS
C. Azure Files
D. Azure Backup

Answer: A — Azure Stream Analytics


Question 45 (Multi-Answer)

Which are characteristics of transactional systems? (Choose TWO)

A. Low-latency transactions
B. Historical trend analysis
C. High concurrency
D. Large aggregations

Answers: A and C


Question 46 (Single Answer)

Which SQL statement modifies existing rows?

A. INSERT
B. UPDATE
C. SELECT
D. CREATE

Answer: B — UPDATE


Question 47 (Single Answer)

Which SQL JOIN returns all rows from the left table and matching rows from the right table?

A. INNER JOIN
B. LEFT JOIN
C. RIGHT JOIN
D. CROSS JOIN

Answer: B — LEFT JOIN


Question 48 (Matching)

Match the visualization to the purpose.

VisualizationPurpose
1. Line chartA. Show relationships
2. Scatter chartB. Show trends

Answers

  • 1 → B
  • 2 → A

Question 49 (Single Answer)

Which Azure service supports Apache Spark analytics?

A. Azure Databricks
B. Azure Files
C. Azure DNS
D. Azure Firewall

Answer: A — Azure Databricks


Question 50 (Single Answer)

Which storage type is MOST appropriate for key-value NoSQL storage?

A. Azure Table Storage
B. Azure SQL Database
C. Azure Files
D. Azure Synapse Analytics

Answer: A — Azure Table Storage


Section 7 — Mixed Difficulty Review


Question 51 (Single Answer)

What is the primary purpose of normalization?

A. Increase redundancy
B. Improve graphics rendering
C. Reduce duplicate data
D. Increase storage costs

Answer: C — Reduce duplicate data


Question 52 (Single Answer)

Which data type stores audio and video files?

A. Structured
B. Semi-structured
C. Unstructured
D. Relational

Answer: C — Unstructured


Question 53 (Multi-Answer)

Which are benefits of Power BI dashboards? (Choose TWO)

A. Real-time monitoring
B. Single-page summary
C. Operating system management
D. Virtual machine provisioning

Answers: A and B


Question 54 (Single Answer)

Which service is MOST associated with IoT device ingestion?

A. Azure IoT Hub
B. Azure SQL Database
C. Azure Files
D. Azure Backup

Answer: A — Azure IoT Hub


Question 55 (Single Answer)

Which Azure service provides a unified analytics platform with BI integration?

A. Microsoft Fabric
B. Azure Firewall
C. Azure DNS
D. Azure Backup

Answer: A — Microsoft Fabric


Question 56 (Single Answer)

Which object improves database query performance?

A. Table
B. View
C. Index
D. Trigger

Answer: C — Index


Question 57 (Single Answer)

Which workload typically uses OLTP systems?

A. Analytical
B. Transactional
C. Archival
D. Reporting-only

Answer: B — Transactional


Question 58 (Fill in the Blank)

The SQL statement used to remove rows from a table is __________.

Answer: DELETE


Question 59 (Single Answer)

Which Azure SQL offering is a Platform as a Service (PaaS) solution?

A. SQL Server on Azure Virtual Machines
B. Azure SQL Database
C. Windows Server
D. Hyper-V

Answer: B — Azure SQL Database


Question 60 (Single Answer)

Which Power BI visualization is MOST appropriate for showing progress toward a goal?

A. Scatter chart
B. Gauge chart
C. Table
D. Pie chart

Answer: B — Gauge chart


Final Exam Tips

Focus heavily on:

  • Relational vs non-relational data
  • Azure storage services
  • Azure SQL family
  • Cosmos DB features
  • Power BI basics
  • Analytics workloads
  • Batch vs streaming concepts

Frequently tested associations:

  • Blob Storage → unstructured files
  • Event Hubs → streaming ingestion
  • Stream Analytics → real-time processing
  • Cosmos DB → globally distributed NoSQL
  • Power BI → visualization and reporting
  • DAX → calculations
  • Power Query → transformation

Power BI Visualization Tips

  • Line chart → trends
  • Bar chart → comparisons
  • Pie chart → proportions
  • Scatter chart → relationships
  • Card → single KPI
  • Map → geographic data

Go to the DP-900 Exam Prep Hub main page.

Practice Questions: Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric) (DP-900 Exam Prep)

Practice Questions


Question 1

What is the primary purpose of Azure Databricks?

A. Hosting relational databases
B. Managing file shares
C. Processing large-scale data using Apache Spark
D. Running virtual machines

Answer: C

Explanation:
Azure Databricks is built on Apache Spark for large-scale data processing.


Question 2

Which feature is a key characteristic of Azure Databricks?

A. Fixed schema relational tables
B. Distributed data processing
C. File-based storage only
D. Limited scalability

Answer: B

Explanation:
Databricks uses distributed computing to process large datasets efficiently.


Question 3

Which scenario is BEST suited for Azure Databricks?

A. Hosting a transactional database
B. Running large-scale ETL pipelines and machine learning models
C. Managing shared file storage
D. Serving static web pages

Answer: B

Explanation:
Databricks is ideal for data engineering and machine learning at scale.


Question 4

What is Microsoft Fabric primarily designed for?

A. Running operating systems
B. Providing a unified, end-to-end analytics platform
C. Managing virtual networks
D. Hosting relational databases only

Answer: B

Explanation:
Microsoft Fabric integrates multiple analytics capabilities into one unified platform.


Question 5

Which component of Microsoft Fabric serves as a unified data storage layer?

A. Azure Blob Storage
B. SQL Database
C. OneLake
D. Azure Files

Answer: C

Explanation:
OneLake is the centralized storage layer within Microsoft Fabric.


Question 6

Which service is BEST suited for organizations that want a single platform for data engineering, data warehousing, and BI?

A. Azure Virtual Machines
B. Azure Databricks
C. Microsoft Fabric
D. Azure Table Storage

Answer: C

Explanation:
Fabric provides an end-to-end unified analytics experience.


Question 7

Which of the following best describes the difference between Azure Databricks and Microsoft Fabric?

A. Databricks is for storage, Fabric is for compute
B. Databricks focuses on big data processing, Fabric provides a unified analytics platform
C. Fabric only supports relational data, Databricks does not
D. Databricks cannot scale, Fabric can

Answer: B

Explanation:
Databricks focuses on processing and ML, while Fabric provides end-to-end analytics.


Question 8

Which programming environments are commonly supported in Azure Databricks notebooks?

A. HTML and CSS only
B. Python, SQL, Scala, and R
C. JavaScript only
D. PowerShell only

Answer: B

Explanation:
Databricks notebooks support multiple languages including Python, SQL, Scala, and R.


Question 9

Which scenario is NOT ideal for Azure Databricks?

A. Large-scale data transformation
B. Machine learning model training
C. Managing simple file shares
D. Processing streaming data

Answer: C

Explanation:
Databricks is not designed for file-sharing scenarios.


Question 10

Which statement about Microsoft Fabric is TRUE?

A. It requires manual infrastructure management
B. It is a SaaS-based unified analytics platform
C. It only supports batch processing
D. It replaces all Azure services

Answer: B

Explanation:
Microsoft Fabric is a fully managed SaaS platform that integrates analytics services.


✅ Quick Exam Takeaways

Azure Databricks

  • Apache Spark-based
  • Distributed processing
  • Data engineering & machine learning

Microsoft Fabric

  • Unified analytics platform
  • End-to-end solution (data + analytics + BI)
  • Includes OneLake storage

✔ Key differences:

  • Databricks → processing & ML
  • Fabric → all-in-one analytics platform

✔ Exam tip:
👉 Big data processing → Azure Databricks
👉 Unified analytics platform → Microsoft Fabric


Go to the DP-900 Exam Prep Hub main page.

Practice Questions: Describe responsibilities for data engineers (DP-900 Exam Prep)

Practice Questions


Question 1

Which task is a primary responsibility of a data engineer?

A. Creating dashboards for business users
B. Managing database user permissions
C. Building and maintaining data pipelines
D. Training machine learning models

Answer: C

Explanation:
Data engineers are responsible for designing and maintaining data pipelines that move and transform data.


Question 2

A company needs to collect data from multiple systems and prepare it for reporting.

Which role is primarily responsible for this task?

A. Data Analyst
B. Database Administrator
C. Data Engineer
D. Business User

Answer: C

Explanation:
Data engineers handle data ingestion, integration, and preparation for downstream analytics.


Question 3

Which process involves extracting data from sources, transforming it, and loading it into a destination system?

A. OLTP
B. ETL
C. OLAP
D. ACID

Answer: B

Explanation:
ETL (Extract, Transform, Load) is a core responsibility of data engineers.


Question 4

Which Azure service is commonly used by data engineers to orchestrate data pipelines?

A. Azure SQL Database
B. Azure Data Factory
C. Azure Blob Storage
D. Azure Virtual Machines

Answer: B

Explanation:
Azure Data Factory is used to build, schedule, and manage data pipelines.


Question 5

Which responsibility ensures that data used for analytics is accurate and reliable?

A. Query optimization
B. Data visualization
C. Data quality management
D. User authentication

Answer: C

Explanation:
Data engineers ensure data quality through validation and cleaning processes.


Question 6

A data engineer is working with large-scale data processing using Apache Spark.

Which Azure service are they MOST likely using?

A. Azure SQL Database
B. Azure Cosmos DB
C. Azure Databricks
D. Azure Table Storage

Answer: C

Explanation:
Azure Databricks is a Spark-based platform used for large-scale data processing.


Question 7

Which storage solution is commonly used by data engineers for storing large volumes of raw and processed data?

A. Azure Data Lake Storage
B. Azure Queue Storage
C. Azure SQL Database
D. Azure Cache for Redis

Answer: A

Explanation:
Azure Data Lake Storage is optimized for big data storage and analytics workloads.


Question 8

Which task is LEAST likely to be performed by a data engineer?

A. Transforming raw data into structured formats
B. Monitoring data pipelines
C. Creating Power BI dashboards
D. Integrating multiple data sources

Answer: C

Explanation:
Creating dashboards is typically the responsibility of a data analyst, not a data engineer.


Question 9

Which type of data processing involves handling real-time data streams?

A. Batch processing
B. Streaming processing
C. Relational processing
D. Transactional processing

Answer: B

Explanation:
Data engineers often work with streaming pipelines for real-time data ingestion.


Question 10

A data engineer selects Parquet as a storage format for a dataset.

What is the primary reason for this choice?

A. It is human readable
B. It supports transactional updates
C. It is optimized for analytical performance
D. It enforces a strict schema

Answer: C

Explanation:
Parquet is a columnar format that improves performance for analytical workloads.


✅ Quick Exam Takeaways

For DP-900, remember data engineers:

✔ Build and manage data pipelines
✔ Handle ETL/ELT processes
✔ Work with batch and streaming data
✔ Ensure data quality and reliability
✔ Manage data storage solutions (Data Lake, Blob)
✔ Use Azure services like:

  • Azure Data Factory
  • Azure Databricks
  • Azure Data Lake Storage
  • Azure Synapse Analytics

✔ Enable analytics and BI by preparing data


Go to the DP-900 Exam Prep Hub main page.

Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric) (DP-900 Exam Prep)

This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. 
This topic falls under these sections:
Describe an analytics workload (25–30%)
--> Describe common elements of large-scale analytics
--> Describe Microsoft Cloud Services for large-scale analytics (Azure Databricks & Microsoft Fabric)


Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.

Modern analytics workloads often require processing massive volumes of data quickly and efficiently. Microsoft provides powerful cloud services to meet these needs, including Azure Databricks and Microsoft Fabric.

For the DP-900 exam, you should understand what these services are, their key features, and when to use each.


Why Large-Scale Analytics Services Matter

Large-scale analytics involves:

  • Processing big data (TBs to PBs)
  • Supporting batch and real-time workloads
  • Enabling advanced analytics and machine learning

✔ Traditional tools often cannot scale to meet these demands.


Azure Databricks


What Is Azure Databricks?

Azure Databricks is a cloud-based analytics platform built on Apache Spark.

It is designed for:

  • Big data processing
  • Data engineering
  • Machine learning
  • Collaborative analytics

Key Features


1. Apache Spark-Based Processing

  • Distributed computing engine
  • Processes large datasets in parallel

✔ Ideal for big data workloads


2. Collaborative Workspace

  • Notebooks (Python, SQL, Scala, R)
  • Multiple users can collaborate

3. Integration with Azure

  • Works with Azure Data Lake Storage
  • Integrates with Azure Synapse Analytics

4. Machine Learning Support

  • Built-in ML capabilities
  • Supports advanced analytics workflows

Common Use Cases

  • Big data processing (ETL/ELT pipelines)
  • Data science and machine learning
  • Real-time analytics
  • Data transformation at scale

Best for: Data engineers and data scientists working with large datasets


Microsoft Fabric


What Is Microsoft Fabric?

Microsoft Fabric is an end-to-end, unified analytics platform that brings together multiple data services into a single environment.

It integrates:

  • Data engineering
  • Data warehousing
  • Data science
  • Real-time analytics
  • Business intelligence

Key Features


1. Unified Platform

  • Combines multiple services into one
  • Reduces complexity of managing separate tools

2. OneLake (Unified Storage Layer)

  • Centralized data lake for all workloads
  • Eliminates data silos

3. Integrated Analytics Experiences

  • Data Factory (ingestion)
  • Data Warehouse
  • Real-Time Analytics
  • Power BI integration

4. SaaS-Based Model

  • Fully managed platform
  • Minimal infrastructure management

Common Use Cases

  • End-to-end analytics solutions
  • Unified data platform for organizations
  • Business intelligence and reporting
  • Data integration and transformation

Best for: Organizations wanting a single, unified analytics solution


Azure Databricks vs Microsoft Fabric

FeatureAzure DatabricksMicrosoft Fabric
FocusBig data processing & MLEnd-to-end analytics platform
EngineApache SparkMultiple integrated engines
UsersData engineers, data scientistsBroad (engineers, analysts, business users)
ComplexityMore flexible, more technicalSimpler, unified experience
Use CaseAdvanced analytics & MLUnified analytics and BI

How They Fit in an Analytics Architecture

Typical roles:

  • Azure Databricks
    • Data processing
    • Advanced transformations
    • Machine learning
  • Microsoft Fabric
    • End-to-end pipeline
    • Storage (OneLake)
    • Reporting (Power BI integration)

✔ They can complement each other in modern architectures.


Key Considerations When Choosing


Choose Azure Databricks when:

  • You need advanced data engineering or machine learning
  • You require Spark-based processing
  • You want full control and flexibility

Choose Microsoft Fabric when:

  • You want a unified analytics platform
  • You prefer simplified, integrated workflows
  • You need end-to-end analytics in one place

Why This Matters for DP-900

On the exam, you may be asked to:

  • Identify the purpose of Azure Databricks
  • Recognize Microsoft Fabric as a unified analytics platform
  • Choose the right service for a scenario
  • Understand how these services support large-scale analytics

Summary — Exam-Relevant Takeaways

Azure Databricks

  • Apache Spark-based
  • Big data processing
  • Machine learning
  • Flexible and powerful

Microsoft Fabric

  • Unified analytics platform
  • End-to-end solution
  • Includes data engineering, warehousing, and BI

✔ Key difference:

  • Databricks → advanced processing & ML
  • Fabric → all-in-one analytics platform

✔ Exam tip:
👉 Spark + big data processing → Azure Databricks
👉 Unified analytics platform → Microsoft Fabric


Go to the Practice Exam Questions for this topic.

Go to the DP-900 Exam Prep Hub main page.