This post is a part of the DP-900: Microsoft Azure Data Fundamentals Exam Prep Hub. This topic falls under these sections: Describe considerations for working with non-relational data on Azure (15–20%) --> Describe Capabilities and Features of Azure Cosmos DB --> Describe Azure Cosmos DB APIs
Note that there are 10 practice questions (with answers and explanations) for each section to help you solidify your knowledge of the material. Also, there are 2 practice tests with 60 questions each available on the hub below the exam topics section.
Azure Cosmos DB supports multiple APIs that allow developers to interact with the database using different data models and familiar query languages.
For the DP-900 exam, you should understand what these APIs are, how they differ, and when to use each one.
What Are Azure Cosmos DB APIs?
APIs in Azure Cosmos DB define:
How data is structured
How it is queried
Which tools and SDKs are used
✔ Each API provides a different way to interact with the same underlying Cosmos DB service.
Why Multiple APIs?
Azure Cosmos DB supports multiple APIs to:
Allow developers to use familiar tools
Enable easy migration from existing systems
Support different types of applications and data models
💡 Key idea: 👉 Choose the API based on your application’s existing technology or data model
Core Azure Cosmos DB APIs
1. Core (SQL) API
Also known as the SQL API.
Key Features
Uses a SQL-like query language
Stores data as JSON documents
Most commonly used API
Use Cases
New application development
General-purpose NoSQL workloads
✔ Best for: Developers familiar with SQL who want flexibility
2. MongoDB API
Key Features
Compatible with MongoDB drivers and tools
Uses MongoDB query syntax
Use Cases
Migrating existing MongoDB applications
Applications already using MongoDB
✔ Best for: MongoDB workloads moving to Azure
3. Cassandra API
Key Features
Compatible with Apache Cassandra
Supports Cassandra Query Language (CQL)
Use Cases
Large-scale distributed workloads
Applications using Cassandra
✔ Best for: Cassandra-based systems needing cloud scalability
4. Table API
Key Features
Similar to Azure Table Storage
Key-value data model
Uses OData-based queries
Use Cases
Simple key-value workloads
Applications already using Table Storage
✔ Best for: Lightweight, scalable key-value scenarios
5. Gremlin API
Key Features
Supports graph data models
Uses Gremlin query language
Use Cases
Graph-based applications
Relationship-heavy data
✔ Best for: Social networks, recommendation engines, network analysis
Key Differences Between APIs
API
Data Model
Query Language
Best For
Core (SQL)
Document (JSON)
SQL-like
General-purpose apps
MongoDB
Document
MongoDB query
MongoDB migration
Cassandra
Wide-column
CQL
Distributed systems
Table
Key-value
OData
Simple scalable storage
Gremlin
Graph
Gremlin
Relationship-based data
Important Concepts for DP-900
1. Same Service, Different Interfaces
All APIs run on Azure Cosmos DB, but:
Each API has its own endpoint
Each uses different query syntax
Each supports different SDKs
2. API Choice Is Permanent
You choose the API when creating a Cosmos DB account
You cannot switch APIs later
3. Performance and Features Are Shared
Global distribution
Low latency
High availability
Scalability
✔ These benefits apply regardless of API choice.
When to Choose Each API
Core (SQL) API → Default choice for most applications
MongoDB API → Existing MongoDB apps
Cassandra API → Distributed, large-scale systems
Table API → Simple key-value workloads
Gremlin API → Graph relationships
Why This Matters for DP-900
On the exam, you may be asked to:
Identify the correct API for a scenario
Match APIs to data models
Understand why multiple APIs exist
Recognize migration scenarios
Summary — Exam-Relevant Takeaways
✔ Azure Cosmos DB supports multiple APIs:
Core (SQL) API
MongoDB API
Cassandra API
Table API
Gremlin API
✔ Each API:
Uses a different data model
Has its own query language
✔ Key concept: 👉 Choose the API based on your application’s needs or existing system
✔ Important:
API choice is fixed at creation
All APIs benefit from Cosmos DB features (scalability, global distribution)
Which Azure SQL offering is fully managed and requires the least administrative effort?
A. SQL Server on Azure Virtual Machines B. Azure SQL Managed Instance C. Azure SQL Database D. Azure Synapse Analytics
✅ Answer: C
Explanation: Azure SQL Database is a fully managed PaaS service with minimal administration.
Question 2
Which Azure SQL service provides the highest level of compatibility with on-premises SQL Server while still being a PaaS solution?
A. Azure SQL Database B. Azure SQL Managed Instance C. SQL Server on Azure Virtual Machines D. Azure Cosmos DB
✅ Answer: B
Explanation: Azure SQL Managed Instance offers near 100% compatibility with SQL Server.
Question 3
Which Azure SQL option allows full control over the operating system?
A. Azure SQL Database B. Azure SQL Managed Instance C. SQL Server on Azure Virtual Machines D. Azure SQL Elastic Pool
✅ Answer: C
Explanation: SQL Server on Azure VM is an IaaS offering, giving full OS-level control.
Question 4
Which service is BEST suited for a cloud-native application with minimal management overhead?
A. SQL Server on Azure Virtual Machines B. Azure SQL Managed Instance C. Azure SQL Database D. Azure Data Lake
✅ Answer: C
Explanation: Azure SQL Database is optimized for modern cloud applications.
Question 5
Which Azure SQL service supports instance-level features such as SQL Agent?
A. Azure SQL Database B. Azure SQL Managed Instance C. SQL Server on Azure Virtual Machines D. Azure Blob Storage
✅ Answer: B
Explanation: Managed Instance supports many instance-level features not available in Azure SQL Database.
Question 6
A company wants to migrate an existing SQL Server database with minimal changes. Which service should they choose?
A. Azure SQL Database B. Azure SQL Managed Instance C. SQL Server on Azure Virtual Machines D. Azure Synapse Analytics
✅ Answer: B
Explanation: Managed Instance is designed for lift-and-shift migrations with high compatibility.
Question 7
Which Azure SQL option requires you to manage backups, updates, and patching?
A. Azure SQL Database B. Azure SQL Managed Instance C. SQL Server on Azure Virtual Machines D. Azure SQL Elastic Pool
✅ Answer: C
Explanation: In IaaS (Azure VM), the customer is responsible for management tasks.
Question 8
Which of the following best describes Platform as a Service (PaaS) in the Azure SQL family?
A. Full control over hardware and OS B. No database management required at all C. Azure manages infrastructure and database maintenance D. Only supports non-relational data
✅ Answer: C
Explanation: PaaS handles infrastructure, patching, backups, and high availability.
Question 9
Which Azure SQL service is MOST appropriate when you need maximum control and customization?
A. Azure SQL Database B. Azure SQL Managed Instance C. SQL Server on Azure Virtual Machines D. Azure SQL Elastic Pool
✅ Answer: C
Explanation: SQL Server on Azure VM provides full control over configuration and environment.
Question 10
Which statement best describes the relationship between the Azure SQL family products?
A. They use completely different database engines B. They all use the SQL Server engine with different management levels C. Only Azure SQL Database supports SQL D. Only SQL Server on Azure VM supports relational data
✅ Answer: B
Explanation: All Azure SQL offerings are based on the SQL Server engine, differing mainly in management and control.
✅ Quick Exam Takeaways
✔ Azure SQL Database
Fully managed (PaaS)
Best for cloud-native apps
✔ Azure SQL Managed Instance
Near full SQL Server compatibility
Best for migrations
✔ SQL Server on Azure VM
Full control (IaaS)
You manage everything
✔ Key concept: 👉 More control = more responsibility 👉 More automation = less control
Computer vision is a branch of Artificial Intelligence (AI) that enables machines to interpret, analyze, and understand visual information such as images and videos. In the context of the AI-900: Microsoft Azure AI Fundamentals exam, you are not expected to build complex models or write code. Instead, the focus is on recognizing computer vision workloads, understanding what problems they solve, and knowing which Azure AI services are appropriate for each scenario.
This topic falls under:
Describe Artificial Intelligence workloads and considerations (15–20%)
Identify features of common AI workloads
A strong conceptual understanding here will help you confidently answer many scenario-based exam questions.
What Is a Computer Vision Workload?
A computer vision workload involves extracting meaningful insights from visual data. These workloads allow systems to:
Identify objects, people, or text in images
Analyze facial features or emotions
Understand the content of photos or videos
Detect changes, anomalies, or motion
Common inputs include:
Images (JPEG, PNG, etc.)
Video streams (live or recorded)
Common outputs include:
Labels or tags
Bounding boxes around detected objects
Extracted text
Descriptions of image content
Common Computer Vision Use Cases
On the AI-900 exam, computer vision workloads are usually presented as real-world scenarios. Below are the most common ones you should recognize.
Image Classification
What it does: Assigns a category or label to an image.
Example scenarios:
Determining whether an image contains a cat, dog, or bird
Classifying products in an online store
Identifying whether a photo shows food, people, or scenery
Key idea: The entire image is classified as one or more categories.
Object Detection
What it does: Detects and locates multiple objects within an image.
Example scenarios:
Detecting cars, pedestrians, and traffic signs in street images
Counting people in a room
Identifying damaged items in a warehouse
Key idea: Unlike classification, object detection identifies where objects appear using bounding boxes.
Face Detection and Facial Analysis
What it does: Detects human faces and analyzes facial attributes.
Example scenarios:
Detecting whether a face is present in an image
Estimating age or emotion
Identifying facial landmarks (eyes, nose, mouth)
Important exam note:
AI-900 focuses on face detection and analysis, not facial recognition for identity verification.
Be aware of ethical and privacy considerations when working with facial data.
Optical Character Recognition (OCR)
What it does: Extracts printed or handwritten text from images and documents.
Example scenarios:
Reading text from scanned documents
Extracting information from receipts or invoices
Recognizing license plate numbers
Key idea: OCR turns unstructured visual text into machine-readable text.
Image Description and Tagging
What it does: Generates descriptive text or tags that summarize image content.
Example scenarios:
Automatically tagging photos in a digital library
Creating alt text for accessibility
Generating captions for images
Key idea: This workload focuses on understanding the overall context of an image rather than specific objects.
Video Analysis
What it does: Analyzes video content frame by frame.
Example scenarios:
Detecting motion or anomalies in security footage
Tracking objects over time
Summarizing video content
Key idea: Video analysis extends image analysis across time, not just a single frame.
Azure Services Commonly Associated with Computer Vision
For the AI-900 exam, you should recognize which Azure AI services support computer vision workloads at a high level.
Azure AI Vision
Supports:
Image analysis
Object detection
OCR
Face detection
Image tagging and description
This is the most commonly referenced service for computer vision scenarios on the exam.
Azure AI Custom Vision
Supports:
Custom image classification
Custom object detection
Used when prebuilt models are not sufficient and you need to train a model using your own images.
Azure AI Video Indexer
Supports:
Video analysis
Object, face, and scene detection in videos
Typically appears in scenarios involving video content.
How Computer Vision Differs from Other AI Workloads
Understanding what is not computer vision is just as important on the exam.
AI Workload Type
Focus Area
Computer Vision
Images and videos
Natural Language Processing
Text and speech
Speech AI
Audio and voice
Anomaly Detection
Patterns in numerical or time-series data
Exam tip: If the input data is visual (images or video), you are almost certainly dealing with a computer vision workload.
Responsible AI Considerations
Microsoft emphasizes responsible AI, and AI-900 includes high-level awareness of these principles.
For computer vision workloads, key considerations include:
Privacy and consent when capturing images or video
Avoiding bias in facial analysis
Transparency in how visual data is collected and used
You will not be tested on implementation details, but you may see conceptual questions about ethical use.
Exam Tips for Identifying Computer Vision Workloads
Focus on keywords like image, photo, video, camera, scanned document
Look for actions such as detect, recognize, classify, extract text
Match the scenario to the simplest appropriate workload
Remember: AI-900 tests understanding, not coding
Summary
To succeed on the AI-900 exam, you should be able to:
Recognize when a problem is a computer vision workload
Identify common use cases such as image classification, object detection, and OCR
Understand which Azure AI services are commonly used
Distinguish computer vision from other AI workloads
Mastering this topic will give you a strong foundation for many questions in the Describe Artificial Intelligence workloads and considerations domain.
Incremental refresh is a key optimization technique for enterprise-scale semantic models in Microsoft Fabric and Power BI. Instead of fully refreshing all data during each refresh cycle, incremental refresh allows you to refresh only new or changed data, significantly improving refresh performance, reducing resource consumption, and enabling scalability for large datasets.
In the DP-600 exam, this topic appears under Optimize enterprise-scale semantic models and focuses on when, why, and how to configure incremental refresh correctly.
What Is Incremental Refresh?
Incremental refresh is a feature for Import mode and Hybrid (Import + DirectQuery) semantic models that:
Partitions data based on date/time columns
Refreshes only a recent portion of data
Retains historical data without reprocessing it
Optionally supports real-time data using DirectQuery
Incremental refresh is not applicable to:
Direct Lake–only semantic models
Pure DirectQuery models
Key Benefits
Incremental refresh provides several enterprise-level advantages:
Faster refresh times for large datasets
Reduced memory and CPU usage
Improved reliability of scheduled refreshes
Better scalability for growing fact tables
Enables near-real-time analytics when combined with DirectQuery
Core Configuration Components
1. Date/Time Column Requirement
Incremental refresh requires a column that:
Is of type Date, DateTime, or DateTimeZone
Represents a monotonically increasing timeline (for example, OrderDate or TransactionDate)
This column is used to define data partitions.
2. RangeStart and RangeEnd Parameters
Incremental refresh relies on two Power Query parameters:
RangeStart – Beginning of the refresh window
RangeEnd – End of the refresh window
These parameters:
Must be of type Date/Time
Are used in a filter step in Power Query
Are evaluated dynamically during refresh
Exam tip: These parameters are required, not optional.
3. Refresh and Storage Policies
When configuring incremental refresh, you define two key time windows:
Policy
Purpose
Store rows from the past
Defines how much historical data is retained
Refresh rows from the past
Defines how much recent data is refreshed
Example:
Store data for 5 years
Refresh data from the last 7 days
Only the refresh window is reprocessed during each refresh.
4. Optional: Detect Data Changes
Incremental refresh can optionally use a change detection column (for example, LastModifiedDate):
Only refreshes partitions where data has changed
Reduces unnecessary refresh operations
Column must be reliably updated when records change
This is especially useful for slowly changing dimensions.
Incremental Refresh with Real-Time Data (Hybrid Tables)
Incremental refresh can be combined with DirectQuery to support real-time data:
Historical data → Import mode
Recent data → DirectQuery
This configuration:
Uses the “Get the latest data in real time” option
Is commonly referred to as a Hybrid table
Balances performance with freshness
Deployment and Execution Behavior
Incremental refresh is defined in Power BI Desktop
Partitions are created only after publishing
Refresh execution happens in the Fabric service
Desktop refresh does not create partitions
Exam tip: Many questions test the difference between design-time configuration and service-side execution.
Limitations and Considerations
Requires Import or Hybrid mode
Date column must exist in the fact table
Cannot be configured directly in Fabric service
Schema changes may require full refresh
Partition count should be managed to avoid excessive overhead
Common DP-600 Exam Scenarios
You may be asked to:
Choose incremental refresh to solve long refresh times
Decide between full refresh vs incremental refresh
Configure refresh windows for historical vs recent data
Combine incremental refresh with real-time analytics
When to Use Incremental Refresh (Exam Heuristic)
Choose incremental refresh when:
Fact tables are large and growing
Only recent data changes
Full refresh times are too long
Import mode is required for performance
Avoid it when:
Data volume is small
Real-time access is required for all data
Using Direct Lake–only models
Exam Tips
For DP-600, remember:
RangeStart / RangeEnd are mandatory
Incremental refresh = Import or Hybrid
Partitions are service-side
Refresh window ≠ storage window
Hybrid tables enable real-time + performance
Summary
Incremental refresh is a foundational optimization technique for large semantic models in Microsoft Fabric. For the DP-600 exam, focus on:
Required parameters (RangeStart, RangeEnd)
Refresh vs storage windows
Import and Hybrid model compatibility
Real-time and change detection scenarios
Service-side execution behavior
Practice Questions:
Here are 10 questions to test and help solidify your learning and knowledge. As you review these and other questions in your preparation, make sure to …
Identifying and understand why an option is correct (or incorrect) — not just which one
Look for and understand the usage scenario of keywords in exam questions to guide you
Expect scenario-based questions rather than direct definitions
Question 1
You have a large fact table with 5 years of historical data. Only the most recent data changes daily. Which feature should you implement to reduce refresh time?
A. DirectQuery mode B. Incremental refresh C. Calculated tables D. Composite models
✅ Correct Answer: B
Explanation: Incremental refresh is designed to refresh only recent data while retaining historical partitions, significantly improving refresh performance for large datasets.
Question 2
Which two Power Query parameters are required to configure incremental refresh?
A. StartDate and EndDate B. MinDate and MaxDate C. RangeStart and RangeEnd D. RefreshStart and RefreshEnd
✅ Correct Answer: C
Explanation: Incremental refresh requires RangeStart and RangeEnd parameters of type Date/Time to define partition boundaries.
Question 3
Where are incremental refresh partitions actually created?
A. Power BI Desktop during data load B. Fabric Data Factory C. Microsoft Fabric service after publishing D. SQL endpoint
✅ Correct Answer: C
Explanation: Partitions are created and managed only in the Fabric service after the model is published. Desktop refresh does not create partitions.
Question 4
Which storage mode is required to use incremental refresh?
A. DirectQuery only B. Direct Lake only C. Import or Hybrid D. Dual only
✅ Correct Answer: C
Explanation: Incremental refresh works with Import mode and Hybrid tables. It is not supported for DirectQuery-only or Direct Lake–only models.
Question 5
You configure incremental refresh to store 5 years of data and refresh the last 7 days. What happens during a scheduled refresh?
A. All data is fully refreshed B. Only the last 7 days are refreshed C. Only the last year is refreshed D. Only new rows are loaded
✅ Correct Answer: B
Explanation: The refresh window defines how much data is reprocessed. Historical partitions outside that window are retained without refresh.
Question 6
Which column type is required for incremental refresh filtering?
A. Text B. Integer C. Boolean D. Date/DateTime
✅ Correct Answer: D
Explanation: Incremental refresh requires a Date, DateTime, or DateTimeZone column to define time-based partitions.
Question 7
What is the purpose of the Detect data changes option?
A. To refresh all partitions automatically B. To detect schema changes C. To refresh only partitions where data has changed D. To enable real-time DirectQuery
✅ Correct Answer: C
Explanation: Detect data changes uses a change-tracking column (e.g., LastModifiedDate) to avoid refreshing partitions when no data has changed.
Question 8
Which scenario best fits a Hybrid incremental refresh configuration?
A. All data must be queried in real time B. Small dataset refreshed once per day C. Historical data rarely changes, but recent data must be real time D. Streaming data only
✅ Correct Answer: C
Explanation: Hybrid tables combine Import for historical data and DirectQuery for recent data, providing real-time access where needed.
Question 9
What happens if the date column used for incremental refresh contains null values?
A. Incremental refresh is automatically disabled B. Only historical partitions fail C. Refresh may fail or produce incorrect partitions D. Null values are ignored safely
✅ Correct Answer: C
Explanation: The date column must be reliable. Null or invalid values can break partition logic and cause refresh failures.
Question 10
When should you avoid using incremental refresh?
A. When the dataset is large B. When only recent data changes C. When using Direct Lake–only semantic models D. When refresh duration is long
✅ Correct Answer: C
Explanation: Incremental refresh is not supported for Direct Lake–only models, as Direct Lake handles freshness differently through OneLake access.
Microsoft Fabric is a central platform for data and analytics, and one of its powerful features that supports it being an all-in-one platform is Shortcuts. Shortcuts provide a simple way to unify data across multiple locations without duplicating or moving it. This is a big deal because it saves a LOT of time and effort that is usually involved in moving data around.
What Are Shortcuts?
Shortcuts are references (or “pointers”) to data that resides in another storage location. Instead of copying the data into Fabric, a shortcut lets you access and query it as if it were stored locally.
This is especially valuable in today’s data landscape, where data often spans OneLake, Azure Data Lake Storage (ADLS), Amazon S3, or other environments.
Types of Shortcuts
There are 2 types of shortcuts: table shortcuts and file shortcuts
Table Shortcuts
Point to existing tables in other Fabric workspaces or external sources.
Allow you to query and analyze the table without physically moving it.
File Shortcuts
Point to files (e.g., Parquet, CSV, Delta Lake) stored in OneLake or other supported storage systems.
Useful for scenarios where files are your system of record, but you want to use them in Fabric experiences like Power BI, Data Engineering, or Data Science.
Benefits of Shortcuts
Shortcuts is a really useful feature, and here are some of its benefits:
No Data Duplication: Saves storage costs and avoids data sprawl.
Single Source of Truth: Data stays in its original location while being usable across Fabric.
Speed and Efficiency: Query and analyze external data in place, without lengthy ETL processes.
Flexibility: Works across different storage platforms and Fabric workspaces.
How and Where Shortcuts Can Be Created
In OneLake: You can create shortcuts directly in OneLake to link to data from ADLS Gen2, Amazon S3, or other OneLake workspaces.
In Fabric Experiences: Whether working in Data Engineering, Data Science, Real-Time Analytics, or Power BI, shortcuts can be created in lakehouses or KQL (Kusto Query Language) databases, and you can use them directly as data in OneLake. Any Fabric service will be able to use them without copying data from the data source.
In Workspaces: Shortcuts make it possible to connect across lakehouses stored in different workspaces, breaking down silos within an organization. The shortcuts can be generated from a lakehouse, warehouse, or KQL database.
Note that warehouses do not support the creation of shortcuts. However, you can query data stored within other warehouses and lakehouses.
How Shortcuts Can Be Used
Cross-Workspace Data Access: Analysts can query data in another team’s workspace without requesting a copy.
Data Virtualization: Data scientists can work with files stored in ADLS without having to move them into Fabric.
BI and Reporting: Power BI models can use shortcuts to reference external files or tables, enabling consistent reporting without duplication.
ETL Simplification: Instead of moving raw files into Fabric, engineers can create shortcuts and build transformations directly on the source.
Common Scenarios
A finance team wants to build Power BI reports on data stored by the operations team without moving the data.
A data scientist needs access to parquet files in Amazon S3 but prefers to analyze them within Fabric.
A company with multiple Fabric workspaces wants to centralize access to shared reference data (like customer or product master data) without replication.
In summary: Microsoft Fabric Shortcuts simplify data access across locations and workspaces. Whether table-based or file-based, they allow organizations to unify data without duplication, streamline analytics, and improve collaboration.