Practice Questions
Question 1
Which file format is most commonly used to store simple structured data in a plain-text, tabular form?
A. JSON
B. Parquet
C. CSV
D. Avro
✅ Answer: C
Explanation:
CSV (Comma-Separated Values) stores structured data as rows and columns in plain text and is widely used for data exchange.
Question 2
Which format is most associated with semi-structured data and commonly used by web APIs?
A. CSV
B. JSON
C. TXT
D. JPEG
✅ Answer: B
Explanation:
JSON uses key–value pairs and nested objects, making it ideal for semi-structured application data and APIs.
Question 3
A data engineering team needs a highly compressed, column-based file format optimized for analytics queries in Azure Synapse. Which format should they use?
A. XML
B. CSV
C. Parquet
D. TXT
✅ Answer: C
Explanation:
Parquet is a columnar, binary format designed for high-performance analytics and efficient storage.
Question 4
Which file format is tag-based, verbose, and commonly seen in legacy systems?
A. JSON
B. XML
C. Avro
D. CSV
✅ Answer: B
Explanation:
XML is a semi-structured, tag-based format often used in older enterprise systems and integrations.
Question 5
Which format is binary, includes schema information, and is commonly used in streaming or ingestion pipelines?
A. CSV
B. JSON
C. Avro
D. TXT
✅ Answer: C
Explanation:
Avro is a compact binary format that embeds schema and supports schema evolution, making it suitable for pipelines and streaming.
Question 6
A company stores application logs as JSON files in Azure Data Lake Storage. What type of data is this?
A. Structured
B. Semi-structured
C. Unstructured
D. Relational
✅ Answer: B
Explanation:
JSON represents semi-structured data because it uses keys and nested structures but does not enforce a fixed schema.
Question 7
Which format is most appropriate for exchanging small datasets between systems and opening directly in Excel?
A. Parquet
B. Avro
C. CSV
D. XML
✅ Answer: C
Explanation:
CSV is lightweight, human readable, and easily opened in spreadsheet tools like Excel.
Question 8
Which Azure service is most commonly used to store files such as CSV, JSON, Parquet, images, and videos?
A. Azure SQL Database
B. Azure Cosmos DB
C. Azure Blob Storage
D. Azure Table Storage
✅ Answer: C
Explanation:
Azure Blob Storage is Azure’s primary service for storing files of all formats, including structured, semi-structured, and unstructured data.
Question 9
Which format is not human readable and primarily optimized for analytics workloads?
A. CSV
B. JSON
C. Parquet
D. XML
✅ Answer: C
Explanation:
Parquet is a binary format optimized for performance and compression, not human readability.
Question 10
Match the format to the most appropriate data type:
Which pairing is correct?
A. CSV → Unstructured
B. JSON → Structured
C. TXT → Semi-structured
D. Parquet → Structured / Analytics
✅ Answer: D
Explanation:
Parquet is commonly used for structured analytical datasets in big data and Azure analytics workloads.
✅ Quick Exam Takeaways
For DP-900, remember:
- CSV → Structured, plain text
- JSON / XML → Semi-structured
- Parquet → Columnar, analytics-optimized
- Avro → Binary, schema included, pipeline-friendly
- TXT → Usually unstructured
And:
- These formats typically live in Azure Blob Storage or Azure Data Lake Storage
- Parquet and Avro are common in analytics and data engineering pipelines
Go to the DP-900 Exam Prep Hub main page.
