Below is a glossary that includes 100 common “Data Quality & Data Validation” terms and phrases in alphabetical order. Enjoy!
| Term | Definition & Example |
| Business Rule | Business-defined constraint on data. Example: Credit limit approval rules. |
| Check Constraint | SQL rule enforcing condition. Example: Age > 0. |
| Constraint | Rule enforced at database level. Example: NOT NULL constraint. |
| Continuous Validation | Ongoing automated validation. Example: Streaming pipelines. |
| Corrective Control | Fixes identified errors. Example: Data reload. |
| Data Accuracy | Degree to which data correctly represents reality. Example: Correct customer addresses. |
| Data Accuracy Rate | Percentage of correct values. Example: 99.5% accurate. |
| Data Anomaly | Unexpected or suspicious data value. Example: Sudden traffic spike. |
| Data Bias | Systematic data distortion. Example: Sampling bias. |
| Data Certification | Marking trusted datasets. Example: Certified gold tables. |
| Data Cleansing | Correcting or removing invalid data. Example: Fixing malformed phone numbers. |
| Data Completeness | Presence of all required data elements. Example: No missing customer IDs. |
| Data Completeness Rate | Percentage of populated fields. Example: 97% filled. |
| Data Confidence | Trust users have in data. Example: Executive reporting trust. |
| Data Conformance | Adherence to standards or schemas. Example: ISO country codes. |
| Data Consistency | Uniformity of data across systems. Example: Same currency code everywhere. |
| Data Deduplication | Removing duplicate records. Example: Merge customer profiles. |
| Data Defect | Specific instance of poor quality. Example: Invalid customer record. |
| Data Drift | Gradual change in data patterns. Example: Customer behavior shifts. |
| Data Enrichment | Enhancing data with additional attributes. Example: Adding demographic data. |
| Data Error | Incorrect or invalid data value. Example: Misspelled city name. |
| Data Exception | Approved rule deviation. Example: Legacy records. |
| Data Exception Handling | Process for managing violations. Example: Manual review. |
| Data Freshness | How current the data is. Example: Last updated timestamp. |
| Data Governance | Framework overseeing data quality. Example: Stewardship model. |
| Data Imputation | Filling missing values. Example: Replacing null with average. |
| Data Integrity | Accuracy and consistency over the lifecycle. Example: Foreign key relationships enforced. |
| Data Issue | Identified quality problem. Example: Missing values. |
| Data Latency | Delay between event and availability. Example: 2-hour ingestion lag. |
| Data Lineage | Tracking data flow and transformations. Example: Source to dashboard. |
| Data Matching | Identifying records referring to same entity. Example: Customer record linkage. |
| Data Noise | Irrelevant or misleading data. Example: Test records in prod. |
| Data Observability | Visibility into data health and behavior. Example: Pipeline monitoring. |
| Data Ownership | Accountability for data quality. Example: Business owner. |
| Data Precision | Level of detail in data. Example: Decimal places. |
| Data Profiling | Analyzing data to understand structure and quality. Example: Null percentage analysis. |
| Data Quality | Measure of how fit data is for its intended use. Example: Accurate sales totals in reports. |
| Data Quality Alert | Notification of quality issue. Example: Slack alert. |
| Data Quality Audit | Formal assessment of data quality. Example: Quarterly review. |
| Data Quality Automation | Automated quality processes. Example: CI/CD checks. |
| Data Quality Backlog | Tracked list of quality issues. Example: Jira tickets. |
| Data Quality Benchmark | Comparison standard. Example: Industry averages. |
| Data Quality Dashboard | Visual view of quality metrics. Example: Completeness trends. |
| Data Quality Dimension | Category used to measure quality. Example: Accuracy, completeness. |
| Data Quality Framework | Structured quality approach. Example: DAMA dimensions. |
| Data Quality Incident | Major quality failure. Example: Incorrect financial report. |
| Data Quality KPI | Metric tracking quality performance. Example: Duplicate rate. |
| Data Quality Maturity | Level of quality capability. Example: Reactive vs proactive. |
| Data Quality Monitoring | Ongoing quality measurement. Example: Daily freshness checks. |
| Data Quality Ownership Matrix | Mapping quality responsibility. Example: RACI chart. |
| Data Quality Program | Organization-wide quality initiative. Example: Enterprise DQ strategy. |
| Data Quality Regression | Reintroduced quality issue. Example: After schema change. |
| Data Quality Rule Engine | System executing validation rules. Example: Automated checks. |
| Data Quality Rule Violation | Failure to meet a rule. Example: Negative balance. |
| Data Quality Score | Numeric representation of data quality. Example: 98% completeness. |
| Data Quality SLA | Quality expectations agreement. Example: 99% accuracy target. |
| Data Quality SLA Breach | Failure to meet quality targets. Example: Accuracy below SLA. |
| Data Quality Trend | Quality performance over time. Example: Monthly improvement. |
| Data Reconciliation | Comparing datasets for consistency. Example: Finance system vs warehouse. |
| Data Reliability | Consistent data performance over time. Example: Stable metrics. |
| Data Remediation | Fixing data quality issues. Example: Reprocessing failed loads. |
| Data Sampling | Checking subset of data. Example: Random record review. |
| Data Standardization | Transforming data into a common format. Example: Converting dates to ISO format. |
| Data Steward | Role responsible for data quality. Example: Customer data steward. |
| Data Threshold | Acceptable quality limit. Example: ≤ 1% nulls. |
| Data Timeliness | Data availability within required timeframes. Example: Daily data refresh by 6 AM. |
| Data Trust Score | Composite measure of reliability. Example: Internal trust index. |
| Data Uniqueness | No unintended duplicates exist. Example: One row per customer. |
| Data Validation | Process of checking data against rules. Example: Rejecting invalid dates. |
| Data Validation Pipeline | Automated validation process. Example: Ingestion checks. |
| Data Validity | Data conforms to defined formats and rules. Example: Email follows standard pattern. |
| Data Verification | Confirming data accuracy. Example: Source system comparison. |
| Detective Control | Finds errors after entry. Example: Quality audits. |
| Domain Validation | Restricting values to a set. Example: Status = Active/Inactive. |
| Downstream Validation | Validating analytical outputs. Example: Dashboard totals. |
| Duplicate Detection | Identifying duplicate records. Example: Same email address twice. |
| Error Rate | Proportion of invalid records. Example: 2% failures. |
| Foreign Key | Reference to another table. Example: Order → Customer. |
| Format Validation | Ensuring correct data format. Example: YYYY-MM-DD dates. |
| Golden Dataset | Highest-quality dataset version. Example: Curated finance data. |
| Hard Validation | Blocking invalid data. Example: Reject invalid IDs. |
| Null Check | Ensuring required fields are populated. Example: Order ID not null. |
| Outlier Detection | Identifying abnormal values. Example: Negative revenue amounts. |
| Pattern Matching | Validating via regex patterns. Example: Postal code validation. |
| Post-Load Validation | Checks after data load. Example: Row count comparisons. |
| Pre-Load Validation | Checks before data ingestion. Example: File schema validation. |
| Preventive Control | Stops errors before entry. Example: Input validation. |
| Primary Key | Unique record identifier. Example: CustomerID. |
| Quality Gate | Mandatory validation checkpoint. Example: Before publishing data. |
| Range Validation | Checking values fall within limits. Example: Age between 0 and 120. |
| Referential Integrity | Valid relationships between tables. Example: Orders reference valid customers. |
| Root Cause Analysis | Identifying source of data issues. Example: ETL failure investigation. |
| Schema Validation | Checking data structure against schema. Example: Column data types. |
| Soft Validation | Warning without rejecting data. Example: Flag unusual values. |
| Source System Validation | Checking upstream data. Example: CRM record checks. |
| Statistical Validation | Using statistics to validate data. Example: Distribution checks. |
| Trusted Dataset | Data approved for consumption. Example: Executive KPIs. |
| Validation Coverage | Proportion of data checked. Example: 100% of critical fields. |
| Validation Rule | Condition data must satisfy. Example: Quantity must be ≥ 0. |
| Validation Threshold | Limit triggering failure. Example: >5% nulls. |
