Tag: Data Quality

Glossary – 100 “Data Quality & Data Validation” terms

Below is a glossary that includes 100 common “Data Quality & Data Validation” terms and phrases in alphabetical order. Enjoy!

TermDefinition & Example
 Business RuleBusiness-defined constraint on data. Example: Credit limit approval rules.
 Check ConstraintSQL rule enforcing condition. Example: Age > 0.
 ConstraintRule enforced at database level. Example: NOT NULL constraint.
 Continuous ValidationOngoing automated validation. Example: Streaming pipelines.
 Corrective ControlFixes identified errors. Example: Data reload.
 Data AccuracyDegree to which data correctly represents reality. Example: Correct customer addresses.
 Data Accuracy RatePercentage of correct values. Example: 99.5% accurate.
 Data AnomalyUnexpected or suspicious data value. Example: Sudden traffic spike.
 Data BiasSystematic data distortion. Example: Sampling bias.
 Data CertificationMarking trusted datasets. Example: Certified gold tables.
 Data CleansingCorrecting or removing invalid data. Example: Fixing malformed phone numbers.
 Data CompletenessPresence of all required data elements. Example: No missing customer IDs.
 Data Completeness RatePercentage of populated fields. Example: 97% filled.
 Data ConfidenceTrust users have in data. Example: Executive reporting trust.
 Data ConformanceAdherence to standards or schemas. Example: ISO country codes.
 Data ConsistencyUniformity of data across systems. Example: Same currency code everywhere.
 Data DeduplicationRemoving duplicate records. Example: Merge customer profiles.
 Data DefectSpecific instance of poor quality. Example: Invalid customer record.
 Data DriftGradual change in data patterns. Example: Customer behavior shifts.
 Data EnrichmentEnhancing data with additional attributes. Example: Adding demographic data.
 Data ErrorIncorrect or invalid data value. Example: Misspelled city name.
 Data ExceptionApproved rule deviation. Example: Legacy records.
 Data Exception HandlingProcess for managing violations. Example: Manual review.
 Data FreshnessHow current the data is. Example: Last updated timestamp.
 Data GovernanceFramework overseeing data quality. Example: Stewardship model.
 Data ImputationFilling missing values. Example: Replacing null with average.
 Data IntegrityAccuracy and consistency over the lifecycle. Example: Foreign key relationships enforced.
 Data IssueIdentified quality problem. Example: Missing values.
 Data LatencyDelay between event and availability. Example: 2-hour ingestion lag.
 Data LineageTracking data flow and transformations. Example: Source to dashboard.
 Data MatchingIdentifying records referring to same entity. Example: Customer record linkage.
 Data NoiseIrrelevant or misleading data. Example: Test records in prod.
 Data ObservabilityVisibility into data health and behavior. Example: Pipeline monitoring.
 Data OwnershipAccountability for data quality. Example: Business owner.
 Data PrecisionLevel of detail in data. Example: Decimal places.
 Data ProfilingAnalyzing data to understand structure and quality. Example: Null percentage analysis.
 Data QualityMeasure of how fit data is for its intended use. Example: Accurate sales totals in reports.
 Data Quality AlertNotification of quality issue. Example: Slack alert.
 Data Quality AuditFormal assessment of data quality. Example: Quarterly review.
 Data Quality AutomationAutomated quality processes. Example: CI/CD checks.
 Data Quality BacklogTracked list of quality issues. Example: Jira tickets.
 Data Quality BenchmarkComparison standard. Example: Industry averages.
 Data Quality DashboardVisual view of quality metrics. Example: Completeness trends.
 Data Quality DimensionCategory used to measure quality. Example: Accuracy, completeness.
 Data Quality FrameworkStructured quality approach. Example: DAMA dimensions.
 Data Quality IncidentMajor quality failure. Example: Incorrect financial report.
 Data Quality KPIMetric tracking quality performance. Example: Duplicate rate.
 Data Quality MaturityLevel of quality capability. Example: Reactive vs proactive.
 Data Quality MonitoringOngoing quality measurement. Example: Daily freshness checks.
 Data Quality Ownership MatrixMapping quality responsibility. Example: RACI chart.
 Data Quality ProgramOrganization-wide quality initiative. Example: Enterprise DQ strategy.
 Data Quality RegressionReintroduced quality issue. Example: After schema change.
 Data Quality Rule EngineSystem executing validation rules. Example: Automated checks.
 Data Quality Rule ViolationFailure to meet a rule. Example: Negative balance.
 Data Quality ScoreNumeric representation of data quality. Example: 98% completeness.
 Data Quality SLAQuality expectations agreement. Example: 99% accuracy target.
 Data Quality SLA BreachFailure to meet quality targets. Example: Accuracy below SLA.
 Data Quality TrendQuality performance over time. Example: Monthly improvement.
 Data ReconciliationComparing datasets for consistency. Example: Finance system vs warehouse.
 Data ReliabilityConsistent data performance over time. Example: Stable metrics.
 Data RemediationFixing data quality issues. Example: Reprocessing failed loads.
 Data SamplingChecking subset of data. Example: Random record review.
 Data StandardizationTransforming data into a common format. Example: Converting dates to ISO format.
 Data StewardRole responsible for data quality. Example: Customer data steward.
 Data ThresholdAcceptable quality limit. Example: ≤ 1% nulls.
 Data TimelinessData availability within required timeframes. Example: Daily data refresh by 6 AM.
 Data Trust ScoreComposite measure of reliability. Example: Internal trust index.
 Data UniquenessNo unintended duplicates exist. Example: One row per customer.
 Data ValidationProcess of checking data against rules. Example: Rejecting invalid dates.
 Data Validation PipelineAutomated validation process. Example: Ingestion checks.
 Data ValidityData conforms to defined formats and rules. Example: Email follows standard pattern.
 Data VerificationConfirming data accuracy. Example: Source system comparison.
 Detective ControlFinds errors after entry. Example: Quality audits.
 Domain ValidationRestricting values to a set. Example: Status = Active/Inactive.
 Downstream ValidationValidating analytical outputs. Example: Dashboard totals.
 Duplicate DetectionIdentifying duplicate records. Example: Same email address twice.
 Error RateProportion of invalid records. Example: 2% failures.
 Foreign KeyReference to another table. Example: Order → Customer.
 Format ValidationEnsuring correct data format. Example: YYYY-MM-DD dates.
 Golden DatasetHighest-quality dataset version. Example: Curated finance data.
 Hard ValidationBlocking invalid data. Example: Reject invalid IDs.
 Null CheckEnsuring required fields are populated. Example: Order ID not null.
 Outlier DetectionIdentifying abnormal values. Example: Negative revenue amounts.
 Pattern MatchingValidating via regex patterns. Example: Postal code validation.
 Post-Load ValidationChecks after data load. Example: Row count comparisons.
 Pre-Load ValidationChecks before data ingestion. Example: File schema validation.
 Preventive ControlStops errors before entry. Example: Input validation.
 Primary KeyUnique record identifier. Example: CustomerID.
 Quality GateMandatory validation checkpoint. Example: Before publishing data.
 Range ValidationChecking values fall within limits. Example: Age between 0 and 120.
 Referential IntegrityValid relationships between tables. Example: Orders reference valid customers.
 Root Cause AnalysisIdentifying source of data issues. Example: ETL failure investigation.
 Schema ValidationChecking data structure against schema. Example: Column data types.
 Soft ValidationWarning without rejecting data. Example: Flag unusual values.
 Source System ValidationChecking upstream data. Example: CRM record checks.
 Statistical ValidationUsing statistics to validate data. Example: Distribution checks.
 Trusted DatasetData approved for consumption. Example: Executive KPIs.
 Validation CoverageProportion of data checked. Example: 100% of critical fields.
 Validation RuleCondition data must satisfy. Example: Quantity must be ≥ 0.
 Validation ThresholdLimit triggering failure. Example: >5% nulls.