Tag: Data Terms

Glossary – 100 “Data Analysis” Terms

Below is a glossary that includes 100 common “Data Analysis” terms and phrases in alphabetical order. Enjoy!

TermDefinition & Example
A/B TestComparing two variations to measure impact. Example: Two webpage layouts.
Actionable InsightAn insight that leads to a clear decision. Example: Improve onboarding experience.
Ad Hoc AnalysisOne-off analysis for a specific question. Example: Investigating a sudden sales dip.
AggregationSummarizing data using functions like sum or average. Example: Total revenue by region.
Analytical MaturityOrganization’s capability to use data effectively. Example: Moving from descriptive to predictive analytics.
Bar ChartA chart comparing categories. Example: Sales by region.
BaselineA reference point for comparison. Example: Last year’s sales used as baseline.
BenchmarkA standard used to compare performance. Example: Industry average churn rate.
BiasSystematic error in data or analysis. Example: Surveying only active users.
Business QuestionA decision-focused question data aims to answer. Example: Which products drive profit?
CausationA relationship where one variable causes another. Example: Price cuts causing sales growth.
Confidence IntervalRange likely containing a true value. Example: 95% CI for average sales.
CorrelationA statistical relationship between variables. Example: Sales and marketing spend.
Cumulative TotalA running total over time. Example: Year-to-date revenue.
DashboardA visual collection of key metrics. Example: Executive sales dashboard.
DataRaw facts or measurements collected for analysis. Example: Sales transactions, sensor readings, survey responses.
Data AnomalyUnexpected or unusual data pattern. Example: Sudden spike in user signups.
Data CleaningCorrecting or removing inaccurate data. Example: Fixing misspelled country names.
Data ConsistencyUniform representation across datasets. Example: Same currency used everywhere.
Data GovernancePolicies ensuring data quality, security, and usage. Example: Defined data ownership roles.
Data ImputationReplacing missing values with estimated ones. Example: Filling null ages with the median.
Data LineageTracking data origin and transformations. Example: Tracing metrics back to source systems.
Data LiteracyAbility to read, understand, and use data. Example: Interpreting charts correctly.
Data ModelThe structure defining how data tables relate. Example: Star schema.
Data PipelineAutomated flow of data from source to destination. Example: Daily ingestion job.
Data ProfilingAnalyzing data characteristics. Example: Checking null percentages.
Data QualityThe accuracy, completeness, and reliability of data. Example: Valid dates and consistent formats.
Data RefreshUpdating data with the latest values. Example: Nightly refresh.
Data Refresh FrequencyHow often data is updated. Example: Hourly vs. daily refresh.
Data SkewnessDegree of asymmetry in data distribution. Example: Income data skewed to the right.
Data SourceThe origin of data. Example: SQL database, API.
Data StorytellingCommunicating insights using narrative and visuals. Example: Executive-ready presentation.
Data TransformationModifying data to improve usability or consistency. Example: Converting text dates to date data types.
Data ValidationEnsuring data meets rules and expectations. Example: No negative quantities.
Data WranglingTransforming raw data into a usable format. Example: Reshaping columns for analysis.
DatasetA structured collection of related data. Example: A table of customer orders with dates, amounts, and regions.
Derived MetricA metric calculated from other metrics. Example: Profit margin = Profit / Revenue.
Descriptive AnalyticsAnalysis that explains what happened. Example: Last quarter’s sales summary.
Diagnostic AnalyticsAnalysis that explains why something happened. Example: Revenue drop due to fewer customers.
DiceFiltering data by multiple dimensions. Example: Sales for 2025 in the West region.
DimensionA descriptive attribute used to slice data. Example: Date, region, product.
Dimension TableA table containing descriptive attributes. Example: Product details.
DimensionalityNumber of features or variables in data. Example: High-dimensional customer data.
DistributionHow values are spread across a range. Example: Income distribution.
Drill DownNavigating from summary to detail. Example: Yearly sales → monthly sales.
Drill ThroughJumping to a detailed view for a specific value. Example: Clicking a region to see store data.
ELTExtract, Load, Transform approach. Example: Transforming data inside a warehouse.
ETLExtract, Transform, Load process. Example: Loading CRM data into a warehouse.
Exploratory Data Analysis (EDA)Initial investigation to understand data. Example: Visualizing distributions.
Fact TableA table containing quantitative data. Example: Sales transactions.
FeatureAn individual measurable property used in analysis. Example: Customer age used in churn analysis.
Feature EngineeringCreating new features from existing data. Example: Calculating customer tenure from signup date.
FilteringLimiting data to a subset of interest. Example: Only orders from 2025.
GranularityThe level of detail in the data. Example: Daily sales vs. monthly sales.
GroupingOrganizing data into categories before aggregation. Example: Sales grouped by product category.
HistogramA chart showing data distribution. Example: Frequency of order sizes.
HypothesisA testable assumption. Example: Discounts increase sales.
Incremental LoadLoading only new or changed data. Example: Yesterday’s transactions.
InsightA meaningful finding that informs action. Example: High churn among new users.
KPI (Key Performance Indicator)A critical metric tied to business objectives. Example: Monthly churn rate.
KurtosisMeasure of how heavy the tails of a distribution are. Example: Detecting extreme outliers.
LatencyDelay between data generation and availability. Example: Real-time vs. daily data.
Line ChartA chart showing trends over time. Example: Monthly revenue trend.
MeanThe arithmetic average. Example: Average order value.
MeasureA calculated numeric value, often aggregated. Example: SUM(Sales).
MedianThe middle value in ordered data. Example: Median household income.
MetricA quantifiable measure used to track performance. Example: Total sales, average order value.
Missing ValuesData points that are absent or null. Example: Blank customer age values.
ModeThe most frequent value. Example: Most common product category.
Multivariate AnalysisAnalyzing multiple variables simultaneously. Example: Studying price, demand, and seasonality.
NormalizationScaling data to a common range. Example: Normalizing values between 0 and 1.
ObservationA single record or row in a dataset. Example: One customer’s purchase history.
OutlierA data point significantly different from others. Example: An unusually large transaction amount.
PercentileValue below which a percentage of data falls. Example: 90th percentile response time.
PopulationThe full set of interest. Example: All customers.
Predictive AnalyticsAnalysis that forecasts future outcomes. Example: Predicting next month’s demand.
Prescriptive AnalyticsAnalysis that suggests actions. Example: Recommending price changes.
QuartileValues dividing data into four parts. Example: Q1, Q2, Q3.
ReportA structured presentation of analysis results. Example: Monthly performance report.
ReproducibilityAbility to recreate analysis results consistently. Example: Using versioned datasets.
Rolling AverageAn average calculated over a moving window. Example: 7-day rolling average of sales.
Root Cause AnalysisIdentifying the underlying cause of an issue. Example: Revenue loss due to inventory shortages.
SampleA subset of a population. Example: Survey respondents.
Sampling BiasBias introduced by non-random samples. Example: Feedback collected only from power users.
Scatter PlotA chart showing relationships between two variables. Example: Ad spend vs. revenue.
SeasonalityRepeating patterns tied to time cycles. Example: Holiday sales spikes.
Semi-Structured DataData with flexible structure. Example: JSON files.
Sensitivity AnalysisEvaluating how outcomes change with inputs. Example: Impact of price changes on profit.
SliceFiltering data by a single dimension. Example: Sales for 2025 only.
SnapshotData captured at a specific point in time. Example: End-of-month balances.
Snowflake SchemaA normalized version of a star schema. Example: Product broken into sub-tables.
Standard DeviationAverage distance from the mean. Example: Consistency of sales performance.
StandardizationRescaling data to have mean 0 and standard deviation 1. Example: Preparing data for regression analysis.
Star SchemaA data model with facts surrounded by dimensions. Example: Sales fact with product and date dimensions.
Structured DataData with a fixed schema. Example: Relational tables.
Time SeriesData indexed by time. Example: Daily stock prices.
TrendA general direction in data over time. Example: Increasing monthly revenue.
Unstructured DataData without a predefined schema. Example: Emails, images.
VariableA characteristic or attribute that can take different values. Example: Age, revenue, product category.
VarianceMeasure of data spread. Example: Variance in delivery times.

Please share your suggestions for any terms that should be added.