Advertisement
Advertisement

How to use this calculator

Guide: Descriptive Statistics

Descriptive statistics summarise and describe the main features of a dataset. Unlike inferential statistics, they make no claims beyond the data you have — they simply describe what you observe.

They answer questions like: What is the typical value? How spread out is the data? Is the distribution symmetric or skewed?

Mean — best for symmetric, roughly normal data without extreme outliers. Sensitive to outliers (one very high salary can make the average misleading).

Median — best for skewed data or when outliers are present. Always use median for income, house prices, or reaction times.

Mode — best for categorical data or when identifying the most common value. A dataset can have multiple modes.

Variance is the average squared deviation from the mean. It is in squared units (e.g. kg²), which makes it hard to interpret directly.

Standard deviation is the square root of variance — it is in the same units as the original data (e.g. kg) and is almost always preferred for reporting.

The sample versions divide by n−1 (Bessel's correction) to account for estimating from a sample rather than measuring the whole population.

Skewness measures asymmetry. A value of 0 is symmetric. Positive skew means a long right tail (common in income data). Negative skew means a long left tail. Values between −0.5 and +0.5 are often considered approximately symmetric.

Kurtosis (excess kurtosis) measures the weight of the tails compared to a normal distribution. A value of 0 is normal-like. Positive = heavier tails (more extreme values). Negative = lighter tails (platykurtic).

The interquartile range (IQR) = Q3 − Q1 covers the middle 50% of the data. Because it ignores the top and bottom 25%, it is resistant to outliers and is the preferred measure of spread for skewed distributions.

A common rule for outlier detection: a value is a suspected outlier if it lies more than 1.5 × IQR below Q1 or above Q3.

Enter any real numbers separated by commas, spaces, tabs, or new lines. Up to 10 000 values are supported.

Use sample std dev (s) when your data is a sample drawn from a larger population — this is the default in most research. Use population std dev (σ) when you have data for every member of the group.

Skewness measures the asymmetry of the distribution. A value near 0 means roughly symmetric. Positive = right-tailed; negative = left-tailed.

Excess kurtosis (Fisher's definition, used here) measures tail heaviness relative to a normal distribution. Values near 0 = normal-like; >0 = heavier tails; <0 = lighter tails.