Why does the same shape show up everywhere?
Heights. Test scores. Measurement errors. The size of snowflakes. The weight of apples in a bag. All of them produce roughly the same histogram, a symmetric, bell-shaped mound that peaks in the middle and tapers off toward the extremes.
This post is about why that happens, what the shape is actually telling you, and how to read it.
The height example
Imagine you pick a random person off the street. How tall are they?
You could try to guess, but here's the more interesting question: why is it so rare to meet someone who's 7 feet tall?
It's not just "randomly unlikely." Being very tall requires dozens of independent things to go right. You need the tall version of this gene, and that gene, and that other one. You need the right hormone levels, the right nutrition at the right ages. Each factor individually nudges you a bit taller or a bit shorter.
Being 7 feet tall means all of those factors had to push in the tall direction at once. The probability of that happening drops off very fast as you pile up more coincidences.
Being 5'10", right near average, only requires those factors to roughly balance out. There are many ways to end up near average. There are very few ways to end up at an extreme.
That's the bell curve. The more coincidences something requires, the rarer it becomes, and the falloff follows a precise mathematical shape.
Two numbers control everything
The bell curve has a specific formula, but before we look at it, we need to understand what it takes as input.
Just two things:
-
μ (mu), the mean. This is where the peak sits. If everyone's average height is 5'9", the peak of the bell is at 5'9". Move the average, and the whole bell shifts.
-
σ (sigma), the standard deviation. This measures the spread. A small σ means people cluster tightly around the average. A large σ means heights are all over the place.
And here's the key constraint: the bell curve is a probability distribution. The total area under it must equal exactly 1, because all probabilities add up to 100%.
This means that if you stretch the bell wider (bigger σ), it automatically gets shorter. And if you squeeze it narrower, it gets taller. The shape adjusts to keep the area constant.
Wider bell = shorter bell — the total area is always 1. μ shifts the center; σ controls the spread.
Drag the σ slider all the way to the left. The bell becomes a tall, narrow spike. Drag it right, it flattens into a wide, shallow mound. The peak height in the stat box updates live, and you'll notice it scales exactly as .
Drag μ and the whole bell slides along the axis. The shape doesn't change, just the position.
The formula
Once you have the intuition, the formula is just a translation of what we already know.
Let's read it piece by piece.
The right side has two parts. First, , that's just the normalization constant. It's the factor that makes the total area equal 1. It happens to be exactly proportional to , which confirms what we just saw: double the spread, halve the peak height.
The interesting part is the exponent: .
Work from inside out. is the distance from the mean, how far you are from average. Dividing by scales it relative to the typical spread. Squaring it makes it always positive and symmetric: being 2 units above average looks the same as being 2 units below. The makes the exponent negative, so is always between 0 and 1, and it gets smaller the further you go from the mean.
The formula says: height at x depends only on how far x is from μ, measured in units of σ. That distance-from-mean idea has a name, and it's the most useful tool in statistics.
The 68–95–99.7 rule
Because the bell curve has a fixed shape, we can nail down exactly what fraction of the area falls within 1, 2, and 3 standard deviations of the mean, and these numbers turn out to be remarkably clean.
Watch how much area lands within each standard deviation band — these percentages hold for every normal distribution.
Press Play and watch the bands fill in.
Within 1σ of the mean: 68.27% of the total area. About 2 out of 3 people.
Within 2σ: 95.45%. If you're within 2 standard deviations of the mean, you're in the vast majority.
Within 3σ: 99.73%. Nearly everyone. Only 0.27% of people fall outside this range.
This is sometimes called the empirical rule, and it's useful because it holds for any normal distribution. If you know the mean and standard deviation, you immediately know where most of the data will be, no matter what you're measuring.
Translating: the z-score
Here's a problem. Suppose two students compare their test scores. Alice scored 82 on a history test. Bob scored 91 on a physics test. Who did better?
You can't compare directly, the two classes might have completely different difficulty levels. Maybe history scores had a mean of 80 and a standard deviation of 10, while physics had a mean of 85 and a standard deviation of 20.
To compare fairly, we need to ask: how far is each score from its own mean, measured in standard deviation units?
That's the z-score:
Alice: , just slightly above average.
Bob: , also slightly above average, but by a bit more.
The z-score collapses any normal distribution into the same standard scale. A z-score of 0 is exactly average. A z-score of +2 means you're in the top 2.3%. A z-score of −1 means you're about one standard deviation below average.
Drag the handle to any score — the z-score tells you how many standard deviations you are from the mean, and the shaded area is your percentile.
Drag the handle across the distribution. Notice how the z-score updates continuously, and the shaded area to the left is your percentile: the fraction of scores you outperform.
Drag to the exact mean (score 70) and you'll see z = 0 and percentile = 50%. Drag to 100 and you're at z = +2.00, the 97.7th percentile. Push to the extremes and watch the percentile approach 0% or 100%, but never quite reach them. The tails are infinite, just vanishingly small.
Why the bell appears everywhere
We've established that heights form a bell. But why do so many other things, test scores, measurement errors, the weight of production parts, the height of ocean waves, all follow the same shape?
Here's the reason.
Whenever a random quantity is the sum of many independent, small contributions, the sum's distribution approaches a bell shape, regardless of what distribution the individual contributions follow.
That's the Central Limit Theorem, one of the most remarkable facts in all of mathematics.
Measuring a rod with a ruler? Your measurement error is the sum of dozens of small, independent inaccuracies, hand position, angle of view, temperature effects, imperfections in the ruler. Sum enough small independent errors and you get a bell.
A test score? The sum of performance on many individual questions, each influenced by independent knowledge, attention, sleep, and luck. Bell.
Even dice: roll one die, you get a uniform distribution. Roll two dice and average them, you get a triangular shape. Roll ten dice and average them, it starts looking remarkably like a bell.
No matter which source distribution you choose, the histogram of sample means always converges to the same bell shape.
Pick any source distribution, Uniform (perfectly flat), Exponential (heavily right-skewed), or Bimodal (two humps, nothing like a bell). Hit Auto-play and watch what happens to the histogram of sample means.
The dashed curve is the bell shape the Central Limit Theorem predicts. The histogram chases it as you add more samples. By the time you have a few hundred, they match almost exactly, regardless of which source distribution you chose.
This is why the normal distribution is everywhere. It's not a coincidence. It's what happens when many small independent effects add up. Which is what most measured quantities in the world actually are.
The short version
The normal distribution is a bell-shaped curve controlled by two parameters: μ (where the peak is) and σ (how spread out the bell is). A wider bell is a shorter bell, the total area is always 1.
About 68% of values fall within 1σ of the mean. About 95% within 2σ. Nearly all within 3σ.
The z-score converts any value into a standard unit: how many standard deviations from the mean. It lets you compare values from completely different distributions on the same scale.
And the reason the bell appears everywhere? The Central Limit Theorem: when you add together many independent random quantities, the sum converges to a bell shape no matter what distribution each piece follows. Most measurements in the world are exactly that, sums of many small independent effects. So they form bells.
The formula is just a description. The bell is what randomness looks like when it comes from many sources at once.
All visualizations on this page compute distributions numerically in the browser. The normal CDF uses the Abramowitz & Stegun rational approximation (error < 7.5 × 10⁻⁸). The CLT simulator uses samples of n = 30 from each source distribution.