Probability & Statistics

Probability distributions, Bayes' theorem, hypothesis testing, and regression - the science of uncertainty.

All Topics

Normal Distribution Z-Score Calculator

Formula
z = (x − μ) / σ
x Value
μ Mean
σ Std Deviation
Z-Score
-

1 Probability Fundamentals

Probability measures the likelihood of an event: P(A) ranges from 0 (impossible) to 1 (certain). Core rules govern how probabilities combine.

P(A or B)
P(A) + P(B) − P(A and B)
P(A and B)
P(A) × P(B|A)
Independent
P(A and B) = P(A) × P(B)
Complement
P(not A) = 1 − P(A)
Worked Example 1

Combined Probability - Medical Test

Problem: Two independent tests for a disease each have 95% sensitivity. If both are used, what is the probability of detecting the disease (at least one positive)?
Complement method
P(both miss) = 0.05 × 0.05 = 0.0025
P(at least one detects) = 1 − 0.0025 = 0.9975 = 99.75%
Answer: 99.75%. Two independent 95% tests combined give 99.75% sensitivity. This is why critical diagnoses use multiple independent tests.

2 Bayes' Theorem

Bayes' theorem updates our beliefs when new evidence arrives. It answers: "Given that we observed B, what is the probability of A?"

P(A|B) = P(B|A) × P(A) / P(B)
P(A|B) = posterior (updated belief)P(A) = prior (initial belief)P(B|A) = likelihood (how likely is evidence if A is true)
Worked Example 2

Bayes' Theorem - False Positives

Problem: A disease affects 1% of the population. A test is 99% sensitive (positive when disease present) and 95% specific (negative when disease absent). If you test positive, what is the actual probability you have the disease?
Apply Bayes
P(D) = 0.01, P(+|D) = 0.99, P(+|no D) = 0.05
P(+) = P(+|D)P(D) + P(+|no D)P(no D)
= 0.99(0.01) + 0.05(0.99) = 0.0099 + 0.0495 = 0.0594
P(D|+) = 0.0099 / 0.0594
P(D|+) = 0.167 = 16.7%
Answer: Only 16.7%! Despite a "99% accurate" test, most positives are false when the disease is rare. This counterintuitive result is called the base rate fallacy - the low prevalence (1%) means false positives overwhelm true positives.

3 The Normal Distribution

The bell curve - the most important distribution in statistics. Many natural phenomena follow it: heights, test scores, measurement errors. Characterized by mean (μ) and standard deviation (σ).

68-95-99.7 Rule
68% of data within μ ± 1σ95% within μ ± 2σ99.7% within μ ± 3σ

4 Hypothesis Testing

A structured framework for making decisions from data. You start with a null hypothesis (H₀, usually "no effect"), collect data, and determine whether the evidence is strong enough to reject H₀.

H₀
Null hypothesis (no effect/difference)
H₁
Alternative hypothesis (there IS an effect)
p-value
Probability of data (or more extreme) if H₀ true
α = 0.05
Common significance level (reject if p < α)
Common mistake: A p-value of 0.03 does NOT mean "3% chance H₀ is true." It means: "If H₀ were true, there's a 3% chance of seeing data this extreme." The distinction matters!

5 Confidence Intervals

A confidence interval gives a range of plausible values for a parameter. A 95% CI means: if we repeated the study 100 times, about 95 of those intervals would contain the true parameter.

CI = x̄ ± z* × (σ/√n)
x̄ = sample mean, σ = std deviationn = sample size, z* = critical value95% CI: z* = 1.96, 99% CI: z* = 2.576
Worked Example 3

Confidence Interval - Survey

Problem: A survey of 400 students finds average study time = 4.2 hours/day with σ = 1.5 hours. Construct a 95% confidence interval for the population mean.
95% CI
Margin = 1.96 × (1.5/√400) = 1.96 × 0.075 = 0.147
CI = 4.2 ± 0.147 = [4.053, 4.347] hours
Answer: We are 95% confident the true average is between 4.05 and 4.35 hours/day. Quadrupling the sample size (1600) would halve the margin to ±0.074 - precision improves with √n.

6 Linear Regression

Regression finds the line of best fit through data points, allowing prediction and quantifying relationships between variables.

ŷ = a + bx, where b = Σ(xᵢ−x̄)(yᵢ−ȳ) / Σ(xᵢ−x̄)²
b = slope, a = ȳ − bx̄ (intercept)R² = fraction of variance explained (0 to 1)R² = 0.85 means model explains 85% of variation
Normal (Gaussian) Distribution 68% (±1σ) 95% (±2σ) μ μ−σ μ+σ

Bayes' Theorem Calculator

Formula
P(A|B) = P(B|A)·P(A) / P(B)
P(A) - Prior probability
P(B|A) - Likelihood
P(B) - Marginal probability
Posterior Probability
-

Binomial Probability Calculator

Formula
P(X=k) = C(n,k) · pᵏ · (1−p)ⁿ⁻ᵏ
n Number of trials
k Successes
p Success probability
Probability
-