Statistics
Statistics Complete Cheatsheet
Descriptive stats, probability, distributions, hypothesis testing, and regression — complete statistics reference.
📖 5 sections
⏰ 15 min read
✅ Quizzes included
🌙 Dark mode
01Descriptive Statistics
Mean
x-bar=sum(x)/n
Median
Middle value when sorted
Mode
Most frequent
IQR
Q3-Q1
Resistant to outliers.
Variance
s^2=sum(x-mean)^2/(n-1)
Sample variance.
Std dev
s=sqrt(variance)
68-95-99.7 rule for normal dist.
SymbolMeaning
x-barSample mean
muPopulation mean
sSample std dev
sigmaPopulation std dev
nSample size
02Probability
P(A)
0 to 1 range. P(certain)=1, P(impossible)=0
Complement
P(A')=1-P(A)
Addition
P(AuB)=P(A)+P(B)-P(AnB)
Multiplication
P(AnB)=P(A|B)*P(B)
Independent
P(AnB)=P(A)*P(B)
Conditional
P(A|B)=P(AnB)/P(B)
STATSProbability examples
# Deck of 52 cards
P(Ace)=4/52=1/13
P(Red or Ace)=26/52+4/52-2/52=28/52

# Independent
P(Head AND Head)=0.5*0.5=0.25

# Conditional
P(2nd Ace | 1st was Ace)=3/51
03Distributions
DistributionMeanVarianceUse when
Binomial(n,p)npnp(1-p)Fixed n trials, constant p
Poisson(lambda)lambdalambdaRare events, fixed time
Normal(mu,sigma^2)musigma^2Continuous, bell-shaped
Uniform(a,b)(a+b)/2(b-a)^2/12All outcomes equally likely
STATSBinomial
P(X=k)=C(n,k)*p^k*(1-p)^(n-k)

Example: 10 flips, p=0.5, P(X=6)
=C(10,6)*0.5^6*0.5^4
=210*0.015625*0.0625=0.205
04Hypothesis Testing
H0
Null hypothesis: no effect. We try to disprove it.
H1
Alternative hypothesis: our claim.
p-value
Prob of result if H0 true. p
alpha
Significance level, usually 0.05.
Type I error
Reject H0 when true (false positive). Rate=alpha.
Type II error
Fail to reject H0 when false (false negative). Rate=beta.
STATSTest steps
1. State H0 and H1
2. Set alpha (0.05)
3. Choose test statistic
4. Calculate: t=(x-bar - mu0)/(s/sqrt(n))
5. Find p-value
6. p
❓ Quiz
p<0.05 in hypothesis testing means?
p-value < alpha means the result is statistically significant — reject null hypothesis H0.
05Regression
Pearson r
-1 to 1
|r|>0.7 strong, <0.3 weak.
R-squared
0 to 1
Variance explained by model.
Regression line
y-hat=a+bx
b=slope, a=intercept.
Slope
b=r*(sy/sx)
Intercept
a=y-bar - b*x-bar
Point (x-bar,y-bar) on line.
⚠️
Correlation does NOT imply causation! Even r=1 does not prove one variable causes the other.