Central Tendency
- Mean
- Median
- Mode
Range:
- Max
- Min
- Quantiles
- Outlier
Dispersion
- Variance
- Standard Deviation
Skew

Mean
Population Mean - Sample Mean - Weighted Mean - Trimmed Mean - mean after removing outliers
Median
Suitable for skewed data
Expensive to compute for large dataset, solution = approximate for grouped data

Mode
Most occurring value in dataset Unimodal - moderately skewed data: Under Normal Distribution :
Multi Modal
Bimodal
Two Peaks Result of combining 2 different processes, eg. body mass of males and females
Trimodal
Dispersion
Variance ( | s )
Population: Sample: s
Once rearranged: Incremental and efficient computation of variance:
Note:
Rearranging the formula to enables you to add datapoints incrementally
- This allows for incremental and efficient computation of variance, as the sum of squares (Ξ£x_i^2) can be updated with each new data point.
![INFO] Note This video explains why we divide by in sample populations
Graphical Displays
Boxplot - 5 number summary (min, , median, , max) Histogram - x-axis are values, y-axis represent frequencies Quantile plot - each value is paired with indicating that approximately 100 of data are Quantile-quantile plot: graphs quantiles of univariant distribution against quantiles of another Scatter Plot: pair of values plotted as points on a plane
Boxplot
= 25th percentile
= 75th percentile
IQR
Outlier - usually, a value higher/lower than IQR
Bar Chart
Plots categorical quantitative data

Histogram
Shows distributions of variables represented by area
Plot binned quantitative data

Quantile Plot
Plots quantile information for all data

Quantile-quantile plot
Graphs quantiles of one univariate distribution against the corresponding quantiles of another
If the data distribution is close to normal, the plotted points will lie close to a sloped straight line

Scatter Plot
