The Cure for the Curse of Dimensionality: PCA
What Do We Mean by “Curse of Dimensionality”?
In statistics, dimensionality is the number of features a dataset has. [1]
Three features (e.g., height, age, favorite number) → a 3D space; each person is a point. Visualizing is fine in 2D/3D, but quickly becomes impractical in 10D+. As dimensionality grows:
- Distances concentrate and neighborhoods become sparse.
 - You need far more samples to cover the space.
 - Models get slower and more brittle.
 
In genomics or other ultra-wide datasets, this “curse” is real. We often fight it with dimensionality reduction—compressing data into fewer, more informative dimensions while preserving structure. Classic tools include PCA, LDA, t-SNE, and autoencoders. [2–4]
PCA in One Paragraph
Principal Component Analysis (PCA) finds a new orthogonal basis for your data such that the first axis captures the largest variance, the second the next largest (subject to orthogonality), and so on. You can then project your data onto the first k components and work in k dimensions with minimal information loss (variance-wise). [5]
Intuition: if height doesn’t help you predict favorite food, PCA can down-weight or remove it.
A Minimal scikit-learn Example
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
import numpy as np
# X: shape (n_samples, n_features)
X = np.array([
    [1.8, 25, 70],
    [1.6, 30, 60],
    [1.9, 22, 75],
    [1.7, 28, 68],
    [1.8, 26, 72],
], dtype=float)
# 1) Standardize (important for PCA)
X_std = StandardScaler().fit_transform(X)
# 2) Fit PCA to keep top-2 components
pca = PCA(n_components=2, random_state=0)
Z = pca.fit_transform(X_std)
print("Explained variance ratio:", pca.explained_variance_ratio_)  # how much each PC explains
print("Components (rows are PCs):\n", pca.components_)             # eigenvectors in feature space
print("Projected data shape:", Z.shape)                            # (n_samples, 2)
Tip: Always scale features first (e.g.,
StandardScaler); otherwise features with large units dominate the covariance. [6, 7]
How PCA Works — Step by Step
1) Standardize & Center the Data
Let X be an n x d matrix (rows = samples, columns = features).
Subtract the mean of each column so the data is zero-centered.
2) Compute the Covariance Matrix
Use: Σ = (1 / (n - 1)) * X^T * X
Σisd x d- Diagonals = feature variances
 - Off-diagonals = covariances [6, 7]
 
3) Eigenvalues & Eigenvectors
Solve the eigenproblem for Σ to obtain:
- Eigenvalues (λ): variance captured
 - Eigenvectors (v): principal directions
Normalize eigenvectors to unit length. [8–10] 
4) Rank & Select Components
- Sort eigenpairs by descending eigenvalue (λ).
 - Keep the top 
keigenvectors. - Stack them into a projection matrix 
W_k. 
5) Project the Data
Transform centered data:
Z = X * W_k^T
Zis the new low-dimensional representation.- The first 
kPCs preserve the maximum variance. 
References
[1] Dimensionality, StatisticsHowTo
[2] Application of PCA to medical data, Indian Journal of Science and Technology
[3] Gülsan Öğündür, PCA (Turkish overview)
[4] Jolliffe/Jackson, A User’s Guide to Principal Components (excerpt)
[5] Wikipedia — Principal Component Analysis (overview)
[6] 5 Things You Should Know About Covariance, Towards Data Science
[7] R Statistics Cookbook (O’Reilly) — covariance section
[8] Wikipedia — Eigenvalues and Eigenvectors
[9] Math StackExchange: Importance of eigenvalues/eigenvectors
[10] MathsIsFun — Eigenvalues and Eigenvectors