Common Algorithms — AI Reference Library

There are dozens of ML algorithms, each with strengths and weaknesses. You don't need to understand the maths — but knowing what each algorithm does conceptually, and when to use it, is genuinely useful.

The golden rule

There's no single best algorithm. The right choice depends on your data size, the type of problem, interpretability needs, and how much training time you can afford.

Linear & Logistic Regression

Linear regression predicts a continuous number by fitting a straight line through data points. Simple, fast, interpretable. Great when relationships are roughly linear. Used for: price prediction, demand forecasting.

Logistic regression (despite the name) is a classification algorithm. It predicts the probability of a binary outcome. Still fast and interpretable. Used for: credit scoring, spam detection, medical diagnosis.

Decision Trees

A decision tree asks a series of yes/no questions to classify or predict. Easy to visualise and explain to non-technical stakeholders ("if income > 50k AND age < 35, predict churn"). Prone to overfitting on their own, but very powerful in ensembles.

Random Forests

A random forest builds hundreds of decision trees, each trained on a random subset of the data, and takes the majority vote (for classification) or average (for regression). The randomness makes the ensemble much more robust than any single tree. One of the most reliable "out of the box" algorithms for tabular data.

Gradient Boosting (XGBoost, LightGBM)

Builds trees sequentially — each new tree focuses on correcting the errors of the previous one. Extremely powerful for structured/tabular data. XGBoost and LightGBM dominate Kaggle competitions and industry ML for business problems.

Algorithm	Best for	Interpretable?
Linear/Logistic Regression	Simple relationships, when you need speed	Yes
Decision Tree	Explainable decisions, small datasets	Yes
Random Forest	Tabular data, good default choice	Somewhat
XGBoost/LightGBM	Best accuracy on tabular data	No
Neural Networks	Images, text, audio, large datasets	No
k-Nearest Neighbours	Simple baselines, recommendation	Yes

Neural Networks

Inspired loosely by biological neurons, neural networks consist of layers of connected nodes. Each layer learns progressively more abstract features. Shallow networks (1–2 hidden layers) work for simple problems. Deep networks (many layers) power image recognition, language models, and more.

Neural networks are the most powerful algorithms available — but they need large amounts of data, are computationally expensive, and are largely uninterpretable (black boxes).

k-Nearest Neighbours (kNN)

The simplest possible idea: to classify a new point, find the k most similar examples in the training data and take a majority vote. Intuitive, requires no training, but slow at prediction time on large datasets. Useful as a baseline and in recommendation systems.

Key takeaways

No single best algorithm — choose based on data size, problem type, and interpretability needs
For tabular data: start with Random Forest or XGBoost for strong baseline performance
For images/text/audio: neural networks (deep learning) are the standard approach
When explainability matters: linear regression, logistic regression, or decision trees
Neural networks are powerful but need large data, compute, and are black boxes