Bias & Fairness — AI Reference Library

AI systems learn from human-generated data — and human data is full of historical bias, structural inequality, and cultural assumptions. When AI learns from this data, it doesn't just inherit these patterns. It can amplify them, automate them at scale, and make them harder to challenge because they appear to come from an objective algorithm. Understanding AI bias is one of the most important aspects of understanding AI's societal impact.

A fundamental truth

There is no such thing as a perfectly neutral AI. Every AI system reflects the choices made in its design: what data was used, what was optimised for, who was in the room when decisions were made. Neutrality is not the absence of bias — it is a particular choice that can itself be biased.

Where bias comes from

AI bias enters systems at multiple points in the development process, not just in the training data. Understanding the sources helps in knowing where to intervene.

Training data bias

The most common source. If the historical data used to train a model reflects past discrimination, the model will learn to replicate that discrimination. A hiring algorithm trained on historical hiring decisions will learn who was hired historically — and if those decisions were biased, the model encodes the bias.

The Amazon hiring algorithm

Amazon built an AI hiring tool trained on a decade of resumes submitted to the company. The problem: most of those resumes came from men, reflecting the male-dominated tech industry. The AI learned to penalise resumes that included the word "women's" (as in women's chess club) and downgraded graduates of all-women's colleges. Amazon quietly scrapped the tool in 2018 after discovering these patterns. The algorithm wasn't told to discriminate — it learned to from the data.

Label bias

In supervised learning, humans label the training data. If the labellers bring their own biases, those biases enter the model. Medical labelling studies have found that the same symptom descriptions are rated differently depending on the perceived demographic of the patient. AI models trained on these labels inherit the rater biases.

Measurement bias

What you measure shapes what the AI optimises for. If you use arrest rates as a proxy for crime rates, you are building in the bias of over-policing in certain communities — since arrest rates reflect policing patterns as much as actual crime. The COMPAS recidivism tool used in US courts was shown to significantly overestimate recidivism risk for Black defendants, partly for this reason.

Aggregation bias

When a single model is trained on data from multiple different groups and applied uniformly, it may perform well on average but poorly for specific subgroups. Facial recognition systems trained predominantly on lighter-skinned faces have significantly higher error rates for darker-skinned faces — documented extensively by Joy Buolamwini at MIT.

Feedback loop bias

Deployed AI systems create new data that is then used to retrain future models. If the system's outputs are biased, and those outputs become training data, the bias compounds over time. A content recommendation algorithm that shows certain types of content to certain groups will reinforce those associations with every interaction.

Fairness is not one thing

One of the deepest challenges in AI fairness is that "fairness" has multiple mathematical definitions — and they are mathematically incompatible. You cannot simultaneously satisfy all of them. This forces hard value judgements about which kind of fairness matters most in a given context.

Fairness concept	What it means	When it matters most
Demographic parity	Equal outcomes across groups (same approval rate regardless of race, gender, etc.)	When historical exclusion needs to be actively corrected
Equal opportunity	Equal true positive rates — qualified candidates from all groups have equal chances	Hiring, university admissions, lending
Calibration	Predicted probabilities are equally accurate across groups	Medical diagnosis, risk scoring
Individual fairness	Similar individuals get similar outcomes, regardless of group membership	When group-level remedies feel unjust

Choosi ng between these is a values question, not a technical one. That is why fairness cannot be delegated entirely to engineers — it requires input from ethicists, affected communities, and policymakers.

Real-world consequences

AI bias is not a theoretical concern. It has caused documented harm in high-stakes domains:

Criminal justice — risk assessment tools used in sentencing and bail decisions have been shown to have racially disparate outcomes in multiple jurisdictions
Healthcare — algorithms that used health care costs as a proxy for health needs systematically underestimated the needs of Black patients, who historically had less access to care
Facial recognition — multiple documented cases of misidentification leading to wrongful arrests, with the errors concentrated in people of colour
Advertising — studies have shown housing and employment ads being shown to demographically skewed audiences, even when advertisers did not intend this
Credit — algorithmic credit decisions have been found to charge higher rates to women and minorities than equally qualified white male applicants in some studies

What is being done about it

The AI fairness field has grown significantly and there are now real tools and practices, though no complete solutions:

Fairness audits — systematically testing model performance across demographic groups before deployment, looking for disparate impact
Diverse training data — deliberately ensuring training datasets represent the full diversity of the population the model will serve
Algorithmic debiasing — techniques for adjusting model outputs or training processes to reduce measured disparities
Diverse development teams — research consistently shows that more diverse teams catch more potential harms in the design phase, before deployment
Community involvement — involving affected communities in the design and evaluation of AI systems that affect them
Transparency and explainability — making it possible to understand why a model made a particular decision, enabling challenge and appeal

The harder truth

Technical fairness interventions can reduce disparities but cannot eliminate the underlying social inequalities that generate biased data in the first place. AI fairness is ultimately inseparable from broader questions of social justice — questions that go well beyond what any algorithm can solve.

Key takeaways

AI bias enters through training data, human labelling, what is measured, aggregation, and feedback loops
Bias is not just a technical error — it reflects the historical inequalities embedded in data
Fairness has multiple mathematical definitions that are mutually incompatible — choosing between them is a values decision
Real-world harm from biased AI has been documented in criminal justice, healthcare, advertising, and credit
Mitigation requires fairness audits, diverse data, diverse teams, and community involvement — not just better algorithms
Technical fixes cannot resolve the underlying social inequalities that generate biased data