Subsection from AI Quality Assurance Strategy – High Level Index and highlighted within the AI Quality Assurance Framework
How to Perform Bias Detection in AI Systems
Bias detection is one of the most important parts of AI quality assurance, and it’s much more systematic than most people expect. You’re not just “checking outputs” — you’re evaluating the entire pipeline: data, model behaviour, and real‑world impact. Let me give you a clear, practical framework you can actually use in real projects.
1. Start With a Clear Definition of “Fairness”
Bias detection only works if you know what type of fairness you’re testing for. Common definitions include:
- Demographic parity — outcomes should be equal across groups
- Equal opportunity — true positive rates should be equal
- Equalized odds — both TPR and FPR should be equal
- Predictive parity — predictions should be equally accurate across groups
Different industries require different fairness metrics. For example, finance often uses equal opportunity, while hiring systems may require demographic parity.
2. Audit the Data for Bias Before Training
This is where most bias originates.
Check for:
- Representation imbalance Example: 80% of your dataset is one demographic group.
- Label bias Human annotators may have encoded their own biases.
- Feature bias Some features may correlate with protected attributes (e.g., postcode → race).
- Historical bias The world itself may be biased, and the data reflects it.
Tools & Techniques
- Distribution plots
- Correlation matrices
- Mutual information tests
- Clustering to detect hidden groupings
- Synthetic minority oversampling (SMOTE) for imbalance
3. Test the Model’s Behaviour Across Groups
This is the core of bias detection.
Evaluate performance metrics by subgroup:
- Accuracy
- Precision
- Recall
- F1
- False positive rate
- False negative rate
If one group consistently gets worse outcomes, you’ve found a bias.
Example
If a fraud detection model flags 12% of transactions from Group A as fraud but only 3% from Group B, you need to investigate.
4. Use Counterfactual Testing
This is one of the most powerful techniques.
How it works
You take the same input and change only the protected attribute:
- Change gender: “John” → “Jane”
- Change ethnicity-coded names
- Change age
- Change postcode
If the prediction changes only because of that attribute, the model is biased.
This is called counterfactual fairness testing.
5. Use Adversarial Bias Probing
You intentionally try to break the model.
Examples
- Generate edge cases
- Use adversarial examples
- Use synthetic data to stress-test fairness
- Try to infer protected attributes from model outputs
If the model leaks or encodes protected attributes, that’s a red flag.
6. Apply Explainability Tools
Explainability helps you see why the model behaves the way it does.
Useful tools
- SHAP values
- LIME
- Integrated gradients
- Feature importance heatmaps
If protected attributes (or proxies) have high influence, you’ve found bias.
7. Run Real‑World Impact Simulations
Bias isn’t just mathematical — it’s social.
Simulate:
- Who gets approved or denied
- Who gets flagged
- Who gets recommended
- Who gets excluded
This helps you detect impact bias, which may not show up in metrics.
8. Document Everything
Bias detection must be transparent.
Include:
- What fairness metrics you used
- What groups you tested
- What disparities you found
- How you mitigated them
- What limitations remain
This becomes part of your model card or AI governance documentation.
9. Continuous Monitoring After Deployment
Bias can appear over time due to:
- Data drift
- Population changes
- Feedback loops
- Model retraining
Set up automated monitoring for:
Unexpected disparities
Group-level performance
Drift in protected attributes