Abstract
This thesis examines the development of explainable machine learning systems for contextual safeguarding in UK secondary education, addressing whether AI can provide effective, interpretable, and fair support for identifying at-risk students. Through experimentation with six algorithms across 2,031 student records, the research investigates both the potential and challenges of applying AI to high-stakes educational decisions. The Explainable Boosting Machine (EBM) achieved 80.8% accuracy and 68% recall when incorporating a Contextual Safeguarding Risk Indicator (CSRI), a domain-specific feature designed through collaboration with educational practitioners that proved substantially more influential than traditional behavioural indicators. Fairness analysis uncovered substantial discrimination: the baseline model exhibited a 47.4 percentage point gap in true positive rates between Pupil Premium and non-Pupil Premium students, functioning as a poverty detector. Standard mitigation approaches (threshold optimisation, sample reweighting, and removing protected attributes) failed to address these disparities adequately.
In response, this research developed a PyGol-EBM hybrid ensemble that routes decisions based on confidence levels, improving detection rates for underserved students from 21.0% to 42.0% whilst maintaining 80.2% overall accuracy and preserving explainability through logical rules for critical decisions. Investigation of explainability revealed trade-offs between transparency and performance, suggesting that systems should provide contextually appropriate transparency rather than pursue maximum explainability.
The research contributes to computer science through the confidence-based routing architecture for ensemble learning, systematic evaluation of fairness-accuracy trade-offs in educational ML, and a contextual explainability framework proposing that interpretability should match decision context rather than be maximised universally. For AI in Education, the thesis demonstrates that standard fairness interventions fail in educational contexts where structural inequality correlates with legitimate risk, proposes an algorithmic equity framework that prioritises proportional support over statistical parity, and provides critical analysis of the ethical tensions between predictive safeguarding and student surveillance, privacy, and agency.