Abstract
Addressing bias and unfairness in machine learning models in different application domains is a multifaceted challenge. Despite the variety of fairness metrics available, identifying an optimal set for evaluating a model’s fairness is still an open question due to the diverse nature of these metrics and the lack of a comprehensive approach for ensuring fairness across different applications. This study aims to propose a method that allows the selection of the most representative metrics for bias and fairness assessment, in post-processing for machine learning models in different contexts. We delve into the use of a correlation-based strategy as a heuristic for fairness metric selection, applying bootstrap sampling using the Markov chain Monte Carlo technique, with our proposed improvements, including stratified sampling, stopping criterion, and Kendall correlation, to address the data bias representation, the computational cost, and the robustness, respectively. An average decrease of 64.37% in the number of models and of 20.00% in processing time was achieved. Moreover, the proposed method effectively paired metrics with similar behaviour, highlighting the presence of a similar term as a strong indicator of a direct relationship. While no standout metric emerges across all contexts, within specific models or datasets, certain metrics consistently stand out. In a complex scenario using a large language model for sexism detection, the proposed method achieved a 71.93% reduction in execution time while forming more comprehensive metric groups. Overall, the proposed method successfully selects the representative metric with a considerable gain in computational costs, demonstrating its practicality for real-world applications.