Fairness and Robustness in Machine Learning

Khandpur Singh, Ashneet

Fairness and Robustness in Machine Learning

Khandpur Singh, Ashneet

Supervised by:

Josep Domingo Ferrer Director
Alberto Blanco Justicia Director

Defence university: Universitat Rovira i Virgili

Fecha de defensa: 18 April 2023

Committee:

Pino Caballero Gil Chair
Maria Bras Amorós Secretary
Javier Parra Arnau Committee member

Type: Thesis

Teseo: 807226 DIALNET TDX editor

Abstract

The rise of the IoT and other distributed environments are causing an increase in the number of devices that constantly collect and exchange data. Machine learning models learn from these data to model concrete environments and problems and predict future events but, if the data are biased, they may reach biased conclusions. Such models can be used to make essential and life-changing decisions in a variety of sensitive contexts. Therefore, it is critical to make sure their predictions are fair and not based on discrimination against specific groups or communities, like those of a particular race, gender, or sexual orientation. Federated learning, a type of distributed machine learning, has become one of the foundations of the next-generation AI in distributed settings and needs to be equipped with techniques to tackle this grand and interdisciplinary challenge. Even if FL provides stronger privacy guarantees to the participating clients than centralized learning, in which the clients' raw data are collected in a central server, it is vulnerable to some attacks whereby malicious clients submit bad updates in order to prevent the model from converging or, more subtly, to introduce artificial biases in the models' predictions or decisions (poisoning). Poisoning detection techniques compute statistics on the updates sent by participants to identify malicious clients. A downside of anti-poisoning techniques is that they might lead to discriminating against minority groups whose data are significantly and legitimately different from those of the majority of clients. This would not only be unfair but would yield poorer models that would fail to capture the knowledge in the training data, especially when data are not independent and identically distributed. In this work, we strive to strike a balance between fighting poisoning and accommodating diversity to help learn fairer and less discriminatory federated learning models. In this way, we forestall the exclusion of diverse clients while still ensuring the detection of poisoning attacks. Additionally, we explore the impact of our proposal on the performance of models on non-i.i.d local training data. On the other hand, in order to develop fair models and verify the fairness of these models in the area of machine learning, we propose a method, based on counterfactual examples, that detects any bias in the ML model, regardless of the data type used in the model Objectives: Our contributions are mechanisms to reconcile security with fairness in FL on non-i.i.d data and a method, based on counterfactual examples, that detects any bias in the ML model. - We propose three methods to distinguish members of minority groups from attackers. A first method based on microaggregation, a second one that uses GMM, and a third method based on DBSCAN. - We propose a method, based on counterfactual examples (CE), that detects bias regardless of the data type, in particular for image and tabular data. Material and methods: For the realization of this work, Python has beed used as a programming language, implemented in two different interactive environments such as Jupyter Notebook and Google Colaboratory. Some Python libraries and modules, such as pandas, numpy, matplotlib and scikit-learn have been integrated in different stages of this work. Furthermore, for the experimental part, some standard machine learning datasets have been used. We conducted experiments to examine the effectiveness of our proposed mechanisms in FL with minority groups and non-i.i.d. data. To that end, we chose three publicly available data sets, namely (i) the Adult Income data set, (ii) the Athletes data set, and (iii) the Bank Marketing data set. We evaluated the performance of the proposed approach on two ML tasks: tabular data classification (on Adult data set) and image classification (on CelebA data set). For each task, we trained a baseline model with the original data set and a bias model after we did some alterations to the data set. In both data sets, the baseline and bias models had the same architecture. Conclusions: In this work we have developed methods for fair and robust ML. We want to ensure that no minority in the data set is unfairly impacted by the model's prediction. To this end, we propose methods that tackle this situation. We have dealt with the problem of distinguishing abnormal/malicious behaviors from legitimate ones in federated learning. We focus on scenarios with clients having legitimate minority data, whose updates are likely to be classified as outlying/malicious by the standard attack detection mechanisms proposed in the literature. To make progress towards fair attack detection, we propose three different methods, one based on microaggregation, another based on the Gaussian mixture model and the third one based on DBSCAN. To measure fairness in generalized ML models, we propose a method, based on generating counterfactual examples. For the case of tabular data, to create these counterfactual examples, we make use of adversarial examples. By using these, we can create scenarios that are similar to real-life situations, but slightly different to force our models to make an incorrect prediction. Also, they are useful for testing the robustness of our models. When dealing with image data, to generate counterfactual examples, we leverage GANs. They also provide robustness to our models. The results show that the biased models were precisely biased against the individuals in the targeted datasets.