In search for algorithmic fairness

Artificial intelligence (AI) has found its way into many work routines – be it the development of hiring procedures, the granting of loans, or even law enforcement. However, the machine learning (ML) systems behind these procedures repeatedly attract attention by distorting results or even discriminating against people on the basis of gender or race. “Accuracy is one essential factor of machine learning models, but fairness and robustness are at least as important,” knows Felix Neutatz, a BIFOLD doctoral student in the group of Prof. Dr. Ziawasch Abedjan, BIFOLD researcher and former professor at TU Berlin who recently moved to Leibniz Universität Hannover. Together with Ricardo Salazar Diaz they published “Automated Feature Engineering for Algorithmic Fairness“, a paper on fairness of machine learning models in Proceedings of the VLDB Endowment.

Algorithms might reinforce biases against groups of people that have been historically discriminated against. Examples include gender bias in machine learning applications on online advertising or recruitment procedures.

The paper presented at VLDB 2021 specifically considers algorithmic fairness. “Previous machine learning models for hiring procedures, usually discriminate systematically against women”, knows Felix Neutatz: “Why? Because they learn on old datasets derived from times when fewer women were employed.” Currently, there are several ways to improve the fairness of such algorithmic decisions. One is to specify that attributes such as gender, race or age are not to be considered in the decision. However, it turns out that other attributes also allow conclusions to be drawn about these sensitive characteristics.

The state-of-the-art bias reduction algorithms simply drop sensitive features and create new artificial non-sensitive instances to counterbalance the loss in the dataset. In case of recruiting procedures, this would mean simply adding lots of artificially generated data from hypothetical female employees to the training dataset. While this approach successfully removes bias it might lead to fairness overfitting and is likely to influence the classification accuracy because of potential information loss.

“There are several important metrics that determine the quality of machine learning models,” Felix Neutatz knows, “these include, for example, privacy, robustness to external attacks, interpretability, and also fairness. The goal of our research is to automatically influence and balance these metrics.”

The researchers developed a new approach that addresses the problem with a feature-wise, strategy. “To achieve both, high accuracy and fairness, we propose to extract as much unbiased information as possible from all features using feature construction (FC) methods that apply non-linear transformations. We use FC first to generate more possible candidate features and then drop sensitive features and optimize for fairness and accuracy”, explains Felix Neutatz. “If we stick to the example of the hiring process, each employee has different attributes depending on the dataset, such as gender, age, experience, education level, hobbies, etc. We generate many new attributes from these real attributes by a large number of transformations. For example, such a new attribute is generated by dividing age by gender or multiplying experience by education level. We show that we can extract unbiased information from biased features by applying human-understandable transformations.”

Finding a unique feature set that optimizes the trade-off between fairness and accuracy is challenging. In their paper, the researchers not only demonstrated a way to extract unbiased information from biased features. They also propose an approach where the ML system and the user collaborate to balance the trade-off between accuracy and fairness and validate this approach by a series of experiments on known datasets.

The publication in detail:

Ricardo Salazar, Felix Neutatz, Ziawasch Abedjan: Automated Feature Engineering for Algorithmic Fairness. PVLDB 14(9): 1694 – 1702 (2021).

One of the fundamental problems of machine ethics is to avoid the perpetuation and amplification of discrimination through machine learning applications. In particular, it is desired to exclude the influence of attributes with sensitive information, such as gender or race, and other causally related attributes on the machine learning task. The state-of-the-art bias reduction algorithm Capuchin breaks the causality chain of such attributes by adding and removing tuples. However, this horizontal approach can be considered invasive because it changes the data distribution. A vertical approach would be to prune sensitive features entirely. While this would ensure fairness without tampering with the data, it could also hurt the machine learning accuracy. Therefore, we propose a novel multi-objective feature selection strategy that leverages feature construction to generate more features that lead to both high accuracy and fairness. On three well-known datasets, our system achieves higher accuracy than other fairness-aware approaches while maintaining similar or higher fairness.