This agility project investigates whether the Clever Hans effect (a reliance on artifacts for prediction) occurs beyond the well-studied supervised learning setting. While supervised Clever Hans effects arise mainly from spurious correlations between data and labels, the mechanisms of occurrence of the Clever Hans effect in unsupervised learning are little known. This project aims to develop theory and analysis tools based on Explainable AI to systematically uncover, diagnose and overcome unsupervised Clever Hans effects.
Director
Research Junior Group Lead
Fellow