Banner Banner

BIFOLD Colloquium 03/2025

Icon

May 13, 2025 Icon 16:00 - 17:00

Icon

Technische Universität Berlin, MAR Building Marchstraße 23 , 10587 Berlin, Room 2.013

Icon

Prof. Dr. Felix Naumann, Hasso-Plattner-Institut

Data Quality in the Age of AI

Abstract: Data quality comprises a large set of dimensions, covering many facets including simple statistics, syntactic problems, factual errors, and organizational and business aspects. With the current trend in data-oriented sciences and the increasing reliance on machine learning methods and AI systems, the challenges of poor data quality are ever more apparent.
Even recent legislation, such as the EU AI Act, mentions data quality requirements for training data. With it, the notion of data quality extends to novel dimensions, such as fairness, diversity, or explainability. In the talk we shall highlight research in this field and point out current challenges and research opportunities.

© HPI

Bio: Felix Naumann studied mathematics, economy, and computer sciences at TU Berlin and completed his PhD thesis in the area of data quality at Humboldt University of Berlin in 2000. After a PostDoc position at the IBM Almaden Research Center working on data integration topics, he came assistant professor for information integration, again at the Humboldt-University of Berlin in 2003. Since 2006 he holds the chair for Information Systems at the Hasso Plattner Institute (HPI) at the University of Potsdam in Germany. He has been visiting researcher at QCRI, AT&T Research, IBM Research, and SAP. His research interests include data profiling, data quality and cleansing, and data integration, recorded in over 200 scientific publications. Next to numerous PC memberships for international conferences, he has organized several conferences in various roles, including VLDB 2021 as PC co-chair, and he is the Editor-in-Chief of the ACM Journal of Data and Information Quality (JDIQ). More details are at https://hpi.de/naumann/people/felix-naumann.html.