Banner Banner

Data Discovery in Data Lakes: Operations, Indexes, Systems

Ziawasch Abedjan
Mahdi Esmailoghli
Sainyam Galhotra

September 01, 2025

Data discovery has gained significant traction in the database com munity resulting in various discovery operations, index schemes, and discovery systems. This tutorial explores the architecture and components of data discovery systems, focusing on indexing struc tures andscalable algorithms for typical operations, such as join and union discovery. While giving insights into individual algorithms, we point out open challenges for holistic systems, data discovery evaluation, and discovery in federated setups.