Banner Banner

Special issue on “Machine learning and databases”

Matthias Boehm
Nesime Tatbul

April 17, 2024

Recent advances in machine learning (ML) techniques have led to an explosion in their adoption across all fields of computer science, including database (DB) systems and data management. Meanwhile, end-to-end ML pipelines built for diverse data-driven applications are becoming increasingly more data-centric, presenting new challenges and opportunities in data science and engineering. From data collection and preparation to model training and deployment, efficient access to high-quality data and models form a critical component of the iterative lifecycle of these pipelines. Furthermore, new synergies arise in applying ML techniques for improving DB system internals as well as specializing their functionality and performance to new data and query workload characteristics—a critical need in increasingly more complex deployment environments, such as disaggregated cloud data centers or heterogeneous hardware settings.