From foundational research to applied systems: BIFOLD presents diverse and high-impact contributions at the forefront of data management science
At VLDB 2025, held from September 1 to 5, 2025, in London, United Kingdom, BIFOLD researchers will contribute 15 presentations in various forms: 3 research papers, 5 demonstrations, 1 tutorial, 5 workshops, and 1 workshop keynote.
Beyond their scientific achievements, BIFOLD researchers help coordinate the conference. Matthias Boehm serves as Associate Editor and received the Distinguished Associate Editor Award. Ziawasch Abedjan, Patrick Damme, and Steffen Zeuch hold review board positions. Patrick Damme, from the DAMS research group, also earned the VLDB 2025 Distinguished Reviewer Award.
The contributions involve several BIFOLD research groups: DIMA – Database Systems and Information Management (Volker Markl), DEEM – Management of Data Science Processes (Sebastian Schelter), DAMS – Big Data Engineering (Matthias Boehm), D2IP – Data Integration and Data Preparation (Ziawasch Abedjan), and MLSEc – Machine Learning and Security (Konrad Rieck).
The VLDB conference is a premier global forum for innovations in data management. It brings together researchers, developers, and practitioners from academia and industry to discuss topics such as database systems, scalable analytics, data privacy, distributed systems, machine learning for data management, and field-specific data architectures.
BIFOLD is honored to foster progress in the worldwide data management arena through its active engagement in VLDB 2025.
Below is an overview of BIFOLD’s contributions:
- Unraveling the Impact of Window Semantics: Optimizing Join Order for Efficient Stream Processing
- Ariane Ziehn (Technische Universität Berlin);Jan Szlang (Snowflake);Steffen Zeuch (TU Berlin);Volker Markl (Technische Universität Berlin)
- [Link]
- CatDB: Data-catalog-guided, LLM-based Generation of Data-centric ML Pipelines
- Saeed Fathollahzadeh (Concordia University);Essam Mansour (Concordia University);Matthias Boehm (Technische Universität Berlin)
- [Link]
- Deduplicated Sampling On-Demand
- RadlER: Deduplicated Sampling On-Demand
- Luca Zecchini (BIFOLD & TU Berlin);Ziawasch Abedjan (BIFOLD & TU Berlin);Vasilis Efthymiou (Harokopio University of Athens);Giovanni Simonini (University of Modena and Reggio Emilia)
- [Code]
- Demonstrating Matelda for Multi-Table Error Detection
- Fatemeh Ahmadi (Technische Universität Berlin - BIFOLD);Julian Paulußen (Technische Universität Berlin - BIFOLD );Ziawasch Abedjan (Technische Universität Berlin - BIFOLD )
- [Link]
- APEX-DAG: Library and Language independent Pipeline EXtraction
- Sebastian Eggers (BIFOLD/TU Berlin);Nina Żukowska (BIFOLD/TU Berlin);Ziawasch Abedjan (BIFOLD/TU Berlin)
- [Code]
- Enter the Warp: Fast and Adaptive Data Transfer with XDBC
- Haralampos Gavriilidis (Technische Universität Berlin);Joel Ziegler (Technische Universität Berlin);Midhun Kaippillil Venugopalan (Technische Universität Berlin);Benedikt Didrich (Technische Universität Berlin);Matthias Boehm (Technische Universität Berlin);Volker Markl (Technische Universität Berlin)
- [Link]
- mlidea: Interactively Improving ML Data Preparation Code via 'Shadow Pipelines'
- Stefan Grafberger (BIFOLD & TU Berlin), Paul Groth (University of Amsterdam), Sebastian Schelter (BIFOLD & TU Berlin)
- [Link]
Data Disovery in Data Lakes: Operations, Indexes, Systems
- Ziawasch Abedjan (BIFOLD/TU Berlin);Mahdi Esmailoghli (HU Berlin);Sainyam Galhorta (Cornell University)
- 16th Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architecture
- GPU-Accelerated Stochastic Gradient Descent for Scalable Operator Placement in Geo-Distributed Streaming Systems
- Tristan Joel Terhaag, Technische Universität Berlin, Xenofon Chatziliadis, Technische Universität Berlin, Eleni Tzirita Zacharatou, Hasso Plattner Institute, University of Potsdam, and Volker Markl, Technische Universität Berlin
- [Link]
- 6th Applied AI for Database Systems and Applications
- Learning to Accelerate: Tuning Data Transfer Parameters
- Benedikt Didrich, Haralampos Gavriilidis, Vasilis Gkolemis, Matthias Boehm, Volker Markl
- [Link]
- 3rd International Workshop on Composable Data Management Systems
- Composability and Interoperability for Federated Data Systems
- Haralampos Gavriilidis, Leonhard Rose, Joel Ziegler, Jonathan Gerloff, Benedikt Didrich, Midhun Kaippillil Venugopalan, Kaustubh Beedkar, Matthias Boehm, Volker Markl
- [Link]
- GuideAI
- Towards Identifying Intent of Data Errors
- Mohamed Ahmed Abdelmaksoud Mohamed, Konrad Rieck, Ziawasch Abedjan
- [Link]
- 1st Workshop on New Ideas for Large-Scale Neurosymbolic Learning Systems
- Modular Neuro-Symbolic Knowledge Graph Completion
- Abelardo Carlos Martinez Lorenzo, Alexander Perfilyev, Volker Markl, Martha Clokie, Thomas Sicheritz-Pontén, Zoi Kaoudi
- DATAI Workshop - 2nd Workshop on Data driven AI
- Keynote: Ziawasch Abedjan
- Navigating Disruption: The Impact of AI Technologies on Data Integration Research
- [Link]