Banner Banner

Providing novel insights from foundational research on data management and data engineering

The annual ACM SIGMOD/PODS Conference is a leading international forum for database researchers, practitioners, developers, and users to explore cutting-edge ideas and results, and to exchange techniques, tools, and experiences. Eight members of the BIFOLD team took the chance to showcase their recent work at SIGMOD 2023 in Seattle through a diverse array of presentations, including research papers, workshop papers, and a demo paper – all of them underscoring the institute's commitment to cutting-edge research in the field of data management. 

BIFOLD highlight of the conference was the acceptance of the ACM SIGMOD Systems Award by BIFOLD director Prof. Dr. Volker Markl. This prestigious award was conferred in recognition of Volker Markl's pivotal contributions to the open-source big data stream analytics platform Apache Flink, a transformative technology in the world of data processing. In another recognition Prof. Dr. Matthias Böhm was honored as SIGMOD Distinguished Associate Editor (AE) and Program Committee (PC) member.

At the Data Management for End-to-End Machine Learning (DEEM) workshop, Haralampos Gavriilidis presented the paper "P2D: A Transpiler Framework for Optimizing Data Science Pipelines." The paper addresses the inefficiency of pre-processing operations, a crucial step in data science pipelines, as they are currently not fully leveraging the capabilities of database management systems (DBMSes) as backends. To optimize the pre-processing step, the authors propose a transpilation-based approach that utilizes static code analysis to detect and "push-down" operations to DBMS backends.

Nils L. Schubert, Philipp M. Grulich, Steffen Zeuch, and Volker Markl shared their insights in the DIMA paper "Exploiting Access Pattern Characteristics for Join Reordering," presented at the Data Management on New Hardware (DaMoN) Workshop. It examines the memory access pattern of intermediate join state – an often neglected performance factor. Based on the analysis, the authors propose a novel join reordering algorithm that detects the memory access pattern and adapts the join order accordingly at runtime.

Kajetan Maliszewski introduced "TeeBench: Seamless Benchmarking in Trusted Execution Environments." The framework enables researchers to benchmark and evaluate custom implementations of relational operators in a seamless manner. TeeBench (Tee = Trusted Execution Environments) comes with a user-friendly graphical user interface as well as with a novel TEE-Analyzer that hints to the user about performance bottlenecks and suggests possible code improvements. Video: https://dl.acm.org/doi/10.1145/3555041.3589726 (15:46 min) Poster: https://kai-chi.github.io/assets/2023-sigmod-poster.jpg

Sebastian Baunsgaard presented "AWARE: Workload-aware, Redundancy-exploiting Linear Algebra” at a SIGMOD research session. The paper addresses the limitations of compressed linear algebra (CLA) with a workload-aware compression framework that includes a wide range of new compression schemes and kernels. Instead of using a data-centric approach that optimizes compression ratios, the workload-aware compression summarizes the workload of an ML pipeline and optimizes the compression and execution schedule to minimize execution time. Video: https://dl.acm.org/doi/10.1145/3588682 (12:39 min).

In addition, BIFOLD researchers presented the following papers at SIGMOD 2023: 

  • Research paper: Yancan Mao, Jianjun Zhao, Shuhao Zhang, Haikun Liu, and Volker Markl. 2023. MorphStream: Adaptive Scheduling for Scalable Transactional Stream Processing on Multicores. Proc. ACM Manag. Data. 1, 1, Article 59 (May 2023), 26 pages. https://doi.org/10.1145/3588913.
  • Research paper: Saeed Fathollahzadeh and Matthias Böhm. 2023. GIO: Generating Efficient Matrix and Frame Readers for Custom Data Formats by Example. Proc. ACM Manag. Data 1, 2, Article 120 (June 2023), 26 pages. https://doi.org/10.1145/3589265.
  • Tutorial: Matthias Böhm, Matteo Interlandi, and Chris Jermaine. 2023. Optimizing Tensor Computations: From Applications to Compilation and Runtime Techniques. In Companion of the 2023 International Conference on Management of Data (SIGMOD-Companion ’23), June 18–23, 2023, Seattle, WA, USA. ACM, New York, NY, USA, 7 pages. https://doi.org/10.1145/3555041.3589407.