Banner Banner

October 16, 2020

Prof. Dr. Volker Markl

BIFOLD database systems research papers were accepted at CIDR 2021

Researchers at the Database Systems and Information Management (DIMA) group at TU Berlin and the Intelligent Analytics for Massive Data (IAM) group at DFKI have been informed that their papers were accepted for presentation at the 11th Annual Conference on Innovative Data Systems Research (CIDR ’21) which will be held as a virtual event on January 11-15, 2021.

The vision paper “The Case for Distance-Bounded Spatial Approximations” by Eleni Zacharatou, Andreas Kipf, Ibrahim Sabek, Varun Pandey, Harish Doraiswamy and Volker Markl advocates for approximate spatial data processing techniques that omit exact geometric tests and provide final answers solely on the basis of (fine-grained) approximations. Thanks to recent hardware advances, this vision can be realized today. Furthermore, these approximate techniques employ a distance-based error bound, i.e., a bound on the maximum spatial distance between false (or missing) and exact results which is crucial for meaningful analyses. This bound allows to control the precision of the approximation and trade accuracy for performance.
A preprint version is available here.

The demo paper “Semi-Supervised Data Cleaning with Raha and Baran” by Mohammad Mahdavi and Ziawash Abedjan demonstrate how two formerly developed systems, Raha and Baran, can be used within an end-to-end data cleaning pipeline. In practice, with a small number of 20 user-annotated tuples, it is possible to effectively identify and fix data quality problems inside a dataset. Furthermore, both systems benefit from knowledge on prior cleaning tasks. Using transfer learning, both systems can optimize the data cleaning task at hand in terms of error detection runtime and error connection effectiveness.
A preprint version is available here.

To learn more about CIDR 2021, please visit