
Prof. Dr. Matthias Böhm
Research Group Lead
Research Group Lead | BIFOLD
Full Professor and Chair: Big Data Engineering
Before joining BIFOLD Matthias Boehm was a BMK-endowed professor for data management at Graz University of Technology, Austria, and a research area manager for data management at the co-located Know-Center GmbH. His cross-organizational research group focuses on high-level, data science-centric abstractions as well as systems and tools to execute these tasks in an efficient and scalable manner. Prior to joining TU Graz in 2018, he was a research staff member at IBM Research - Almaden, CA, USA, with a major focus on compilation and runtime techniques for declarative, large-scale machine learning in Apache SystemML. Matthias received his Ph.D. from Dresden University of Technology, Germany in 2011 with a dissertation on cost-based optimization of integration flows. His previous research also includes systems support for time series forecasting as well as in-memory indexing and query processing. Matthias is a recipient of the 2016 VLDB Best Paper Award, a 2016 SIGMOD Research Highlight Award, a 2016 IBM Pat Goldberg Memorial Best Paper Award, and the 2021 SIGMOD DS&E Best Paper Award.
Current Projects: Apache SystemDS (An open source ML system for the end-to-end data science lifecycle), ExDRa (exploratory data science and federated ML over raw data, w/ Siemens, DFKI, and TU Berlin), DAPHNE (an open and extensible system infrastructure for integrated data analysis pipelines, w/ AVL, DLR, ETH Zurich, HPI Potsdam, ICCS, Infineon, Intel, ITU Copenhagen, KAI, TU Dresden, Uni Maribor, Uni Basel), and ReWaste F (recycling and recovery of waste for future, 4 scientific and 14 industrial partners)
2021 | SIGMOD DS&E Best Paper Award |
2016 | IBM Pat Goldberg Memorial Best Paper Award |
2016 | SIGMOD Research Highlight Award |
2016 | VLDB Best Paper Award |
- System-oriented research for the end-to-end data science lifecycle from data integration, preparation, cleaning, over efficient ML training, to model debugging and deployment,
- Large-scale, distributed machine learning and data management,
- Query optimization (in ML systems, integration systems, database systems), and
- In-memory indexing, query processing, and high-performance computing.