Student Assistant – Data Integration & Machine Learning Research at BIFOLD / TU Berlin (m/f/d)
Support ML & data lake research
About us
The Data Integration and Data Preparation Group (D2IP) at TU Berlin conducts research at the intersection of data management, data engineering, and machine learning. The goal of our work is to develop efficient and easy-to-use data management systems for data science and machine learning applications, while ensuring high data quality and scalability when working with large and heterogeneous data. We are seeking support for the BIFOLD-associated research project "DFG TDC 2". The goal of the project is the development of methods for curating and improving large data lakes to instrument data science pipelines.
Your responsibility
- Support with feature and optimization implementations (50%)
- Support in the development of experimental benchmarks (20%)
- Support with the editorial processing of scientific texts (30%)
Your profile
Mandatory criteria:
- Solid knowledge of machine learning (concepts, models, and workflows)
- Solid Knowledge of data integration concepts
- Very good knowledge of Python
- Experience using libraries (scikit-learn, pandas, polars, or PyTorch)
- Good knowledge of scalable data processing
- Experience contributing to data systems or machine learning frameworks
- Good knowledge of German and/or English required; willingness to acquire the respective missing language skills
Optional criteria:
- Prior exposure to open-source development practices or contributions to open-source projects
How to apply
Party responsible for specialist area / point of contact for job posting: Prof. Dr. Ziawasch Abedjan
Period of employment: immeditately, until 30.04.2027
Apply to: sekr@d2ip.tu-berlin.de
Employer: TU Berlin / BIFOLD
Starting Date: at the earliest possible (until April 30, 2027)
Apply by: March 16, 2026
Full job posting: IV-SB-0008-2026