EcoTUB @SustainEval 2025: Ensembling BERT for German Sustainability Report Classification

Sinan Bove

Icondy Kiba-Gassaye

Sirak Tadesse

Lisa Raithel

September 08, 2025

This paper outlines our contribution to the Sus tainEval 2025 shared task (Prange et al., 2025), which focuses on the classification of German sustainability report snippets into one of 20 pre defined content categories defined by the Ger manSustainability Code (DNK). We fine-tuned a transformer-based model, deepset/gbert-base, and explored multiple methods to improve clas sification performance, including hyperparam eter tuning, data augmentation through back translation, and model ensembling. While our ensemble model achieved a accuracy of 0.74 on our internal validation set, its performance dropped to 0.58 on the final test set evaluated by the organizers, highlighting challenges in adapt ability to new data. We compare our results to several baselines and conduct error analysis to identify common misclassifications patterns, such as overlapping categories and ambiguous language. Our findings demonstrate both the potential and the limitations of NLP approaches for structured content analysis in German sus tainability reports.

https://aclanthology.org/2025.konvens-2.19.pdf

BIFOLD AUTHORS

Lisa Raithel