Tutorial Day “Foundation models”

BIFOLD Tutorial Day

The first BIFOLD Tutorial Day focuses on Foundation Models

Date: April, 30th, 2024

Time: 10:00 am - 06:30 pm

Location: Max Delbrück Center | MDC-BIMSB, Hannoversche Str. 28, 10115 Berlin

The Tutorial Day is tailored specifically to enable exchange between BIFOLD researchers. The presentations focus on Foundation Models - offering insights into their core principles and practical applications and highlighting their role in BIFOLD research. Throughout the event, the participants will have the opportunity to listen to keynotes and engage in discussions and networking.

Agenda

Conference Schedule

Time	Title
10:00-10:15	Welcome & Introduction Volker Markl
10:15-11:00	Representing Patients and Predicting Clinical Outcomes with Large Language Models Alexander Löser
11:00-11:30	Explaining, Analyzing and Debugging Foundation Models Wojciech Samek, Grégoire Montavon
11:30-12:00	Generative AI in Security Konrad Rieck
12:00-13:00	Lunch Break
13:00-13:30	The Helmholtz Foundation Model Initiative Dagmar Kainmüller
13:30-14:00	Explanation Dialogues for Understanding Foundation Model Behavior Nils Feldhus
14:00-14:30	Exploring Foundation Models in Low-Resource Settings: Opportunities and Challenges Oliver Eberle, Hassan El-Hajj, Jochen Büttner
14:30-15:10	Coffee break & World Cafés Nils Feldhus, Oliver Eberle & Jochen Büttner & Hassan El-Hajj, Alexander Löser
15:10-15:40	Large Language Models for medical decision making Tobias Röschl
15:40-16:10	RudolfV: A Foundation Model by Pathologists for Pathologists Maximilian Alber
16:10-16:40	Genome foundation models: how…. and why? Uwe Ohler
16:40-16:55	Wrap-Up Volker Markl
17:00-18:30	Networking and get together

Representing Patients and Predicting Clinical Outcomes with Large Language Models
Prof. Dr. Alexander Löser
Abstract: Medical professionals are faced with a large amount of textual patient information every day. Clinical decision support systems (CDSS) aim to help clinicians in the process of decision-making based on such data. We specifically look at a sub-task of CDSS, namely the prediction of clinical diagnosis from patient admission notes. When clinicians approach the task of diagnosis prediction, they usually take similar patients into account (from their own experience, clinic databases or by talking to their colleagues) who presented with typical or atypical signs of a disease. They then compare the patient at hand with these previous encounters and determine the patient’s risk of having the same condition. I will review our work on this set of tasks over the last three years and will briefly introduce Large Language Model architectures, data sets, like our Medbert.de or MedAlpaca, and lessons learned from papers published at EACL’21, COLING 22, IJCNP 22, LREC 22 and Expert Systems with Applications Journal 23.

Explaining, Analyzing and Debugging Foundation Models
Prof. Dr. Wojciech Samek, Prof. Dr. Grégoire Montavon
Abstract: Foundations models have been a major development in machine learning, aiming at decoupling the general and computationally intensive task of representing data from the myriad of possible downstream applications. To ensure a broad applicability of foundation models, it is however crucial that these models retain some level of transparency and trustworthiness. In this talk, we present our recent contributions to Explainable AI in the context of foundation models. We first review the LRP explanation technique and present recent developments of LRP to bring explainability to Transformers, a key technology underlying foundation models. We then use LRP to analyze the prediction strategy of foundation models, finding that the latter are susceptible to develop ‘Clever Hans’ strategies. The latter are distinct from their supervised counterparts, thereby also requiring distinct strategies to mitigate them.

Generative AI in Security
Prof. Dr. Konrad Rieck
Abstract: Foundation models have the potential to bring about groundbreaking changes in several areas of research. While they already enabled remarkable results in vision and audio tasks, their capabilities in applications with discrete data are still under debate. In this talk, we explore the application of generative AI in computer security. We highlight promising applications currently being investigated by our group, but also point out limitations that are a stumbling block in cases where decisions need to be accurate and free of hallucinations.

The Helmholtz Foundation Model Initiative
Dr. Dagmar Kainmüller
Abstract: The Helmholtz Association with its 18 centers and large-scale facilities is a world leader in the generation of cutting edge research data. The challenge of leveraging these vast amounts of data for scientific progress requires synergistic data analysis solutions that can be universally applied across various analysis tasks and data domains. Recent advancements in AI research have given rise to a transformative paradigm designed precisely for this challenge: Foundation models. These models are trained on vast and diverse datasets at scale, making them highly adaptable to solve a wide range of tasks. To leverage the unparalleled data repositories of the Helmholtz Association across six Research Fields, we have launched the Helmholtz Foundation Model Initiative (HFMI), a systematic approach to developing generalist AI models in a synergistic manner. Resulting open-source models are anticipated to significantly expedite data analysis and are designed to be shared globally, ensuring that HFMI’s benefits extend to the worldwide research community.

Explanation Dialogues for Understanding Foundation Model Behavior and Teaching Concepts
Nils Feldhus
Abstract: Framing explanation processes as a dialogue between the human and the model has been motivated in many recent works from the areas of HCI and ML explainability. With the growing popularity of LLMs, the research community has started to present dialogue-based interpretability frameworks for ML problems that is both capable of conveying faithful explanations in human-understandable terms and is generalizable to different datasets, use cases and models. In this talk, I will give an overview of existing "dialogue XAI" systems and evaluation paradigms in performance assessment and user studies. Finally, I will illustrate how explanation dialogues can be used for teaching concepts to explainees of different expertise levels and what role LLMs can play in estimating teaching quality and detecting explanatory patterns in classroom settings.

Exploring Foundation Models in Low-Resource Settings: Opportunities and Challenges
Dr. Oliver Eberle, Dr. Hassan El-Hajj, Dr. Jochen Büttner
Abstract: "Foundation models (FMs) could be a pivotal advancement for providing ML solutions in low-resource environments, enabling the extraction of task-relevant representations with minimal or no additional annotations required. In this talk, we explore a use case of FMs applied
to digital history, where the automated transcription and representation of historical data poses a key methodological hurdle. Such data is characterized by high complexity and heterogeneity, with limited or partial label information available, challenging
the abilities of FMs while also cautioning against model over-reliance in humanities research. This opens a broader discussion on the use of FMs for various low-resource domain applications, and highlights general challenges for the discovery of domain insights."

Large Language Models for medical decision making
Dr. Tobias Röschl
Abstract: Large Language Models (LLMs) could provide a solution to several current challenges in healthcare, such as operational inefficiencies and lack of interoperability. Our working group is investigating the extent to which LLMs could indeed serve as a viable solution, with a particular focus on the application of LLMs in medical decision making and the extraction and processing of clinical data.

RudolfV: A Foundation Model by Pathologists for Pathologists
Dr. Maximilian Alber
Abstract: Histopathology plays a central role in clinical medicine and biomedical research. While artificial intelligence shows promising results on many pathological tasks, generalization and dealing with rare diseases, where training data is scarce, remains a challenge. Distilling knowledge from unlabeled data into a foundation model before learning from, potentially limited, labeled data provides a viable path to address these challenges. In this work, we extend the state of the art of foundation models for digital pathology whole slide images by semi-automated data curation and incorporating pathologist domain knowledge. Specifically, we combine computational and pathologist domain knowledge (1) to curate a diverse dataset of 103k slides corresponding to 750 million image patches covering data from different fixation, staining, and scanning protocols as well as data from different indications and labs across the EU and US, (2) for grouping semantically similar slides and tissue patches, and (3) to augment the input images during training. We evaluate the resulting model on a set of public and internal benchmarks and show that although our foundation model is trained with an order of magnitude less slides, it performs on par or better than competing models. We expect that scaling our approach to more data and larger models will further increase its performance and capacity to deal with increasingly complex real world tasks in diagnostics and biomedical research.

Genome foundation models: how…. and why?
Prof. Dr. Uwe Ohler
Abstract: The human genome can at first approximation regarded as linear text, and it is not surprising that self-supervised language models have been adapted to the genomics domain. I’ll provide an overview of models that have been described so far, and how much we know about how well they actually work. I’ll discuss issues arising when treating the genome as linear text — which is too simplistic — and what inherent limitations that implies for models directly lifted from NLP.

Abstracts