Towards Efficient Foundation Models in Remote Sensing
Lunch Talk Series: The BIFOLD Lunch Talk series gives BIFOLD members and external partners the opportunity to engage in dialogue about their research in Machine Learning and Big Data. Each Lunch Talk offers BIFOLD members, fellows and colleagues from other research institutes the chance to present their research and to network with each other.
Abstract: Self-supervised learning through masked autoencoders has emerged as a promising approach for remote sensing foundation models, yet existing methods often face trade-offs between computational efficiency and representational capacity. This talk presents an adaptation that integrates a soft mixture-of-experts mechanisms into foundation model architectures, enabling specialized processing while maintaining cross-modality learning. We demonstrate how this approach reduces computational requirements during training and inference while preserving or enhancing performance across diverse downstream tasks including classification, segmentation, and retrieval. The proposed method achieves substantial efficiency gains compared to existing models by delivering more than twice the computational efficiency while maintaining competitive accuracy.
Beyond the technical implementation, we discuss strategic considerations for training set construction and examine broader implications for scalable foundation model development and deployment in remote sensing applications.
The Lunch Talk takes place at BIFOLD and online. For further information on the Lunch Talks and registration, contact Dr. Laura Wollenweber via email.
Leonard Hackel from BIFOLD will talk about "Towards Efficient Foundation Models in Remote Sensing". He introduces a masked autoencoder with a soft mixture-of-experts design for remote sensing foundation models that achieves over twice the computational efficiency while preserving competitive performance and scalable cross-modal learning.