Banner Banner

A Benchmark Dataset for Sentinel-2 Based Forest Type Classification in the Siberian Summergreen-Evergreen Forest Transition Zone

F. van Geffen
R. Hänsch
B. Demir
S. Kruse
U. Herzschuh
B. Heim

April 21, 2025

Circumboreal forests covering about 30% of global forested areas are undergoing significant changes. In Siberia, global warming may reduce the dominance of summergreen larch forest inducing shifts towards evergreen forest types, specifically in the Eastern Siberian summergreen-evergreen forest transition zone. We create a Remote Sensing training dataset for summergreen and evergreen forest types from the SiDroForest Sentinel-2 image dataset. This new training dataset informed by expert field knowledge includes nearly two million Sentinel-2 pixels across the early summer, peak summer, and late summer phenophases. We create the equivalent seasonal SiDroTest dataset linked to in-situ forest plots for benchmarking the seasonal training dataset. To optimize satellite-based monitoring, we train a Random Forest classifier on the train dataset to map summergreen and evergreen forest resulting in accuracies of 63% for early summer, 89% for peak summer, and 99% for late summer, with an average accuracy of 82% across all seasons. Feature importance analysis highlights the Sentinel-2 shortwave infrared as crucial for distinguishing forest types in all seasons. Additional key features include the normalized difference vegetation index (NDVI) and the red wavelength region for early summer, shortwave infrared and the visible wavelength region for peak summer, and shortwave infrared, near-infrared and NDVI for late summer. This study provides a benchmarked training dataset for mapping boreal forest types in the Siberian summergreen-evergreen transition zone. The Random Forest classifier performs best in late summer, leveraging distinct spectral differences between evergreen forests' greenness and the seasonal coloring of summergreen larch forests.