Returning the Favor - Leveraging Quality Insights of OpenStreetMap-Based Land-Use/Land-Cover Multi-Label Modeling to the Community

Moritz Schott
Adina Zell
Sven Lautenbach
Begüm Demir
Alexander Zipf

August 17 , 2022

Land-use and land-cover (LULC) information in OSM is a challenging topic. On the one hand, this information provides the background for all other data rendered on the central map and is used by applications like It has a high potential to be used as a valuable data source to tackle major current challenges like the climate crisis. On the other hand, this information has a difficult position within the OSM ecosystem. LULC information can be quite cumbersome or even difficult to map e.g. due to natural ambiguity.
As most other OSM tagging schemes, the current LULC tagging scheme is the result of a bottom-up growth which resulted in a collection of sometimes ambiguous, unstable or overlapping tag definitions that are not fully compatible with any official LULC legend definition [1]. Furthermore, the data is highly shaped by local or national preferences and imports. This diversity of the LULC data in OSM is a fundamental principle of OSM that enabled the success of the project. Yet, this can create considerable usage barriers or at least caveats for data users unfamiliar with the projects' ecosystem. The remote sensing community for instance has started to use OSM LULC information as labels in their classification models. Frequently, OSM LULC data has thereby been taken at face value
without critical reflection. And, while the quality and fitness for purpose of OSM data has been proven in many cases (e.g., [2,3]), these analyses have also unveiled quality variations e.g. between rural and urban regions. The quality of OSM therefore can be assumed to be generally high, but remains unknown for a specific use-case