Banner Banner

Benchmarking Spatial Operations over Heterogeneous Data: The Case of Zonal Statistics

Gereon Dusella
Haralampos Gavriilidis
Volker Markl
Eleni Tzirita Zacharatou

June 05, 2026

Zonal Statistics (ZS) is a fundamental operation in Earth Obser vation workflows. It aggregates raster pixel values within regions defined by vector geometries, such as computing average vegeta tion indices across farmland parcels. Unlike traditional database operations, which operate within a single data model, ZS requires joining two fundamentally different spatial data models: raster and vector. This data heterogeneity introduces unique benchmarking challenges that existing spatial benchmarks, which focus on one data model in isolation, do not address. In this paper, we present an experimental study of spatial operations over raster and vector data, using ZS as a representative case. Evaluating three architecturally diverse systems—PostGIS (relational), Beast (dataflow), and Ras DaMan(array-based)—across 25 queries over 17 real-world datasets, we find that the interaction between data characteristics (geome try type, raster-to-vector size ratio, coordinate reference systems) and system internals causes performance differences of up to 22× between competitive systems, while parameter tuning alone can yield over 55× speedup within a single system.