Banner Banner

Efficient Placement of Decomposable Aggregation Functions for Stream Processing over Large Geo-Distributed Topologies

Xenofon Chatziliadis
Eleni Tzirita Zacharatou
Alphan Eracar
Steffen Zeuch
Volker Markl

August 25, 2024

A recent trend in stream processing is offloading the computation of decomposable aggregation functions (DAF) from cloud nodes to geo-distributed fog/edge devices to decrease latency and improve energy efficiency. However, deploying DAFs on low-end devices is challenging due to their volatility and limited resources. Additionally, in geo-distributed fog/edge environments, creating new operator instances on demand and replicating operators ubiquitously is restricted, posing challenges for achieving load balancing without overloading devices. Existing work predominantly focuses on cloud environments, overlooking DAF operator placement in resource-constrained and unreliable geo-distributed settings. This paper presents NEMO, a resource-aware optimization approach that determines the replication factor and placement of DAF operators in resource-constrained geo-distributed topologies. Leveraging Euclidean embeddings of network topologies and a set of heuristics, NEMO scales to millions of nodes and handles topological changes through adaptive re-placement and re-replication decisions. Compared to existing solutions, NEMO achieves up to 6× lower latency and up to 15× reduction in communication cost, while preventing overloaded nodes. Moreover, NEMO re-optimizes placements in constant time, regardless of the topology size. As a result, it lays the foundation to efficiently process continuous data streams on large, heterogeneous, and geo-distributed topologies.