Banner Banner

Imperative or Functional Control Flow Handling: Why not the Best of Both Worlds?

Gábor E. Gévay
Tilmann Rabl
Sebastian Breß
Lorand Madai-Tahy
Jorge-Arnulfo Quiané-Ruiz
Volker Markl

June 01, 2022

Modern data analysis tasks often involve control flow statements, such as the iterations in PageRank and K-means. To achieve scalability, developers usually implement these tasks in distributed dataflow systems, such as Spark and Flink. Designers of such systems have to choose between providing imperative or functional control flow constructs to users. Imperative constructs are easier to use, but functional constructs are easier to compile to an efficient dataflow job. We propose Mitos, a system where control flow is both easy to use and efficient. Mitos relies on an intermediate representation based on the static single assignment form. This allows us to abstract away from specific control flow constructs and treat any imperative control flow uniformly both when building the dataflow job and when coordinating the distributed execution.