Banner Banner

TU Berlin datalog research paper was accepted for presentation at LSGDA 2020

The Paper “Distributed Graph Analytics with Datalog Queries in Flink” by TU Berlin Database Systems Reasearchers Muhammad Imran, Gábor Gévay, Volker Markl will be presented at the 2nd International Workshop on Large Scale Graph Data Analytics (LSGDA 2020) in conjunction with the 2020 VLDB Conference in Tokyo, Japan, at September 4, 2020.


Muhammad Imran, Gábor Gévay, Volker Markl

Large-scale, parallel graph processing has been in demand over the past decade. Succinct program structure and efficient execution are among the essential requirements of graph processing frameworks. In this paper, we present Cog, which executes Datalog programs on the Apache Flink distributed dataflow system. We chose Datalog for its compact program structure and Flink for its efficiency. We implemented a parallel semi-naive evaluation algorithm exploiting Flink’s delta iteration to propagate only the tuples that need to be further processed to the subsequent iterations. Flink’s delta iteration feature reduces the overhead present in acyclic dataflow systems, such as Spark, when evaluating recursive queries, hence making it more efficient. We demonstrated in our experiments that Cog outperformed BigDatalog, the state-of-the-art distributed Datalog evaluation system, in most of the tests.