Key points are not available for this paper at this time.
Motivation: Scalable distributed in-memory databases are at the core of data-intensive computation. Although scaling-out solutions help to handle large amounts of data, more nodes do not necessarily lead to improved query performance. In fact, recent papers have shown that performance can even degrade when scaling out due to higher communication overhead (e.g., shuffling data across nodes) and limited bandwidth Rö15. Thus, current distributed database systems are built with the assumption that the network is the major bottleneck BH13 and should be avoided at all costs. In recent years, high-speed networks (e.g., InfiniBand (IB)) with a bandwidth close to the local memory bus Bi16 have become economically viable. These network technologies provide Remote Direct Memory Access (RDMA) to allow direct memory access to a remote host and also reduce the latency of data transfer through bypassing the remote’s CPU In17, Gr10. Therefore, the assumption that the network is the bottleneck no longer holds. Consequently, recent research has focused on integrating RDMA-enabled high-speed networks into existing database systems designed along a Shared-Nothing Architecture (SN) Rö16, LYB17. This architecture co-locates computation and data to reduce the communication overhead in a cluster. Although combining a SN with IB’s higher network bandwidth enables scalability to a certain extent, this approach fails if the data or workload is skewed and cannot be evenly partitioned. The root cause is that classical query execution schemes assume that each partition is processed by one node. Since nodes with larger partitions must process more data, they may become a bottleneck and hinder the overall scalability. In consequence, only utilizing the higher bandwidth without adapting the database architecture and query execution, does not automatically lead to improved scalability Bi16. Contributions: In this paper, we present a new approach to execute distributed queries on fast networks with RDMA. Our main contribution is a novel execution strategy, which enables collaborative query processing by remote work stealing to mitigate skew, as this is a common issues that hinders scalable query execution WDJ91, Ly88. Moreover, we implement this execution strategy in our prototype engine I-Store and show that it introduces almost no overhead to handle skew.
Building similarity graph...
Analyzing shared references across papers
Loading...
Tobias Ziegler
Technical University of Darmstadt
Carsten Binnig
Technical University of Darmstadt
Uwe Röhm
The University of Sydney
The University of Sydney
Technical University of Darmstadt
Building similarity graph...
Analyzing shared references across papers
Loading...
Ziegler et al. (Tue,) studied this question.
synapsesocial.com/papers/6a210d4023521dddf4c3b41a — DOI: https://doi.org/10.18420/btw2019-ws-06