Logo der Universität Wien

Classification Framework for the Parallel Hash Join with a Performance Analysis on the GPU


The hash join operator is one of the most important relational operators in database applications and a prominent research topic in the domain of parallel processing. However, up to date, no consistent algorithm design guidelines for high-performance implementations on parallel platforms have been derived from the available experimental results. In this work we define a taxonomy of the parallel hash join operator landscape and categorize state of the art research accordingly. Moreover, we implement and benchmark three taxonomy types: A sequential implementation on the CPU, a hybrid CPU-GPU implementation as well as a fully parallel version on the GPU. The results show that (1) the hybrid CPUGPU type outperforms the other two, showcasing the benefits of a good fit between algorithm type and hardware platform choice, (2) the poor end-to-end performance of the GPU-only type highlights the impact of GPU specific synchronization and contention issues that appear with an unfit design choice, (3) parallelization improves runtime by a factor of 2.2X in the end-to-end algorithm, a factor of 83X in the join phase and shows good scaling behavior with increasing number of threads. This proves that the GPU is a valuable co-processor option for computation offloading in database applications. We anticipate this classification framework to be a starting-point for design decisions for parallel big data hash join operators on other heterogeneous systems.

Grafik Top
Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title
15th IEEE International Symposium on Parallel and Distributed Processing with Applications ISPA 2017
Workflow Systems and Technology
Parallele Datenverarbeitung
Event Location
Guangzhou, China
Event Type
Event Dates
December 12-15, 2017
December 2017
Grafik Top
Universität Wien | Universitätsring 1 | 1010 Wien | T +43-1-4277-0