Extreme-scale Direct Numerical Simulation of Incompressible Turbulence on the Heterogeneous Many-core System
Direct numerical simulation (DNS) is a technique that solves the fluid Navier-Stokes equations directly without further modeling. The high spatial and temporal resolution makes DNS one of the most effective methods to advance turbulence research. However, it also hinders the application of DNS in high Reynolds number (Re) turbulence of particular interest, because the memory and computation requirements scale rapidly with Re^4. Recent studies have shown that developing efficient parallel methods for heterogeneous many-core systems is promising to solve this computational challenge. We propose four parallel methods and optimizations to develop a high-performance and scalable implicit finite difference solver called PowerLLEL, aiming to accelerate extreme-scale DNS of incompressible turbulence on the heterogeneous many-core system. Firstly, an adaptive multi-level parallelization strategy is proposed to fully exploit the multi-level parallelism and computing power of heterogeneous many-core systems. Secondly, hierarchical-memory-adapted data reuse/tiling strategy and kernel fusion are adopted to improve the performance of memory-bounded stencil-like operations. Thirdly, a parallel tridiagonal solver based on the parallel diagonal dominant (PDD) algorithm is developed to minimize the number of global data transposes. Fourthly, three effective communication optimizations are implemented by Remote Direct Memory Access (RDMA) to maximize the performance of the remaining global transposes and halo exchange. Results show that the solver exploits the heterogeneous computing power of the new Tianhe supercomputer, and achieves up to 10.6x speedup (against the CPU-only performance). Linear strong scaling is obtained up to a 25.8 billion problem size.
Mon 4 MarDisplayed time zone: London change
14:20 - 15:40 | High Performance ComputingMain Conference at Moorfoot Chair(s): Helen Xu Lawrence Berkeley National Laboratory | ||
14:20 20mTalk | OsirisBFT: Say No to Task Replication for Scalable Byzantine Fault Tolerant Analytics Main Conference Link to publication DOI | ||
14:40 20mTalk | Towards Scalable Unstructured Mesh Computations on Shared Memory Many-Cores Main Conference Haozhong Qiu , xuchuanfu National University of Defense Technology, Jianbin Fang National University of Defense Technology, Liang Deng China Aerodynamic Research and Development Center, Jian Zhang China Aerodynamic Research and Development Center, Qingsong Wang National University of Defense Technology, Yue Ding NOT_PROVIDED, Zhe Dai China Aerodynamic Research and Development Center, Yonggang Che National University of Defense Technology Link to publication DOI | ||
15:00 20mTalk | Extreme-scale Direct Numerical Simulation of Incompressible Turbulence on the Heterogeneous Many-core System Main Conference Jiabin Xie Sun Yat-sen University, Guangnan Feng Sun Yat-sen University, Han Huang Sun Yat-sen University, Junxuan Feng Sun Yat-sen University, Yutong Lu Sun Yat-sen University Link to publication DOI | ||
15:20 20mTalk | Pure: Evolving Message Passing To Better Leverage Shared Memory Within Nodes Main Conference James Psota Massachusetts Institute of Technology, Armando Solar-Lezama Massachusetts Institute of Technology Link to publication DOI |