Posters published in the ACM Digital Library
- Poster 1: FFT data distribution in plane-waves DFT codes. A case study from Quantum ESPRESSO
Fabio Affinito and Carlo Cavazzoni
DOI - Poster 2: Optimizing PARSEC for Knights Landing
Alexey Malhanov, Ariel J Biller and Michael Chuvelev
DOI - Poster 3: Effective Calculation with Halo communication using Halo Functions
Keiichiro Fukazawa, Toshiya Takami, Takeshi Soga, Yoshiyuki Morie and Takeshi Nanri
DOI - Poster 4: MPI usage at NERSC: Present and Future
Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan
DOI - Poster 5: Performance comparison of Eulerian kinetic Vlasov code between flat-MPI parallelism and hybrid parallelism on Fujitsu FX100 supercomputer
Takayuki Umeda and Keiichiro Fukazawa
DOI
Other posters
- Poster 6: Leveraging MPI RMA for an efficient directory/cache transport layer Nick Brown
- Poster 7: Producing robust, sustainable and high performance software for complex potential energy models
Weronika Filinger, Mario Antonioletti, Lorna Smith, Arno Proeme, Neil Chue Hong and Omar Demerdash - Poster 8: Benchmarking Wee Archie - ARCHER’s Little Brother
Gordon Gibb, Alistair Grant and Nick Brown - Poster 9: BeatBox — HPC Simulation Environment for Biophysically and Anatomically Realistic Cardiac Electrophysiology
Adrian Jackson, Mario Antonioletti, Vadim Biktashev, Irina Biktasheva, Sanjay Kharche and Tomas Stary - Poster 10: Prototyping An Early Implementation of Nonblocking Persistent Collective Communication
Bradley Morgan, Anthony Skjellum, Shane Farmer and Daniel Holmes - Poster 11: Petal Tool: From Blocking Collectives to Non-Blocking Collectives MPI
Anthony Skjellum, Hadia Ahmed and Peter Pirkelbauer
Poster 1: FFT data distribution in plane-waves DFT codes. A case study from Quantum ESPRESSO
Fabio Affinito and Carlo Cavazzoni
In this work we describe the FFT data distribution adopted in Quantum ESPRESSO, one of the most used codes for density-functional plane-waves calculations. We describe the MPI hierarchical structure of communicators that permit to tune the workload and the impact of the collective communications, one of the most critical factor affecting the performance. We present the FFTXlib miniapp and discuss profiling data and performance figures.
_______________________________________________
Poster 2: Optimizing PARSEC for Knights Landing
Alexey Malhanov, Ariel J Biller and Michael Chuvelev
PARSEC is a massively parallel Density-Functional-Theory code. Within the modernization effort towards the new Intel Knights Landing platform (KNL), we adapted the main computational kernel to use hybrid MPI and OpenMP runtimes. We employed MPI-3 non-blocking neighborhood collectives for the halo exchange. We present performance data on the KNL including MPI traces portraying our exploration of communication-computation overlap in a hybrid MPI and OpenMP application.
_______________________________________________
Poster 3: Effective Calculation with Halo communication using Halo Functions
Keiichiro Fukazawa, Toshiya Takami, Takeshi Soga, Yoshiyuki Morie and Takeshi Nanri
The halo communication issues decrease the scalability. To overcome the issues, we have introduced "Halo thread" to our simulation code. However, the communication and calculation related to the communication in Halo thread have not been optimized yet. In this study we have developed the Halo functions which perform the halo communication effectively. Then we can perform the calculation and communication in a pipeline and obtained good performance.
_______________________________________________
Poster 4: MPI usage at NERSC: Present and Future
Alice Koniges, Brandon Cook, Jack Deslippe, Thorston Kurth, Hongzhang Shan
We describe how MPI is used at NERSC, the production HPC center for USDOE with more than 5000 users. Through a variety of tools, we determine how MPI is used on our new Intel Knights Landing, one of the first to be deployed. We compare usage of MPI to exascale developmental programming models such as UPC++ and HPX. In addition to a broad survey, we follow the evolution of a few key application codes that were highly optimized for the Knights Corner architecture using advanced OpenMP techniques.
_______________________________________________
Poster 5: Performance comparison of Eulerian kinetic Vlasov code between flat-MPI parallelism and hybrid parallelism on Fujitsu FX100 supercomputer
Takayuki Umeda and Keiichiro Fukazawa
The present study deals with the Vlasov simulation, which solves the first-principle kinetic equation for space plasma based on the Eulerian grid. We perform performance comparison between the flat-MPI and the two-level hybrid parallelism with MPI and OpenMP. The parallel Vlasov code is benchmarked on the Fujitsu FX100 supercomputer systems, which uses the second-generation post architecture of the K computer.