SUBJECT: Ph.D. Proposal Presentation
BY: Paul Burke
TIME: Thursday, April 30, 2020, 11:30 a.m.
PLACE:, Online
TITLE: COMET-GPU: A GPGPU-Enabled Deterministic Solver for the Continuous-Energy Coarse Mesh Transport Method (COMET)
COMMITTEE: Dr. Farzad Rahnema, Co-Chair (NRE)
Dr. Umit Catalyurek, Co-Chair (CSE)
Dr. Dan Gill (NNL)
Dr. Bojan Petrovic (NRE)
Dr. Dingkang Zhang (NRE)


The Continuous-Energy Coarse Mesh Transport (COMET) method is a neutron transport solver that uses a unique hybrid stochastic-deterministic solution method to obtain high-fidelity whole-core solutions to reactor physics problems with formidable speed. This method involves pre-computing solutions to individual coarse meshes within the global problem, then using a deterministic transport sweep to construct a whole-core solution from these local solutions. This deterministic portion of the solver is ripe for parallelization due to the nature of the algorithm, which loosely corresponds to convergence to the eigenvector of inter-mesh partial neutron currents via a power iteration method. Previous preliminary work demonstrated speedup by the adaptation of the serial implementation of the method to parallel systems via distributed-memory parallelism. This work intends to further accelerate the method by exploiting Graphics Processing Unit (GPU) architectures. GPUs are devices that excel at parallel operations on large datasets, such as large matrix operations. These devices have been demonstrated in neutron transport methods to provide significant computational speed, with one device providing, in some cases, performance equivalent to 150 CPU cores. The novel application of GPU technology to the continuous-energy COMET method promises to provide considerable speedup. Development of this capability will involve a ground-up rewrite of the serial deterministic solver to prepare the memory layout and operations for application to GPUs. With this new implementation, General-Purpose GPU (GPGPU) computation capability will be added using a suitable API (e.g. CUDA, Kokkos, OpenMP). The computational speed will be demonstrated on a set of benchmark problems, as well as new applications unlocked by the performance.