CUDA-based Scientific Computing
Tools and Selected Applications
von Stephan Christoph Kramer
Datum der mündl. Prüfung:2012-11-22
Erschienen:2013-02-12
Betreuer:Prof. Dr. Gert Lube
Gutachter:Prof. Dr. Gert Lube
Gutachter:Prof. Dr. Reiner Kree
Dateien
Name:sc_with_cuda_main_ediss_Stephan_Kramer.pdf
Size:10.6Mb
Format:PDF
Zusammenfassung
Englisch
The first goal of this thesis is the design of SciPAL (Scientific and Parallel Algorithms Library), a C++-based and operating-system-independent library. The core of SciPAL is a domain-specific embedded language for dense and sparse linear algebra which is presented in chapter 1. Chapter 2 shows that by using SciPAL, algorithms can be stated in a mathematically intuitive, unified way in terms of matrix and vector operations. The resulting rapid prototyping capabilities are discussed by using LU factorization and principal component analysis as example. SciPAL ports the most frequently used linear algebra classes of the widely used finite element library deal.II to CUDA (NVidia’s extension of the programming language C for programming their GPUs). Thereby, simulation frameworks based on deal.II can easily be adapted to GPU-based computing. Besides adding a user-friendly API to any BLAS SciPAL particularly aims at simplifying the usage of NVidia’s CUBLAS, especially the issues of data transfer arising from CUDA’s distributed memory programming model. Chapter 3 closes the first part with a brief discussion of the necessary steps to solve sparse, unstructured linear systems. The second part of this thesis is a collection of various examples which started as projects based on deal.II and at some point profit from being moved to the GPU. They are drawn from the field of neuroscience, enginering of indoor airflow, quantum transport in semiconductor heterostructures and structure-function interaction of proteins. Chapter 4 contributes to the interactive stimulation of light-sensitive neurons by evaluating the efficiency of existing algorithms and their speedup by CUDA. The accurate numerical prediction of indoor airflows for building configurations of practical relevance is of paramount importance for the energy-efficient design of modern buildings. Chapter 5 discusses CUDA-based preconditioning for indoor airflow and the technical implications for existing codes written for the numerical solution of the underlying Reynolds-averaged Navier-Stokes equations. Chapter 6 focuses on the main issues of designing proper boundary conditions for finite element simulations of quantum transport in the Landauer-B¨ uttiker picture. The simulation of a two-dimensional electron gas for a given set of model parameters is not a big deal. The difficult part is an accurate formulation of transparent boundary conditions needed to truncate the source and drain leads to a finite length. The formulation of a GPU-based framework for quantum wave-packet dynamics in chapter 7 is straightforward if formulated in terms of matrix-vector products and is an example for how to combine the topics of this thesis to solve a new problem class. The accurate simulation of single-molecule impedance spectroscopy of globular proteins must be done in 3D. Current theoretical models ignore boundary conditions and the electro-chemical properties of ions completely. Chapter 8 introduces and improved model based on the Poisson-Nernst-Plack equations. The physical properties are dominated by boundary layers and the discontinuity of the dielectric permittivity at the protein-solvent interface. The goal of the simulation is the determination of the electro-diffusive fluxes in the solvent. To avoid the inclusion of the protein interior in the simulation the Poisson problem in the protein interior is replaced by a boundaryvalue problem leading to a FEM-BEM coupling. Since it is basically an elliptic problem multigrid methods are particularly well-suited for its efficient solution.
Keywords: CUDA; C++; Expression Templates; Preconditioning; Sparse Matrix; Indoor Airflow; Quantum Hall Effect; Magnetic Focussing; Hardy Space Infinite Elements; Phase Retrieval; Optogenetics; Object-oriented Design; FFT; Dielectric Relaxation Spectroscopy; Poisson-Nernst-Planck Equations; BEM; FEM-BEM Coupling; Curvilinear Boundaries; Boundary Integral Equation; Transparent Boundary Conditions; Schrödinger Equation; Krylov Methods; Parallel Computing; GPU Computing; Sparse Approximate Inverse; Faber Polynomials; Polynomial Preconditioner; Conformational Sampling of Proteins; Protein Folding; Higher Order Finite Elements