WaveToyOpenCL

Erik Schnetter <eschnetter@perimeterinstitute.ca>

May 15, 2012

Abstract

This thorn implements WaveToy, solving the scalar wave equation (in a Euclidean, i.e. trivial geometry). The thorn is implemented in OpenCL, with some wrapper code in C++.

1 Introduction

This thorn WavetoyOpenCL solves the scalar wave equation, the same equation solved in thorn WaveToy and its companions written in other languages. Its major purpose is to serve as high-level example of using OpenCL in Cactus. It is purposefully written to be simple and easy to understand; for example, there are no parameters to choose different types of initial or boundary conditions.

2 Thorn Structure

We assume the reader is familiar with the structure of a Cactus thorn written e.g. in C or C++. An OpenCL thorn is slightly more complex because it (1) has to describe when and what data are moved between host and device, and (2) Cactus does not (yet?) support calling OpenCL code directly; some boilerplate code is necessary.

2.1 Schedule Declarations

Thorn WaveToyOpenCL relies on thorn Accelerator to handle data movement between host and device. This does not need to be managed explicitly; instead, the file schedule.ccl describes which routines are executed where (host or device), and which variables or groups are read or written.

The location where a scheduled routine is ultimately executed needs to be described in a Device= schedule tag. The set of variables that are read and/or written needs to be declared in READS and WRITES schedule statements. For example, this is the schedule item for the evolution routine of thorn WaveToyOpenCL:

SCHEDULE WaveToyOpenCL_Evol AT evol  
{  
  LANG:   C  
  TAGS:   Device=1  
  WRITES: WaveToyOpenCL::Scalar  
} "Evolve scalar wave"

This indicates that this routine executes on the device, i.e. its kernel is implemented in OpenCL. Note that, in OpenCL, both CPU and GPU count as devices (thus every routine written in OpenCL counts as executing on a device, even if the device happens to be the CPU).

This also indicates that this routine writes (i.e. defines) the grid function group Scalar, without looking at (the current timelevel of) this group.

2.2 Schedule Routines

Executing OpenCL code requires some boilerplate: One needs to choose an OpenCL platform and device, needs to compile the code (from a C string), needs to pass in arguments, and finally needs to execute the actual kernel code. Thorn OpenCLRunTime provides a simple helper routine for these tasks that can be used e.g. as follows: