CarpetX User Manual

Erik Schnetter <eschnetter@perimeterinstitute.ca>
Lucas Timotheo Sanches <lucas.t.s.carneiro@gmail.com>
Steven R. Brandt <sbrandt@cct.lsu.edu>

July 6, 2025

Abstract

CarpetX is a Cactus driver thorn based on AMReX, a software framework for block-structured AMR (adaptive mesh refinement). CarpetX is intended for the Einstein Toolkit.

Driver thorns are special modules that provide distributed data structures, refine meshes, perform memory allocation, and interfaces to parallel computing hardware and software. In short, they provide all the low-level basic infrastructure needed for any scientific simulation.

1 Introduction
2 Building and using standard images
3 CCL file settings and macros
3.1 Macros
3.2 Centering
3.3 Tags
3.3.1 rhs
3.3.2 dependents
3.3.3 checkpoint
3.4 parity
4 Creating grids
5 Loops over grid elements
5.1 Loop regions
5.2 Loop methods
5.3 Loop Lambdas
5.4 The PointDesc type and loop lambda body
5.5 Example: Computing a RHS with finite differences
5.6 SIMD Vectorization of loops
6 Time integration using ODESolvers
7 Adding and controlling AMR
7.1 Box-in-box AMR
7.2 Advanced AMR
7.3 Controlling prolongation
8 Boundary conditions and symmetries
8.1 Boundary conditions via parameters
8.2 Symmetry boundary conditions
8.3 Custom boundary conditions
9 Interpolation
10 Performance tuning
10.1 Parallelism model
10.2 Parallelism control parameters
10.2.1 CarpetX::max_grid_size_{xyz}
10.2.2 CarpetX::blocking_factor_{xyz}
10.2.3 CarpetX::refine_grid_layout
10.2.4 CarpetX::grid_efficiency
10.2.5 CarpetX::max_tile_size_{xyz}
11 Profiling
12 Outputting data
12.1 TSV
12.2 SILO
12.3 openPMD
12.4 Pure ADIOS2 files
12.5 Norms files
12.6 Performance file
13 Debugging
14 Parameters
15 Interfaces
16 Schedule

1 Introduction

TODO: We should have some words explaining what CarpetX is.

2 Building and using standard images

There are a few standard images for CarpetX.

There are a set of them in the docker directory of CarpetX’s main GitHub repository. These build the various packages that CarpetX depends on mostly by hand. This is probably the more official set of images.

Another is build from this Dockerfile. This image is based on Spack, a flexible, python-based system for installing packages. This image contains optionlist files for Cactus in /usr/cactus/local-cpu.cfg and /usr/local-gpu.cfg. This image also contains two versions of the hpctoolkit tool, one that is cuda-enabled and one that is not. This rather beefy image is nearly 40GB in size, but it provides you with a complete set of tools for building and running CarpetX, Cactus, and the various development projects taking place within that framework. The image is maintained at docker.io/stevenrbrandt/etworkshop.

If you are running on a cluster with Singularity installed, you can compile the image as follows:

1singularity build -F /work/sbrandt/images/etworkshop.simg \ 
2    docker://stevenrbrandt/etworkshop

You can run the image using something like the following:

1srun singularity exec --nv \ 
2     --bind /var/spool --bind /work \ 
3     --bind /etc/ssh/ssh_known_hosts \ 
4     exec /work/sbrandt/images/etworkshop.simg cactus_sim my_parfile.par

Whether you choose to use one of the above images, or create an installation based on what you find in the dockerfiles for the above images, we wish you luck in your compiling and running efforts.

3 CCL file settings and macros

CCL files are used to define cactus thorns and customize its behavior. In this section, we shall discuss the additions that CarpetX brigs to these files, but we shall not teach users how to write them from scratch. In order to learn more, refer to Sec. C of the Cactus user guide provided in the doc folder of all distributions.

Users may need to require certain thorns explicitly in configuration.ccl files when using CarpetX APIs or schedule functions at specific bins provided by thorns in the CarpetX arrangement. These thorn-specific additions are documented in the thorn’s section of this text.

An important point to note is that CarpetX strongly enforces read, write and location statement correctness at runtime and compile time as declared in schedule.ccl files. This means that one is not able to write to a variable declared as READ. CarpetX also marks points in regions as valid or invalid and impedes user of reading from invalid data points. These extra measures help users prevent common memory and region access bugs.

3.1 Macros

In their C++ code, users should always use the DECLARE_CCTK_ARGUMENTSX_<scheduled function name> macro to ensure that only the objects declared in the function’s schedule block in schedule.ccl are brought into scope, as well as other objects necessary for working with CarpetX.

3.2 Centering

The centering of a grid function is declared via CENTERING=<centering-code>, where <centering-code> is a three-letter code containing any combination of either c or v, representing cell centering and vertex centering, respectively in interface.ccl files. For example, to declare a group of variables as being cell centered, one would use

1  CCTK_<type> <group-name> CENTERING={ccc} 
2  { 
3    ... 
4  } "A group of variables"

Similarly, for a vertex centered group one would use

1  CCTK_<type> <group-name> CENTERING={vvv} 
2  { 
3    ... 
4  } "A group of variables"

Note that it is possible to mix centering indexes. For example, to have a quantity centered at faces in the x direction, one would write

1  CCTK_<type> <group-name> CENTERING={cvv} 
2  { 
3    ... 
4  } "A group of variables"

Not also that if no CENTERING is specified, CarpetX assumes that the variable group is vertex centered in all directions.

Finally, it is very important to match the centering description in interface files with the description passed to loop API template arguments. See Sec. 5 for details.

3.3 Tags

Tags are declared in interface statements (in interface.ccl) with the TAGS=<tags> syntax. The <tags> declaration consists of a single quote string (marked by ’) with space separated key-value pars of the form key="value". For example, a tagged interface declaration will be similar to the following

1  CCTK_<type> <group-name> TAGS=’key_1="key1" key_2="key2" ...’ 
2  { 
3    ... 
4  } "A group of variables"

We will now list the tags supported by CarpetX and describe their usage.

3.3.1 rhs

Marks a group of grid variables as the RHS (right-hand side) of a system. ODESolvers uses this information to perform time steps and update a state vector (See Sec. 6 for more information). For example, in

1  CCTK_<type> state_vector TAGS=’rhs="right_hand_side"’ 
2  { 
3    ... 
4  } "A group of variables representing the PDE system state" 
5 
6  CCTK_<type> right_hand_side 
7  { 
8    ... 
9  } "A group of variables representing the PDE RHS"

the rhs="right_hand_side" tag indicates that the group named state_vector has a corresponding RHS group named right_hand_side.

1  CCTK_<type> state_vector TAGS=’rhs="right_hand_side"’ 
2  { 
3    ... 
4  } "A group of variables representing the PDE system state"

3.3.2 dependents

Indicates that the groups of variables contained on this tag must be marked as invalid if there are any changes to the group being declared. For example, in

1  CCTK_<type> parent_group TAGS=’dependents="child1 child2"’ 
2  { 
3    ... 
4  } "A group of real variables"

the tag dependents="child1 child2" indicates that parent_group has two dependent groups, child1 and child2. Whenever CarpetX writes to variables in parent_group, variables in child1 and child2 are marked as invalid and cannot be read from unless written to again.

3.3.3 checkpoint

Indicates that a variable group must be saved as a simulation checkpoint. Cactus and CarpetX can then later recover the saved checkpoint variables and restore the simulation state to continue evolution from there. Usually, only state vectors are checkpointed (See Chap. A3 of the Cactususer manual for further details). For example, in

1  CCTK_<type> state_vector TAGS=’checkpoint="yes"’ 
2  { 
3    ... 
4  } "A group of real variables describing the simulation state" 
5 
6  CCTK_<type> right_hand_side TAGS=’checkpoint="no"’ 
7  { 
8    ... 
9  } "A group of real variables describing the simulation RHS"

the simulation state, represented by the state_vector group is checkpointed, while the right_hand_side group is not.

The parameters CarpetX::checkpoint_method and CarpetX::recover_method control CarpetX’s behavior during variable checkpointing and recovery, respectively. Both parameters accept the same keyword values, described in Tab. 1

Setting value	Effect


error	Abort with error instead of checkpointing/recovering (default)
openpmd	Uses openpmd formatted files for writing checkpoints
silo	Uses silo files for writing checkpoints

Table 1: Possible values for CarpetX::checkpoint_method and CarpetX::recover_method and their effects on checkpointing and recovery.

TODO: Using TerminationTrigger to control checkpointing does not work yet.

To control when a checkpoint is to be produced, users first need to set the flash parameter Cactus::terminate, which controls under which conditions a simulation is to be terminated (see Sec. D5.2 of the Cactus user guide for further details on possible values for this parameter). After that, they can control various aspects of the checkpoint files produced, such as name and checkpoint file production on termination by setting parameters in the IO thorn. This is an infrastructure thorn that is always active and communicates its settings to CarpetX, which is effectively responsible for writing data to disk. To see all available parameters that can be set in IO, read the file Cactus/repos/cactusbase/IOUtil/param.ccl lines 137-205. These parameters are well documented in this file.

Additionally, it is imperative to set the flash’s presync mode to "presync-only" by adding Cactus::presync_mode = "presync-only" to the simulation’s parameter file. Failure to do so will cause the recovered checkpoints to be valid only on the interior and not on ghosts and boundaries (See Sec. D5.2 of the Cactus user guide for more information on presync modes).

As an example, we present an excerpt of a parameter file for a simulation that ends after simulation time reaches \(t=100\), but we allow Cactus running for only 1 minute of wall time. For the physical system being simulated, such execution time is not enough for the system to reach \(t=100\), thus the execution of Cactus will be terminated before the system is completely evolved. We would like to be able to recover from the last iteration performed by the system in subsequent executions of Cactus. To do that, we will configure the IO thorn to produce checkpoint files upon Cactus termination.

1  # Required in order to recover successfully 
2  Cactus::presync_mode = "presync-only" 
3 
4  # The type of checkpoint and recovery files that should be used 
5  CarpetX::checkpoint_method = "openpmd" 
6  CarpetX::recover_method    = "openpmd" 
7 
8  # Schedule termination of the simulation for 1 minute of wall time 
9  Cactus::terminate       = "runtime" 
10  Cactus::cctk_final_time = 100.0 
11  Cactus::max_runtime     = 1.0 
12 
13  # Configure IO to produce checkpoint files before terminating 
14  IO::checkpoint_ID           = no 
15  IO::recover                 = "autoprobe" 
16  IO::out_proc_every          = 2 
17  IO::checkpoint_on_terminate = yes 
18  IO::checkpoint_dir          = "../checkpoints" 
19  IO::recover_dir             = "../checkpoints" 
20  IO::abort_on_io_errors      = yes

3.4 parity

The parity tag controls a tensor’s reflection symmetries about the x, y and z axes. There are 3 parities for each tensor component, representing the symmetries about the 3 spatial axes.

For example, a scalar, has parities={+1 +1 +1}, which indicates that no sign changes take place when the quantity is reflected around the axes. If no parities tag is present in a variable group declaration, it is assumed to be a scalar and parities={+1 +1 +1} is implicitly assumed. For a vector, the parities={-1 +1 +1 +1 -1 +1 +1 +1 -1} tag indicates in the first triad of numbers that the vector’s x component changes sign while being reflected in the x direction, the second triad indicates that the y component changes sign while reflected in the y direction and the third triad indicates that the z component changes sign while being reflected in the z direction. As an example, let us look at the parities of the energy momentum tensor in general relativity as declared in the interface.ccl file of the TmunuBaseX thorn included in CarpetX :

1# The T_{tt} component represents the energy density and 
2# is stored as a scalar. The parities={+1 +1 +1} tag is implicit. 
3CCTK_REAL eTtt TAGS=’checkpoint="no"’ TYPE=gf "T_00" 
4 
5# The T_{ti} components of the tensor are a vector of three entries. 
6# Each component changes sign when reflected on their respective axes. 
7CCTK_REAL eTti TAGS=’parities={-1 +1 +1    +1 -1 +1    +1 +1 -1} checkpoint="no"’ TYPE=gf 
8{ 
9    eTtx eTty eTtz 
10} "T_0i" 
11 
12# The T_{ij} components form a rank 2 tensor of 3 spatial dimensions. 
13# Each index triplet represents a component in the tensor. 
14CCTK_REAL eTij TAGS=’parities={+1 +1 +1    -1 -1 +1    -1 +1 -1    +1 +1 +1    +1 -1 -1    +1 +1 +1} checkpoint="no"’ TYPE=gf 
15{ 
16    eTxx eTxy eTxz eTyy eTyz eTzz 
17} "T_ij" 
18 
19# The parities tag on this tensor indicates that: 
20# * The xx component does not change sign. 
21# * The xy component changes sign while reflecting on either x or y. 
22# * The xz component changes sign while reflecting on either x or z. 
23# * The yy component does not change sign. 
24# * The yz component changes sign while reflecting on either y or z. 
25# * The zz component does not change sign.

4 Creating grids

In this section, we will describe CarpetX’s parameters used for describing the simulation domain. Currently, CarpetX supports only Cartesian grids. To control the grid’s extents, users must utilize parameters described in Tab. 2. To control the number of cells in each direction, users must set the parameters described in Tab. 3. Finally, to control the number of ghost zones in the grid, users must set the parameters described in Tab. 4. It is important to note that the size of all ghost zones can be set at once via ghost_size parameter or via the ghost_size_xyz family of parameters if ghost_size is set to \(-1\)

Parameter	Description	Default value


xmin	Minimum value of the grid in the \(x\) direction	\(-1.0\)
xmax	Maximum value of the grid in the \(x\) direction	\(1.0\)
ymin	Minimum value of the grid in the \(y\) direction	\(-1.0\)
ymax	Maximum value of the grid in the \(y\) direction	\(1.0\)
zmin	Minimum value of the grid in the \(z\) direction	\(-1.0\)
zmax	Maximum value of the grid in the \(z\) direction	\(1.0\)

Table 2: Parameters controlling grid extents in CarpetX.

Parameter	Description	Default value


ncells_x	Number of grid cells in the \(x\) direction	\(128\)
ncells_y	Number of grid cells in the \(y\) direction	\(128\)
ncells_z	Number of grid cells in the \(z\) direction	\(128\)

Table 3: Parameters controlling grid resolutions in CarpetX.

Parameter	Description	Default value


ghost_size	Number of grid cells in all directions.	\(-1\)
ghost_size_x	Number of ghost zones in the \(x\) direction	\(1\)
ghost_size_y	Number of ghost zones in the \(y\) direction	\(1\)
ghost_size_z	Number of ghost zones in the \(z\) direction	\(1\)

Table 4: Parameters controlling the number of grid ghost zones in CarpetX.

5 Loops over grid elements

In CarpetX loops over grid elements are not written explicitly. Operations that are to be executed for every grid element (cells, edges or points) are specified via C++ lambda functions, also known as closures or anonymous functions.

These objects behave like regular C++ functions, but can be defined inline, that is, on the body of a function or as an argument to another function.

An important concept to grasp with lambda function is captures. If a lambda (let us call this the child function) is defined in the body of an outer function (let us call this the parent function), the child can access variables defined in the parent function, provided that these variables are captured. The two most relevant modes of capture while using CarpetX are capture by reference (denoted with the & sign in the square brackets denoting the start of the lambda) and capture by value (denoted by an = sign inside the square brackets of the lambda declaration).

When running on GPUs, captures by value are required and captures by reference are forbidden. This is because data must be copied from host (CPU side) memory to device (GPU side) memory in order to be executed.

The API for writing loops in CarpetX is provided by the Loop thorn. To use it, one must add

1    REQUIRES Loop

to the thorn’s configuration.ccl file and

1    INHERITS: CarpetXRegrid 
2    USES INCLUDE HEADER: loop_device.hxx

to the thorn’s interface.ccl file. Furthermore, one must include the Loop API header file in all source files where the API is needed by adding

1    #include <loop_device.hxx>

to the beginning of the source file.

5.1 Loop regions

Before actually writing any code that iterates over grid elements, one must choose which elements are to be iterated over. We shall refer to the set of points in the grid hierarchy that will be iterated over when a loop is executed as a Loop region. The following regions are defined in the Loop API:

All: This region refers to all points contained in the grid. Denoted in code by the all suffix.
Interior: This region refers to the interior of the grid. Denoted in code by the int prefix.
Outermost interior: This region refers to the outermost ”boundary” points in the interior. They correspond to points that are shifted inwards by = cctk_nghostzones[3] from those that CarpetX identifies as boundary points. From the perspective of CarpetX (or AMReX), these do not belong in the outer boundary, but rather the interior. This excludes ghost faces, but includes ghost edges/corners on non-ghost faces. Loop over faces first, then edges, then corners. Denoted in code by the outermost_int suffix.

TODO: Picture of grid regions

5.2 Loop methods

Loop API functions are methods of the GridDescBaseDevice class which contain functions for looping over grid elements on the CPU or GPU, respectively. The macro DECLARE_CCTK_ARGUMENTSX_FUNCTION_NAME provides a variable called grid, which is an instance of either of these classes. The name of each looping method is formed according to

loop_ + <loop region>+ [_device]

For example, to loop over boundaries using the CPU one would call

1    grid.loop_bnd<...>(...);

To obtain a GPU equivalent version, one would simply append _device to the function name. Thus, for example, to loop over the interior using a GPU, one would call

1    grid.loop_int_device<...>(...);

Let us now look at the required parameter of loop methods. The typical signature is as follows

1  template <int CI, int CJ, int CK, ..., typename F> 
2  void loop_REG_PU(const vect<int, dim> &group_nghostzones, const F &f);

The template parameters meanings are as follows:

CI: Centering index for the first grid direction. Must be set explicitly and be either 0 or 1. 0 means that this direction will be looped over grid vertices, while 1 means that it will be looped over cell centers.
CJ: Centering index for the second grid direction. Must be set explicitly and be either 0 or 1. 0 means that this direction will be looped over grid vertices, while 1 means that it will be looped over cell centers.
CK: Centering index for the third grid direction. Must be set explicitly and be either 0 or 1. 0 means that this direction will be looped over grid vertices, while 1 means that it will be looped over cell centers.
F: The type signature of the lambda function passed to the loop. It is not required to be set explicitly and is automatically deduced by the compiler.

Note that centering indexes can be mixed: setting the indices to \((1,0,0)\), for instance, creates loops over faces on the x direction. Function parameter meanings are as follows:

group_nghostzones: The number of ghost zones in each direction of the grid. This can be obtained by calling grid.nghostzones.
f: The C++ lambda to be executed on each step of the loop.

5.3 Loop Lambdas

We shall now discuss the syntax and the available elements of the lambda functions that are to be fed to the Loop methods described in Section 5.2.

To start, let us be reminded of the general syntax of a lambda function in C++:

1  // append ; if assigning to a variable 
2  [capture_parameter] (argument_list) -> return_type { function_body }

When running on GPUs, the capture_parameter field used should always be =, indicating pass by value (copy) rather than &, indicating pass by reference. The argument_list of the lambda should receive only one element of type PointDesc (which will be described on Sec. 5.4) and the lambda must return no value, which means that return_type can be omitted altogether.

This means that a typical lambda passed to a loop method will have the form

1  [=] (const Loop::PointDesc &p) { 
2    // loop body 
3  }

In addition, users can specify inlining (a compile-time optimization where the function call is replaced by its body) by using the CCTK_ATTRIBUTE_ALWAYS_INLINE macro and designate host or device availability via the CCTK_HOST or CCTK_DEVICE macros, respectively. These annotations are optional unless the compiler is unable to build a function due to an inability to automatically determine that it needs to be available for device usage, for example. However, when these annotations are desired, they must be placed in specific locations.

To mark a lambda for host usage, one would use:

1  [=] CCTK_HOST(const Loop::PointDesc &p) CCTK_ATTRIBUTE_ALWAYS_INLINE { 
2    // loop body 
3  }

For device usage, one would use:

1  [=] CCTK_DEVICE(const Loop::PointDesc &p) CCTK_ATTRIBUTE_ALWAYS_INLINE { 
2    // loop body 
3  }

For both host and device usage, one would use:

1  [=] CCTK_HOST CCTK_DEVICE(const Loop::PointDesc &p) CCTK_ATTRIBUTE_ALWAYS_INLINE { 
2    // loop body 
3  }

5.4 The PointDesc type and loop lambda body

The PointDesc type provides a complete description of the current grid element in the loop. The following members are the ones that are expected to be used more often:

I: A 3-element array containing the grid point indices.
DI: A 3-element array containing the direction unit vectors from the current grind point.
X: A 3-element array containing the point’s coordinates.
DX: A 3-element array containing the point’s grid spacing.
iter: The current loop iteration.
BI: A 3-element array containing the outward boundary normal if the current point is the outermost interior point or zero otherwise.

In the body of a loop lambda, grid functions declared in the thorn’s interface.ccl file are available as GF3D2 objects, which are C++ wrappers around native Cactus grid functions. These objects are accessible by directly calling them as functions taking arrays of grid indices as input. Such indices, in turn can be obtained by directly accessing PointDesc members.

5.5 Example: Computing a RHS with finite differences

Let us now combine the elements describe thus far into a single example. Let us suppose that the following system of PDEs is implemented in Cactus:

\begin {align} \partial _t u & = \rho \label {eq:toy_loop_0}\\ \partial _t \rho & = \partial _x^2 u + \partial _y^2 u + \partial _z^2 u \label {eq:toy_loop_1} \end {align}

Let us suppose that the grid functions u and rho where made available, while grid functions u_rhs and rho_rhs are their corresponding RHS storage variables. The function that computes the RHS of Eqs. (??)-(??) can be written as

1extern "C" void LoopExample_RHS(CCTK_ARGUMENTS) { 
2  DECLARE_CCTK_ARGUMENTS_LoopExample_RHS; 
3  DECLARE_CCTK_PARAMETERS; 
4 
5  // The grid variable is implicitly defined via the CCTK macros 
6  // A 0/1 in template parameters indicate that a grid is vertex/cell centered 
7  grid.loop_int<0, 0, 0>( 
8    grid.nghostzones, 
9 
10    // The loop lambda 
11    [=] (const Loop::PointDesc &p) { 
12      using std::pow; 
13 
14      const CCTK_REAL hx = p.DX[0] * p.dX[0]; 
15      const CCTK_REAL hy = p.DX[1] * p.dX[1]; 
16      const CCTK_REAL hz = p.DX[2] * p.dX[2]; 
17 
18      const CCTK_REAL dudx = u(p.I - p.DI[0]) - 2 * u(p.I) 
19        + u(p.I + p.DI[0])/hx; 
20 
21      const CCTK_REAL dudy = u(p.I - p.DI[1]) - 2 * u(p.I) 
22        + u(p.I + p.DI[1])/hy; 
23 
24      const CCTK_REAL dudz = u(p.I - p.DI[2]) - 2 * u(p.I) 
25        + u(p.I + p.DI[2])/hz; 
26 
27      u_rhs(p.I) = rho(p.I); 
28      rho_rhs(p.I) = ddu; 
29 
30    } // Ending of the loop lambda 
31  ); // Ending of the loop_int call 
32}

5.6 SIMD Vectorization of loops

If the user’s CPU supports SIMD instructions (see here for details), CarpetX is capable of vectorizing loops via the Arith thorn. To use it, users must add

1  USES INCLUDE HEADER: simd.hxx 
2  USES INCLUDE HEADER: vect.hxx

to the top of their interface.ccl files, in addition to the other required headers.

While writing SIMD vectorized code, one has to keep in mind that grid functions are not a collection of CCTK_REAL values but a collection of Arith::simd<CCTK_REAL> real values, which is itself a collection of CCTK_REAL values. This becomes apparent when reading and writing to grid functions at a particular grid point. To see how these differences come about, let us study an example of initializing grid data using the SIMD API

1  extern "C" void SIMDExample_Initial(CCTK_ARGUMENTS) { 
2  DECLARE_CCTK_ARGUMENTSX_SIMDExample_Initial; 
3  DECLARE_CCTK_PARAMETERS; 
4 
5  using real = CCTK_REAL; 
6  using vreal = Arith::simd<real>; 
7 
8  // This is the compile time determined vector size supported by the underlying CPU architecture 
9  constexpr std::size_t vsize = std::tuple_size_v<vreal>; 
10 
11  // After passing the centering indices, size of the SIMD vectors is passed as template arguments 
12  grid.loop_int_device<0, 0, 0, vsize>( 
13    grid.nghostzones, 
14 
15    [=] (const Loop::PointDesc &p) { 
16      // p.x is a scalar, but must be turned into a vector 
17      const vreal x0 = p.x + Arith::iota<vreal>() * p.dx; 
18      const real y0 = p.y; 
19      const real z0 = p.z; 
20 
21      // The initialization function takes its inputs as vectors 
22      vreal u0, rho0; 
23      initial_data(x0, y0, z0, u0, rho0); 
24 
25      u.store(p.mask, p.I, u0); 
26      rho.store(p.mask, p.I, rho0); 
27    } 
28  ); 
29}

In lines 5 and 6, we define aliases for the real scalar type CCTK_REAL and its associated vector type Arith::simd<real>. Values assigned to grid functions need to be of this type. In line 9, the vsize variable stores the size of the SIMD vectors of the target CPU. The constexpr keyword ensures that this expression is evaluated at compile time. In line 12, we call a loop API function with the three centering indices, discussed in Sec. 5.2, plus the extra vreal parameter which informs the loop method call that the loops will be vectorized.

At this point, it is very important to realize that loops can only be vectorized along the x direction. This is so because in SIMD vectors, entries must be sequential and internally, CarpetXstores 3D grid data as a flattened array and only the x direction is contiguous. Line 17 is responsible for the vectorization of the x direction. The Arith::iota<vreal>() instruction produces an array of contiguously increasing elements from 0 to vsize (not end inclusive) which then gets multiplied by p.dx and incremented by p.dx. As an illustrative example, let us suppose that \(\texttt {vsize} = 4\). In this case, \begin {equation} \texttt {Arith::iota<vreal>()} = \begin {pmatrix} 0\\ 1\\ 2\\ 3 \end {pmatrix} . \end {equation} The operation on line 17 becomes then \begin {equation} \texttt {x0} = \texttt {p.x} + \begin {pmatrix} 0\\ 1\\ 2\\ 3\\ \end {pmatrix} \texttt {p.dx} = \begin {pmatrix} \texttt {p.x}\\ \texttt {p.x} + \texttt {p.dx}\\ \texttt {p.x} + 2 * \texttt {p.dx}\\ \texttt {p.x} + 3 * \texttt {p.dx}\\ \end {pmatrix} \end {equation}

On lines 22 and 23, the initial_data function gets called and uses the vectorized x0 coordinates and scalar coordinates y0 and z0 to fill the vectorized initial data variables u0 and rho0 which then finally get assigned to their respective grid functions via the assign member on lines 25 and 26. The initial_data is arbitrary and user defined, but note that Arith overloads arithmetic operators and trigonometric functions, so it is straightforward to write code that uses vectorized and scalar variables together.

Let us now look at an example of writing derivatives and RHS functions with SIMD loops.

1  extern "C" void SIMDExample_RHS(CCTK_ARGUMENTS) { 
2  DECLARE_CCTK_ARGUMENTSX_SIMDExample_RHS; 
3  DECLARE_CCTK_PARAMETERS; 
4 
5  using vreal = Arith::simd<CCTK_REAL>; 
6  constexpr std::size_t vsize = std::tuple_size_v<vreal>; 
7 
8  grid.loop_int_device<0, 0, 0, vsize>( 
9    grid.nghostzones, 
10 
11    [=] (const Loop::PointDesc &p) { 
12      using Arith::pow2; 
13 
14      const auto d2udx2 = (u(p.mask, p.I - p.DI[0]) - 2 * u(p.mask, p.I) + u(p.mask, p.I + p.DI[0]) ) / pow2(p.DX[0]); 
15 
16      const auto d2udy2 = (u(p.mask, p.I - p.DI[1]) - 2 * u(p.mask, p.I) + u(p.mask, p.I + p.DI[1]) ) / pow2(p.DX[1]); 
17 
18      const auto d2udz2 = (u(p.mask, p.I - p.DI[2]) - 2 * u(p.mask, p.I) + u(p.mask, p.I + p.DI[2]) ) / pow2(p.DX[2]); 
19 
20      const auto udot = rho(p.mask, p.I); 
21      const auto rhodot = ddu; 
22 
23      u_rhs.store(p.mask, p.I, udot); 
24      rho_rhs.store(p.mask, p.I, rhodot); 
25    }); 
26  }

As previously mentioned, Arith overloads arithmetic operators, which makes writing mathematical expressions in vectorized loops no different from their non-vectorized counterparts. This is exemplified in lines 14-18 where second derivatives are being taken via finite differences approximations. Note once again in line 8 the presence of an extra template argument indicating the CPU’s vector sizes, the extra p.mask argument being passed on all invocations of grid functions and the use of the store method to assign computed values to the RHS grid functions.

TODO: Document fused SIMD operations

6 Time integration using ODESolvers

In CarpetX, time integration of PDEs via the Method of Lines is provided by the ODESolvers thorn. This makes time integration tightly coupled with the grid driver, allowing for more optimization opportunities and better integration.

From the user’s perspective, ODESolvers is very similar (and sometimes even more straightforward) the MoL thorn, but a few key differences need to be observed. Firstly, not all integrators available to MoL are available to ODESolvers. The list of all supported methods is displayed in Tab. 5. Method selection occurs via configuration file, by setting

1  ODESolvers::method = "Method name"

and the default method used if none other is set is ”RK2”.

Name	Description


constant	The state vector is kept constant in time
Euler	Forward Euler method
RK2	Explicit midpoint rule
RK3	Runge-Kutta’s third-order method
RK4	Classic RK4 method
SSPRK3	Third-order Strong Stability Preserving Runge-Kutta
RKF78	Runge-Kutta-Fehlberg 7(8)
DP87	Dormand & Prince 8(7)
Implicit Euler	Implicit Euler method

Table 5: Available methods in ODESolvers

Additionally, users can set verbose output from the time integrator by setting

1  ODESolvers::verbose = "yes"

By default, this option is set to "no". Finally, to control the step size of the time integrator, it is possible to set the configuration parameter CarpetX::dtfac, which defaults to \(0.5\), is defined as \begin {equation} \texttt {dtfac} = \texttt {dt}/\min (\texttt {delta\_space}) \end {equation} where \(\min (\texttt {delta\_space})\) refers to the smallest step size defined in the CarpetX grid and dt is the time integrator step.

To actually perform time evolution, the PDE system of interest needs to be declared to Cactus as a set of Left-Hand Side (or LHS, or more commonly state vector) grid functions plus a set of Right-Hand Side (RHS) grid functions. The RHS grid functions correspond exactly to the right-hand side of the evolution equations while the state vectors stores the variables being derived in time in the current time step. More time steps can be stored internally, depending on the time integrator of choice, but this is an implementation detail that is supervised automatically by ODESolvers. To make this clear, consider the PDE system comprised of Eqs. (??)-(??). In this example, the state vector would be the set \((u,\rho )\) while the right-hand side would be all elements to the right of the equal signs. Note that derivative appearing on the RHS are only derivatives in space. By discretizing space with a grid and replacing continuous derivatives with finite approximations (by using finite differences, for instance) the time-space dependent PDE system now becomes a ODE system in time, with the state vector being the sought variables. By providing the RHS of the PDE system, ODESolvers can apply the configured time stepping method and compute the next time steps of the state vector.

To see how ODESolvers is used in practice, let us turn once again to the WaveToyX example, bundled with CarpetX. To begin, let us look at an excerpt from this example’s interface.ccl file

1  CCTK_REAL state TYPE=gf TAGS=’rhs="rhs" dependents="energy error"’ 
2  { 
3    u 
4    rho 
5  } "Scalar wave state vector" 
6 
7  CCTK_REAL rhs TYPE=gf TAGS=’checkpoint="no"’ 
8  { 
9    u_rhs 
10    rho_rhs 
11  } "RHS of scalar wave state vector" 
12 
13  ...

In lines 1-5, the group of real grid functions called state, consisting of grid function u and rho, is declared. The TYPE=gf entry indicates that the variables in this group are grid functions and the TAGS entry is particularly important in this instance, thus it is highly recommended that readers visit Sec. 3 for more information. The rhs="rhs" tag indicates that these grid functions have an associated RHS group, that is, a group of variables with grid functions responsible for storing the PDE system’s RHS and this group is called "rhs" which is defined later in lines 7-11. This information is used by ODESolvers while taking a time step and is tightly coupled to Cactus file parsers. In lines 7-11, the rhs group is declared with two real grid functions, u_rhs and rho_rhs. These variables will be responsible for holding the RHS data of the PDE, which will in turn be used by ODESolvers.

The next step is to schedule the execution of functions into their correct schedule groups. The most relevant schedule groups provided by ODESolvers are ODESolvers_RHS and ODESolvers_PostStep. The former is the group where one evaluates the RHS of the state vector everywhere on the grid and the latter is where boundary conditions are applied to the state vector, and projections are applied if necessary. For example, looking at WaveToyX’s schedule.ccl file, one sees

1  SCHEDULE WaveToyX_RHS IN ODESolvers_RHS 
2  { 
3    LANG: C 
4    READS: state(everywhere) 
5    WRITES: rhs(interior) 
6    SYNC: rhs 
7  } "Calculate scalar wave RHS" 
8 
9SCHEDULE WaveToyX_Energy IN ODESolvers_PostStep 
10  { 
11    LANG: C 
12    READS: state(everywhere) 
13    WRITES: energy(interior) 
14    SYNC: energy 
15  } "Calculate scalar wave energy density"

The schedule statement from lines 1-7 schedules the function that computes the RHS of the wave equation. Note that the function reads the state on the whole grid and writes to the RHS grid variables in the interior. With CarpetX, grid functions read and write statements are enforced: You cannot write to a variable which was declared as read only in the schedule.ccl file. Lines 9-15 exemplify the scheduling of a function in the ODESolvers_PostStep group, which is executed after ODESolvers_RHS during the time stepping loop. In this particular example, the scheduled function is computing the energy associated with the scalar wave equation system. These are all the required steps for using ODESolvers to solve a PDE system via the method of lines.

7 Adding and controlling AMR

7.1 Box-in-box AMR

CarpetX supports fixed mesh refinement via the so called box-in-box paradigm. This capability is provided by the BoxInBox thorn. Using it is very simple and similar to Carpet’s CarpetRegrid2 usage.

Name	Type	Possible Values	Default Value	Description


shape_n	String	"sphere" or "cube"	"sphere"	Shape of refined region
active_n	Boolean	"yes" or "no"	"yes"	Is this box active?
num_levels_n	Single integer	\([1,30]\)	\(1\)	Number of refinement levels for this box
position_x_n	Single real number	Any real	\(0.0\)	x-position of this box
position_y_n	Single real number	Any real	\(0.0\)	y-position of this box
position_z_n	Single real number	Any real	\(0.0\)	z-position of this box
radius_n	30 element array of reals	\(-1.0\) or positive real	\(-1.0\) (radius ignored)	Radius of refined region for this level
radius_x_n	30 element array of reals	\(-1.0\) or positive real	\(-1.0\) (radius ignored)	x-radius of refined region for this level
radius_y_n	30 element array of reals	\(-1.0\) or positive real	\(-1.0\) (radius ignored)	y-radius of refined region for this level
radius_z_n	30 element array of reals	\(-1.0\) or positive real	\(-1.0\) (radius ignored)	z-radius of refined region for this level

Table 6: Configuration parameters for a single (1 out of 3) box that can be defined in parameter files using the BoxInBox thorn.

All configuration of boxes and levels are performed within configuration files. BoxInBox supports adding 3 “boxes” or “centers”. Each box can be configured as summarized in Tab. 6. The n suffix should be replaced by 1, 2 or 3 for configuring the corresponding boxes. Each box can be shaped differently as either Cartesian-like cubes or spheres and support configuring up to 30 levels. Level’s positions and radii can be set independently for each dimension. Note that for each box the active, num_levels and position_(xyz) field are stored as grid scalars. Each of the 30 refinement level radii and x, y, z individual radii for each box are also stored as grid arrays. This allows these parameters to be changed during a simulation run, allowing for moving boxes. This is useful, for example, when implementing a puncture tracker.

These configurations are subjected to (and restricted by) two additional CarpetX configurations, namely CarpetX::regrid_every, which controls how many iterations should pass before checking if the box grid variables have changed and CarpetX::max_num_levels which controls the maximum number of allowed refinement levels.

As an example, we present a configuration file excerpt for creating two refinement boxes with the BoxInBox thorn

1  BoxInBox::num_regions = 2 
2 
3  BoxInBox::num_levels_1 = 2 
4  BoxInBox::position_x_1 = -0.5 
5  BoxInBox::radius_x_1[1] = 0.25 
6  BoxInBox::radius_y_1[1] = 0.25 
7  BoxInBox::radius_z_1[1] = 0.25 
8 
9  BoxInBox::num_levels_2 = 2 
10  BoxInBox::position_x_2 = +0.5 
11  BoxInBox::radius_x_2[1] = 0.25 
12  BoxInBox::radius_y_2[1] = 0.25 
13  BoxInBox::radius_z_2[1] = 0.25

Let us now suppose that one wishes to make the boxes set up in the above parameter file to move in a circle around the origin. This is not very useful in practice, but it illustrates important concepts that can be later applied to more complex tools, such as puncture trackers. The MovingBoxToy thorn bundled in CarpetX provides an example of how to achieve this. We shall now examine this implementation closely. Let us start by the examining the thorn’s interface.ccl file:

1  # Interface definition for thorn MovingBoxToy 
2 
3  IMPLEMENTS: MovingBoxToy 
4 
5  INHERITS: BoxInBox

The INHERITS statement in line 5 states that this thorn will write to grid functions provided in BoxInBox which control the refinement boxes parameters. Next, in the thorn’s param.ccl file we have

1  # Parameter definitions for thorn MovingBoxToy 
2 
3  shares: BoxInBox 
4  USES CCTK_REAL position_x_1 
5  USES CCTK_REAL position_x_2

Lines 3-5 declare that MovingBoxToy uses parameters position_x_1 and position_x_2 from BoxInBox. These declarations are required in order to access the initial positions of the boxes. Note that similar statements would be used if other parameters from BoxInBox were required.

Finally, in the schedule.ccl file we schedule the routine that will actually update the box positions, called MovingBoxToy_MoveBoxes:

1  # Schedule definitions for thorn MovingBoxToy 
2 
3  SCHEDULE MovingBoxToy_MoveBoxes AT postinitial BEFORE EstimateError 
4  { 
5    LANG: C 
6    READS: BoxInBox::positions 
7    WRITES: BoxInBox::positions 
8  } "Update box positions" 
9 
10  SCHEDULE MovingBoxToy_MoveBoxes AT poststep BEFORE EstimateError 
11  { 
12    LANG: C 
13    READS: BoxInBox::positions 
14    WRITES: BoxInBox::positions 
15  } "Update box positions"

Note that the routine is scheduled with AT poststep BEFORE EstimateError. This is important, since it is in the EstimateError bin that CarpetX’s AMR error grid function (see Sec. 7.2) will be updated thus any changes to box data should be scheduled before that.

Finally, the C++ routine MovingBoxToy_MoveBoxes that will actually update the boxes positions reads

1  extern "C" void MovingBoxToy_MoveBoxes(CCTK_ARGUMENTS) { 
2    DECLARE_CCTK_ARGUMENTSX_MovingBoxToy_MoveBoxes; 
3    DECLARE_CCTK_PARAMETERS; 
4 
5    using std::cos; 
6 
7    const CCTK_REAL omega{M_PI/4}; 
8 
9    // Initial positions of box 1 
10    const auto x0_1{position_x_1}; 
11 
12    // Initial positions of box 2 
13    const auto x0_2{position_x_2}; 
14 
15    // Trajectory of box 1 
16    position_x[0] = x0_1 * cos(omega * cctk_time); 
17    position_y[0] = x0_1 * sin(omega * cctk_time); 
18 
19    // Trajectory of box 2 
20    position_x[1] = x0_2 * cos(omega * cctk_time); 
21    position_y[1] = x0_2 * sin(omega * cctk_time); 
22  }

In lines 10 and 13, we read BoxInBox parameters for the initial positions of the boxes. In line 16-21 those positions are updated in a way that the boxes centers describe a circle around the origin. At each time, the boxes move \(\pi /4\) radians around the origin in a counterclockwise fashion. Figure 1 shows 6 still frames of the boxes motions around the origin. All panels are \(z=0\) slices of the grid hierarchy and time and iteration values are provided for each panel. These plots were produced with the VisIt visualization software with CarpetX producing silo data files as output (see Sec. 12 for more details on how to visualize and post-process CarpetX data). The data can be reproduced by running the circle.par parameter file, provided in the par folder of the MovingBoxToy thorn. An animated version of Fig. 1 can be found in the doc folder of the CarpetX thorn under the name animated_boxes.gif.

Figure 1: AMR boxes moving around a circle, as implemented in the MovingBoxToy thorn.

7.2 Advanced AMR

CarpetX supports non-fixed (adaptive) mesh refinement. For cell level control of AMR, CarpetXRegridError provides users with a cell centered and non-checkpointed grid function called regrid_error. Users are responsible for filling this grid function with real value however they see fit. Once it is filled, the configuration parameter CarpetX::regrid_error_threshold controls regridding: If the values stored in regrid_error are larger than what is set in regrid_error_threshold, the region gets refined. Additionally, the configuration parameter CarpetX::regrid_every controls how many iterations should pass before checking if the error threshold has been exceeded. The parameter CarpetX::max_num_levels controls the maximum number of allowed refinement levels.

Note that CarpetX does not provide a “standardized” regrid error routine. This is because refinement criteria are highly specific to the problem being solved via AMR, and thus there is no one size fits all error criteria. This might seem inconvenient, but ultimately it allows for users to have higher degrees of customization in their AMR codes. For demonstration purposes, we shall now provide a routine that estimates the regrinding error as TODO: what? Provide a good starter example. This implementation could be used as a starting point for codes that wish to use different error criteria in their AMR grids.

1  extern "C" void EstimateError(CCTK_ARGUMENTS) { 
2  DECLARE_CCTK_ARGUMENTSX_EstimateError; 
3  DECLARE_CCTK_PARAMETERS; 
4 
5  // The template indices indicate this a loop over cell centers 
6  // Remember that regrid_error is a cell centered grid function 
7  grid.loop_int_device<1, 1, 1>( 
8      grid.nghostzones, 
9      [=] (const Loop::PointDesc &p) { 
10        // TODO: Give a simple example 
11        regrid_error(p.I) = 0.0; 
12      }); 
13}

Once defined, EstimateError should be scheduled in both the postinitial and poststep bins. The poststep bin gets called right after a new state vector has been calculated, and is thus the proper place to analyze it. The postinitial scheduling is also necessary for computing the initial refinement after initial conditions have been set up. A thorn making use of the regrid_error AMR mechanism should then add the following to its schedule.ccl file:

1  SCHEDULE EstimateError AT postinitial 
2  { 
3    LANG: C 
4    READS: state(everywhere) 
5    WRITES: CarpetXRegrid::regrid_error(interior) 
6  } "Estimate error for regridding" 
7 
8  SCHEDULE EstimateError AT poststep 
9  { 
10    LANG: C 
11    READS: state(everywhere) 
12    WRITES: CarpetXRegrid::regrid_error(interior) 
13  } "Estimate error for regridding"

7.3 Controlling prolongation

Prolongation refers to the interpolation of data on coarse grid into a finer grid. CarpetX allows users to control how this propagation of data will be executed via the parameter CarpetX::prolongation_type, whose possible values are detailed in Tab. 7. Additionally, other aspects of the prolongation operator can be controlled with the parameters detailed in Tab. 8

Method	Description


"interpolate"	Interpolate between data points
"conservative"	Interpolate cell averages, ensuring conservation
"ddf"	Interpolate in vertex centered and conserve in cell centered directions
"ddf-eno"	Interpolate in vertex centered and ENO-conserve in cell centered directions
"ddf-hermite"	Hermite-interpolate in vertex centered and conserve in cell centered directions
"natural"	Interpolate in vertex centered and conserve in cell centered directions, using the same order
"poly-cons3lfb"	Interpolate polynomially in vertex centered directions and conserve with 3rd order accuracy and a linear fallback in cell centered directions

Table 7: Prolongation methods available in CarpetX.

Parameter	Type	Default value	Description


prolongation_order	Integer larger than 0	1	Controls the order of the interpolation in prolongation
do_restrict	Boolean	yes	Automatically restrict fine to coarse grid functions
restrict_during_sync	Boolean	yes	Restrict fine to coarse grid functions when syncing

Table 8: Parameters controlling prolongation in CarpetX.

8 Boundary conditions and symmetries

CarpetX provides a way of setting up standard boundary conditions and symmetry conditions (to be applied on grid boundaries) via parameter files and grid function tags. More complex boundary conditions can be implemented by users but doing so is not often an easy task due to how AMReX works internally.

8.1 Boundary conditions via parameters

The following boundary condition types can be specified:

"none": Don’t apply any boundary condition to the grid. This is the default setting for all boundaries.
"dirichlet": Dirichlet boundary conditions.
"linear extrapolation": Linearly extrapolate interior values to boundary values.
"neumann": Neumann boundary conditions.
"robin": Robin boundary conditions.

Remember that, given a function \(f(x,y,z)\) that satisfies a given PDE in a domain \(\Omega \subset \mathbb {R}^n\) whose boundary is denoted by \(\partial \Omega \), a Dirichlet boundary condition imposes that \begin {equation} f(x,y,z) = g \quad \forall (x,y,z) \in \partial \Omega , \label {eq:dirichlet} \end {equation} a Neumann boundary condition imposes that \begin {equation} \frac {\partial f}{\partial \mathbf {n}} = g \quad \forall (x,y,z) \in \partial \Omega , \label {eq:neumann} \end {equation} where \(\mathbf {n}\) is the normal to the boundary \(\partial \Omega \) and a Robin boundary condition imposes that \begin {equation} a f + b \frac {\partial f}{\partial \mathbf {n}} = g \quad \forall (x,y,z) \in \partial \Omega . \label {eq:robin} \end {equation}

The type of boundary condition in all 6 grid faces can be configured independently via the parameters described in Tab. 9.

Parameter	Description


boundary_x	Boundary condition at lower \(x\) boundary
boundary_upper_x	Boundary condition at upper \(x\) boundary
boundary_y	Boundary condition at lower \(y\) boundary
boundary_upper_y	Boundary condition at upper \(y\) boundary
boundary_z	Boundary condition at lower \(z\) boundary
boundary_upper_z	Boundary condition at upper \(z\) boundary

Table 9: Parameters controlling boundary condition types in CarpetX.

Once boundary types are chosen, they are applied to all grid functions. Users may override the global configuration for a specific set of grid function in which different boundary conditions should be applied to. This is done by passing a list of space (or new line) separated strings to the appropriate parameter, depending on which boundary condition(s) should be overridden. These parameters are of the form

CarpetX::<boundary condition name>_<direction>_vars = "..."

where <boundary condition name> is one of

dirichlet
linear_extrapolation
neumann
robin

and <direction> is one of

x
y
z
upper_x
upper_y
upper_z

In order to set the actual values of Dirichlet, Neumann and Robin boundary conditions, users can specify values via the TAGS mechanism in interface.ccl, detailed in Sec. 3.3. Note that it is only possible to set the quantity \(g\) in Eqs (4)-(6). The syntax is as follows

1  # For Dirichlet BCs 
2  CCTK_<type> <group-name> TAGS=’dirichlet_values={1.0}’ 
3  { 
4    ... 
5  } "A group of variables" 
6 
7  # For Neumman BCs 
8  CCTK_<type> <group-name> TAGS=’neumann_values={1.0}’ 
9  { 
10    ... 
11  } "A group of variables" 
12 
13  # For Robin BCs 
14  CCTK_<type> <group-name> TAGS=’robin_values={1.0}’ 
15  { 
16    ... 
17  } "A group of variables"

If a variable group contains multiple variables or multiple boundary conditions, simply append the boundary values to the brace enclose list of boundary values and add another boundary condition as a tag. For example, consider the 3-metric variable group declared in the ADMBaseX thorn bundled with CarpetX, which consists of 6 grid functions each with their own Dirichlet and Robin boundary condition values:

1  CCTK_REAL metric TYPE=gf TAGS=’dirichlet_values={1.0 0.0 0.0   1.0 0.0   1.0} robin_values={1.0 0.0 0.0   1.0 0.0   1.0} ...’ 
2  { 
3    gxx gxy gxz gyy gyz gzz 
4  } "ADM 3-metric g_ij"

8.2 Symmetry boundary conditions

CarpetX supports two types of symmetry boundary conditions: Reflections and periodicity. Each of these conditions can be set individually for each grid face as with boundary conditions. To Set reflective symmetry conditions, the parameters in Tab. 10 are available. To set periodicity, the parameters in Tab. 11 are provided.

Parameter	Description


reflection_x	Reflection symmetry at lower \(x\) boundary
reflection_upper_x	Reflection symmetry at upper \(x\) boundary
reflection_y	Reflection symmetry at lower \(y\) boundary
reflection_upper_y	Reflection symmetry at upper \(y\) boundary
reflection_z	Reflection symmetry at lower \(z\) boundary
reflection_upper_z	Reflection symmetry at upper \(z\) boundary

Table 10: Parameters controlling reflection symmetry in CarpetX.

Parameter	Description


periodic_x	Periodic \(x\) boundary
periodic_x	periodic \(y\) boundary
periodic_y	Periodic \(z\) boundary

Table 11: Parameters controlling periodicity in CarpetX.

8.3 Custom boundary conditions

The recommended way of applying custom boundary conditions is to write them to the RHS of the evolution system. Let us once again consider the function that computes the RHS of Eqs. (??)-(??) presented in Sec. 5 modified to compute boundary conditions:

1extern "C" void LoopExample_RHS(CCTK_ARGUMENTS) { 
2  DECLARE_CCTK_ARGUMENTS_LoopExample_RHS; 
3  DECLARE_CCTK_PARAMETERS; 
4 
5  // The grid variable is implicitly defined via the CCTK macros 
6  // A 0/1 in template parameters indicate that a grid is vertex/cell centered 
7  grid.loop_int<0, 0, 0>( 
8    grid.nghostzones, 
9 
10    // The loop lambda 
11    [=] (const Loop::PointDesc &p) { 
12      using std::pow; 
13 
14      // Instead of computing values for each direction expression 
15      // explicitly we will use a loop to save us from some typing 
16      // ddu will store the partial results for each direction 
17      Arith::vect<CCTK_REAL, dim> ddu; 
18 
19      // Loop over directions 
20      for (int d = 0; d < dim; ++d) { 
21 
22        // Boundary associated with xmin ymin and zmin 
23        if (p.BI[d] < 0) { 
24          ddu[d] = 0; 
25        } 
26        // Boundary associated with xmax ymax and zmax 
27        else if (p.BI[d] > 0) { 
28          ddu[d] = 0; 
29        } 
30        // interior 
31        else { 
32          ddu[d] = (u(p.I - p.DI[d]) - 2 * u(p.I) + u(p.I + p.DI[d])) / 
33                    pow(p.DX[d], 2); 
34        } 
35      } 
36 
37      u_rhs(p.I) = rho(p.I); 
38      rho_rhs(p.I) = ddu[0] + ddu[1] + ddu[2]; 
39 
40    } // Ending of the loop lambda 
41  ); // Ending of the loop_int call 
42}

Instead of computing derivatives and boundaries for each direction explicitly, we use a loop over each direction, written on line 20. The if branches on lines 23-25 and 27-29 set the values of the RHS of the system in the boundaries associated with the xyz_min values and xyz_max values, respectively. In these case, we are setting boundary values to \(0\), but users would use whatever boundary condition they require. The takeaway message of this example is to note how one can access boundary points, instead of what to write to those points.

9 Interpolation

The recommended way of performing interpolation is using the flesh interface for interpolation by calling CCTK_InterpGridArrays(), as documented on Sec. C1.7.4 of the Cactus user guide and with detailed API description in Sec. A149 of the Cactus reference manual.

Internally, this functionality is implemented by the DriverInterpolate function. If users desire, they can call it directly instead of using CCTK_InterpGridArrays(). In order to do so, users must add the following to their interface.ccl file

1  CCTK_INT FUNCTION DriverInterpolate( 
2    CCTK_POINTER_TO_CONST IN cctkGH, 
3    CCTK_INT IN N_dims, 
4    CCTK_INT IN local_interp_handle, 
5    CCTK_INT IN param_table_handle, 
6    CCTK_INT IN coord_system_handle, 
7    CCTK_INT IN N_interp_points, 
8    CCTK_INT IN interp_coords_type, 
9    CCTK_POINTER_TO_CONST ARRAY IN interp_coords, 
10    CCTK_INT IN N_input_arrays, 
11    CCTK_INT ARRAY IN input_array_indices, 
12    CCTK_INT IN N_output_arrays, 
13    CCTK_INT ARRAY IN output_array_types, 
14    CCTK_POINTER ARRAY IN output_arrays) 
15  REQUIRES FUNCTION DriverInterpolate

Note that DriverInterpolate and CCTK_InterpGridArrays() receive exactly the same arguments (again, a complete description of these is given in Sec. A149 of the Cactus reference manual) but some arguments are unused by CarpetX. These are coord_system_handle, interp_coords_type_code, output_array_type_codes and interp_handle. Here’s and example of a DriverInterpolate call where these ignored parameters are set to zero:

1  // DriverInterpolate arguments that aren’t currently used 
2  CCTK_INT const coord_system_handle = 0; 
3  CCTK_INT const interp_coords_type_code = 0; 
4  CCTK_INT const output_array_type_codes[1] = {0}; 
5  int interp_handle = 0; 
6 
7  // We assume that the other (non-ignored) parameters exist 
8  // within the scope but are ommited from this snippet for clarity 
9  DriverInterpolate( 
10      cctkGH, N_dims, interp_handle, param_table_handle, coord_system_handle, 
11      npoints, interp_coords_type_code, interp_coords, nvars, (int const*const)varinds, 
12      nvars, output_array_type_codes, output_array);

Lastly, to control the order of the interpolation operators, users can set the CarpetX::interpolation_order parameter, which defaults to 1 and must be greater than 0.

10 Performance tuning

10.1 Parallelism model

When running on CPUs, AMReX will divide the grid in boxes whose sizes are controlled by blocking_factor and max_grid_size parameter families. These will be distributed over MPI ranks and each rank will subdivide the boxes on tiles. This subdivision into tiles is only logical (not in memory). Threads may then process tiles in any random order. When running on GPU’s, tiling is disabled and AMReX tries to assign one GPU per MPI process. If there are more MPI processes than grids to distribute, some MPI process may become idle. To prevent this from happening, the CarpetX::refine_grid_layout (see bellow) can be used to ensure that each MPI rank processes at least one grid.

10.2 Parallelism control parameters

CarpetX allows users to control AMReX’s underlying load balancing mechanism by exposing AMReX parameters to Cactus parameter files. We shall now describe these parameters and their functionality.

10.2.1 CarpetX::max_grid_size_{xyz}

AMReX will divide the domain in every direction so that each grid is no longer than max_grid_size in that direction. It defaults to 32 in each coordinate direction.

10.2.2 CarpetX::blocking_factor_{xyz}

Constrain grid creation so that each grid must be divisible by blocking_factor. This implies that both the domain (at each level) and max_grid_size must be divisible by blocking_factor, and that blocking_factor must be either 1 or a power of 2. It defaults to 8 in every direction.

10.2.3 CarpetX::refine_grid_layout

If set to "yes" and the number of grids created is less than the number of processors (\(N_\text {grids} < N_\text {procs}\)), then grids will be further subdivided until \(N_\text {grids} \geq N_\text {procs}\). If subdividing the grids to achieve \(N_\text {grids} \geq N_\text {procs}\) would violate the blocking_factor criterion then additional grids are not created and the number of grids will remain less than the number of processors. It defaults to "yes"

10.2.4 CarpetX::grid_efficiency

Because mesh refinement in AMReXis done per-cell (see sec. 7.2 for further details), a refinement region may be irregular. AMReX can only deal with logically cubical grids, thus a refinement region will be replaced by a new grid consisting of coarse and finer boxes on the region. Due to this “discretization” of the refinement region, more cells will be refined than necessary. Refining too many points becomes obviously inefficient. Conversely, refining too few points is also non-ideal, as it leads to a grid structure consisting of many small boxes. “Grid efficiency” then refers to the ratio of the points that need to be refined over the points that actually are refined in the new grid structure. This steers AMReX’s eagerness to combine smaller boxes into larger ones by refining more points. The default value for this parameter is \(0.7\)

10.2.5 CarpetX::max_tile_size_{xyz}

Controls tile sizes. In order to be cache efficient, tiling is disabled in the x direction because these elements are not stored contiguously in memory. The default values for max_tile_size are given in Tab. 12. These default values were found trough experimentation. Users are encouraged to change them in order to achieve optimal performance in their machines and use cases.

Parameter	Default value


max_tile_size_x	\(1024000\) (disabled)
max_tile_size_y	\(16\)
max_tile_size_z	\(32\)

Table 12: CarpetX’s tile size control parameters and their default values.

11 Profiling

Profiling refers to the act of collecting runtime and other performance data on the execution of an application in order to improve its performance characteristics. The recommended way of profiling CarpetX applications is using HPCToolkit. We recommend that user first read the software’s documentation thoroughly.

Installing HPCToolkit on an HPC system will probably require the help of the system’s administrators. Once installed, the application should be launched with the hpcrun command, which will automatically collect performance data from the launched application. To profile CarpetX while using an HPC system, the recommended steps are:

Launch an interactive section with the cluster’s job system. In this session, the mpirun and hpcrun commands should be available.
Navigate to the directory where the Cactus executable is located. Given a configuration named config-name, the executable is located in Cactus/exe/cactus_config-name.
Select a parameter file to execute and gather performance information on. Let us call it perf.par and suppose that it is located in Cactus/exe/.
Assuming that the system’s MPI launcher is called mpirun, to profile a CPU application, issue
```
1    mpirun hpcrun ./cactus_config-name perf.par
```
To profile a GPU application, issue
```
1    mpirun hpcrun -e gpu=<vendor> ./cactus_config-name perf.par
```
where <vendor> is either nvidia or amd, depending on which GPU chips are available in the system.
After Cactus completes the evolution, a new folder called hpctoolkit-cactus_config-name-measurements will be created on the same directory where hpcrun was executed. HPCToolkit needs to recover program structure in order to create better profiling information. To do that, issue
```
1    hpcstruct hpctoolkit-cactus_config-name-measurements
```
Finally, to have HPCToolkit analyze measurements and attribute it to source code, issue
```
1    hpcprof hpctoolkit-cactus_config-name-measurements
```

This will generate the hpctoolkit-cactus_config-name-database folder, which can be used with HPCToolkit’s provided profile visualization tool called hpcviewer. See chapter 10 of the HPCToolkit manual for usage details.

12 Outputting data

CarpetX outputs data in the following formats: Silo, openPMD (with various backends), Adios2 and TSV. For each output format, users can choose what and how frequently variables will be written to disk. In the following sections we will provide details to each format.

12.1 TSV

The TSV acronym stands for Tab Separated Values. This is a plain text format in which data is written to files in rows (lines) and columns, each column separated by a tab character. These formats are very easy to hand parse and manipulate or post-process. Many open source tools are also available for reading this format out of the box (see for instance the Python framework Pandas). There are however two important disadvantages in using plain text formats: The first is a loss of precision when converting floating point numbers from binary to string representation and converting from string back into binary form, due to the way floating point numbers work. The second is compactness. While a 16 digit double precision floating point number takes 8 bytes to be represented in binary form, it requires at least 17 bytes (1 byte per digit plus the floating point dot) to be represented as a string. For these reasons, this format is not recommended to production runs or to complex data analysis. It is useful however, for quick and easy visualizations or diagnostics of data. It is also useful when other, more sophisticated formats are not available.

To control TSV output, CarpetX provides the parameters described in Tab. 12.1

Parameter	Type	Default Value	Description


out_tsv	Boolean	"yes"	Whether to produce TSV output
out_tsv_vars	String	Empty string	Space or newline separated string of grid functions to output
out_tsv_every	Integer	\(-1\)	If set to \(-1\), output is produced as often as the setting of IO::out_every. If set to \(0\), output is never produced. If set to \(1\) or larger, output every that many iterations

Table 13: CarpetX parameters for TSV output.

As an example, to output the grid functions of WaveToyX as TSV files every 2 iterations, one would write in their parameter file the following excerpt

1  CarpetX::out_tsv = yes 
2  CarpetX::out_tsv_every = 2 
3  CarpetX::out_tsv_vars = " 
4    WaveToyX::state 
5    WaveToyX::rhs 
6    WaveToyX::energy 
7    WaveToyX::error 
8"

12.2 SILO

The Silo format (see here for more information) is the native file format of the VisIt data visualization tool. Given its native support in VisIt, silo files are good for direct visualization of simulation data and grids (Fig. 1 of this document as well as the animated_boxes.gif file in the documentation folder were both created with VisIt).

To control SILO output, CarpetX provides the parameters described in Tab. 12.2

Parameter	Type	Default Value	Description


out_silo_vars	String	Empty string	Space or newline separated string of grid functions to output
out_silo_every	Integer	\(-1\)	If set to \(-1\), output is produced as often as the setting of IO::out_every. If set to \(0\), output is never produced. If set to \(1\) or larger, output every that many iterations

Table 14: CarpetX parameters for SILO output.

As an example, to output the grid functions of WaveToyX as SILO files every 2 iterations, one would write in their parameter file the following excerpt

1  CarpetX::out_silo_every = 2 
2  CarpetX::out_silo_vars = " 
3    WaveToyX::state 
4    WaveToyX::rhs 
5    WaveToyX::energy 
6    WaveToyX::error 
7"

12.3 openPMD

openPMD stands for “open standard for particle-mesh data files”. Strictly speaking, it is not a file format, but a standard for meta-data, naming schemes and file organization. openPMD is thus “backed” by usual hierarchical binary data formats, such as HDF5 and Adios2, both supported by CarpetX. Put simply, openPMD specifies a standardized and interchangeable way of organizing data within HDF5 and Adios2 files.

To control openPMD output, CarpetX provides the parameters described in Tab. 12.3

Parameter	Type	Default Value	Description


out_openpmd_vars	String	Empty string	Space or newline separated string of grid functions to output
out_openpmd_every	Integer	\(-1\)	If set to \(-1\), output is produced as often as the setting of IO::out_every. If set to \(0\), output is never produced. If set to \(1\) or larger, output every that many iterations

Table 15: CarpetX parameters for SILO output.

Additionally, it is possible to control which “backing” format to use with openPMD via the CarpetX::openpmd_format parameter, whose possible string values are listed in Tab 12.3

Value	Description


"HDF5"	The Hierarchical Data Format, version 5 (see here)
"ADIOS1"	The Adaptable I/O System version 1 (see here)
"ADIOS2"	The Adaptable I/O System version 2 (see here). Requires openPMD_api \(< 0.15\)
"ADIOS2_BP"	ADIOS2 native binary-pack. Requires openPMD_api \(\geq 0.15\)
"ADIOS2_BP4"	ADIOS2 native binary-pack version 4 (see here). Requires openPMD_api \(\geq 0.15\)
"ADIOS2_BP5"	ADIOS2 native binary-pack version 5 (see here). Requires openPMD_api \(\geq 0.15\)
"ADIOS2_SST"	ADIOS2 Sustainable Staging Transport (see hee)
"ADIOS2_SSC"	ADIOS2 Strong Staging Coupler (see here)
"JSON"	JavaScript Object Notation (see here)

Table 16: Possible values for CarpetX::openpmd_format controlling openPMD’s storage format.

When using openPMD output, CarpetX can produce files that can be visualized with VisIt. To do that for WaveToyX, for example, the following excerpt would be added to a parameter file

1  # It is very important to produce BP4 files, as BP5 
2  # cannot be imported into VisIt at the time of writing 
3  CarpetX::openpmd_format = "ADIOS2_BP4" 
4 
5  CarpetX::out_openpmd_vars = " 
6      WaveToyX::state 
7      WaveToyX::rhs 
8      WaveToyX::energy 
9      WaveToyX::error 
10  "

When loading data into VisIt, users must open the *.pmd.visit file produced together with the simulation output. It is also imperative to select the file format as ADIOS2 in VisIt’s file selector window, otherwise the files will not be loaded correctly.

In addition to VisIt, users can find in the CarpetX/doc/openpmd-plot folder a sample Python script for extracting and plotting data from CarpetX produced openPMD files. The best way to interact with the script is using a Python virtual environment. First, navigate to the CarpetX/doc/openpmd-plot and issue the following commands:

1python -m venv venv             # Creates the virtual environment. 
2source venv/bin/activate        # Activates the virtual environment. 
3pip install -r requirements.txy # Installs required packages. 
4python plot-zslice.py -h        # See the options available. 
5deactivate                      # Exits the environment when done.

12.4 Pure ADIOS2 files

To output pure ADIOS2 BP5 files, (without openPMD formatting), users can use the parameters described in Tab. 12.4. Note that this output mode produces BP5 files, which cannot be read by VisIt (at the time of writing). Users may have to rely on their own tools and scripts for data processing and visualization.

Parameter	Type	Default Value	Description


out_adios2_vars	String	Empty string	Space or newline separated string of grid functions to output
out_adios2_every	Integer	\(-1\)	If set to \(-1\), output is produced as often as the setting of IO::out_every. If set to \(0\), output is never produced. If set to \(1\) or larger, output every that many iterations

Table 17: CarpetX parameters for ADIOS2 output.

As an example, to produce ADIOS2 files at every \(2\) time steps in WaveToyX, users would use the following excerpt

1  CarpetX::out_adios2_every = 2 
2  CarpetX::out_adios2_vars = " 
3    WaveToyX::state 
4    WaveToyX::rhs 
5    WaveToyX::energy 
6    WaveToyX::error 
7"

12.5 Norms files

CarpetXcan output various norms of grid functions, such as min., max., L2 and others. This output works similarly to the previously described methods. To control norm output, the parameters in Tab. 12.5 are provided.

Parameter	Type	Default Value	Description


out_norm_vars	String	Empty string	Space or newline separated string of grid functions to output
out_norm_every	Integer	\(-1\)	If set to \(-1\), output is produced as often as the setting of IO::out_every. If set to \(0\), output is never produced. If set to \(1\) or larger, output every that many iterations

Table 18: CarpetX parameters for norms output.

For example, to output openPMD and norms for the WaveToyX thorn grid functions, the following excerpt could be added to a parameter file

1  CarpetX::out_norm_every = 2 
2 
3  # Here we choose to output only norms of the error grid function 
4  CarpetX::out_norm_vars = " 
5      WaveToyX::error 
6  " 
7 
8  # It is very important to produce BP4 files, as BP5 
9  # cannot be imported into VisIt at the time of writing 
10  CarpetX::openpmd_format = "ADIOS2_BP4" 
11 
12  CarpetX::out_openpmd_vars = " 
13      WaveToyX::state 
14      WaveToyX::rhs 
15      WaveToyX::energy 
16      WaveToyX::error 
17  "

12.6 Performance file

CarpetX can produce a file called performance.yaml with various performance timers for certain iterations of a simulation. This output mirrors that of stdout and may be useful for thorn developers while benchmarking their codes. Controlling performance output is done similarly as to other types of output. The parameters for controlling it are detailed in Tab 12.6

Parameter	Type	Default Value	Description


out_performance	Boolean	"yes"	Whether to produce performance data in YAML format.
out_performance_every	Integer	\(-1\)	If set to \(-1\), output is produced as often as the setting of IO::out_every. If set to \(0\), output is never produced. If set to \(1\) or larger, output every that many iterations.

Table 19: CarpetX parameters for performance output.

As an example, to have performance measurements output for a simulation every 2 iterations, the following parameters may be added to a parameter file

1  CarpetX::out_performance = yes 
2  CarpetX::out_performance_every = 2

13 Debugging

To debug CarpetX applications, users have access to the following parameters:

verbose: A boolean parameter that controls weather or not CarpetX produces verbose output. This output details more of what the driver is doing internally, aiding users in finding bugs. Its default value is no.
poison_undefined_values: Sets undefined grid point values to NaN (not a number). By visualizing locations where grid functions contain NaNs, users can determine which grid points are not being filled correctly. Its default value is yes.

14 Parameters


amrex_parameters	Scope: private	STRING

Description: Additional AMReX parameters

Range		Default: (none)
	do nothing
[]+=.*	keyword=value


blocking_factor_x	Scope: private	INT

Description: Blocking factor

Range		Default: 8
1:*


blocking_factor_y	Scope: private	INT

Description: Blocking factor

Range		Default: 8
1:*


blocking_factor_z	Scope: private	INT

Description: Blocking factor

Range		Default: 8
1:*	must be a power of 2


boundary_upper_x	Scope: private	KEYWORD

Description: Boundary condition at upper x boundary

Range		Default: none
none	don’t apply any boundary
dirichlet	Dirichlet
linear extrapolation	Linear extrapolation
neumann	Neumann
robin	Robin


boundary_upper_y	Scope: private	KEYWORD

Description: Boundary condition at upper y boundary

Range		Default: none
none	don’t apply any boundary
dirichlet	Dirichlet
linear extrapolation	Linear extrapolation
neumann	Neumann
robin	Robin


boundary_upper_z	Scope: private	KEYWORD

Description: Boundary condition at upper z boundary

Range		Default: none
none	don’t apply any boundary
dirichlet	Dirichlet
linear extrapolation	Linear extrapolation
neumann	Neumann
robin	Robin


boundary_x	Scope: private	KEYWORD

Description: Boundary condition at lower x boundary

Range		Default: none
none	don’t apply any boundary
dirichlet	Dirichlet
linear extrapolation	Linear extrapolation
neumann	Neumann
robin	Robin


boundary_y	Scope: private	KEYWORD

Description: Boundary condition at lower y boundary

Range		Default: none
none	don’t apply any boundary
dirichlet	Dirichlet
linear extrapolation	Linear extrapolation
neumann	Neumann
robin	Robin


boundary_z	Scope: private	KEYWORD

Description: Boundary condition at lower z boundary

Range		Default: none
none	don’t apply any boundary
dirichlet	Dirichlet
linear extrapolation	Linear extrapolation
neumann	Neumann
robin	Robin


checkpoint_method	Scope: private	KEYWORD

Description: I/O method for checkpointing

Range		Default: error
error	Abort with error instead of checkpointing
openpmd
silo


dirichlet_upper_x_vars	Scope: private	STRING

Description: Override boundary condition at upper x boundary

Range		Default: (none)
.*


dirichlet_upper_y_vars	Scope: private	STRING

Description: Override boundary condition at upper y boundary

Range		Default: (none)
.*


dirichlet_upper_z_vars	Scope: private	STRING

Description: Override boundary condition at upper z boundary

Range		Default: (none)
.*


dirichlet_x_vars	Scope: private	STRING

Description: Override boundary condition at lower x boundary

Range		Default: (none)
.*


dirichlet_y_vars	Scope: private	STRING

Description: Override boundary condition at lower y boundary

Range		Default: (none)
.*


dirichlet_z_vars	Scope: private	STRING

Description: Override boundary condition at lower z boundary

Range		Default: (none)
.*


do_reflux	Scope: private	BOOLEAN

Description: Manage flux registers to ensure conservation

		Default: yes


do_restrict	Scope: private	BOOLEAN

Description: Automatically restrict fine to coarse grid functions

		Default: yes


dtfac	Scope: private	REAL

Description: The standard timestep condition dt = dtfac*min(delta_space)

Range		Default: 0.5
:


filereader_method	Scope: private	KEYWORD

Description: I/O method for file reader

Range		Default: error
error	Abort with error when file reader is used
openpmd
silo


ghost_size	Scope: private	INT

Description: Number of ghost zones

Range		Default: -1
-1	use ghost_size_[xyz]
0:*


ghost_size_x	Scope: private	INT

Description: Number of ghost zones

Range		Default: 1
0:*


ghost_size_y	Scope: private	INT

Description: Number of ghost zones

Range		Default: 1
0:*


ghost_size_z	Scope: private	INT

Description: Number of ghost zones

Range		Default: 1
0:*


gpu_sync_after_every_kernel	Scope: private	BOOLEAN

Description: Call amrex::Gpu::streamSynchronize after every kernel (EXPENSIVE)

		Default: no


grid_efficiency	Scope: private	REAL

Description: Minimum AMR grid efficiency

Range		Default: 0.7
0.0:*


interpolation_order	Scope: private	INT

Description: Interpolation order

Range		Default: 1
0:*


kernel_launch_method	Scope: private	KEYWORD

Description: Kernel launch method

Range		Default: default
serial	no parallelism
openmp	use OpenMP
cuda	target CUDA
default	Use OpenMP for CPU builds and CUDA for GPU builds


linear_extrapolation_upper_x_vars	Scope: private	STRING

Description: Override boundary condition at upper x boundary

Range		Default: (none)
.*


linear_extrapolation_upper_y_vars	Scope: private	STRING

Description: Override boundary condition at upper y boundary

Range		Default: (none)
.*


linear_extrapolation_upper_z_vars	Scope: private	STRING

Description: Override boundary condition at upper z boundary

Range		Default: (none)
.*


linear_extrapolation_x_vars	Scope: private	STRING

Description: Override boundary condition at lower x boundary

Range		Default: (none)
.*


linear_extrapolation_y_vars	Scope: private	STRING

Description: Override boundary condition at lower y boundary

Range		Default: (none)
.*


linear_extrapolation_z_vars	Scope: private	STRING

Description: Override boundary condition at lower z boundary

Range		Default: (none)
.*


max_grid_size_x	Scope: private	INT

Description: Maximum grid size

Range		Default: 32
1:*	must be a multiple of the blocking factor


max_grid_size_y	Scope: private	INT

Description: Maximum grid size

Range		Default: 32
1:*	must be a multiple of the blocking factor


max_grid_size_z	Scope: private	INT

Description: Maximum grid size

Range		Default: 32
1:*	must be a multiple of the blocking factor


max_grid_sizes_x	Scope: private	INT

Description: Maximum grid size

Range		Default: -1
-1	use value from max_grid_size_x
1:*	must be a multiple of the blocking factor


max_grid_sizes_y	Scope: private	INT

Description: Maximum grid size

Range		Default: -1
-1	use value from max_grid_size_y
1:*	must be a multiple of the blocking factor


max_grid_sizes_z	Scope: private	INT

Description: Maximum grid size

Range		Default: -1
-1	use value from max_grid_size_z
1:*	must be a multiple of the blocking factor


max_num_levels	Scope: private	INT

Description: Maximum number of refinement levels

Range		Default: 1
1:*


max_tile_size_x	Scope: private	INT

Description: Maximum tile size

Range		Default: 1024000
1:*


max_tile_size_y	Scope: private	INT

Description: Maximum tile size

Range		Default: 16
1:*


max_tile_size_z	Scope: private	INT

Description: Maximum tile size

Range		Default: 32
1:*


ncells_x	Scope: private	INT

Description: Number of grid cells

Range		Default: 128
1:*	must be a multiple of the blocking factor


ncells_y	Scope: private	INT

Description: Number of grid cells

Range		Default: 128
1:*	must be a multiple of the blocking factor


ncells_z	Scope: private	INT

Description: Number of grid cells

Range		Default: 128
1:*	must be a multiple of the blocking factor


neumann_upper_x_vars	Scope: private	STRING

Description: Override boundary condition at upper x boundary

Range		Default: (none)
.*


neumann_upper_y_vars	Scope: private	STRING

Description: Override boundary condition at upper y boundary

Range		Default: (none)
.*


neumann_upper_z_vars	Scope: private	STRING

Description: Override boundary condition at upper z boundary

Range		Default: (none)
.*


neumann_x_vars	Scope: private	STRING

Description: Override boundary condition at lower x boundary

Range		Default: (none)
.*


neumann_y_vars	Scope: private	STRING

Description: Override boundary condition at lower y boundary

Range		Default: (none)
.*


neumann_z_vars	Scope: private	STRING

Description: Override boundary condition at lower z boundary

Range		Default: (none)
.*


openpmd_format	Scope: private	KEYWORD

Description: openPMD file format

Range		Default: ADIOS2_auto
HDF5
ADIOS1	requires openPMD_api <0.16
ADIOS2_auto	Choose ADIOS2 format automatically (either ADIOS2_BP5 or ADIOS2_BP4)
ADIOS2	requires openPMD_api <0.15
ADIOS2_BP	requires openPMD_api >=0.15
ADIOS2_BP4	requires openPMD_api >=0.15
ADIOS2_BP5	requires openPMD_api >=0.15
ADIOS2_SST
ADIOS2_SSC
JSON
TOML	requires openPMD_api >=0.16
GENERIC	requires openPMD_api >=0.16


out_adios2_every	Scope: private	INT

Description: Output in ADIOS2 format every that many iterations

Range		Default: -1
-1	use value of IO::out_every
	never output ADIOS2
1:*	every that many iterations


out_adios2_vars	Scope: private	STRING

Description: Variables to output in ADIOS2 format

Range		Default: (none)
.*


out_metadata	Scope: private	BOOLEAN

Description: Output metadata in YAML format?

		Default: yes


out_metadata_every	Scope: private	INT

Description: Output metadata every that many iterations

Range		Default: -1
-1	use value of IO::out_every
	never output metadata
1:*	every that many iterations


out_norm_every	Scope: private	INT

Description: Output norms every that many iterations

Range		Default: -1
-1	use value of IO::out_every
	never output norms
1:*	every that many iterations


out_norm_omit_sumloc_for_backward_compatibility	Scope: private	BOOLEAN

Description: Omit sumloc norms

		Default: no


out_norm_omit_unstable	Scope: private	BOOLEAN

Description: Omit unstable norms (e.g. maxloc)

		Default: no


out_norm_vars	Scope: private	STRING

Description: Variables to output as norms

Range		Default: (none)
.*


out_openpmd_every	Scope: private	INT

Description: Output in openPMD format every that many iterations

Range		Default: -1
-1	use value of IO::out_every
	never output openPMD
1:*	every that many iterations


out_openpmd_vars	Scope: private	STRING

Description: Variables to output in openPMD format

Range		Default: (none)
.*


out_performance	Scope: private	BOOLEAN

Description: Output performance data in YAML format?

		Default: yes


out_performance_every	Scope: private	INT

Description: Output performance data every that many iterations

Range		Default: -1
-1	use value of IO::out_every
	never output performance data
1:*	every that many iterations


out_plotfile_every	Scope: private	INT

Description: Output in plotfile format every that many iterations

Range		Default: -1
-1	use value of IO::out_every
	never output plotfile
1:*	every that many iterations


out_plotfile_groups	Scope: private	STRING

Description: Groups to output as AMReX plotfile

Range		Default: (none)
.*


out_silo_every	Scope: private	INT

Description: Output in Silo format every that many iterations

Range		Default: -1
-1	use value of IO::out_every
	never output Silo
1:*	every that many iterations


out_silo_vars	Scope: private	STRING

Description: Variables to output in Silo format

Range		Default: (none)
.*


out_tsv	Scope: private	BOOLEAN

Description: 3d Output in TSV (tab-separated values) format?

		Default: no


out_tsv_every	Scope: private	INT

Description: Output in TSV format every that many iterations

Range		Default: -1
-1	use value of IO::out_every
	never output TSV
1:*	every that many iterations


out_tsv_vars	Scope: private	STRING

Description: Variables to output in 1d TSV format

Range		Default: (none)
.*


poison_undefined_values	Scope: private	BOOLEAN

Description: Set undefined grid point values to nan

		Default: yes


prolongate_per_group	Scope: private	BOOLEAN

Description: Prolongate all variables in a group together

		Default: no


prolongation_order	Scope: private	INT

Description: Prolongation order

Range		Default: 1
0:*


prolongation_type	Scope: private	KEYWORD

Description: Prolongation type

Range		Default: natural
interpolate	interpolate between data points
conservative	interpolate cell averages, ensuring conservation
ddf	interpolate in vertex centred and conserve (with one order lower) in cell centred directions
eno	interpolate in vertex centred and minmod-conserve in cell centred directions
minmod	interpolate in vertex centred and ENO-conserve in cell centred directions
hermite	”Hermite-interpolate in vertex centred and conserve in cell centred directions”
natural	interpolate in vertex centred and conserve in cell centred directions, using the same order
poly-cons3lfb	interpolate polynomially in vertex centred directions and conserve with 3rd order accuracy and a linear fallback in cell centred directions
poly-eno3lfb	interpolate polynomially in vertex centred directions and use ENO stencils with 3rd order accuracy and a linear fallback in cell centred directions


recover_method	Scope: private	KEYWORD

Description: I/O method for recovery

Range		Default: error
error	Abort with error instead of recovering
openpmd
silo


refine_grid_layout	Scope: private	BOOLEAN

Description: Ensure that each MPI process has at least one grid

		Default: yes


reflection_upper_x	Scope: private	BOOLEAN

Description: Reflection symmetry at the upper x boundary

		Default: no


reflection_upper_y	Scope: private	BOOLEAN

Description: Reflection symmetry at the upper y boundary

		Default: no


reflection_upper_z	Scope: private	BOOLEAN

Description: Reflection symmetry at the upper z boundary

		Default: no


reflection_x	Scope: private	BOOLEAN

Description: Reflection symmetry at the lower x boundary

		Default: no


reflection_y	Scope: private	BOOLEAN

Description: Reflection symmetry at the lower y boundary

		Default: no


reflection_z	Scope: private	BOOLEAN

Description: Reflection symmetry at the lower z boundary

		Default: no


regrid_error_threshold	Scope: private	REAL

Description: Regridding error threshold

Range		Default: 1.0
0.0:*


regrid_every	Scope: private	INT

Description: Regridding interval

Range		Default: (none)
	never
1:*	every that many iterations


restrict_during_sync	Scope: private	BOOLEAN

Description: Restrict fine to coarse grid functions when syncing

		Default: yes


robin_upper_x_vars	Scope: private	STRING

Description: Override boundary condition at upper x boundary

Range		Default: (none)
.*


robin_upper_y_vars	Scope: private	STRING

Description: Override boundary condition at upper y boundary

Range		Default: (none)
.*


robin_upper_z_vars	Scope: private	STRING

Description: Override boundary condition at upper z boundary

Range		Default: (none)
.*


robin_x_vars	Scope: private	STRING

Description: Override boundary condition at lower x boundary

Range		Default: (none)
.*


robin_y_vars	Scope: private	STRING

Description: Override boundary condition at lower y boundary

Range		Default: (none)
.*


robin_z_vars	Scope: private	STRING

Description: Override boundary condition at lower z boundary

Range		Default: (none)
.*


use_subcycling_wip	Scope: private	BOOLEAN

Description: Use subcycling in time

		Default: no


verbose	Scope: private	BOOLEAN

Description: Verbose output

		Default: no


xmax	Scope: private	REAL

Description: Domain size (location of vertices)

Range		Default: 1.0
:


xmin	Scope: private	REAL

Description: Domain size (location of vertices)

Range		Default: -1.0
:


ymax	Scope: private	REAL

Description: Domain size (location of vertices)

Range		Default: 1.0
:


ymin	Scope: private	REAL

Description: Domain size (location of vertices)

Range		Default: -1.0
:


zmax	Scope: private	REAL

Description: Domain size (location of vertices)

Range		Default: 1.0
:


zmin	Scope: private	REAL

Description: Domain size (location of vertices)

Range		Default: -1.0
:


periodic	Scope: restricted	BOOLEAN

Description: Periodic boundary conditions

Range		Default: no
yes	must be set for any periodic_x,y,z to be effective
no	no periodicity


periodic_x	Scope: restricted	BOOLEAN

Description: Periodic boundary conditions in x-direction

Range		Default: yes
yes	periodic in x direction, requires periodic=yes as well
no	not periodic in x direction


periodic_y	Scope: restricted	BOOLEAN

Description: Periodic boundary conditions in y-direction

Range		Default: yes
yes	periodic in y direction, requires periodic=yes as well
no	not periodic in y direction


periodic_z	Scope: restricted	BOOLEAN

Description: Periodic boundary conditions in z-direction

Range		Default: yes
yes	periodic in z direction, requires periodic=yes as well
no	not periodic in z direction


checkpoint_dir	Scope: shared from IO	STRING


checkpoint_every	Scope: shared from IO	INT


checkpoint_every_walltime_hours	Scope: shared from IO	REAL


checkpoint_file	Scope: shared from IO	STRING


checkpoint_id	Scope: shared from IO	BOOLEAN


checkpoint_on_terminate	Scope: shared from IO	BOOLEAN


filereader_id_dir	Scope: shared from IO	STRING


filereader_id_files	Scope: shared from IO	STRING


filereader_id_vars	Scope: shared from IO	STRING


out_dir	Scope: shared from IO	STRING


out_every	Scope: shared from IO	INT


out_mode	Scope: shared from IO	KEYWORD


out_proc_every	Scope: shared from IO	INT


out_xline_y	Scope: shared from IO	REAL


out_xline_z	Scope: shared from IO	REAL


out_yline_x	Scope: shared from IO	REAL


out_yline_z	Scope: shared from IO	REAL


out_zline_x	Scope: shared from IO	REAL


out_zline_y	Scope: shared from IO	REAL


recover	Scope: shared from IO	KEYWORD


recover_dir	Scope: shared from IO	STRING


recover_file	Scope: shared from IO	STRING

15 Interfaces

General

Implements:

driver

Adds header:

driver.hxx

reduction.hxx

Uses header:

silo.hxx

defs.hxx

div.hxx

rational.hxx

simd.hxx

tuple.hxx

vect.hxx

loop.hxx

loop_device.hxx

Provides:

MultiPatch_GetDomainSpecification to

MultiPatch_GetBoundarySizesAndTypes to

GetCallFunctionCount to

GetLoopBoxAll to

GetLoopBoxInt to

CallScheduleGroup to

SolvePoisson to

Interpolate to

DriverInterpolate to

16 Schedule

This section lists all the variables which are assigned storage by thorn CarpetX/CarpetX. Storage can either last for the duration of the run (Always means that if this thorn is activated storage will be assigned, Conditional means that if this thorn is activated storage will be assigned for the duration of the run if some condition is met), or can be turned on for the duration of a schedule function.

Storage

NONE

Scheduled Functions

CCTK_STARTUP

carpetx_startup

start up the driver

	Language:	c
	Type:	function

CCTK_RECOVER_PARAMETERS (conditional)

carpetx_recoverparameters

recover parameters

	Language:	c
	Options:	meta
	Type:	function

CCTK_CPINITIAL

carpetx_checkpointinitial

checkpoint initial conditions

	Language:	c
	Options:	meta
	Type:	function

CCTK_CHECKPOINT

carpetx_checkpoint

checkpoint

	Language:	c
	Options:	meta
	Type:	function

CCTK_TERMINATE

carpetx_checkpointterminate

checkpoint before terminating

	Language:	c
	Options:	meta
	Type:	function

CCTK_SHUTDOWN

carpetx_shutdown

shut down the driver

	Language:	c
	Options:	meta
	Type:	function

Aliased Functions

Alias Name:	Function Name:
CarpetX_Shutdown	Driver_Shutdown
CarpetX_Startup	Driver_Startup