NaNChecker

Thomas Radke

Date

Abstract

Thorn NaNChecker reports NaN values found in CCTK grid variables.

1 Purpose

The NaNChecker thorn can be used to analyze Cactus grid variables (that is grid functions, arrays or scalars) of real or complex data type for NaN (Not-a-Number) and (on availability of finite(3)) infinite values. Grid variables can be periodically checked, or a call can be inserted into a thorn to check at a specific point.

This thorn is a utility thorn, designed to be used for debugging and testing code for uninitialised variables, or for variables which become corrupted during a simulation, for example following a division by zero or illegal memory usage.

On many architectures, uninitialised variables will be given the value zero, and simulations using such variables will seemingly run perfectly well. However, not only is it dubious programming practise to assume such behaviour, but also moving to a new machine may well cause pathalogical problems (for example, with Alpha processors used in Compaq or Cray machines). It is thus recommended to test codes periodically with the NaNChecker, and to fix any problems as soon as they are seen.

2 Periodic Testing

Periodic testing of variables can easily be achieved by adding NaNChecker to the ActiveThorns parameter, and setting the parameters

NaNChecker::check_every, NaNChecker::check_after, and NaNChecker::check_vars

to the required values. (For most testing purposes these can be set to 1, 0, and ”all” respectively).

The NaNChecker then registers a routine at CCTK_ANALYSIS which checks at every NanChecker::check_every iteration – starting at iteration number NaNChecker::check_after – all the variables listed in NaNChecker::check_vars for NaN or infinite values (depending on NaNChecker::check_for) and — if such a value is found — performs an action as specified in NaNChecker::action_if_found.

Currently these actions can be to

By default, the current timelevel of the variables given in NaNChecker::check_vars will be checked. This can be overwritten by an optional string [timelevel=<timelevel>] appended to the variable/group name. For example, to apply the NaNChecker to timelevel 0 of the variable grid::x, timelevel 1 of grid::y and timelevel 2 of grid::z you would use the parameter

NaNChecker::check_vars = "grid::x grid::y[timelevel=1] grid::z[timelevel=2]"

3 Tracking and Visualizing NaNs Positions

The NaNChecker thorn can also mark the positions (in grid index points) of all the NaNs found for a given list of CCTK grid functions in a mask array and save this array to an HDF5 file.

The mask array is declared as a grid function NaNChecker::NaNmask with data type INTEGER. Each bit i in an integer element is used to flag a NaN value found in grid function i at the corresponding grid position (the counting for i starts at 0 and is incremented for each grid function as it appears in NaNChecker::check_vars). Thus the NaN locations of up to 32 individual grid functions can be coded in the NaNmask array.

In order to activate the NaNmask output you need to set the parameter NaNChecker::out_NaNmask to "yes" (which is already the default) and have the IOHDF5 thorn activated.

The NaN locations can be visualized with OpenDX. An example DX network VisualizeNaNs.net and a sample NaNmask HDF5 output file NaNmask.h5 are available via anonymous CVS from the NumRel CVS server:

  # this is for (t)csh; use export CVSROOT for bash  
  setenv CVSROOT :pserver:cvs_anon@cvs.aei.mpg.de:/numrelcvs  
 
  # CVS pserver password is ’anon’  
  cvs login  
  cvs checkout AEIPhysics/Visualization/OpenDX/Networks/Miscellaneous

4 NaNChecker API

Thorn NaNChecker also provides a function API which can be used by other code to invoke the NaNChecker routines to test for NaN/Inf values or to set NaN values for a list of variables:

C API

  int NaNChecker_CheckVarsForNaN (const cGH *cctkGH,  
                                  int report_max,  
                                  const char *vars,  
                                  const char *check_for,  
                                  const char *action_if_found);  
 
  int NaNChecker_SetVarsToNaN (const cGH *cctkGH,  
                               const char *vars);

Fortran API

  call NaNChecker_CheckVarsForNaN (ierror, cctkGH, report_max,  
                                   vars, check_for, action_if_found)  
 
                                   integer ierror  
                                   CCTK_POINTER cctkGH  
                                   integer report_max  
                                   character*(*) vars  
                                   character*(*) check_for  
                                   character*(*) action_if_found  
 
  call NaNChecker_SetVarsToNaN (ierror, cctkGH, vars)  
 
                                integer ierror  
                                CCTK_POINTER cctkGH  
                                character*(*) vars

The report_max, check_vars, check_for and action_if_found arguments have the same semantics as their parameter counterparts.
If action_if_found is given as a NULL pointer (C API) or as an empty string (Fortran API) the routine will be quiet and just return the number of NaN values found.

The C function NaNChecker_CheckVarsForNaN() returns the total number of NaN/Inf values found, NaNChecker_SetToNaN() returns the total number of variables set to NaN; this return value is stored in the ierror argument for the corresponding fortran wrapper routines.

5 Parameters




action_if_found
Scope: private  KEYWORD



Description: What to do if a NaN was found



Range   Default: just warn
just warn
Just print a level 1 warning
terminate
Warn and terminate Cactus gracefully as soon as possible
abort
Warn and abort Cactus immediately






check_after
Scope: private  INT



Description: Start checking for NaNs after so many iterations



Range   Default: (none)
0:*
Any valid iteration number






check_every
Scope: private  INT



Description: How often to check for NaNs



Range   Default: (none)
Never (default)
1:*
Every so many iterations






check_for
Scope: private  KEYWORD



Description: Check for NaNs and/or infinite numbers (only evaluated if finite(3) is available)



Range   Default: both
NaN
Check only for NaNs
Inf
Check only for infinite numbers
both
Check for both NaNs and infinite numbers






check_vars
Scope: private  STRING



Description: Groups and/or variables to check for NaNs



Range   Default: (none)
.*
List of full group and/or variable names, or ’all’ for everything






ignore_restricted_points
Scope: private  BOOLEAN



Description: do not check grid points whose values will be restricted away



  Default: no






out_nanmask
Scope: private  BOOLEAN



Description: Dump the NaN grid function mask into an HDF5 file



  Default: yes






report_max
Scope: private  INT



Description: How many NaNs to report for a single variable



Range   Default: -1
-1
Report all (default)
0:*
Do not report more than report_max number of NaNs






restriction_mask
Scope: private  STRING



Description: grid function to use to decide which points are restricted away, points where the mask is zero are ignored



Range   Default: CarpetReduce::weight
see [1] below
Carpet’s reduction mask
see [1] below
takes prolongation stencil into account
.*[:][:].*
any grid function with points masked out set to zero



[1]

CarpetReduce[:][:]weight

[1]

CarpetEvolutionMask[:][:]evolution\_mask




setup_test
Scope: private  BOOLEAN



Description: set up grid function with NaNs



  Default: no






verbose
Scope: private  KEYWORD



Description: How much information to give



Range   Default: standard
all
All information
standard
Standard information



6 Interfaces

General

Implements:

nanchecker

Inherits:

reduce

Grid Variables

6.0.1 PRIVATE GROUPS




  Group Names    Variable Names    Details   




nanmask   compact0
NaNmask   descriptionGrid function mask for NaN locations
  dimensions3
  distributionDEFAULT
  group typeGF
  tagsProlongation=”None” checkpoint=”no”
  timelevels1
 variable typeINT




nansfound   compact0
NaNsFound   descriptionScalar variable counting the number of NaNs found
  dimensions0
  distributionCONSTANT
  group typeSCALAR
  tagscheckpoint=”no”
  timelevels1
 variable typeINT




testgf   compact0
TestGF   descriptionGrid function to hold NaNs for testsuite
  dimensions3
  distributionDEFAULT
  group typeGF
  tagsProlongation=”None” checkpoint=”no”
  timelevels1
 variable typeREAL




Adds header:

NaNCheck.h to NaNChecker.h

Provides:

CheckVarsForNaN to

SetVarsToNaN to

7 Schedule

This section lists all the variables which are assigned storage by thorn CactusUtils/NaNChecker. Storage can either last for the duration of the run (Always means that if this thorn is activated storage will be assigned, Conditional means that if this thorn is activated storage will be assigned for the duration of the run if some condition is met), or can be turned on for the duration of a schedule function.

Storage

 

Always: Conditional:
NaNmask NaNsFoundTestGF
   

Scheduled Functions

CCTK_BASEGRID

  nanchecker_resetcounter

  reset the nanchecker::nansfound counter

 

 Language:c
 Options: global
 Type: function

CCTK_PRESTEP

  nanchecker_resetcounter

  reset the nanchecker::nansfound counter

 

 Language:c
 Options: global
 Type: function

NaNChecker_NaNCheck

  nanchecker_nancheck_prepare

  prepare data structures to check for nans

 

 Language:c
 Options: level
 Type: function

CCTK_POSTSTEP

  nanchecker_nancheck

  check for nans and count them in nanchecker::nansfound

 

 Type:group

NaNChecker_NaNCheck

  nanchecker_nancheck_check

  check for nans

 

 After: nanchecker_nancheck_prepare
 Language:c
 Options: local
 Type: function

NaNChecker_NaNCheck

  nanchecker_nancheck_finish

  count nans in nanchecker::nansfound

 

 After: nanchecker_nancheck_check
  Language:c
 Options: level
 Type: function

CCTK_POSTSTEP

  nanchecker_takeaction

  output nanchecker::nanmask and take action according to nanchecker::action_if_found

 

 After: zzz_nanchecker_nancheck
  Language:c
 Options: global
   loop-level
 Type: function

CCTK_POST_RECOVER_VARIABLES

  nanchecker_nancheck

  check for nans and count them in nanchecker::nansfound

 

 Type:group

CCTK_POST_RECOVER_VARIABLES

  nanchecker_takeaction

  output nanchecker::nanmask and take action according to nanchecker::action_if_found

 

 After: zzz_nanchecker_nancheck
  Language:c
 Options: global
   loop-level
 Type: function

CCTK_INITIAL (conditional)

  nanchecker_setuptest

  set test grid function to nan

 

 Language:c
 Type: function

Aliased Functions

 

Alias Name:        Function Name:
NaNChecker_NaNCheckzzz_NaNChecker_NaNCheck