WatchDog

David Radice <dradice@caltech.edu>

March 06 2015

Abstract

1 Introduction

WatchDog is thorn that terminates jobs that do not make progress over a user-defined time frame.

Internally, WatchDog is made of two parts. One updates a global timer every iteration at CCTK_ANALYSIS. The other one spawns a watcher thread that periodically checks if the timer has been updated. If the timer has not been updated for more than a user-defined time frame the thread calls “abort()” to terminate the process (and the job).

2 Parameters




check_every
Scope: private INT



Description: Check that the run is progressing every so many seconds



Range Default: 3600
1:*
Any positive integer



3 Interfaces

General

Implements:

watchdog

4 Schedule

This section lists all the variables which are assigned storage by thorn CactusUtils/WatchDog. Storage can either last for the duration of the run (Always means that if this thorn is activated storage will be assigned, Conditional means that if this thorn is activated storage will be assigned for the duration of the run if some condition is met), or can be turned on for the duration of a schedule function.

Storage

NONE

Scheduled Functions

CCTK_ANALYSIS

  watchdog

  make sure that the run is progressing

 

  Language: c
  Type: function