# Evaluation methods

In the [basics](scl.md#statements) section it has been noted that the order of the statements in the program provides no information for the interpreter and the output will not depend on this order. Rather the *dependencies* between variables and their parameters determine the execution order.

## Language interpreters and evaluation modes

Currently, three different interpreters can be used to execute programs. These interpreters do not perform *eager evaluation*, i.e. do not evaluate parameters (such as function calls, expressions etc.) as soon as they are interpreted. Rather the evaluation is triggered only when a certain value is actually needed. Thus, the **order of evaluation** is neither the order of statements in the source code nor the order of interpretation.

Another aspect of evaluation is the **time of evaluation** - during interpreter runs (let us call it *immediate*) and after the interpreter has finished (*deferred* evaluation).

A third aspect is the **location of evaluation**: we make difference between *local* and *remote* evaluation. This is when we need additional computing resources that are not available locally (i.e. local HPC cluster with a batch system and a JupyterHub instance connected to it).

The choice of interpreter determines the *evaluation mode* that is selected with `--mode | -m` command-line flags.

The evaluation of parameters is essentially governed by the [evaluation policies](io.md#evaluation-policies) of [the evaluation modes](io.md#modes-of-evaluation) and by the occurrence of `print` and `view` statements in the code. See next [section](io.md) for more details.

## Order of evaluation and short-circuiting

Some built-in function calls and some expression types are evaluated in the so-called **normal order**, i.e. only these input parameters that are actually needed are evaluated. The values of all input parameters are cached so that they are evaluated (if needed) only once. This is sometimes referred to as *lazy evaluation*, or *call-by-need*. Expressions including `if`, `or`, `and` evaluated in this way are known as *short-circuiting* expressions. In contrast, other parameters may be evaluated in **applicative order** (implemented as *call-by-value*), i.e. their evaluation begins only after *all* input parameters have been evaluated.

Obviously, the mode and order of evaluation do not affect the outputs of the model but rather the behavior, i.e. the performance, location, resources used, time and costs of evaluation. In particular, computational resources can be saved by avoiding unnecessary evaluations.

The order of evaluation in various cases is summarized in the table below. In instant (`--mode instant`, default) and deferred (`--mode deferred`) evaluation modes, all parameters are evaluated in normal order without any exceptions and the `or`, `and` and `if` expressions are short-circuiting.

In workflow mode (`--mode workflow`), the behavior is more complex and depends on the selected evaluation policy, the type of statement and the level of nesting, as shown in the table below. Normal order is used only when the `if` or the Boolean expressions (such including `or` and `and`) are parameters of variable, print or view statements, i.e. "top-level" and not nested. Variables including the annotation `?` are evaluated in normal order.

Evaluation mode  | Evaluation policy | Evaluation order | Nesting  | Statements | Examples with `if` # behavior
-----------------|-------------------|------------------|----------|------------|-----------------------------------
instant          | on demand         | normal           | any      | any        | `c = 2*if(true, a, b/2)  # b not evaluated`
deferred         | on demand         | normal           | any      | any        | `c = 2*if(not true, a, b)  # a not evaluated`
workflow         | all (-r)          | applicative      | any      | any        | `print(if(true, a, b))  # a and b evaluated`
workflow         | on demand (-rd)   | applicative      | any      | variable   | `c = if(true, a, b)  # a and b evaluated`
workflow         | on demand (-rd)   | normal           | top-level| variable   | `c = if(true, a, b)?  # b not evaluated`
workflow         | on demand (-rd)   | applicative      | nested   | variable   | `c = 2*if(true, a, b)  # a and b evaluated`
workflow         | on demand (-rd)   | normal           | nested<sup>1</sup>   | print/view | `print(2*if(true, a, b))  # b not evaluated`
workflow         | on demand (-rd)   | normal           | top-level| print/view | `print(if(true, a, b))  # b not evaluated`

<sup>1</sup> Applicative order in the expression evaluation (or function call) within [map](scl.md#map-function), [filter](scl.md#filter-function-and-filter-expression) and [reduce](scl.md#reduce-function) functions.

**NOTE:** Statements in COMPLETED state containing the `?` annotation and all their ancestor statements may not be updated using the [`:=` operator](scl.md#dealing-with-failures) and may not be rerun using [`%rerun` magic](tools.md#specific-features).

**NOTE:** The level of nesting of normal evaluation order in workflow mode can become unlimited in future interpreter implementations. To achieve the desired behavior relying on the currently implemented top-level nesting, the user has to decompose the expressions and define a variable for every nested `if` or Boolean expression.

### Examples

In the following example, the input parameter `a` of the `if` function is not evaluated because only the second input parameter, that is a string literal, is only needed and evaluated.

```
a = 'abc'
expr = true
b = if(expr, 'xyz', a)
print(b)
```
Swapping `'xyz'` and `a` (or changing the first input to `false`) will cause `a` to be evaluated but not the string literal `xyz`. Only evaluation in instant or deferred modes will have this behavior. In workflow mode, because the `if` function is in a variable statement, both `xyz` and `a` will be evaluated before the evaluation of the `if` function is started, no matter of the chosen evaluation policy. However, in workflow evaluation mode with on-demand (`--on-demand --autorun` flags) or no-evaluation (no flags) policy, the `if` function is short-circuiting:

```
a = 'abc'
expr = true
print(if(expr, 'xyz', a))
```

In this mode it is also easy to check, that `a` is in fact not computed, with no-evaluation policy (omit `--autorun` flag), the statement `print(a)` will print `n.c.`.

Example of rewriting an expression to allow deeply nested normal-order evaluation:

```
d = (a or b and not c)?
```

where `a`, `b` and `c` are variables of Boolean type (may have `true` or `false` values) but also may have `null` values. In this example, only the top-level `or` will be evaluated in normal order (note that `or` has highest precedence). The whole expression can be evaluated in normal order by defining an auxiliary variable `d_`:

```
d_ = (b and not c)?
d = (a or d_)?
```

# Checkpoint and recovery

Evaluation may be interrupted for various reasons. In instant and deferred [evaluation modes](io.md#modes-of-evaluation) there is no persistence of completed evaluations and the model must be started from scratch. In contrast, there are several levels of checkpointing and recovery available in [workflow evaluation mode](io.md#modes-of-evaluation).

## Node level

In [workflow evaluation mode](io.md#modes-of-evaluation), the workflow *nodes* that contain completed evaluations have COMPLETED state and their outputs are saved. This can be viewed as a checkpoint at the node level. This type of checkpointing is performed automatically by default and recovery at this level is performed when the [`%rerun` magic](tools.md#dealing-with-failures-and-lost-launches) is used.


## Task level

Usually, one workflow node includes the evaluation of one statement. However, other [mappings](resources.md#granularity) are also possible, for example nodes including several statements. The evaluation of such statements is mapped to a list of tasks within a single workflow node. The tasks are evaluated sequentially using the same computing resources. In cases where some tasks have been completed but some are not completed due to a failure, a partial recovery of the already completed tasks is possible using the [`%recover` magic](tools.md#specific-features). Thus the recovery evaluation of the node starts by repeating the evaluation of the first failed task.

## Step level

Some *iterable* parameters, such as [`Property`](amml.md#property), are evaluated in steps that are mapped to [Custodian Job](https://materialsproject.github.io/custodian/) objects. At this level the evaluation of every single step is atomic. In the current implementation, the computing resources of the step evaluations are shared and the order of evaluation is sequential, similar to the tasks. A recovery at this level is always activated when the [`%recover` magic](tools.md#specific-features) has been used.

## Application level

Many external applications used as backends for Algorithms and Calculators may provide further levels of checkpointing and restart. A recovery at this level is always activated when the [`%recover` magic](tools.md#specific-features) has been used. The applications are signalled about the recovery by the environment variable `VRE_LANG_RECOVERY_LAUNCH`. If set, it indicates the number of task-level recovery. Furthermore, the parameter `restart` is set to `true`, when the Algorithm or Calculator used in the relevant iterable step supports this parameter. In addition, the launch directory is restored automatically to the state of the last completed task and the last completed step.


Level   | Node states      | Recover with    | Checkpoint storage | Evaluation starts with
--------|------------------|-----------------|-------------------|------
Node    | FIZZLED, RUNNING | %rerun          | database | READY nodes
Task    | FIZZLED, RUNNING | %recover        | database & launch directory | first failed task
Step    | FIZZLED, RUNNING | %recover        | launch directory | first incomplete step
Application | FIZZLED, RUNNING | %recover    | app-specific | app-specific


## Typical use cases

There may be many reasons for failed evaluations. Some failed evaluations may not be recovered when the output does not exist, e.g. evaluations including devision by zero. The subsections below outline some typical scenarios in which a task/step/app level recovery would be beneficial.

### Node in FIZZLED state

If the node has one task and Property with one step, then running %recover is recommended only if app-level recovery is possible, i.e. the Calculator/Algorithm understands the restart flag. If app-level recovery is not possible, then using the `%recover` magic is only recommended for many-step Property.

A reason for a FIZZLED state can be `ConvergenceError`. In this case, recovery is only effective at the application level and requires checkpointing at the same level.

### Node in permanent RUNNING state

If a node is in RUNNING state for longer that certain timeout then the node is marked a lost run. In this state, such a node can be restarted in recovery mode with `%recover`.

A reason for lost runs can be that an evaluation in the node has exceeded a wall-time limit from the resource management system (e.g. Slurm).

**NOTE:** If a node in RUNNING state is not marked as lost run no restart / recovery action will be performed; an error will be issued instead.