Scientific Computing Language
Hello world
To print a string on the screen the print statement is used:
print('Hello world!')
Statements
The building blocks of the language are the statements. The statements are separated either by semicolons ; or by new lines. A statement can be:
variable
print
view
import from a Python module
export of arbitrary parameter
function definition
variable update
NOTE: The order of evaluation does not depend on the order of statements. Rather the order of evaluation is determined by the dependencies between variables and their parameters.
NOTE: The print and view statements are evaluated in the same order as they are in the program. See more details here.
Variables
Variables are the most used statements. A variable is initialized immediately with its definition with a variable name on the left hand side and a parameter on the right hand side of the = sign. A variable may not be redefined, i.e. the variable name may not be used to define other parameters. The variable’s parameter can be a string, integer, Boolean, a data structure, etc. A parameter can also be a reference to a variable. Parameters are immutable, i.e. they cannot be modified after having been created.
Example:
var_1 = 'Hello world!'
var_2 = var_1
var_3 = 0.5 [meter]
In the first line var_1 is a variable name. However, in the second line, var_1 is a parameter of the variable with name var_2, more specifically a reference to the variable with name var_1. The third statement is a variable with name var_3 and a parameter of numeric floating point type with units meters.
The print statement
The print statement displays the values of one or more parameters on the screen. The syntax is print(par1[, par2[, ...]]) where the optional parameters are displayed in square brackets [].
Example:
print(var_1) # "print the value of variable with name var_1"
var_1 = 'Hello world!' # initialize variable with name var_1 with a string literal
In this example the variable with name var_1 is defined after the print statement. Because the value of variable with name var_1 has to be printed on the screen it has to be evaluated first. Therefore, the variable with name var_1 (i.e. the second statement) is evaluated first and only after that the print statement is evaluated.
NOTE: This is the behavior in the case of instant evaluation and deferred on-demand evaluation. In other forms of deferred evaluation, var_1 may be printed without having been evaluated. In the latter case n.c. (not computed) is printed instead of the value of var_1 that is not available yet.
NOTE: In the case of on-demand evaluation policy, only these variables or parameters are evaluated that are used in the print, view or export statements. In the example above, the variable var_1 would not be evaluated if there were no print statement.
Further more detailed explanation of the behavior of input/output operations is provided separately.
The view statement
This statement displays its parameters graphically. The syntax is
view <mode> (parameter_1, parameter_2, ...)
Currently the modes lineplot and scatterplot are implemented and these will be explained below.
Plotting datasets in 2D
This is achieved with the modes lineplot and scatterplot that have the meaning of plot types here. Further modes of 2D plotting will be implemented in future but all these have common parameters patterns.
Data shape |
Parameter 1 |
Parameter 2 |
Parameter 3 |
Parameter 4 |
|---|---|---|---|---|
long-form data |
values, Table |
column to use as x axis, String |
column to use as y axis, String |
|
wide-form data |
values, Series(1D-Array), shape of values:array: (len(index), len(columns)) or (len(columns), len(index)) |
not used |
||
wide-form data |
values, 2D-Array, shape: (len(index), len(columns)) or (len(columns), len(index)) |
not used |
||
xy data |
not used |
not used |
Examples:
Long-form data:
tab = ( (number: 1, 2, 3, 4, 1, 2, 3, 4) [meter], (type_: 'square', 'square', 'square', 'square', 'cube', 'cube', 'cube', 'cube'), (value: 1, 4, 9, 16, 1, 8, 27, 64), (time: 0., 1., 2., 3., 4., 5., 6., 7.) [hour] ) units = (units: 'cm', '', '', 'min') view lineplot (tab, 'number', 'value', units)
Wide-form data:
ind = (number: 1, 2, 3, 4) [meter] val_ser = (values: [1, 4, 9, 16], [1, 8, 27, 64]) columns = (columns: 'square', 'cube') view lineplot (val_ser, ind [cm], columns)
XY-data:
ind = (number: 1, 2, 3, 4) [meter] sqr = (square: 1, 4, 9, 16) view lineplot (sqr, ind [mm]) view lineplot (sqr:array, ind:array) view lineplot (sqr, ind:array) view lineplot (sqr:array, ind)
Units in print and view parameters
Parameters of numeric types in print and view statements are printed with their units. The parameter’s value can be printed in other than the default units by specifying the units in the print or view parameter. For example, to print a mass in grams we can use this:
mass = 1.5 [kg]
print(mass [g])
The output of this print statement is 1500.0 [gram].
Using the following special units keywords the units can be converted without specifying the target units explicitly:
Keyword |
Comment |
Example for |
|---|---|---|
|
reduced units |
|
|
root units |
|
|
plain units |
|
|
human-readable magnitude |
|
Export statement
The export statement allows exporting a value to a file or a URL.
The syntax of the export statement is <export_source> to (file <path> | url <url_string>)
Where <export_source> can be:
a reference to a variable
a = 'Hello world!'
a to file 'hello_world.yaml'
b = [[1, 0, 0], [0, 2, 0], [0, 0, 3]]
b[1][1] to file 'b11.json' # exports the array element `2`
operations on iterable objects such as Series slice, Series filter, Table slice, Table filter, Array slice, or Tuple membership.
# Exporting the slice of a series:
lens = (length: 1, 2, 3, 4, 5, 6) [m]
lens[0:4:2] to file 'lens_042.json' # exports the series slice `(length: 1, 3) [meter]`
# Exporting the output of a filter operation:
tabtp = ((temp: 100., 200., 300.) [K], (pressure: 1., 2., 3.) [bar])
tabtp where column:temp > 100 [K] to file 'tabtp_200-300K.json'
# exports `((temp: 200.0, 300.0) [kelvin], (pressure: 2.0, 3.0) [bar])`
The file extension (the portion of the path after the .) indicates the format in which the value will be exported. Currently, YAML (extensions ‘.yml’ or ‘.yaml’), JSON (extension ‘.json’), and HDF5 (extensions: ‘.hdf’, ‘.h4’, ‘.hdf4’, ‘.he2’, ‘.h5’, ‘.hdf5’, ‘.he5’) formats are supported for all variable types. Domain formats are supported for some domain-specific types (see the relevant sections).
NOTE: If a file with the same name as specified already exists, the export statement will not work, i.e. export allows no file overwriting.
NOTE: While relative paths, as in the example above, are supported it is strongly recommended to use absolute paths, especially in the workflow evaluation mode.
Other statements
Further statements are imports from external Python modules and function definitions. They are more advanced and are outlined here and here, respectively.
Type
Variables and parameters have type. The type is fixed with the definition and checked before evaluation begins, i.e. it is static. The type determines in what operations a parameter or a variable can be used. If a statement contains operations on incompatible types a Type error is issued. In the most cases, type errors are issued before the evaluation begins, as long as the types of parameters and variables can be evaluated without computing their values.
String type
A string literal is a Unicode string enclosed by single or double quotes. Empty strings are allowed.
hello = "Hello world!"
empty = '' # empty string
print(hello == empty) # string match, result: false
print(hello != empty) # string match, result: true
Currently, no operations on strings, except for string match, are available.
Boolean type
Parameters and variables of Boolean type have values of either true or false. Unlike in other languages, parameters and variables of other types have no Boolean values. Also variables and parameters of Boolean type cannot be interpreted as numeric types, such as integer, and used in numeric expressions. Boolean literals are parameters matching either true or false:
bool_1 = true
bool_2 = false
Boolean expressions
Using the operators and, or and not and any parameters of Boolean type, arbitrary Boolean expressions can be composed. Expressions with and and or are currently not short circuiting.
a = true and (false or true)
print(not a) # result: false
Boolean expressions always have Boolean type.
Numeric types
The parameters of numeric type can be integer (Integer), floating point (Float) or complex (Complex) quantities. Numeric types can be Quantity, Array and Series.
Numeric expressions
Using the operators +, -, *, / and **, and any scalar numeric parameters (of type Quantity), arbitrary numeric expressions can be composed.
a = 2
b = (2.0*a + 1)**2 - 1.5
Numeric expressions always have numeric type.
Physical units
All parameters of numeric type have physical units assigned. Here some examples:
number = 1 # dimensionless integer type quantity
length = 2.0 [meter] # floating point type quantity with units meter
s = number + length; print(s)
Because number is dimensionless it cannot be added to length and the following evaluation error occurs:
Dimensionality error: None:3:5 --> number + length <--
Cannot convert from 'dimensionless' (dimensionless) to 'meter' ([length])
In contrast to type, physical units are checked only during evaluation. This is why this error message will not be issued if we remove the print(s) statement.
NOTE: Dimensionless quantities also have units. This becomes evident in the error message above. These units are [dimensionless]. These can be optionally specified, for example number = 1 [dimensionless].
Complex numbers
Complex numbers have the format real [+-] imag [jJ] where real and imag are the real and the imaginary part of the complex number, respectively. Complex numbers can be used as scalars, as well as in Series and Arrays. The real and the imaginary part of a complex scalar can be retrieved using the built-in functions real() and imag(), respectively. For example, one can define a function to compute the complex conjugate:
conjg(z) = real(z) - imag(z) * (0 + 1 j)
Comparison expressions
Comparison expressions are defined for scalar numeric types. They can include one of the operators ==, !=, >, <, >=, <=. In comparisons with String, Boolean and Complex operands only the operators == and != are allowed. String matching using the operators == and != can be regarded as a comparison expression. Comparison expressions always have Boolean type.
b = 2 < 1
print(b) # result: false
Uncertainties
Parameters of scalar floating-point type (i.e. Quantity of Float datatype) can have optional uncertainty specification. The Quantity literal with uncertainty has syntax shown in the following examples:
time = 12.7 +/- 0.1 [seconds] # easy to type
distance = 2.56 ± 0.02 [angstrom] # easy to read
The number after the ± (or +/-) is the standard deviation and therefore it must be a non-negative floating point number. Using 0 as uncertainty is allowed but not recommended as it produces the warning “Using UFloat objects with std_dev==0 may give unexpected results.”
Quantities with uncertainties can be used e.g. in all numeric expressions, comparisons expressions, user-defined internal and many external functions.
Comparisons of quantities with uncertainties should be performed with caution. For more details see the user guide and the technical guide of the uncertainties package.
Data structures
In data structures several parameters of different or the same types can be combined to express a certain type of interrelation. The types that are no data structures will be called scalar types.
Tuple
Parameters of any type can be combined in a fixed order using a Tuple. The syntax is like in this example: t = (a, 1.3, 'abc', false); a = 2. Tuples are most useful if used as parameters of tuples of variables but also to pass bundled heterogeneous data.
A tuple containing one parameter should contain a comma before the closing parenthesis, otherwise it may be parsed as an expression. For example, use (1,) or (true,) but not (1) or (true). Empty tuples are not allowed as input.
Series
The Series contains a list of parameters: elements. The elements must be of the same type unless they are all scalar numerical types. In the latter case the datatype of the series will be the most generic type found in the series. For example, if the elements are floating-point and integer numbers then the series datatype will be Float.
The common syntax of Series is: (name: e1[, e2[, e3[...]]]) [units for numeric type].
The Series data structure must have a name.
The Series elements are the items between the : and the ) separated by commas.
Series literals must have at least one element.
Series of numeric types must have elements of the same units.
Series literals have syntax that is shown in the following example.
a = 3. [s]
s1 = (time: 1. [s], 2. [s], a)
s2 = (lengths: 1., 2., 3.) [nm]
s3 = (booleans: true, bval); bval = false
s4 = (numbers: 0, 3, -2)
The units of Series of numeric type can be specified if all elements are numeric literals, either after every single element, as shown for s1, or after the closing parenthesis ) as shown for s2. If an element is another numeric parameter then units may not be specified as it is shown for the parameter a in s1 (the parameter holds the units itself). If units specification is omitted, as in s4, then the Series still has units but it is dimensionless. One can also specify [dimensionless] but this is optional. Series of non-numeric types may have no units. Empty Series is not allowed as input.
Table
The Table data structure consists of an ordered set of Series parameters of the same length that can be viewed as columns. There are two different syntaxes for Table literals:
t1 = ((numbers: 1, 2, 3), (lengths: 1., 2., 3.) [nm])
t2 = Table ((numbers: 1, 2, 3), s2)
s2 = (lengths: 1., 2., 3.) [nm]
The rows of the Table are Tuples of the Series elements at the same position (see subscripting operations below). Empty tables are not allowed.
Dict
The Dict can be regarded as a tuple of key-value pairs
d = {key1: value2, key2: value2, ...}
where the values can be any parameters. Dict is mostly used to define a Table row within the functions (the first parameters) of map and reduce or to define tags and search queries.
A Dict requires that each key appears only once within the same dictionary. A duplicate key is considered invalid and will result in an error. It is also noted that the input order of key-value pairs has no meaning and may not be preserved.
Arrays
Arrays are data structures with fixed types. Compared to Series an Array has no name, may be multidimensional (whereas Series is one-dimensional) and may not be used as a column in Tables. In addition, an Array may only have Numeric, Boolean or String datatype whereas Series may have any datatype. Array literals have the following syntax:
pbc = [True, True, False] # switching boundary condition
cell = [[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]] [angstrom] # cubic unit cell
Empty arrays are not allowed as input.
Internal functions
An internal function, or simply a function, is a named expression in which some variables are bound. The bound variables (also called dummy variables) are provided comma-separated in a list enclosed by parentheses. The function is called by specifying the parameters to be used for each of the dummy variables:
f(x) = x**3 - 2*x**2 + 3*x - 4 # function definition (a statement)
print(f(1)) # function call (a parameter)
b = f(2); print(b) # another call
g(x) = b*x # another function definition where b is unbound
NOTE: The dummy variables are bound to the scope of the relevant functions. This is why the same names can be reused as dummy variables in other functions.
NOTE: The list of dummy variables may not be empty, e.g. f() = 2*a is not valid. In all such cases, simply the expression 2*a should be used instead of the call f(). If the expression has to be used more than once, then a variable b = 2*a can be defined and a reference to b can be used.
The if function and if expression
The value of the if function depends on the value of the first parameter that is always of Boolean type: if it is true then the value is equal to the value of the second parameter. If the value of the first parameter is false then the value of the function equals the value of the third parameter. All three parameters are mandatory.
c = if(true, 1, 2); print(c) # result: 1
...
b = f(...) # function call with boolean type
d = if(b, 'b was true', 'b was false'); print(d)
The if expression has a different syntax but the same meaning (semantics) as the if function. The same examples are shown below with using the expression syntax.
c = 1 if true else 2; print(c) # result: 1
...
b = f(...) # function call with boolean type
d = 'b was true' if b else 'b was false'; print(d)
Expression nesting
Expressions of the same type can be nested by using parentheses (). One typical use case is nesting comparison expressions in Boolean expressions.
print((3 > 4) or (-1 <= 0)) # result: true
Map function
The map() function iterates over the tuples from the elements or rows of the second, third, etc. parameters, which must be Series or Tables of equal length, and calls the function defined as first parameter with the tuple per each iteration. The type of map() depends on the type of the first parameter. If the type of the first parameter is Dict, then the type of map() is Table. Otherwise the type is Series with elements of the type of the first parameter. The length of the returned Series or Table is the same as the length of the input data (second, third etc. parameters).
Example with Series as input data:
s = (length: 1, 2, 3) [m]
sqr(x) = x*x
area = map(sqr, s)
print(area) # result (area: 1, 4, 9) [meter ** 2]
The first parameter in map() can also be a so-called lambda function. Lambda function is an internal function with no name.
s = (length: 1, 2, 3) [m]
area = map((x: x*x), s)
print(area) # result (area: 1, 4, 9) [meter ** 2]
A typical use case of map() is to apply operations element-wise to one or more series. In the following example an expression with the elements of two series is computed.
sx = (sx: 0.1, 1.3, -1.2)
sy = (sy: 2.1, -3.7, 4.6)
print(map((x, y: 3*x + 2*y - 1), sx, sy))
In this example, the returned type will be Series with elements of the type of the lambda function (x, y: 3*x + 2*y - 1), i.e. Integer type.
Example with table and with series and table as input data:
t = ((a: 1, 2, 3), (b: 4, 5, 6))
print(map((x: {a: x.a, b: x.b, c: x.a + x.b}), t))
s = (b: true, false, true)
print(map((x, y: {c: x.a + x.b, b: not y}), t, s))
series = map((x: x.a + x.b), t)
print(series)
program output: >>>
((a: 1, 2, 3), (b: 4, 5, 6), (c: 5, 7, 9))
((c: 5, 7, 9), (b: false, true, false))
(series: 5, 7, 9)
<<<
Operations with Series
In the following, operations with parameters of type Series will be outlined.
Slice
A slice of a Series is a new parameter of type Series returning a selection of elements from a parameter of type Series.
Syntax: [start:stop] or [start:stop:step]
In the first syntax the default step is 1.
lens = (length: 1, 2, 3, 4, 5, 6) [m]
print(lens[0:1]) # result: (length: 1) [meter]
print(lens[0:4:2]) # result: (length: 1, 3) [meter]
print(lens[6:0:-1]) # (length: 6, 5, 4, 3, 2) [meter]
print(lens[6::-1]) # invert the order, result: (length: 6, 5, 4, 3, 2, 1) [meter]
Subscripting
Individual Series elements can be retrieved by subscripting. The syntax is [index]. The type and, if appropriate, the units of the returned parameter are the same as these of the Series.
lens = (length: 1, 2, 3, 4, 5, 6) [m]
print(lens[0]) # first element: 1 [meter]
print(lens[1]) # second element: 2 [meter]
print(lens[-1]) # last element: 6 [meter]
print(lens[-2]) # second to last element: 5 [meter]
Retrieve the name
The name of a parameter of Series type can be retrieved using :name:
lens = (length: 1, 2, 3, 4, 5, 6) [m]
print(lens:name) # result: 'length'
The type of Series name is string type.
Retrieve the array
The array of a parameter of Series type can be retrieved using :array:
lens = (length: 1, 2, 3, 4, 5, 6) [m]
print(lens:array) # result: [1, 2, 3, 4, 5, 6] [m]
The returned type is Array type.
Reduce function
The reduce() function calls the function of two arguments provided as first parameter successively and cumulatively with the elements of the Series provided as second parameter.
Example:
s = (n: 1, 2, 3, 4)
prod(x, y) = x*y
print(reduce(prod, s)) # with internal function, result: 24
print(reduce((x, y: x*y), s)) # with lambda function
# with nested function calls:
print(prod(prod(prod(1, 2), 3), 4)) # equivalent
print(prod(prod(prod(s[0], s[1]), s[2]), s[3])) # equivalent
The type of the reduce() function is the same as the type of the first parameter of reduce(). In the example above, this will be the type of the lambda function (x, y: x*y) or of the function prod(x, y).
By combining map() and reduce() various algorithms can be implemented, for example the scalar product of two series:
s1 = (s1: -1., 2., -3.); s2 = (s1: 1., -2., 3.)
print(reduce((x, y: x+y), map((u, v: u*v), s1, s2))
Functions sum, all and any
The sum() function has two syntaxes:
If
sum()has only one parameter, then it must be of type Series of numeric type and computes the sum of the parameter elements. In this casesum(...)is equivalent toreduce((x, y: x+y), ...).If
sum()has more than one parameter then these parameters must be of scalar numeric type and the function computes the sum of all parameters.
The all() function has two syntaxes:
If
all()has only one parameter, then it must be of type Series of Boolean type and hastruevalue when all parameter elements havetruevalues; otherwise it hasfalsevalue. In this caseall(...)is equivalent toreduce((x, y: x and y), ...).If
all()has more than one parameter then these parameters must be of scalar Boolean type and hastruevalue when all parameters havetruevalues; otherwise it hasfalsevalue.
The any() function has two syntaxes:
If
any()has only one parameter, then it must be of type Series of Boolean type and hastruevalue if any parameter element hastruevalue; otherwise it hasfalsevalue. In this caseany(...)is equivalent toreduce((x, y: x or y), ...).If
any()has more than one parameter then these parameters must be of scalar Boolean type and hastruevalue when any parameter hastruevalue; otherwise it hasfalsevalue.
Filter function and filter expression
The filter function performs a selection of elements from a Series type parameter that satisfy a condition. The condition is provided as a Boolean type internal or lambda function.
print(filter((x: x > 2), (n: 1, 2, 3, 4))
The filter expression is semantically equivalent to the filter function but has a different syntax (note particularly the variable reference).
s = (n: 1, 2, 3, 4); print(s where column:n > 2)
More complex conditions are possible. For example:
s = (n: 1, 2, 3, 4)
ffunc = filter((x: (x < 2) or (x > 3)), s)
fexpr = s where column:n < 2 or column:n > 3
print(ffunc, fexpr) # (ffunc: 1, 4) (n: 1, 4)
The type of filter functions and expressions is always the same as the type of the input Series.
Testing membership
The membership test is an expression with two input parameters separated by the in keyword: a parameter of any type and a parameter of Series type.
The test returns true if the first parameter is an element of the Series, false otherwise.
If the series contains a null element and the value of the first parameter is not found in the series, the test will return null only if the data types of both parameters match; otherwise, it will return false.
Example:
s = (s: 'a', 'b', null)
print('a' in s) # true
print('c' in s) # null
print(1 in s) # false
n = (n: 1, 2, 3)
print(4 in n) # false
The membership expression is semantically equivalent to applying map and any in a sequence:
s = (s: 'a', 'b', null)
print(any(map((x: x == 'a'), s))) # true
print(any(map((x: x == 'c'), s))) # null
print(any(map((x: x == 4), (n: 1, 2, 3)))) # false
The advantages of using the in operator are the better code readability and brevity,
and possibly shorter evaluation time due to short circuiting which is not available in map.
Operations with Table
In the following, operations including parameters of type Table will be outlined.
Slice
Similarly to Series, a slice from a Table is a Table. The syntax and semantics are the same as with Series.
tab = ((bools: true, false, true), (numbers: 1, 2, 3))
print(tab[0:2]) # result: ((bools: true, false), (numbers: 1, 2))
print(tab[3::-1]) # result: ((bools: true, false, true), (numbers: 3, 2, 1))
Subscripting
Individual Table rows can be retrieved by subscripting. The type of this operation is always a Tuple type. The syntax is [index].
tab = ((bools: true, false, true), (numbers: 1, 2, 3))
print(tab[0]) # result: (true, 1)
Retrieve a column
Individual Table columns can be retrieved using the syntax .<name> where <name> is the column name.
tab = ((bools: true, false, true), (numbers: 1, 2, 3))
print(tab.numbers) # result: (numbers: 1, 2, 3)
The type of this operation is always Series type.
Retrieve the list of column names
The type of this operation is Series type.
tab = ((bools: true, false, true), (numbers: 1, 2, 3))
print(tab:columns) # result: (columns: 'bools', 'numbers')
Filter expressions applied to Table
Filter expressions can be used with parameters of Table type.
tab = ((temp: 100., 200., 300.) [K], (pressure: 1., 2., 3.) [bar])
print(tab where column:temp > 100 [K])
print(tab select pressure where column:temp > 100 [K])
After evaluation this result is printed:
((temp: 200.0, 300.0) [kelvin], (pressure: 2.0, 3.0) [bar])
((pressure: 2.0, 3.0) [bar])
Filter function applied to Table
The filter function performs a selection of rows from a Table type parameter that satisfy a condition. The condition is provided as a Boolean type internal or lambda function.
Example:
tabl = ((numbers: 1, 2, 3), (strings: 'a', 'b', 'c'))
print(filter((x: x.numbers > 2), tabl)
This will return a new table with the same columns as tabl but only the rows where numbers is greater than 2. This is semantically equivalent to the filter expression tabl where numbers > 2.
Reduce function applied to Table
The reduce() function applies a function of two arguments provided as first parameter successively and cumulatively to the rows of the Table provided as second parameter.
Example: Sum the elements of the first column and multiply the elements of the second column.
t = ((a: 1, 2, 3), (b: 4, 5, 6))
print(reduce((x, y: {a: x.a + y.a, b: x.b * y.b}), t))
program output: >>>
((a: 6), (b: 120))
<<<
The type of the reduce() function is a Table with one row and columns the same as the second parameter.
Operations with Array
In the following, operations including parameters of type Array will be outlined. The examples are with an array of integer type but the operation can be used with all data types supported in arrays.
Subscripting
The purpose of subscripting is to retrieve individual elements or sub-arrays.
Retrieving individual elements
The following example demonstrates the retrieval of a single array element:
a = [[1, 2], [3, 4]] [m]
print(a[1][0]) # result: 3 [meter]
To retrieve a single array element, the number of subscripts must be the same as the number of dimensions (axes) of the array. The returned type is a scalar type the same as the array data type.
Retrieving sub-arrays
If the number of subscripts is less than the number of array dimensions (axes) then a sub-array is returned:
a = [[1, 2], [3, 4]] [m]
print(a[0]) # result: [1, 2] [meter]
Subscripting errors
If the subscript is larger than the largest index in the relevant axis then an Invalid index is issued:
a = [[1, 2], [3, 4]] [m]
b = a[2] # Index out of range, index: 2, data length: 2
c = a[1][3] # Index out of range, index: 3, data length: 2
If the number of subscripts exceeds the number of axes in the array then a Type error is issued:
a = [[1, 2], [3, 4]] [m]
b = a[0][0][0] # Invalid use of index in type Quantity
Because a[0][0] is a Quantity (a numerical scalar type) it cannot be subscripted with [0].
Slice
The slice returns selected elements within the same axis of an array. The slice syntax is the same as with Series and Table. The slice can be applied only once per statement after all (optional) subscripts. Example:
a = [[1, 2], [3, 4]] [m]
print(a[1][0:1:1]) # result: [3] [meter]
print(a[::]) # result: [[1, 2], [3, 4]] [meter]
print(a[0:1][0]) # Syntax error, intended result [1, 2] [meter]
print(a[0:1][0:1]) # Syntax error, intended result: [[1, 2]] [meter]
To concatenate multiple slices or to combine slices with subscripts in arbitrary ordering, with the current syntax one has to define auxiliary variables:
a = [[1, 2], [3, 4]] [m]
# print(a[0:1][0]) # Syntax error, intended result [1, 2] [meter]
b0 = a[0:1]; print(b0[0]) # result [1, 2] [meter]
# print(a[0:1][0:1]) # Syntax error, intended result: [[1, 2]] [meter]
e0 = a[0:1]; print(e0[0:1]) # result: [[1, 2]] [meter]
# print(a[1:][0:]) # Syntax error, intended result: [[3, 4]] [meter]
f0 = a[1:]; print(f0[0:]) # result: [[3, 4]] [meter]
Operations with Tuple
Subscripting
Individual elements or slices of tuples can be retrieved using subscripting.
Example:
tup = (1, (2, 3), true, 'abc', [4, 5])
print(tup[0]) # result: 1
print(tup[1]) # result: (2, 3)
print(tup[-1]) # result: [4, 5]
print(tup[3:4]) # result: ('abc',)
print(tup[0::2]) # result: (1, true, [4, 5])
Testing membership
The membership expression consists of the operator in and two operands that are
a parameter of any type and a Tuple literal, respectively. It returns true if
the left operand is equal to any of the elements of the tuple, false otherwise.
If there is a null element in the tuple then null is returned in case of no match.
Example:
print(5 in (1, 4, 5)) # true
print(0 in (1, 4, 5)) # false
print(3 in (-1, 5, null)) # null
Range function and range expression
The range function with syntax range(start, stop, step) creates Series in a given numeric range from start to (but not including) stop incrementing by step.
lens = range(1 [m], 6 [m], 1 [m])
print(lens) # result: (lens: 1, 2, 3, 4, 5) [meter]
There is range expression with the same semantics as the range function.
lens = range from 1 [m] to 6 [m] step 1 [m]
print(lens) # result: (lens: 1, 2, 3, 4, 5) [meter]
The parameters start, stop and step must be of the same scalar numeric type, i.e. either integers or floating-point numbers.
Imported objects and functions
External objects and functions from arbitrary Python modules can be imported and used in the language. There are three syntaxes:
use <module>.<name>use <name1>, <name2> from <module[.submodule]*>from <module[.submodule]*> use <name1>, <name2>
The first syntax is shorter while the second and third allow several imports in the same statement and namespace including modules with arbitrary number of sub-modules separated by periods.
The imported objects and functions can be used as parameters (references) with the same names as in the imports. This means that no variables with these names can be used.
Examples
The example below shows using an imported function len() from the builtins module from the standard Python package using the first syntax:
use builtins.len
s = (numbers: 1, 2, 3)
print(len(s)) # result: 3
The following example demonstrates usage of imported object pi and functions sin() and cos() from the numpy module (package) using the second and third syntaxes.
use pi, sin from numpy
from numpy use cos
print(sin(2.*pi)) # result: -2.4492935982947064e-16
print(cos(2.*pi)) # result: 1.0
The name of an imported object cannot be used any more:
use math.pi
pi = 3.14
Initialization error: None:1:? --> pi = 3.14 <--
Repeated initialization of "pi"
Imported functions
Using external Python functions in textS (with the use keyword) assumes/requires some knowledge in Python. While using functions from the builtins and math Python modules, or from the numpy package, can be straight forward, using self-written functions provides more options but also hides more difficulties.
Function signature
If an imported function has a call signature this is used to perform a static check. Particularly, it is checked that the call includes the correct number of required parameters by extracting the positional arguments without defaults from the signature.
NOTE: Currently, imported Python functions can be called only with their positional arguments, i.e. calls with pure keyword arguments are not possible.
Type annotations
If type annotations are provided in the function call signature for any function positional arguments, a type check is performed for these particular arguments. If a return-type annotation is available it is used to perform a static type check.
Currently, only types processed by the interpreter may be used in the type hints. More information about the valid types is provided in this section.
Built-in module with commonly used functions
The virtmat.functions module provides a set of commonly used functions. Compared to functions directly imported from Python’s math module and the numpy package, the functions from the virtmat.functions module provide better support for typing, physical units and quantities with uncertainties.
Examples:
use exp from virtmat.functions
use boltzmann_constant from virtmat.constants
f(e, T) = exp(-e/(boltzmann_constant*T))
print(f(0.01 [eV], 300 [K])) # result: 0.6792151960927103
Currently, all numpy functions and universal functions that are supported by the pint package are available in virtmat.functions. A list of names of all supported functions can be retrieved from the same module:
use FUNCTIONS from virtmat.functions
print(FUNCTIONS)
Some of these functions provide support for inputs with uncertainties via the uncertainties.unumpy module. This support is currently limited to unumpy-supported functions with one output (i.e. no functions returning tuple) and to only scalar inputs.
Additionally, some functions from Python’s math module, adapted to textS, are available in the module virtmat.functions.math.
Example:
from virtmat.functions.math use sqrt
print(sqrt(4 [meter**2])) # result: 2.0 [meter]
The built-in info function
The info function takes one argument that can be any parameter. It returns a Table with information about the parameter, such as type, datatype (for Series and Arrays), and dimensionality / units if the parameter is numeric and has been evaluated. In workflow evaluation mode, additional metadata is included in the table when the parameter is a variable.
Example with an expression:
Input:
print(info(prop.energy))
Output:
((name: null),
(type: 'Series'),
(scalar: false),
(numeric: true),
(datatype: 'float'),
(dimensionality: '[mass] * [length] ** 2 / [time] ** 2'),
(units: 'electron_volt'))
Example with a variable:
Input:
print(info(prop))
Output:
((name: 'prop'),
(type: 'Property'),
(scalar: false),
(numeric: false),
(datatype: null),
('group UUID': 'd9fe0968ed6740888750eb534aababa0'),
('model UUID': '181091fc725242258b3788ed797bf932'),
('node UUID': 'c68c476ef82141d29ae5be555722c27a'),
('node ID': 751),
('parent IDs': (752, 753, 754)),
('node state': 'COMPLETED'),
('created on': '2025-02-25T14:23:32+01:00'),
('updated on': '2025-02-25T14:23:44+01:00'),
('grammar version': 32),
('data schema version': 7),
('python version': '3.10.12 (main, Feb 4 2025, 14:57:36) [GCC 11.4.0]'),
(category: 'interactive'),
(fworker: null),
(dupefinder: true),
('reservation ID': null),
('number of launches': 1),
('number of archived launches': 0),
(launch_dir: ('/mnt/data/ubuntu/work/vre-language/examples/launcher_2025-02-25-13-23-33-167195',)),
('archived launch_dir': ()),
(runtime_secs: 3.495358),
('runtime_secs total': 3.495358))
The output of info cannot be used as parameter of a variable and therefore is not persistent. Currently, info can only be used in print statements but does not trigger on-demand evaluation of the referenced variables.
The info function replaces the deprecated type function (grammar version 32 or newer). The type function in older compatible grammar versions produces the same output as info.
Loading parameters from file or URL
Some parameters have the optional syntax allowing to load them from file or download from a URL. The common syntax is
<var name> = <parameter> from (file <path> | url <url_string>)
Current list of textS parameters supporting this syntax: Quantity, Bool, String, Series, Table, BoolArray, StrArray, IntArray, FloatArray and ComplexArray. Additionally, there are textM parameters that also support this syntax.
The <path> is a string that must contain the path to the input data file. While relative paths are supported it is strongly recommended to use absolute paths, especially in the workflow evaluation mode. By default the internal serialization format (in JSON) is used and the file type may be JSON (filename extension json) or YAML (file name extensions yml or yaml). Domain-specific parameters may support further domain-specific formats.
Dealing with missing data or default values
Sometimes measurements include data gaps, i.e. some data elements may be missing. Furthermore, in modeling often some parameters have default values that should be used without specifying them. For these two use cases, Series allow the placeholders null and default to specify unknown elements. Note that null and default have no type because they are no parameters. The Series type is inferred from the type of all other elements that must have the same type. If all elements are either null or default then the type of the series is Any (unknown type).
numbers = (numbers: 1, 2, null)
sqrs = map((x: x**2), numbers)
print(sqrs) # result: (sqrs: 1, 4, null)
print(sqrs[2]) # result: null
If some quantity or an element in a data structure critically depends on such elements it gets the null value. For example, print(any((bools: true, null))) yields true and not null because the missing value is not critical for the value of the any() function. In contrast, print(all((bools: true, null))) yields null, i.e. undetermined between true and false, because the placeholder null can be both true or false.
Implications for Boolean operations
Structures, where Boolean values are missing and denoted by null, are processed using a three-valued logic. This means that a non-Boolean value null is returned when a Boolean output is ambiguous. This affects Boolean expressions, the if, filter, all and any built-in functions.
Dealing with failures
As all other parameters in textS, the variables are immutable objects. This means that they cannot be modified (updated) once they have been defined and initialized.
This behavior can lead to the following situation. Let us have this model that we run in an interactive session:
Input > a = 1
Input > b = a / 0
Input > c = 2
Input > f(x) = b * x
Input > %start
Input > print(f(c))
Arithmetic error: None:2:1 --> b = a / 0 <--
float division by zero
Obviously, we will never be able to use function f because of the run-time error in evaluating b. One workaround is to define a new variable b_correct and a new function f_correct that uses b_correct instead of b:
Input > b_correct = a / 2
Input > f_correct(x) = b_correct * x
Input > f_correct(c)
Output > 1.0
Though this is the recommended approach, there are some cases when this is not desirable or practical. One case is if the model contains a large number of the descendants of variable b. In this case all these statements have to be rewritten. Furthermore, the statements that are descendants of b will never be evaluated but also cannot be removed from the model which is likely to lead to confusions. In another case, the evaluations have not failed but a mistake leading to wrong results is found in a statement. Thus the affected statement and all its descendants have to be invalidated or removed.
Effectively, this can be accomplished by updating the statement with the error / mistake:
Input > b := a / 2
Input > print(f(c))
Output > 1.0
Using this approach, all descendants have been found and reevaluated.
Currently, there are some restrictions to this approach:
The set of references in the updated variable parameter must be identical with that in the parameter of the original variable. For example, in the example above, the update
b := 1 / 2is not valid because the reference toais not used. Also,b := c + 1is not valid because it includes a reference tocthat is not in the original version ofb.A variable can be updated only once per model extension. The update becomes ambiguous otherwise. For example, the update
b := 1 / a; b := a / 2is not valid.The variable may not be part of a parameter variation across several models (a model group). For example, if variable
ais in such a variation, that has been added with the statementvary ((a: 1, 2))then it cannot be updated witha := 3in further model extensions.Variables containing
ifexpressions or Boolean expressions with?annotations as parameters, as well as the ancestors of such variables, cannot be updated.Variables containing parallel
map,filterandreducecannot be updated.
The approach described here is recommended if the evaluation error is caused by the model input. If the error during evaluation is due to failure of computing nodes, network or file system, or other similar failures, then the evaluation can be rerun in an interactive session by using the %rerun magic.
Using physical constants
Physical constants are provided in the module constants in the virtmat namespace.
Example:
use speed_of_light from virtmat.constants
print(speed_of_light) # result: 1.0 [speed_of_light]
print(speed_of_light) [m/s] # result: 299792458.0 [meter / second]
print(speed_of_light [_base]) # result: 299792458.0 [meter / second]
print(speed_of_light [_compact]) # result: 1.0 [speed_of_light]
A list of the names of all currently provided constants can be retrieved from the same module:
use CONSTANTS from virtmat.constants
print(CONSTANTS)
Additionally, the definitions of these constants can be found here.
Using random numbers
Random number generators are automatically enabled in textS. In order to use a
random sampling function, the collection in virtmat.functions.random can be used:
from virtmat.functions.random use <function name>
var = <function name> ([Table|Dict])
The function name is the name of the relevant function for random sampling from numpy.random. All sampling functions from numpy.random are supported except for bytes and shuffle.
The table, or dictionary, contain the parameters to pass to the function, according to numpy.random specifications. The table (dictionary) are optional, i.e. a function can be called without parameters.
Additional to the parameters specified in the original numpy.random functions, all virtmat.functions.random functions accept two extra parameters: rng and seed.
The rng parameter is the name of one of the numpy bit generators, i.e. MT19937, PCG64, PCG64DXSM, Philox, or SFC64. If not specified, the default PCG64 is chosen. After time, another bit generator might be selected by default in numpy (see this link) and if the generator has not been explicitly specified, then the result cannot be reproduced.
The seed parameter is a 128-bit integer in hexadecimal format, e.g. 8ed8c93f1db74abebc32d228a25f7628. A random seed will be generated automatically if seed is not specified.
Reproducibility of the results is only guaranteed if rng and seed are explicitly specified, so it is recommended to always set these two parameters.
The type of the returned (generated) value depends on the function function name and the keyword size that may be specified in the table (or dictionary), as explained in the following table. The datatype (Integer or Float) depends on the specific function.
|
Returned type |
Example |
Returned value (sample) |
|---|---|---|---|
— |
scalar Quantity |
|
|
scalar |
numeric Series |
|
|
Tuple |
numeric Array |
|
|
Tuple |
numeric Array |
|
|
The physical unit of the returned value depends on the units of some numerical inputs specified in the table. If all numerical inputs are dimensionless or no inputs are specified then the returned value is dimensionless. For example, the generated quantity stored in the distr variable,
from virtmat.functions.random use normal
distr = normal {loc: 0 [nm], scale: 0.2 [nm]}
will be in nanometers.
NOTE: Though it is possible to import the same functions directly from numpy.random, this is not recommended. For example, instead of using this code
from numpy.random use normal
s = normal(0.0, 0.1)
one should use
from virtmat.functions.random use normal
s = normal(((loc: 0.0), (scale: 0.1)))
Comments and white space
Comments are ignored and not interpreted. All input after the hash sign
#up to the end of the same line is ignored. All input enclosed by a pair of three double quotes"""is ignored. All white space is needed only to separate keyword inputs otherwise white space is ignored.