utopya package

utopya.cfg.write_to_cfg_dir(cfg_name: str, obj: dict)[source]#

Writes a YAML represetation of the given object to the configuration directory. Always overwrites a possibly existing file.

Parameters:

cfg_name (str) – The configuration name
obj (dict) – The yaml-representable object that is to be written; usually a dict.

utopya.exceptions module#

utopya-specific exception types

exception utopya.exceptions.UtopyaException[source]#

Bases: BaseException

Base class for utopya-specific exceptions

exception utopya.exceptions.ValidationError[source]#

Raised upon failure to validate a parameter

exception utopya.exceptions.WorkerManagerError[source]#

Bases: UtopyaException

The base exception class for WorkerManager errors

exception utopya.exceptions.WorkerManagerTotalTimeout[source]#

Bases: WorkerManagerError

Raised when a total timeout occurred

exception utopya.exceptions.WorkerTaskError[source]#

Bases: WorkerManagerError

Raised when there was an error in a WorkerTask

exception utopya.exceptions.WorkerTaskNonZeroExit(task: utopya.task.WorkerTask, *args, **kwargs)[source]#

Can be raised when a WorkerTask exited with a non-zero exit code.

__init__(task: utopya.task.WorkerTask, *args, **kwargs)[source]#: Initialize an error handling non-zero exit codes from workers

__str__() → str[source]#: Returns information on the error

exception utopya.exceptions.WorkerTaskStopConditionFulfilled(task: utopya.task.WorkerTask, *args, **kwargs)[source]#

Bases: WorkerTaskNonZeroExit

An exception that is raised when a worker-specific stop condition was fulfilled. This allows being handled separately to other non-zero exits.

exception utopya.exceptions.WorkerTaskSetupError[source]#

Raised upon errors in the worker task setup function

exception utopya.exceptions.SkipWorkerTask(reason: str, *args, **kwargs)[source]#

Raised to indicate that a worker task should be skipped.

exception utopya.exceptions.WorkerTaskNotSkippable[source]#

Raised when a worker task was NOT marked as skippable but a skip event was raised.

exception utopya.exceptions.MultiverseError[source]#

Bases: UtopyaException

Base class for Multiverse-related exceptions

exception utopya.exceptions.MultiverseRunAlreadyFinished[source]#

Bases: MultiverseError

Raised when a Multiverse run has already finished.

exception utopya.exceptions.UniverseSetupError[source]#

Bases: MultiverseError

Raised on issues with universe during setup

exception utopya.exceptions.UniverseOutputDirectoryError[source]#

Bases: UniverseSetupError

Raised on issues with universe output directory

exception utopya.exceptions.SkipUniverse(reason: str, *args, **kwargs)[source]#

Bases: SkipWorkerTask, MultiverseError

Raised to indicate that a universe should be skipped.

exception utopya.exceptions.SkipUniverseAfterSetup(reason: str, *args, **kwargs)[source]#

Bases: SkipUniverse

Raised to indicate that this universe (and all others) are deliberately skipped after their setup function was invoked.

exception utopya.exceptions.YAMLRegistryError[source]#

Base class for errors in YAMLRegistry

exception utopya.exceptions.EntryExistsError[source]#

Raised if an entry already exists

exception utopya.exceptions.MissingEntryError[source]#

Raised if an entry is missing

exception utopya.exceptions.MissingRegistryError[source]#

Raised if a registry is missing

exception utopya.exceptions.EntryValidationError[source]#

Raised upon failed validation of a registry entry

exception utopya.exceptions.SchemaValidationError[source]#

If schema validation failed

exception utopya.exceptions.ModelRegistryError[source]#

Raised on errors with model registry

exception utopya.exceptions.MissingModelError[source]#

Raised when a model is missing

exception utopya.exceptions.BundleExistsError[source]#

Raised when a bundle that compared equal already exists

exception utopya.exceptions.MissingBundleError[source]#

Raised when a bundle is missing

exception utopya.exceptions.BundleValidationError[source]#

Raised when the result of validating the existence of a bundle fails

exception utopya.exceptions.ProjectRegistryError[source]#

Raised on errors with project registry

exception utopya.exceptions.MissingProjectError[source]#

Bases: ProjectRegistryError

Raised on a missing project

exception utopya.exceptions.ProjectExistsError[source]#

Bases: ProjectRegistryError

Raised if a project or project file of that name already exists

utopya.model module#

Provides the Model to work interactively with registered utopya models

Bases: object

A class to work with Utopia models interactively.

It attaches to a certain model and makes it easy to load config files, create a Multiverse from them, run it, and work with it further…

CONFIG_SET_MODEL_SOURCE_SUBDIRS = ('cfgs', 'cfg_sets', 'config_sets')#: Directories within the model source directories to search through when looking for configuration sets. These are not used if the utopya config contains an entry overwriting this.

Initialize the ModelTest for the given model name

Parameters:

name (str, optional) – Name of the model to attach to. If not given, need to pass info_bundle.
info_bundle (ModelInfoBundle, optional) – The required information to work with this model. If not given, will attempt to find the model in the model registry via name or bundle_label.
bundle_label (str, optional) – A label to use for identifying the info bundle.
base_dir (str, optional) – For convenience, can specify this path which will be seen as the base path for config files; if set, arguments that allow specifying configuration files can specify them relative to this directory.
sim_errors (str, optional) – Whether to raise errors from Multiverse
use_tmpdir (bool, optional) – Whether to use a temporary directory to write data to. The default value can be set here; but the flag can be overwritten in the create_mv and create_run_load methods. For false, the regular model output directory is used.

Raises:

ValueError – Upon bad base_dir

__str__() → str[source]#: Returns an informative string for this Model instance

property info_bundle: ModelInfoBundle#: The model info bundle

property name: str#: The name of this Model object, which is at the same time the name of the attached model.

property base_dir: str#

Returns the path to the base directory, if set during init.

This is the path to a directory from which config files can be loaded using relative paths.

property default_model_cfg: dict#: Returns the default model configuration by loading it from the file specified in the info bundle.

property default_config_set_search_dirs: List[str]#

Returns the default config set search directories for this model in the order of precedence:

defined on the project-level via cfg_set_abs_search_dirs; these may also be format strings supporting the following set of keys: model_name, project_base_dir, and model_source_dir (if set). If no project is associated, there will be no additional search directories.
names of subdirectories relative to the model source directory, defined in cfg_set_model_source_subdirs. If no model source directory is known, no search directories will be added. If no project is associated, a standard set of search directories is used: cfgs, cfg_sets, config_sets, as defined in CONFIG_SET_MODEL_SOURCE_SUBDIRS.

Note

The output may contain relative paths.

property default_config_sets: Dict[str, dict]#

Config sets at the default search locations.

To retrieve an individual config set, consider using get_config_set() instead of this property.

For more information, see Configuration Sets.

create_mv(*, from_cfg: str | None = None, from_cfg_set: str | None = None, run_cfg_path: str | None = None, use_tmpdir: bool | None = None, **update_meta_cfg) → Multiverse[source]#

Creates a utopya.multiverse.Multiverse for this model, optionally loading a configuration from a file and updating it with further keys.

Parameters:

from_cfg (str, optional) – The name of the config file (relative to the base directory) to be used.
from_cfg_set (str, optional) – Name of the config set to retrieve the run config from. Mutually exclusive with from_cfg and run_cfg_path.
run_cfg_path (str, optional) – The path of the run config to use. Can not be passed if from_cfg or from_cfg_set arguments were given.
use_tmpdir (bool, optional) – Whether to use a temporary directory to write the data to. If not given, uses default value set at initialization.
**update_meta_cfg – Can be used to update the meta configuration

Returns:

The created Multiverse object

Return type:

Multiverse

Raises:

ValueError – If more than one of the run config selecting arguments (from_cfg, from_cfg_set, run_cfg_path) were given.

create_frozen_mv(**fmv_kwargs) → FrozenMultiverse[source]#

Create a utopya.multiverse.FrozenMultiverse, coupling it to a run directory.

Use this method if you want to load an existing simulation run.

Parameters:: **fmv_kwargs – Passed on to utopya.multiverse.FrozenMultiverse.__init__()

create_distributed_mv(**dmv_kwargs) → DistributedMultiverse[source]#

Create a utopya.multiverse.FrozenMultiverse, coupling it to a run directory.

Use this method if you want to load an existing simulation run.

Parameters:: **fmv_kwargs – Passed on to utopya.multiverse.FrozenMultiverse.__init__()

create_run_load(*, from_cfg: str | None = None, run_cfg_path: str | None = None, from_cfg_set: str | None = None, use_tmpdir: bool | None = None, print_tree: bool = True, **update_meta_cfg) → Tuple[Multiverse, DataManager][source]#

Chains the create_mv(), mv.run, and mv.dm.load_from_cfg methods calls together and returns a (Multiverse, DataManager) tuple.

Parameters:

from_cfg (str, optional) – The name of the config file (relative to the base directory) to be used.
from_cfg_set (str, optional) – Name of the config set to retrieve the run config from. Mutually exclusive with from_cfg and run_cfg_path.
run_cfg_path (str, optional) – The path of the run config to use. Can not be passed if from_cfg or from_cfg_set arguments were given.
use_tmpdir (bool, optional) – Whether to use a temporary directory to write the data to. If not given, uses default value set at initialization.
print_tree (bool, optional) – Whether to print the loaded data tree
**update_meta_cfg – Arguments passed to the create_mv function

Returns:

The created Multiverse and the: corresponding DataManager (with data already loaded).

Return type:

Tuple[Multiverse, DataManager]

get_config_set(name: str | None = None) → Dict[str, str][source]#

Returns a configuration set: a dict containing paths to run and/or eval configuration files. These are accessible via the keys run and eval.

Config sets are retrieved from multiple locations:

The cfgs directory in the model’s source directory

The user-specified lookup directories, specified in the utopya configuration as config_set_search_dirs

If name is an absolute or relative path, and a directory exists at the specified location, the parent directory is interpreted as a search path.

This uses get_config_sets() to retrieve all available configuration sets from the above paths and then selects the one with the given name. Config sets that are found later overwrite those with the same name found in previous searches and log a warning message (which can be controlled with the warn argument); sets are not merged.

For more information, see Configuration Sets.

Parameters:: name (str, optional) – The name of the config set to retrieve. This may also be a local path, which is looked up prior to the default search directories.

get_config_sets(*, search_dirs: List[str] | None = None, warn: bool = True, cfg_sets: dict | None = None) → Dict[str, dict][source]#

Searches for all available configuration sets in the given search directories, aggregating them into one dict.

The search is done in reverse order of the paths given in search_dirs, i.e. starting from those directories with the lowest precedence. If configuration sets with the same name are encountered, warnings are emitted, but the one with higher precedence (appearing more towards the front of search_dirs, i.e. the later-searched one) will take precedence.

Note

This will not merge configuration sets from different search directories, e.g. if one contained only an eval configuration and the other contained only a run configuration, a warning will be emitted but the entry from the later-searched directory will be used.

Parameters:

search_dirs (List[str], optional) – The directories to search sequentially for config sets. If not given, will use the default config set search directories, see default_config_set_search_dirs.
warn (bool, optional) – Whether to warn (via log message), if the search yields a config set with a name that already existed.
cfg_sets (dict, optional) – If given, aggregate newly found config sets into this dict. Otherwise, start with an empty one.

_store_mv(mv: Multiverse, **kwargs) → None[source]#: Stores a created Multiverse object and all the kwargs in a dict

_create_tmpdir() → TemporaryDirectory[source]#: Create a TemporaryDirectory

_find_config_sets(search_dir: str, *, cfg_sets: dict, warn: bool = True) → Dict[str, dict][source]#

Looks for config sets in the given directory and aggregates them into the given cfg_sets dict, warning if an entry already exists.

Parameters:

search_dir (str) – The directory to search for configuration sets. Can be an absolute or relative path; ~ is expanded.
cfg_sets (dict) – The dict to populate with the results, each entry being one config set.
warn (bool, optional) – Whether to warn (via log message) if an entry already exists.

utopya.multiverse module#

Implementation of the Multiverse class which sits at the heart of utopya and supplies the main user interface for the frontend. It allows to run a simulation and then evaluate it.

class utopya.multiverse.Multiverse(*, model_name: str | None = None, info_bundle: ModelInfoBundle | None = None, run_cfg_path: str | None = None, user_cfg_path: str | None = None, _shared_worker_manager: WorkerManager | None = None, **update_meta_cfg)[source]#

Bases: object

The Multiverse is where a single simulation run is orchestrated from.

It spawns multiple universes, each of which represents a single simulation of the selected model with the parameters specified by the meta configuration. The WorkerManager takes care to perform these simulations in parallel.

The Multiverse then interfaces with the dantro data processing pipeline using classes specialized in utopya.eval: The DataManager loads the created simulation output, making it available in a uniformly accessible and hierarchical data tree. Then, the PlotManager handles plotting of that data.

RUN_DIR_TIME_FSTR = '%y%m%d-%H%M%S'#: The time format string for the run directory

RUN_SUBDIRS: Tuple[str, ...] = ('config', 'data', 'eval', '.cache')#: Names or managed subdirectories of the run directory.

RUN_SUBDIRS_REQUIRED: Tuple[str, ...] = ('config', 'data', 'eval')#: Names of required subdirectories; these are guaranteed to exist.

BASE_META_CFG_PATH = '/home/docs/checkouts/readthedocs.org/user_builds/utopya/checkouts/latest/utopya/cfg/base_cfg.yml'#: Where the default meta-configuration can be loaded from.

UTOPYA_BASE_PLOTS_PATH = '/home/docs/checkouts/readthedocs.org/user_builds/utopya/checkouts/latest/utopya/cfg/base_plots.yml'#: Where the utopya base plots configuration can be found; this is passed to the PlotManager.

USER_CFG_SEARCH_PATH = '/home/docs/.config/utopya/user_cfg.yml'#: Where to look for the user configuration

Initialize the Multiverse.

Parameters:

model_name (str, optional) – The name of the model to run
info_bundle (ModelInfoBundle, optional) – The model information bundle that includes information about the executable path etc. If not given, will attempt to read it from the model registry.
run_cfg_path (str, optional) – The path to the run configuration.
user_cfg_path (str, optional) – If given, this is used to update the base configuration. If None, will look for it in the default path, see Multiverse.USER_CFG_SEARCH_PATH.
_shared_worker_manager (WorkerManager, optional) –
If given, this already existing WorkerManager instance (and its reporter) will be used instead of initializing new instances.

Warning

This argument is only exposed for internal purposes. It should not be used for production code and behavior of this argument may change at any time.
**update_meta_cfg – Can be used to update the meta configuration generated from the previous configuration levels

property debug_level: int#: The debug level

property info_bundle: ModelInfoBundle#: The model info bundle for this Multiverse

property model_name: str#: The model name associated with this Multiverse

property model_executable: str#: The path to the model executable

property model: utopya.model.Model#: A model instance, created ad-hoc using the associated info bundle

property meta_cfg: dict#: The meta configuration.

property dirs: dict#: Information on managed directories.

property status_file_paths: List[str]#: Retrieves status file paths in this Multiverse’s run directory

property cluster_mode: bool#: Whether the Multiverse should run in cluster mode

property cluster_params: dict#: Returns a copy of the cluster mode configuration parameters

property resolved_cluster_params: dict#: Returns a copy of the cluster configuration with all parameters resolved. This makes some additional keys available on the top level.

property skipping: dict#: The skipping control parameters

property dm: DataManager#: The Multiverse’s DataManager.

property wm: WorkerManager#: The Multiverse’s WorkerManager.

property pm: PlotManager#: The Multiverse’s PlotManager.

run(*, sweep: bool | None = None)[source]#

Starts a simulation run.

Specifically, this method adds simulation tasks to the associated WorkerManager, locks its task list, and then invokes the start_working() method which performs all the simulation tasks.

If cluster mode is enabled, this will split up the parameter space into (ideally) equally sized parts and only run one of these parts, depending on the cluster node this Multiverse is being invoked on.

Note

As this method locks the task list of the WorkerManager, no further tasks can be added henceforth. This means, that each Multiverse instance can only perform a single simulation run.

Parameters:: sweep (bool, optional) – Whether to perform a sweep or not. If None, the value will be read from the perform_sweep key of the meta-configuration.

run_single()[source]#

Runs a single simulation using the parameter space’s default value.

See run() for more information.

run_sweep()[source]#

Runs a parameter sweep.

See run() for more information.

renew_plot_manager(**update_kwargs)[source]#

Tries to set up a new PlotManager. If this succeeds, the old one is discarded and the new one is associated with this Multiverse.

Parameters:: **update_kwargs – Passed on to PlotManager.__init__

classmethod _load_user_cfg(user_cfg_path: str | None = None) → Tuple[str, dict][source]#: Loads the user configuration from a path; if no path is given, searches for it …

classmethod _load_meta_cfg_parts(*, info_bundle: ModelInfoBundle, user_cfg_path: str | None = None) → Tuple[Dict[str, str | None], Dict[str, dict | None]][source]#: Loads the various parts of the meta-configuration for a model and returns a dict of their paths and one of the loaded dictionaries.

classmethod _assemble_meta_cfg_base_layers(*, info_bundle: ModelInfoBundle, user_cfg_path: str | None = None) → Tuple[dict, Dict[str, str], Dict[str, dict]][source]#

Assembles the meta-configuration base layers, i.e. without the model default configuration or any run-specific updates.

It includes the following layers:

base

framework

project

model_mv (model-specific multiverse updates)

user

Other layers are applied later.

Returns a 3-tuple of (assembled meta config, cfg paths, cfg dicts).

_create_meta_cfg(*, run_cfg_path: str, user_cfg_path: str, update_meta_cfg: dict) → dict[source]#

Create the meta configuration from several parts and store it.

The final configuration dict is built from multiple components, where one is always recursively updating the previous level. The resulting configuration is the meta configuration and is stored to the meta_cfg attribute.

The parts are recorded in the cfg_parts dict and returned such that a backup can be created.

Parameters:

run_cfg_path (str) – path to the run configuration
user_cfg_path (str) – path to the user configuration file
update_meta_cfg (dict) – will be used to update the resulting dict

Returns:

dict of the parts that were needed to create the meta config.: The dict-key corresponds to the part name, the value is the payload which can be either a path to a cfg file or a dict

Return type:

_apply_debug_level(lvl: int | None = None)[source]#: Depending on the debug level, applies certain settings to the Multiverse and the runtime environment.

Note

This does not (yet) set the corresponding debug flags for the PlotManager, DataManager, or WorkerManager!

_create_run_dir(*, out_dir: str, model_note: str | None = None, dir_permissions: dict | None = None) → None[source]#

Create the folder structure for the run output.

For the chosen model name and current timestamp, the run directory will be of form <timestamp>_<model_note> and be part of the following directory tree:

utopya_output
    model_a
        180301-125410_my_model_note
            config
            data
                uni000
                uni001
                ...
            eval
    model_b
        180301-125412_my_first_sim
        180301-125413_my_second_sim

If running in cluster mode, the cluster parameters are resolved and used to determine the name of the simulation. The pattern then does not include a timestamp as each node might return not quite the same value. Instead, a value from an environment variable is used. The resulting path can have different forms, depending on which environment variables were present; required parts are denoted by a * in the following pattern; if the value of the other entries is not available, the connecting underscore will not be used:

{timestamp}_{job id*}_{cluster}_{job account}_{job name}_{note}

Parameters:

out_dir (str) – The base output directory, where all Utopia output is stored.
model_note (str, optional) – The note to add to the run directory of the current run.
dir_permissions (Dict[str, str]) – If given, will set directory permissions on the specified managed directories of this Multiverse. The keys of this dict should be entries of the dirs attribute, values should be octal permissions values given as a string.

Raises:

RuntimeError – If the simulation directory already existed. This should not occur, as the timestamp is unique. If it occurs, you either started two simulations very close to each other or something is seriously wrong. Strange time zone perhaps?

_get_run_dir(*, out_dir: str | None, run_dir: str | None, **__)[source]#

Helper function to find the run directory from arguments given to __init__(). This is not actually used in Multiverse but in FrozenMultiverse and DistributedMultiverse.

Parameters:

out_dir (str) – The Model output directory. If unknown (None), will try to deduce it from an absolute run directory path or from the info bundle.
run_dir (str) – The run directory to use; if not known will try to find the latest run directory.
**__ – ignored

Raises:

IOError – No directory found to use as run directory
TypeError – When run_dir was not a string

_setup_pm(**update_kwargs) → PlotManager[source]#: Helper function to setup a PlotManager instance

_parse_base_cfg_pools(base_cfg_pools: List[str | Tuple[str, dict | str]]) → List[Tuple[str, dict | str]][source]#

Prepares the base_cfg_pools argument to be valid input to the PlotManager. This method resolves format strings and thus allows to more generically define base config pools.

Possible formats for each entry of base_cfg_pools argument are:

A 2-tuple (name, pool dict) which specifies the name of the base config pool alongside with the pool entries.

A 2-tuple (name, path to pool config file), which is later loaded by the PlotManager

A shortcut key which resolves to the corresponding 2-tuple. Available shortcuts are: utopya_base, framework_base, project_base, and model_base.

Both the pool name and path may be format strings which get resolved with the model_name key and (in the case of the path) the full paths dict of the current model’s info bundle. A format string may look like this:

“{paths[source_dir]}/{model_name}_more_plots.yml” “~/some/more/plots/{model_name}/plots.yml”

If such a path cannot be resolved, an error is logged and an empty pool is used instead; this allows for more flexibility in defining locations for additional config pools.

Parameters:: base_cfg_pools (List[Union[str, Tuple[str, Union[str, dict]]]]) – The unparsed specification of config pools.

_perform_backup(*, cfg_parts: dict, backup_cfg_files: bool = True, backup_executable: bool = False, include_git_info: bool = True) → None[source]#

Performs a backup of that information that can be used to recreate a simulation.

The configuration files are backed up into the config subdirectory of the run directory. All other relevant information is stored in an additionally created backup subdirectory.

Warning

These backups are created prior to the start of the actual simulation run and contains information known at that point. Any changes to the meta configuration made after initialization of the Multiverse will not be reflected in these backups.

In particular, the perform_sweep and parameter_space entries of the meta configuration may not reflect which form of parameter space iteration was actually performed, because the run_single and run_sweep methods overwrite this behavior. To that end, that information is separately stored once the run methods are invoked.

Parameters:

cfg_parts (dict) – A dict of either paths to configuration files or dict-like data that is to be dumped into a configuration file.
backup_cfg_files (bool, optional) – Whether to backup the individual configuration files (i.e. the cfg_parts information). If false, the meta configuration will still be backed up.
backup_executable (bool, optional) – Whether to backup the executable. Note that these files can sometimes be quite large.
include_git_info (bool, optional) – If True, will store information about the state of the project’s (and framework’s, if existent) git repository.

_perform_pspace_backup(pspace: ParamSpace, *, filename: str = 'parameter_space', **info_kwargs)[source]#

Stores the given parameter space and its metadata into the config directory. Two files will be produced:

config/{filename}.yml: the passed pspace object

config/{filename}_info.yml: the passed pspace object’s
info dictionary containing relevant metadata (and the additionally passed info_kwargs)

Note

This method is separated from the regular backup method Multiverse._perform_backup() because the parameter space that is used during a simulation run may be a lower-dimensional version of the one the Multiverse was initialized with. To that end, run() will invoke this backup function again once the relevant information is fully determined. This is important because it is needed to communicate the correct information about the sweep to objects downstream in the pipeline (e.g. MultiversePlotCreator).

Parameters:

pspace (paramspace.paramspace.ParamSpace) – The ParamSpace object to save as backup.
filename (str, optional) – The filename (without extension!) to use. (This is also used for the log message, with underscores changed to spaces.)
**info_kwargs – Additional kwargs that are to be stored in the meta- data dict.

_prepare_executable(*, run_from_tmpdir: bool = False, prefix: Tuple[str, ...] | None = None) → None[source]#

Prepares the model executable, potentially copying it to a temporary location.

Also allows specifying a prefix to the model executable, which can be used to control how the model is invoked.

Note

The run_from_tmpdir argument requires the executable to be relocatable to another location, i.e. be position-independent.

Also, copying the executable to a temporary directory might not suffice in isolating it from all system changes, e.g. if dependencies that are imported during runtime change!

Parameters:

run_from_tmpdir (bool, optional) – Whether to copy the executable to a temporary directory that goes out of scope once the Multiverse instance goes out of scope.
prefix (Tuple[str, ...], optional) – These arguments are prefixed to the model invocation. For instance, if this is ('python',), the resulting invocation command will be: python path/to/model/executable.py path/to/uni/run_cfg.yml

Raises:

FileNotFoundError – On missing file at model binary path
PermissionError – On wrong access rights of file at the binary path

_resolve_cluster_params() → dict[source]#

This resolves the cluster parameters, e.g. by setting parameters depending on certain environment variables. This function is called by the resolved_cluster_params property.

Returns:: The resolved cluster configuration parameters
Return type:: dict
Raises:: ValueError – If a required environment variable was missing or empty

_setup_universe(*, worker_kwargs: dict, model_name: str, model_executable: str, args_prefix: Tuple[str, ...], uni_cfg: dict, uni_basename: str) → dict[source]#

Setup function for individual universes. These are realised through individual WorkerTask instances, where this function is called as part of the setup routine, before a task is run.

This is called before the worker process starts working on the universe.

Parameters:

worker_kwargs (dict) – the current status of the worker_kwargs dictionary; is always passed to a task setup function
model_name (str) – The name of the model
model_executable (str) – path to the binary to execute
args_prefix (Tuple[str, ...]) – arguments that are prefixed to the model executable
uni_cfg (dict) – the configuration to create a yml file from which is then needed by the model
uni_basename (str) – Basename of the universe to use for folder creation, i.e.: zero-padded universe number, e.g. uni0042

Returns:

kwargs for the process to be run when task is grabbed by: Worker.

Return type:

_setup_universe_dir(uni_dir: str, *, uni_basename: str) → None[source]#

Determines the universe directory and, if needed, creates it.

This is invoked from _setup_universe() and is carried out directly before work on that universe starts.

Parameters:: uni_basename (str) – The basename of the universe to create the run directory for.

_setup_universe_config(*, uni_cfg: dict, uni_dir: str, uni_cfg_path: str, mode: str = 'x') → dict[source]#

Sets up the universe configuration and writes it to a file.

This is invoked from _setup_universe() and is carried out directly before work on that universe starts.

Parameters:

uni_cfg (dict) – The given universe configuration
uni_dir (str) – The universe directory, added to the configuration
uni_cfg_path (str) – Where to store the uni configuration at
mode (str) – File mode of the config file. Use w for overwriting an existing file and x for creating a new file.

Returns:

The (potentially updated) universe configuration

Return type:

_setup_universe_worker_kwargs(*, model_executable: str, args_prefix: Tuple[str, ...], uni_cfg_path: str, uni_cfg: dict, uni_dir: str, save_streams: bool = False, **worker_kwargs) → dict[source]#

Assembles worker kwargs for a specific universe.

This is invoked from _setup_universe() and is carried out directly before work on that universe starts.

Returns:

the combined worker kwargs, including args for running: the model executable.

Return type:

_maybe_skip(context: str, *, desc: str, exc: Exception | None = None)[source]#: Called from the universe setup function in certain situations, this method checks how to proceed. It may trigger skipping of the task, raise an error (e.g. if skipping is disabled), or just continue without either of those, potentially causing an error later.

_add_sim_task(*, uni_id_str: str, uni_cfg: dict, is_sweep: bool) → None[source]#

Helper function that handles task assignment to the WorkerManager.

This function creates a WorkerTask that will perform the following actions once it is grabbed and worked at:

Create a universe (folder) for the task (simulation)

Write that universe’s configuration to a yaml file in that folder

Create the correct arguments for the call to the model binary

To that end, this method gathers all necessary arguments and registers a WorkerTask with the WorkerManager.

Parameters:

uni_id_str (str) – The zero-padded uni id string
uni_cfg (dict) – given by ParamSpace. Defines how many simulations should be started
is_sweep (bool) – Flag is needed to distinguish between sweeps and single simulations. With this information, the forwarding of a simulation’s output stream can be controlled.

Raises:

RuntimeError – If adding the simulation task failed

_add_sim_tasks(*, sweep: bool | None = None) → int[source]#

Adds the simulation tasks needed for a single run or for a sweep.

Parameters:: sweep (bool, optional) – Whether tasks for a parameter sweep should be added or only for a single universe. If None, will read the perform_sweep key from the meta-configuration.
Returns:: The number of added tasks.
Return type:: int
Raises:: ValueError – On sweep == True and zero-volume parameter space.

_validate_meta_cfg() → bool[source]#

Goes through the parameters that require validation, validates them, and creates a useful error message if there were invalid parameters.

Returns:

True if all parameters are valid; None if no check was done.: Note that False will never be returned, but a ValidationError will be raised instead.

Return type:

Raises:

ValidationError – If validation failed.

_start_working(*, lock_tasks: bool = True, **kwargs)[source]#: Wrapper that helps to invoke the WorkerManager

_conclude_working(wm_status: str)[source]#: Called after working and provides some final messaging at the end of the simulation run.

class utopya.multiverse.FrozenMultiverse(*, model_name: str | None = None, info_bundle: ModelInfoBundle | None = None, run_dir: str | None = None, run_cfg_path: str | None = None, user_cfg_path: str | None = None, use_meta_cfg_from_run_dir: bool = False, **update_meta_cfg)[source]#

Bases: Multiverse

A frozen Multiverse is like a Multiverse, but frozen.

It is initialized from a finished Multiverse run and re-creates all the attributes from that data, e.g.: the meta configuration, a DataManager, and a PlotManager.

Note

A frozen multiverse is no longer able to perform any simulations.

__init__(*, model_name: str | None = None, info_bundle: ModelInfoBundle | None = None, run_dir: str | None = None, run_cfg_path: str | None = None, user_cfg_path: str | None = None, use_meta_cfg_from_run_dir: bool = False, **update_meta_cfg)[source]#

Initializes the FrozenMultiverse from a model name and the name of a run directory.

Note that this also takes arguments to specify the run configuration to use.

Parameters:

model_name (str) – The name of the model to load. From this, the model output directory is determined and the run_dir will be seen as relative to that directory.
info_bundle (ModelInfoBundle, optional) – The model information bundle that includes information about the binary path etc. If not given, will attempt to read it from the model registry.
run_dir (str, optional) – The run directory to load. Can be a path relative to the current working directory, an absolute path, or the timestamp of the run directory. If not given, will use the most recent timestamp.
run_cfg_path (str, optional) – The path to the run configuration.
user_cfg_path (str, optional) – If given, this is used to update the base configuration. If None, will look for it in the default path, see Multiverse.USER_CFG_SEARCH_PATH.
use_meta_cfg_from_run_dir (bool, optional) – If True, will load the meta configuration from the given run directory; only works for absolute run directories.
**update_meta_cfg – Can be used to update the meta configuration generated from the previous configuration levels

_create_run_dir(*_, **__)[source]#: Overload of parent method, for safety: we should not create a new run directory.

class utopya.multiverse.DistributedMultiverse(*, run_dir: str, model_name: str | None = None, info_bundle: ModelInfoBundle | None = None, no_reports: bool = False)[source]#

Bases: FrozenMultiverse

A distributed Multiverse is like a Multiverse, but initialized from an existing meta-configuration.

Unlike the FrozenMultiverse, it is able to continue, join or repeat an existing simulation run.

__init__(*, run_dir: str, model_name: str | None = None, info_bundle: ModelInfoBundle | None = None, no_reports: bool = False)[source]#

Initializes a DistributedMultiverse from a model name and an existing run directory.

Parameters:

run_dir (str, optional) – The run directory to load. Can be a path relative to the current working directory, an absolute path, or the timestamp of the run directory. If not given, will use the most recent timestamp.
model_name (str) – The name of the model to load. From this, the model output directory is determined and the run_dir will be seen as relative to that directory.
info_bundle (ModelInfoBundle, optional) – The model information bundle that includes information about the binary path etc. If not given, will attempt to read it from the model registry.
no_reports (bool, optional) – If True, will not write work status or other simulation report files. Set this, if invoking this with many individual universes and in order to avoid creating as many report files.

run_single(*_, **__)[source]#

Runs a single simulation using the parameter space’s default value.

See run() for more information.

run_sweep(*_, **__)[source]#

Runs a parameter sweep.

See run() for more information.

run(*, universes: Literal['all'] | str | List[str] = 'all', num_workers: int | None = None, timeout: float | None = None, on_existing_uni_dir: str = 'continue', on_existing_uni_cfg: str = 'continue', on_existing_uni_output: str = 'raise')[source]#

Starts a simulation run for all or a specified subset of universes, working on the existing run directory.

Using the on_existing_uni_output argument, it is possible to skip universes that already created output; alternatively, the output can be removed, effectively repeating the universe simulation.

Parameters:

universes (Union[Literal["all"], str, List[str]], optional) –
Which universes to run again. Can either be all (default) to run all universes, or a selection of universe IDs. The selection can be given as a list of ID strings or a string of comma-separated IDs. Example for valid formats:

[‘uni01’, ‘uni02’, ‘uni03’] ‘uni01,uni02,uni03’ [‘01’, ‘02’, ‘03’] 1,2,3

Leading zeros and uni are optional.
num_workers (int, optional) – Specify the number of workers to use, overwriting the setting from the meta-configuration.
timeout (float, optional) – If given, overwrites the existing value for the WorkerManager timeout, which may have been set in the original Multiverse run.
on_existing_uni_dir (str, optional) – How to proceed if a universe directory already exists; can be skip, raise, or continue. Set this to continue if you previously generated all universe output directories.
on_existing_uni_cfg (str, optional) – How to proceed if a universe configuration already exists; can be skip, raise, or continue. Set this to continue if you previously generated all universe config files.
on_existing_uni_output (str, optional) – What to do if universe output already exists. Options are skip, raise, continue or clear; the latter will remove existing output files without prompting for this again!

join_run(*, num_workers: int | None = None, shuffle_tasks: bool = True, timeout: float | None = None)[source]#

Joins an already-running simulation and performs tasks that have not been taken up yet.

Parameters:

num_workers (int, optional) – Set number of workers to use.
shuffle_tasks (bool, optional) – If given, will overwrite the shuffle_tasks run arguments. When joining an already-running simulation run, it is advisable to set this to True to reduce competition for new tasks.
timeout (float, optional) – If given, will overwrite the existing value for the WorkerManager timeout, which may have been set in the original Multiverse run.

_prepare_executable(*args, **kwargs) → None[source]#: Like the parent’s method, but restores the executable from its backup location, if it was backed up. Then calls the parent method.

_perform_pspace_backup(**kwargs)[source]#: Overload that skips parameter space backup (already exists).

_setup_universe_dir(uni_dir: str, *, uni_basename: str)[source]#: Overload of parent method that allows for universe directories to already exist.

_setup_universe_config(*, uni_cfg_path: str, **kwargs) → dict[source]#: Overload of parent method that checks if a universe config already exists and, if so, loads that one instead of storing a new one.

utopya.multiverse.get_status_file_paths(run_dir: str, *, status_file_glob='.status*.yml') → List[str][source]#

utopya.multiverse.get_distributed_work_status(run_dir: str, **kwargs) → Dict[str, dict | None][source]#: Finds and loads the work status files in the given directory

utopya.multiverse.active_dmvs(dws: Dict[str, dict | None]) → Dict[str, dict | None][source]#: Returns status of the distributed Multiverse instances that are currently working, given a distributed work status dict.

utopya.multiverse.combined_dmv_progress(dws: Dict[str, dict | None]) → float[source]#: Extracts the sum of individual multiverse’s active progress

utopya.parameter module#

This module implements the Parameter class which is used when validating model and simulation parameters.

Bases: object

The parameter class is used when a model parameter needs to be validated before commencing the model run. It can hold information on the parameter itself as well as its valid range and type and other meta-data.

Per default, the Parameter class should be assumed to handle scalar parameters like numerical values or strings. For validating sequence-like parameters, corresponding specializing classes are to be implemeted.

SHORTHAND_MODES: Dict[str, Callable] = {'is-bool': <function Parameter.<lambda>>, 'is-in-unit-interval': <function Parameter.<lambda>>, 'is-int': <function Parameter.<lambda>>, 'is-negative': <function Parameter.<lambda>>, 'is-negative-int': <function Parameter.<lambda>>, 'is-negative-or-zero': <function Parameter.<lambda>>, 'is-positive': <function Parameter.<lambda>>, 'is-positive-int': <function Parameter.<lambda>>, 'is-positive-or-zero': <function Parameter.<lambda>>, 'is-probability': <function Parameter.<lambda>>, 'is-string': <function Parameter.<lambda>>, 'is-unsigned': <function Parameter.<lambda>>}#

Shorthand mode factory functions. These are used in the from_shorthand() class method to generate a Parameter object more easily.

Also, utopya.yaml registers each of these shorthand modes as a YAML constructor for tag !<mode>.

LIMIT_COMPS: Dict[str, Callable] = {'(': <built-in function gt>, ')': <built-in function lt>, '[': <built-in function ge>, ']': <built-in function le>}#: Comparators for the limits check, depending on limits_mode

LIMIT_MODES: Sequence[str] = ('[]', '()', '[)', '(]')#: Possible limit modes

yaml_tag: str = '!param'#: Default YAML tag to use for representing

Creates a new Parameter object, which holds a default value as well as some constraints on the possible values this parameter can assume.

Parameters:

default (Any) – the default value of the parameter.
name (str, optional) – the name of the parameter.
description (str, optional) – a description of this parameter or its effects.
is_any_of (Sequence[Any], optional) – a sequence of possible values this parameter can assume. If this parameter is given, limits cannot be used.
limits (Tuple[Optional[float], Optional[float]], optional) – the upper and lower bounds of the parameter (only applicable to scalar numerals). If None, the bound is assumed to be negative or positive infinity, respectively. Whether boundary values are included into the interval is controlled by the limits_mode argument. This argument is mutually exclusive with is_any_of!
limits_mode (str, optional) – whether to interpret the limits as an open, closed, or semi-closed interval. Possible values: '[]' (closed, default), '()' (open), '[)', and '(]'.
dtype (Union[str, type], optional) – expected data type of this parameter. Accepts all strings that are accepted by numpy.dtype, eg. int, float, uint16, string.

Raises:

TypeError – On a limits argument that was not tuple-like or if a limits argument was given but the default was a
ValueError – if an invalid limits_mode is passed, if limits and is_any_of are both passed, or if the limits argument did not have length 2.

__eq__(other) → bool[source]#: Returns True for parameters with equal behavior.

property default#: The default value for this parameter

property name#: The name of this parameter

property description#: The description of this parameter

property limits: tuple | None#: The limits of this parameter

property limits_mode: str#: The mode used when evaluating the limits

property is_any_of: Tuple[Any]#: Possible values this parameter may assume

property dtype: dtype | None#: The expected data type of this parameter

validate(value: Any, *, raise_exc: bool = True) → bool[source]#

Checks whether the given value would be a valid parameter.

The checks for the corresponding arguments are carried out in the following order:

is_any_of

dtype

limits

The data type is checked according to the numpy type hierarchy, see docs. To reduce strictness, the following additional compatibilities are taken into account:

for unsigned integer dtype, a signed integer-type value is compatible if value >= 0

for floating-point dtype, integer-type value are always considered compatible

for floating-point dtype, value of all floating-point- types are considered compatible, even if they have a lower precision (note the coercion test below, though)

Additionally, it is checked whether value is representable as the target data type. This is done by coercing value to dtype and then checking for equality (using np.isclose).

Parameters:

value (Any) – The value to test.
raise_exc (bool, optional) – Whether to raise an exception or not.

Returns:

Whether or not the given value is a valid parameter.

Return type:

Raises:

ValidationError – If validation failed or is impossible (for instance due to ambiguous validity parameters). This error message contains further information on why validation failed.

classmethod from_shorthand(default: Any, *, mode: str, **kwargs)[source]#

Constructs a Parameter object from a given shorthand mode.

Parameters:

default (Any) – the default value for the parameter
mode (str) – A valid shorthand mode, see SHORTHAND_MODES
**kwargs – any further arguments for Parameter ininitialization, see __init__().

Returns:

a Parameter object

classmethod to_yaml(representer, node)[source]#

Represent this Parameter object as a YAML mapping.

Parameters:

representer (ruamel.yaml.representer) – The representer module
node (Parameter) – The node, i.e. an instance of this class

Returns:

a yaml mapping that is able to recreate this object

classmethod from_yaml(constructor, node)[source]#: The default constructor for Parameter objects, expecting a YAML node that is mapping-like.

utopya.parameter.extract_validation_objects(model_cfg: dict, *, model_name: str) → Tuple[dict, dict][source]#

Extracts all Parameter objects from a model configuration (a nested dict), replacing them with their default values. Returns both the modified model configuration well as the Parameter objects (keyed by the key sequence necessary to reach them within the model configuration).

Parameters:

model_cfg (dict) – the model configuration to inspect
model_name (str) – the name of the model

Returns:

a tuple of (model config, parameters to validate).: The model config contains the passed config dict in which all Parameter class elements have been replaced by their default entries. The second entry is a dictionary consisting of the Parameter class objects (requiring validation) with keys being key sequences to those Parameter objects. Note that the key sequence is relative to the level above the model configuration, with model_name as a common entry for all returned values.

Return type:

Tuple[dict, dict]

utopya.plotting module#

DEPRECATED module that provides backwards-compatibility for the old utopya module structure.

Deprecated since version 1.0.0: This module will be removed soon, please use utopya.eval instead.

utopya.project_registry module#

Implementation of the utopya project registry

class utopya.project_registry.ProjectPaths(*, base_dir: Annotated[Path, PathType(path_type=dir)], project_info: Annotated[Path, PathType(path_type=file)] | None = None, models_dir: Annotated[Path, PathType(path_type=dir)] | None = None, py_tests_dir: Annotated[Path, PathType(path_type=dir)] | None = None, py_plots_dir: Annotated[Path, PathType(path_type=dir)] | None = None, mv_project_cfg: Annotated[Path, PathType(path_type=file)] | None = None, project_base_plots: Annotated[Path, PathType(path_type=file)] | None = None)[source]#

Bases: BaseSchema

Schema to use for a project’s paths field

base_dir: Annotated[Path, PathType(path_type=dir)]#

project_info: Annotated[Path, PathType(path_type=file)] | None#

models_dir: Annotated[Path, PathType(path_type=dir)] | None#

py_tests_dir: Annotated[Path, PathType(path_type=dir)] | None#

py_plots_dir: Annotated[Path, PathType(path_type=dir)] | None#

mv_project_cfg: Annotated[Path, PathType(path_type=file)] | None#

project_base_plots: Annotated[Path, PathType(path_type=file)] | None#

_abc_impl = <_abc._abc_data object>#

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'validate_assignment': True, 'validate_default': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Bases: BaseSchema

Schema to use for a project’s metadata field

version: str | None#

long_name: str | None#

description: str | None#

long_description: str | None#

license: str | None#

authors: List[str] | None#

email: str | None#

website: str | None#

utopya_compatibility: str | None#

language: str | None#

requirements: List[str] | None#

misc: Dict[str, Any] | None#

_abc_impl = <_abc._abc_data object>#

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'validate_assignment': True, 'validate_default': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class utopya.project_registry.ProjectSettings(*, preload_project_py_plots: bool | None = None, preload_framework_py_plots: bool | None = None)[source]#

Bases: BaseSchema

Schema to use for a project’s settings field

preload_project_py_plots: bool | None#: Whether to preload the project-level plot module (py_plots_dir) after initialization of the PlotManager. If not given, will load the module.

preload_framework_py_plots: bool | None#: Whether to preload the framework-level plot module (py_plots_dir) after initialization of the PlotManager. If not given, will load the module.

_abc_impl = <_abc._abc_data object>#

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'validate_assignment': True, 'validate_default': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class utopya.project_registry.ProjectSchema(*, project_name: str, framework_name: str | None = None, paths: ProjectPaths, metadata: ProjectMetadata, settings: ProjectSettings = {}, run_cfg_format: str = 'yaml', cfg_set_abs_search_dirs: List[str] | None = None, cfg_set_model_source_subdirs: List[str] | None = None, custom_py_modules: Dict[str, Annotated[Path, PathType(path_type=dir)]] | None = None, output_files: dict | None = None, debug_level_updates: Dict[str, dict] | None = None)[source]#

Bases: BaseSchema

The data model for a project registry entry

project_name: str#

framework_name: str | None#

paths: ProjectPaths#

metadata: ProjectMetadata#

settings: ProjectSettings#

run_cfg_format: str#

cfg_set_abs_search_dirs: List[str] | None#

cfg_set_model_source_subdirs: List[str] | None#

custom_py_modules: Dict[str, Annotated[Path, PathType(path_type=dir)]] | None#

output_files: dict | None#

debug_level_updates: Dict[str, dict] | None#

_abc_impl = <_abc._abc_data object>#

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'validate_assignment': True, 'validate_default': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class utopya.project_registry.Project(name: str, *, registry: YAMLRegistry = None, **data)[source]#

Bases: RegistryEntry

A registry entry that describes a project

SCHEMA#: alias of ProjectSchema

property framework_project: Project | None#: If a framework project is defined, retrieve it from the registry

get_git_info(*, include_patch_info: bool = False) → dict[source]#

Returns information about the state of this project’s git repository using the python-git-info package.

If no git information is retrievable, e.g. because the project’s base_dir does not contain a git repository, will still return a dict but with have_git_info entry set to False.

Otherwise the git information will be in the latest_commit entry.

Parameters:: include_patch_info (bool, optional) – If True, will attempt a subprocess call to git and store patch information alongside in the diff entry. In that case, the dirty entry will denote whether there were uncommitted changes.
Returns:: A dict containing information about the associated git repo.
Return type:: dict

class utopya.project_registry.ProjectRegistry(registry_dir: str | None = None)[source]#

Bases: YAMLRegistry

The project registry

__init__(registry_dir: str | None = None)[source]#

Initializes the project registry, loading available entries from the registry directory in the utopya config directory.

This also creates the projects directory, if not created yet.

Parameters:: registry_dir (str, optional) – A custom projects

register(*, base_dir: str, info_file: str | None = None, custom_project_name: str | None = None, require_matching_names: bool | None = None, exists_action: str = 'raise') → Project[source]#

Parameters:

base_dir (str) – Project base directory
info_file (str, optional) – Path to info file which contains further path information and metadata (may be relative to base directory). If not given, will use some defaults to search for it.
custom_project_name (str, optional) – Custom project name, overwrites the one given in the info file
require_matching_names (bool, optional) – If set, will require that the custom project name is equal to the one given in the project info file. This allows checking that the file content does not diverge from some outside state.
exists_action (str, optional) – Action to take upon existing project

Returns:

Project information for the new or validated project

Return type:

Project

utopya.project_registry.PROJECTS = <utopya.project_registry.ProjectRegistry object>#: The package-wide project registry

utopya.reporter module#

Implementation of the reporter framework which can be used to report on the progress or result of operations within utopya.

utopya.reporter._DEFAULT_distributed_status_fstr: str = ' {progress_here:>5s} @ {host_name_short:12s} - {pid:7d}: {status:10s} ({tags})'#: The format string to use for the joined run status

class utopya.reporter.ReportFormat(*, parser: Callable, writers: List[Callable], min_report_intv: float | None = None)[source]#

Bases: object

A report format aggregates callables for a single report parser and potentially multiple report writers. As a whole, it contains all arguments needed to generate a certain kind of report.

It is used in utopya.reporter.Reporter and derived classes, which are the classes that actually implement the parsers and writers.

__init__(*, parser: Callable, writers: List[Callable], min_report_intv: float | None = None)[source]#

Initializes a ReportFormat object, which gathers callables needed to create a report in a certain format.

Parameters:

parser (Callable) – The parser method to use
writers (List[Callable]) – The writer method(s) to use
min_report_intv (float, optional) – The minimum report interval of reports in this format. Determines the time (in seconds) that needs to have passed before the next report will be emitted.

property min_report_intv: timedelta | None#: Returns the minimum report interval, i.e. the time that needs to have passed between two reports.

property reporting_blocked: bool#

Determines whether this ReportFormat may be blocked from emission, e.g. because of the minimum report interval not having passed yet.

If no minimum report interval is given, will always return False. Otherwise checks if at least that interval has passed since the last report.

report(*, force: bool = False, parser_kwargs: dict | None = None) → bool[source]#

Parses and writes a report corresponding to the callables defined in this report format.

Parameters:

force (bool, optional) – If True, will ignore the minimum report interval and always perform a report.
parser_kwargs (dict, optional) – Keyword arguments passed on to the parser

Returns:

Whether a report was generated or not

Return type:

class utopya.reporter.Reporter(*, report_formats: List[str] | Dict[str, dict] | None = None, default_format: str | None = None, report_dir: str | None = None, suppress_cr: bool = False)[source]#

Bases: object

The Reporter class holds general reporting capabilities.

It needs to be subclassed in order to specialize its reporting functions.

__init__(*, report_formats: List[str] | Dict[str, dict] | None = None, default_format: str | None = None, report_dir: str | None = None, suppress_cr: bool = False)[source]#

Initialize the Reporter base class.

Parameters:

report_formats (Union[List[str], Dict[str, dict]], optional) – The report formats to use with this reporter. If given as list of strings, the strings are the names of the report formats as well as those of the parsers; all other parameters are the defaults. If given as dict of dicts, the keys are the names of the formats and the inner dicts are the parameters to create report formats from.
default_format (str, optional) – The name of the default report format; if None is given, the .report method requires the name of a report format.
report_dir (str, optional) – if reporting to a file; this is the base directory that is reported to.
suppress_cr (bool, optional) – Whether to suppress carriage return characters in writers. This option is useful when the reporter is not the only class that writes to a stream.

property report_formats: dict#: Returns the dict of ReportFormat objects.

property default_format: None | ReportFormat#: Returns the default report format or None, if not set.

property suppress_cr: bool#: Whether to suppress a carriage return. Objects using the reporter can set this property to communicate that they will be putting content into the stdout stream as well. The writers can check this property and adjust their behaviour accordingly.

add_report_format(name: str, *, parser: str | None = None, write_to: str | Dict[str, dict] = 'stdout', min_report_intv: float | None = None, rf_kwargs: dict | None = None, **parser_kwargs)[source]#

Add a report format to this reporter.

Parameters:

name (str) – The name of this format
parser (str, optional) – The name of the parser; if not given, the name of the report format is assumed
write_to (Union[str, Dict[str, dict]], optional) – The name of the writer. If this is a dict of dict, the keys will be interpreted as the names of the writers and the nested dict as the **kwargs to the writer function.
min_report_intv (float, optional) – The minimum report interval (in seconds) for this report format
rf_kwargs (dict, optional) – Further kwargs to ReportFormat.__init__
**parser_kwargs – The kwargs to the parser function

Raises:

ValueError – A report format with this name already exists

report(report_format: str | None = None, **kwargs) → bool[source]#

Create a report with the given format; if none is given, the default format is used.

Parameters:

report_format (str, optional) – The report format to use
**kwargs – Passed on to the ReportFormat.report() call

Returns:

Whether there was a report

Return type:

Raises:

ValueError – If no default format was set and no report format name was given

parse_and_write(*, parser: str | Callable, write_to: str | Callable, **parser_kwargs)[source]#

This function allows to select a parser and writer explicitly.

Parameters:

parser (Union[str, Callable]) – The parser method to use.
write_to (Union[str, Callable]) – The write method to use. Can also be a sequence of names and/or callables or a Dict. For allowed specification formats, see the ._resolve_writers method.
**parser_kwargs – Passed to the parser, if given

_resolve_parser(parser: str | Callable, **parser_kwargs) → Callable[source]#

Given a string or a callable, returns the corresponding callable.

Parameters:

parser (Union[str, Callable]) – If a callable is already given, returns that; otherwise looks for a parser method with the given name in the attributes of this class.
**parser_kwargs – Arguments that should be passed to the parser. If given, a new function is created where these arguments are already included.

Returns:

The desired parser function

Return type:

Callable

Raises:

ValueError – If no parser with the given name is available

_resolve_writers(write_to) → Dict[str, Callable][source]#

Resolves the given argument to a list of callable writer functions.

Parameters:

write_to –

a specification of the writers to use. Allows many different ways of specifying the writer functions, depending on the type of the argument:

str: the name of the writer method of this reporter

Callable: the writer function to use

sequence of str and/or Callable: the names and/or functions to use

Dict[str, dict]: the names of the writer functions and additional keyword arguments.

If the type is wrong, will raise.

Returns:

the writers (key: name, value: writer method)

Return type:

Dict[str, Callable]

Raises:

TypeError – Invalid write_to argument
ValueError – A writer with that name was already added or a writer with the given name is not available.

_write_to_stdout(s: str, *, flush: bool = True, **print_kws)[source]#

Writes the given string to stdout using the print function.

Parameters:

s (str) – The string to write
flush (bool, optional) – Whether to flush directly; default: True
**print_kws – Other print function keyword arguments

_write_to_stdout_noreturn(s: str, *, prepend=' ')[source]#

Writes to stdout without ending the line. Always flushes.

Parameters:

s (str) – The string to write
prepend (str, optional) – Is prepended to the string; useful because the cursor might block this point of the terminal
report_no (int, optional) – accepted from ReportFormat call

_write_to_log(s: str, *, lvl: int = 10, skip_if_empty: bool = False)[source]#

Writes the given string via the logging module.

Parameters:

s (str) – The string to log
lvl (int, optional) – The level at which to log at; default is 10, corresponding to the DEBUG level
skip_if_empty (bool, optional) – Whether to skip writing if s is empty.

_write_to_file(s: str, *, path: str = '_report.txt', mode: str = 'w', skip_if_empty: bool = False)[source]#

Writes the given string to a file

Parameters:

s (str) – The string to write
path (str, optional) – The path to write it to; will be assumed relative to the report_dir attribute; if that is not given, path needs to be absolute. By default, assumes that there is a report_dir given.
mode (str, optional) – Writing mode of that file
skip_if_empty (bool, optional) – Whether to skip writing if s is empty.

Raises:

ValueError – If report_dir was not set and path is relative.

class utopya.reporter.WorkerManagerReporter(wm: utopya.workermanager.WorkerManager, *, mv: utopya.multiverse.Multiverse = None, **reporter_kwargs)[source]#

Bases: Reporter

This class specializes the base Reporter to report on the WorkerManager state and its progress.

TTY_MARGIN = 4#: Margin to use when writing to terminal

PROGRESS_BAR_SYMBOLS = {'active': '░', 'active_progress': '▒', 'skipped': '»', 'space': ' ', 'success': '▓'}#: Symbols to use in progress bar parser

LATEST_WM_REPORT_TO_STATUS: Dict[str, str] = {'after_cancel': 'cancelled', 'after_fail': 'failed', 'after_work': 'finished'}#: Maps WorkerManager report names to a worker status; used in determining the work status.

__init__(wm: utopya.workermanager.WorkerManager, *, mv: utopya.multiverse.Multiverse = None, **reporter_kwargs)[source]#

Initialize the specialized reporter for the WorkerManager.

It is aware of the WorkerManager and may additionally have acces to the Multiverse it is embedded in, which provides additional information to report parsers.

Parameters:

wm (utopya.workermanager.WorkerManager) – The associated WorkerManager instance
mv (utopya.multiverse.Multiverse, optional) – The Multiverse this reporter is used in. If this is provided, it can be used in report parsers, e.g. to provide additional information on simulations.
**reporter_kwargs – Passed on to parent method

property wm: utopya.workermanager.WorkerManager#: Returns the associated WorkerManager

property task_counters: OrderedDict#: Returns a dict of task counters from the WorkerManager

property wm_finished: bool#

property wm_active_tasks_progress: ndarray#: Array of active tasks’ progress

property wm_elapsed: timedelta | None#: Seconds elapsed since start of working or None if not yet started

property wm_times: dict#: Return the characteristics of WorkerManager times. Calls get_progress_info() without any additional arguments.

register_task(task: utopya.task.WorkerTask)[source]#

Given the task object, extracts and stores some information like its run time or its exit code. Exit codes are aggregated over multiple registrations.

This can be called from a callback function of a WorkerTask object in order to relay information to the reporter.

Parameters:: task (utopya.task.WorkerTask) – The WorkerTask to extract information from.

calc_runtime_statistics(min_num: int = 10) → OrderedDict[source]#

Calculates the current runtime statistics.

Parameters:

min_num (int, optional) – Minimum number of runtimes that need to be registered for advanced statistics to actually be computed. If below this number, not all entries will exist.

Returns:

The runtime statistics. If there are no runtimes yet,: only the total (wall) entry will be there. If there are too few

Return type:

OrderedDict

get_progress_info(**eta_options) → Dict[str, float][source]#

Compiles a dict containing progress information for the current work session.

Parameters:

**eta_options – Passed on to method calculating est_left, _compute_est_left().

Returns:

Progress information. Guaranteed to contain the: keys start, now, elapsed, est_left, est_end, and end.

Return type:

Dict[str, float]

_compute_progress(counters: Dict[str, int] | None = None) → Dict[str, float][source]#: Given task counters, computes various progress measures, each values between 0 and 1.

_compute_est_left(*, progress: Dict[str, float], elapsed: timedelta, mode: str = 'from_start', progress_buffer_size: int = 60) → timedelta | None[source]#

Computes the estimated time left until the end of the work session (ETA) using the current progress value and the elapsed time. Depending on mode, additional information may be included in the calculation.

Note

When task skipping is enabled, ETA computation becomes more difficult.

Parameters:

progress (float) – The current progress value, in (0, 1]
elapsed (datetime.timedelta) – The elapsed time since start
mode (str, optional) –
By which mode to calculate the ETA. Available modes are:
- from_start, where ETA is computed from the start of
  work session.
- from_buffer, where ETA is computed from a more
  recent point during the work session. This uses a buffer to keep track of recent progress and computes the ETA against the oldest record (controlled by argument progress_buffer_size), giving more accurate estimates for long-running work sessions.
progress_buffer_size (int, optional) – The size of the ring buffer used in from_buffer mode.

Returns:

Estimate for how much time is left: until the end of the work session. If it cannot be estimated yet, e.g. because no progress was made, will return None.

Return type:

Optional[datetime.timedelta]

_parse_task_counters(*, report_no: int | None = None) → str[source]#

Return a string that shows the task counters of the WorkerManager

Parameters:: report_no (int, optional) – A counter variable passed by the ReportFormat call, indicating how often this parser was called so far.
Returns:: A str representation of the task counters of the WorkerManager
Return type:: str

_parse_progress(*, report_no: int | None = None) → str[source]#

Returns a progress string

Parameters:: report_no (int, optional) – A counter variable passed by the ReportFormat call, indicating how often this parser was called so far.
Returns:: A simple progress indicator
Return type:: str

_parse_progress_bar(*, num_cols: str | int = 'fixed', fstr: str = ' ╠{:}╣ {info:}{times:}', info_fstr: str = '{prg[total]:>5.1f}% ', show_times: bool = False, times_fstr: str = '| {elapsed:} elapsed | ~{est_left:} left ', times_fstr_final: str = '| finished in {elapsed:} ', times_kwargs: dict = {}, report_no: int | None = None) → str[source]#

Returns a progress bar.

It shows the amount of finished tasks, active tasks, and a percentage.

Parameters:

num_cols (Union[str, int], optional) – The number of columns available for creating the progress bar. Can also be a string adaptive to poll terminal size upon each call, or fixed to use the number of columns determined at import time.
fstr (str, optional) – The format string for the final output. Should contain the pbar string, which makes up the progress bar, and can optionally contain the``info`` and times segments, formatted using the respective format string arguments.
info_fstr (str, optional) –
The format string for the info section of the final output. Available keys:
- prg, dict with various progress measures in percent: total, active, skipped, failed, success, …
- cnt, the task counters dictionary, see: task_counters()
show_times (bool, optional) – Whether to show a short version of the results of the times parser
times_fstr (str, optional) – Format string for times information
times_fstr_final (str, optional) – Format string for times information once the work session has ended
times_kwargs (dict, optional) – Passed on to times parser. Only used if show_times is set.
report_no (int, optional) – A counter variable passed by the ReportFormat call, indicating how often this parser was called so far.

Returns:

The one-line progress bar

Return type:

_parse_times(*, fstr: str = 'Elapsed: {elapsed:<8s} | Est. left: {est_left:<8s} | Est. end: {est_end:<10s}', timefstr_short: str = '%H:%M:%S', timefstr_full: str = '%d.%m., %H:%M:%S', use_relative: bool = True, times: dict | None = None, report_no: int | None = None, **progress_info_kwargs) → str[source]#

Parses the WorkerManager’s time information, including estimated time left or others.

Parameters:

fstr (str, optional) – The main format string; gets as keys the results of the WorkerManager time information. Available keys: elapsed, est_left, est_end, start, now, end.
timefstr_short (str, optional) – A time format string for absolute dates; short version.
timefstr_full (str, optional) – A time format string for absolute dates; long (ideally: full) version.
use_relative (bool, optional) – Whether for a date difference of 1 to use relative dates, e.g. Today, 13:37.
times (dict, optional) – A dict of times to use; this is mainly for testing purposes!
report_no (int, optional) – The report number passed by ReportFormat
**progress_info_kwargs – Passed on to method calculating progress get_progress_info()

Returns:

A string representation of the time information

Return type:

_parse_runtime_stats(*, fstr: str = ' {k:<13s} {v:}', join_char='\n', ms_precision: int = 1, report_no: int | None = None) → str[source]#

Parses the runtime statistics dict into a multiline string

Parameters:

fstr (str, optional) – The format string to use. Gets passed the keys k and v where k is the name of the entry and v its value. Note that v is a non-numeric value.
join_char (str, optional) – The join character / string to join the elements together.
ms_precision (int, optional) – Number of digits to represent the milliseconds part of the runtimes.
report_no (int, optional) – A counter variable passed by the ReportFormat call, indicating how often this parser was called so far.

Returns:

The multi-line runtime statistics

Return type:

_parse_distributed_work_status(*, fstr: str = ' {progress_here:>5s} @ {host_name_short:12s} - {pid:7d}: {status:10s} ({tags})', distributed_work_status: dict | None = None, include_header: bool = True, report_no: int | None = None) → str[source]#: Loads the work status of this and the distributed workers and creates a status string from it.

_parse_report(*, fstr: str = ' {k:<{w:}s} {v:}', min_num: int = 2, report_no: int | None = None, show_host_info: bool = True, show_exit_codes: bool = True, show_distributed_run_info: bool = True, distributed_status_fstr: str = ' {progress_here:>5s} @ {host_name_short:12s} - {pid:7d}: {status:10s} ({tags})', show_individual_runtimes: bool = True, max_num_to_show: int = 2048, task_label_singular: str = 'task', task_label_plural: str = 'tasks') → str[source]#

Parses a report for all tasks that were being worked on into a multiline string. The headings can be adjusted by keyword arguments.

Parameters:

fstr (str, optional) – The format string to use. Gets passed the keys k and v where k is the name of the entry and v its value. Note that this format string is also used with v being a non-numeric value. Also, w can be used to have a key column of constant width.
min_num (int, optional) – The minimum number of universes needed to calculate extended runtime statistics.
report_no (int, optional) – A counter variable passed by the ReportFormat call, indicating how often this parser was called so far.
show_host_info (bool, optional) – Whether to show basic information about the host machine
show_exit_codes (bool, optional) – Whether to show a table of exit codes of the finished simulations
show_distributed_run_info (bool, optional) – Whether to look for work status report files and show their information.
distributed_status_fstr (str, optional) – How to represent the work status of joined runs. Available keys are those from the status file plus tags, which is a comma-separated string with information on whether this was a joined run (or main run) and a marker which run belongs to this report file.
show_individual_runtimes (bool, optional) – Whether to report individual universe runtimes; default: True. This should be disabled if there are a huge number of universes.
max_num_to_show (int, optional) – Maximum number of tasks to list in the report
task_label_singular (str, optional) – The label to use in the report when referring to a single task.
task_label_plural (str, optional) – The label to use in the report when referring to multiple tasks.

Returns:

The multi-line simulation report string

Return type:

_parse_pspace_info(*, fstr: str = '{sweep_info:}', min_tasks_added: int = 0, report_no: int | None = None) → str[source]#

Provides information about the parameter space.

Extracts the parameter_space from the associated Multiverse’s meta configuration and provides information on that.

If there are multiple tasks specified, it is assumed that a sweep is or was being carried out and an information string is retrieved from the paramspace.paramspace.ParamSpace object, which is then returned. If only a single task was defined, returns an empty string.

Parameters:

fstr (str, optional) – The format string the sweep info should be embedded into. Needs to contain sweep_info key.
min_tasks_added (int, optional) – Number of tasks that need to have been added in order for showing the parameter space info. If zero, will always return the pspace info, this can be useful if invoking this before the WorkerManager got any tasks!
report_no (int, optional) – A counter variable passed by the ReportFormat call, indicating how often this parser was called so far.

Returns:

If there is more than one task, returns the result of: paramspace.paramspace.ParamSpace.get_info_str(). If not, returns a string denoting that there was only one task.

Return type:

_parse_work_status(*, report_no: int | None = None) → str[source]#: Supplies a very simple, YAML-formatted status string for this WorkerManager run.

_write_to_file(*args, path: str = '_report.txt', cluster_mode_path: str = '{}_{node_name}{ext}', dmv_mode_path: str = '{}__{host_name_short}-{pid}{ext}', skip_if_dmv: bool = False, **kwargs)[source]#

Overloads the parent method with capabilities needed in cluster mode

All args and kwargs are passed through. If in cluster mode, the path is changed such that it includes the name of the node.

Parameters:

*args – Passed on to parent method
path (str, optional) – The path to save to
cluster_mode_path (str, optional) – The format string to use for the path in cluster mode. Requires to contain the format key {0:} which retains the given path, extension split off. Extension can be used via ext (already includes the dot). Additional format keys: node_name, job_id.
dmv_mode_path (str, optional) – The format string to use for the path in a distributed Multiverse run.
skip_if_dmv (bool, optional) – Whether to skip reporting if part of a joined or continued run, i.e.: originating from a DistributedMultiverse run.
**kwargs – Passed on to parent method

utopya.stop_conditions module#

This module implements the StopCondition class, which is used by the WorkerManager to stop a worker process in certain situations.

In addition, it implements a set of basic stop condition functions and provides the stop_condition_function() decorator which is required to make them accessible by name.

utopya.stop_conditions.SIG_STOPCOND: str = 'SIGUSR1'#: Signal to use for stopping workers with fulfilled stop conditions

utopya.stop_conditions.STOP_CONDITION_FUNCS: Dict[str, Callable] = {'check_monitor_entry': <function check_monitor_entry>, 'timeout_wall': <function timeout_wall>}#

Registered stop condition functions are stored in this dictionary. These functions evaluate whether a certain stop condition is actually fulfilled.

To that end, a WorkerTask object is passed to these functions, the information in which can be used to determine whether the condition is fulfilled. The signature of these functions is: (task: WorkerTask, **kws) -> bool

utopya.stop_conditions._FAILED_MONITOR_ENTRY_CHECKS = []#: Keeps track of failed monitor entry checks in the check_monitor_entry() stop condition function in order to avoid repetitive warnings.

Bases: object

A StopCondition object holds information on the conditions in which a worker process should be stopped.

Create a new stop condition object.

Parameters:

to_check (List[dict], optional) – A list of dicts, that holds the functions to call and the arguments to call them with. The only requirement for the dict is that the func key is available. All other keys are unpacked and passed as kwargs to the given function. The func key can be either a callable or a string corresponding to a name in the utopya.stopcond_funcs module.
name (str, optional) – The name of this stop condition
description (str, optional) – A short description of this stop condition
enabled (bool, optional) – Whether this stop condition should be checked; if False, it will be created but will always be un- fulfilled when checked.
func (Union[Callable, str], optional) – (For the short syntax only!) If no to_check argument is given, a function can be given here that will be the only one that is checked. If this argument is a string, it is also resolved from the utopya stopcond_funcs module.
**func_kwargs – (For the short syntax) The kwargs that are passed to the single stop condition function

property fulfilled_for: Set[utopya.task.Task]#: The set of tasks this stop condition was fulfilled for

static _resolve_sc_funcs(to_check: List[dict], func: str | Callable, func_kwargs: dict) → List[tuple][source]#

Resolves the functions and kwargs that are to be checked.

The callable is either retrieved from the module-level stop condition functions registry or, if the given func is already a callable, that one will be used.

__str__() → str[source]#: A string representation for this StopCondition, including the name and, if given, the description.

fulfilled(task: utopya.task.Task) → bool[source]#

Checks if the stop condition is fulfilled for the given worker, using the information from the dict.

All given stop condition functions are evaluated; if all of them return True, this method will also return True.

Furthermore, if the stop condition is fulfilled, the task’s set of fulfilled stop conditions will

Parameters:

task (utopya.task.Task) – Task object that is to be checked

Returns:

If all stop condition functions returned true for the given: worker and its current information

Return type:

yaml_tag = '!stop-condition'#

classmethod to_yaml(representer, node)[source]#

Creates a yaml representation of the StopCondition object by storing the initialization kwargs as a yaml mapping.

Parameters:

representer (ruamel.yaml.representer) – The representer module
node (StopCondition) – The node, i.e. an instance of this class

Returns:

a yaml mapping that is able to recreate this object

classmethod from_yaml(constructor, node)[source]#: Creates a StopCondition object by unpacking the given mapping such that all stored arguments are available to __init__.

utopya.stop_conditions.stop_condition_function(f: Callable)[source]#

A decorator that registers the decorated callable in the module-level stop condition function registry. The callable’s __name__ attribute will be used as the key.

Parameters:: f (Callable) – A callable that is to be added to the function registry.
Raises:: AttributeError – If the name already exists in the registry

utopya.stop_conditions.timeout_wall(task: utopya.task.WorkerTask, *, seconds: float) → bool[source]#

Checks the wall timeout of the given worker

Parameters:

task (utopya.task.WorkerTask) – The WorkerTask object to check
seconds (float) – After how many seconds to trigger the wall timeout

Returns:

Whether the timeout is fulfilled

Return type:

utopya.stop_conditions.check_monitor_entry(task: utopya.task.WorkerTask, *, entry_name: str, operator: str, value: float) → bool[source]#

Checks if a monitor entry compares in a certain way to a given value

Parameters:

task (utopya.task.WorkerTask) – The WorkerTask object to check
entry_name (str) – The name of the monitor entry, leading to the value to the left-hand side of the operator
operator (str) – The binary operator to use
value (float) – The right-hand side value to compare to

Returns:

Result of op(entry, value)

Return type: