core#
The ZnTrack Node class.
- class zntrack.core.node.Node(name: str)[source]#
A node in a ZnTrack workflow.
- name#
the Name of the Node
- Type:
str, default = cls.__name__
- state#
information about the state of the Node.
- Type:
- nwd#
the node working directory.
- Type:
pathlib.Path
- classmethod convert_notebook(nb_name: str = None)[source]#
Use jupyter_class_to_file to convert ipynb to py.
- Parameters:
nb_name (str) – Notebook name when not using config.nb_name (this is not recommended)
- classmethod from_rev(name=None, remote=None, rev=None, lazy: bool = None, results: bool = True) Node [source]#
Create a Node instance from an experiment.
- load(lazy: bool = None, results: bool = True) None [source]#
Load the node’s output from disk.
- lazy#
Whether to load the node lazily. If None, the value from the config is used.
- Type:
bool, default = None
- results#
Whether to load the results. If False, only the parameters are loaded.
- Type:
bool, default = True
- property nwd: Path#
Get the node working directory.
- save(parameter: bool = True, results: bool = True, meta_only: bool = False) None [source]#
Save the node’s output to disk.
- property state: NodeStatus#
Get the state of the node.
- class zntrack.core.node.NodeIdentifier(module: str, cls: str, name: str, remote: str, rev: str)[source]#
All information that uniquely identifies a node.
- class zntrack.core.node.NodeStatus(loaded: bool, results: NodeStatusResults, remote: str = None, rev: str = None)[source]#
The status of a node.
- loaded#
Whether the attributes of the Node are loaded from disk. If a new Node is created, this will be False. If some attributes could not be loaded, this will be False.
- Type:
bool
- results#
The status of the node results. E.g. was the computation successful.
- Type:
NodeStatusResults
- remote#
Where the Node has its data from. This could be the current “workspace” or a “remote” location, such as a git repository.
- Type:
str, default = None
- rev#
The revision of the Node. This could be the current “HEAD” or a specific revision.
- Type:
str, default = None
- tmp_path#
The temporary path used for loading the data. This is only set within the context manager ‘use_tmp_path’. If neither ‘remote’ nor ‘rev’ are set, tmp_path will not be used.
- Type:
pathlib.Path, default = DISABLE_TMP_PATH|None
- property fs: _DVCFileSystem#
Get the file system of the Node.
- magic_patch() ContextManager [source]#
Patch the open function to use the Node’s file system.
Opening a relative path will use the Node’s file system. Opening an absolute path will use the local file system.
- tmp_path#
alias of
DISABLE_TMP_PATH
- use_tmp_path(path: Path = None) ContextManager [source]#
Load the data for ‘*_path’ into a temporary directory.
If you can not use ‘node.state.fs.open’ you can use this as an alternative. This will load the data into a temporary directory and then delete it afterwards. The respective paths ‘node.*_path’ will be replaced automatically inside the context manager.
This is only set, if either ‘remote’ or ‘rev’ are set. Otherwise, the data will be loaded from the current directory.
- zntrack.core.node.get_dvc_cmd(node: Node, git_only_repo: bool, quiet: bool = False, verbose: bool = False, force: bool = True, external: bool = False, always_changed: bool = False, desc: str = None) List[List[str]] [source]#
Get the ‘dvc stage add’ command to run the node.
The @nodify decorator.
- class zntrack.core.nodify.DVCRunOptions(no_commit: bool, external: bool, always_changed: bool, no_run_cache: bool, force: bool)[source]#
Collection of DVC run options.
- All attributes are documented under the dvc run method.
References
https://dvc.org/doc/command-reference/run#options.
- property dvc_args: list#
Get the activated options.
- Returns:
list – [”–no-commit”, “–external”].
- Return type:
A list of strings for the subprocess call, e.g.:
- class zntrack.core.nodify.NodeConfig(params: ~dot4dict.dotdict | dict = <factory>, outs: ~typing.List[str | ~pathlib.Path] | ~typing.Dict[str, str | ~pathlib.Path] | str | ~pathlib.Path = None, outs_no_cache: ~typing.List[str | ~pathlib.Path] | ~typing.Dict[str, str | ~pathlib.Path] | str | ~pathlib.Path = None, outs_persist: ~typing.List[str | ~pathlib.Path] | ~typing.Dict[str, str | ~pathlib.Path] | str | ~pathlib.Path = None, outs_persist_no_cache: ~typing.List[str | ~pathlib.Path] | ~typing.Dict[str, str | ~pathlib.Path] | str | ~pathlib.Path = None, metrics: ~typing.List[str | ~pathlib.Path] | ~typing.Dict[str, str | ~pathlib.Path] | str | ~pathlib.Path = None, metrics_no_cache: ~typing.List[str | ~pathlib.Path] | ~typing.Dict[str, str | ~pathlib.Path] | str | ~pathlib.Path = None, deps: ~typing.List[str | ~pathlib.Path] | ~typing.Dict[str, str | ~pathlib.Path] | str | ~pathlib.Path = None, plots: ~typing.List[str | ~pathlib.Path] | ~typing.Dict[str, str | ~pathlib.Path] | str | ~pathlib.Path = None, plots_no_cache: ~typing.List[str | ~pathlib.Path] | ~typing.Dict[str, str | ~pathlib.Path] | str | ~pathlib.Path = None)[source]#
DataClass to contain the arguments passed by the user.
- All dvc attributes but connected by "_" instead of "-"
- zntrack.core.nodify.check_type(obj, types, allow_iterable=False, allow_none=False, allow_dict=False) bool [source]#
Check if the obj is of the given types.
This includes recursive search for nested lists / dicts and fails if any of the values is not in types
- Parameters:
obj – object to check
types – single class or tuple of classes to check against
allow_iterable – check list entries if a list is provided
allow_none – accept None even if not in types.
allow_dict – allow for {key: types}
- zntrack.core.nodify.execute_function_call(func)[source]#
Run the function call.
Load the parameters from the Files.zntrack / Files.params
Deserialize them
Update the cfg: NodeConfig
return the func(cfg)
- Parameters:
func (callable) – decorated function
- Return type:
not used - return function return value
- zntrack.core.nodify.module_to_path(module: str, suffix='.py') Path [source]#
Convert module a.b.c to path(a/b/c).
- zntrack.core.nodify.nodify(*, params: dict = None, outs: List[str | Path] | Dict[str, str | Path] | str | Path = None, outs_no_cache: List[str | Path] | Dict[str, str | Path] | str | Path = None, outs_persist: List[str | Path] | Dict[str, str | Path] | str | Path = None, outs_persist_no_cache: List[str | Path] | Dict[str, str | Path] | str | Path = None, metrics: List[str | Path] | Dict[str, str | Path] | str | Path = None, metrics_no_cache: List[str | Path] | Dict[str, str | Path] | str | Path = None, deps: List[str | Path] | Dict[str, str | Path] | str | Path = None, plots: List[str | Path] | Dict[str, str | Path] | str | Path = None, plots_no_cache: List[str | Path] | Dict[str, str | Path] | str | Path = None)[source]#
Wrapper Function to convert a function into a DVC Stage.
Special Parameters#
- params: dict
for the params.yaml file context
- **kwargs: str|Path|list
All other parameters are related to dvc run commands and can be a str / Path or a list of them
References
- zntrack.core.nodify.prepare_dvc_script(node_name, dvc_run_option: DVCRunOptions, custom_args: list, nb_name, module, func_or_cls, call_args=None) list [source]#
Prepare the dvc cmd to be called by subprocess.
- Parameters:
node_name (str) – Name of the Node
dvc_run_option (DVCRunOptions) – dataclass to collect special DVC run options
custom_args (list[str]) – all the params / deps / … to be added to the script
nb_name (str|None) – Notebook name for jupyter support
module (str) – like “src.my_module”
func_or_cls (str) – The name of the Node class or function to be imported and run
call_args (str) – Additional str like “(run_func=True)” or “.load().run_and_save”
- Returns:
The list to be passed to the subprocess call.
- Return type:
list[str]
- zntrack.core.nodify.save_node_config_to_files(cfg: NodeConfig, node_name: str)[source]#
Save the values from cfg to zntrack.json / params.yaml.
- Parameters:
cfg (NodeConfig) – The NodeConfig object which should be serialized to zntrack.json / params.yaml
node_name (str) – The name of the node, usually func.__name__.