Project#

Providing a project class that can run experiments.

class zntrack.project.Experiment(name: str, project: Project)[source]#

A DVC Experiment.

apply() → None[source]#: Apply the experiment.

load() → None[source]#: Load the nodes from this experiment.

class zntrack.project.Project(initialize: bool = True, remove_existing_graph: bool = False, automatic_node_names: bool = False, git_only_repo: bool = True, force: bool = False)[source]#

The ZnTrack Project class.

graph#

the znflow graph of the project.

Type:: znflow.DiGraph

initialize#

If True, initialize a git repository and a dvc repository.

Type:: bool, default = True

remove_existing_graph#

If True, remove ‘dvc.yaml’, ‘zntrack.json’ and ‘params.yaml’: before writing new nodes.

Type:: bool, default = False

automatic_node_names#

If True, automatically add a number to the node name if the name is already: used in the graph.

Type:: bool, default = False

git_only_repo#

The DVC graph relies on file outputs for connecting stages. ZnTrack will use a ‘–metrics-no-cache’ file output for each stage by default. Contrary to ‘–outs-no-cache’, this will keep the DVC run cache available. If a project has a DVC remote available, ‘–outs’ can be used instead. This will require a DVC remote to be setup.

Type:: bool, default = True

force#

overwrite existing nodes.

Type:: bool, default = False

auto_remove(remove_empty_dirs=True)[source]#: Remove all nodes from ‘dvc.yaml’ that are not in the graph.

property branches#: Get the branches in the project.

build(**kwargs) → None[source]#: Build the project graph without running it.

create_branch(name: str) → Branch[source]#: Create a branch in the project.

create_experiment(name: str = None, queue: bool = True) → Experiment[source]#: Create a new experiment.

property experiments: dict[str, zntrack.project.zntrack_project.Experiment]#: List all experiments.

get_nodes() → dict[str, znflow.node.Node][source]#: Get the nodes in the project.

group(*names: List[str])[source]#

Group nodes together.

Parameters:: names (list[str], optional) – The name of the group. If None, the group will be named ‘GroupX’ where X is the number of groups + 1. If more than one name is given, the groups will be nested to ‘nwd = name[0]/name[1]/…/name[-1]’

load()[source]#: Load all nodes in the project.

property nodes: dict[str, znflow.node.Node]#: Get the nodes in the project.

remove(name)[source]#: Remove all nodes with the given name from the project.

repro() → None[source]#: Run dvc repro.

run(eager=False, repro: bool = True, optional: dict = None, save: bool = True, environment: dict = None, nodes: list = None, auto_remove: bool = False)[source]#

Run the Project Graph.

Parameters:

eager (bool, default = False) – if True, run the nodes in eager mode. if False, run the nodes using dvc.
save (bool, default = True) – if using ‘eager=True’ this will save the results to disk. Otherwise, the results will only be in memory.
repro (bool, default = True) – if True, run dvc repro after running the nodes.
optional (dict, default = None) – A dictionary of optional arguments for each node. Use {node_name: {arg_name: arg_value}} to pass arguments to nodes. Possible arg_names are e.g. ‘always_changed: True’
environment (dict, default = None) – A dictionary of environment variables for all nodes.
nodes (list, default = None) – A list of node names to run. If None, run all nodes.
auto_remove (bool, default = False) – If True, remove all nodes from ‘dvc.yaml’ that are not in the graph. This is the same as calling ‘project.auto_remove()’

run_exp(jobs: int = 1) → None[source]#: Run all queued experiments.