Project#

Providing a project class that can run experiments.

class zntrack.project.Experiment(name: str, project: Project)[source]#

A DVC Experiment.

apply() None[source]#

Apply the experiment.

load() None[source]#

Load the nodes from this experiment.

class zntrack.project.Project(initialize: bool = True, remove_existing_graph: bool = False, automatic_node_names: bool = False, git_only_repo: bool = True, force: bool = False)[source]#

The ZnTrack Project class.

graph#

the znflow graph of the project.

Type:

znflow.DiGraph

initialize#

If True, initialize a git repository and a dvc repository.

Type:

bool, default = True

remove_existing_graph#
If True, remove ‘dvc.yaml’, ‘zntrack.json’ and ‘params.yaml’

before writing new nodes.

Type:

bool, default = False

automatic_node_names#
If True, automatically add a number to the node name if the name is already

used in the graph.

Type:

bool, default = False

git_only_repo#

The DVC graph relies on file outputs for connecting stages. ZnTrack will use a ‘–metrics-no-cache’ file output for each stage by default. Contrary to ‘–outs-no-cache’, this will keep the DVC run cache available. If a project has a DVC remote available, ‘–outs’ can be used instead. This will require a DVC remote to be setup.

Type:

bool, default = True

force#

overwrite existing nodes.

Type:

bool, default = False

auto_remove(remove_empty_dirs=True)[source]#

Remove all nodes from ‘dvc.yaml’ that are not in the graph.

property branches#

Get the branches in the project.

build(**kwargs) None[source]#

Build the project graph without running it.

create_branch(name: str) Branch[source]#

Create a branch in the project.

create_experiment(name: str = None, queue: bool = True) Experiment[source]#

Create a new experiment.

property experiments: dict[str, zntrack.project.zntrack_project.Experiment]#

List all experiments.

get_nodes() dict[str, znflow.node.Node][source]#

Get the nodes in the project.

group(*names: List[str])[source]#

Group nodes together.

Parameters:

names (list[str], optional) – The name of the group. If None, the group will be named ‘GroupX’ where X is the number of groups + 1. If more than one name is given, the groups will be nested to ‘nwd = name[0]/name[1]/…/name[-1]’

load()[source]#

Load all nodes in the project.

property nodes: dict[str, znflow.node.Node]#

Get the nodes in the project.

remove(name)[source]#

Remove all nodes with the given name from the project.

repro() None[source]#

Run dvc repro.

run(eager=False, repro: bool = True, optional: dict = None, save: bool = True, environment: dict = None, nodes: list = None, auto_remove: bool = False)[source]#

Run the Project Graph.

Parameters:
  • eager (bool, default = False) – if True, run the nodes in eager mode. if False, run the nodes using dvc.

  • save (bool, default = True) – if using ‘eager=True’ this will save the results to disk. Otherwise, the results will only be in memory.

  • repro (bool, default = True) – if True, run dvc repro after running the nodes.

  • optional (dict, default = None) – A dictionary of optional arguments for each node. Use {node_name: {arg_name: arg_value}} to pass arguments to nodes. Possible arg_names are e.g. ‘always_changed: True’

  • environment (dict, default = None) – A dictionary of environment variables for all nodes.

  • nodes (list, default = None) – A list of node names to run. If None, run all nodes.

  • auto_remove (bool, default = False) – If True, remove all nodes from ‘dvc.yaml’ that are not in the graph. This is the same as calling ‘project.auto_remove()’

run_exp(jobs: int = 1) None[source]#

Run all queued experiments.