Project#
Providing a project class that can run experiments.
- class zntrack.project.Project(initialize: bool = True, remove_existing_graph: bool = False, automatic_node_names: bool = False, git_only_repo: bool = True, force: bool = False)[source]#
The ZnTrack Project class.
- graph#
the znflow graph of the project.
- Type:
znflow.DiGraph
- initialize#
If True, initialize a git repository and a dvc repository.
- Type:
bool, default = True
- remove_existing_graph#
- If True, remove ‘dvc.yaml’, ‘zntrack.json’ and ‘params.yaml’
before writing new nodes.
- Type:
bool, default = False
- automatic_node_names#
- If True, automatically add a number to the node name if the name is already
used in the graph.
- Type:
bool, default = False
- git_only_repo#
The DVC graph relies on file outputs for connecting stages. ZnTrack will use a ‘–metrics-no-cache’ file output for each stage by default. Contrary to ‘–outs-no-cache’, this will keep the DVC run cache available. If a project has a DVC remote available, ‘–outs’ can be used instead. This will require a DVC remote to be setup.
- Type:
bool, default = True
- force#
overwrite existing nodes.
- Type:
bool, default = False
- auto_remove(remove_empty_dirs=True)[source]#
Remove all nodes from ‘dvc.yaml’ that are not in the graph.
- property branches#
Get the branches in the project.
- create_experiment(name: str = None, queue: bool = True) Experiment [source]#
Create a new experiment.
- property experiments: dict[str, zntrack.project.zntrack_project.Experiment]#
List all experiments.
- group(*names: List[str])[source]#
Group nodes together.
- Parameters:
names (list[str], optional) – The name of the group. If None, the group will be named ‘GroupX’ where X is the number of groups + 1. If more than one name is given, the groups will be nested to ‘nwd = name[0]/name[1]/…/name[-1]’
- property nodes: dict[str, znflow.node.Node]#
Get the nodes in the project.
- run(eager=False, repro: bool = True, optional: dict = None, save: bool = True, environment: dict = None, nodes: list = None, auto_remove: bool = False)[source]#
Run the Project Graph.
- Parameters:
eager (bool, default = False) – if True, run the nodes in eager mode. if False, run the nodes using dvc.
save (bool, default = True) – if using ‘eager=True’ this will save the results to disk. Otherwise, the results will only be in memory.
repro (bool, default = True) – if True, run dvc repro after running the nodes.
optional (dict, default = None) – A dictionary of optional arguments for each node. Use {node_name: {arg_name: arg_value}} to pass arguments to nodes. Possible arg_names are e.g. ‘always_changed: True’
environment (dict, default = None) – A dictionary of environment variables for all nodes.
nodes (list, default = None) – A list of node names to run. If None, run all nodes.
auto_remove (bool, default = False) – If True, remove all nodes from ‘dvc.yaml’ that are not in the graph. This is the same as calling ‘project.auto_remove()’