Metadata collection with ZnTrack#

ZnTrack allows for the collection of some metadata. One example is measuring the execution time of Nodes or even methods inside the Nodes easily. This can be achieved by using the @TimeIt decorator which is shown in the following example.

[1]:
import zntrack
from time import sleep

zntrack.config.nb_name = "05_metadata.ipynb"
[3]:
!git init
!dvc init
Initialized empty Git repository in /tmp/tmp2n5usz7x/.git/
Initialized DVC repository.

You can now commit the changes to git.

+---------------------------------------------------------------------+
|                                                                     |
|        DVC has enabled anonymous aggregate usage analytics.         |
|     Read the analytics documentation (and how to opt-out) here:     |
|             <https://dvc.org/doc/user-guide/analytics>              |
|                                                                     |
+---------------------------------------------------------------------+

What's next?
------------
- Check out the documentation: <https://dvc.org/doc>
- Get help and share ideas: <https://dvc.org/chat>
- Star us on GitHub: <https://github.com/iterative/dvc>
[4]:
class SleepNode(zntrack.Node):
    metadata = zntrack.zn.metrics()

    @zntrack.tools.timeit("metadata")
    def run(self):
        self.sleep_1s()
        self.sleep_2s()

    @zntrack.tools.timeit("metadata")
    def sleep_1s(self):
        sleep(1)

    def sleep_2s(self):
        sleep(2)
[5]:
with zntrack.Project() as project:
    node = SleepNode()

project.run()
DeprecationWarning for write_graph: Building a graph is now done using 'with zntrack.Project() as project: ...' (Deprecated since 0.6.0)
Running DVC command: 'stage add --name SleepNode --force ...'
Creating 'dvc.yaml'
Adding stage 'SleepNode' in 'dvc.yaml'

To track the changes with git, run:

        git add dvc.yaml nodes/SleepNode/.gitignore

To enable auto staging, run:

        dvc config core.autostage true
Jupyter support is an experimental feature! Please save your notebook before running this command!
Submit issues to https://github.com/zincware/ZnTrack.
[NbConvertApp] Converting notebook 05_metadata.ipynb to script
/data/fzills/miniconda3/envs/zntrack/lib/python3.10/site-packages/nbformat/__init__.py:93: MissingIDFieldWarning: Code cell is missing an id field, this will become a hard error in future nbformat versions. You may want to use `normalize()` on your notebooks before validations (available since nbformat 5.1.4). Previous versions of nbformat are fixing this issue transparently, and will stop doing so in the future.
  validate(nb)
[NbConvertApp] Writing 1732 bytes to 05_metadata.py
Running DVC command: 'repro SleepNode'
Running stage 'SleepNode':
> zntrack run src.SleepNode.SleepNode --name SleepNode
Could not load field metadata for node SleepNode.
Generating lock file 'dvc.lock'
Updating lock file 'dvc.lock'

To track the changes with git, run:

        git add dvc.lock

To enable auto staging, run:

        dvc config core.autostage true
Use `dvc push` to send your updates to remote storage.
[6]:
!dvc metrics show
Path                           run     sleep_1s
nodes/SleepNode/metadata.json  3.1168  1.00106

We can also time a single function multiple times, using the following example:

[7]:
class SleepNodeMulti(zntrack.Node):
    metadata = zntrack.zn.metrics()

    @zntrack.tools.timeit("metadata")
    def run(self):
        self.sleep(1)
        self.sleep(2)

    @zntrack.tools.timeit("metadata")
    def sleep(self, time):
        sleep(time)
[8]:
with zntrack.Project() as project:
    node = SleepNodeMulti()

project.run()
DeprecationWarning for write_graph: Building a graph is now done using 'with zntrack.Project() as project: ...' (Deprecated since 0.6.0)
Running DVC command: 'stage add --name SleepNodeMulti --force ...'
Adding stage 'SleepNodeMulti' in 'dvc.yaml'

To track the changes with git, run:

        git add nodes/SleepNodeMulti/.gitignore dvc.yaml

To enable auto staging, run:

        dvc config core.autostage true
[NbConvertApp] Converting notebook 05_metadata.ipynb to script
/data/fzills/miniconda3/envs/zntrack/lib/python3.10/site-packages/nbformat/__init__.py:93: MissingIDFieldWarning: Code cell is missing an id field, this will become a hard error in future nbformat versions. You may want to use `normalize()` on your notebooks before validations (available since nbformat 5.1.4). Previous versions of nbformat are fixing this issue transparently, and will stop doing so in the future.
  validate(nb)
[NbConvertApp] Writing 1732 bytes to 05_metadata.py
Running DVC command: 'repro SleepNodeMulti'
Running stage 'SleepNodeMulti':
> zntrack run src.SleepNodeMulti.SleepNodeMulti --name SleepNodeMulti
Could not load field metadata for node SleepNodeMulti.
Updating lock file 'dvc.lock'

To track the changes with git, run:

        git add dvc.lock

To enable auto staging, run:

        dvc config core.autostage true
Use `dvc push` to send your updates to remote storage.
[9]:
!dvc metrics show
Path                                run     sleep.mean    sleep.std    sleep_1s
nodes/SleepNode/metadata.json       3.1168  -             -            1.00106
nodes/SleepNodeMulti/metadata.json  3.1003  1.50156       0.50054      -

One can also access the metrics directly within Python. This is possible, because they are just another zn.metrics which is automatically added when using one of the given metadata decorators.

[10]:
SleepNodeMulti.from_rev().metadata
[10]:
{'sleep': {'values': [1.0010173320770264, 2.0021069049835205],
  'mean': 1.5015621185302734,
  'std': 0.5005447864532471},
 'run': 3.1002955436706543}