Checkpoint feature
Implement a checkpoint feature that allows to continue computation.
Possible things to consider:
- a cluster automatically restarts a job -> automatic detection of a checkpoint and start there
- but the same job can be run multiple times
- we need to include intermediate results and data in addition to the models/pipeline
- only some models/pipeline configurations allow checkpoints
Maybe we can reuse the backup_dir
of #9 (closed).
Edited by User expired