Skip to content

Add checkpoint resume: continue pipeline from last generated JSONL upon unexpected termination #149

@miaode74

Description

@miaode74

System Info

Description
When running DataFlow pipelines (e.g., Reasoning Pipeline, Text Pipeline), an unexpected interruption (network failure, crash, etc.) currently forces a full rerun from step 0—even if intermediate JSONL outputs already exist. This wastes time and compute resources.

Expected Behavior

  1. On startup, scan the output directory for existing step_*.jsonl files and determine the highest completed step index.
  2. Introduce a --resume flag (or enable automatic resume) that skips all already finished steps and proceeds from the next one.
  3. Update the documentation with usage examples for this resume feature.

Usage Example

# Assuming output/step_0.jsonl … output/step_3.jsonl already exist
dataflow run reasoning-pipeline \
  --input data/questions.jsonl \
  --output output/ \
  --resume
# Should automatically resume from step_4

### Others

_No response_

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions