-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
I set the resume_from_checkpoint parameter in TrainingArguments.
And in the startup script, the checkpoint path is specified for resume_from_checkpoint.
@dataclass
class TrainingArguments(transformers.TrainingArguments):
cache_dir: Optional[str] = field(default=None)
optim: str = field(default='adamw_torch')
resume_from_checkpoint: Optional[str] = field(
default=None,
metadata={
'help': 'Path to a checkpoint directory to resume training from (e.g., `output/checkpoint-1000/`)'
}
)
max_length: int = field(
default=4096,
metadata={
'help':
'Maximum sequence length. Sequences will be right padded (and possibly truncated).'
},
)
use_lora: bool = False
fix_vit: bool = True
fix_sampler: bool = False
fix_llm: bool = True
label_names: List[str] = field(default_factory=lambda: ['samples'])However, ChartMoETrainer will still start training from scratch.
What Settings should I make to resume training from a breakpoint?
Metadata
Metadata
Assignees
Labels
No labels