Created by: wkcn
Hi there, thanks for your great work!
I found that there is no quote for train_cmd in metaseq/launcher/slurm.py.
It leads to an issue that dry-run prints a command without quote, namely --adam-betas (0.9, 0.95), which could not run directly. The correct command should be --adam-betas '(0.9, 0.95)'.
Patch Description
Add quote for train_cmd in the function gen_train_command of metaseq/launcher/slurm.py.
The command --adam-betas (0.9, 0.95) will be fixed to --adam-betas '(0.9, 0.95)' in this PR : )
Testing steps
Run the following command to print the dry-run message.
opt-baselines \
-n 1 -g 2 \
-p test_v0 \
--model-size 125m \
--azure \
--data ./dataset \
--checkpoints-dir "./outputs" \
--local \
--dry-run