Monitoring
Table of Contents
Checking Job Status
# List your running jobs
squeue -u $USER
# Detailed job info
scontrol show job <job_id>
# Cancel a job
scancel <job_id>Log Directory
Log Structure
Key Files
log.out
benchmark.out
Worker Logs ({node}_prefill_w0.err, {node}_decode_w0.err)
config.yaml
Common Commands
Connecting to Running Jobs
Last updated