# Introduction `srtctl` is a command-line tool for running distributed LLM inference benchmarks on SLURM clusters. It replaces complex shell scripts and 50+ CLI flags with clean, declarative YAML configuration files. ## Table of Contents * [Why srtctl?](#why-srtctl) * [How It Works](#how-it-works) * [Commands](#commands) * [Next Steps](#next-steps) ## Why srtctl? Running large language models across multiple GPUs and nodes requires orchestrating many moving parts: SLURM job scripts, container mounts, SGLang configuration, worker coordination, and benchmark execution. Traditionally, this meant maintaining brittle bash scripts with hardcoded parameters. `srtctl` solves this by providing: * **Declarative configuration** - Define your entire job in a single YAML file * **Validation** - Catch configuration errors before submitting to SLURM * **Reproducibility** - Every job saves its full configuration for later reference * **Parameter sweeps** - Run grid searches across configurations with a single command * **Profiling support** - Built-in torch/nsys profiling modes ## How It Works When you run `srtctl apply -f config.yaml`, the tool: 1. Validates your configuration against the schema 2. Resolves any aliases from your cluster config (`srtslurm.yaml`) 3. Generates a SLURM batch script and SGLang configuration files 4. Submits to SLURM Once allocated, workers launch inside containers, discover each other through ETCD and NATS, and begin serving. If you've configured a benchmark, it runs automatically against the serving endpoint and saves results to the log directory. ## Commands | Command | Description | | -------------------------------------------------- | --------------------------------------- | | `srtctl apply -f ` | Submit job(s) to SLURM | | `srtctl apply -f --setup-script