SGLang Router
This page explains the sglang router mode for prefill-decode (PD) disaggregation, an alternative to the default Dynamo frontend architecture.
Overview
By default, srtctl uses Dynamo frontends to coordinate between prefill and decode workers. This requires NATS/ETCD infrastructure and the dynamo package.
SGLang Router is an alternative that uses sglang's native sglang_router for PD disaggregation.
Infrastructure
NATS + ETCD + dynamo
sglang_router only
Routing
Dynamo's coordination
sglang's native PD routing
Scaling
nginx + multiple frontends
nginx + multiple routers
Configuration
Enable sglang router in your recipe's backend section:
backend:
use_sglang_router: trueThat's it. The workers will launch with sglang.launch_server instead of dynamo.sglang, and the router will handle request distribution.
Architecture Modes
Single Router (enable_multiple_frontends: false)
enable_multiple_frontends: false)The simplest mode - one router on node 0, no nginx:
Router directly on port 8000
Good for testing or small deployments
No load balancing overhead
Multiple Routers (enable_multiple_frontends: true, default)
enable_multiple_frontends: true, default)Nginx load balances across multiple router instances:
nginx on node 0 listens on port 8000 (public)
Routers listen on port 30080 (internal)
nginx round-robins requests to routers
Routers distributed across nodes using same logic as Dynamo frontends
How Router Distribution Works
The num_additional_frontends setting controls how many additional routers spawn beyond the first:
num_additional_frontends: 0
1
Node 0 only
num_additional_frontends: 4
5
Node 0 + 4 distributed
num_additional_frontends: 9
10
Node 0 + 9 distributed (default)
Routers are distributed across available nodes using ceiling division:
Port Configuration
Bootstrap Port
The sglang router needs the disaggregation bootstrap port to connect to prefill workers. This must match the disaggregation-bootstrap-port in your sglang config:
The default bootstrap port is 30001 (matching most recipes). If you use a different port, ensure it's consistent across prefill and decode configs.
Server Port
Workers listen on port 30000 by default. This is standard sglang behavior and doesn't need configuration.
Complete Example
Here's a full recipe using sglang router:
Troubleshooting
Port Conflicts
If you see bind() to 0.0.0.0:8000 failed (Address already in use):
This means nginx and a router are both trying to use port 8000
Ensure you're using the latest template (routers use port 30080 internally)
Router Not Connecting to Workers
Check that:
disaggregation-bootstrap-portmatches in prefill/decode configsWorkers are fully started before router tries to connect
Network connectivity between router and worker nodes
Benchmark Can't Reach Endpoint
The benchmark connects to http://<node0>:8000. Ensure:
nginx is running (if
enable_multiple_frontends: true)Router is running (if
enable_multiple_frontends: false)Port 8000 is accessible
Comparison with Dynamo
Startup
Slower (NATS/ETCD + dynamo install)
Faster (just sglang)
Complexity
More moving parts
Simpler
Maturity
Production-tested
Newer
Config
Via dynamo.sglang
Via sglang.launch_server
Scaling
Same nginx approach
Same nginx approach
Both modes support the same enable_multiple_frontends and num_additional_frontends settings for horizontal scaling.
Last updated