Auto-Scaling Guide
How Nimbus automatically scales server instances based on player demand — scale-up, scale-down, custom states, smart scheduling, and multi-node.
Nimbus includes an auto-scaling engine that dynamically adjusts the number of running instances per group based on player demand. This ensures you always have enough server capacity without wasting resources on idle instances.
How scaling works
The scaling engine runs a continuous loop:
- Every heartbeat interval (default: 10 seconds), ping all
READYservices via Server List Ping to update player counts - For each dynamic group, count players on routable services (those with no custom state)
- Apply the scale-up and scale-down rules
Static groups are never auto-scaled.
Scale-up
The engine scales up when the fill rate exceeds the scale threshold:
fill_rate = total_players / (routable_count * players_per_instance)
if fill_rate > scale_threshold
AND routable_count < max_instances
→ start one new instanceExample: A BedWars group with players_per_instance = 16 and scale_threshold = 0.8:
| Routable servers | Players | Fill rate | Action |
|---|---|---|---|
| 2 | 20 | 62.5% | No action |
| 2 | 27 | 84.4% | Scale up (> 80%) |
| 3 | 27 | 56.3% | No action |
| 3 | 40 | 83.3% | Scale up (> 80%) |
Cooldown
Only one instance is started per evaluation cycle. After a scale-up, the engine waits 30 seconds before scaling the same group again. After a scale-down, the cooldown is 120 seconds. This prevents thrashing when player counts fluctuate rapidly.
Stress testing
During an active stress test, the scaling engine is completely paused to avoid reacting to simulated player counts. Scaling resumes automatically when the stress test ends.
Global service limit
The scaling engine respects the controller.max_services setting (default: 20) as a global hard cap across all groups. Even if a group's max_instances allows more, the engine will not start new instances once the total service count reaches this limit. This prevents resource exhaustion from misconfigured scaling rules.
Scale-down
The engine scales down when a service has been empty for longer than the idle timeout:
if service.playerCount == 0
AND service has been empty for > idle_timeout seconds
AND routable_count > min_instances
→ stop that serviceThe idle timer starts when a service's player count drops to zero and resets if any player joins.
If idle_timeout is set to 0, scale-down is disabled -- empty instances will never be stopped automatically. This is the default, and appropriate for lobbies that should always be running.
Custom states and routing
Services can have a custom state set by game plugins via the SDK (e.g., "WAITING", "INGAME", "ENDING"). Services with a custom state are treated specially:
- Excluded from routable count -- they don't count toward capacity calculations
- Never scaled down -- even if empty, a service mid-game won't be stopped
- Not chosen for routing -- the proxy won't send new players to them
This is critical for game servers. Without custom states, a BedWars server mid-game would count toward capacity, and the scaling engine might not start new instances even though no server is actually accepting players.
// In your game plugin:
Nimbus.setState("INGAME"); // Server is no longer routable
// ... game plays out ...
Nimbus.clearState(); // Server is routable againHow custom states affect scaling
Consider a BedWars group with 4 instances:
| Service | Players | Custom State | Routable? |
|---|---|---|---|
| BedWars-1 | 16 | INGAME | No |
| BedWars-2 | 14 | INGAME | No |
| BedWars-3 | 8 | null | Yes |
| BedWars-4 | 0 | null | Yes |
The scaling engine sees: 2 routable servers, 8 total routable players. Fill rate = 8 / (2 * 16) = 25%. No scale-up needed.
If BedWars-4 fills up:
| Service | Players | Custom State | Routable? |
|---|---|---|---|
| BedWars-3 | 14 | null | Yes |
| BedWars-4 | 14 | null | Yes |
Fill rate = 28 / (2 * 16) = 87.5% > 80% threshold. A new instance (BedWars-5) starts.
Configuration
Scaling is configured per group in the [group.scaling] section:
[group.scaling]
min_instances = 1 # Minimum instances (always running)
max_instances = 10 # Maximum instances (hard cap)
players_per_instance = 16 # Expected players per server
scale_threshold = 0.8 # Scale up at this fill rate (0.0 - 1.0)
idle_timeout = 300 # Seconds before stopping an empty server| Field | Default | Description |
|---|---|---|
min_instances | 1 | Floor -- instances are started on boot to meet this |
max_instances | 4 | Ceiling -- scaling will never exceed this |
players_per_instance | 40 | Used to calculate fill rate |
scale_threshold | 0.8 | Fill rate that triggers scale-up (80%) |
idle_timeout | 0 | Seconds empty before scale-down (0 = disabled) |
Practical examples
Lobby servers
Always-on, spread players evenly. Never scale down below 1.
[group]
name = "Lobby"
type = "STATIC"
[group.scaling]
min_instances = 1
max_instances = 4
players_per_instance = 100
scale_threshold = 0.8
idle_timeout = 0 # Never stop lobbiesBedWars (minigame)
Ephemeral servers that scale with demand. Stop empty servers after 5 minutes.
[group]
name = "BedWars"
type = "DYNAMIC"
[group.scaling]
min_instances = 1
max_instances = 10
players_per_instance = 16
scale_threshold = 0.8
idle_timeout = 300
[group.lifecycle]
stop_on_empty = false # Let idle_timeout handle it
restart_on_crash = true
max_restarts = 3Survival server
Single persistent instance, no scaling needed.
[group]
name = "Survival"
type = "STATIC"
[group.scaling]
min_instances = 1
max_instances = 1Event mode
Expecting a traffic spike? Temporarily increase capacity:
[group.scaling]
min_instances = 5
max_instances = 20
players_per_instance = 50
scale_threshold = 0.7 # Scale earlier for headroom
idle_timeout = 600Or edit the group config file and reload:
reloadManual scaling
Console commands
# Manually start an instance
start BedWars
# Manually stop an instance
stop BedWars-3
# View current scaling state
statusREST API
# Start a new instance
curl -X POST http://127.0.0.1:8080/api/services/BedWars/start \
-H "Authorization: Bearer <token>"
# Stop a specific instance
curl -X POST http://127.0.0.1:8080/api/services/BedWars-3/stop \
-H "Authorization: Bearer <token>"Monitoring
The status command shows current scaling state for all groups:
statusThis displays:
- Active instances per group
- Player counts per instance
- Custom states
- Scaling bounds (min/max)
Tuning tips
Threshold selection
| Threshold | Behavior | Best for |
|---|---|---|
0.6 | Scale early, lots of headroom | Competitive games where lag = lost players |
0.8 | Balanced (default) | Most use cases |
0.95 | Pack tightly, minimal waste | Large lobbies, budget-constrained |
Idle timeout
| Timeout | Behavior |
|---|---|
0 | Never auto-stop (good for lobbies) |
60 | Aggressive cleanup (1 minute) |
300 | Balanced (5 minutes, good for minigames) |
900 | Conservative (15 minutes, good for longer game modes) |
Players per instance
Set this to the expected peak capacity, not the absolute maximum. If your BedWars maps hold 16 players, set players_per_instance = 16. If your lobby can handle 200 but you want to keep it comfortable, set players_per_instance = 100.
The scale_threshold and players_per_instance work together. A threshold of 0.8 with 16 players per instance means scaling triggers at 13 players on average per routable server.
Multi-node scaling
By default, Nimbus runs all services on a single machine. For larger networks, you can enable cluster mode to distribute services across multiple machines with automatic placement, failover, and an optional TCP load balancer.
The scaling engine works the same way in multi-node — it just has more machines to place services on. Remote services report their player counts via heartbeats, so scaling decisions remain accurate. If no remote node is available, dynamic services fall back to running locally on the controller.
Static services are never distributed — they always run on the controller because they have persistent data (worlds, configs) in services/static/. Only dynamic and proxy services are placed on remote agent nodes.
See the dedicated Multi-Node & Load Balancer guide for full setup instructions, placement strategies, load balancer configuration, and failure handling.
Smart Scaling module
The built-in Smart Scaling module (nimbus-module-scaling) adds proactive, time-based scaling on top of the reactive scaling engine. While the core engine reacts to current player counts, the Smart Scaling module predicts demand and pre-starts servers.
Features
- Time schedules — Define rules like "evenings need 3 lobbies, weekends need 4"
- Predictive warmup — Analyzes player history (last 7 days, same weekday/hour) to pre-start servers before demand hits
- Lead time — Start servers X minutes before a schedule activates
- Player count history — Snapshots every 60s, stored in the database, auto-pruned after 90 days
The module only starts servers — it never stops them. Scale-down remains the responsibility of the core scaling engine's idle timeout.
Schedule configuration
Each group can have its own schedule in config/modules/scaling/<GroupName>.toml:
[schedule]
enabled = true
timezone = "Europe/Berlin"
[[schedule.rules]]
name = "evening-peak"
days = ["MON", "TUE", "WED", "THU", "FRI"]
from = "17:00"
to = "23:00"
min_instances = 3
[[schedule.rules]]
name = "weekend"
days = ["SAT", "SUN"]
from = "10:00"
to = "01:00"
min_instances = 4
[[schedule.rules]]
name = "night"
days = ["MON", "TUE", "WED", "THU", "FRI", "SAT", "SUN"]
from = "02:00"
to = "08:00"
min_instances = 1
max_instances = 2
[warmup]
enabled = true
lead_time_minutes = 10Schedule rules override the group's min_instances during their active window. The max_instances in a rule is optional and caps scaling during that period. Overnight ranges (e.g. 22:00 to 02:00) are supported.
Console commands
scaling status # Active schedules + recent decisions
scaling schedule list # All rules for all groups
scaling schedule info <group> # Rules for a specific group
scaling history <group> [hours] # Player count history (default 24h)
scaling predict <group> # Predictions for next 6 hours
scaling reload # Reload configsREST API
GET /api/scaling/status # Overview + recent decisions
GET /api/scaling/schedules # All schedule rules
GET /api/scaling/schedules/{group} # Rules for a group
GET /api/scaling/history/{group} # Player history (?hours=24)
GET /api/scaling/predictions/{group} # Predictions (next 6h)
GET /api/scaling/decisions # Recent scaling decisionsHow it works
Every 30 seconds, the module:
- Collects snapshots — Records current player count + service count per group
- Evaluates schedules — Matches current time against rules, checks if
min_instancesis met - Evaluates predictions — Compares predicted player load (historical average) against current capacity
- Starts servers — If more services are needed, calls
serviceManager.startService()
A 60-second cooldown per group prevents rapid-fire starts.
Startup order
Nimbus uses a phased startup to ensure the proxy is ready before any backend server starts:
- Phase 1 — Start all proxy groups, wait for them to reach
READYstate (up to 120 seconds) - Phase 2 — Start all backend groups
The scaling engine is deliberately started after the phased boot completes, so it cannot race the startup order.
Next steps
- Server Groups -- Group configuration details
- SDK -- Setting custom states from game plugins
- API Reference -- REST API for manual control
- nimbus.toml Reference -- Load balancer and cluster configuration