Nimbusv1.0.0

Auto-Scaling Guide

How Nimbus automatically scales server instances based on player demand — scale-up, scale-down, custom states, smart scheduling, and multi-node.

Nimbus includes an auto-scaling engine that dynamically adjusts the number of running instances per group based on player demand. This ensures you always have enough server capacity without wasting resources on idle instances.

How scaling works

The scaling engine runs a continuous loop:

  1. Every heartbeat interval (default: 10 seconds), ping all READY services via Server List Ping to update player counts
  2. For each dynamic group, count players on routable services (those with no custom state)
  3. Apply the scale-up and scale-down rules

Static groups are never auto-scaled.

Scale-up

The engine scales up when the fill rate exceeds the scale threshold:

Scale-Up Formula
fill_rate = total_players / (routable_count * players_per_instance)

if fill_rate > scale_threshold
   AND routable_count < max_instances
→ start one new instance

Example: A BedWars group with players_per_instance = 16 and scale_threshold = 0.8:

Routable serversPlayersFill rateAction
22062.5%No action
22784.4%Scale up (> 80%)
32756.3%No action
34083.3%Scale up (> 80%)

Cooldown

Only one instance is started per evaluation cycle. After a scale-up, the engine waits 30 seconds before scaling the same group again. After a scale-down, the cooldown is 120 seconds. This prevents thrashing when player counts fluctuate rapidly.

Stress testing

During an active stress test, the scaling engine is completely paused to avoid reacting to simulated player counts. Scaling resumes automatically when the stress test ends.

Global service limit

The scaling engine respects the controller.max_services setting (default: 20) as a global hard cap across all groups. Even if a group's max_instances allows more, the engine will not start new instances once the total service count reaches this limit. This prevents resource exhaustion from misconfigured scaling rules.

Scale-down

The engine scales down when a service has been empty for longer than the idle timeout:

Scale-Down Formula
if service.playerCount == 0
   AND service has been empty for > idle_timeout seconds
   AND routable_count > min_instances
→ stop that service

The idle timer starts when a service's player count drops to zero and resets if any player joins.

If idle_timeout is set to 0, scale-down is disabled -- empty instances will never be stopped automatically. This is the default, and appropriate for lobbies that should always be running.

Custom states and routing

Services can have a custom state set by game plugins via the SDK (e.g., "WAITING", "INGAME", "ENDING"). Services with a custom state are treated specially:

  • Excluded from routable count -- they don't count toward capacity calculations
  • Never scaled down -- even if empty, a service mid-game won't be stopped
  • Not chosen for routing -- the proxy won't send new players to them

This is critical for game servers. Without custom states, a BedWars server mid-game would count toward capacity, and the scaling engine might not start new instances even though no server is actually accepting players.

Game Plugin — Custom State
// In your game plugin:
Nimbus.setState("INGAME");    // Server is no longer routable
// ... game plays out ...
Nimbus.clearState();          // Server is routable again

How custom states affect scaling

Consider a BedWars group with 4 instances:

ServicePlayersCustom StateRoutable?
BedWars-116INGAMENo
BedWars-214INGAMENo
BedWars-38nullYes
BedWars-40nullYes

The scaling engine sees: 2 routable servers, 8 total routable players. Fill rate = 8 / (2 * 16) = 25%. No scale-up needed.

If BedWars-4 fills up:

ServicePlayersCustom StateRoutable?
BedWars-314nullYes
BedWars-414nullYes

Fill rate = 28 / (2 * 16) = 87.5% > 80% threshold. A new instance (BedWars-5) starts.

Configuration

Scaling is configured per group in the [group.scaling] section:

config/groups/<group>.toml
[group.scaling]
min_instances = 1          # Minimum instances (always running)
max_instances = 10         # Maximum instances (hard cap)
players_per_instance = 16  # Expected players per server
scale_threshold = 0.8      # Scale up at this fill rate (0.0 - 1.0)
idle_timeout = 300         # Seconds before stopping an empty server
FieldDefaultDescription
min_instances1Floor -- instances are started on boot to meet this
max_instances4Ceiling -- scaling will never exceed this
players_per_instance40Used to calculate fill rate
scale_threshold0.8Fill rate that triggers scale-up (80%)
idle_timeout0Seconds empty before scale-down (0 = disabled)

Practical examples

Lobby servers

Always-on, spread players evenly. Never scale down below 1.

config/groups/lobby.toml
[group]
name = "Lobby"
type = "STATIC"

[group.scaling]
min_instances = 1
max_instances = 4
players_per_instance = 100
scale_threshold = 0.8
idle_timeout = 0             # Never stop lobbies

BedWars (minigame)

Ephemeral servers that scale with demand. Stop empty servers after 5 minutes.

config/groups/bedwars.toml
[group]
name = "BedWars"
type = "DYNAMIC"

[group.scaling]
min_instances = 1
max_instances = 10
players_per_instance = 16
scale_threshold = 0.8
idle_timeout = 300

[group.lifecycle]
stop_on_empty = false        # Let idle_timeout handle it
restart_on_crash = true
max_restarts = 3

Survival server

Single persistent instance, no scaling needed.

config/groups/survival.toml
[group]
name = "Survival"
type = "STATIC"

[group.scaling]
min_instances = 1
max_instances = 1

Event mode

Expecting a traffic spike? Temporarily increase capacity:

config/groups/<group>.toml — Event Mode
[group.scaling]
min_instances = 5
max_instances = 20
players_per_instance = 50
scale_threshold = 0.7        # Scale earlier for headroom
idle_timeout = 600

Or edit the group config file and reload:

Nimbus
reload

Manual scaling

Console commands

Nimbus
# Manually start an instance
start BedWars

# Manually stop an instance
stop BedWars-3

# View current scaling state
status

REST API

Terminal
# Start a new instance
curl -X POST http://127.0.0.1:8080/api/services/BedWars/start \
  -H "Authorization: Bearer <token>"

# Stop a specific instance
curl -X POST http://127.0.0.1:8080/api/services/BedWars-3/stop \
  -H "Authorization: Bearer <token>"

Monitoring

The status command shows current scaling state for all groups:

Nimbus
status

This displays:

  • Active instances per group
  • Player counts per instance
  • Custom states
  • Scaling bounds (min/max)

Tuning tips

Threshold selection

ThresholdBehaviorBest for
0.6Scale early, lots of headroomCompetitive games where lag = lost players
0.8Balanced (default)Most use cases
0.95Pack tightly, minimal wasteLarge lobbies, budget-constrained

Idle timeout

TimeoutBehavior
0Never auto-stop (good for lobbies)
60Aggressive cleanup (1 minute)
300Balanced (5 minutes, good for minigames)
900Conservative (15 minutes, good for longer game modes)

Players per instance

Set this to the expected peak capacity, not the absolute maximum. If your BedWars maps hold 16 players, set players_per_instance = 16. If your lobby can handle 200 but you want to keep it comfortable, set players_per_instance = 100.

The scale_threshold and players_per_instance work together. A threshold of 0.8 with 16 players per instance means scaling triggers at 13 players on average per routable server.

Multi-node scaling

By default, Nimbus runs all services on a single machine. For larger networks, you can enable cluster mode to distribute services across multiple machines with automatic placement, failover, and an optional TCP load balancer.

The scaling engine works the same way in multi-node — it just has more machines to place services on. Remote services report their player counts via heartbeats, so scaling decisions remain accurate. If no remote node is available, dynamic services fall back to running locally on the controller.

Static services are never distributed — they always run on the controller because they have persistent data (worlds, configs) in services/static/. Only dynamic and proxy services are placed on remote agent nodes.

See the dedicated Multi-Node & Load Balancer guide for full setup instructions, placement strategies, load balancer configuration, and failure handling.

Smart Scaling module

The built-in Smart Scaling module (nimbus-module-scaling) adds proactive, time-based scaling on top of the reactive scaling engine. While the core engine reacts to current player counts, the Smart Scaling module predicts demand and pre-starts servers.

Features

  • Time schedules — Define rules like "evenings need 3 lobbies, weekends need 4"
  • Predictive warmup — Analyzes player history (last 7 days, same weekday/hour) to pre-start servers before demand hits
  • Lead time — Start servers X minutes before a schedule activates
  • Player count history — Snapshots every 60s, stored in the database, auto-pruned after 90 days

The module only starts servers — it never stops them. Scale-down remains the responsibility of the core scaling engine's idle timeout.

Schedule configuration

Each group can have its own schedule in config/modules/scaling/<GroupName>.toml:

config/modules/scaling/<GroupName>.toml
[schedule]
enabled = true
timezone = "Europe/Berlin"

[[schedule.rules]]
name = "evening-peak"
days = ["MON", "TUE", "WED", "THU", "FRI"]
from = "17:00"
to = "23:00"
min_instances = 3

[[schedule.rules]]
name = "weekend"
days = ["SAT", "SUN"]
from = "10:00"
to = "01:00"
min_instances = 4

[[schedule.rules]]
name = "night"
days = ["MON", "TUE", "WED", "THU", "FRI", "SAT", "SUN"]
from = "02:00"
to = "08:00"
min_instances = 1
max_instances = 2

[warmup]
enabled = true
lead_time_minutes = 10

Schedule rules override the group's min_instances during their active window. The max_instances in a rule is optional and caps scaling during that period. Overnight ranges (e.g. 22:00 to 02:00) are supported.

Console commands

Nimbus — Smart Scaling
scaling status                    # Active schedules + recent decisions
scaling schedule list             # All rules for all groups
scaling schedule info <group>     # Rules for a specific group
scaling history <group> [hours]   # Player count history (default 24h)
scaling predict <group>           # Predictions for next 6 hours
scaling reload                    # Reload configs

REST API

REST API — Smart Scaling
GET  /api/scaling/status              # Overview + recent decisions
GET  /api/scaling/schedules           # All schedule rules
GET  /api/scaling/schedules/{group}   # Rules for a group
GET  /api/scaling/history/{group}     # Player history (?hours=24)
GET  /api/scaling/predictions/{group} # Predictions (next 6h)
GET  /api/scaling/decisions           # Recent scaling decisions

How it works

Every 30 seconds, the module:

  1. Collects snapshots — Records current player count + service count per group
  2. Evaluates schedules — Matches current time against rules, checks if min_instances is met
  3. Evaluates predictions — Compares predicted player load (historical average) against current capacity
  4. Starts servers — If more services are needed, calls serviceManager.startService()

A 60-second cooldown per group prevents rapid-fire starts.

Startup order

Nimbus uses a phased startup to ensure the proxy is ready before any backend server starts:

  1. Phase 1 — Start all proxy groups, wait for them to reach READY state (up to 120 seconds)
  2. Phase 2 — Start all backend groups

The scaling engine is deliberately started after the phased boot completes, so it cannot race the startup order.

Next steps