Nimbus — Minecraft Cloud System

Configure server groups that define how Nimbus manages collections of server instances, including scaling, lifecycle, and JVM settings.

Server groups define how Nimbus manages collections of server instances. Each group has its own TOML file in the groups/ directory (e.g., groups/Lobby.toml, groups/BedWars.toml).

Group configs can be hot-reloaded at runtime using the reload console command.

Complete Example

config/groups/BedWars.toml

[group]
name = "BedWars"
type = "DYNAMIC"
template = "BedWars"
software = "PAPER"
version = "1.21.4"

[group.resources]
memory = "2G"
max_players = 50

[group.scaling]
min_instances = 1
max_instances = 10
players_per_instance = 40
scale_threshold = 0.8
idle_timeout = 0

[group.lifecycle]
stop_on_empty = false
restart_on_crash = true
max_restarts = 5

[group.jvm]
optimize = true

`[group]`

Core group identity and server software settings.

Option	Type	Default	Description
`name`	String	required	Group name in PascalCase. Only `a-z`, `A-Z`, `0-9`, `-`, `_` allowed. Services are named `<Name>-<N>` (e.g., `BedWars-1`).
`type`	Enum	`"DYNAMIC"`	`STATIC` or `DYNAMIC`. See Static vs Dynamic below.
`template`	String	required	Template directory name inside `templates/`. Only `a-z`, `A-Z`, `0-9`, `-`, `_`, `.` allowed.
`software`	Enum	`"PAPER"`	Server software. One of: `PAPER`, `PUFFERFISH`, `PURPUR`, `LEAF`, `FOLIA`, `VELOCITY`, `FORGE`, `FABRIC`, `NEOFORGE`, `CUSTOM`.
`version`	String	`"1.21.4"`	Minecraft version (e.g., `"1.21.4"`, `"1.8.8"`). Must match format `X.Y` or `X.Y.Z`.
`modloader_version`	String	`""`	Modloader version for `FORGE`, `FABRIC`, or `NEOFORGE`. If empty, Nimbus uses the latest stable version.
`jar_name`	String	`""`	Custom JAR filename for `CUSTOM` software. Defaults to `"server.jar"` if empty.
`ready_pattern`	String	`""`	Custom regex pattern for detecting when a `CUSTOM` server is ready. Nimbus watches stdout for this pattern.
`java_path`	String	`""`	Override the Java binary path for this group. Takes priority over the version-based lookup in nimbus.toml.
`templates`	`List<String>`	`[]`	Template stacking: list of template names applied in order (later overrides earlier). When set, supersedes the singular `template` field. Example: `templates = ["base", "lobby-overlay"]`.

`[group.resources]`

Memory and player capacity settings.

Option	Type	Default	Description
`memory`	String	`"1G"`	JVM heap size (`-Xmx`). Format: number + `M` or `G` (e.g., `"512M"`, `"2G"`). Counts toward the controller's `max_memory` budget.
`max_players`	Int	`50`	Maximum players per instance. Must be ≥ 1. Used for display and scaling calculations.

`[group.scaling]`

Auto-scaling behavior for dynamic groups. Ignored for static groups.

Option	Type	Default	Description
`min_instances`	Int	`1`	Minimum running instances. Nimbus ensures at least this many are always running. Must be ≥ 0 and ≤ `max_instances`.
`max_instances`	Int	`4`	Maximum instances the scaling engine will create.
`players_per_instance`	Int	`40`	Target player capacity per instance. Used in the fill-rate calculation.
`scale_threshold`	Double	`0.8`	Fill-rate threshold (0.0 - 1.0) that triggers scale-up. When the ratio of total players to total capacity exceeds this value, a new instance is started.
`idle_timeout`	Long	`0`	Seconds before an empty instance is stopped. Set to `0` to disable idle shutdown (instances stay running indefinitely). Only applies when current instances exceed `min_instances`.
`warm_pool_size`	Int	`0`	Number of pre-staged services kept in PREPARED state. These services have templates extracted and JVM configured but are not yet started — they launch in seconds instead of minutes. `0` = disabled.

Scaling Formula

The scaling engine runs every heartbeat_interval milliseconds (configured in nimbus.toml) and evaluates each dynamic group:

Scale Up - A new instance starts when:

Scale Up

fill_rate = total_players / (routable_instances * players_per_instance)
fill_rate > scale_threshold AND current_instances < max_instances

Scale Down - An empty instance stops when:

Scale Down

instance_players == 0
AND seconds_idle > idle_timeout
AND idle_timeout > 0
AND current_instances > min_instances

Services with a custom state (e.g., INGAME, ENDING) are excluded from capacity calculations. They are not considered "routable" and won't accept new players, so they don't count toward the fill rate.

Example: High-Volume Game Server

config/groups/BedWars.toml

[group.scaling]
min_instances = 2       # Always have 2 ready
max_instances = 20      # Scale up to 20
players_per_instance = 16
scale_threshold = 0.7   # Start new instance at 70% fill
idle_timeout = 120      # Remove empty instances after 2 minutes

`[group.lifecycle]`

Instance lifecycle management.

Option	Type	Default	Description
`stop_on_empty`	Boolean	`false`	Stop the instance when the last player leaves. Useful for game servers where an empty instance has no purpose.
`restart_on_crash`	Boolean	`true`	Automatically restart an instance if its process exits unexpectedly.
`max_restarts`	Int	`5`	Maximum consecutive automatic restarts. After this limit, the instance stays stopped to prevent crash loops. Must be ≥ 0.
`drain_timeout`	Long	`30`	Seconds to wait for graceful drain (players leaving) before force-stopping a service.
`deploy_on_stop`	Boolean	`false`	Copy changed files back to the template directory on service stop (deploy-back). Only files that differ from the template are copied.
`deploy_excludes`	`List<String>`	see below	Glob patterns to skip during deploy-back. Default: `["logs/", "crash-reports/", "cache/", "libraries/", "*.tmp"]`.

Deploy-back example

A common iteration workflow for DYNAMIC groups: start a service, edit its configs or plugins live in the service directory, then stop the service so the changes land back in the template for the next spawn.

config/groups/lobby.toml

[group.lifecycle]
stop_on_empty = false
restart_on_crash = true
deploy_on_stop = true
deploy_excludes = [
  "logs/",
  "crash-reports/",
  "cache/",
  "libraries/",
  "*.tmp",
  "world/playerdata/",   # don't snapshot per-player state
  "plugins/*/data/",     # skip plugin runtime data
]

On every graceful stop, Nimbus hashes each file in the service directory, compares it to the template, and copies back only the files that differ (respecting deploy_excludes). If the copy fails, the service is moved to CRASHED with a descriptive error — the template is never left in a half-written state.

Deploy-back is intended for iterating on configs and plugins, not for persisting world or player data. For stateful services that need their full working directory preserved across restarts, use type = "STATIC" or enable [group.sync] instead. Crash-stopped services never trigger deploy-back — only graceful stops do.

`[group.placement]`

Controls where services in this group run in a multi-node cluster. Ignored when cluster mode is disabled.

Option	Type	Default	Description
`node`	String	`""`	Placement target: `""` = any available node (default scheduler), `"local"` = controller only, `"<node_name>"` = pin to a specific agent node.
`fallback`	String	`"wait"`	What happens when the pinned node is offline. `"wait"` = refuse to start until the node is back; `"local"` = start on the controller instead (UNSAFE for stateful groups — data diverges); `"fail"` = log an error and skip.

Pinning stateful groups to a node. You can assign a STATIC group (e.g. Lobby, Survival) to a specific agent by setting node = "worker-1". The service's data directory lives on the agent's filesystem under services/static/<name>/ and is preserved across restarts by the agent's existing static-workdir handling.

For pinned stateful services, keep the default fallback = "wait". Setting fallback = "local" causes Nimbus to start a second copy of the service on the controller if the pinned agent is offline, which creates divergent data on two hosts. Automated recovery is a separate (not-yet-implemented) migration flow.

Initial data seeding. The first time a pinned static service starts on an agent, the agent downloads the group template and creates a fresh workdir. If you want to seed the workdir with existing data (e.g., an existing world), copy the data into services/static/<name>/ on the agent before the first start — Nimbus will not overwrite files that already exist.

config/groups/lobby.toml

[group]
name = "Lobby"
type = "STATIC"
template = "lobby"
software = "PAPER"
version = "1.21.4"

[group.placement]
node = "worker-1"     # pin to this agent
fallback = "wait"     # don't fall back to controller (data integrity)

`[group.sync]`

State sync policy for stateful services that should float between nodes while keeping the canonical copy of their working directory on the controller. Enables multi-node deployment of services like Lobbies and Survival worlds without manually distributing data.

State sync only works with STATIC groups. Setting enabled = true on a DYNAMIC group logs a warning and the flag is force-disabled — dynamic services are rebuilt from templates on every start, so sync has no sensible semantics.

Option	Type	Default	Description
`enabled`	Boolean	`false`	When true, the controller stores the service's canonical working directory under `services/state/<name>/`. On every start, the agent pulls (only changed files). On every graceful stop, the agent pushes changes back.
`excludes`	`List<String>`	see below	rsync-style glob patterns. Matched files are neither uploaded nor deleted during reconcile.

Default excludes:

excludes = ["logs/", "cache/", "crash-reports/", "*.tmp", "*.lock", "*.pid", "session.lock"]

How it works

On start (remote node): agent fetches controller's manifest, compares against its local cached copy, downloads only the files that changed (or everything on the first start), reconciles local state to match canonical.
While running: data lives on the agent's filesystem. Controller doesn't interfere — MC process writes normally.
On graceful stop: agent computes delta between its current state and controller's manifest, uploads only the changed files in one multipart request, controller atomically commits staging → canonical.
On crash (not graceful): no push happens. Controller's canonical copy stays at the last successful push. Any changes since then are lost on the dead node.

Data loss model

Scenario	What you keep	What you lose
Planned restart (via `service stop`, `scaling down`, config reload)	Everything up to the stop moment	Nothing
Node failure / crash / kill	Everything up to the last graceful stop	Runtime changes since the last stop
Network blip during push	Controller stays at pre-push state (atomic staging)	Current session's changes (until next successful push)

Unplanned crashes lose data since the last graceful stop. If your use case can't tolerate any data loss, use pinning ([group.placement] node = "<id>") instead — data stays on the assigned agent, no loss, but also no automatic migration.

Sync vs. Pin

Aspect	`sync.enabled = true`	`placement.node = "<id>"`
Data location	Controller (canonical) + node (cache)	Node only
Service mobility	Moves freely across nodes	Stuck to one node
Crash loss window	Since last graceful stop	Zero (data stays on the same disk)
Controller restart	Safe (canonical preserved)	Safe (data on node)
Permanent node loss	Auto-recovers on any other node	Requires manual recovery
Bandwidth	Delta upload on every stop	None

Setting both is a configuration error — sync wins and a warning is logged.

Example

config/groups/lobby.toml

[group]
name = "Lobby"
type = "STATIC"
template = "lobby"
software = "PAPER"
version = "1.21.4"

[group.sync]
enabled = true
excludes = [
    "logs/",
    "cache/",
    "crash-reports/",
    "plugins/.paper-remapped/",
    "*.tmp",
    "session.lock"
]

Bandwidth optimization

Sync is incremental: the agent's first step is fetching the controller's manifest (a JSON list of path → sha256), comparing against its local copy, and only downloading or uploading files that actually differ. For a 5 GB world where only a few chunks changed during the last session, push transfers only the changed .mca files — typically tens of MB.

The first start ever (or after nimbus cluster state reset <name>) is a full transfer because there's nothing to compare against.

`[group.jvm]`

JVM and performance optimization settings.

Option	Type	Default	Description
`optimize`	Boolean	`true`	Enable automatic performance optimization. When enabled with no custom `args`, applies Aikar's JVM flags and optimizes `spigot.yml` + `paper-world-defaults.yml` for Paper/Pufferfish/Purpur/Leaf/Folia servers.
`args`	`List<String>`	`[]`	Custom JVM arguments passed before the `-jar` flag. When set alongside `optimize = true`, these args are used instead of Aikar's flags, but config optimization still applies.

Performance Optimization

When optimize = true (the default), Nimbus automatically:

Aikar's JVM Flags — Applies optimized G1GC tuning flags that reduce GC pauses and improve throughput. Flags are adjusted automatically for large heaps (12G+). Applied to all server types.
Config Tuning (Paper/Pufferfish/Purpur/Leaf/Folia only) — Optimizes spigot.yml (merge radius, entity activation ranges) and paper-world-defaults.yml (chunk save throttling, explosion optimization, despawn ranges).

Three modes:

config/groups/*.toml

# Mode 1: Full auto (default) — Aikar's flags + config tuning
[group.jvm]
optimize = true

# Mode 2: Custom JVM flags + config tuning
[group.jvm]
optimize = true
args = ["-XX:+UseZGC", "-Dcom.mojang.eula.agree=true"]

# Mode 3: No optimization — fully manual
[group.jvm]
optimize = false

`[group.sandbox]`

Per-group override of the managed sandbox backend and its resource limits. Defaults come from the global [sandbox] section in nimbus.toml; the per-group block only needs to carry the keys you want to diverge on. Added in v0.12.0.

Option	Type	Default	Description
`mode`	String	`""`	`""` inherits `[sandbox] default_mode`. Explicit values: `"bare"` (plain `ProcessBuilder`, no kernel enforcement), `"managed"` (cgroup v2 via `systemd-run --user --scope`; falls back to bare on unsupported platforms), `"docker"` (delegated to the Docker module — kept for backwards compatibility; prefer `[group.docker] enabled = true`).
`memory_limit_mb`	Long	`0`	Hard memory cap in MB applied when `mode = "managed"`. `0` auto-derives from `[group.resources] memory` plus the global overhead budget (default 30 % or 256 MB, whichever is larger).
`cpu_quota`	Double	`0.0`	CPU quota as a multiplier — `1.0` = one core, `2.5` = 2.5 cores, `0.0` = unlimited. Applied as `CPUQuota=N%` to the cgroup scope.
`tasks_max`	Int	`0`	Maximum task (thread + process) count inside the scope. `0` = unlimited.

config/groups/Lobby.toml

[group.sandbox]
mode = "managed"          # force managed even if global default is "bare"
memory_limit_mb = 4096    # hard cap, overrides the auto-derived value
cpu_quota = 2.0           # two cores
tasks_max = 512

managed mode is kernel-enforced. If a service exceeds memory_limit_mb, the kernel OOM-kills it and Nimbus surfaces the crash with an "OOM-gekillt (Exit 137)" diagnosis in the operator console and in service.lastCrashReport. Set a realistic cap — -Xmx plus 30 % native-memory headroom is a safe starting point.

Static vs Dynamic

Aspect	STATIC	DYNAMIC
Template handling	Template copied once; existing files preserved on restart	Template re-applied from scratch on every start
World data	Persisted across restarts	Wiped on every start
Scaling	No auto-scaling; instances managed manually	Auto-scaled based on player count and thresholds
Use case	Survival worlds, persistent lobbies, build servers	Minigame servers, temporary game instances
Instance count	Fixed at `min_instances`	Ranges from `min_instances` to `max_instances`

Full Examples

Proxy (Velocity)

config/groups/Proxy.toml

[group]
name = "Proxy"
type = "STATIC"
template = "Proxy"
software = "VELOCITY"
version = "3.4.0"

[group.resources]
memory = "512M"
max_players = 500

[group.scaling]
min_instances = 1
max_instances = 1

[group.lifecycle]
restart_on_crash = true
max_restarts = 10

[group.jvm]
optimize = true

Proxy ports start at 25565. Backend ports start at 30000. Port allocation is automatic.

Lobby

config/groups/Lobby.toml

[group]
name = "Lobby"
type = "STATIC"
template = "Lobby"
software = "PAPER"
version = "1.21.4"

[group.resources]
memory = "1G"
max_players = 100

[group.scaling]
min_instances = 1
max_instances = 3
players_per_instance = 80
scale_threshold = 0.8
idle_timeout = 0

[group.lifecycle]
restart_on_crash = true
max_restarts = 5

[group.jvm]
optimize = true

Game Server (BedWars)

config/groups/BedWars.toml

[group]
name = "BedWars"
type = "DYNAMIC"
template = "BedWars"
software = "PAPER"
version = "1.21.4"

[group.resources]
memory = "2G"
max_players = 16

[group.scaling]
min_instances = 1
max_instances = 10
players_per_instance = 16
scale_threshold = 0.7
idle_timeout = 120

[group.lifecycle]
stop_on_empty = true
restart_on_crash = true
max_restarts = 3

[group.jvm]
optimize = true

Modded Server (Forge)

config/groups/ModdedSMP.toml

[group]
name = "ModdedSMP"
type = "STATIC"
template = "ModdedSMP"
software = "FORGE"
version = "1.20.1"
modloader_version = "47.2.0"

[group.resources]
memory = "6G"
max_players = 30

[group.scaling]
min_instances = 1
max_instances = 1

[group.lifecycle]
restart_on_crash = true
max_restarts = 3

[group.jvm]
optimize = true

Fabric Server

config/groups/FabricSMP.toml

[group]
name = "FabricSMP"
type = "STATIC"
template = "FabricSMP"
software = "FABRIC"
version = "1.21.4"

[group.resources]
memory = "4G"
max_players = 40

[group.scaling]
min_instances = 1
max_instances = 1

[group.lifecycle]
restart_on_crash = true
max_restarts = 3

[group.jvm]
optimize = true

Folia Server (Regionized Multithreading)

config/groups/FoliaLobby.toml

[group]
name = "FoliaLobby"
type = "STATIC"
template = "FoliaLobby"
software = "FOLIA"
version = "1.21.4"

[group.resources]
memory = "4G"
max_players = 200

[group.scaling]
min_instances = 1
max_instances = 2

[group.lifecycle]
restart_on_crash = true
max_restarts = 3

[group.jvm]
optimize = true

Folia Plugin Compatibility

Folia uses regionized multithreading, which breaks most Bukkit/Paper plugins. Only use plugins that explicitly support Folia's threading model. The Nimbus SDK and NimbusPerms are fully Folia-compatible.

Custom Server Software

config/groups/CustomServer.toml

[group]
name = "CustomServer"
type = "STATIC"
template = "CustomServer"
software = "CUSTOM"
version = "1.21.4"
jar_name = "custom-server.jar"
ready_pattern = "Server started on port \\d+"

[group.resources]
memory = "2G"
max_players = 50

[group.scaling]
min_instances = 1
max_instances = 1

[group.lifecycle]
restart_on_crash = true
max_restarts = 3

[group.jvm]
optimize = true

Dedicated Services

Dedicated services are single-instance, fixed-port Minecraft servers managed alongside groups. Each one lives in its own TOML file under config/dedicated/<Name>.toml (path configurable via [paths] dedicated). Unlike groups, they are never scaled — the scheduler just starts / stops / restarts one instance.

Complete Example

config/dedicated/sandbox.toml

[dedicated]
name = "sandbox"
port = 30010
software = "PAPER"
version = "1.21.4"
memory = "4G"
proxy_enabled = true
restart_on_crash = true
max_restarts = 5

[dedicated.jvm]
optimize = true

[dedicated.placement]
node = "worker-2"
fallback = "wait"

[dedicated.sync]
enabled = false

`[dedicated]`

Option	Type	Default	Description
`name`	String	required	Service name. Used verbatim (no `-1` suffix). Must match `[A-Za-z0-9_-]+`.
`port`	Int	required	Backend port. Must not collide with other dedicated services or the group backend port range.
`software`	Enum	`"PAPER"`	Same enum as group software (`PAPER`, `PURPUR`, `FOLIA`, `FORGE`, `FABRIC`, ...).
`version`	String	`"1.21.4"`	Minecraft version. Same format rules as group configs.
`jar_name`	String	`""`	Custom JAR filename for `CUSTOM` software.
`ready_pattern`	String	`""`	Custom stdout ready pattern for `CUSTOM` software.
`java_path`	String	`""`	Override the resolved Java binary for this service.
`proxy_enabled`	Boolean	`true`	Register the service with Velocity proxies. Set to `false` for services that should be reachable only via direct connect (e.g. build servers).
`memory`	String	`"2G"`	JVM heap (`-Xmx`). Same format as groups.
`restart_on_crash`	Boolean	`true`	Auto-restart after an unexpected process exit.
`max_restarts`	Int	`5`	Crash-loop cap.

`[dedicated.jvm]`

Same fields as [group.jvm]: optimize (Boolean, default true), args (List<String>, default []).

`[dedicated.placement]`

Same fields as [group.placement]: node (String, default ""), fallback (String, default "wait").

`[dedicated.sync]`

Same fields and semantics as [group.sync]: enabled (Boolean, default false), excludes (List<String>, default rsync-style log/cache excludes). Sync stores canonical data under dedicated/<name>/ on the controller instead of services/state/<name>/.

Server JARs are auto-downloaded on first start via the same resolver used for groups. No template is required; Nimbus creates and manages the working directory under paths.dedicated/<name>/.

Validation Rules

Nimbus validates every group config on load and rejects invalid configurations:

name must not be blank
template must not be blank
version must match X.Y or X.Y.Z format (e.g., 1.21.4, 1.8.8)
memory must match format like 512M or 2G
min_instances must be ≥ 0 and ≤ max_instances
scale_threshold must be between 0.0 and 1.0
max_players must be ≥ 1
max_restarts must be ≥ 0

Invalid group configs are skipped with an error log message. Other valid groups will still load normally.

Group Configuration

Complete Example

`[group]`

`[group.resources]`

`[group.scaling]`

Scaling Formula

Example: High-Volume Game Server

`[group.lifecycle]`

Deploy-back example

`[group.placement]`

`[group.sync]`

How it works

Data loss model

Sync vs. Pin

Example

Bandwidth optimization

`[group.jvm]`

Performance Optimization

`[group.sandbox]`

Static vs Dynamic

Full Examples

Proxy (Velocity)

Lobby

Game Server (BedWars)

Modded Server (Forge)

Fabric Server

Folia Server (Regionized Multithreading)

Custom Server Software

Dedicated Services

Complete Example

`[dedicated]`

`[dedicated.jvm]`

`[dedicated.placement]`

`[dedicated.sync]`

Validation Rules

On this page