Why We Systematically Terminate Servers for Game Server Security

Uninterrupted uptime is the prerequisite for online games to achieve their potential. Given this requirement, the idea of methodically and deliberately terminating your own game servers may seem counterintuitive, if not outright counterproductive, but in the realm of infrastructure management it’s a proactive process that ensures long-term health, stability, and game server security for your online environment. Guaranteeing this operational integrity demands 24/7 vigilance, and GameFabric handles it all for you.

Why Simple Patching Isn't Enough

The core of this challenge lies in how security vulnerabilities are addressed at the operating system level. Every day, new Linux package updates are released to fix critical security vulnerabilities, known as CVEs. A disciplined team makes sure these patches are installed promptly across a server fleet.

But simply installing a patch on a live server is not enough. For a game server security fix to be fully activated and applied correctly, the host machine must be completely rebooted. This presents an operational dilemma. How do you constantly reboot an entire global server fleet to stay secure without impacting active player sessions and causing disruptive downtime?

Graceful Rolling Reboots

GameFabric’s orchestration system is built to solve this problem without compromising your player experience. We manage this through a graceful rolling update process that fortifies game server security without causing noticeable disruption. 

When a security update is released, our hosts automatically install it and flag themselves as requiring a reboot. At this point, our cluster operator marks one of these hosts as “unschedulable.” This command tells the system not to start any new game sessions on that specific machine. 

The process of evacuating this flagged host then begins. Non-active game servers (those that are ready but have no players) are immediately terminated and their capacity is seamlessly recreated on other hosts within the cluster. What’s more, our dedicated buffer nodes automatically absorb this load if the primary fleet is at capacity, meaning there’s no loss of potential player slots.

For active game sessions, a different, more patient process unfolds. Our integration with Agones allows GameFabric’s orchestrator to know precisely which servers have active players. These “allocated” servers are given a generous window, typically 24 hours, for players to complete their sessions. If a player is in the middle of a match, they won’t be abruptly disconnected. Though, to maintain the absolute security of the underlying hardware, even these active sessions have a maximum lifetime before they are terminated.

Once the host is completely empty of all game sessions, it is safely rebooted, fully activating all security patches. It then rejoins the cluster as a healthy, secure host, ready to accept players again.

The Burden of DIY

Managing your own server infrastructure requires building the graceful reboot process yourself. It means developing a strategy for buffer capacity, enforcing session limits, and maintaining the operational discipline to track and act on daily CVEs. This is, in effect, a complex and continuous internal product that requires dedicated engineering resources and an SRE team.

GameFabric, and its veteran team of site reliability engineers, handles this complex, continuous process for you around the clock. By regularly terminating individual servers in a controlled manner, our platform ensures both game server security and uptime for your entire server fleet, freeing your studio from the burden of critical infrastructure management.

Keep your games secure and always online. Schedule your personalized GameFabric demo today.

Weave GameFabric Into Your Game.

Get Started