Guide · Redundancy & uptime design without double-sign

Design for high uptime without ever double-signing.

Uptime and redundancy matter, but not at the cost of slashing. This page explains how to design failover and redundancy for Ethereum validators – active/standby, remote signers, DVT – and how Validator Tools can help you reason about these designs and avoid double-sign risk.

Redundancy · Failover · Remote signers · DVT · Zero double-sign tolerance

1. Uptime and redundancy: what you are really optimising for

There are two goals that must be satisfied at the same time:

  • Stay live enough – validators perform duties reliably, even when hardware or networks fail.
  • Never double-sign – no matter how failover is triggered, a validator key must never be active in two places.

In practice, many slashing incidents are the result of designs that optimised for uptime without a rigorous model for how and where keys can be active.

Pattern · Single signer + standby infra
One active validator client or remote signer, with standby hardware or nodes that can attach but do not hold copies of the key or run independently.
Pattern · Remote signer + multiple CL/EL
Several CL/EL nodes, one signing authority. Resilient, but requires careful configuration and clear ownership of signer policies.
Anti-pattern · Two validator clients with same keys
Two independent nodes configured identically, both able to sign. Looks like redundancy, behaves like a double-sign accident waiting to happen.
Design principle: redundancy should be built around infrastructure (nodes, networks, clients), while signing authority for a given validator key should remain singular and clearly modelled.

2. Risk patterns to avoid in redundancy setups

When teams add redundancy under time pressure, three risk patterns show up over and over again:

2.1 “Backup node” with full key copy

A second node is provisioned with a full copy of the validator key store. A health check or manual switch flips traffic to it in case of problems. If the original is not fully stopped or if both are briefly active, double-signing is possible.

2.2 Uncoordinated multi-region setups

Validators are deployed in multiple regions with slightly different configurations and scripts. Failover is triggered manually. Over time, drift between configurations increases the chance that two regions will run the same validators at once.

2.3 Mixing DVT/DKG with ad-hoc backups

Distributed validator technology or key splitting is combined with traditional “copy the keys” backups. The combined system is poorly documented and operators no longer have a clear mental model of where signing authority actually lives.

Validator Tools cannot magically prevent bad topology choices, but it can help make them visible: where keys live, which components can sign, and what failover paths exist.

Step by step
Using Validator Tools to design redundancy without double-sign risk

These steps show how to use Validator Tools to model your redundancy design, check it for double-sign hazards, and track changes over time.

  1. Install the desktop application. Download and install the latest version Validator Tools GUI for your operating system (Windows, macOS or Linux). Run it on an operator workstation with network access to your validator clients, beacon nodes and any remote signers.
  2. Register all nodes and signers. Add each CL/EL/VC node and any remote signing infrastructure (HSMs, signer clusters, DVT nodes) to the infrastructure view, including which validators or key IDs they can serve.
  3. Describe your intended redundancy pattern. For each validator group (e.g. “core mainnet”, “client A EU”), select a redundancy pattern in the GUI: active–standby, remote signer with multiple CL/EL nodes, DVT cluster, or “single node, no redundancy”.
  4. Attach validators and keys to a single signing authority. Ensure that, for each validator key, the app shows one and only one signing authority: either a specific validator client instance or a named remote signer / DVT cluster. All other components should be modelled as infrastructure, not as independent signers.
  5. Model failover triggers and paths. For each pattern, document in Validator Tools how failover happens: manually via a runbook, automatically via health checks, or through external orchestration. Link those triggers to the nodes and signers they affect.
  6. Run a “double-sign risk” check. Use the app’s topology check to highlight any situation where: a key appears to be live behind multiple signers, two validator clients could sign for the same key, or DVT topology is mixed with unmanaged backups.
  7. Align documentation and runbooks to the model. Update your failover runbooks so that they explicitly reference the same patterns and components that appear in the GUI. If the written runbook and the model disagree, treat that as a bug.
  8. Review changes as deliberate design updates. When you add new regions, nodes or signers, update the model first, then the infrastructure. Use Validator Tools’ history view to track how redundancy has evolved and who changed what.
It is often easiest to formalise redundancy for one validator group first, prove that the model matches reality, and then apply the same patterns to the rest of your fleet.

4. Safer redundancy design patterns

While every setup is unique, a few patterns have emerged as relatively safe and easy to reason about:

  • Single signer, multiple CL/EL nodes. A remote signer (or central validator client) holds the keys, while multiple CL/EL nodes connect to it. If one node fails, another can attach to the same signer without copying keys.
  • Active–standby validator clients with clean handover. Only one validator client is configured to sign for a set of keys at a time. Standby instances do not hold the keys or are configured as “cold” until an explicit migration procedure is completed.
  • DVT / key sharing, without extra full-key backups. When using DVT or threshold signing, treat the DVT cluster as the single signing authority, and avoid duplicating keys outside that system.

Validator Tools can represent these patterns explicitly, so your design decisions are visible and reviewable, not just implied by scripts.

Important: “safe pattern” does not mean “cannot be misconfigured”. It means the default shape of the design makes double-sign less likely, and misconfigurations easier to see in the topology view.

5. Using Validator Tools as a design review surface

In many teams, redundancy decisions are scattered across Terraform, Ansible, shell scripts and ad-hoc documentation. Validator Tools provides a neutral surface where:

  • infrastructure engineers can see how validators and signers are wired together,
  • security teams can review where keys exist and how failover works,
  • governance or risk committees can understand the impact of regional outages.

You can treat the topology and redundancy views as a kind of “diagram that never goes stale”, because they are built directly from the registered components, keys and validators.

If your current redundancy is “whatever our scripts do”, it is worth modelling it explicitly at least once. You can download Validator Tools, register your existing nodes and signers, and see what your current topology actually looks like from a double-sign risk perspective.

6. Practical recommendations for uptime without double-sign

To make redundancy work in your favour, not against you:

  • Separate infrastructure redundancy from signing authority. It is fine to have many CL/EL nodes and networks; there should still be a single, clearly defined signer per validator key.
  • Treat failover as a design, not a script hack. What triggers failover, who approves it and how it is executed should be documented and reflected in your tooling.
  • Model before you deploy. Use Validator Tools (or another modelling approach) to sketch redundancy first, then build infrastructure to match it, not the other way around.
  • Revisit your design after major changes. Enabling DVT, moving to remote signers, or adding regions should always trigger a design review and a new topology snapshot.
Uptime is only a win if it doesn’t come with slashing risk attached. If you want a structured way to design and maintain redundancy, you can download Validator Tools, connect it to your validator stack and start by modelling a single, critical validator group. Pair that design with the monitoring & alerting guide so the failover conditions are clear and observable.