Cubist Blog - Your validator can get slashed even if you do everything by the book

Getting slashed is every staker's nightmare: your validator loses its signing privileges, you lose money and, if you're a service, you lose credibility, too.

In an ideal world, behaving honestly would be enough to protect you from slashing. Unfortunately, our world is far from ideal; even honest operators can accidentally sign two divergent messages with the same key. This blog post outlines a handful of such accidental slashing scenarios, and explains how you can protect yourself and your validators—even if something goes wrong.

Accidental slashing scenarios

The most basic recipe for accidental slashing is running two validator clients with the same validator key(s). At first glance, this setup seems easy to avoid: make sure you maintain a separate set of keys for each validator client. In practice, though, this is occasionally insufficient and surprisingly tough; what's easy is to accidentally use and maintain your validators in subtly slashable ways.

Scenario One: Unreliable networks and machines

Validator clients and the Beacon node clients with which they interface are complex pieces of software operating complex protocols. As a result, there are all sorts of ways that a single validator client can produce two conflicting, slashable signatures. For example, if the network is unreliable, the beacon node–which attempts to only maintain connections to reliable peers–may join a different fork of the chain.¹ If the node subsequently asks a validator client to sign for the same slot both pre-fork and post-fork, that validator will get slashed for signing conflicting messages.

Luckily, most validator clients offer a local anti-slashing database. This database tracks every signature that the client's validators produce. Before signing, the client checks the database to ensure that it's not about to produce a slashable signature; if so, the client won't sign and submit the offending message. This is a great step forward–you should always use the local anti-slashing database!–but, as we'll see shortly, it doesn't solve the slashing problem when you have more than one validator client or when you're trying to run a highly reliable client.

Scenario Two: Software upgrades

In practice, you'll eventually need to upgrade your validator software and/or patch the underlying operating system. The most straightforward upgrade workflow is to stop your validator client, update it, and then start it again. If you have many clients to upgrade, though—or if you're trying to avoid serious downtime—you may end up accidentally operating two copies of the same validator client at once. This means having two clients with the same keys—and thus two clients in a position to sign conflicting, slashable messages.

Upgrade-related slashing sounds deceptively simple. In reality, many operators depend on complex infrastructure scripts for pulling up and tearing down validators; a simple misunderstanding of infrastructure-as-code semantics can lead to a validator coming online before its out-of-date twin signs off.

Scenario Three: Maintenance and migrations

If you run multiple validator machines, you may occasionally need to load-balance validator keys among those different machines. This process can introduce slashable mistakes. For example, if you accidentally copy a key to a new validator client without removing it from the original, that's it–you're in danger of being slashed.

There are more subtle versions of this issue, too. For example, forgetting to update the local slashing database after migrating validator keys from one client to another introduces a ticking slashing timebomb. The operator thinks they're protected by the database, but since the database excludes certain keys' histories, simple network issues can once again introduce slashable signatures.

Similarly, both at setup and during migrations, it's possible to improperly configure validator clients such that they get slashed. For example, you can accidentally use a fixed–and fatally small–amount of storage for the local slashing database; you can also configure database storage such that it's accidentally non-persistent. Both of these errors re-introduce the possibility of producing slashable signatures from a single validator client.

Scenario Four: Client bugs

The previous two scenarios are distressing, but at least they’re within your control. Not so in the case of validator client bugs: if your client incorrectly requests a slashable signature, that means that its local database (already) didn't protect you—and there’s very little you can do to protect yourself.

Bugs that lead to slashing aren't always obvious. For example, a Lido operator recently experienced a power outage that forced them to move validators from one machine to another. This operator properly deleted all the migrated keys from its Prysm instances, and even confirmed that those keys had been deleted. Unfortunately, a bug in Prysm meant that the confirmation was false: those keys hadn't actually been deleted. When the operator re-started the buggy client, it re-imported the supposedly deleted keys and started signing slashable messages.

The validator node operator didn't do anything wrong—but they ended up slashed anyway.

Global slashing databases can help

Tools like Web3Signer and Cubist's key management solution, CubeSigner, can maintain global slashing databases. For example, CubeSigner stores every signature it generates for every one of your validators, and automatically applies anti-slashing following EIP-3076 before every signature it produces. Even if you make slashable signature requests–and even if you make them from different validator clients–CubeSigner will refuse to sign a slashable message. And, finally, CubeSigner does this without the classic remote-signing pitfalls: it doesn't compromise on latency or reliability. If you want to talk more about anti-slashing, reach out!

‍

_{[1]: Kiln has a great write up on}_{this too}_.

About

Your validator can get slashed even if you do everything by the book