> For the complete documentation index, see [llms.txt](https://docs.snowbridge.network/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.snowbridge.network/resources/emergency-procedures.md).

# Emergency Procedures

On-call runbook. Halt mechanics are at the top so they're easy to reach under pressure. For routine governance, see [Governance and Operational Processes](/resources/governance-and-operational-processes.md).

Halt is technically reversible but the halt referendum on Polkassembly is public. See [Decision authority](#decision-authority) for when solo action vs team confirmation applies.

## Producing the preimage and submission links

The governance page at [app.snowbridge.network/governance](https://app.snowbridge.network/governance) is the single source of truth during an incident. Select the halt scope (see [Halt scopes reference](#halt-scopes-reference)) and the page emits both the preimage **and** the two ready-to-submit papi.how links. Take the links straight to [Submitting](#submitting).

**Fallback (UI down only)**: call `buildHaltBridgePreimage` then `buildHaltBridgeSubmissionUrls` from `@snowbridge/api`. Produces the same preimage + URLs the UI shows.

## Halt scopes reference

Pick the narrowest scope that covers the failure mode. Governance page form fields:

* **All** Every component. Default if nothing else selected.
* **Ethereum client** Halts `EthereumBeaconClient::submit` and short-circuits `Verifier::verify` for all BridgeHub consumers. Stops V1 + V2 inbound `submit` and `outbound-queue-v2::submit_delivery_receipt`. Use for beacon-light-client or sync-committee compromise. `force_checkpoint` stays available (root-only) for recovery.
* **Inbound queue** Both V1 + V2 inbound pallets on BridgeHub.
* **Inbound queue V1** V1 inbound only.
* **Inbound queue V2** V2 inbound only.
* **Outbound queue** V1 outbound on BridgeHub **and** AssetHub system-frontend (short-circuits `PausableExporter` for V1 + V2 at XcmRouter). V2 has no local outbound halt, so system-frontend is the primary V2 outbound lever.
* **System frontend** AssetHub system-frontend only. Blocks V1 + V2 P→E at `PausableExporter` (`SendError::NotApplicable`). V1 BridgeHub outbound keeps draining in-flight messages.
* **Gateway** Sends `Command::SetOperatingMode(Halted)` to the Ethereum Gateway via both V1 + V2 system pallets. Delivery is relayer-dependent, so schedule **before** local outbound halts.
* **Gateway V2** V2-only Gateway halt. Blocks `v2_sendMessage` and `v2_registerToken` once delivered. Pair with **Inbound queue V2** + **AssetHub max fee V2** for a V2-only pause.
* **AssetHub max fee** Sets `BridgeHubEthereumBaseFee` + `BridgeHubEthereumBaseFeeV2` to `u128::MAX`. Fee deterrent, not a router halt.
* **AssetHub max fee V2** V2-only variant. Writes only `BridgeHubEthereumBaseFeeV2`. Only V2-isolated P→E lever.

| Failure mode                                    | Scopes                                                          |
| ----------------------------------------------- | --------------------------------------------------------------- |
| Beacon light client / sync committee compromise | **Ethereum client**                                             |
| Ethereum Gateway compromise                     | **Gateway** + **AssetHub max fee**                              |
| Inbound-queue bug (one version)                 | **Inbound queue V1** or **Inbound queue V2**                    |
| Outbound-queue / system-frontend bug            | **Outbound queue**                                              |
| V2 P→E only (V1 keeps flowing)                  | **AssetHub max fee V2** (fee deterrent only)                    |
| Full V2 pause                                   | **Gateway V2** + **Inbound queue V2** + **AssetHub max fee V2** |
| Full P→E halt (V1 + V2)                         | **Outbound queue** or **System frontend**                       |
| Uncertain                                       | **All**                                                         |

When uncertain: **All**. To block both directions immediately: **Gateway** + **AssetHub max fee**.

## Submitting

Submission goes through OpenGov's **Whitelisted Caller** track, which requires the Polkadot Fellowship to whitelist the call first. From the governance page's result panel, two papi.how links handle this end-to-end:

1. **Asset Hub batch** Click **Open**. Notes the preimage and opens the public Whitelisted Caller referendum on Asset Hub. Anyone on the team can submit. Sign in papi.how.
2. **Fellowship whitelist** Click **Copy** and share the link in the Polkadot Fellowship Element channel (see [Comms](#comms-during-an-incident)). Must be submitted by a Fellow of rank 3 or higher. Bottleneck of the flow.

Enactment defaults to `After(10)` blocks (matches opengov-cli's default).

**Wall-clock: hours, not minutes.** Run Polkadot Fellowship escalation in parallel with the Asset Hub submission.

### Fallback: opengov-cli

If the governance page is unreachable, construct the same submission locally with [opengov-cli](https://github.com/joepetrowski/opengov-cli) and the preimage bytes (which the SDK fallback in [Producing](#producing-the-preimage-and-submission-links) can still generate offline):

{% code overflow="wrap" %}

```
opengov-cli submit-referendum \
    --proposal 0x<preimage-bytes> \
    --network polkadot \
    --track whitelistedcaller \
    --output AppsUiLink
```

{% endcode %}

Emits the same two papi.how URLs the UI shows. Use only when the UI is down; the UI is the single source of truth the team drives from during an incident.

## Verifying the halt

After the call executes, query each affected chain's `OperatingMode` storage (expected: `Halted`).

| Halt scope                                          | Chain     | Storage                                                                                                              |
| --------------------------------------------------- | --------- | -------------------------------------------------------------------------------------------------------------------- |
| **Ethereum client**                                 | BridgeHub | `ethereumBeaconClient.operatingMode`                                                                                 |
| **Inbound queue V1**                                | BridgeHub | `ethereumInboundQueue.operatingMode`                                                                                 |
| **Inbound queue V2**                                | BridgeHub | `ethereumInboundQueueV2.operatingMode`                                                                               |
| **Outbound queue** (BridgeHub)                      | BridgeHub | `ethereumOutboundQueue.operatingMode`                                                                                |
| **Outbound queue** / **System frontend** (AssetHub) | AssetHub  | `systemFrontend.operatingMode`                                                                                       |
| **Gateway** (BridgeHub side)                        | BridgeHub | `ethereumSystem.operatingMode`, `ethereumSystemV2.operatingMode`                                                     |
| **Gateway** (Ethereum contract)                     | Ethereum  | `Gateway.operatingMode() == Halted`. **Relayer-dependent**: watch for `SetOperatingMode` event before confirming.    |
| **AssetHub max fee**                                | AssetHub  | `bridgeHubEthereumBaseFee` + `bridgeHubEthereumBaseFeeV2` == `u128::MAX` (`340282366920938463463374607431768211455`) |
| **AssetHub max fee V2**                             | AssetHub  | `bridgeHubEthereumBaseFeeV2` == `u128::MAX`                                                                          |

Polkadot-side halt is the firm guarantee. If Gateway isn't halted yet (no relayer delivery), `sendToken`/`sendMessage` on Ethereum still accept calls but nothing downstream processes them.

## Detection

Triggers for the incident flow:

* **Funds drained or unexpectedly moved**. Highest priority. Halt first, investigate after.
* **Bug bounty report** (HackenProof or direct), verified by a team member as a valid exploit with working PoC.

When in doubt: post in Slack, treat as incident until ruled out.

## Decision authority

| Action                          | Threshold                                                                                 |
| ------------------------------- | ----------------------------------------------------------------------------------------- |
| Solo halt                       | 1 member. **Only for visible exploit / funds being drained.**                             |
| Confirmed halt                  | 2 members agree. Default for bug bounty, anomalies, "I don't understand what I'm seeing." |
| Escalate to Polkadot Fellowship | 2 members agree. Same conversation as confirmed halt in practice.                         |
| Public comms                    | Full team. **Only after fix is deployed and bridge is resuming.**                         |
| Emergency upgrade               | Coordinated with Polkadot Fellowship. Code is exploit-sensitive.                          |

A halt referendum on Polkassembly is public, so solo authority is reserved for cases where the incident is already public (funds moving). Otherwise discuss in Slack first.

## Comms during an incident

Each step assumes the previous one has happened.

1. **Slack** `#snowbridge-security` Post the signal (link to explorer, alert, bounty report). Non-visible signals: wait for at least one teammate to confirm before halting. Visible exploit: skip ahead.
2. **Halt** See [Producing](#producing-the-preimage-and-submission-links) + [Submitting](#submitting). For visible exploits, run in parallel with steps 3 and 4.
3. **Internal confirmation** 2+ members agree it's a real incident. Retroactive for solo-halt cases.
4. **Element with Polkadot Fellowship** New room, invite Adrian, Bastian, Oliver. Fellowship coordination happens here.
5. **Integrators** Telegram. Hydration first, then others. Tell them what's halted + expected resume timing.
6. **No public comms** (Twitter/X, forum, blog, public Discord) until fix is deployed and resume is in flight.

## Resuming the bridge

Same flow as halting: the governance page emits the resume preimage and the two submission links. Select scopes matching what was halted, then proceed via [Submitting](#submitting).

Fallback: `buildResumeBridgePreimage` + `buildResumeBridgeSubmissionUrls` in `@snowbridge/api`.

Before submitting, confirm:

* Fix is deployed and verified in production.
* Full team has signed off.
* Monitoring is back to baseline, no fresh anomalies.
* AssetHub fee values being restored to pre-incident values. Resume writes `BridgeHubEthereumBaseFee` and `BridgeHubEthereumBaseFeeV2` back to known good defaults (currently `14_929_540_998` for V1, `1_000_000_000` for V2). Double-check these match what was live before.

Public comms can begin once resume executes and the bridge is processing again.

## Emergency Upgrade

For cases a halt alone can't contain (e.g. critical pallet logic bug):

* **Halt first anyway.** Buys time to develop the upgrade without pressure. Skip only if halting is itself harmful.
* **Restrict the code.** Upgrade code for an unpatched vulnerability is itself exploit material. Private branch, limited reviewers (team + necessary Polkadot Fellowship contacts). Don't publicise until executed on-chain.
* **Coordinate the pathway with Polkadot Fellowship.** Whitelisted Caller for runtime, multi-sig for contracts. Use the Element channel.
* **Resume** only after the upgrade is verified live ([Resuming the bridge](#resuming-the-bridge)).

## Post-mortem

Within 48h of resume:

* **Owner** Whoever drove the incident (defaults to whoever halted first).
* **Format** Google Doc, shared with team + Polkadot Fellowship contacts from the Element channel.
* **Contents** Timeline (timestamps), root cause, halt scope + reason, what worked, what didn't, action items with owners and dates.
* **Action items** Track in the team issue tracker, not the doc itself.

Also write one for false-positive halts. Tuning detection signals to reduce false positives is itself useful output.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.snowbridge.network/resources/emergency-procedures.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
