Statement Distribution
This subsystem is responsible for distributing signed statements that we have generated and forwarding statements
generated by our peers. Received candidate receipts and statements are passed to the Candidate Backing
subsystem to handle producing local statements. On receiving
StatementDistributionMessage::Share
, this subsystem distributes the message across the network with redundancy to
ensure a fast backing process.
Overview
Goal: every well-connected node is aware of every next potential parachain block.
Validators can either:
- receive parachain block from collator, check block, and gossip statement.
- receive statements from other validators, check the parachain block if it originated within their own group, gossip forward statement if valid.
Validators must have statements, candidates, and persisted validation from all other validators. This is because we need to store statements from validators who've checked the candidate on the relay chain, so we know who to hold accountable in case of disputes. Any validator can be selected as the next relay-chain block author, and this is not revealed in advance for security reasons. As a result, all validators must have a up to date view of all possible parachain candidates + backing statements that could be placed on-chain in the next block.
This blog post puts it another way: "Validators who aren't assigned to the parachain still listen for the attestations [statements] because whichever validator ends up being the author of the relay-chain block needs to bundle up attested parachain blocks for several parachains and place them into the relay-chain block."
Backing-group quorum (that is, enough backing group votes) must be reached before the block author will consider the candidate. Therefore, validators need to consider all seconded candidates within their own group, because that's what they're assigned to work on. Validators only need to consider backable candidates from other groups. This informs the design of the statement distribution protocol to have separate phases for in-group and out-group distribution, respectively called "cluster" and "grid" mode (see below).
With Async Backing
Asynchronous backing changes the runtime to accept parachain candidates from a certain allowed range of historic relay-parents. These candidates must be backed by the group assigned to the parachain as-of their corresponding relay parents.
Protocol
To address the concern of dealing with large numbers of spam candidates or statements, the overall design approach is to
combine a focused "clustering" protocol for legitimate fresh candidates with a broad-distribution "grid" protocol to
quickly get backed candidates into the hands of many validators. Validators do not eagerly send each other heavy
CommittedCandidateReceipt
, but instead request these lazily through request/response protocols.
A high-level description of the protocol follows:
Messages
Nodes can send each other a few kinds of messages: Statement
, BackedCandidateManifest
,
BackedCandidateAcknowledgement
.
Statement
messages contain only a signed compact statement, without full candidate info.BackedCandidateManifest
messages advertise a description of a backed candidate and stored statements.BackedCandidateAcknowledgement
messages acknowledge that a backed candidate is fully known.
Request/response protocol
Nodes can request the full CommittedCandidateReceipt
and PersistedValidationData
, along with statements, over a
request/response protocol. This is the AttestedCandidateRequest
; the response is AttestedCandidateResponse
.
Importability and the Hypothetical Frontier
The prospective parachains subsystem maintains prospective "fragment trees" which can be used to determine whether a particular parachain candidate could possibly be included in the future. Candidates which either are within a fragment tree or would be part of a fragment tree if accepted are said to be in the "hypothetical frontier".
The statement-distribution subsystem keeps track of all candidates, and updates its knowledge of the hypothetical frontier based on events such as new relay parents, new confirmed candidates, and newly backed candidates.
We only consider statements as "importable" when the corresponding candidate is part of the hypothetical frontier, and only send "importable" statements to the backing subsystem itself.
Cluster Mode
- Validator nodes are partitioned into groups (with some exceptions), and validators within a group at a relay-parent
can send each other
Statement
messages for any candidates within that group and based on that relay-parent. - This is referred to as the "cluster" mode.
- Right now these are the same as backing groups, though "cluster" specifically refers to the set of nodes communicating with each other in the first phase of distribution.
Seconded
statements must be sent beforeValid
statements.Seconded
statements may only be sent to other members of the group when the candidate is fully known by the local validator.- "Fully known" means the validator has the full
CommittedCandidateReceipt
andPersistedValidationData
, which it receives on request from other validators or from a collator. - The reason for this is that sending a statement (which is always a
CompactStatement
carrying nothing but a hash and signature) to the cluster, is also a signal that the sending node is available to request the candidate from. - This makes the protocol easier to reason about, while also reducing network messages about candidates that don't really exist.
- "Fully known" means the validator has the full
- Validators in a cluster receiving messages about unknown candidates request the candidate (and statements) from other cluster members which have it.
- Spam considerations
- The maximum depth of candidates allowed in asynchronous backing determines the maximum amount of
Seconded
statements originating from a validator V which each validator in a cluster may send to others. This bounds the number of candidates. - There is a small number of validators in each group, which further limits the amount of candidates.
- The maximum depth of candidates allowed in asynchronous backing determines the maximum amount of
- We accept candidates which don't fit in the fragment trees of any relay parents.
- "Accept" means "attempt to request and store in memory until useful or expired".
- We listen to prospective parachains subsystem to learn of new additions to the fragment trees.
- Use this to attempt to import the candidate later.
Grid Mode
- Every consensus session provides randomness and a fixed validator set, which is used to build a redundant grid
topology.
- It's redundant in the sense that there are 2 paths from every node to every other node. See "Grid Topology" section for more details.
- This grid topology is used to create a sending path from each validator group to every validator.
- When a node observes a candidate as backed, it sends a
BackedCandidateManifest
to their "receiving" nodes. - If receiving nodes don't yet know the candidate, they request it.
- Once they know the candidate, they respond with a
BackedCandidateAcknowledgement
. - Once two nodes perform a manifest/acknowledgement exchange, they can send
Statement
messages directly to each other for any new statements they might need.- This limits the amount of statements we'd have to deal with w.r.t. candidates that don't really exist. See "Manifest Exchange" section.
- There are limitations on the number of candidates that can be advertised by each peer, similar to those in the cluster. Validators do not request candidates which exceed these limitations.
- Validators request candidates as soon as they are advertised, but do not import the statements until the candidate is part of the hypothetical frontier, and do not re-advertise or acknowledge until the candidate is considered both backable and part of the hypothetical frontier.
- Note that requesting is not an implicit acknowledgement, and an explicit acknowledgement must be sent upon receipt.
Disabled validators
After a validator is disabled in the runtime, other validators should no longer accept statements from it. Filtering out of statements from disabled validators on the node side is purely an optimization, as it will be done in the runtime as well.
We use the state of the relay parent to check whether a validator is disabled to avoid race conditions and ensure that disabling works well in the presense of re-enabling.
Messages
Incoming
ActiveLeaves
- Notification of a change in the set of active leaves.
StatementDistributionMessage::Share
- Notification of a locally-originating statement. That is, this statement comes from our node and should be distributed to other nodes.
- Sent by the Backing Subsystem after it successfully imports a locally-originating statement.
StatementDistributionMessage::Backed
- Notification of a candidate being backed (received enough validity votes from the backing group).
- Sent by the Backing Subsystem after it successfully imports a statement for the first time and after sending ~Share~.
StatementDistributionMessage::NetworkBridgeUpdate
- See next section.
Network bridge events
- v1 compatibility
- Messages for the v1 protocol are routed to the legacy statement distribution.
Statement
- Notification of a signed statement.
- Sent by a peer's Statement Distribution subsystem when circulating statements.
BackedCandidateManifest
- Notification of a backed candidate being known by the sending node.
- For the candidate being requested by the receiving node if needed.
- Announcement.
- Sent by a peer's Statement Distribution subsystem.
BackedCandidateKnown
- Notification of a backed candidate being known by the sending node.
- For informing a receiving node which already has the candidate.
- Acknowledgement.
- Sent by a peer's Statement Distribution subsystem.
Outgoing
NetworkBridgeTxMessage::SendValidationMessages
- Sends a peer all pending messages / acknowledgements / statements for a relay parent, either through the cluster or the grid.
NetworkBridgeTxMessage::SendValidationMessage
- Circulates a compact statement to all peers who need it, either through the cluster or the grid.
NetworkBridgeTxMessage::ReportPeer
- Reports a peer (either good or bad).
CandidateBackingMessage::Statement
- Note a validator's statement about a particular candidate.
ProspectiveParachainsMessage::GetHypotheticalMembership
- Gets the hypothetical frontier membership of candidates under active leaves' fragment trees.
NetworkBridgeTxMessage::SendRequests
- Sends requests, initiating the request/response protocol.
Request/Response
We also have a request/response protocol because validators do not eagerly send each other heavy
CommittedCandidateReceipt
, but instead need to request these lazily.
Protocol
-
Requesting Validator
- Requests are queued up with
RequestManager::get_or_insert
.- Done as needed, when handling incoming manifests/statements.
RequestManager::dispatch_requests
sends any queued-up requests.- Calls
RequestManager::next_request
to completion.- Creates the
OutgoingRequest
, saves the receiver inRequestManager::pending_responses
.
- Creates the
- Does nothing if we have more responses pending than the limit of parallel requests.
- Calls
- Requests are queued up with
-
Peer
- Requests come in on a peer on the
IncomingRequestReceiver
.- Runs in a background responder task which feeds requests to
answer_request
throughMuxedMessage
. - This responder task has a limit on the number of parallel requests.
- Runs in a background responder task which feeds requests to
answer_request
on the peer takes the request and sends a response.- Does this using the response sender on the request.
- Requests come in on a peer on the
-
Requesting Validator
receive_response
on the original validator yields a response.- Response was sent on the request's response sender.
- Uses
RequestManager::await_incoming
to await on pending responses in an unordered fashion. - Runs on the
MuxedMessage
receiver.
handle_response
handles the response.
API
dispatch_requests
- Dispatches pending requests for candidate data & statements.
answer_request
- Answers an incoming request for a candidate.
- Takes an incoming
AttestedCandidateRequest
.
receive_response
- Wait on the next incoming response.
- If there are no requests pending, this future never resolves.
- Returns
UnhandledResponse
handle_response
- Handles an incoming response.
- Takes
UnhandledResponse
Manifests
A manifest is a message about a known backed candidate, along with a description of the statements backing it. It can be one of two kinds:
Full
: Contains information about the candidate and should be sent to peers who may not have the candidate yet. This is also called anAnnouncement
.Acknowledgement
: Omits information implicit in the candidate, and should be sent to peers which are guaranteed to have the candidate already.
Manifest Exchange
Manifest exchange is when a receiving node received a Full
manifest and replied with an Acknowledgement
. It
indicates that both nodes know the candidate as valid and backed. This allows the nodes to send Statement
messages
directly to each other for any new statements.
Why? This limits the amount of statements we'd have to deal with w.r.t. candidates that don't really exist. Limiting out-of-group statement distribution between peers to only candidates that both peers agree are backed and exist ensures we only have to store statements about real candidates.
In practice, manifest exchange means that one of three things have happened:
- They announced, we acknowledged.
- We announced, they acknowledged.
- We announced, they announced.
Concerning the last case, note that it is possible for two nodes to have each other in their sending set. Consider:
1 2
3 4
If validators 2 and 4 are in group B, then there is a path 2->1->3
and 4->3->1
. Therefore, 1 and 3 might send each
other manifests for the same candidate at the same time, without having seen the other's yet. This also counts as a
manifest exchange, but is only allowed to occur in this way.
After the exchange is complete, we update pending statements. Pending statements are those we know locally that the remote node does not.
Alternative Paths Through The Topology
Nodes should send a BackedCandidateAcknowledgement(CandidateHash, StatementFilter)
notification to any peer which has
sent a manifest, and the candidate has been acquired by other means. This keeps alternative paths through the topology
open, which allows nodes to receive additional statements that come later, but not after the candidate has been posted
on-chain.
This is mostly about the limitation that the runtime has no way for block authors to post statements that come after the parablock is posted on-chain and ensure those validators still get rewarded. Technically, we only need enough statements to back the candidate and the manifest + request will provide that. But more statements might come shortly afterwards, and we want those to end up on-chain as well to ensure all validators in the group are rewarded.
For clarity, here is the full timeline:
- candidate seconded
- backable in cluster
- distributed along grid
- latecomers issue statements
- candidate posted on chain
- really latecomers issue statements
Cluster Module
The cluster module provides direct distribution of unbacked candidates within a group. By utilizing this initial phase
of propagating only within clusters/groups, we bound the number of Seconded
messages per validator per relay-parent,
helping us prevent spam. Validators can try to circumvent this, but they would only consume a few KB of memory and it is
trivially slashable on chain.
The cluster module determines whether to accept/reject messages from other validators in the same group. It keeps track of what we have sent to other validators in the group, and pending statements. For the full protocol, see "Protocol".
Grid Module
The grid module provides distribution of backed candidates and late statements outside the backing group. For the full protocol, see the "Protocol" section.
Grid Topology
For distributing outside our cluster (aka backing group) we use a 2D grid topology. This limits the amount of peers we send messages to, and handles view updates.
The basic operation of the grid topology is that:
- A validator producing a message sends it to its row-neighbors and its column-neighbors.
- A validator receiving a message originating from one of its row-neighbors sends it to its column-neighbors.
- A validator receiving a message originating from one of its column-neighbors sends it to its row-neighbors.
This grid approach defines 2 unique paths for every validator to reach every other validator in at most 2 hops, providing redundancy.
Propagation follows these rules:
- Each node has a receiving set and a sending set. These are different for each group. That is, if a node receives a candidate from group A, it checks if it is allowed to receive from that node for candidates from group A.
- For groups that we are in, receive from nobody and send to our X/Y peers.
- For groups that we are not part of:
- We receive from any validator in the group we share a slice with and send to the corresponding X/Y slice in the other dimension.
- For any validators we don't share a slice with, we receive from the nodes which share a slice with them.
Example
For size 11, the matrix would be:
0 1 2
3 4 5
6 7 8
9 10
e.g. for index 10, the neighbors would be 1, 4, 7, 9 -- these are the nodes we could directly communicate with (e.g. either send to or receive from).
Now, which of these neighbors can 10 receive from? Recall that the sending/receiving sets for 10 would be different for different groups. Here are some hypothetical scenarios:
- Scenario 1: 9 belongs to group A but not 10. Here, 10 can directly receive candidates from group A from 9. 10 would propagate them to the nodes in {1, 4, 7} that are not in A.
- Scenario 2: 6 is in group A instead of 9, and 7 is not in group A. 10 can receive group A messages from 7 or 9. 10 will try to relay these messages, but 7 and 9 together should have already propagated the message to all x/y peers of 10. If so, then 10 will just receive acknowledgements in reply rather than requests.
- Scenario 3: 10 itself is in group A. 10 would not receive candidates from this group from any other nodes through the grid. It would itself send such candidates to all its neighbors that are not in A.
Seconding Limit
The seconding limit is a per-validator limit. Before asynchronous backing, we had a rule that every validator was only
allowed to second one candidate per relay parent. With asynchronous backing, we have a 'maximum depth' which makes it
possible to second multiple candidates per relay parent. The seconding limit is set to max depth + 1
to set an upper
bound on candidates entering the system.
Candidates Module
The candidates module provides a tracker for all known candidates in the view, whether they are confirmed or not, and how peers have advertised the candidates. What is a confirmed candidate? It is a candidate for which we have the full receipt and the persisted validation data. This module gets confirmed candidates from two sources:
- It can be that a validator fetched a collation directly from the collator and validated it.
- The first time a validator gets an announcement for an unknown candidate, it will send a request for the candidate.
Upon receiving a response and validating it (see
UnhandledResponse::validate_response
), it will mark the candidate as confirmed.
Requests Module
The requests module provides a manager for pending requests for candidate data, as well as pending responses. See "Request/Response Protocol" for a high-level description of the flow. See module-docs for full details.
Glossary
- Acknowledgement: A partial manifest sent to a validator that already has the candidate to inform them that the sending node also knows the candidate. Concludes a manifest exchange.
- Announcement: A full manifest indicating that a backed candidate is known by the sending node. Initiates a manifest exchange.
- Attestation: See "Statement".
- Backable vs. Backed:
- Note that we sometimes use "backed" to refer to candidates that are "backable", but not yet backed on chain.
- Backed should technically mean that the parablock candidate and its backing statements have been added to a relay chain block.
- Backable is when the necessary backing statements have been acquired but those statements and the parablock candidate haven't been backed in a relay chain block yet.
- Fragment tree: A parachain fragment not referenced by the relay-chain. It is a tree of prospective parachain blocks.
- Manifest: A message about a known backed candidate, along with a description of the statements backing it. There
are two kinds of manifest,
Acknowledgement
andAnnouncement
. See "Manifests" section. - Peer: Another validator that a validator is connected to.
- Request/response: A protocol used to lazily request and receive heavy candidate data when needed.
- Reputation: Tracks reputation of peers. Applies annoyance cost and good behavior benefits.
- Statement: Signed statements that can be made about parachain candidates.
- Seconded: Proposal of a parachain candidate. Implicit validity vote.
- Valid: States that a parachain candidate is valid.
- Target: Target validator to send a statement to.
- View: Current knowledge of the chain state.
- Explicit view / immediate view
- The view a peer has of the relay chain heads and highest finalized block.
- Implicit view
- Derived from the immediate view. Composed of active leaves and minimum relay-parents allowed for candidates of various parachains at those leaves.
- Explicit view / immediate view