Availability Recovery
This subsystem is the inverse of the Availability Distribution subsystem: validators will serve the availability chunks kept in the availability store to nodes who connect to them. And the subsystem will also implement the other side: the logic for nodes to connect to validators, request availability pieces, and reconstruct the AvailableData.
This version of the availability recovery subsystem is based off of direct connections to validators. In order to recover any given AvailableData, we must recover at least f + 1 pieces from validators of the session. Thus, we will connect to and query randomly chosen validators until we have received f + 1 pieces.
Protocol
PeerSet: Validation
Input:
NetworkBridgeUpdate(update)AvailabilityRecoveryMessage::RecoverAvailableData(candidate, session, backing_group, response)
Output:
NetworkBridge::SendValidationMessageNetworkBridge::ReportPeerAvailabilityStore::QueryChunk
Functionality
We hold a state which tracks the currently ongoing recovery tasks, as well as which request IDs correspond to which task. A recovery task is a structure encapsulating all recovery tasks with the network necessary to recover the available data in respect to one candidate.
#![allow(unused)] fn main() { struct State { /// Each recovery is implemented as an independent async task, and the handles only supply information about the result. ongoing_recoveries: FuturesUnordered<RecoveryHandle>, /// A recent block hash for which state should be available. live_block_hash: Hash, // An LRU cache of recently recovered data. availability_lru: LruCache<CandidateHash, Result<AvailableData, RecoveryError>>, } /// This is a future, which concludes either when a response is received from the recovery tasks, /// or all the `awaiting` channels have closed. struct RecoveryHandle { candidate_hash: CandidateHash, interaction_response: RemoteHandle<Concluded>, awaiting: Vec<ResponseChannel<Result<AvailableData, RecoveryError>>>, } struct Unavailable; struct Concluded(CandidateHash, Result<AvailableData, RecoveryError>); struct RecoveryTaskParams { validator_authority_keys: Vec<AuthorityId>, validators: Vec<ValidatorId>, // The number of pieces needed. threshold: usize, candidate_hash: Hash, erasure_root: Hash, } enum RecoveryTask { RequestFromBackers { // a random shuffling of the validators from the backing group which indicates the order // in which we connect to them and request the chunk. shuffled_backers: Vec<ValidatorIndex>, } RequestChunksFromValidators { // a random shuffling of the validators which indicates the order in which we connect to the validators and // request the chunk from them. shuffling: Vec<ValidatorIndex>, received_chunks: Map<ValidatorIndex, ErasureChunk>, requesting_chunks: FuturesUnordered<Receiver<ErasureChunkRequestResponse>>, } } struct RecoveryTask { to_subsystems: SubsystemSender, params: RecoveryTaskParams, source: Source, } }
Signal Handling
On ActiveLeavesUpdate, if activated is non-empty, set state.live_block_hash to the first block in Activated.
Ignore BlockFinalized signals.
On Conclude, shut down the subsystem.
AvailabilityRecoveryMessage::RecoverAvailableData(receipt, session, Option<backing_group_index>, response)
- Check the 
availability_lrufor the candidate and return the data if so. - Check if there is already an recovery handle for the request. If so, add the response handle to it.
 - Otherwise, load the session info for the given session under the state of 
live_block_hash, and initiate a recovery task withlaunch_recovery_task. Add a recovery handle to the state and add the response channel to it. - If the session info is not available, return 
RecoveryError::Unavailableon the response channel. 
Recovery logic
launch_recovery_task(session_index, session_info, candidate_receipt, candidate_hash, Option<backing_group_index>)
- Compute the threshold from the session info. It should be 
f + 1, wheren = 3f + k, wherek in {1, 2, 3}, andnis the number of validators. - Set the various fields of 
RecoveryParamsbased on the validator lists insession_infoand information about the candidate. - If the 
backing_group_indexisSome, start in theRequestFromBackersphase with a shuffling of the backing group validator indices and aNonerequesting value. - Otherwise, start in the 
RequestChunksFromValidatorssource withreceived_chunks,requesting_chunks, andnext_shufflingall empty. - Set the 
to_subsystemssender to be equal to a clone of theSubsystemContext's sender. - Initialize 
received_chunksto an empty set, as well asrequesting_chunks. 
Launch the source as a background task running run(recovery_task).
run(recovery_task) -> Result<AvailableData, RecoeryError>
#![allow(unused)] fn main() { // How many parallel requests to have going at once. const N_PARALLEL: usize = 50; }
- 
Request
AvailabilityStoreMessage::QueryAvailableData. If it exists, return that. - 
If the task contains
RequestFromBackers- Loop:
- If the 
requesting_povisSome, poll for updates on it. If it concludes, setrequesting_povtoNone. - If the 
requesting_povisNone, take the next backer off theshuffled_backers.- If the backer is 
Some, issue aNetworkBridgeMessage::Requestswith a network request for theAvailableDataand wait for the response. - If it concludes with a 
Noneresult, return to beginning. - If it concludes with available data, attempt a re-encoding.
- If it has the correct erasure-root, break and issue a 
Ok(available_data). - If it has an incorrect erasure-root, return to beginning.
 
 - If it has the correct erasure-root, break and issue a 
 - Send the result to each member of 
awaiting. - If the backer is 
None, set the source toRequestChunksFromValidatorswith a random shuffling of validators and emptyreceived_chunks, andrequesting_chunksand break the loop. 
 - If the backer is 
 
 - If the 
 
 - Loop:
 - 
If the task contains
RequestChunksFromValidators:- Request 
AvailabilityStoreMessage::QueryAllChunks. For each chunk that exists, add it toreceived_chunksand remote the validator fromshuffling. - Loop:
- If 
received_chunks + requesting_chunks + shufflinglengths are less than the threshold, break and returnErr(Unavailable). - Poll for new updates from 
requesting_chunks. Check merkle proofs of any received chunks. If the request simply fails due to network issues, insert into the front ofshufflingto be retried. - If 
received_chunkshas more thanthresholdentries, attempt to recover the data.- If that fails, return 
Err(RecoveryError::Invalid) - If correct:
- If re-encoding produces an incorrect erasure-root, break and issue a 
Err(RecoveryError::Invalid). - break and issue 
Ok(available_data) 
 - If re-encoding produces an incorrect erasure-root, break and issue a 
 
 - If that fails, return 
 - Send the result to each member of 
awaiting. - While there are fewer than 
N_PARALLELentries inrequesting_chunks,- Pop the next item from 
shuffling. If it's empty andrequesting_chunksis empty, returnErr(RecoveryError::Unavailable). - Issue a 
NetworkBridgeMessage::Requestsand wait for the response inrequesting_chunks. 
 - Pop the next item from 
 
 - If 
 
 - Request