Polkadot logo

Introduction

This book contains the Polkadot Fellowship Requests for Comments (RFCs) detailing proposed changes to the technical implementation of the Polkadot network.

GitHub logo polkadot-fellows/RFCs

(source)

Table of Contents

RFC-0000: Pre-ELVES soft concensus

Start DateDate of initial proposal
DescriptionProvide and exploit a soft consensus before launching approval checks
AuthorsJeff Burdges, Alistair Stewart

Summary

Availability (bitfield) votes gain a preferred_fork flag which expresses the validator's opinion upon relay chain equivocations and babe forks, while still sharing availability votes for all relay chain blocks. We make relay chain block production require a supermajority with preferred_fork set, so forks cannot advance if they split the honest validators, which creates an early soft concensus. We similarly defend ELVES from relay chain equivocation attacks and prevent redundent approvals across babe forks.

Motivation

We've always known relay chain equivocations break the ELVES threat model. We originally envisioned ELVES having fallback pathways, but doing fallbacks requires dangerous subtle debugging. We support more assignment schemes in ELVES this way too, including one novel post-quantum one, and very low CPU usage schemes.

We expect this early soft concensus creates back pressure that improves performance under babe forks.

Alistair: TODO?

Stakeholders

We modify the availability votes and restrict relay chain blocks, fork choice, and ELVES start conditions, so mostly the parachain. See alternatives notes on the flag under sassafras chains like JAM.

Explanation

Availability voting

At present, availability votes have a bitfield representing the cores, a relay_parent, and a signature. We process these on-chain in several steps: We first validate the signatures, zero any bits for cores included/enacted between the relay_parent and our predecessor, sum the set bits for each core, and finally include/enact the core if this exceeds 2/3rds of the validators.

Availability votes gain a preferred_fork flag, which honest validators set for exactly one relay_parent on their availability votes in a block production slot. We say a validator prefers a fork given by chain head h if it provides an availability vote with relay_parent = h and preferred_fork set.

Validators recieve a minor equivocations slash if they claim to set preferred_fork for two different relay_parents in the same slot. In sassafras, this means preferred fork equivocations can only occur for relay chain equivocations, but under babe preferred fork equivocations could occur between primary and secondary blocks, or other primary blocks.

All validators still provide availability votes for all forks, because those non-preferred votes could still help enact candidates faster, but those non-preferred vote have preferred_fork zeroed.

Around this, validators could optionally provide an early availability vote that commits to their preferred fork, and then later provide a second availability votes stating the same preferred fork but a fuller bitfield, provided doing so somehow helps relay chain blcok producers.

Fork choice

We require relay chain block producers build upon forks preferred by 2 f + 1 validators. In other words, a relay chain block with parent p must contain availability bitfield votes from 2 f + 1 validators with relay_parent = p and preferred_fork set. It follows our preferred fork votes override other fork choice priorities.

A relay chain block producer could lack this 2 f + 1 threshold for a prespective parent block p, in which case they must build upon the parent of p instead. We know availability votes simply being slow would cause this somtimes, in which case adding slightly more delay could save the relay chain slot Alternatively though, two distinct relay chain blocks in the same slot could each wind up prefered by f+1 validators, in which case we must abandond the slot entirely.

Elves

We only launch the approvals process aka (machine) elves for a relay chain block p once 2 f + 1 validators prefer that block, aka 2 f + 1 validators provide availability votes with relay_parent = p and preferred_fork set. We could optionally delay this further until we have some valid decendent of p.

Fast prunning

In fact, this new fork choice logic creates more short relay chain forks than exist currently: If the validators split their votes, then we create a new fork in a later slot. We no longer need to process every fork now though.

Instead, availability votes from honest validators must express the correct preferred fork, which requires validators carefully time when they judge and announce their preference flags. In babe, we need primary slots to be preferred over secondary slots, so the validators need logic that delays sending availability votes for a secondary slot, giving the primary slot enough time. We also prefer the primary slot with smallest VRF as well, so we need some delay even once we recieve a primary.

We suggest roughly this approach:

First, download only relay chain block headers, from which we determine our tentative preferred fork.

Second, we download and import only our currently tentatively preferred fork. We download our availability chunks as soon as we import a currently tentatively preferred relay chain block. We've no particular target for availability chunks other than simply some delay timer. In babe, we add some extra delay here for secondary slots, like perhaps 2 seconds minus the actual execution time, so that a fast secondary slot cannot beat a primary slot.

We somtimes obtain an even more preferable header during import, chunk distribution, and delays for our first tentatively preferred fork. Also, the first could simply turn out invalid. In either case, we loop to repeat this second step on our new tentative preferred fork. We repeat this process until an import succeeds and its timers run out, without receiving any more preferable header. Actual equivocations cannot be preferable over one another, so all this loops terminates reasonably quickly.

Next, we broadcast our availability vote with its relay_parent set to our tentatively preferred fork, and with its preferred_fork set.

Finally, if 2 f + 1 other validators have a different preference from us, then we download and import their preferred relay chain block, fetch chunks for it, and provide availability votes with preferred_fork zero. It's possible this occurs earlier than our preference finishes, in which case we probably still send out our preference, if only for forensic evidence.

Concerns: Drawbacks, Testing, Security, and Privacy

Adds subtle timing constraints, which could entrench existing performanceg obstacles. We might explore variations that ignore wall clock time.

We've always known relay chain equivocations break the ELVES threat model. We originally envisioned ELVES having fallback pathways, but these were complex and demanded unused code paths, which cannot realistically be debugged. Although complex, the early soft concensus scheme feels less complex overall. We know timing sucks to optimise a distributed system, but at least doing so use everyday code paths.

Performance, Ergonomics, and Compatibility

We expect early soft concensus introduce back pressure that radically alters performance. We no longer run approvals checks upon all forks. As primary slots occur once every other slot in expectation, one might expect a 25% reduction in CPU load, but this depends upon diverse factors.

We apply back pressure by dropping some whole relay chain blocks though, so this shall increase the expected parachain blocktime somewhat, but how much depens upon future optimisation work.

Compatibility

Major upgrade

Prior Art and References

...

Unresolved Questions

We halt the chain when less than 2/3 of validators are online. We consider this reasonable since governance now runs on a parachain, ELVES would not secure, and nothing can be finalized anyways. We could perhaps add some "recovery mode" where the relay chain embeds entire system parachain blocks, but doing so might not warrant the effort required.

Sassafras

Arguably, a sassafras RC like JAM could avoid preferred_fork flag, by only releasing availability votes for at most one sassafras equivocation. We wanted availability for babe forks, but sassafras has only equivocations, so those block can simply be dropped.

In principle, a sassafras equivocation could still enter the valid chain, assuming 2/3rd of validators provide availability votes for the same equivocations. If JAM lacks the preferred_fork flag then enactment proceeds slower in this case, but this should almost never occur.

Thresahold randomness

We think threshold randomness could reduce the tranche zero approcha checker assigments by roughly 40%, meaning a fixed 15 vs the expected 25 in the elves paper (30 in production now).

We do know threshold VRF based schemes that address relay chain equivocations directly, by using as input the relay chain block hash. We have many more options with early soft concensus though. TODO In particular, we only know two post-quantum approaches to elves, and the bandwidth efficent one needs early soft concensus.

Mid-strenght concensus

In this RFC, we only require that each relay chain block contain preference votes for its parent from 2/3rds of validators. We could enforce the opposite direction too: Around y>2 seconds after a validator V has seen preference votes for a chain head X from 2/3rd of validators, the V begins rejecting any relay chain block that does not build upon X. This is tricky because the y>2 second delay must be long enough so that most honest nodes learn both X and its preference votes. In this, we might treat preferred_fork votes as evidence for finality of the parent of the vote's relay_parent. This strengthens MEV defenses that assume some honest nodes.

Avoid wall clock time

We know parachains could baset heir slots upon relay chain slots, instaed of wall clock time (RFC ToDo). After this happens, we could avoid or minimize wall clock timing in the relay chain too, so that relay chain slots could've a floating duration based upon workload.

Partial relay chain blocks

Above, we only discuss abandoning realy chain blocks which fail early soft concensus. We could alternatively treat them as partial blocks and build extension partial blocks that complete them, with elves probably using randomness from the final partial block.

(source)

Table of Contents

RFC-0000: Validator Rewards

Start DateDate of initial proposal
DescriptionRewards protocol for Polkadot validators
AuthorsJeff Burdges, ...

Summary

An off-chain approximation protocol should assign rewards based upon the approvals and availability work done by validators.

All validators track which approval votes they actually use, reporting the aggregate, after which an on-chain median computation gives a good approximation under byzantine assumptions. Approval checkers report aggregate information about which availability chunks they use too, but in availability we need a tit-for-tat game to enforce honesty, because approval committees could often bias results thanks to their small size.

Motivation

We want all or most polkadot subsystems be profitable for validataors, because otherwise operators might profit from running modified code. In particular, almost all rewards in Kusama/Polkadot should come from work done securing parachains, primarily approval checking, but also backing, availability, and support of XCMP.

Among these task, our highest priorities must be approval checks, which ensure soundness, and sending availability chunks to approval checkers. We prove backers must be paid strictly less than approval checkers.

At present though, validators' rewards have relatively little relationship to validators operating costs, in terms of bandwidth and CPU time. Worse, polkadot's scaling makes us particular vulnerable "no-shows" caused by validators skipping their approval checks.

We're particularly concernned about hardware specks impact upon the number of parachain cores. We've requested relatively low spec machines so far, only four physical CPU cores, although some run even lower specs like only two physical CPU cores. Alone, rewards cannot fix our low speced validator problem, but rewards and outreach together should far more impact than either alone.

In future, we'll further increase validator spec requirements, which directly improve polkadot's throughput, and which repeats this dynamic of purging underspeced nodes, except outreach becomes more important because de facto too many slow validators can "out vote" the faster ones

Stakeholders

We alter the validators rewards protocol, but with negligable impact upon rewards for honest validators who comply with hardware and bandwidth recommendations.

We shall still reward participation in relay chain concensus of course, which de facto means block production but not finality, but these current reward levels shall wind up greatly reduced. Any validators who manipulate block rewards now could lose rewards here, simply because of rewards being shifted from block production to availability, but this sounds desirable.

We've discussed roughly this rewards protocol in https://hackmd.io/@rgbPIkIdTwSICPuAq67Jbw/S1fHcvXSF and https://github.com/paritytech/polkadot-sdk/issues/1811 as well as related topics like https://github.com/paritytech/polkadot-sdk/issues/5122

Logic

Categories

We alter the current rewards scheme by reducing to roughly these proportions of total rewards:

  • 15-20% - Relay chain block production and uncle logic
  • 5% - Anything else related to relay chain finality, primarily beefy proving, but maybe other tastes exist.
  • Any existing rewards for on-chain validity statements would only cover backers, so those rewards must be removed.

We add roughly these proportions of total rewards covering parachain work:

  • 70-75% - approval and backing validity checks, with the backing rewards being required to be less than approval rewards.
  • 5-10% - Availability redistribution from availability providers to approval checkers. We do not reward for availability distribution from backers to availability providers.

Observation

We track this data for each candidate during the approvals process:

/// Our subjective record of out availability transfers for this candidate.
CandidateRewards {
    /// Anyone who backed this parablock
    backers: [AuthorityId; NumBackers],
    /// Anyone to whome we think no-showed, even only briefly.
    noshows: HashSet<AuthorityId>,
    /// Anyone who sent us chunks for this candidate
    downloaded_from: HashMap<AuthorityId,u16>,    
    /// Anyone to whome we sent chunks for this candidate
    uploaded_to: HashMap<AuthorityId,u16>,
}

We no longer require this data during disputes.

After we approve a relay chain block, then we collect all its CandidateRewards into an ApprovalsTally, with one ApprovalTallyLine for each validator. In this, we compute approval_usages from the final run of the approvals loop, plus 0.8 for each backer.

We say a validator 𝑢 uses an approval vote by a validator 𝑣 on a candidate 𝑐 if the approval assignments loop by 𝑢 counted the vote by 𝑣 towards approving the candidate 𝑐.

/// Our subjective record of what we used from, and provided to, all other validators on the finalized chain
pub struct ApprovalsTally(Vec<ApprovalTallyLine>);

/// Our subjective record of what we used from, and provided to, all one other validators on the finalized chain
pub struct ApprovalTallyLine {
    /// Approvals by this validator which our approvals gadget used in marking candidates approved.
    approval_usages: u32,
    /// How many times we think this validator no-showed, even only briefly.
    noshows: u32
    /// Availability chunks we downloaded from this validator for our approval checks we used.
    used_downloads: u32,
    /// Availability chunks we uploaded to this validator which whose approval checks we used.
    used_uploads: u32,
}

At finality we sum these ApprovalsTally for one for the whole epoch so far, into another ApprovalsTally. We can optionally sum them earlier at chain heads, but this requires mutablity.

Messages

After the epoch is finalized, we share the first three field of each ApprovalTallyLine in its ApprovalTally.

/// Our subjective record of what we used from some other validator on the finalized chain
pub struct ApprovalTallyMessageLine {
    /// Approvals by this validator which our approvals gadget used in marking candidates approved.
    approval_usages: u32,
    /// How many times we think this validator no-showed, even only briefly.
    noshows: u32
    /// Availability chunks we downloaded from this validator for our approval checks we used.
    used_downloads: u32,
}

/// Our subjective record of what we used from all other validators on the finalized chain
pub struct ApprovalsTallyMessage(Vec<ApprovalTallyMessageLine>);

Actual ApprovalsTallyMessages sent over the wire must be signed of course, likely by the grandpa ed25519 key.

Rewards computation

We compute the approvals rewards for each validator by taking the median of the approval_usages fields for each validator across all validators ApprovalsTallyMessages. We compute some noshows_percentiles for each validator similarly, but using a 2/3 precentile instead of the median.

let mut approval_usages_medians = Vec::new(); 
let mut noshows_percentiles = = Vec::new(); 
for i in 0..num_validators {
    let mut v: Vec<u32> = approvals_tally_messages.iter().map(|atm| atm.0[i].approval_usages);
    v.sort();
    approval_usages_medians.push(v[num_validators/2]);
    let mut v: Vec<u32> = approvals_tally_messages.iter().map(|atm| atm.0[i].noshows);
    v.sort();
    noshows_percentiles.push(v[num_validators/3]); 
}

Assuming more than 50% honersty, these median tell us how many approval votes form each validator.

We re-weight the used_downloads from the ith validator by their median times their expected f+1 chunks and divided by how many chunks downloads they claimed, and sum them

#[cfg(offchain)]
let mut my_missing_uploads = my_approvals_tally.iter().map(|l| l.used_uploads).collect();
let mut reweighted_total_used_downloads = vec[0u64; num_validators];
for (mmu,atm) in my_missing_uploads.iter_mut().zip(approvals_tally_messages) {
    let d = atm.0.iter().map(|l| l.used_downloads).sum();
    for i in 0..num_validators {
        let atm_from_i = approval_usages_medians[i] * (f+1) / d;
        #[cfg(offchain)]
        if i == me { mmu -= atm_from_i };
        reweighted_total_used_downloads[i] += atm_from_i;
    }
}

We distribute rewards on-chain using approval_usages_medians and reweighted_total_used_downloads. Approval checkers could later change from who they download chunks using my_missing_uploads.

We deduct small amount of rewards using noshows_medians too, likely 1% of the rewards for an approval, but excuse some small number of noshows, ala noshows_medians[i].saturating_sub(MAX_NO_PENALTY_NOSHOWS).

Strategies

In theory, validators could adopt whatever strategy they like to penalize validators who stiff them on availability redistribution rewards, except they should not stiff back, only choose other availability providers. We discuss one good strategy below, but initially this could go unimplemented.

Concensus

We avoid placing rewards logic on the relay chain now, so we must either collect the signed ApprovalsTallyMessages and do the above computations somewhere sufficently trusted, like a parachain, or via some distributed protocol with its own assumptions.

In-core

A dedicated rewards parachain could easily collect the ApprovalsTallyMessages and do the above computations. In this, we logically have two phases, first we build the on-chain Merkle tree M of ApprovalsTallyMessages, and second we process those into the rewards data.

Any in-core approach risks enough malicious collators biasing the rewards by censoring the ApprovalsTallyMessages messages for some validators during the first phase. After this first phase completes, our second phase proceeds deterministically.

As an option, each validator could handle this second phase itself by creating single heavy transaction with n state accesses in this Merkle tree M, and this transaction sends the era points.

A remark for future developments..

JAM-like non/sub-parachain accumulation could mitigate the risk of the rewards parachain being captured.

JAM services all have either parachain accumulation or else non/sub-parachain accumulation.

  • A parachain should mean any service that tracks mutable state roots onto the relay chain, with its accumulation updating the state roots. Inherently, these state roots create some capture risk for the parachain, although how much depends upon numerous other factors.
  • A non/sub-parachain means the service does not maintain state like a blockchain does, but could use some tiny state within the relay chain. Although seemingly less powerful than parachains, these non/sub-parachain accumulations could reduce the capture risk so that any validator could create a block for the service, without knowing any existing state.

In our case, each ApprovalsTallyMessage would become a block for the first phase rewards service, so then the accumulation tracks an MMR of the rewards service block hashes, which becomes M from Option 1. At 1024 validators this requires 9 * 32 = 288 bytes for the MMR and 1024/8 = 128 bytes for a bitfield, so 416 bytes of relay chain state in total. Any validator could then add their ApprovalsTallyMessage in any order, but only one per relay chain block, so the submission timeframe should be long enough to prevent censorship.

Arguably after JAM, we should migrate critical functions to non/sub-parachain aka JAM services without mutable state, so this covers validator elections, DKGs, and rewards. Yet, non/sub-parachains cannot eliminate all censorship risks, so the near term benefits seem questionable.

Off-core

All validators could collect ApprovalsTallyMessages and independently compute rewards off-core. At that point, all validators have opinions about all other validators rewards, but even among honest validators these opinions could differ if some lack some ApprovalsTallyMessages.

We'd have the same in-core computation problem if we perform statistics like medians upon these opinions. We could however take an optimistic approach where each validator computes medians like above, but then shares their hash of the final rewards list. If 2/3rds voted for the same hash, then we distribute rewards as above. If not, then we distribute no rewards until governance selects the correct hash.

We never validate in-core the signatures on ApprovalsTallyMessages or the computation, so this approach permits more direct cheating by malicious 2/3rd majority, but if that occurs then we've broken our security assumptions anyways. It's likely these hashes do diverge during some network disruptions though, which increases our "drama" factor considerably, which maybe unacceptable.

Explanation

Backing

Polkadot's efficency creates subtle liveness concerns: Anytime one node cannot perform one of its approval checks then Polkadot loses in expectation 3.25 approval checks, or 0.10833 parablocks. This makes back pressure essential.

We cannot throttle approval checks securely either, so reactive off-chain back pressure only makes sense during or before the backing phase. In other words, if nodes feel overworked themselves, or perhaps beleive others to be, then they should drop backing checks, never approval checks. It follows backing work must be rewarded less well and less reliably than approvals, as otherwise validators could benefit from behavior that harms the network.

We propose that one backing statement be rewarded at 80% of one approval statement, so backers earn only 80% of what approval checkers earn. We omit rewards for availability distribution, so backers spend more on bandwidth too. Approval checkers always fetch chunks first from backers though, so good backers earn roughly 7% there, meaning backing checks earn roughly 13% less than approval checks. We should lower this 80% if we ever increase availability redistribution rewards.

Although imperfect, we believe this simplifies implementation, and provides robustness against mistakes elsewhere, including by governance mistakes, but incurs minimal risk. In principle, backer might not distribute systemic chunks, but approval checkers fetch systemic chunks from backers first anyways, so likely this yields negligable gains.

As always we require that backers' rewards covers their operational costs plus some profit, but approval checks must be more profitable.

Approvals

In polkadot, all validators run an approval assignment loop for each candidate, in which the validator listens to other approval checkers assignments and approval statements/votes, with which it marks checkers no-show or done, and marks candidates approved. Also, this loop determines and announces validators' own approval checker assignments.

Any validator should always conclude whatever approval checks it begins, but our approval assignment loop ignore some approval checks, either because they were announced too soon or because an earlier no-show delivered its approval vote before the final approval. We say a validator $u$ uses an approval vote by a validator $v$ on a candidate $c$ if the approval assignments loop by $u$ counted the vote by $v$ towards approving the candidate $c$. We should not rewards votes announced too soon, so we unavoidably omit rewards for some honest no-show replacements too. We expect the 80% discount for backing covers these losses, so approval checks remain more profitable than backing.

We propose a simple approximate solution based upon computing medians across validators for used votes.

  1. In an epoch $e$, each validator $u$ counts of the number $\alpha_{u,v}$ of votes they used from each validator $v$, including themselves. Any time a validator marks a candidate approved, they increment these counts appropriately.

  2. After epoch $e$'s last block gets finalized, all validators of epoch $e$ submit an approvals tally message ApprovalsTallyMessage that reveals their number $\alpha_{u,v}$ of useful approvals they saw from each validator $v$ on candidates that became available in epoch $n$. We do not send $\alpha_{u,u}$ for tit-for-tat reasons discussed below, not for bias concerns. We record these approvals tally messages on-chain.

  3. After some delay, we compute on-chain the median $\alpha_v := \textrm{median} { \alpha_{u,v} : u }$ used approvals statements for each validator $v$.

As discussed in https://hackmd.io/@rgbPIkIdTwSICPuAq67Jbw/S1fHcvXSF we could compute these medians using the on-line algorithm if substrate had a nice priority queue.

We never achieve true consensus on approval checkers and their approval votes. Yet, our approval assignment loop gives a rough concensus, under our Byzantine assumption and some synchrony assumption. It then follows that miss-reporting by malicious validators should not appreciably alter the median $\alpha_v$ and hence rewards.

We never tally used approval assignments to candidate equivocations or other forks. Any validator should always conclude whatever approval checks it begins, even on other forks, but we expect relay chain equivocations should be vanishingly rare, and sassafras should make forks uncommon.

We account for noshows similarly, and deduce a much smaller amount of rewards, but require a 2/3 precentile level, not kjust a median.

Availability redistribution

As approval checkers could easily perform useless checks, we shall reward availability providers for the availability chunks they provide that resulted in useful approval checks. We enforce honesty using a tit-for-tat mechanism because chunk transfers are inherently subjective.

An approval checker reconstructs the full parachain block by downloading distinct $f+1$ chunks from other validators, where at most $f$ validators are byzantine, out of the $n \ge 3 f + 1$ total validators. In downloading chunks, validators prefer the $f+1$ systemic chunks over the non-systemic chunks, and prefer fetching from validators who already voted valid, like backing checkers. It follows some validators should recieve credit for more than one chunk per candidate.

We expect a validator $v$ has actually performed more approval checks $\omega_v$ than the median $\alpha_v$ for which they actually received credit. In fact, approval checkers even ignore some of their own approval checks, meaning $\alpha_{v,v} \le \omega_v$ too.

Alongside approvals count for epoch $e$, approval checker $v$ computes the counts $\beta_{u,v}$ of the number of chunks they downloaded from each availability provider $u$, excluding themselves, for which they percieve the approval check turned out useful, meaning their own approval counts in $\alpha_{v,v}$. Approval checkers publish $\beta_{u,v}$ alongside $\alpha_{u,v}$ in the approvals tally message ApprovalsTallyMessage. We originally proposed include the self availability usage $\beta_{v,v}$ here, but this should not matter, and excluding simplifies the code.

Symmetrically, availability provider $u$ computes the counts $\gamma_{u,v}$ of the number of chunks they uploaded to each approval checker $v$, again including themselves, again for which they percieve the approval check turned out useful. Availability provider $u$ never reveal its $\gamma_{u,v}$ however.

At this point, $\alpha_v$, $\alpha_{v,v}$, and $\alpha_{u,v}$ all potentially differ. We established consensus upon $\alpha_v$ above however, with which we avoid approval checkers printing unearned availability provider rewards:

After receiving "all" pairs $(\alpha_{u,v},\beta_{u,v})$, validator $w$ re-weights the $\beta_{u,v}$ and their own $\gamma_{w,v}$. $$ \begin{aligned} \beta\prime_{w,v} &= {(f+1) \alpha_v \over \sum_u \beta_{u,v}} \beta_{w,v} \ \gamma\prime_{w,v} &= {(f+1) \alpha_w \over \sum_v \gamma_{w,v}} \gamma_{w,v} \ \end{aligned} $$ At this point, we compute $\beta\prime_w = \sum_v \beta\prime_{w,v}$ on-chain for each $w$ and reward $w$ proportionally.

Tit-for-tat

We employ a tit-for-tat strategy to punish validators who lie about from whome they obtain availability chunks. We only alter validators future choices in from whom they obtain availability chunks, and never punish by lying ourselves, so nothing here breaks polkadot, but not having roughly this strategy enables cheating.

An availability provider $w$ defines $\delta\prime_{w,v} := \gamma\prime_{w,v} - \beta\prime_{w,v}$ to be the re-weighted number of chunks by which $v$ stiffed $w$. Now $w$ increments their cumulative stiffing perception $\eta_{w,v}$ from $v$ by the value $\delta\prime_{w,v}$, so $\eta_{w,v} \mathrel{+}= \delta\prime_{w,v}$

In future, anytime $w$ seeks chunks in reconstruction $w$ skips $v$ proportional to $\eta_{w,v} / \sum_u \eta_{w,u}$, with each skip reducing $\eta_{w,u}$ by 1. We expect honest accedental availability stiffs have only small $\delta\prime_{w,v}$, so they clear out quickly, but intentional skips add up more quickly.

We keep $\gamma_{w,v}$ and $\alpha_{u,u}$ secret so that approval checkers cannot really know others stiffing perceptions, although $\alpha_{u,v}$ leaks some relevant information. We expect this secrecy keeps skips secret and thus prevents the tit-for-tat escalating beyond one round, which hopefully creates a desirable Nash equilibrium.

We favor skiping systematic chunks to reduce reconstructon costs, so we face costs when skipping them. We could however fetch systematic chunks from availability providers as well as backers, or even other approval checkers, so this might not become problematic in practice.

Concerns: Drawbacks, Testing, Security, and Privacy

We do not pay backers individually for availability distribution per se. We could only do so by including this information into the availability bitfields, which complicates on-chain computation. Also, if one of the two backers does not distribute then the availability core should remain occupied longer, meaning the lazy backer loses some rewards too. It's likely future protocol improbvements change this, so we should monitor for lazy backers outside the rewards system.

We discuss approvals being considered by the tit-for-tat in earlier drafts. An adversary who successfuly manipulates the rewards median votes would've alraedy violated polkadot's security assumptions though, which requires a hard fork and correcting the dot allocation. Incorrect report wrong approval_usages remain interesting statistics though.

Adversarial validators could manipulates their availability votes though, even without being a supermajority. If they still download honestly, then this costs them more rewards than they earn. We do not prevent validators from preferentially obtaining their pieces from their friends though. We should analyze, or at least observe, the long-term consequences.

A priori, whale nominator's validators could stiff validators but then rotate their validators quickly enough so that they never suffered being skipped back. We discuss several possible solution, and their difficulties, under "Rob's nominator-wise skipping" in https://hackmd.io/@rgbPIkIdTwSICPuAq67Jbw/S1fHcvXSF but overall less seems like more here. Also frequent validator rotation could be penalized elsewhere.

Performance, Ergonomics, and Compatibility

We operate off-chain except for final rewards votes and median tallies. We expect lower overhead rewards protocols would lack information, thereby admitting easier cheating.

Initially, we designed the ELVES approval gadget to allow on-chain operation, in part for rewards computation, but doing so looks expensive. Also, on-chain rewards computaiton remains only an approximation too, but could even be biased more easily than our off-chain protocol presented here.

We alraedy teach validators about missed parachain blocks, but we'll teach approval checking more going forwards, because current efforts focus more upon backing.

JAM's block exports should not complicate availability rewards, but could impact some alternative schemes.

Prior Art and References

None

Unresolved Questions

Provide specific questions to discuss and address before the RFC is voted on by the Fellowship. This should include, for example, alternatives to aspects of the proposed design where the appropriate trade-off to make is unclear.

Synthetic parachain flag

Any rewards protocol could simply be "out voted" by too many slow validators: An increase the number of parachain cores increases more workload, but this creates no-shows if too few validators could handle this workload.

We could add a synthetic parachain flag, only settable by governance, which treats no-shows as positive approval votes for that parachain, but without adding rewards. We should never enable this for real parachains, only for synthetic ones like gluttons. We should not enable the synthetic parachain flag long-term even for gluttonsm, because validators could easily modify their code. Yet, synthetic approval checks might enable pushing the hardware upgrades more agressively over the short-term.

(source)

Table of Contents

RFC-0117: The Unbrick Collective

Start Date22 August 2024
DescriptionThe Unbrick Collective aims to help teams rescuing a para once it stops producing blocks
AuthorsBryan Chen, Pablo Dorado

Summary

A followup of the RFC-0014. This RFC proposes adding a new collective to the Polkadot Collectives Chain: The Unbrick Collective, as well as improvements in the mechanisms that will allow teams operating paras that had stopped producing blocks to be assisted, in order to restore the production of blocks of these paras.

Motivation

Since the initial launch of Polkadot parachains, there has been many incidients causing parachains to stop producing new blocks (therefore, being bricked) and many occurrences that required Polkadot governance to update the parachain head state/wasm. This can be due to many reasons range from incorrectly registering the initial head state, inability to use sudo key, bad runtime migration, bad weight configuration, and bugs in the development of the Polkadot SDK.

Currently, when the para is not unlocked in the paras registrar1, the Root origin is required to perform such actions, involving the governance process to invoke this origin, which can be very resource expensive for the teams. The long voting and enactment times also could result significant damage to the parachain and users.

Finally, other instances of governance that might enact a call using the Root origin (like the Polkadot Fellowship), due to the nature of their mission, are not fit to carry these kind of tasks.

In consequence, the idea of a Unbrick Collective that can provide assistance to para teams when they brick and further protection against future halts is reasonable enough.

Stakeholders

  • Parachain teams
  • Parachain users
  • OpenGov users
  • Polkadot Fellowship

Explanation

The Collective

The Unbrick Collective is defined as an unranked collective of members, not paid by the Polkadot Treasury. Its main goal is to serve as a point of contact and assistance for enacting the actions needed to unbrick a para. Such actions are:

  • Updating the Parachain Verification Function (a.k.a. a new WASM) of a para.
  • Updating the head state of a para.
  • A combination of the above.

In order to ensure these changes are safe enough for the network, actions enacted by the Unbrick Collective must be whitelisted via similar mechanisms followed by collectives like the Polkadot Fellowship. This will prevent unintended, not overseen changes on other paras to occur.

Also, teams might opt-in to delegate handling their para in the registry to the Collective. This allows to perform similar actions using the paras registrar, allowing for a shorter path to unbrick a para.

Initially, the unbrick collective has powers similar to a parachains own sudo, but permits more decentralized control. In the future, Polkadot shall provide functionality like SPREE or JAM that exceeds sudo permissions, so the unbrick collective cannot modify those state roots or code.

The Unbrick Process

flowchart TD
    A[Start] 

    A -- Bricked --> C[Request para unlock via Root]
    C -- Approved --> Y
    C -- Rejected --> A
    
    D[unbrick call proposal on WhitelistedUnbrickCaller]
    E[whitelist call proposal on the Unbrick governance]
    E -- call whitelisted --> F[unbrick call enacted]
    D -- unbrick called --> F
    F --> Y

    A -- Not bricked --> O[Opt-in to the Collective]
    O -- Bricked --> D
    O -- Bricked --> E

    Y[update PVF / head state] -- Unbricked --> Z[End]

Initially, a para team has two paths to handle a potential unbrick of their para in the case it stops producing blocks.

  1. Opt-in to the Unbrick Collective: This is done by delegating the handling of the para in the paras registrar to an origin related to the Collective. This doesn't require unlocking the para. This way, the collective is enabled to perform changes in the paras module, after the Unbrick Process proceeds.
  2. Request a Para Unlock: In case the para hasn't delegated its handling in the paras registrar, it'll be still possible for the para team to submit a proposal to unlock the para, which can be assisted by the Collective. However, this involves submitting a proposal to the Root governance origin.

Belonging to the Collective

The collective will be initially created without members (no seeding). There will be additional governance proposals to setup the seed members.

The origins able to modify the members of the collective are:

  • The Fellows track in the Polkadot Fellowship.
  • Root track in the Relay.
  • More than two thirds of the existing Unbrick Collective.

The members are responsible to verify the technical details of the unbrick requests (i.e. the hash of the new PVF being set). Therefore, they must have the technical capacity to perform such tasks.

Suggested requirements to become a member are the following:

  • Rank 3 or above in the Polkadot Fellowship.
  • Being a CTO or Technical Lead in a para team that has opted-in to delegate the Unbrick Collective to manage the PVF/head state of the para.

Drawbacks

The ability to modify the Head State and/or the PVF of a para means a possibility to perform arbitrary modifications of it (i.e. take control the native parachain token or any bridged assets in the para).

This could introduce a new attack vector, and therefore, such great power needs to be handled carefully.

Testing, Security, and Privacy

The implementation of this RFC will be tested on testnets (Rococo and Westend) first.

An audit will be required to ensure the implementation doesn't introduce unwanted side effects.

There are no privacy related concerns.

Performance, Ergonomics, and Compatibility

Performance

This RFC should not introduce any performance impact.

Ergonomics

This RFC should improve the experience for new and existing parachain teams, lowering the barrier to unbrick a stalled para.

Compatibility

This RFC is fully compatible with existing interfaces.

Prior Art and References

Unresolved Questions

  • What are the parameters for the WhitelistedUnbrickCaller track?
  • Any other methods that shall be updated to accept Unbrick origin?
  • Any other requirements to become a member?
  • We would like to keep this simple, so no funding support from the Polkadot treasury. But do we want to compensate the members somehow? i.e. Allow parachain teams to donate to the collective.
  • We hope SPREE/JAM would be carefully audited for miss-use risks before being
    provided to parachain teams, but could the unbrick collective have an elections
    that warranted trust beyond sudo powers?
  • An auditing framework/collective makes sense parachain code upgrades, but
    could also strengthen the unbrick collective.
  • Do we want to have this collective offer additional technical support to help bricked parachains? i.e. help debug the code, create the rescue plan, create postmortem report, provide resources on how to avoid getting bricked
1

The paras registrar refers to a pallet in the Relay, responsible to gather registration info of the paras, the locked/unlocked state, and the manager info.

(source)

Table of Contents

RFC-0145: Remove the host-side runtime memory allocator

Start Date2025-05-16
DescriptionUpdate the runtime-host interface to no longer make use of a host-side allocator
AuthorsPierre Krieger, Someone Unknown

Summary

Update the runtime-host interface so that it no longer uses the host-side allocator.

Prior Art

The API of these new functions was heavily inspired by the API used by the C programming language.

This RFC is mainly based on RFC-4 by @tomaka, which was never adopted, and this RFC supersedes it.

Changes from RFC-4

  • The original RFC required checking if an output buffer address provided to a host function is inside the VM address space range and to stop the runtime execution if that's not the case. That requirement has been removed in this version of the RFC, as in the general case, the host doesn't have exhaustive information about the VM's memory organization. Thus, attempting to write to an out-of-bounds region will result in a "normal" runtime panic.
  • Function signatures introduced by PPP#7 have been used in this RFC, as the PPP has already been properly implemented and documented. However, it has never been officially adopted, nor have its functions been in use.
  • Return values were harmonized to i64 everywhere where they represent either a positive outcome as a positive integer or a negative outcome as a negative error code.
  • ext_offchain_network_peer_id_version_1 now returns a result code instead of silently failing if the network status is unavailable.
  • Added new versions of ext_misc_runtime_version and ext_offchain_random_seed.
  • Addressed discussions from the original RFC-4 discussion thread.

Motivation

The heap allocation of the runtime is currently controlled by the host using a memory allocator on the host side.

The API of many host functions contains buffer allocations. For example, when calling ext_hashing_twox_256_version_1, the host allocates a 32-byte buffer using the host allocator, and returns a pointer to this buffer to the runtime. The runtime later has to call ext_allocator_free_version_1 on this pointer to free the buffer.

Even though no benchmark has been done, it is pretty obvious that this design is very inefficient. To continue with the example of ext_hashing_twox_256_version_1, it would be more efficient to instead write the output hash to a buffer allocated by the runtime on its stack and passed by pointer to the function. Allocating a buffer on the stack, in the worst case, consists simply of decreasing a number; in the best case, it is free. Doing so would save many VM memory reads and writes by the allocator, and would save a function call to ext_allocator_free_version_1.

Furthermore, the existence of the host-side allocator has become questionable over time. It is implemented in a very naive way, and for determinism and backwards compatibility reasons, it needs to be implemented exactly identically in every client implementation. Runtimes make substantial use of heap memory allocations, and each allocation needs to go through the runtime <-> host boundary twice (once for allocating and once for freeing). Moving the allocator to the runtime side would be a good idea, although it would increase the runtime size. But before the host-side allocator can be deprecated, all the host functions that use it must be updated to avoid using it.

Stakeholders

No attempt was made to convince stakeholders.

Explanation

New definitions

New Definition I: Runtime Optional Positive Integer

The Runtime optional positive integer is a signed 64-bit value. Positive values in the range of [0..2³²) represent corresponding unsigned 32-bit values. The value of -1 represents a non-existing value (an absent value). All other values are invalid.

New Definition II: Runtime Optional Pointer-Size

The runtime optional pointer-size has exactly the same definition as runtime pointer-size (Definition 216) with the value of 2⁶⁴-1 representing a non-existing value (an absent value).

Changes to host functions

ext_storage_get

The function is deprecated. Users are encouraged to use ext_storage_read_version_2 instead.

ext_storage_read

The new version 2 is introduced, deprecating ext_storage_read_version_1. The new signature is

(func $ext_storage_read_version_2
    (param $key i64) (param $value_out i64) (param $value_offset i32) (result i64))
Arguments
  • key is a pointer-size (Definition 216) to the storage key being read;
  • value_out is a pointer-size (Definition 216) to a buffer where the value read should be stored. If the buffer is not long enough to accommodate the value, the value is truncated to the length of the buffer;
  • value_offset is a 32-bit offset from which the value reading should start.
Result

The result is an optional positive integer (New Definition I), representing either the full length of the value in storage or the absence of such a value in storage.

Changes

The logic of the function is unchanged since the previous version. Only the result representation has changed.

ext_storage_clear_prefix

The new version 3 is introduced, deprecating ext_storage_clear_prefix_version_2. The new signature is

(func $ext_storage_clear_prefix_version_3
    (param $maybe_prefix i64) (param $maybe_limit i64) (param $maybe_cursor_in i64)
    (param $maybe_cursor_out i64) (param $backend i32) (param $unique i32) (param $loops i32)
    (result i32))
Arguments
  • maybe_prefix is a pointer-size (Definition 216) containing a (possibly empty) storage prefix being cleared;
  • maybe_limit is an optional positive integer (New Definition I) representing either the maximum number of backend deletions which may happen, or the absence of such a limit. The number of backend iterations may surpass this limit by no more than one;
  • maybe_cursor_in is an optional pointer-size (New Definition II) representing the cursor returned by the previous (unfinished) call to this function. It should be absent on the first call;
  • maybe_cursor_out is a pointer-size (Definition 216) to a buffer where the continuation cursor will optionally be written (see also the Result section);
  • backend is a pointer (Definition 215) to a 4-byte buffer where a 32-bit integer representing the number of items removed from the backend database will be written;
  • unique is a pointer (Definition 215) to a 4-byte buffer where a 32-bit integer representing the number of unique keys removed, taking into account both the backend and the overlay;
  • loops is a pointer (Definition 215) to a 4-byte buffer where a 32-bit integer representing the number of iterations (each requiring a storage seek/read) which were done will be written.
Result

The result represents the length of the continuation cursor which was written to the buffer provided in maybe_cursor_out. A zero value represents the absence of such a cursor and no need for continuation (the prefix has been completely cleared). If the buffer is not large enough to accommodate the cursor, the latter will be truncated, but the full length of the cursor will always be returned.

Changes

The new version adopts PPP#7, hence the significant change in the function interface with respect to the previous version. The reasoning for such a change was provided in the original proposal discussion.

ext_storage_root

The new version 3 is introduced, deprecating ext_storage_root_version_2. The signature is

(func $ext_storage_root_version_3
    (param $out i64) (result i32))
Arguments
  • out is a pointer-size (Definition 216) to a buffer where the SCALE-encoded storage root, calculated after committing all the existing operations, will be stored.
Results

The result is the length of the output stored in the buffer provided in out. If the buffer is not large enough to accommodate the data, the latter will be truncated, but the full length of the output data will always be returned.

Changes

The new version adopts PPP#6 deprecating the argument that used to represent the storage version.

ext_storage_next_key

The new version 2 is introduced, deprecating ext_storage_next_key_version_1. The signature is

(func $ext_storage_next_key_version_2
    (param $key_in i64) (param $key_out i64) (result i32))
Changes

The logic of the function is unchanged since the previous version. The signature has changed to align with the new memory allocation strategy.

Arguments
  • key_in is a pointer-size (Definition 216) to a buffer containing a storage key;
  • key_out is a pointer-size (Definition 216) to an output buffer where the next key in the storage in the lexicographical order will be written.
Result

The result is the length of the output key, or zero if no next key was found. If the buffer provided in key_out is not large enough to accommodate the data, the latter will be truncated, but the full length of the output data will always be returned.

ext_default_child_storage_get

The function is deprecated. Users are encouraged to use ext_default_child_storage_read_version_2 instead.

ext_default_child_storage_read

The new version 2 is introduced, deprecating ext_default_child_storage_read_version_1. The new signature is

(func $ext_storage_read_version_2
    (param $storage_key i64) (param $key i64) (param $value_out i64) (param $value_offset i32)
    (result i64))
Arguments
  • storage_key is a pointer-size (Definition 216) to the child storage key (Definition 219);
  • key is the storage key being read;
  • value_out is a pointer-size (Definition 216) to a buffer where the value read should be stored. If the buffer is not long enough to accommodate the value, the value is truncated to the length of the buffer;
  • value_offset is a 32-bit offset from which the value reading should start.
Result

The result is an optional positive integer (New Definition I), representing either the full length of the value in storage or the absence of such a value in storage.

Changes

The logic of the function is unchanged since the previous version. Only the result representation has changed.

ext_default_child_storage_storage_kill

The new version 4 is introduced, deprecating ext_default_child_storage_storage_kill_version_3. The new signature is

(func $ext_default_child_storage_storage_kill_version_4
    (param $storage_key i64) (param $maybe_limit i64) (param $maybe_cursor_in i64)
    (param $maybe_cursor_out i64) (param $backend i32) (param $unique i32) (param $loops i32)
    (result i32))
Arguments
  • storage_key is a pointer-size (Definition 216) to the child storage key (Definition 219);
  • maybe_limit is an optional positive integer representing either the maximum number of backend deletions which may happen, or the absence of such a limit. The number of backend iterations may surpass this limit by no more than one;
  • maybe_cursor_in is an optional pointer-size representing the cursor returned by the previous (unfinished) call to this function. It should be absent on the first call;
  • maybe_cursor_out is a pointer-size (Definition 216) to a buffer where the continuation cursor will optionally be written (see also the Result section);
  • backend is a pointer (Definition 215) to a 4-byte buffer where a 32-bit integer representing the number of items removed from the backend database will be written;
  • unique is a pointer (Definition 215) to a 4-byte buffer where a 32-bit integer representing the number of unique keys removed, taking into account both the backend and the overlay;
  • loops is a pointer (Definition 215) to a 4-byte buffer where a 32-bit integer representing the number of iterations (each requiring a storage seek/read) which were done will be written.
Result

The result represents the length of the continuation cursor which was written to the buffer provided in maybe_cursor_out. A zero value represents the absence of such a cursor and no need for continuation (the prefix has been completely cleared). If the buffer is not large enough to accommodate the cursor, the latter will be truncated, but the full length of the cursor will always be returned.

Changes

The new version adopts PPP#7, hence the significant change in the function interface with respect to the previous version. The reasoning for such a change was provided in the original proposal discussion.

ext_default_child_storage_clear_prefix

The new version 3 is introduced, deprecating ext_default_child_storage_clear_prefix_version_2. The new signature is

(func $ext_default_child_storage_clear_prefix_version_3
    (param $storage_key i64) (param $prefix i64) (param $maybe_limit i64)
    (param $maybe_cursor_in i64) (param $maybe_cursor_out i64) (param $backend i32)
    (param $unique i32) (param $loops i32) (result i32))
Arguments
  • storage_key is a pointer-size (Definition 216) to the child storage key (Definition 219);
  • prefix is a pointer-size (Definition 216) containing a storage prefix being cleared;
  • maybe_limit is an optional positive integer representing either the maximum number of backend deletions which may happen, or the absence of such a limit. The number of backend iterations may surpass this limit by no more than one;
  • maybe_cursor_in is an optional pointer-size representing the cursor returned by the previous (unfinished) call to this function. It should be absent on the first call;
  • maybe_cursor_out is a pointer-size (Definition 216) to a buffer where the continuation cursor will optionally be written (see also the Result section);
  • backend is a pointer (Definition 215) to a 4-byte buffer where a 32-bit integer representing the number of items removed from the backend database will be written;
  • unique is a pointer (Definition 215) to a 4-byte buffer where a 32-bit integer representing the number of unique keys removed, taking into account both the backend and the overlay;
  • loops is a pointer (Definition 215) to a 4-byte buffer where a 32-bit integer representing the number of iterations (each requiring a storage seek/read) which were done will be written.
Result

The result represents the length of the continuation cursor which was written to the buffer provided in maybe_cursor_out. A zero value represents the absence of such a cursor and no need for continuation (the prefix has been completely cleared). If the buffer is not large enough to accommodate the cursor, the latter will be truncated, but the full length of the cursor will always be returned.

Changes

The new version adopts PPP#7, hence the significant change in the function interface with respect to the previous version. The reasoning for such a change was provided in the original proposal discussion.

ext_default_child_storage_root

The new version 3 is introduced, deprecating ext_default_child_storage_root_version_2. The signature is

(func $ext_default_child_storage_root_version_3
    (param $storage_key i64) (param $out i64) (result i32))
Arguments
  • storage_key is a pointer-size (Definition 216) to the child storage key (Definition 219);
  • out is a pointer-size (Definition 216) to a buffer where the SCALE-encoded storage root, calculated after committing all the existing operations, will be stored.
Results

The result is the length of the output stored in the buffer provided in out. If the buffer is not large enough to accommodate the data, the latter will be truncated, but the full length of the output data will always be returned.

Changes

The new version adopts PPP#6 deprecating the argument that used to represent the storage version.

ext_default_child_storage_next_key

The new version 2 is introduced, deprecating ext_default_child_storage_next_key_version_1. The signature is

(func $ext_default_child_storage_next_key_version_2
    (param $storage_key i64) (param $key_in i64) (param $key_out i64) (result i32))
Arguments
  • storage_key is a pointer-size (Definition 216) to the child storage key (Definition 219);
  • key_in is a pointer-size (Definition 216) to a buffer containing a storage key;
  • key_out is a pointer-size (Definition 216) to an output buffer where the next key in the storage in the lexicographical order will be written.
Result

The result is the length of the output key, or zero if no next key was found. If the buffer provided in key_out is not large enough to accommodate the data, the latter will be truncated, but the full length of the output data will always be returned.

Changes

The logic of the function is unchanged since the previous version. The signature has changed to align with the new memory allocation strategy.

ext_trie_{blake2|keccak}_256_[ordered_]root

The following functions share the same signatures and set of changes:

  • ext_trie_blake2_256_root
  • ext_trie_blake2_256_ordered_root
  • ext_trie_keccak_256_root
  • ext_trie_keccak_256_ordered_root

For the aforementioned functions, versions 3 were introduced, and the corresponding versions 2 were deprecated. The signature is:

(func $ext_trie_{blake2|keccak}_256_[ordered_]root_version_3
    (param $input i64) (param $version i32) (param $out i32))
Arguments
  • input is a pointer-size (Definition 216) to the SCALE-encoded vector of the trie key-value pairs;
  • version is the state version, where 0 denotes V0 and 1 denotes V1 state version. Other state versions may be introduced in the future;
  • out is a pointer (Definition 215) to a 32-byte buffer, where the calculated trie root will be stored.
Changes

The logic of the function is unchanged since the previous version. The signature has changed to align with the new memory allocation strategy.

ext_misc_runtime_version

The new version 2 is introduced, deprecating ext_default_child_storage_next_key_version_1. The signature is

(func $ext_misc_runtime_version_version_2
    (param $wasm i64) (param $out i64) (result i64))
Arguments
  • wasm is a pointer-size (Definition 216) to the Wasm blob from which the version information should be extracted;
  • out is a pointer-size (Definition 216) to the buffer where the SCALE-encoded extracted version information will be stored.
Result

The result is an optional positive integer (New Definition I) representing the length of the output data. If the buffer is not large enough to accommodate the data, the latter will be truncated, but the full length of the output data will always be returned. An absent value represents the absence of the version information in the Wasm blob or a failure to read one.

Changes

The logic of the function is unchanged since the previous version. The signature has changed to align with the new memory allocation strategy.

ext_crypto_{ed25519|sr25519|ecdsa}_public_keys

The following functions are deprecated:

  • ext_crypto_ed25519_public_keys_version_1
  • ext_crypto_sr25519_public_keys_version_1
  • ext_crypto_ecdsa_public_keys_version_1

Users are encouraged to use the new *_num_public_keys and *_public_key counterparts.

ext_crypto_{ed25519|sr25519|ecdsa}_num_public_keys

New functions, all sharing the same signature and logic, are introduced:

  • ext_crypto_ed25519_num_public_keys_version_1
  • ext_crypto_sr25519_num_public_keys_version_1
  • ext_crypto_ecdsa_num_public_keys_version_1

The signature is:

(func $ext_crypto_{ed25519|sr25519|ecdsa}_num_public_keys
    (param $id i32) (result i32))
Arguments
Result

The result represents a (possibly zero) number of keys of the given type known to the keystore.

ext_crypto_{ed25519|sr25519|ecdsa}_public_key

New functions, all sharing the same signature and logic, are introduced:

  • ext_crypto_ed25519_public_key_version_1
  • ext_crypto_sr25519_public_key_version_1
  • ext_crypto_ecdsa_public_key_version_1

The signature is:

(func $ext_crypto_{ed25519|sr25519|ecdsa}_public_key
    (param $id i32) (param $index i32) (param $out))
Arguments
  • id is a pointer (Definition 215) to the key type identifier (Definition 220).
  • index is the index of the key in the keystore. If the index is out of bounds (determined by the value returned by the respective _num_public_keys function) the function will panic;
  • out is a pointer (Definition 215) to the output buffer of the respective size (depending on key type) where the key will be written.

ext_crypto_{ed25519|sr25519|ecdsa}_generate

The following functions share the same signatures and set of changes:

  • ext_crypto_ed25519_generate
  • ext_crypto_sr25519_generate
  • ext_crypto_ecdsa_generate

For the aforementioned functions, versions 2 are introduced, and the corresponding versions 1 are deprecated. The signature is:

(func $ext_crypto_{ed25519|sr25519|ecdsa}_generate_version_2
    (param $id i32) (param $seed i64) (param $out i32))
Arguments
  • id is a pointer (Definition 215) to the key type identifier (Definition 220). The function will panic if the identifier is invalid;
  • seed is a pointer-size (Definition 216) to the SCALE-encoded Option value (Definition 200) containing the BIP-39 seed which must be valid UTF-8. The function will panic if the seed is not valid UTF-8;
  • out is a pointer (Definition 215) to the output buffer of the respective size (depending on key type) where the generated key will be written.
Changes

The logic of the functions is unchanged since the previous version. The signature has changed to align with the new memory allocation strategy.

ext_crypto_{ed25519|sr25519|ecdsa}_sign[_prehashed]

The following functions share the same signatures and set of changes:

  • ext_crypto_ed25519_sign
  • ext_crypto_sr25519_sign
  • ext_crypto_ecdsa_sign
  • ext_crypto_ecdsa_sign_prehashed

For the aforementioned functions, versions 2 are introduced, and the corresponding versions 1 are deprecated. The signature is:

(func $ext_crypto_{ed25519|sr25519|ecdsa}_sign{_prehashed|}_version_2
    (param $id i32) (param $pub_key i32) (param $msg i64) (param $out i64) (result i64))
Arguments
  • id is a pointer (Definition 215) to the key type identifier (Definition 220). The function will panic if the identifier is invalid;
  • pub_key is a pointer (Definition 215) to the public key bytes (as returned by the respective _public_key function);
  • msg is a pointer-size (Definition 216) to the message that is to be signed;
  • out is a pointer (Definition 215) to the output buffer of the respective size (depending on key type) where the signature will be written.
Result

The function returns 0 on success. On error, -1 is returned and the output buffer should be considered uninitialized.

Changes

The logic of the functions is unchanged since the previous version. The signature has changed to align with the new memory allocation strategy.

ext_crypto_secp256k1_ecdsa_recover[_compressed]

The following functions share the same signatures and set of changes:

  • ext_crypto_secp256k1_ecdsa_recover
  • ext_crypto_secp256k1_ecdsa_recover_compressed

For the aforementioned functions, versions 3 are introduced, and the corresponding versions 2 are deprecated. The signature is:

(func $ext_crypto_secp256k1_ecdsa_recover\[_compressed]_version_3
    (param $sig i32) (param $msg i32) (param $out i32) (result i64))
Arguments
  • sig is a pointer (Definition 215) to the buffer containing the 65-byte signature in RSV format. V must be either 0/1 or 27/28;
  • msg is a pointer (Definition 215) to the buffer containing the 256-bit Blake2 hash of the message;
  • out is a pointer (Definition 215) to the output buffer of the respective size (depending on key type) where the recovered public key will be written.
Result

The function returns 0 on success. On error, it returns a negative ECDSA verification error code, where -1 stands for incorrect R or S, -2 stands for invalid V, and -3 stands for invalid signature.

Changes

The signature has changed to align with the new memory allocation strategy. The return error encoding, defined under Definition 221, is changed to promote the unification of host function result reporting (zero and positive values are for success, and the negative values are for failure codes).

ext_hashing_{keccak|sha2|blake2|twox}_

The following functions share the same signatures and set of changes:

  • ext_hashing_keccak_256
  • ext_hashing_keccak_512
  • ext_hashing_sha2_256
  • ext_hashing_blake2_128
  • ext_hashing_blake2_256
  • ext_hashing_twox_64
  • ext_hashing_twox_128
  • ext_hashing_twox_256

For the aforementioned functions, versions 2 are introduced, and the corresponding versions 1 are deprecated. The signature is:

(func $ext_hashing_{keccak|sha2|blake2|twox}_{64|128|256|512}_version_2
    (param $data i64) (param $out i32))
Arguments
  • data is a pointer-size (Definition 216) to the data to be hashed.
  • out is a pointer (Definition 215) to the output buffer of the respective size (depending on hash type) where the calculated hash will be written.
Changes

The logic of the functions is unchanged since the previous version. The signature has changed to align with the new memory allocation strategy.

ext_offchain_submit_transaction

The new version 2 is introduced, deprecating ext_offchain_submit_transaction_version_1. The signature is unchanged.

(func $ext_offchain_submit_transaction_version_2
    (param $data i64) (result i64))
Arguments
  • data is a pointer-size (Definition 216) to the byte array storing the encoded extrinsic.
Result

The result is 0 for success or -1 for failure.

Changes

The logic and the signature of the function are unchanged since the previous version. The only change is the interpretation of the result value to avoid an unneeded allocation and promote the unification of host function result reporting (zero and positive values are for success, and the negative values are for failure codes).

ext_offchain_network_state

The function is deprecated. Users are encouraged to use ext_offchain_network_peer_id_version_1 instead.

ext_offchain_network_peer_id

A new function is introduced. The signature is

(func $ext_offchain_submit_transaction_version_2
    (param $out i32) (result i64))
Arguments
  • out is a pointer (Definition 215) to the output buffer, 38 bytes long, where the network peer ID will be written.
Result

The result is 0 for success or -1 for failure.

ext_offchain_random_seed

The new version 2 is introduced, deprecating ext_offchain_random_seed_version_1. The signature is unchanged.

(func $ext_offchain_random_seed_version_2
    (param $out i32))
Arguments
  • out is a pointer (Definition 215) to the output buffer, 32 bytes long, where the random seed will be written.
Changes

The logic of the functions is unchanged since the previous version. The signature has changed to align with the new memory allocation strategy and promote the unification of host function result returning (zero and positive values are for success, and the negative values are for failure codes).

ext_offchain_local_storage_get

The function is deprecated. Users are encouraged to use ext_offchain_local_storage_read_version_1 instead.

ext_offchain_local_storage_read

A new function is introduced. The signature is

(func $ext_offchain_local_storage_read_version_1
    (param $kind i32) (param $key i64) (param $value_out i64) (param $offset i32) (result i64))
Arguments
  • kind is an offchain storage kind, where 0 denotes the persistent storage (Definition 222), and 1 denotes the local storage (Definition 223);
  • key is a pointer-size (Definition 216) to the storage key being read;
  • value_out is a pointer-size (Definition 216) to a buffer where the value read should be stored. If the buffer is not large enough to accommodate the value, the value is truncated to the length of the buffer;
  • offset is a 32-bit offset from which the value reading should start.
Result

The result is an optional positive integer (New Definition I), representing either the full length of the value in storage or the absence of such a value in storage.

ext_offchain_http_request_start

The new version 2 is introduced, deprecating ext_offchain_http_request_start_version_1. The signature is unchanged.

(func $ext_offchain_http_request_start_version_2
    (param $method i64) (param $uri i64) (param $meta i64) (result i64))
Arguments

method is a pointer-size (Definition 216) to the HTTP method. Possible values are “GET” and “POST”; uri is a pointer-size (Definition 216) to the URI; meta is a future-reserved field containing additional, SCALE-encoded parameters. Currently, an empty array should be passed.

Result

On success, a positive request identifier is returned. On error, -1 is returned.

Changes

The logic and the signature of the function are unchanged since the previous version. The only change is the interpretation of the result value to avoid an unneeded allocation and promote the unification of host function result returning (zero and positive values are for success, and the negative values are for failure codes).

ext_offchain_http_request_add_header

The new version 2 is introduced, deprecating ext_offchain_http_request_add_header_version_1. The signature is unchanged.

(func $ext_offchain_http_request_add_header_version_2
    (param $request_id i32) (param $name i64) (param $value i64) (result i64))
Arguments
  • request_id is an i32 integer indicating the ID of the started request, as returned by ext_offchain_http_request_start;
  • name is a pointer-size (Definition 216) to the HTTP header name;
  • value is a pointer-size (Definition 216) to the HTTP header value.
Result

The result is 0 for success or -1 for failure.

Changes

The logic and the signature of the function are unchanged since the previous version. The only change is the interpretation of the result value to avoid an unneeded allocation and promote the unification of host function result returning (zero and positive values are for success, and the negative values are for failure codes).

ext_offchain_http_request_write_body

The new version 2 is introduced, deprecating ext_offchain_http_request_write_body_version_1. The signature is unchanged.

(func $ext_offchain_http_request_write_body_version_2
    (param $request_id i32) (param $chunk i64) (param $deadline i64) (result i64))
Arguments
  • request_id is an i32 integer indicating the ID of the started request, as returned by ext_offchain_http_request_start;
  • chunk is a pointer-size (Definition 216) to the chunk of bytes. Writing an empty chunk finalizes the request;
  • deadline is a pointer-size (Definition 216) to the SCALE-encoded Option value (Definition 200) containing the UNIX timestamp (Definition 191). Passing None blocks indefinitely.
Result

On success, 0 is returned. On failure, a negative error code is returned, where -1 denotes the deadline was reached, -2 denotes that an I/O error occurred, and -3 denotes that the request ID provided was invalid.

Changes

The logic and the signature of the function are unchanged since the previous version. The only change is the interpretation of the result value to avoid an unneeded allocation and promote the unification of host function result returning (zero and positive values are for success, and the negative values are for failure codes).

ext_offchain_http_request_wait

The new version 2 is introduced, deprecating ext_offchain_http_request_wait_version_1. The signature is:

(func $ext_offchain_http_request_wait_version_2
    (param $ids i64) (param $deadline i64) (param $out i64))
Arguments
  • ids is a pointer-size (Definition 216) to the SCALE-encoded array of started request IDs, as returned by ext_offchain_http_request_start;
  • deadline is a pointer-size (Definition 216) to the SCALE-encoded Option value (Definition 200) containing the UNIX timestamp (Definition 191). Passing None blocks indefinitely;
  • out is a pointer-size (Definition 216) to the buffer of i32 integers where the request statuses will be stored. The number of elements of the buffer must be strictly equal to the number of elements in the ids array; otherwise, the function panics.
Changes

The logic of the functions is unchanged since the previous version. The signature has changed to align with the new memory allocation strategy.

ext_offchain_http_response_read_body

The new version 2 is introduced, deprecating ext_offchain_http_response_read_body_version_1. The signature is unchanged.

(func $ext_offchain_http_response_read_body_version_2
    (param $request_id i32) (param $buffer i64) (param $deadline i64) (result i64))
Arguments
  • request_id is an i32 integer indicating the ID of the started request, as returned by ext_offchain_http_request_start;
  • buffer is a pointer-size (Definition 216) to the buffer where the body is written;
  • deadline is a pointer-size (Definition 216) to the SCALE-encoded Option value (Definition 200) containing the UNIX timestamp (Definition 191). Passing None blocks indefinitely.
Result

On success, the number of bytes written to the buffer is returned. A value of 0 means the entire response was consumed and no further calls to the function are needed for the provided request ID. On failure, a negative error code is returned, where -1 denotes the deadline was reached, -2 denotes that an I/O error occurred, and -3 denotes that the request ID provided was invalid.

Changes

The logic and the signature of the function are unchanged since the previous version. The only change is the interpretation of the result value to avoid an unneeded allocation and promote the unification of host function result returning (zero and positive values are for success, and the negative values are for failure codes).

ext_allocator_

The functions are deprecated and must not be used in new code.

ext_input_read

A new function is introduced. The signature is

(func $ext_input_read_version_1
    (param $buffer i64))
Arguments
  • buffer is a pointer-size (Definition 216) to the buffer where the input data will be written. If the buffer is not large enough to accommodate the input data, the function will panic.

Other changes

Currently, all runtime entrypoints have the following identical Wasm function signatures:

(func $runtime_entrypoint (param $data i32) (param $len i32) (result i64))

After this RFC is implemented, such entrypoints are still supported, but considered deprecated. New entrypoints must have the following signature:

(func $runtime_entrypoint (param $len i32) (result i64))

A runtime function called through such an entrypoint gets the length of SCALE-encoded input data as its only argument. After that, the function must allocate exactly the amount of bytes it is requested, and call the ext_input_read host function to obtain the encoded input data.

(source)

Table of Contents

RFC-150: Allow Voting While Delegating

Start DateJune 5th, 2025
DescriptionAllow voters to simultaneously delegate and vote
Authorspolka.dom (polkadotdom)

Summary

This RFC proposes changes to pallet-conviction-voting that allow for simultaneous voting and delegation. For example, Alice could delegate to Bob, then later vote on a referendum while keeping their delegation to Bob intact. It is a strict subset of Leemo's RFC 35.

Motivation

Backdrop

Under our current voting system, a voter can either vote or delegate. To vote, they must first ensure they have no delegate, and to delegate, they must first clear their current votes.

The Issue

Empirically, the vast majority of people do not vote on day to day policy. This was foreseen and is the reason governance has delegation. However, more worriedly, it has also been observed that most people do not delegate either, leaving a large percentage of our voting population unrepresented.

Factors Limiting Delegation

One could think of three major reasons for this lack of delegation.

  • The voter does not know of anyone who accurately represents them.
  • The voter does not want their right to vote stripped, in consideration of some yet unknown, highly important, referendum.
  • The voter does not want to clear their voting data so as to delegate.

This RFC aims to solve the second and third issue and thus more accurately align governance to the true voter preferences.

An Aside

One may ask, could a voter not just undelegate, vote, then delegate again? Could this just be built into the user interface? Unfortunately, this does not work due to the need to clear their votes before redelegation. In practice the voter would undelegate, vote, wait until the referendum is closed, hope that there's no other referenda they would like to vote on, then redelegate. At best it's a temporally extended friction. At worst the voter goes unrepresented in voting for the duration of the vote clearing period.

Stakeholders

Runtime developers: If runtime developers are relying on the previous assumptions for their VotingHooks implementations, they will need to rethink their approach. In addition, a runtime migration is needed. Lastly, it is a serious change in governance that requires some consideration beyond the technical.

App developers: Apps like Subsquare and Polkassembly would need to update their user interface logic. They will also need to handle the new error.

Users: We will want users to be aware of the new functionality, though not required.

Technical Writers: This change will require rewrites of documentation and tutorials.

Explanation

New Data & Runtime Logic

The Voting Enum, which currently holds the user's vote data, would first be collapsed and it's underlying fields consolidated, as there would no longer be a distinction between the enum's variants. A (poll index -> retracted votes count) field would then be added to the resulting structure - It's role to keep track of the per poll balance that has been clawed back from the user by those delegating to them. See here for a potential implementation.

The implementation must allow for the (poll index -> retracted votes) data to exist even if the user does not currently have a vote for that poll. A simple example that highlights the necessity is as follows: A delegator votes first, then the delegate does. If the delegator is not allowed to create the retracted votes data on the delegate, the tally count would be corrupted when the delegate votes.

It follows then that the delegator must also handle clean up of that data when their vote is removed. Otherwise, the delegate has no immediate monetary incentive to clean the retracted vote's state.

All changes to pallet-conviction-voting's STF would follow those simple changes. For example, when a user votes standard, the final amount added to the poll's tally would be balance + (amount delegated to user - retracted votes). Then, if they are delegating, it will update their delegate's vote data with the newly retracted votes.

The retracted amount is always the full delegated amount. For example, if Alice delegates 10 UNITS to Bob and then votes with 5 UNITS, the full 10 UNITS is still added as a clawback to Bob for that poll. This is both for simplicity and to ensure we don't make unnecessary assumptions about what Alice wants.

Because you need to add the clawback, a delegator's vote can affect a delegate's voting data. If a delegator's vote or delegation makes the delegate's voting data exceed MaxVotes, the transaction will fail. In practice, this means this new system is somewhere between the old and the ideal. However, this will incentivize delegates to stay on top of voting data clearance. And given our current referenda rates and MaxVotes set to 512, it would be difficult to hit this limit.

A new error is to be introduced that signals MaxVotes was reached specifically for the delegate's voting data.

Locked Balance

A user's locked balance will be the greater of the delegation lock and the voting lock.

Migrations

A runtime migration is necessary, though simple considering voting and delegation are currently separate. It would iterate over the VotingFor storage item and convert the old vote data structure to the new structure.

Drawbacks

There are two potential drawbacks to this system -

An unbounded rate of change of the voter preferences function

If implemented, there will be no friction in delegating, undelegating, and voting. Therefore, there could be large and immediate shifts in the voter preferences function. In other voting systems we see bounds added to the rate of change (voting cycles, etc). That said, it is unclear whether this is desired or advantageous. Additionally, there are more easily parameterized and analytically tractable ways to handle this than what we currently have. See future directions.

Lessened value in becoming a delegate

If a delegate's voting power can be stripped from them at any point, then there is necessarily a reduction in their power within the system. This provides less incentive to become a delegate. But again, there are more customizable ways to handle this if it proves necessary.

Testing, Security, and Privacy

This change would mean a more complicated STF for voting, which would increase difficulty of hardening. Though sufficient unit testing should handle this with ease.

Performance, Ergonomics, and Compatibility

Performance

The proposed changes would increase both the compute and storage requirements by about 2x for all voting functions. No change in complexity.

Ergonomics

Voting and delegation will both become more ergonomic for users, as there are no longer hard constraints affecting what you can do and when you can do it.

Compatibility

Runtime developers will need to add the migration and ensure their hooks still work.

App developers will need to update their user interfaces to accommodate the new functionality. They will need to handle the new error as well.

Prior Art and References

A current implementation can be found here.

Unresolved Questions

None

It is possible we would like to add a system parameter for the rate of change of the voting/delegation system. This could prevent wild swings in the voter preferences function and motivate/shield delegates by solidifying their positions over some amount of time. However, it's unclear that this would be valuable or even desirable.

(source)

Table of Contents

RFC-0151: Crowdsourced Decision Deposits

Start DateJuly 7th, 2025
DescriptionAllow decision deposits to be crowdsourced.
Authorspolka.dom & Phunky

Summary

This RFC proposes changes to pallet-referenda that would allow for many people to contribute to a single referendum's decision deposit.

Motivation

Backdrop

Currently there are two types of deposits that must be placed for an OpenGov referendum to begin its deciding stage - the submission deposit, which is miniscule, and the decision deposit. Each of these can only be placed lump sum by a single account.

The Issue

The decision deposit can be (by design) quite large, reaching values up to 100k DOT on the relay chain. Perhaps unsurprisingly, it can be observed that we are incurring voter signal loss due to this high barrier of entry, seen here and here.

The primary motivation of this RFC is then to reduce that signal loss while still retaining high security assumptions.

Stakeholders

Governance Actors: All actors in governance would be affected by this RFC, as it changes the dynamics of our federal and local voting systems.

Runtime Developers: Runtime developers will need to update their sdk version and enact a runtime migration.

DApp Developers: App developers will need to integrate the new changes into their UI/UX.

Technical Writers: A rewrite of existing documentation is not needed, but documentation for the new features would be warranted.

Explanation

The changes to pallet-referenda would be as follows:

  • A referendum's status must be modified to include a list of deposits instead of just one.

  • An additional extrinsic for contributing partially to a decision deposit should be created. Retrofitting the existing one would work as well, but at the cost of more breaking changes.

  • The kill and refund_decision_deposit extrinsics must be updated to deal with a list of deposits.

  • New per-track info fields for the minimum amount contributable and the maximum amount of contributors must be added. The minimum contributable helps with griefing (see below), and the maximum contributors keeps the storage/compute bounded.

  • The last available slot for contribution must reserved only for those contributing what amount is remaining. This is to prevent griefing.

  • Any amount contributed greater than the remaining amount required should not be locked/used.

  • New errors must be created for max contributors reached and contribution under the minimum.

  • A new event for partial decision deposit placed should be created. The current event could be used, but may be confusing as it's known to mean the full amount.

Drawbacks

See performance section.

Testing, Security, and Privacy

This RFC opens up referenda to a griefing attack if improperly structured. It goes as follows - Alice opens a referendum, Bob creates n = MaxContributors faux accounts and fills all contributor spots with dust contributions, ensuring a referendum never achieves it's full decision deposit and in turn never makes it to the deciding phase.

One can avoid anything catastrophic by reserving the final contributor spot only for those contributing the remaining amount, but it is more difficult to keep antagonists from wasting contributor spots in general. A simple route is to make minimum contribution = decision deposit / max contributors; however, that might leave the barrier to contributing still too high. The tuning of this, or perhaps some unseen fix, is an open question.

Performance, Ergonomics, and Compatibility

Performance

The decision deposit field would take up potentially max_contributors times more storage/PoV. However, with the decision deposit being just a handful of bytes, this should be manageable. Similarly the kill and refund_decision_deposit extrinsics would become max_contributors times more compute intensive. All other metrics would be conserved or nominal.

Ergonomics

This RFC will make our referenda pipeline more ergonomic and open.

Compatibility

DApps would need to account for the new decision deposit structure and potentially the new extrinsic if they so choose. In addition, for runtime developers, a storage migration is necessary to convert the old ReferendumStatus to the new.

Prior Art and References

N/A

Unresolved Questions

N/A

Find the current WIP here.

Considering Governance will soon be in a smart contracts environment, this change could be further augmented through contracts.

(source)

Table of Contents

RFC-0152: Decentralized Convex-Preference Coretime Market for Polkadot

Start Date2025-06-30
DescriptionThis RFC proposes a decentralized market mechanism for allocating Coretime on Polkadot, replacing the existing Dutch auction method (RFC17). The proposed model leverages convex preference interactions among agents, eliminating explicit bidding and centralized price determination. This ensures fairness, transparency, and decentralization.
**Conflicts-WithRFC-0017
AuthorsDiego Correa Tristain algoritmia@labormedia.cl

Summary

This RFC proposes a decentralized market mechanism for allocating Coretime on Polkadot, replacing the existing Dutch auction method (RFC17). The proposed model leverages convex preference interactions among agents, eliminating explicit bidding and centralized price determination. This ensures fairness, transparency, and decentralization.

Motivation

The current auction-based model (RFC17) presents critical issues:

  • Front-running and timing asymmetry: Actors with superior infrastructure or timing strategies possess unfair advantages.

  • Complexity and cognitive overhead: Auctions pose challenges for participant comprehension and effective engagement.

  • Resource hoarding and inefficiency: Auctions allow strategic actors to monopolize resources, restricting equitable participation.

The decentralized convex-preference model addresses these issues by facilitating asynchronous, equitable and transparent access before state coordination and deterministic verifiability during and after protocol consensus.

Stakeholders

Primary set of stakeholders are:

  • Parachain Teams & Developers
  • Governance Bodies (Polkadot Fellowship, Polkadot Governance, Technical Committees)
  • Core Developers & Runtime Engineers
  • Application Builders / Smart Contract Developers
  • End Users of Polkadot Ecosystem dApps
  • Token Holders & Investors
  • Researchers / Economists / Protocol Designers
  • Communication Hubs (e.g., The Kusamarian, Polkadot Forum Moderators, Ecosystem Ambassadors)

Explanation

Guide-Level Explanation

Agents participating in the Coretime market (such as parachains, parathreads, or smart contracts) declare two parameters:

  • Asset Holdings: Their initial allocation of Coretime and tokens (e.g., DOT).

  • Preference Parameter (α): A scalar value between 0 and 1 indicating their valuation preference between Coretime and tokens.

These parameters are recorded transparently on-chain. Transactions between agents are conducted through deterministic convex optimizations, ensuring local Pareto-optimal exchanges. A global equilibrium price naturally emerges from these local interactions without any centralized authority or external pricing mechanism Tristain, 2024.

Reference-Level Explanation

Economic Model

Agents' preferences are represented using a Cobb-Douglas utility function:

$U_i(x, y) = x^{α_i} y^{1-α_i}$

where:

  • $x$ represents the quantity of Coretime.
  • $y$ represents the quantity of tokens.
  • $α_i \in [0,1]$ is the scalar preference parameter.

Mechanism Implementation

Implementation involves the following components:

  1. Preference Declaration: Agents MUST explicitly register their scalar preference (α) and initial asset holdings on-chain.
  2. Interaction Module: A dedicated runtime pallet or smart contract SHOULD manage interactions, ensuring Pareto-optimal deterministic outcomes.
  3. Convergence Enforcement: Interaction ordering MUST follow a deterministic protocol prioritizing transactions significantly enhancing price convergence, sequencing from higher to lower exchange ratios.
  4. On-chain Verifiability: Transaction histories and convergence processes MUST be transparently auditable and verifiable on-chain.

Example Flow Diagram

Preference & Asset Declaration → Paired-exchange Convex Optimization → Interaction Ordering (High-to-Low Exchange Impact) → Global Price Convergence → On-chain Auditability

Drawbacks

Performance

  • Initial implementation complexity due to the introduction of a new runtime module.

User Experience

  • User education and UI development required for scalar preference parameter comprehension.

Governance Burden

  • Additional review and audit complexity due to innovative economic logic.

Testing, Security, and Privacy

The implementation of this decentralized convex-preference Coretime market mechanism demands particular care in maintaining determinism, accuracy, and security in all on-chain interactions. Key considerations include:

Precision and Determinism in Arithmetic

  • The proposed mechanism relies on convex optimization over continuous variables, which REQUIRES floating-point arithmetic or high-precision fixed-point alternatives.

  • To ensure deterministic behavior across all nodes, arithmetic operations MUST be implemented using deterministic libraries or Wasm-compatible fixed-point math, avoiding non-deterministic floating-point behavior across architectures.

  • Verifiability of Pareto-optimal outcomes across interactions MUST be reproducible and provable, potentially leveraging range-limited arithmetic or bounded rational approximations for optimization solvers.

Security

  • Preference declarations and asset holdings MUST be immutably recorded on-chain, subject to strict validation and input constraints to prevent manipulation.

  • The optimization process MUST prevent overflow, underflow, or division-by-zero attacks in edge-case preference combinations (e.g., α close to 0 or 1).

  • Any deterministic interaction ordering logic MUST be auditable and resistant to manipulation or reordering incentives by privileged actors.

Privacy

  • Although the model emphasizes transparency and verifiability, it MAY be beneficial in future iterations to support privacy-preserving preference commitment schemes (e.g., via homomorphic encryption or zero-knowledge commitments).

  • This MAY allow agents to express preferences without revealing them publicly, while still enabling fair participation and on-chain verification.

Testing and Recommendations

  • Simulation of multiple interacting agents with heterogeneous preferences and randomized initial allocations SHOULD be used to validate global convergence and equilibrium behavior.

  • Fuzz testing and symbolic execution SHOULD be applied to the interaction module to identify corner cases in the optimization pipeline.

  • Formal verification of convergence routines and boundedness of the optimization space is RECOMMENDED for high-assurance deployments.

Performance, Ergonomics, and Compatibility

This leads to a more fluid, computation-bound system where efficiency stems from algorithmic design and verification speed, not from externally imposed timing constraints. Compatibility with existing Substrate pallets can be explored through modular implementation.

Performance

The system's performance depends on the availability of computational resources, not on arbitrary time windows or rounds. Price discovery and convergence are calculated as fast as the system can process the deterministic interaction rules. Pair-wise interactions can be batched and accumulated asynchronously. This enhances real-time responsiveness while removing artificial scheduling constraints.

Ergonomics

Agents only need to express a simple scalar preference and their token/Coretime holdings, removing cognitive complexity. This lightweight interaction model improves usability, especially for smaller participants.

Compatibility

The mechanism is fully compatible with asynchronous execution architectures. Because it relies on deterministic local state transitions, it integrates seamlessly with Byzantine fault-tolerant consensus protocols and supports scalable, decentralized implementations.

Prior Art and References

RFC-1

Initial Forum Discussion (superseded) : Invitation to Critically Evaluate Core Time Pricing Model Framework

RFC Draft Proposal Preliminary Forum Thread: RFC: Decentralized Convex-Preference Coretime Market for Polkadot Draft

"Emergent Properties of Distributed Agents with Two-Stage Convex Zero-Sum Optimal Exchange Network": Tristain, 2024

Personally, I want to express a special gratitude to Edmundo Beteta for introduccing me to Microeconomics Theory and guiding my curiosity at the Faculty of Economics and Administration, Universidad de Chile.

Unresolved Questions

  • Optimal method for initial rollout (experimental sandbox vs. partial deployment on Polkadot).

  • OPTIONAL criteria and heuristics for deterministic interaction ordering.

  • Extend the model to support multi-asset allocations with additional priority mechanisms.

  • Apply similar decentralized convex-preference principles to broader decentralized resource allocation challenges (e.g. JAM, energy/resource coordination, price stabilization).

(source)

Table of Contents

RFC-0154: AURA Multi-Slot Collation

Start Date25th of August 2025
DescriptionMulti-Slot AURA for System Parachains
Authorsbhargavbh, burdges, AlistairStewart

Summary

This RFC proposes a modification to the AURA round-robin block production mechanism for system parachains (e.g. Polkadot Hub). The proposed change increases the number of consecutive block production slots assigned to each collator from the current single-slot allocation to a configurable value, initially set at four. This modification aims to enhance censorship resistance by mitigating data-withholding attacks.

Motivation

The Polkadot Relay Chain guarantees the safety of parachain blocks, but it does not provide explicit guarantees for liveness or censorship resistance. With the planned migration of core Relay Chain functionalities—such as Balances, Staking, and Governance—to the Polkadot Hub system parachain in early November 2025, it becomes critical to establish a mechanism for achieving censorship resistance for these parachains without compromising throughput. For example, if governance functionality is migrated to Polkadot-Hub, malicious collators could systematically censor aye votes for a Relay Chain runtime upgrade, potentially altering the referendum's outcome. This demonstrates that censorship attacks on a system parachain can have a direct and undesirable impact on the security of the Relay Chain. This proposal addresses such censorship vulnerabilities by modifying the AURA block production mechanism utilized by system parachain collator with minimal honesty assumptions on the collators.

Stakeholders

  • Collators: Operators responsible for block production on the Polkadot Hub and other system parachains.
  • Users and Applications: Entities that interact with the Polkadot Hub or other system parachains.

Threat Model

This analysis of censorship resistance for AURA-based parachains operates under the following assumptions:

  • Collator Honesty: The model assumes the presence of at least one honest collator. We intentionally chose the most relaxed security assumption as collators are not slashable (unlike validators). Note that all system parachains use AURA via the Aura-Ext pallet.

  • Backer Honesty: The backer assigned to a block candidate is assumed to be honest. This is a reasonable assumption given 2/3rd honesty on relay-chain and that backers are assigned randomly by ELVES. Additionally, we assume that backers responsible for disbursing the withheld block to the victim collators. Pre-PVFs can definitely help in improving the resilience of backers against DoS attacks. Essentially, the pre PVF lets backers check the slot ownership and hence backers can filter out spamming collators at this stage. However, pre-PVFs have not yet been implemented. The stronger on assumption on backer disbursing the block is only needed for efficiency concerns and not essential for censorship resistance itself (i.e. the collator can always reconstruct from the availability layer).

  • Availability Layer: We also assume that the availability layer is robust and a collator can fetch the latest parablock (header and body) directly from the availability layer (or the backer) in a reasonable time, i.e., <6s from backer and <18s from availability layer provided by ELVES.

  • Scope: We focus mainly on honest collators ability to produce and get their blocks backed, rather than censorship at the transaction level. Ideally, we want to achive the property that honest collators eventually get their blocks backed even if there is a slight delay (and provide a provable bound on this delay).

Proposed Changes

The current AURA mechanism, which assigns a single block production slot per collator, is vulnerable to data-withholding attacks. A malicious collator can strategically produce a block and then selectively withhold it from subsequent collators. This can prevent honest collators from building their blocks in a timely manner, effectively censoring their block production.

Illustrative Attack Scenario:

Consider 3 collators A, B and C assigned to consecutive slots by the AURA mechanism. A and C conspire to censor collator B, i.e., not allow B's block to get backed, they can execute the following attack: A produces block $b_A$ and submits it to the backers but it selectively witholds $b_A$ from B. Then C builds on top of $b_A$ and gets in its block before B can recover $b_A$ from availability layer and build on top of it.

Proposed Solution

This proposal modifies the AURA round-robin mechanism to assign $x$ consecutive slots to each collator. The specific value of $x$ is contingent upon asynchronous backing parameters od the system parachain and will be derived using a generic formula provided in this document. The collator selected by AURA will be responsible for producing $x$ consecutive blocks. This modification will require corresponding adjustments to the AURA authorship checks within the PVF (Parachain Validation Function). For the current configuration of Polkadot Hub, $x=4$.

Analysis

The number of consecutive slots to be assigned to ensure AURA's censorship resistance depends on Async Backing Parameters like unincluded_segment_length. We now describe our approach for deriving $x$ based on paramters of async backing and other variables like block production and latency in availability layer. The relevant values can then be plugged in to obtain $x$ for any system parachain.

Clearly, the number of consecutive slots (x) in the round-robin is lower bounded by the time required to reconstruct the previous block from the availability layer (b) in addition to the block building time (a). Hence, we need to set $x$ such that $x\geq a+b$. But with async backing, a malicious collator sequentially tries to not share the block and just-in-time front-run the honest collator for all the unincluded_segment blocks. Hence, $x\geq (a+b)\cdot m$ is sufficient, where $m$ is the max allowed candidate depth (unincluded segment allowed).

Independently, there is a check on the relay chain which filters out parablocks anchoring to very old relay_parents in the verify_backed_candidates. Any parablock which is anchored to a relay parent older than the oldest element in allowed_relay_parents gets rejected. Hence, the malicious collator can not front-run and censor the consequent collator after this delay as the parablock is no longer valid. The update of the allowed_relay_parents occurs at process_inherent_data where the buffer length of AllowedRelayParents is set by the scheduler parameter: lookahead (set to 3 by default). Therefore, the async_backing delay (asyncdelay) tolerated by the relay chain backers is $3*6s = 18s$. Hence, the number of consecutive slots is the minimum of the above two values:

$$x \geq min((a+b)\cdot m, a + b + asyncdelay)$$

where $m$ is the max_candidate_depth (or unincluded segment as seen from collator's perpective).

Number of consecutive slots for Polkadot Hub

Assuming the previous block data can be fetched from backers, then we comfortably have $a+b \leq 6s$, i.e. block buiding plus recoinstruciton time is < 6s. Using the current asyncdelay of 18s, suffices to set $x$ to 4. If the max_candidate_depth (m) for Polkadot Hub is set $m\leq3$, then this will reduce (improve) $x$ from 4 to $m$. Note that a channel would have to be provided for collators to fetch blocks from backers as the preferred option and only recover from availability layer as the fail-safe option.

Performance, Ergonomics, and Compatibility

The proposed changes are security critical and mitigate censorship attacks on core functionality like balances, staking and governance on Polkadot Hub. This approach is compatible with the Slot-Based collation and the currently deployed FixedVelocityConsensusHook. Further analysis is needed to integrate with cusotm ConsesnsusHooks that leverage Elastic Scaling.

Multi-slot collation however is vulnerable to liveness attacks: adversarial collators don't show up to stall the liveness but then also lose out on block production rewards. The amount of missed blocks because of collators skipping is same as in the current implementation, only the distribution of missed slots changes (they are chunked together instead of being evenly distributed). Secondly, when ratio of adversarial (censoring) collators $\alpha$ is high (close to 1), the ratio of uncensored block to all blocks produced drops to $(1-\alpha)/(x\alpha)$. For more practical lower values of $\alpha<1/4$, the ratio of uncensored to all blocks is almost 1.

The latency for backing of blocks is affected as follows:

  • Censored Blocks: $(x-1)*6s$ compared to the blocks being indefinitely censored. $x$ is the number number of consecutive slots per collator.
  • An adversarial collator not showing up can slow the chain by $x*6s$ instead of $6s$. This is however not an economically rational attack as there are incentives for collating paid retrospectively.

Effective multi-slot collation requires that collators be able to prioritize transactions that have been targeted for censorship. The implementation should incorporate a framework for priority transactions (e.g., governance votes, election extrinsics) to ensure that such transactions are included in the uncensored blocks.

Prior Art and References

This RFC is related to RFC-7, which details the selection mechanism for System Parachain Collators. In general, a more robust collator selection mechanism that reduces the proportion of malicious actors would directly benefit the effectiveness of the ideas presented in this RFC

Future Directions

A resilient mechanism is needed for prioritising transactions in block production for collators that are actively targeted for censorship. There are two potential approches:

  • One approach is to categorise which transactions or extrinsics are more likely to be censored and should be considered priority. This would allow an honest collator to maximize the utility of its consecutive block production slots and prioritise when building the uncensored block. While this is dependent on the specific parachain's functionality, a generic framework would be beneficial for runtime engineers to tag relevant transaction types. However, if there exist transactions which are cheap and high priority (e.g. a governance vote), this approach is not ideal as it lets an adversary spam the collators with cheap high-priority transactions.
  • AAlternatively, one could design a robust tipping mechanism where transaction actively being censored would have to pay a higher tip to get themselves included. Even if the adversary initiates a bidding war, since 100% of the tip is forwarded to the collator, it only increase the revenue of the collator further incentivising it to remain honest. A careful analysis of such an incentive mechanism is required, however, it is beyond the scope of this RFC.

(source)

Table of Contents

RFC-XXXX: Adding customized mandatory context to proof of possession statement

Start Date20 May 2025 2025
DescriptionChange SessionKeys runtime API to generate a customized ownership proof for each crypto type
AuthorsAndrew Berger - Syed Hosseini

Summary

This RFC is an amendment to RFC-0048. It proposes changing the OpaqueKeysInner:create_ownership_proof and OpaqueKeys:: ownership_proof_is_valid to invoke generation and validation procedures specific to each crypto type. This enables different crypto schemes to implement proof of possession that fits their security needs. In short, this RFC delegates the procedure of generating and validating proof of possession to the crypto schemes rather than dictating a uniform generation and verification.

Motivation

Following RFC-0048, all submitted keys accompany a signature of the account_id by the same key, proving that the submitter knows the private key corresponding to the submitted key. However, a scheme should mandate a context-specific approach for generating proof of possession and a different context for signing anything else to prevent rogue key attacks [3]. While this is critical for schemes with aggregatable public keys, the other (non-aggregatable) crypto schemes opt for backward compatibility and accept signatures not prepended with mandatory context.

However, the current RFC does not allow using different API calls and procedures to generate proof of possession for different crypto schemes.

After this RFC, the procedure for generating and verifying proof of possession would be at the discretion of the crypto scheme itself, not deterministically tied to the way they sign other messages. Stakeholders

  • Polkadot runtime implementors
  • Polkadot node implementors
  • Validator operators

Explanation

The RFC does not change the structure introduced by RFC-0048. The proof is a sequence of signatures:

#![allow(unused)]
fn main() {
type Proof = (Signature, Signature, ..);
}

However, each signature is generated by the crypto scheme instead of each private session key signing the account_id. By default, the following statement is signed by the crypto scheme:

rust
"POP_"|account_id

The prefix could alert signers if they are misled into signing false proof of possession statements. More importantly, a new crypto scheme could specify a different structure for its proof of possession.

Because RFC-0048 has not been deployed, the version of the SessionKeys could still be set to 1 as requested by RFC-0048.

Drawbacks

Crypto scheme needs to implement an explicit generate_proof_of_possession and verify_proof_of_possession runtime API in addition to old capabilities (sigen, verify, etc).

Testing, Security, and Privacy

The proof of possession for current crypto schemes is virtually identical to the one defined in RFC-0048. On the other hand, the changes proposed by this RFC allow the generation of secure proof of possession for BLS keys.

Performance, Ergonomics, and Compatibility

Performance

The performance is the same as the one discussed in RFC-0048.

Ergonomics

Separating the generation of proof of possession from signing allows a crypto scheme more freedom to implement proof of possession that is fitted to its needs.

Compatibility

The significant difference is that proof of possession suggested by RFC-0048 is signed:

rust
account_id

vs the current proposal suggests changing the statement to:

rust
"POP_"|account_id

for the current crypto scheme. However, future crypto schemes such as BLS, which are not bound to backward compatibility, could produce more sophisticated proof of possession.

Prior Art and References

This is a minor amendment to RFC-0048.

Unresolved Questions

None.

- [1] Substrate implementation of the generation of proof of possession for all crypto schemes (current and experimental ones) is implemented in Pull 6010.

- [2] Substrate implementation of RFC-0048, in which the implementation of OpaqueKeysInner:create_ownership_proof and OpaqueKeys:: ownership_proof_is_valid should be modified to call generate_proof_of_possion and verify_proof_of_possession runtime APIs instead of directly calling the sign.

- [3] Ristenpart, T., & Yilek, S. (2007). The power of proofs-of-possession: Securing multiparty signatures against rogue-key attacks. In , Annual {{International Conference}} on the {{Theory}} and {{Applications}} of {{Cryptographic Techniques} (pp. 228–245). : Springer).

(source)

Table of Contents

RFC-1: Agile Coretime

Start Date30 June 2023
DescriptionAgile periodic-sale-based model for assigning Coretime on the Polkadot Ubiquitous Computer.
AuthorsGavin Wood

Summary

This proposes a periodic, sale-based method for assigning Polkadot Coretime, the analogue of "block space" within the Polkadot Network. The method takes into account the need for long-term capital expenditure planning for teams building on Polkadot, yet also provides a means to allow Polkadot to capture long-term value in the resource which it sells. It supports the possibility of building rich and dynamic secondary markets to optimize resource allocation and largely avoids the need for parameterization.

Motivation

Present System

The Polkadot Ubiquitous Computer, or just Polkadot UC, represents the public service provided by the Polkadot Network. It is a trust-free, WebAssembly-based, multicore, internet-native omnipresent virtual machine which is highly resilient to interference and corruption.

The present system of allocating the limited resources of the Polkadot Ubiquitous Computer is through a process known as parachain slot auctions. This is a parachain-centric paradigm whereby a single core is long-term allocated to a single parachain which itself implies a Substrate/Cumulus-based chain secured and connected via the Relay-chain. Slot auctions are on-chain candle auctions which proceed for several days and result in the core being assigned to the parachain for six months at a time up to 24 months in advance. Practically speaking, we only see two year periods being bid upon and leased.

Funds behind the bids made in the slot auctions are merely locked, they are not consumed or paid and become unlocked and returned to the bidder on expiry of the lease period. A means of sharing the deposit trustlessly known as a crowdloan is available allowing token holders to contribute to the overall deposit of a chain without any counterparty risk.

Problems

The present system is based on a model of one-core-per-parachain. This is a legacy interpretation of the Polkadot platform and is not a reflection of its present capabilities. By restricting ownership and usage to this model, more dynamic and resource-efficient means of utilizing the Polkadot Ubiquitous Computer are lost.

More specifically, it is impossible to lease out cores at anything less than six months, and apparently unrealistic to do so at anything less than two years. This removes the ability to dynamically manage the underlying resource, and generally experimentation, iteration and innovation suffer. It bakes into the platform an assumption of permanence for anything deployed into it and restricts the market's ability to find a more optimal allocation of the finite resource.

There is no ability to determine capital requirements for hosting a parachain beyond two years from the point of its initial deployment onto Polkadot. While it would be unreasonable to have perfect and indefinite cost predictions for any real-world platform, not having any clarity whatsoever beyond "market rates" two years hence can be a very off-putting prospect for teams to buy into.

However, quite possibly the most substantial problem is both a perceived and often real high barrier to entry of the Polkadot ecosystem. By forcing innovators to either raise seven-figure sums through investors or appeal to the wider token-holding community, Polkadot makes it difficult for a small band of innovators to deploy their technology into Polkadot. While not being actually permissioned, it is also far from the barrierless, permissionless ideal which an innovation platform such as Polkadot should be striving for.

Requirements

  1. The solution SHOULD provide an acceptable value-capture mechanism for the Polkadot network.
  2. The solution SHOULD allow parachains and other projects deployed on to the Polkadot UC to make long-term capital expenditure predictions for the cost of ongoing deployment.
  3. The solution SHOULD minimize the barriers to entry in the ecosystem.
  4. The solution SHOULD work well when the Polkadot UC has up to 1,000 cores.
  5. The solution SHOULD work when the number of cores which the Polkadot UC can support changes over time.
  6. The solution SHOULD facilitate the optimal allocation of work to cores of the Polkadot UC, including by facilitating the trade of regular core assignment at various intervals and for various spans.
  7. The solution SHOULD avoid creating additional dependencies on functionality which the Relay-chain need not strictly provide for the delivery of the Polkadot UC.

Furthermore, the design SHOULD be implementable and deployable in a timely fashion; three months from the acceptance of this RFC should not be unreasonable.

Stakeholders

Primary stakeholder sets are:

  • Protocol researchers and developers, largely represented by the Polkadot Fellowship and Parity Technologies' Engineering division.
  • Polkadot Parachain teams both present and future, and their users.
  • Polkadot DOT token holders.

Socialization:

The essensials of this proposal were presented at Polkadot Decoded 2023 Copenhagen on the Main Stage. A small amount of socialization at the Parachain Summit preceeded it and some substantial discussion followed it. Parity Ecosystem team is currently soliciting views from ecosystem teams who would be key stakeholders.

Explanation

Overview

Upon implementation of this proposal, the parachain-centric slot auctions and associated crowdloans cease. Instead, Coretime on the Polkadot UC is sold by the Polkadot System in two separate formats: Bulk Coretime and Instantaneous Coretime.

When a Polkadot Core is utilized, we say it is dedicated to a Task rather than a "parachain". The Task to which a Core is dedicated may change at every Relay-chain block and while one predominant type of Task is to secure a Cumulus-based blockchain (i.e. a parachain), other types of Tasks are envisioned.

Bulk Coretime is sold periodically on a specialised system chain known as the Coretime-chain and allocated in advance of its usage, whereas Instantaneous Coretime is sold on the Relay-chain immediately prior to usage on a block-by-block basis.

This proposal does not fix what should be done with revenue from sales of Coretime and leaves it for a further RFC process.

Owners of Bulk Coretime are tracked on the Coretime-chain and the ownership status and properties of the owned Coretime are exposed over XCM as a non-fungible asset.

At the request of the owner, the Coretime-chain allows a single Bulk Coretime asset, known as a Region, to be used in various ways including transferal to another owner, allocated to a particular task (e.g. a parachain) or placed in the Instantaneous Coretime Pool. Regions can also be split out, either into non-overlapping sub-spans or exactly-overlapping spans with less regularity.

The Coretime-Chain periodically instructs the Relay-chain to assign its cores to alternative tasks as and when Core allocations change due to new Regions coming into effect.

Renewal and Migration

There is a renewal system which allows a Bulk Coretime assignment of a single core to be renewed unchanged with a known price increase from month to month. Renewals are processed in a period prior to regular purchases, effectively giving them precedence over a fixed number of cores available.

Renewals are only enabled when a core's assignment does not include an Instantaneous Coretime allocation and has not been split into shorter segments.

Thus, renewals are designed to ensure only that committed parachains get some guarantees about price for predicting future costs. This price-capped renewal system only allows cores to be reused for their same tasks from month to month. In any other context, Bulk Coretime would need to be purchased regularly.

As a migration mechanism, pre-existing leases (from the legacy lease/slots/crowdloan framework) are initialized into the Coretime-chain and cores assigned to them prior to Bulk Coretime sales. In the sale where the lease expires, the system offers a renewal, as above, to allow a priority sale of Bulk Coretime and ensure that the Parachain suffers no downtime when transitioning from the legacy framework.

Instantaneous Coretime

Processing of Instantaneous Coretime happens in part on the Polkadot Relay-chain. Credit is purchased on the Coretime-chain for regular DOT tokens, and this results in a DOT-denominated Instantaneous Coretime Credit account on the Relay-chain being credited for the same amount.

Though the Instantaneous Coretime Credit account records a balance for an account identifier (very likely controlled by a collator), it is non-transferable and non-refundable. It can only be consumed in order to purchase some Instantaneous Coretime with immediate availability.

The Relay-chain reports this usage back to the Coretime-chain in order to allow it to reward the providers of the underlying Coretime, either the Polkadot System or owners of Bulk Coretime who contributed to the Instantaneous Coretime Pool.

Specifically the Relay-chain is expected to be responsible for:

  • holding non-transferable, non-refundable DOT-denominated Instantaneous Coretime Credit balance information.
  • setting and adjusting the price of Instantaneous Coretime based on usage.
  • allowing collators to consume their Instantaneous Coretime Credit at the current pricing in exchange for the ability to schedule one PoV for near-immediate usage.
  • ensuring the Coretime-Chain has timely accounting information on Instantaneous Coretime Sales revenue.

Coretime-chain

The Coretime-chain is a new system parachain. It has the responsibility of providing the Relay-chain via UMP with information of:

  • The number of cores which should be made available.
  • Which tasks should be running on which cores and in what ratios.
  • Accounting information for Instantaneous Coretime Credit.

It also expects information from the Relay-chain via DMP:

  • The number of cores available to be scheduled.
  • Account information on Instantaneous Coretime Sales.

The specific interface is properly described in RFC-5.

Detail

Parameters

This proposal includes a number of parameters which need not necessarily be fixed. Their usage is explained below, but their values are suggested or specified in the later section Parameter Values.

Reservations and Leases

The Coretime-chain includes some governance-set reservations of Coretime; these cover every System-chain. Additionally, governance is expected to initialize details of the pre-existing leased chains.

Regions

A Region is an assignable period of Coretime with a known regularity.

All Regions are associated with a unique Core Index, to identify which core the assignment of which ownership of the Region controls.

All Regions are also associated with a Core Mask, an 80-bit bitmap, to denote the regularity at which it may be scheduled on the core. If all bits are set in the Core Mask value, it is said to be Complete. 80 is selected since this results in the size of the datatype used to identify any Region of Polkadot Coretime to be a very convenient 128-bit. Additionally, if TIMESLICE (the number of Relay-chain blocks in a Timeslice) is 80, then a single bit in the Core Mask bitmap represents exactly one Core for one Relay-chain block in one Timeslice.

All Regions have a span. Region spans are quantized into periods of TIMESLICE blocks; BULK_PERIOD divides into TIMESLICE a whole number of times.

The Timeslice type is a u32 which can be multiplied by TIMESLICE to give a BlockNumber value representing the same quantity in terms of Relay-chain blocks.

Regions can be tasked to a TaskId (aka ParaId) or pooled into the Instantaneous Coretime Pool. This process can be Provisional or Final. If done only provisionally or not at all then they are fresh and have an Owner which is able to manipulate them further including reassignment. Once Final, then all ownership information is discarded and they cannot be manipulated further. Renewal is not possible when only provisionally tasked/pooled.

Bulk Sales

A sale of Bulk Coretime occurs on the Coretime-chain every BULK_PERIOD blocks.

In every sale, a BULK_LIMIT of individual Regions are offered for sale.

Each Region offered for sale has a different Core Index, ensuring that they each represent an independently allocatable resource on the Polkadot UC.

The Regions offered for sale have the same span: they last exactly BULK_PERIOD blocks, and begin immediately following the span of the previous Sale's Regions. The Regions offered for sale also have the complete, non-interlaced, Core Mask.

The Sale Period ends immediately as soon as span of the Coretime Regions that are being sold begins. At this point, the next Sale Price is set according to the previous Sale Price together with the number of Regions sold compared to the desired and maximum amount of Regions to be sold. See Price Setting for additional detail on this point.

Following the end of the previous Sale Period, there is an Interlude Period lasting INTERLUDE_PERIOD of blocks. After this period is elapsed, regular purchasing begins with the Purchasing Period.

This is designed to give at least two weeks worth of time for the purchased regions to be partitioned, interlaced, traded and allocated.

The Interlude

The Interlude period is a period prior to Regular Purchasing where renewals are allowed to happen. This has the effect of ensuring existing long-term tasks/parachains have a chance to secure their Bulk Coretime for a well-known price prior to general sales.

Regular Purchasing

Any account may purchase Regions of Bulk Coretime if they have the appropriate funds in place during the Purchasing Period, which is from INTERLUDE_PERIOD blocks after the end of the previous sale until the beginning of the Region of the Bulk Coretime which is for sale as long as there are Regions of Bulk Coretime left for sale (i.e. no more than BULK_LIMIT have already been sold in the Bulk Coretime Sale). The Purchasing Period is thus roughly BULK_PERIOD - INTERLUDE_PERIOD blocks in length.

The Sale Price varies during an initial portion of the Purchasing Period called the Leadin Period and then stays stable for the remainder. This initial portion is LEADIN_PERIOD blocks in duration. During the Leadin Period the price decreases towards the Sale Price, which it lands at by the end of the Leadin Period. The actual curve by which the price starts and descends to the Sale Price is outside the scope of this RFC, though a basic suggestion is provided in the Price Setting Notes, below.

Renewals

At any time when there are remaining Regions of Bulk Coretime to be sold, including during the Interlude Period, then certain Bulk Coretime assignmnents may be Renewed. This is similar to a purchase in that funds must be paid and it consumes one of the Regions of Bulk Coretime which would otherwise be placed for purchase. However there are two key differences.

Firstly, the price paid is the minimum of RENEWAL_PRICE_CAP more than what the purchase/renewal price was in the previous renewal and the current (or initial, if yet to begin) regular Sale Price.

Secondly, the purchased Region comes preassigned with exactly the same workload as before. It cannot be traded, repartitioned, interlaced or exchanged. As such unlike regular purchasing the Region never has an owner.

Renewal is only possible for either cores which have been assigned as a result of a previous renewal, which are migrating from legacy slot leases, or which fill their Bulk Coretime with an unsegmented, fully and finally assigned workload which does not include placement in the Instantaneous Coretime Pool. The renewed workload will be the same as this initial workload.

Manipulation

Regions may be manipulated in various ways by its owner:

  1. Transferred in ownership.
  2. Partitioned into quantized, non-overlapping segments of Bulk Coretime with the same ownership.
  3. Interlaced into multiple Regions over the same period whose eventual assignments take turns to be scheduled.
  4. Assigned to a single, specific task (identified by TaskId aka ParaId). This may be either provisional or final.
  5. Pooled into the Instantaneous Coretime Pool, in return for a pro-rata amount of the revenue from the Instantaneous Coretime Sales over its period.

Enactment

Specific functions of the Coretime-chain

Several functions of the Coretime-chain SHALL be exposed through dispatchables and/or a nonfungible trait implementation integrated into XCM:

1. transfer

Regions may have their ownership transferred.

A transfer(region: RegionId, new_owner: AccountId) dispatchable shall have the effect of altering the current owner of the Region identified by region from the signed origin to new_owner.

An implementation of the nonfungible trait SHOULD include equivalent functionality. RegionId SHOULD be used for the AssetInstance value.

2. partition

Regions may be split apart into two non-overlapping interior Regions of the same Core Mask which together concatenate to the original Region.

A partition(region: RegionId, pivot: Timeslice) dispatchable SHALL have the effect of removing the Region identified by region and adding two new Regions of the same owner and Core Mask. One new Region will begin at the same point of the old Region but end at pivot timeslices into the Region, whereas the other will begin at this point and end at the end point of the original Region.

Also:

  • owner field of region must the equal to the Signed origin.
  • pivot must equal neither the begin nor end fields of the region.

3. interlace

Regions may be decomposed into two Regions of the same span whose eventual assignments take turns on the core by virtue of having complementary Core Masks.

An interlace(region: RegionId, mask: CoreMask) dispatchable shall have the effect of removing the Region identified by region and creating two new Regions. The new Regions will each have the same span and owner of the original Region, but one Region will have a Core Mask equal to mask and the other will have Core Mask equal to the XOR of mask and the Core Mask of the original Region.

Also:

  • owner field of region must the equal to the Signed origin.
  • mask must have some bits set AND must not equal the Core Mask of the old Region AND must only have bits set which are also set in the old Region's' Core Mask.

4. assign

Regions may be assigned to a core.

A assign(region: RegionId, target: TaskId, finality: Finality) dispatchable shall have the effect of placing an item in the workplan corresponding to the region's properties and assigned to the target task.

If the region's end has already passed (taking into account any advance notice requirements) then this operation is a no-op. If the region's begining has already passed, then it is effectively altered to become the next schedulable timeslice.

finality may have the value of either Final or Provisional. If Final, then the operation is free, the region record is removed entirely from storage and renewal may be possible: if the Region's span is the entire BULK_PERIOD, then the Coretime-chain records in storage that the allocation happened during this period in order to facilitate the possibility for a renewal. (Renewal only becomes possible when the full Core Mask of a core is finally assigned for the full BULK_PERIOD.)

Also:

  • owner field of region must the equal to the Signed origin.

5. pool

Regions may be consumed in exchange for a pro rata portion of the Instantaneous Coretime Sales Revenue from its period and regularity.

A pool(region: RegionId, beneficiary: AccountId, finality: Finality) dispatchable shall have the effect of placing an item in the workplan corresponding to the region's properties and assigned to the Instantaneous Coretime Pool. The details of the region will be recorded in order to allow for a pro rata share of the Instantaneous Coretime Sales Revenue at the time of the Region relative to any other providers in the Pool.

If the region's end has already passed (taking into account any advance notice requirements) then this operation is a no-op. If the region's begining has already passed, then it is effectively altered to become the next schedulable timeslice.

finality may have the value of either Final or Provisional. If Final, then the operation is free and the region record is removed entirely from storage.

Also:

  • owner field of region must the equal to the Signed origin.

6. Purchases

A dispatchable purchase(price_limit: Balance) shall be provided. Any account may call purchase to purchase Bulk Coretime at the maximum price of price_limit.

This may be called successfully only:

  1. during the regular Purchasing Period;
  2. when the caller is a Signed origin and their account balance is reducible by the current sale price;
  3. when the current sale price is no greater than price_limit; and
  4. when the number of cores already sold is less than BULK_LIMIT.

If successful, the caller's account balance is reduced by the current sale price and a new Region item for the following Bulk Coretime span is issued with the owner equal to the caller's account.

7. Renewals

A dispatchable renew(core: CoreIndex) shall be provided. Any account may call renew to purchase Bulk Coretime and renew an active allocation for the given core.

This may be called during the Interlude Period as well as the regular Purchasing Period and has the same effect as purchase followed by assign, except that:

  1. The price of the sale is the Renewal Price (see next).
  2. The Region is allocated exactly the given core is currently allocated for the present Region.

Renewal is only valid where a Region's span is assigned to Tasks (not placed in the Instantaneous Coretime Pool) for the entire unsplit BULK_PERIOD over all of the Core Mask and with Finality. There are thus three possibilities of a renewal being allowed:

  1. Purchased unsplit Coretime with final assignment to tasks over the full Core Mask.
  2. Renewed Coretime.
  3. A legacy lease which is ending.

Renewal Price

The Renewal Price is the minimum of the current regular Sale Price (or the initial Sale Price if in the Interlude Period) and:

  • If the workload being renewed came to be through the Purchase and Assignment of Bulk Coretime, then the price paid during that Purchase operation.
  • If the workload being renewed was previously renewed, then the price paid during this previous Renewal operation plus RENEWAL_PRICE_CAP.
  • If the workload being renewed is a migation from a legacy slot auction lease, then the nominal price for a Regular Purchase (outside of the Lead-in Period) of the Sale during which the legacy lease expires.

8. Instantaneous Coretime Credits

A dispatchable purchase_credit(amount: Balance, beneficiary: RelayChainAccountId) shall be provided. Any account with at least amount spendable funds may call this. This increases the Instantaneous Coretime Credit balance on the Relay-chain of the beneficiary by the given amount.

This Credit is consumable on the Relay-chain as part of the Task scheduling system and its specifics are out of the scope of this proposal. When consumed, revenue is recorded and provided to the Coretime-chain for proper distribution. The API for doing this is specified in RFC-5.

Notes on the Instantaneous Coretime Market

For an efficient market to form around the provision of Bulk-purchased Cores into the pool of cores available for Instantaneous Coretime purchase, it is crucial to ensure that price changes for the purchase of Instantaneous Coretime are reflected well in the revenues of private Coretime providers during the same period.

In order to ensure this, then it is crucial that Instantaneous Coretime, once purchased, cannot be held indefinitely prior to eventual use since, if this were the case, a nefarious collator could purchase Coretime when cheap and utilize it some time later when expensive and deprive private Coretime providers of their revenue.

It must therefore be assumed that Instantaneous Coretime, once purchased, has a definite and short "shelf-life", after which it becomes unusable. This incentivizes collators to avoid purchasing Coretime unless they expect to utilize it imminently and thus helps create an efficient market-feedback mechanism whereby a higher price will actually result in material revenues for private Coretime providers who contribute to the pool of Cores available to service Instantaneous Coretime purchases.

Notes on Economics

The specific pricing mechanisms are out of scope for the present proposal. Proposals on economics should be properly described and discussed in another RFC. However, for the sake of completeness, I provide some basic illustration of how price setting could potentially work.

Bulk Price Progression

The present proposal assumes the existence of a price-setting mechanism which takes into account several parameters:

  • OLD_PRICE: The price of the previous sale.
  • BULK_TARGET: the target number of cores to be purchased as Bulk Coretime Regions or renewed during the previous sale.
  • BULK_LIMIT: the maximum number of cores which could have been purchased/renewed during the previous sale.
  • CORES_SOLD: the actual number of cores purchased/renewed in the previous sale.
  • SELLOUT_PRICE: the price at which the most recent Bulk Coretime was purchased (not renewed) prior to selling more cores than BULK_TARGET (or immediately after, if none were purchased before). This may not have a value if no Bulk Coretime was purchased.

In general we would expect the price to increase the closer CORES_SOLD gets to BULK_LIMIT and to decrease the closer it gets to zero. If it is exactly equal to BULK_TARGET, then we would expect the price to remain the same.

In the edge case that no cores were purchased yet more cores were sold (through renewals) than the target, then we would also avoid altering the price.

A simple example of this would be the formula:

IF SELLOUT_PRICE == NULL AND CORES_SOLD > BULK_TARGET THEN
    RETURN OLD_PRICE
END IF
EFFECTIVE_PRICE := IF CORES_SOLD > BULK_TARGET THEN
    SELLOUT_PRICE
ELSE
    OLD_PRICE
END IF
NEW_PRICE := IF CORES_SOLD < BULK_TARGET THEN
    EFFECTIVE_PRICE * MAX(CORES_SOLD, 1) / BULK_TARGET
ELSE
    EFFECTIVE_PRICE + EFFECTIVE_PRICE *
        (CORES_SOLD - BULK_TARGET) / (BULK_LIMIT - BULK_TARGET)
END IF

This exists only as a trivial example to demonstrate a basic solution exists, and should not be intended as a concrete proposal.

Intra-Leadin Price-decrease

During the Leadin Period of a sale, the effective price starts higher than the Sale Price and falls to end at the Sale Price at the end of the Leadin Period. The price can thus be defined as a simple factor above one on which the Sale Price is multiplied. A function which returns this factor would accept a factor between zero and one specifying the portion of the Leadin Period which has passed.

Thus we assume SALE_PRICE, then we can define PRICE as:

PRICE := SALE_PRICE * FACTOR((NOW - LEADIN_BEGIN) / LEADIN_PERIOD)

We can define a very simple progression where the price decreases monotonically from double the Sale Price at the beginning of the Leadin Period.

FACTOR(T) := 2 - T

Parameter Values

Parameters are either suggested or specified. If suggested, it is non-binding and the proposal should not be judged on the value since other RFCs and/or the governance mechanism of Polkadot is expected to specify/maintain it. If specified, then the proposal should be judged on the merit of the value as-is.

NameValue
BULK_PERIOD28 * DAYSspecified
INTERLUDE_PERIOD7 * DAYSspecified
LEADIN_PERIOD7 * DAYSspecified
TIMESLICE8 * MINUTESspecified
BULK_TARGET30suggested
BULK_LIMIT45suggested
RENEWAL_PRICE_CAPPerbill::from_percent(2)suggested

Instantaneous Price Progression

This proposal assumes the existence of a Relay-chain-based price-setting mechanism for the Instantaneous Coretime Market which alters from block to block, taking into account several parameters: the last price, the size of the Instantaneous Coretime Pool (in terms of cores per Relay-chain block) and the amount of Instantaneous Coretime waiting for processing (in terms of Core-blocks queued).

The ideal situation is to have the size of the Instantaneous Coretime Pool be equal to some factor of the Instantaneous Coretime waiting. This allows all Instantaneous Coretime sales to be processed with some limited latency while giving limited flexibility over ordering to the Relay-chain apparatus which is needed for efficient operation.

If we set a factor of three, and thus aim to retain a queue of Instantaneous Coretime Sales which can be processed within three Relay-chain blocks, then we would increase the price if the queue goes above three times the amount of cores available, and decrease if it goes under.

Let us assume the values OLD_PRICE, FACTOR, QUEUE_SIZE and POOL_SIZE. A simple definition of the NEW_PRICE would be thus:

NEW_PRICE := IF QUEUE_SIZE < POOL_SIZE * FACTOR THEN
    OLD_PRICE * 0.95
ELSE
    OLD_PRICE / 0.95
END IF

This exists only as a trivial example to demonstrate a basic solution exists, and should not be intended as a concrete proposal.

Notes on Types

This exists only as a short illustration of a potential technical implementation and should not be treated as anything more.

Regions

This data schema achieves a number of goals:

  • Coretime can be individually traded at a level of a single usage of a single core.
  • Coretime Regions, of arbitrary span and up to 1/80th interlacing can be exposed as NFTs and exchanged.
  • Any Coretime Region can be contributed to the Instantaneous Coretime Pool.
  • Unlimited number of individual Coretime contributors to the Instantaneous Coretime Pool. (Effectively limited only in number of cores and interlacing level; with current values this would allow 80,000 individual payees per timeslice).
  • All keys are self-describing.
  • Workload to communicate core (re-)assignments is well-bounded and low in weight.
  • All mandatory bookkeeping workload is well-bounded in weight.
#![allow(unused)]
fn main() {
type Timeslice = u32; // 80 block amounts.
type CoreIndex = u16;
type CoreMask = [u8; 10]; // 80-bit bitmap.

// 128-bit (16 bytes)
struct RegionId {
    begin: Timeslice,
    core: CoreIndex,
    mask: CoreMask,
}
// 296-bit (37 bytes)
struct RegionRecord {
    end: Timeslice,
    owner: AccountId,
}

map Regions = Map<RegionId, RegionRecord>;

// 40-bit (5 bytes). Could be 32-bit with a more specialised type.
enum CoreTask {
    Off,
    Assigned { target: TaskId },
    InstaPool,
}
// 120-bit (15 bytes). Could be 14 bytes with a specialised 32-bit `CoreTask`.
struct ScheduleItem {
    mask: CoreMask, // 80 bit
    task: CoreTask, // 40 bit
}

/// The work we plan on having each core do at a particular time in the future.
type Workplan = Map<(Timeslice, CoreIndex), BoundedVec<ScheduleItem, 80>>;
/// The current workload of each core. This gets updated with workplan as timeslices pass.
type Workload = Map<CoreIndex, BoundedVec<ScheduleItem, 80>>;

enum Contributor {
    System,
    Private(AccountId),
}

struct ContributionRecord {
    begin: Timeslice,
    end: Timeslice,
    core: CoreIndex,
    mask: CoreMask,
    payee: Contributor,
}
type InstaPoolContribution = Map<ContributionRecord, ()>;

type SignedTotalMaskBits = u32;
type InstaPoolIo = Map<Timeslice, SignedTotalMaskBits>;

type PoolSize = Value<TotalMaskBits>;

/// Counter for the total CoreMask which could be dedicated to a pool. `u32` so we don't ever get
/// an overflow.
type TotalMaskBits = u32;
struct InstaPoolHistoryRecord {
    total_contributions: TotalMaskBits,
    maybe_payout: Option<Balance>,
}
/// Total InstaPool rewards for each Timeslice and the number of core Mask which contributed.
type InstaPoolHistory = Map<Timeslice, InstaPoolHistoryRecord>;
}

CoreMask tracks unique "parts" of a single core. It is used with interlacing in order to give a unique identifier to each component of any possible interlacing configuration of a core, allowing for simple self-describing keys for all core ownership and allocation information. It also allows for each core's workload to be tracked and updated progressively, keeping ongoing compute costs well-bounded and low.

Regions are issued into the Regions map and can be transferred, partitioned and interlaced as the owner desires. Regions can only be tasked if they begin after the current scheduling deadline (if they have missed this, then the region can be auto-trimmed until it is).

Once tasked, they are removed from there and a record is placed in Workplan. In addition, if they are contributed to the Instantaneous Coretime Pool, then an entry is placing in InstaPoolContribution and InstaPoolIo.

Each timeslice, InstaPoolIo is used to update the current value of PoolSize. A new entry in InstaPoolHistory is inserted, with the total_contributions field of InstaPoolHistoryRecord being informed by the PoolSize value. Each core's has its Workload mutated according to its Workplan for the upcoming timeslice.

When Instantaneous Coretime Market Revenues are reported for a particular timeslice from the Relay-chain, this information gets placed in the maybe_payout field of the relevant record of InstaPoolHistory.

Payments can be requested made for any records in InstaPoolContribution whose begin is the key for a value in InstaPoolHistory whose maybe_payout is Some. In this case, the total_contributions is reduced by the ContributionRecord's mask and a pro rata amount paid. The ContributionRecord is mutated by incrementing begin, or removed if begin becomes equal to end.

Example:

#![allow(unused)]
fn main() {
// Simple example with a `u16` `CoreMask` and bulk sold in 100 timeslices.
Regions:
{ core: 0u16, begin: 100, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
// First split @ 50
Regions:
{ core: 0u16, begin: 100, mask: 0b1111_1111_1111_1111u16 } => { end: 150u32, owner: Alice };
{ core: 0u16, begin: 150, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
// Share half of first 50 blocks
Regions:
{ core: 0u16, begin: 100, mask: 0b1111_1111_0000_0000u16 } => { end: 150u32, owner: Alice };
{ core: 0u16, begin: 100, mask: 0b0000_0000_1111_1111u16 } => { end: 150u32, owner: Alice };
{ core: 0u16, begin: 150, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
// Sell half of them to Bob
Regions:
{ core: 0u16, begin: 100, mask: 0b1111_1111_0000_0000u16 } => { end: 150u32, owner: Alice };
{ core: 0u16, begin: 100, mask: 0b0000_0000_1111_1111u16 } => { end: 150u32, owner: Bob };
{ core: 0u16, begin: 150, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
// Bob splits first 10 and assigns them to himself.
Regions:
{ core: 0u16, begin: 100, mask: 0b1111_1111_0000_0000u16 } => { end: 150u32, owner: Alice };
{ core: 0u16, begin: 100, mask: 0b0000_0000_1111_1111u16 } => { end: 110u32, owner: Bob };
{ core: 0u16, begin: 110, mask: 0b0000_0000_1111_1111u16 } => { end: 150u32, owner: Bob };
{ core: 0u16, begin: 150, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
// Bob shares first 10 3 ways and sells smaller shares to Charlie and Dave
Regions:
{ core: 0u16, begin: 100, mask: 0b1111_1111_0000_0000u16 } => { end: 150u32, owner: Alice };
{ core: 0u16, begin: 100, mask: 0b0000_0000_1100_0000u16 } => { end: 110u32, owner: Charlie };
{ core: 0u16, begin: 100, mask: 0b0000_0000_0011_0000u16 } => { end: 110u32, owner: Dave };
{ core: 0u16, begin: 100, mask: 0b0000_0000_0000_1111u16 } => { end: 110u32, owner: Bob };
{ core: 0u16, begin: 110, mask: 0b0000_0000_1111_1111u16 } => { end: 150u32, owner: Bob };
{ core: 0u16, begin: 150, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
// Bob assigns to his para B, Charlie and Dave assign to their paras C and D; Alice assigns first 50 to A
Regions:
{ core: 0u16, begin: 150, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
Workplan:
(100, 0) => vec![
    { mask: 0b1111_1111_0000_0000u16, task: Assigned(A) },
    { mask: 0b0000_0000_1100_0000u16, task: Assigned(C) },
    { mask: 0b0000_0000_0011_0000u16, task: Assigned(D) },
    { mask: 0b0000_0000_0000_1111u16, task: Assigned(B) },
]
(110, 0) => vec![{ mask: 0b0000_0000_1111_1111u16, task: Assigned(B) }]
// Alice assigns her remaining 50 timeslices to the InstaPool paying herself:
Regions: (empty)
Workplan:
(100, 0) => vec![
    { mask: 0b1111_1111_0000_0000u16, task: Assigned(A) },
    { mask: 0b0000_0000_1100_0000u16, task: Assigned(C) },
    { mask: 0b0000_0000_0011_0000u16, task: Assigned(D) },
    { mask: 0b0000_0000_0000_1111u16, task: Assigned(B) },
]
(110, 0) => vec![{ mask: 0b0000_0000_1111_1111u16, task: Assigned(B) }]
(150, 0) => vec![{ mask: 0b1111_1111_1111_1111u16, task: InstaPool }]
InstaPoolContribution:
{ begin: 150, end: 200, core: 0, mask: 0b1111_1111_1111_1111u16, payee: Alice }
InstaPoolIo:
150 => 16
200 => -16
// Actual notifications to relay chain.
// Assumes:
// - Timeslice is 10 blocks.
// - Timeslice 0 begins at block #1000.
// - Relay needs 10 blocks notice of change.
//
Workload: 0 => vec![]
PoolSize: 0

// Block 990:
Relay <= assign_core(core: 0u16, begin: 1000, assignment: vec![(A, 8), (C, 2), (D, 2), (B, 4)])
Workload: 0 => vec![
    { mask: 0b1111_1111_0000_0000u16, task: Assigned(A) },
    { mask: 0b0000_0000_1100_0000u16, task: Assigned(C) },
    { mask: 0b0000_0000_0011_0000u16, task: Assigned(D) },
    { mask: 0b0000_0000_0000_1111u16, task: Assigned(B) },
]
PoolSize: 0

// Block 1090:
Relay <= assign_core(core: 0u16, begin: 1100, assignment: vec![(A, 8), (B, 8)])
Workload: 0 => vec![
    { mask: 0b1111_1111_0000_0000u16, task: Assigned(A) },
    { mask: 0b0000_0000_1111_1111u16, task: Assigned(B) },
]
PoolSize: 0

// Block 1490:
Relay <= assign_core(core: 0u16, begin: 1500, assignment: vec![(Pool, 16)])
Workload: 0 => vec![
    { mask: 0b1111_1111_1111_1111u16, task: InstaPool },
]
PoolSize: 16
InstaPoolIo:
200 => -16
InstaPoolHistory:
150 => { total_contributions: 16, maybe_payout: None }

// Sometime after block 1500:
InstaPoolHistory:
150 => { total_contributions: 16, maybe_payout: Some(P) }

// Sometime after block 1990:
InstaPoolIo: (empty)
PoolSize: 0
InstaPoolHistory:
150 => { total_contributions: 16, maybe_payout: Some(P0) }
151 => { total_contributions: 16, maybe_payout: Some(P1) }
152 => { total_contributions: 16, maybe_payout: Some(P2) }
...
199 => { total_contributions: 16, maybe_payout: Some(P49) }

// Sometime later still Alice calls for a payout
InstaPoolContribution: (empty)
InstaPoolHistory: (empty)
// Alice gets rewarded P0 + P1 + ... P49.
}

Rollout

Rollout of this proposal comes in several phases:

  1. Finalise the specifics of implementation; this may be done through a design document or through a well-documented prototype implementation.
  2. Implement the design, including all associated aspects such as unit tests, benchmarks and any support software needed.
  3. If any new parachain is required, launch of this.
  4. Formal audit of the implementation and any manual testing.
  5. Announcement to the various stakeholders of the imminent changes.
  6. Software integration and release.
  7. Governance upgrade proposal(s).
  8. Monitoring of the upgrade process.

Performance, Ergonomics and Compatibility

No specific considerations.

Parachains already deployed into the Polkadot UC must have a clear plan of action to migrate to an agile Coretime market.

While this proposal does not introduce documentable features per se, adequate documentation must be provided to potential purchasers of Polkadot Coretime. This SHOULD include any alterations to the Polkadot-SDK software collection.

Testing, Security and Privacy

Regular testing through unit tests, integration tests, manual testnet tests, zombie-net tests and fuzzing SHOULD be conducted.

A regular security review SHOULD be conducted prior to deployment through a review by the Web3 Foundation economic research group.

Any final implementation MUST pass a professional external security audit.

The proposal introduces no new privacy concerns.

RFC-3 proposes a means of implementing the high-level allocations within the Relay-chain.

RFC-5 proposes the API for interacting with Relay-chain.

Additional work should specify the interface for the instantaneous market revenue so that the Coretime-chain can ensure Bulk Coretime placed in the instantaneous market is properly compensated.

Drawbacks, Alternatives and Unknowns

Unknowns include the economic and resource parameterisations:

  • The initial price of Bulk Coretime.
  • The price-change algorithm between Bulk Coretime sales.
  • The price increase per Bulk Coretime period for renewals.
  • The price decrease graph in the Leadin period for Bulk Coretime sales.
  • The initial price of Instantaneous Coretime.
  • The price-change algorithm for Instantaneous Coretime sales.
  • The percentage of cores to be sold as Bulk Coretime.
  • The fate of revenue collected.

Prior Art and References

Robert Habermeier initially wrote on the subject of Polkadot blockspace-centric in the article Polkadot Blockspace over Blockchains. While not going into details, the article served as an early reframing piece for moving beyond one-slot-per-chain models and building out secondary market infrastructure for resource allocation.

(source)

Table of Contents

RFC-5: Coretime Interface

Start Date06 July 2023
DescriptionInterface for manipulating the usage of cores on the Polkadot Ubiquitous Computer.
AuthorsGavin Wood, Robert Habermeier

Summary

In the Agile Coretime model of the Polkadot Ubiquitous Computer, as proposed in RFC-1 and RFC-3, it is necessary for the allocating parachain (envisioned to be one or more pallets on a specialised Brokerage System Chain) to communicate the core assignments to the Relay-chain, which is responsible for ensuring those assignments are properly enacted.

This is a proposal for the interface which will exist around the Relay-chain in order to communicate this information and instructions.

Motivation

The background motivation for this interface is splitting out coretime allocation functions and secondary markets from the Relay-chain onto System parachains. A well-understood and general interface is necessary for ensuring the Relay-chain receives coretime allocation instructions from one or more System chains without introducing dependencies on the implementation details of either side.

Requirements

  • The interface MUST allow the Relay-chain to be scheduled on a low-latency basis.
  • Individual cores MUST be schedulable, both in full to a single task (a ParaId or the Instantaneous Coretime Pool) or to many unique tasks in differing ratios.
  • Typical usage of the interface SHOULD NOT overload the VMP message system.
  • The interface MUST allow for the allocating chain to be notified of all accounting information relevant for making accurate rewards for contributing to the Instantaneous Coretime Pool.
  • The interface MUST allow for Instantaneous Coretime Market Credits to be communicated.
  • The interface MUST allow for the allocating chain to instruct changes to the number of cores which it is able to allocate.
  • The interface MUST allow for the allocating chain to be notified of changes to the number of cores which are able to be allocated by the allocating chain.

Stakeholders

Primary stakeholder sets are:

  • Developers of the Relay-chain core-management logic.
  • Developers of the Brokerage System Chain and its pallets.

Socialization:

This content of this RFC was discussed in the Polkdot Fellows channel.

Explanation

The interface has two sections: The messages which the Relay-chain is able to receive from the allocating parachain (the UMP message types), and messages which the Relay-chain is able to send to the allocating parachain (the DMP message types). These messages are expected to be able to be implemented in a well-known pallet and called with the XCM Transact instruction.

Future work may include these messages being introduced into the XCM standard.

UMP Message Types

request_core_count

Prototype:

fn request_core_count(
    count: u16,
)

Requests the Relay-chain to alter the number of schedulable cores to count. Under normal operation, the Relay-chain SHOULD send a notify_core_count(count) message back.

request_revenue_info_at

Prototype:

fn request_revenue_at(
    when: BlockNumber,
)

Requests that the Relay-chain send a notify_revenue message back at or soon after Relay-chain block number when whose until parameter is equal to when.

The period in to the past which when is allowed to be may be limited; if so the limit should be understood on a channel outside of this proposal. In the case that the request cannot be serviced because when is too old a block then a notify_revenue message must still be returned, but its revenue field may be None.

credit_account

Prototype:

fn credit_account(
    who: AccountId,
    amount: Balance,
)

Instructs the Relay-chain to add the amount of DOT to the Instantaneous Coretime Market Credit account of who.

It is expected that Instantaneous Coretime Market Credit on the Relay-chain is NOT transferrable and only redeemable when used to assign cores in the Instantaneous Coretime Pool.

assign_core

Prototype:

type PartsOf57600 = u16;
enum CoreAssignment {
    InstantaneousPool,
    Task(ParaId),
}
fn assign_core(
    core: CoreIndex,
    begin: BlockNumber,
    assignment: Vec<(CoreAssignment, PartsOf57600)>,
    end_hint: Option<BlockNumber>,
)

Requirements:

assert!(core < core_count);
assert!(targets.iter().map(|x| x.0).is_sorted());
assert_eq!(targets.iter().map(|x| x.0).unique().count(), targets.len());
assert_eq!(targets.iter().map(|x| x.1).sum(), 57600);

Where:

  • core_count is assumed to be the sole parameter in the last received notify_core_count message.

Instructs the Relay-chain to ensure that the core indexed as core is utilised for a number of assignments in specific ratios given by assignment starting as soon after begin as possible. Core assignments take the form of a CoreAssignment value which can either task the core to a ParaId value or indicate that the core should be used in the Instantaneous Pool. Each assignment comes with a ratio value, represented as the numerator of the fraction with a denominator of 57,600.

If end_hint is Some and the inner is greater than the current block number, then the Relay-chain should optimize in the expectation of receiving a new assign_core(core, ...) message at or prior to the block number of the inner value. Specific functionality should remain unchanged regardless of the end_hint value.

On the choice of denominator: 57,600 is a very composite number which factors into: 2 ** 8, 3 ** 2, 5 ** 2. By using it as the denominator we allow for various useful fractions to be perfectly represented including thirds, quarters, fifths, tenths, 80ths, percent and 256ths.

DMP Message Types

notify_core_count

Prototype:

fn notify_core_count(
    count: u16,
)

Indicate that from this block onwards, the range of acceptable values of the core parameter of assign_core message is [0, count). assign_core will be a no-op if provided with a value for core outside of this range.

notify_revenue_info

Prototype:

fn notify_revenue_info(
    until: BlockNumber,
    revenue: Option<Balance>,
)

Provide the amount of revenue accumulated from Instantaneous Coretime Sales from Relay-chain block number last_until to until, not including until itself. last_until is defined as being the until argument of the last notify_revenue message sent, or zero for the first call. If revenue is None, this indicates that the information is no longer available.

This explicitly disregards the possibility of multiple parachains requesting and being notified of revenue information. The Relay-chain must be configured to ensure that only a single revenue information destination exists.

Realistic Limits of the Usage

For request_revenue_info, a successful request should be possible if when is no less than the Relay-chain block number on arrival of the message less 100,000.

For assign_core, a successful request should be possible if begin is no less than the Relay-chain block number on arrival of the message plus 10 and workload contains no more than 100 items.

Performance, Ergonomics and Compatibility

No specific considerations.

Testing, Security and Privacy

Standard Polkadot testing and security auditing applies.

The proposal introduces no new privacy concerns.

RFC-1 proposes a means of determining allocation of Coretime using this interface.

RFC-3 proposes a means of implementing the high-level allocations within the Relay-chain.

Drawbacks, Alternatives and Unknowns

None at present.

Prior Art and References

None.

(source)

Table of Contents

RFC-0007: System Collator Selection

Start Date07 July 2023
DescriptionMechanism for selecting collators of system chains.
AuthorsJoe Petrowski

Summary

As core functionality moves from the Relay Chain into system chains, so increases the reliance on the liveness of these chains for the use of the network. It is not economically scalable, nor necessary from a game-theoretic perspective, to pay collators large rewards. This RFC proposes a mechanism -- part technical and part social -- for ensuring reliable collator sets that are resilient to attemps to stop any subsytem of the Polkadot protocol.

Motivation

In order to guarantee access to Polkadot's system, the collators on its system chains must propose blocks (provide liveness) and allow all transactions to eventually be included. That is, some collators may censor transactions, but there must exist one collator in the set who will include a given transaction. In fact, all collators may censor varying subsets of transactions, but as long as no transaction is in the intersection of every subset, it will eventually be included. The objective of this RFC is to propose a mechanism to select such a set on each system chain.

While the network as a whole uses staking (and inflationary rewards) to attract validators, collators face different challenges in scale and have lower security assumptions than validators. Regarding scale, there exist many system chains, and it is economically expensive to pay collators a premium. Likewise, any staked DOT for collation is not staked for validation. Since collator sets do not need to meet Byzantine Fault Tolerance criteria, staking as the primary mechanism for collator selection would remove stake that is securing BFT assumptions, making the network less secure.

Another problem with economic scalability relates to the increasing number of system chains, and corresponding increase in need for collators (i.e., increase in collator slots). "Good" (highly available, non-censoring) collators will not want to compete in elections on many chains when they could use their resources to compete in the more profitable validator election. Such dilution decreases the required bond on each chain, leaving them vulnerable to takeover by hostile collator groups.

This RFC proposes a system whereby collation is primarily an infrastructure service, with the on-chain Treasury reimbursing costs of semi-trusted node operators, referred to as "Invulnerables". The system need not trust the individual operators, only that as a set they would be resilient to coordinated attempts to stop a single chain from halting or to censor a particular subset of transactions.

In the case that users do not trust this set, this RFC also proposes that each chain always have available collator positions that can be acquired by anyone by placing a bond.

Requirements

  • System MUST have at least one valid collator for every chain.
  • System MUST allow anyone to become a collator, provided they reserve/hold enough DOT.
  • System SHOULD select a set of collators with reasonable expectation that the set will not collude to censor any subset of transactions.
  • Collators selected by governance SHOULD have a reasonable expectation that the Treasury will reimburse their operating costs.

Stakeholders

  • Infrastructure providers (people who run validator/collator nodes)
  • Polkadot Treasury

Explanation

This protocol builds on the existing Collator Selection pallet and its notion of Invulnerables. Invulnerables are collators (identified by their AccountIds) who will be selected as part of the collator set every session. Operations relating to the management of the Invulnerables are done through privileged, governance origins. The implementation should maintain an API for adding and removing Invulnerable collators.

In addition to Invulnerables, there are also open slots for "Candidates". Anyone can register as a Candidate by placing a fixed bond. However, with a fixed bond and fixed number of slots, there is an obvious selection problem: The slots fill up without any logic to replace their occupants.

This RFC proposes that the collator selection protocol allow Candidates to increase (and decrease) their individual bonds, sort the Candidates according to bond, and select the top N Candidates. The selection and changeover should be coordinated by the session manager.

A FRAME pallet already exists for sorting ("bagging") "top N" groups, the Bags List pallet. This pallet's SortedListProvider should be integrated into the session manager of the Collator Selection pallet.

Despite the lack of apparent economic incentives (i.e., inflation), several reasons exist why one may want to bond funds to participate in the Candidates election, for example:

  • They want to build credibility to be selected as Invulnerable;
  • They want to ensure availability of an application, e.g. a stablecoin issuer might run a collator on Asset Hub to ensure transactions in its asset are included in blocks;
  • They fear censorship themselves, e.g. a voter might think their votes are being censored from governance, so they run a collator on the governance chain to include their votes.

Unlike the fixed-bond mechanism that fills up its Candidates, the election mechanism ensures that anyone can join the collator set by placing the Nth highest bond.

Set Size

In order to achieve the requirements listed under Motivation, it is reasonable to have approximately:

  • 20 collators per system chain,
  • of which 15 are Invulnerable, and
  • five are elected by bond.

Drawbacks

The primary drawback is a reliance on governance for continued treasury funding of infrastructure costs for Invulnerable collators.

Testing, Security, and Privacy

The vast majority of cases can be covered by unit testing. Integration test should ensure that the Collator Selection UpdateOrigin, which has permission to modify the Invulnerables and desired number of Candidates, can handle updates over XCM from the system's governance location.

Performance, Ergonomics, and Compatibility

This proposal has very little impact on most users of Polkadot, and should improve the performance of system chains by reducing the number of missed blocks.

Performance

As chains have strict PoV size limits, care must be taken in the PoV impact of the session manager. Appropriate benchmarking and tests should ensure that conservative limits are placed on the number of Invulnerables and Candidates.

Ergonomics

The primary group affected is Candidate collators, who, after implementation of this RFC, will need to compete in a bond-based election rather than a race to claim a Candidate spot.

Compatibility

This RFC is compatible with the existing implementation and can be handled via upgrades and migration.

Prior Art and References

Written Discussions

Prior Feedback and Input From

  • Kian Paimani
  • Jeff Burdges
  • Rob Habermeier
  • SR Labs Auditors
  • Current collators including Paranodes, Stake Plus, Turboflakes, Peter Mensik, SIK, and many more.

Unresolved Questions

None at this time.

There may exist in the future system chains for which this model of collator selection is not appropriate. These chains should be evaluated on a case-by-case basis.

(source)

Table of Contents

RFC-0008: Store parachain bootnodes in relay chain DHT

Start Date2023-07-14
DescriptionParachain bootnodes shall register themselves in the DHT of the relay chain
AuthorsPierre Krieger

Summary

The full nodes of the Polkadot peer-to-peer network maintain a distributed hash table (DHT), which is currently used for full nodes discovery and validators discovery purposes.

This RFC proposes to extend this DHT to be used to discover full nodes of the parachains of Polkadot.

Motivation

The maintenance of bootnodes has long been an annoyance for everyone.

When a bootnode is newly-deployed or removed, every chain specification must be updated in order to take the update into account. This has lead to various non-optimal solutions, such as pulling chain specifications from GitHub repositories. When it comes to RPC nodes, UX developers often have trouble finding up-to-date addresses of parachain RPC nodes. With the ongoing migration from RPC nodes to light clients, similar problems would happen with chain specifications as well.

Furthermore, there exists multiple different possible variants of a certain chain specification: with the non-raw storage, with the raw storage, with just the genesis trie root hash, with or without checkpoint, etc. All of this creates confusion. Removing the need for parachain developers to be aware of and manage these different versions would be beneficial.

Since the PeerId and addresses of bootnodes needs to be stable, extra maintenance work is required from the chain maintainers. For example, they need to be extra careful when migrating nodes within their infrastructure. In some situations, bootnodes are put behind domain names, which also requires maintenance work.

Because the list of bootnodes in chain specifications is so annoying to modify, the consequence is that the number of bootnodes is rather low (typically between 2 and 15). In order to better resist downtimes and DoS attacks, a better solution would be to use every node of a certain chain as potential bootnode, rather than special-casing some specific nodes.

While this RFC doesn't solve these problems for relay chains, it aims at solving it for parachains by storing the list of all the full nodes of a parachain on the relay chain DHT.

Assuming that this RFC is implemented, and that light clients are used, deploying a parachain wouldn't require more work than registering it onto the relay chain and starting the collators. There wouldn't be any need for special infrastructure nodes anymore.

Stakeholders

This RFC has been opened on my own initiative because I think that this is a good technical solution to a usability problem that many people are encountering and that they don't realize can be solved.

Explanation

The content of this RFC only applies for parachains and parachain nodes that are "Substrate-compatible". It is in no way mandatory for parachains to comply to this RFC.

Note that "Substrate-compatible" is very loosely defined as "implements the same mechanisms and networking protocols as Substrate". The author of this RFC believes that "Substrate-compatible" should be very precisely specified, but there is controversy on this topic.

While a lot of this RFC concerns the implementation of parachain nodes, it makes use of the resources of the Polkadot chain, and as such it is important to describe them in the Polkadot specification.

This RFC adds two mechanisms: a registration in the DHT, and a new networking protocol.

DHT provider registration

This RFC heavily relies on the functionalities of the Kademlia DHT already in use by Polkadot. You can find a link to the specification here.

Full nodes of a parachain registered on Polkadot should register themselves onto the Polkadot DHT as the providers of a key corresponding to the parachain that they are serving, as described in the Content provider advertisement section of the specification. This uses the ADD_PROVIDER system of libp2p-kademlia.

This key is: sha256(concat(scale_compact(para_id), randomness)) where the value of randomness can be found in the randomness field when calling the BabeApi_currentEpoch function. For example, for a para_id equal to 1000, and at the time of writing of this RFC (July 14th 2023 at 09:13 UTC), it is sha(0xa10f12872447958d50aa7b937b0106561a588e0e2628d33f81b5361b13dbcf8df708), which is equal to 0x483dd8084d50dbbbc962067f216c37b627831d9339f5a6e426a32e3076313d87.

In order to avoid downtime when the key changes, parachain full nodes should also register themselves as a secondary key that uses a value of randomness equal to the randomness field when calling BabeApi_nextEpoch.

Implementers should be aware that their implementation of Kademlia might already hash the key before XOR'ing it. The key is not meant to be hashed twice.

The compact SCALE encoding has been chosen in order to avoid problems related to the number of bytes and endianness of the para_id.

New networking protocol

A new request-response protocol should be added, whose name is /91b171bb158e2d3848fa23a9f1c25182fb8e20313b2c1eb49219da7a70ce90c3/paranode (that hexadecimal number is the genesis hash of the Polkadot chain, and should be adjusted appropriately for Kusama and others).

The request consists in a SCALE-compact-encoded para_id. For example, for a para_id equal to 1000, this is 0xa10f.

Note that because this is a request-response protocol, the request is always prefixed with its length in bytes. While the body of the request is simply the SCALE-compact-encoded para_id, the data actually sent onto the substream is both the length and body.

The response consists in a protobuf struct, defined as:

syntax = "proto2";

message Response {
    // Peer ID of the node on the parachain side.
    bytes peer_id = 1;

    // Multiaddresses of the parachain side of the node. The list and format are the same as for the `listenAddrs` field of the `identify` protocol.
    repeated bytes addrs = 2;

    // Genesis hash of the parachain. Used to determine the name of the networking protocol to connect to the parachain. Untrusted.
    bytes genesis_hash = 3;

    // So-called "fork ID" of the parachain. Used to determine the name of the networking protocol to connect to the parachain. Untrusted.
    optional string fork_id = 4;
};

The maximum size of a response is set to an arbitrary 16kiB. The responding side should make sure to conform to this limit. Given that fork_id is typically very small and that the only variable-length field is addrs, this is easily achieved by limiting the number of addresses.

Implementers should be aware that addrs might be very large, and are encouraged to limit the number of addrs to an implementation-defined value.

Drawbacks

The peer_id and addrs fields are in theory not strictly needed, as the PeerId and addresses could be always equal to the PeerId and addresses of the node being registered as the provider and serving the response. However, the Cumulus implementation currently uses two different networking stacks, one of the parachain and one for the relay chain, using two separate PeerIds and addresses, and as such the PeerId and addresses of the other networking stack must be indicated. Asking them to use only one networking stack wouldn't feasible in a realistic time frame.

The values of the genesis_hash and fork_id fields cannot be verified by the requester and are expected to be unused at the moment. Instead, a client that desires connecting to a parachain is expected to obtain the genesis hash and fork ID of the parachain from the parachain chain specification. These fields are included in the networking protocol nonetheless in case an acceptable solution is found in the future, and in order to allow use cases such as discovering parachains in a not-strictly-trusted way.

Testing, Security, and Privacy

Because not all nodes want to be used as bootnodes, implementers are encouraged to provide a way to disable this mechanism. However, it is very much encouraged to leave this mechanism on by default for all parachain nodes.

This mechanism doesn't add or remove any security by itself, as it relies on existing mechanisms. However, if the principle of chain specification bootnodes is entirely replaced with the mechanism described in this RFC (which is the objective), then it becomes important whether the mechanism in this RFC can be abused in order to make a parachain unreachable.

Due to the way Kademlia works, it would become the responsibility of the 20 Polkadot nodes whose sha256(peer_id) is closest to the key (described in the explanations section) to store the list of bootnodes of each parachain. Furthermore, when a large number of providers (here, a provider is a bootnode) are registered, only the providers closest to the key are kept, up to a certain implementation-defined limit.

For this reason, an attacker can abuse this mechanism by randomly generating libp2p PeerIds until they find the 20 entries closest to the key representing the target parachain. They are then in control of the parachain bootnodes. Because the key changes periodically and isn't predictable, and assuming that the Polkadot DHT is sufficiently large, it is not realistic for an attack like this to be maintained in the long term.

Furthermore, parachain clients are expected to cache a list of known good nodes on their disk. If the mechanism described in this RFC went down, it would only prevent new nodes from accessing the parachain, while clients that have connected before would not be affected.

Performance, Ergonomics, and Compatibility

Performance

The DHT mechanism generally has a low overhead, especially given that publishing providers is done only every 24 hours.

Doing a Kademlia iterative query then sending a provider record shouldn't take more than around 50 kiB in total of bandwidth for the parachain bootnode.

Assuming 1000 parachain full nodes, the 20 Polkadot full nodes corresponding to a specific parachain will each receive a sudden spike of a few megabytes of networking traffic when the key rotates. Again, this is relatively negligible. If this becomes a problem, one can add a random delay before a parachain full node registers itself to be the provider of the key corresponding to BabeApi_next_epoch.

Maybe the biggest uncertainty is the traffic that the 20 Polkadot full nodes will receive from light clients that desire knowing the bootnodes of a parachain. Light clients are generally encouraged to cache the peers that they use between restarts, so they should only query these 20 Polkadot full nodes at their first initialization. If this every becomes a problem, this value of 20 is an arbitrary constant that can be increased for more redundancy.

Ergonomics

Irrelevant.

Compatibility

Irrelevant.

Prior Art and References

None.

Unresolved Questions

While it fundamentally doesn't change much to this RFC, using BabeApi_currentEpoch and BabeApi_nextEpoch might be inappropriate. I'm not familiar enough with good practices within the runtime to have an opinion here. Should it be an entirely new pallet?

It is possible that in the future a client could connect to a parachain without having to rely on a trusted parachain specification.

(source)

Table of Contents

RFC-0010: Burn Coretime Revenue

Start Date19.07.2023
DescriptionRevenue from Coretime sales should be burned
AuthorsJonas Gehrlein

Summary

The Polkadot UC will generate revenue from the sale of available Coretime. The question then arises: how should we handle these revenues? Broadly, there are two reasonable paths – burning the revenue and thereby removing it from total issuance or divert it to the Treasury. This Request for Comment (RFC) presents arguments favoring burning as the preferred mechanism for handling revenues from Coretime sales.

Motivation

How to handle the revenue accrued from Coretime sales is an important economic question that influences the value of DOT and should be properly discussed before deciding for either of the options. Now is the best time to start this discussion.

Stakeholders

Polkadot DOT token holders.

Explanation

This RFC discusses potential benefits of burning the revenue accrued from Coretime sales instead of diverting them to Treasury. Here are the following arguments for it.

It's in the interest of the Polkadot community to have a consistent and predictable Treasury income, because volatility in the inflow can be damaging, especially in situations when it is insufficient. As such, this RFC operates under the presumption of a steady and sustainable Treasury income flow, which is crucial for the Polkadot community's stability. The assurance of a predictable Treasury income, as outlined in a prior discussion here, or through other equally effective measures, serves as a baseline assumption for this argument.

Consequently, we need not concern ourselves with this particular issue here. This naturally begs the question - why should we introduce additional volatility to the Treasury by aligning it with the variable Coretime sales? It's worth noting that Coretime revenues often exhibit an inverse relationship with periods when Treasury spending should ideally be ramped up. During periods of low Coretime utilization (indicated by lower revenue), Treasury should spend more on projects and endeavours to increase the demand for Coretime. This pattern underscores that Coretime sales, by their very nature, are an inconsistent and unpredictable source of funding for the Treasury. Given the importance of maintaining a steady and predictable inflow, it's unnecessary to rely on another volatile mechanism. Some might argue that we could have both: a steady inflow (from inflation) and some added bonus from Coretime sales, but burning the revenue would offer further benefits as described below.

  • Balancing Inflation: While DOT as a utility token inherently profits from a (reasonable) net inflation, it also benefits from a deflationary force that functions as a counterbalance to the overall inflation. Right now, the only mechanism on Polkadot that burns fees is the one for underutilized DOT in the Treasury. Finding other, more direct target for burns makes sense and the Coretime market is a good option.

  • Clear incentives: By burning the revenue accrued on Coretime sales, prices paid by buyers are clearly costs. This removes distortion from the market that might arise when the paid tokens occur on some other places within the network. In that case, some actors might have secondary motives of influencing the price of Coretime sales, because they benefit down the line. For example, actors that actively participate in the Coretime sales are likely to also benefit from a higher Treasury balance, because they might frequently request funds for their projects. While those effects might appear far-fetched, they could accumulate. Burning the revenues makes sure that the prices paid are clearly costs to the actors themselves.

  • Collective Value Accrual: Following the previous argument, burning the revenue also generates some externality, because it reduces the overall issuance of DOT and thereby increases the value of each remaining token. In contrast to the aforementioned argument, this benefits all token holders collectively and equally. Therefore, I'd consider this as the preferrable option, because burns lets all token holders participate at Polkadot's success as Coretime usage increases.

(source)

Table of Contents

RFC-0012: Process for Adding New System Collectives

Start Date24 July 2023
DescriptionA process for adding new (and removing existing) system collectives.
AuthorsJoe Petrowski

Summary

Since the introduction of the Collectives parachain, many groups have expressed interest in forming new -- or migrating existing groups into -- on-chain collectives. While adding a new collective is relatively simple from a technical standpoint, the Fellowship will need to merge new pallets into the Collectives parachain for each new collective. This RFC proposes a means for the network to ratify a new collective, thus instructing the Fellowship to instate it in the runtime.

Motivation

Many groups have expressed interest in representing collectives on-chain. Some of these include:

  • Parachain technical fellowship (new)
  • Fellowship(s) for media, education, and evangelism (new)
  • Polkadot Ambassador Program (existing)
  • Anti-Scam Team (existing)

Collectives that form part of the core Polkadot protocol should have a mandate to serve the Polkadot network. However, as part of the Polkadot protocol, the Fellowship, in its capacity of maintaining system runtimes, will need to include modules and configurations for each collective.

Once a group has developed a value proposition for the Polkadot network, it should have a clear path to having its collective accepted on-chain as part of the protocol. Acceptance should direct the Fellowship to include the new collective with a given initial configuration into the runtime. However, the network, not the Fellowship, should ultimately decide which collectives are in the interest of the network.

Stakeholders

  • Polkadot stakeholders who would like to organize on-chain.
  • Technical Fellowship, in its role of maintaining system runtimes.

Explanation

The group that wishes to operate an on-chain collective should publish the following information:

  • Charter, including the collective's mandate and how it benefits Polkadot. This would be similar to the Fellowship Manifesto.
  • Seeding recommendation.
  • Member types, i.e. should members be individuals or organizations.
  • Member management strategy, i.e. how do members join and get promoted, if applicable.
  • How much, if at all, members should get paid in salary.
  • Any special origins this Collective should have outside its self. For example, the Fellowship can whitelist calls for referenda via the WhitelistOrigin.

This information could all be in a single document or, for example, a GitHub repository.

After publication, members should seek feedback from the community and Technical Fellowship, and make any revisions needed. When the collective believes the proposal is ready, they should bring a remark with the text APPROVE_COLLECTIVE("{collective name}, {commitment}") to a Root origin referendum. The proposer should provide instructions for generating commitment. The passing of this referendum would be unequivocal direction to the Fellowship that this collective should be part of the Polkadot runtime.

Note: There is no need for a REJECT referendum. Proposals that have not been approved are simply not included in the runtime.

Removing Collectives

If someone believes that an existing collective is not acting in the interest of the network or in accordance with its charter, they should likewise have a means to instruct the Fellowship to remove that collective from Polkadot.

An on-chain remark from the Root origin with the text REMOVE_COLLECTIVE("{collective name}, {para ID}, [{pallet indices}]") would instruct the Fellowship to remove the collective via the listed pallet indices on paraId. Should someone want to construct such a remark, they should have a reasonable expectation that a member of the Fellowship would help them identify the pallet indices associated with a given collective, whether or not the Fellowship member agrees with removal.

Collective removal may also come with other governance calls, for example voiding any scheduled Treasury spends that would fund the given collective.

Drawbacks

Passing a Root origin referendum is slow. However, given the network's investment (in terms of code maintenance and salaries) in a new collective, this is an appropriate step.

Testing, Security, and Privacy

No impacts.

Performance, Ergonomics, and Compatibility

Generally all new collectives will be in the Collectives parachain. Thus, performance impacts should strictly be limited to this parachain and not affect others. As the majority of logic for collectives is generalized and reusable, we expect most collectives to be instances of similar subsets of modules. That is, new collectives should generally be compatible with UIs and other services that provide collective-related functionality, with little modifications to support new ones.

Prior Art and References

The launch of the Technical Fellowship, see the initial forum post.

Unresolved Questions

None at this time.

(source)

Table of Contents

RFC-0013: Prepare Core runtime API for MBMs

Start DateJuly 24, 2023
DescriptionPrepare the Core Runtime API for Multi-Block-Migrations
AuthorsOliver Tale-Yazdi

Summary

Introduces breaking changes to the Core runtime API by letting Core::initialize_block return an enum. The versions of Core is bumped from 4 to 5.

Motivation

The main feature that motivates this RFC are Multi-Block-Migrations (MBM); these make it possible to split a migration over multiple blocks.
Further it would be nice to not hinder the possibility of implementing a new hook poll, that runs at the beginning of the block when there are no MBMs and has access to AllPalletsWithSystem. This hook can then be used to replace the use of on_initialize and on_finalize for non-deadline critical logic.
In a similar fashion, it should not hinder the future addition of a System::PostInherents callback that always runs after all inherents were applied.

Stakeholders

  • Substrate Maintainers: They have to implement this, including tests, audit and maintenance burden.
  • Polkadot Runtime developers: They will have to adapt the runtime files to this breaking change.
  • Polkadot Parachain Teams: They have to adapt to the breaking changes but then eventually have multi-block migrations available.

Explanation

Core::initialize_block

This runtime API function is changed from returning () to ExtrinsicInclusionMode:

fn initialize_block(header: &<Block as BlockT>::Header)
+  -> ExtrinsicInclusionMode;

With ExtrinsicInclusionMode is defined as:

#![allow(unused)]
fn main() {
enum ExtrinsicInclusionMode {
  /// All extrinsics are allowed in this block.
  AllExtrinsics,
  /// Only inherents are allowed in this block.
  OnlyInherents,
}
}

A block author MUST respect the ExtrinsicInclusionMode that is returned by initialize_block. The runtime MUST reject blocks that have non-inherent extrinsics in them while OnlyInherents was returned.

Coming back to the motivations and how they can be implemented with this runtime API change:

1. Multi-Block-Migrations: The runtime is being put into lock-down mode for the duration of the migration process by returning OnlyInherents from initialize_block. This ensures that no user provided transaction can interfere with the migration process. It is absolutely necessary to ensure this, otherwise a transaction could call into un-migrated storage and violate storage invariants.

2. poll is possible by using apply_extrinsic as entry-point and not hindered by this approach. It would not be possible to use a pallet inherent like System::last_inherent to achieve this for two reasons: First is that pallets do not have access to AllPalletsWithSystem which is required to invoke the poll hook on all pallets. Second is that the runtime does currently not enforce an order of inherents.

3. System::PostInherents can be done in the same manner as poll.

Drawbacks

The previous drawback of cementing the order of inherents has been addressed and removed by redesigning the approach. No further drawbacks have been identified thus far.

Testing, Security, and Privacy

The new logic of initialize_block can be tested by checking that the block-builder will skip transactions when OnlyInherents is returned.

Security: n/a

Privacy: n/a

Performance, Ergonomics, and Compatibility

Performance

The performance overhead is minimal in the sense that no clutter was added after fulfilling the requirements. The only performance difference is that initialize_block also returns an enum that needs to be passed through the WASM boundary. This should be negligible.

Ergonomics

The new interface allows for more extensible runtime logic. In the future, this will be utilized for multi-block-migrations which should be a huge ergonomic advantage for parachain developers.

Compatibility

The advice here is OPTIONAL and outside of the RFC. To not degrade user experience, it is recommended to ensure that an updated node can still import historic blocks.

Prior Art and References

The RFC is currently being implemented in polkadot-sdk#1781 (formerly substrate#14275). Related issues and merge requests:

Unresolved Questions

Please suggest a better name for BlockExecutiveMode. We already tried: RuntimeExecutiveMode, ExtrinsicInclusionMode. The names of the modes Normal and Minimal were also called AllExtrinsics and OnlyInherents, so if you have naming preferences; please post them.
=> renamed to ExtrinsicInclusionMode

Is post_inherents more consistent instead of last_inherent? Then we should change it.
=> renamed to last_inherent

The long-term future here is to move the block building logic into the runtime. Currently there is a tight dance between the block author and the runtime; the author has to call into different runtime functions in quick succession and exact order. Any misstep causes the block to be invalid.
This can be unified and simplified by moving both parts into the runtime.

(source)

Table of Contents

RFC-0014: Improve locking mechanism for parachains

Start DateJuly 25, 2023
DescriptionImprove locking mechanism for parachains
AuthorsBryan Chen

Summary

This RFC proposes a set of changes to the parachain lock mechanism. The goal is to allow a parachain manager to self-service the parachain without root track governance action.

This is achieved by remove existing lock conditions and only lock a parachain when:

  • A parachain manager explicitly lock the parachain
  • OR a parachain block is produced successfully

Motivation

The manager of a parachain has permission to manage the parachain when the parachain is unlocked. Parachains are by default locked when onboarded to a slot. This requires the parachain wasm/genesis must be valid, otherwise a root track governance action on relaychain is required to update the parachain.

The current reliance on root track governance actions for managing parachains can be time-consuming and burdensome. This RFC aims to address this technical difficulty by allowing parachain managers to take self-service actions, rather than relying on general public voting.

The key scenarios this RFC seeks to improve are:

  1. Rescue a parachain with invalid wasm/genesis.

While we have various resources and templates to build a new parachain, it is still not a trivial task. It is very easy to make a mistake and resulting an invalid wasm/genesis. With lack of tools to help detect those issues1, it is very likely that the issues are only discovered after the parachain is onboarded on a slot. In this case, the parachain is locked and the parachain team has to go through a lengthy governance process to rescue the parachain.

  1. Perform lease renewal for an existing parachain.

One way to perform lease renewal for a parachain is by doing a least swap with another parachain with a longer lease. This requires the other parachain must be operational and able to perform XCM transact call into relaychain to dispatch the swap call. Combined with the overhead of setting up a new parachain, this is an time consuming and expensive process. Ideally, the parachain manager should be able to perform the lease swap call without having a running parachain2.

Requirements

  • A parachain manager SHOULD be able to rescue a parachain by updating the wasm/genesis without root track governance action.
  • A parachain manager MUST NOT be able to update the wasm/genesis if the parachain is locked.
  • A parachain SHOULD be locked when it successfully produced the first block.
  • A parachain manager MUST be able to perform lease swap without having a running parachain.

Stakeholders

  • Parachain teams
  • Parachain users

Explanation

Status quo

A parachain can either be locked or unlocked3. With parachain locked, the parachain manager does not have any privileges. With parachain unlocked, the parachain manager can perform following actions with the paras_registrar pallet:

  • deregister: Deregister a Para Id, freeing all data and returning any deposit.
  • swap: Initiate or confirm lease swap with another parachain.
  • add_lock: Lock the parachain.
  • schedule_code_upgrade: Schedule a parachain upgrade to update parachain wasm.
  • set_current_head: Set the parachain's current head.

Currently, a parachain can be locked with following conditions:

  • From add_lock call, which can be dispatched by relaychain Root origin, the parachain, or the parachain manager.
  • When a parachain is onboarded on a slot4.
  • When a crowdloan is created.

Only the relaychain Root origin or the parachain itself can unlock the lock5.

This creates an issue that if the parachain is unable to produce block, the parachain manager is unable to do anything and have to rely on relaychain Root origin to manage the parachain.

Proposed changes

This RFC proposes to change the lock and unlock conditions.

A parachain can be locked only with following conditions:

  • Relaychain governance MUST be able to lock any parachain.
  • A parachain MUST be able to lock its own lock.
  • A parachain manager SHOULD be able to lock the parachain.
  • A parachain SHOULD be locked when it successfully produced a block for the first time.

A parachain can be unlocked only with following conditions:

  • Relaychain governance MUST be able to unlock any parachain.
  • A parachain MUST be able to unlock its own lock.

Note that create crowdloan MUST NOT lock the parachain and onboard a parachain SHOULD NOT lock it until a new block is successfully produced.

Migration

A one off migration is proposed in order to apply this change retrospectively so that existing parachains can also be benefited from this RFC. This migration will unlock parachains that confirms with following conditions:

  • Parachain is locked.
  • Parachain never produced a block. Including from expired leases.
  • Parachain manager never explicitly lock the parachain.

Drawbacks

Parachain locks are designed in such way to ensure the decentralization of parachains. If parachains are not locked when it should be, it could introduce centralization risk for new parachains.

For example, one possible scenario is that a collective may decide to launch a parachain fully decentralized. However, if the parachain is unable to produce block, the parachain manager will be able to replace the wasm and genesis without the consent of the collective.

It is considered this risk is tolerable as it requires the wasm/genesis to be invalid at first place. It is not yet practically possible to develop a parachain without any centralized risk currently.

Another case is that a parachain team may decide to use crowdloan to help secure a slot lease. Previously, creating a crowdloan will lock a parachain. This means crowdloan participants will know exactly the genesis of the parachain for the crowdloan they are participating. However, this actually providers little assurance to crowdloan participants. For example, if the genesis block is determined before a crowdloan is started, it is not possible to have onchain mechanism to enforce reward distributions for crowdloan participants. They always have to rely on the parachain team to fulfill the promise after the parachain is alive.

Existing operational parachains will not be impacted.

Testing, Security, and Privacy

The implementation of this RFC will be tested on testnets (Rococo and Westend) first.

An audit maybe required to ensure the implementation does not introduce unwanted side effects.

There is no privacy related concerns.

Performance

This RFC should not introduce any performance impact.

Ergonomics

This RFC should improve the developer experiences for new and existing parachain teams

Compatibility

This RFC is fully compatibility with existing interfaces.

Prior Art and References

  • Parachain Slot Extension Story: https://github.com/paritytech/polkadot/issues/4758
  • Allow parachain to renew lease without actually run another parachain: https://github.com/paritytech/polkadot/issues/6685
  • Always treat parachain that never produced block for a significant amount of time as unlocked: https://github.com/paritytech/polkadot/issues/7539

Unresolved Questions

None at this stage.

This RFC is only intended to be a short term solution. Slots will be removed in future and lock mechanism is likely going to be replaced with a more generalized parachain manage & recovery system in future. Therefore long term impacts of this RFC are not considered.

1

https://github.com/paritytech/cumulus/issues/377 2: https://github.com/paritytech/polkadot/issues/6685 3: https://github.com/paritytech/polkadot/blob/994af3de79af25544bf39644844cbe70a7b4d695/runtime/common/src/paras_registrar.rs#L51-L52C15 4: https://github.com/paritytech/polkadot/blob/994af3de79af25544bf39644844cbe70a7b4d695/runtime/common/src/paras_registrar.rs#L473-L475 5: https://github.com/paritytech/polkadot/blob/994af3de79af25544bf39644844cbe70a7b4d695/runtime/common/src/paras_registrar.rs#L333-L340

(source)

Table of Contents

RFC-0022: Adopt Encointer Runtime

Start DateAug 22nd 2023
DescriptionPermanently move the Encointer runtime into the Fellowship runtimes repo.
Authors@brenzi for Encointer Association, 8000 Zurich, Switzerland

Summary

Encointer is a system chain on Kusama since Jan 2022 and has been developed and maintained by the Encointer association. This RFC proposes to treat Encointer like any other system chain and include it in the fellowship repo with this PR.

Motivation

Encointer does not seek to be in control of its runtime repository. As a decentralized system, the fellowship has a more suitable structure to maintain a system chain runtime repo than the Encointer association does.

Also, Encointer aims to update its runtime in batches with other system chains in order to have consistency for interoperability across system chains.

Stakeholders

  • Fellowship: Will continue to take upon them the review and auditing work for the Encointer runtime, but the process is streamlined with other system chains and therefore less time-consuming compared to the separate repo and CI process we currently have.
  • Kusama Network: Tokenholders can easily see the changes of all system chains in one place.
  • Encointer Association: Further decentralization of the Encointer Network necessities like devops.
  • Encointer devs: Being able to work directly in the Fellowship runtimes repo to streamline and synergize with other developers.

Explanation

Our PR has all details about our runtime and how we would move it into the fellowship repo.

Noteworthy: All Encointer-specific pallets will still be located in encointer's repo for the time being: https://github.com/encointer/pallets

It will still be the duty of the Encointer team to keep its runtime up to date and provide adequate test fixtures. Frequent dependency bumps with Polkadot releases would be beneficial for interoperability and could be streamlined with other system chains but that will not be a duty of fellowship. Whenever possible, all system chains could be upgraded jointly (including Encointer) with a batch referendum.

Further notes:

  • Encointer will publish all its crates crates.io
  • Encointer does not carry out external auditing of its runtime nor pallets. It would be beneficial but not a requirement from our side if Encointer could join the auditing process of other system chains.

Drawbacks

Other than all other system chains, development and maintenance of the Encointer Network is mainly financed by the KSM Treasury and possibly the DOT Treasury in the future. Encointer is dedicated to maintaining its network and runtime code for as long as possible, but there is a dependency on funding which is not in the hands of the fellowship. The only risk in the context of funding, however, is that the Encointer runtime will see less frequent updates if there's less funding.

Testing, Security, and Privacy

No changes to the existing system are proposed. Only changes to how maintenance is organized.

Performance, Ergonomics, and Compatibility

No changes

Prior Art and References

Existing Encointer runtime repo

Unresolved Questions

None identified

More info on Encointer: encointer.org

(source)

Table of Contents

RFC-0026: Sassafras Consensus Protocol

Start DateSeptember 06, 2023
DescriptionSassafras consensus protocol specification
AuthorsDavide Galassi

Abstract

Sassafras is a novel consensus protocol designed to address the recurring fork-related challenges encountered in other lottery-based protocols.

The protocol aims to create a mapping between each epoch's slots and the authorities set while ensuring that the identity of authorities assigned to the slots remains undisclosed until the slot is actively claimed during block production.

1. Motivation

Sassafras Protocol has been rigorously described in a comprehensive research paper authored by the Web3 Foundation research team.

This RFC is primarily intended to detail the critical implementation aspects vital for ensuring interoperability and to clarify certain aspects that are left open by the research paper and thus subject to interpretation during implementation.

1.1. Relevance to Implementors

This RFC focuses on providing implementors with the necessary insights into the core protocol's operation.

In instances of inconsistency between this document and the research paper, this RFC should be considered authoritative to eliminate ambiguities and ensure interoperability.

1.2. Supporting Sassafras for Polkadot

Beyond promoting interoperability, this RFC also aims to facilitate the implementation of Sassafras within the greater Polkadot ecosystem.

Although the specifics of deployment strategies are beyond the scope of this document, it lays the groundwork for the integration of Sassafras.

2. Stakeholders

The protocol has a central role in the next generation block authoring consensus systems.

2.1. Blockchain Core Developers

Developers responsible for creating blockchains who intend to leverage the benefits offered by the Sassafras Protocol.

2.2. Polkadot Ecosystem Contributors

Developers contributing to the Polkadot ecosystem, both relay-chain and para-chains.

3. Notation

This section outlines the notation adopted throughout this document to ensure clarity and consistency.

3.1. Data Structures Definitions

Data structures are mostly defined using standard ASN.1 syntax with few exceptions.

To ensure interoperability of serialized structures, the order of the fields must match the definitions found within this specification.

3.2. Types Alias

  • Unsigned integer: Unsigned ::= INTEGER (0..MAX)
  • n-bit unsigned integer: Unsigned<n> ::= INTEGER (0..2^n - 1)
    • 8-bit unsigned integer (octet) Unsigned8 ::= Unsigned<8>
    • 32-bit unsigned integer: Unsigned32 ::= Unsigned<32>
    • 64-bit unsigned integer: Unsigned64 ::= Unsigned<64>
  • Non-homogeneous sequence (struct/tuple): Sequence ::= SEQUENCE
  • Variable length homogeneous sequence (vector): Sequence<T> ::= SEQUENCE OF T
  • Fixed length homogeneous sequence (array): Sequence<T,n> ::= Sequence<T> (SIZE(n))
  • Variable length octet-string: OctetString ::= Sequence<Unsigned8>
  • Fixed length octet-string: OctetString<n> ::= Sequence<Unsigned8, n>

3.2. Pseudo-Code

It is convenient to make use of code snippets as part of the protocol description. As a convention, the code is formatted in a style similar to Rust, and can make use of the following set of predefined procedures:

Sequences

  • CONCAT(x₀: OctetString, ..., xₖ: OctetString) -> OctetString: Concatenates the input octet-strings as a new octet string.

  • LENGTH(s: Sequence) -> Unsigned: The number of elements in the sequence s.

  • GET(s: Sequence<T>, i: Unsigned) -> T: The i-th element of the sequence s.

  • PUSH(s: Sequence<T>, x: T): Appends x as the new last element of the sequence s.

  • POP(s: Sequence<T>) -> T: extract and returns the last element of the sequence s.

Codec

  • ENCODE(x: T) -> OctetString: Encodes x as an OctetString according to SCALE codec.

  • DECODE<T>(x: OctetString) -> T: Decodes x as a type T object according to SCALE codec.

Other

  • BLAKE2(x: OctetString) -> OctetString<32>: Standard Blake2b hash of x with 256-bit digest.

3.3. Incremental Introduction of Types and Functions

More types and helper functions are introduced incrementally as they become relevant within the document's context.

4. Protocol Introduction

The timeline is segmented into a sequentially ordered sequence of slots. This entire sequence of slots is further partitioned into distinct segments known as epochs.

Sassafras aims to map each slot within a target epoch to the authorities scheduled for that epoch, utilizing a ticketing system.

The core protocol operation can be roughly divided into four phases.

4.1. Submission of Candidate Tickets

Each authority scheduled for the target epoch generates and shares a set of candidate tickets. Every ticket has an unbiasable pseudo random score and is bundled with an anonymous proof of validity.

4.2. Validation of Candidate Tickets

Each candidate ticket undergoes a validation process for the associated validity proof and compliance with other protocol-specific constraints. Valid tickets are persisted on-chain.

4.3. Tickets Slots Binding

After collecting all valid candidate tickets and before the beginning of the target epoch, a deterministic method is used to uniquely associate a subset of these tickets to the slots of the target epoch.

4.4. Claim of Ticket Ownership

During block production phase of target epoch, the author is required to prove ownership of the ticket associated to the block's slot. This step discloses the identity of the ticket owner.

5. Bandersnatch VRFs Cryptographic Primitives

This section is not intended to serve as an exhaustive exploration of the mathematically intensive foundations of the cryptographic primitive. Rather, its primary aim is to offer a concise and accessible explanation of the primitives role and interface which is relevant within the scope of the protocol. For a more detailed explanation, refer to the Bandersnatch VRFs technical specification

Bandersnatch VRF comes in two variants:

  • Bare VRF: Extension to the IETF ECVRF RFC 9381,
  • Ring VRF: Anonymous signatures leveraging zk-SNARK.

Together with the input, which determines the VRF output, both variants offer the capability to sign some arbitrary additional data (extra) which doesn't contribute to the VRF output.

5.1 Bare VRF Interface

VRF signature construction.

#![allow(unused)]
fn main() {
    fn vrf_sign(
        secret: SecretKey,
        input: OctetString,
        extra: OctetString,
    ) -> VrfSignature
}

VRF signature verification. Returns a Boolean indicating the validity of the signature (1 on success).

#![allow(unused)]
fn main() {
    fn vrf_verify(
        public: PublicKey,
        input: OctetString,
        extra: OctetString,
        signature: VrfSignature
    ) -> Unsigned<1>;
}

VRF output derivation from input and secret.

#![allow(unused)]
fn main() {
    fn vrf_output(
        secret: SecretKey,
        input: OctetString,
    ) -> OctetString<32>;
}

VRF output derivation from a VRF signature.

#![allow(unused)]
fn main() {
    fn vrf_signed_output(
        signature: VrfSignature,
    ) -> OctetString<32>;
}

The following condition is always satisfied:

#![allow(unused)]
fn main() {
    let signature = vrf_sign(secret, input, extra);
    vrf_output(secret, input) == vrf_signed_output(signature)
}

SecretKey, PublicKey and VrfSignature types are intentionally left undefined. Their definitions can be found in the Bandersnatch VRF specification and related documents.

5.4.2. Ring VRF Interface

Ring VRF signature construction.

#![allow(unused)]
fn main() {
    fn ring_vrf_sign(
        secret: SecretKey,
        prover: RingProver,
        input: OctetString,
        extra: OctetString,
    ) -> RingVrfSignature;
}

Ring VRF signature verification. Returns a Boolean indicating the validity of the signature (1 on success). Note that verification doesn't require the signer's public key.

#![allow(unused)]
fn main() {
    fn ring_vrf_verify(
        verifier: RingVerifier,
        input: OctetString,
        extra: OctetString,
        signature: RingVrfSignature,
    ) -> Unsigned<1>;
}

VRF output derivation from a ring VRF signature.

#![allow(unused)]
fn main() {
    fn ring_vrf_signed_output(
        signature: RingVrfSignature,
    ) -> OctetString<32>;
}

The following condition is always satisfied:

#![allow(unused)]
fn main() {
    let signature = vrf_sign(secret, input, extra);
    let ring_signature = ring_vrf_sign(secret, prover, input, extra);
    vrf_signed_output(signature) == ring_vrf_signed_output(ring_signature);
}

RingProver, RingVerifier, and RingVrfSignature are intentionally left undefined. Their definitions can be found in the Bandersnatch VRF specification and related documents.

6. Sassafras Protocol

6.1. Protocol Configuration

The ProtocolConfiguration type contains some parameters to tweak the protocol behavior and primarily influences certain checks carried out during tickets validation. It is defined as:

#![allow(unused)]
fn main() {
    ProtocolConfiguration ::= Sequence {
        epoch_length: Unsigned32,
        attempts_number: Unsigned8,
        redundancy_factor: Unsigned8,
    }
}

Where:

  • epoch_length: Number of slots for each epoch.
  • attempts_number: Maximum number of tickets that each authority is allowed to submit.
  • redundancy_factor: Expected ratio between the cumulative number of valid tickets which can be submitted by the scheduled authorities and the epoch's duration in slots.

The attempts_number influences the anonymity of block producers. As all published tickets have a public attempt number less than attempts_number, all the tickets which share the attempt number value must belong to different block producers, which reduces anonymity late as we approach the epoch tail. Bigger values guarantee more anonymity but also more computation.

Details about how these parameters drive the tickets validity probability can be found in section 6.5.2.

6.2. Header Digest Log

Each block header contains a Digest log, which is defined as an ordered sequence of DigestItems:

#![allow(unused)]
fn main() {
    DigestItem ::= Sequence {
        id: OctetString<4>,
        data: OctetString
    }

    Digest ::= Sequence<DigestItem>
}

The Digest sequence is used to propagate information required for the correct protocol progress. Outside the protocol's context, the information within each DigestItem is opaque and maps to some SCALE-encoded protocol-specific structure.

For Sassafras related items, the DiegestItems id is set to the ASCII string "SASS"

Possible digest items for Sassafras:

  • Epoch change signal: Information about next epoch. This is mandatory for the first block of a new epoch.
  • Epoch tickets signal: Sequence of tickets for claiming slots in the next epoch. This is mandatory for the first block in the epoch's tail
  • Slot claim info: Additional data required for block verification. This is mandatory for each block and must be the second-to-last entry in the log.
  • Seal: Block signature added by the block author. This is mandatory for each block and must be the last entry in the log.

If any digest entry is unexpected, not found where mandatory or found in the wrong position, then the block is considered invalid.

6.3. On-Chain Randomness

A sequence of four randomness entries is maintained on-chain.

#![allow(unused)]
fn main() {
    RandomnessBuffer ::= Sequence<OctetString<32>, 4>
}

During epoch N:

  • The first entry is the current randomness accumulator and incorporates verifiable random elements from all previously executed blocks. The accumulation procedure is described in section 6.10.

  • The second entry is the snapshot of the accumulator before the execution of the first block of epoch N. This is the randomness used for tickets targeting epoch N+2.

  • The third entry is the snapshot of the accumulator before the execution of the first block of epoch N-1. This is the randomness used for tickets targeting epoch N+1 (the next epoch).

  • The third entry is the snapshot of the accumulator before the execution of the first block of epoch N-2. This is the randomness used for tickets targeting epoch N (the current epoch).

The buffer's entries are updated after each block execution.

6.4. Epoch Change Signal

The first block produced during epoch N must include a descriptor for some of the parameters to be used by the subsequent epoch (N+1).

This signal descriptor is defined as:

#![allow(unused)]
fn main() {
    NextEpochDescriptor ::= Sequence {
        randomness: OctetString<32>,
        authorities: Sequence<PublicKey>,
    }
}

Where:

  • randomness: Randomness accumulator snapshot relevant for validation of next epoch blocks. In other words, randomness used to construct the tickets targeting epoch N+1.
  • authorities: List of authorities scheduled for next epoch.

This descriptor is SCALE encoded and embedded in a DigestItem.

6.4.1. Startup Parameters

Some of the initial parameters used by the first epoch (#0), are set through the genesis configuration, which is defined as:

#![allow(unused)]
fn main() {
    GenesisConfig ::= Sequence {
        authorities: Sequence<PublicKey>,
    }
}

The on-chain RandomnessBuffer is initialized after the genesis block construction. The first buffer entry is set as the Blake2b hash of the genesis block, each of the other entries is set as the Blake2b hash of the previous entry.

Since block #0 is generated by each node as part of the genesis process, the first block that an authority explicitly produces for epoch #0 is block #1. Therefore, block #1 is required to contain the NextEpochDescriptor for the following epoch.

NextEpochDescriptor for epoch #1:

  • randomness: Third entry (index 2) of the randomness buffer.
  • authorities: The same sequence as specified in the genesis configuration.

6.5. Tickets Creation and Submission

During epoch N, each authority scheduled for epoch N+2 constructs a set of tickets which may be eligible (6.5.2) for on-chain submission.

These tickets are constructed using the on-chain randomness snapshot taken before the execution of the first block of epoch N together with other parameters and aims to secure ownership of one or more slots of epoch N+2 (target epoch).

Each authority is allowed to submit a maximum number of tickets, constrained by attempts_number field of the ProtocolConfiguration.

The ideal timing for the candidate authority to start constructing the tickets is subject to strategy. A recommended approach is to initiate tickets creation once the last block of epoch N-1 is either probabilistically or, even better, deterministically finalized. This delay is suggested to prevent wasting resources creating tickets that will be unusable if a different chain branch is chosen as canonical.

Tickets generated during epoch N are shared with the tickets relayers, which are the authorities scheduled for epoch N+1. Relayers validate and collect (off-chain) the tickets targeting epoch N+2.

When epoch N+1 starts, collected tickets are submitted on-chain by relayers as inherent extrinsics, a special type of transaction inserted by the block author at the beginning of the block's transactions sequence.

6.5.1. Ticket Identifier

Each ticket has an associated identifier defined as:

#![allow(unused)]
fn main() {
    TicketId ::= OctetString<32>;
}

The value of TicketId is completely determined by the output of Bandersnatch VRFs given the following unbiasable input:

#![allow(unused)]
fn main() {
    let ticket_vrf_input = CONCAT(
        BYTES("sassafras_ticket_seal"),
        target_epoch_randomness,
        BYTES(attempt)
    );

    let ticket_id = vrf_output(authority_secret_key, ticket_vrf_input);
}

Where:

  • target_epoch_randomness: element of RandomnessBuffer which contains the randomness for the epoch the ticket is targeting.
  • attempt: value going from 0 to the configured attempts_number - 1.

6.5.2. Tickets Threshold

A ticket is valid for on-chain submission if its TicketId value, when interpreted as a big-endian 256-bit integer normalized as a float within the range [0..1], is less than the ticket threshold computed as:

T = (r·s)/(a·v)

Where:

  • v: epoch's authorities number
  • s: epoch's slots number
  • r: redundancy factor
  • a: attempts number

In an epoch with s slots, the goal is to achieve an expected number of valid tickets equal to r·s.

It's crucial to ensure that the probability of having fewer than s winning tickets is very low, even in scenarios where up to 1/3 of the authorities might be offline. To accomplish this, we first define the winning probability of a single ticket as T = (r·s)/(a·v).

Let n be the actual number of participating authorities, where v·2/3 ≤ n ≤ v. These n authorities each make a attempts, for a total of a·n attempts.

Let X be the random variable associated to the number of winning tickets, then its expected value is E[X] = T·a·n = (r·s·n)/v. By setting r = 2, we get s·4/3 ≤ E[X] ≤ s·2. Using Bernestein's inequality we get Pr[X < s] ≤ e^(-s/21).

For instance, with s = 600 this results in Pr[X < s] < 4·10⁻¹³. Consequently, this approach offers considerable tolerance for offline nodes and ensures that all slots are likely to be filled with tickets.

For more details about threshold formula refer to probabilities and parameters paragraph in the Web3 Foundation description of the protocol.

6.5.3. Ticket Envelope

Each ticket candidate is represented by a TicketEnvelope:

#![allow(unused)]
fn main() {
    TicketEnvelope ::= Sequence {
        attempt: Unsigned8,
        extra: OctetString,
        signature: RingVrfSignature
    }   
}

Where:

  • attempt: Index associated to the ticket.
  • extra: Additional data available for user-defined applications.
  • signature: Ring VRF signature of the envelope data (attempt and extra).

Envelope data is signed using Bandersnatch Ring VRF (5.4.2).

#![allow(unused)]
fn main() {
    let signature = ring_vrf_sign(
        secret_key,
        ring_prover
        ticket_vrf_input,
        extra,
    );
}

With ticket_vrf_input defined as in 6.5.1.

6.6. On-chain Tickets Validation

Validation rules:

  1. Ring VRF signature is verified using the ring_verifier derived by the constant ring context parameters (SNARK SRS) and the next epoch authorities public keys.

  2. TicketId is locally computed from the RingVrfSignature and its value is checked to be less than tickets' threshold.

  3. On-chain tickets submission can't occur within a block part of the epoch's tail, which encompasses a configurable number of slots at the end of the epoch. This constraint is to give time to persisted on-chain tickets to be probabilistically (or even better deterministically) finalized and thus to further reduce the fork chances at the beginning of the target epoch.

  4. All tickets which are proposed within a block must be valid and all of them must end up being persisted on-chain. Because the total number of tickets persisted on-chain is limited by to the epoch's length, this may require to drop some of the previously persisted tickets. We remove tickets with greater TicketId value first.

  5. No tickets duplicates are allowed.

If at least one of the checks fails then the block must be considered invalid.

Pseudo-code for ticket validation for steps 1 and 2:

#![allow(unused)]
fn main() {
    let ticket_vrf_input = CONCAT(
        BYTES("sassafras_ticket_seal"),
        target_epoch_randomness,
        BYTES(envelope.attempt)
    );

    let result = ring_vrf_verify(
        ring_verifier,
        ticket_vrf_input,
        envelope.extra,
        envelope.ring_signature
    );
    ASSERT(result == 1);

    let ticket_id = ring_vrf_signed_output(envelope.ring_signature);
    ASSERT(ticket_id < ticket_threshold);
}

Valid tickets are persisted on-chain in a bounded sorted sequence of TicketBody objects. Items within this sequence are sorted according to their TicketId, interpreted as a 256-bit big-endian unsigned integer.

#![allow(unused)]
fn main() {
    TicketBody ::= Sequence {
        id: TicketId,
        attempt: Unsigned8,
        extra: OctetString,
    }

    Tickets ::= Sequence<TicketBody>
}

The on-chain tickets sequence length bound is set equal to the epoch length in slots according to the protocol configuration.

6.7. Ticket-Slot Binding

Before the beginning of the target epoch, the on-chain sequence of tickets must be associated to epoch's slots such that there is at most one ticket per slot.

Given an ordered sequence of tickets [t₀, t₁, ..., tₙ], the tickets are associated according to the following outside-in strategy:

    slot_index  : [  0,  1,  2,  3 ,  ... ]
    tickets     : [ t₀, tₙ, t₁, tₙ₋₁, ... ]

Here slot_index is the slot number relative to the epoch's first slot: slot_index = slot - epoch_first_slot.

The association between tickets and a slots is recorded on-chain and thus is public. What remains confidential is the ticket's author identity, and consequently, who is enabled to claim the corresponding slot. This information is known only to the ticket's author.

If the number of published tickets is less than the number of epoch's slots, some orphan slots at the end of the epoch will remain unbounded to any ticket. For orphan slots claiming strategy refer to 6.8.2. Note that this fallback situation always apply to the first two epochs after genesis.

6.8. Slot Claim

With tickets bounded to the target epoch slots, every designated authority acquires the information about the slots for which they are required to produce a block.

The procedure for slot claiming depends on whether a given slot has an associated ticket according to the on-chain state. If a slot has an associated ticket, then the primary authoring method is used. Conversely, the protocol resorts to the secondary method as a fallback.

6.8.1. Primary Method

An authority, can claim a slot using the primary method if it is the legit owner of the ticket associated to the given slot.

Let target_epoch_randomness be the entry in RandomnessBuffer relative to the epoch the block is targeting and attempt be the attempt used to construct the ticket associated to the slot to claim, the VRF input for slot claiming is constructed as:

#![allow(unused)]
fn main() {
    let seal_vrf_input = CONCAT(
        BYTES("sassafras_ticket_seal"),
        target_epoch_randomness,
        BYTES(attempt)
    );
}

The seal_vrf_input, when signed with the correct authority secret key, must generate the same TicketId which has been associated to the target slot according to the on-chain state.

6.8.2. Secondary Method

Given that the authorities scheduled for the target epoch are kept on-chain in an ordered sequence, the index of the authority which has the privilege to claim an orphan slot is given by the following procedure:

#![allow(unused)]
fn main() {
    let hash_input = CONCAT(
        target_epoch_randomness,
        ENCODE(relative_slot_index),
    );
    let hash = BLAKE2(hash_input);
    let index_bytes = CONCAT(GET(hash, 0), GET(hash, 1), GET(hash, 2), GET(hash, 3));
    let index = DECODE<Unsigned32>(index_bytes) % LENGTH(authorities);
}

With relative_slot_index the slot offset relative to the target epoch's start and authorities the sequence of target epoch authorities.

#![allow(unused)]
fn main() {
    let seal_vrf_input = CONCAT(
        BYTES("sassafras_fallback_seal"),
        target_epoch_randomness
    );
}

6.8.3. Claim Data

ClaimData is a digest entry which contains additional information required by the protocol to verify the block:

#![allow(unused)]
fn main() {
    ClaimData ::= Sequence {
        slot: Unsigned32,
        authority_index: Unsigned32,
        randomness_source: VrfSignature,
    }
}
  • slot: The slot number
  • authority_index: Block's author index relative to the on-chain authorities sequence.
  • randomness_source: VRF signature used to generate per-block randomness.

Given the seal_vrf_input constructed using the primary or secondary method, the randomness source signature is generated as follows:

#![allow(unused)]
fn main() {
    let randomness_vrf_input = CONCAT(
        BYTES("sassafras_randomness"),
        vrf_output(authority_secret_key, seal_vrf_input)
    );

    let randomness_source = vrf_sign(
        authority_secret_key,
        randomness_vrf_input,
        []
    );

    let claim = SlotClaim {
        slot,
        authority_index,
        randomness_source
    };

    PUSH(block_header.digest, ENCODE(claim));
}

The ClaimData object is SCALE encoded and pushed as the second-to-last element of the header digest log.

6.8.4. Block Seal

A block is finally sealed as follows:

#![allow(unused)]
fn main() {
    let unsealed_header_byets = ENCODE(block_header);

    let seal = vrf_sign(
        authority_secret_key,
        seal_vrf_input,
        unsealed_header_bytes
    );

    PUSH(block_header.digest, ENCODE(seal));
}

With block_header the block's header without the seal digest log entry.

The seal object is a VrfSignature, which is SCALE encoded and pushed as the last entry of the header digest log.

6.9. Slot Claim Verification

The last entry is extracted from the header digest log, and is SCALE decoded as a VrfSignature object. The unsealed header is then SCALE encoded in order to be verified.

The next entry is extracted from the header digest log, and is SCALE decoded as a ClaimData object.

The validity of the two signatures is assessed using as the authority public key corresponding to the authority_index found in the ClaimData, together with the VRF input (which depends on primary/secondary method) and additional data used by the block author.

#![allow(unused)]
fn main() {
    let seal_signature = DECODE<VrfSignature>(POP(header.digest));
    let unsealed_header_bytes = ENCODE(header);
    let claim_data = DECODE<ClaimData>(POP(header.digest));

    let authority_public_key = GET(authorities, claim_data.authority_index);

    // Verify seal signature
    let result = vrf_verify(
        authority_public_key,
        seal_vrf_input,
        unsealed_header_bytes,
        seal_signature
    );
    ASSERT(result == 1);

    let randomness_vrf_input = CONCAT(
        BYTES("sassafras_randomness"),
        vrf_signed_output(seal_signature)
    );

    // Verify per-block entropy source signature
    let result = vrf_verify(
        authority_public_key,
        randomness_vrf_input,
        [],
        claim_data.randomness_source
    );
    ASSERT(result == 1);
}

With:

  • header: The block's header.
  • authorities: Sequence of authorities for the target epoch, as recorded on-chain.
  • seal_vrf_input: VRF input data constructed as specified in 6.8.

If signatures verification is successful, then the verification process diverges based on whether the slot is associated with a ticket according to the on-chain state.

6.9.1. Primary Method

For slots tied to a ticket, the primary verification method is employed. This method verifies ticket ownership using the TicketId associated to the slot.

#![allow(unused)]
fn main() {
    let ticket_id = vrf_signed_output(seal_signature);
    ASSERT(ticket_id == expected_ticket_id);
}

With expected_ticket_id the ticket identifier committed on-chain in the associated TicketBody.

6.9.2. Secondary Method

If the slot doesn't have any associated ticket, then the authority_index contained in the ClaimData must match the one returned by the procedure outlined in section 6.8.2.

6.10. Randomness Accumulator

The randomness accumulator is updated using the randomness_source signature found within the ClaimData object. In particular, fresh randomness is derived and accumulated after block execution as follows:

#![allow(unused)]
fn main() {
    let fresh_randomness = vrf_signed_output(claim.randomness_source);  
    randomness_buffer[0] = BLAKE2(CONCAT(randomness_buffer[0], fresh_randomness));
}

7. Drawbacks

None

8. Testing, Security, and Privacy

It is critical that implementations of this RFC undergo thorough rigorous testing. A security audit may be desirable to ensure the implementation does not introduce emergent side effects.

9. Performance, Ergonomics, and Compatibility

9.1. Performance

Adopting Sassafras consensus marks a significant improvement in reducing the frequency of short-lived forks which are eliminated by design.

Forks may only result from network disruption or protocol attacks. In such cases, the choice of which fork to follow upon recovery is clear-cut, with only one valid option.

9.2. Ergonomics

No specific considerations.

9.3. Compatibility

The adoption of Sassafras affects the native client and thus can't be introduced via a "simple" runtime upgrade.

A deployment strategy should be carefully engineered for live networks. This subject is left open for a dedicated RFC.

10. Prior Art and References

11. Unresolved Questions

None

While this RFC lays the groundwork and outlines the core aspects of the protocol, several crucial topics remain to be addressed in future RFCs.

12.1. Interactions with On-Chain Code

  • Storage: Types, organization and genesis configuration.

  • Host interface: Interface that the hosting environment exposes to on-chain code (also known as host functions).

  • Unrecorded on-chain interface. Interface that on-chain code exposes to the hosting environment (also known as runtime API).

  • Transactional on-chain interface. Interface that on-chain code exposes to the World to alter the state (also known as transactions or extrinsics in the Polkadot ecosystem).

12.2. Deployment Strategies

  • Protocol Migration. Investigate of how Sassafras can seamlessly replace an already operational instance of another protocol. Future RFCs may focus on deployment strategies to facilitate a smooth transition.

12.3. ZK-SNARK Parameters

  • Parameters Setup: Determine the setup procedure for the zk-SNARK SRS (Structured Reference String) initialization. Future RFCs may provide insights into whether this process should include an ad-hoc initialization ceremony or if we can reuse an SRS from another ecosystem (e.g. Zcash or Ethereum).

12.4. Anonymous Submission of Tickets.

  • Mixnet Integration: Submitting tickets directly to the relay can pose a risk of potential deanonymization through traffic analysis. Subsequent RFCs may investigate the potential for incorporating mix network protocol or other privacy-enhancing mechanisms to address this concern.

(source)

Table of Contents

RFC-0032: Minimal Relay

Start Date20 September 2023
DescriptionProposal to minimise Relay Chain functionality.
AuthorsJoe Petrowski, Gavin Wood

Summary

The Relay Chain contains most of the core logic for the Polkadot network. While this was necessary prior to the launch of parachains and development of XCM, most of this logic can exist in parachains. This is a proposal to migrate several subsystems into system parachains.

Motivation

Polkadot's scaling approach allows many distinct state machines (known generally as parachains) to operate with common guarantees about the validity and security of their state transitions. Polkadot provides these common guarantees by executing the state transitions on a strict subset (a backing group) of the Relay Chain's validator set.

However, state transitions on the Relay Chain need to be executed by all validators. If any of those state transitions can occur on parachains, then the resources of the complement of a single backing group could be used to offer more cores. As in, they could be offering more coretime (a.k.a. blockspace) to the network.

By minimising state transition logic on the Relay Chain by migrating it into "system chains" -- a set of parachains that, with the Relay Chain, make up the Polkadot protocol -- the Polkadot Ubiquitous Computer can maximise its primary offering: secure blockspace.

Stakeholders

  • Parachains that interact with affected logic on the Relay Chain;
  • Core protocol and XCM format developers;
  • Tooling, block explorer, and UI developers.

Explanation

The following pallets and subsystems are good candidates to migrate from the Relay Chain:

  • Identity
  • Balances
  • Staking
    • Staking
    • Election Provider
    • Bags List
    • NIS
    • Nomination Pools
    • Fast Unstake
  • Governance
    • Treasury and Bounties
    • Conviction Voting
    • Referenda

Note: The Auctions and Crowdloan pallets will be replaced by Coretime, its system chain and interface described in RFC-1 and RFC-5, respectively.

Migrations

Some subsystems are simpler to move than others. For example, migrating Identity can be done by simply preventing state changes in the Relay Chain, using the Identity-related state as the genesis for a new chain, and launching that new chain with the genesis and logic (pallet) needed.

Other subsystems cannot experience any downtime like this because they are essential to the network's functioning, like Staking and Governance. However, these can likely coexist with a similarly-permissioned system chain for some time, much like how "Gov1" and "OpenGov" coexisted at the latter's introduction.

Specific migration plans will be included in release notes of runtimes from the Polkadot Fellowship when beginning the work of migrating a particular subsystem.

Interfaces

The Relay Chain, in many cases, will still need to interact with these subsystems, especially Staking and Governance. These subsystems will require making some APIs available either via dispatchable calls accessible to XCM Transact or possibly XCM Instructions in future versions.

For example, Staking provides a pallet-API to register points (e.g. for block production) and offences (e.g. equivocation). With Staking in a system chain, that chain would need to allow the Relay Chain to update validator points periodically so that it can correctly calculate rewards.

A pub-sub protocol may also lend itself to these types of interactions.

Functional Architecture

This RFC proposes that system chains form individual components within the system's architecture and that these components are chosen as functional groups. This approach allows synchronous composibility where it is most valuable, but isolates logic in such a way that provides flexibility for optimal resource allocation (see Resource Allocation). For the subsystems discussed in this RFC, namely Identity, Governance, and Staking, this would mean:

  • People Chain, for identity and personhood logic, providing functionality related to the attributes of single actors;
  • Governance Chain, for governance and system collectives, providing functionality for pluralities to express their voices within the system;
  • Staking Chain, for Polkadot's staking system, including elections, nominations, reward distribution, slashing, and non-interactive staking; and
  • Asset Hub, for fungible and non-fungible assets, including DOT.

The Collectives chain and Asset Hub already exist, so implementation of this RFC would mean two new chains (People and Staking), with Governance moving to the currently-known-as Collectives chain and Asset Hub being increasingly used for DOT over the Relay Chain.

Note that one functional group will likely include many pallets, as we do not know how pallet configurations and interfaces will evolve over time.

Resource Allocation

The system should minimise wasted blockspace. These three (and other) subsystems may not each consistently require a dedicated core. However, core scheduling is far more agile than functional grouping. While migrating functionality from one chain to another can be a multi-month endeavour, cores can be rescheduled almost on-the-fly.

Migrations are also breaking changes to some use cases, for example other parachains that need to route XCM programs to particular chains. It is thus preferable to do them a single time in migrating off the Relay Chain, reducing the risk of needing parachain splits in the future.

Therefore, chain boundaries should be based on functional grouping where synchronous composibility is most valuable; and efficient resource allocation should be managed by the core scheduling protocol.

Many of these system chains (including Asset Hub) could often share a single core in a semi-round robin fashion (the coretime may not be uniform). When needed, for example during NPoS elections or slashing events, the scheduler could allocate a dedicated core to the chain in need of more throughput.

Deployment

Actual migrations should happen based on some prioritization. This RFC proposes to migrate Identity, Staking, and Governance as the systems to work on first. A brief discussion on the factors involved in each one:

Identity

Identity will be one of the simpler pallets to migrate into a system chain, as its logic is largely self-contained and it does not "share" balances with other subsystems. As in, any DOT is held in reserve as a storage deposit and cannot be simultaneously used the way locked DOT can be locked for multiple purposes.

Therefore, migration can take place as follows:

  1. The pallet can be put in a locked state, blocking most calls to the pallet and preventing updates to identity info.
  2. The frozen state will form the genesis of a new system parachain.
  3. Functions will be added to the pallet that allow migrating the deposit to the parachain. The parachain deposit is on the order of 1/100th of the Relay Chain's. Therefore, this will result in freeing up Relay State as well as most of each user's reserved balance.
  4. The pallet and any leftover state can be removed from the Relay Chain.

User interfaces that render Identity information will need to source their data from the new system parachain.

Note: In the future, it may make sense to decommission Kusama's Identity chain and do all account identities via Polkadot's. However, the Kusama chain will serve as a dress rehearsal for Polkadot.

Staking

Migrating the staking subsystem will likely be the most complex technical undertaking, as the Staking system cannot stop (the system MUST always have a validator set) nor run in parallel (the system MUST have only one validator set) and the subsystem itself is made up of subsystems in the runtime and the node. For example, if offences are reported to the Staking parachain, validator nodes will need to submit their reports there.

Handling balances also introduces complications. The same balance can be used for staking and governance. Ideally, all balances stay on Asset Hub, and only report "credits" to system chains like Staking and Governance. However, staking mutates balances by issuing new DOT on era changes and for rewards. Allowing DOT directly on the Staking parachain would simplify staking changes.

Given the complexity, it would be pragmatic to include the Balances pallet in the Staking parachain in its first version. Any other systems that use overlapping locks, most notably governance, will need to recognise DOT held on both Asset Hub and the Staking parachain.

There is more discussion about staking in a parachain in Moving Staking off the Relay Chain.

Governance

Migrating governance into a parachain will be less complicated than staking. Most of the primitives needed for the migration already exist. The Treasury supports spending assets on remote chains and collectives like the Polkadot Technical Fellowship already function in a parachain. That is, XCM already provides the ability to express system origins across chains.

Therefore, actually moving the governance logic into a parachain will be simple. It can run in parallel with the Relay Chain's governance, which can be removed when the parachain has demonstrated sufficient functionality. It's possible that the Relay Chain maintain a Root-level emergency track for situations like parachains halting.

The only complication arises from the fact that both Asset Hub and the Staking parachain will have DOT balances; therefore, the Governance chain will need to be able to credit users' voting power based on balances from both locations. This is not expected to be difficult to handle.

Kusama

Although Polkadot and Kusama both have system chains running, they have to date only been used for introducing new features or bodies, for example fungible assets or the Technical Fellowship. There has not yet been a migration of logic/state from the Relay Chain into a parachain. Given its more realistic network conditions than testnets, Kusama is the best stage for rehearsal.

In the case of identity, Polkadot's system may be sufficient for the ecosystem. Therefore, Kusama should be used to test the migration of logic and state from Relay Chain to parachain, but these features may be (at the will of Kusama's governance) dropped from Kusama entirely after a successful migration on Polkadot.

For Governance, Polkadot already has the Collectives parachain, which would become the Governance parachain. The entire group of DOT holders is itself a collective (the legislative body), and governance provides the means to express voice. Launching a Kusama Governance chain would be sensible to rehearse a migration.

The Staking subsystem is perhaps where Kusama would provide the most value in its canary capacity. Staking is the subsystem most constrained by PoV limits. Ensuring that elections, payouts, session changes, offences/slashes, etc. work in a parachain on Kusama -- with its larger validator set -- will give confidence to the chain's robustness on Polkadot.

Drawbacks

These subsystems will have reduced resources in cores than on the Relay Chain. Staking in particular may require some optimizations to deal with constraints.

Testing, Security, and Privacy

Standard audit/review requirements apply. More powerful multi-chain integration test tools would be useful in developement.

Performance, Ergonomics, and Compatibility

Describe the impact of the proposal on the exposed functionality of Polkadot.

Performance

This is an optimization. The removal of public/user transactions on the Relay Chain ensures that its primary resources are allocated to system performance.

Ergonomics

This proposal alters very little for coretime users (e.g. parachain developers). Application developers will need to interact with multiple chains, making ergonomic light client tools particularly important for application development.

For existing parachains that interact with these subsystems, they will need to configure their runtimes to recognize the new locations in the network.

Compatibility

Implementing this proposal will require some changes to pallet APIs and/or a pub-sub protocol. Application developers will need to interact with multiple chains in the network.

Prior Art and References

Unresolved Questions

There remain some implementation questions, like how to use balances for both Staking and Governance. See, for example, Moving Staking off the Relay Chain.

Ideally the Relay Chain becomes transactionless, such that not even balances are represented there. With Staking and Governance off the Relay Chain, this is not an unreasonable next step.

With Identity on Polkadot, Kusama may opt to drop its People Chain.

(source)

Table of Contents

RFC-0042: Add System version that replaces StateVersion on RuntimeVersion

Start Date25th October 2023
DescriptionAdd System Version and remove State Version
AuthorsVedhavyas Singareddi

Summary

At the moment, we have system_version field on RuntimeVersion that derives which state version is used for the Storage. We have a use case where we want extrinsics root is derived using StateVersion::V1. Without defining a new field under RuntimeVersion, we would like to propose adding system_version that can be used to derive both storage and extrinsic state version.

Motivation

Since the extrinsic state version is always StateVersion::V0, deriving extrinsic root requires full extrinsic data. This would be problematic when we need to verify the extrinsics root if the extrinsic sizes are bigger. This problem is further explored in https://github.com/polkadot-fellows/RFCs/issues/19

For Subspace project, we have an enshrined rollups called Domain with optimistic verification and Fraud proofs are used to detect malicious behavior. One of the Fraud proof variant is to derive Domain block extrinsic root on Subspace's consensus chain. Since StateVersion::V0 requires full extrinsic data, we are forced to pass all the extrinsics through the Fraud proof. One of the main challenge here is some extrinsics could be big enough that this variant of Fraud proof may not be included in the Consensus block due to Block's weight restriction. If the extrinsic root is derived using StateVersion::V1, then we do not need to pass the full extrinsic data but rather at maximum, 32 byte of extrinsic data.

Stakeholders

  • Technical Fellowship, in its role of maintaining system runtimes.

Explanation

In order to use project specific StateVersion for extrinsic roots, we proposed an implementation that introduced parameter to frame_system::Config but that unfortunately did not feel correct. So we would like to propose adding this change to the RuntimeVersion object. The system version, if introduced, will be used to derive both storage and extrinsic state version. If system version is 0, then both Storage and Extrinsic State version would use V0. If system version is 1, then Storage State version would use V1 and Extrinsic State version would use V0. If system version is 2, then both Storage and Extrinsic State version would use V1.

If implemented, the new RuntimeVersion definition would look something similar to

#![allow(unused)]
fn main() {
/// Runtime version (Rococo).
#[sp_version::runtime_version]
pub const VERSION: RuntimeVersion = RuntimeVersion {
		spec_name: create_runtime_str!("rococo"),
		impl_name: create_runtime_str!("parity-rococo-v2.0"),
		authoring_version: 0,
		spec_version: 10020,
		impl_version: 0,
		apis: RUNTIME_API_VERSIONS,
		transaction_version: 22,
		system_version: 1,
	};
}

Drawbacks

There should be no drawbacks as it would replace state_version with same behavior but documentation should be updated so that chains know which system_version to use.

Testing, Security, and Privacy

AFAIK, should not have any impact on the security or privacy.

Performance, Ergonomics, and Compatibility

These changes should be compatible for existing chains if they use state_version value for system_verision.

Performance

I do not believe there is any performance hit with this change.

Ergonomics

This does not break any exposed Apis.

Compatibility

This change should not break any compatibility.

Prior Art and References

We proposed introducing a similar change by introducing a parameter to frame_system::Config but did not feel that is the correct way of introducing this change.

Unresolved Questions

I do not have any specific questions about this change at the moment.

IMO, this change is pretty self-contained and there won't be any future work necessary.

(source)

Table of Contents

RFC-0043: Introduce storage_proof_size Host Function for Improved Parachain Block Utilization

Start Date30 October 2023
DescriptionHost function to provide the storage proof size to runtimes.
AuthorsSebastian Kunert

Summary

This RFC proposes a new host function for parachains, storage_proof_size. It shall provide the size of the currently recorded storage proof to the runtime. Runtime authors can use the proof size to improve block utilization by retroactively reclaiming unused storage weight.

Motivation

The number of extrinsics that are included in a parachain block is limited by two constraints: execution time and proof size. FRAME weights cover both concepts, and block-builders use them to decide how many extrinsics to include in a block. However, these weights are calculated ahead of time by benchmarking on a machine with reference hardware. The execution-time properties of the state-trie and its storage items are unknown at benchmarking time. Therefore, we make some assumptions about the state-trie:

  • Trie Depth: We assume a trie depth to account for intermediary nodes.
  • Storage Item Size: We make a pessimistic assumption based on the MaxEncodedLen trait.

These pessimistic assumptions lead to an overestimation of storage weight, negatively impacting block utilization on parachains.

In addition, the current model does not account for multiple accesses to the same storage items. While these repetitive accesses will not increase storage-proof size, the runtime-side weight monitoring will account for them multiple times. Since the proof size is completely opaque to the runtime, we can not implement retroactive storage weight correction.

A solution must provide a way for the runtime to track the exact storage-proof size consumed on a per-extrinsic basis.

Stakeholders

  • Parachain Teams: They MUST include this host function in their runtime and node.
  • Light-client Implementors: They SHOULD include this host function in their runtime and node.

Explanation

This RFC proposes a new host function that exposes the storage-proof size to the runtime. As a result, runtimes can implement storage weight reclaiming mechanisms that improve block utilization.

This RFC proposes the following host function signature:

#![allow(unused)]
fn main() {
fn ext_storage_proof_size_version_1() -> u64;
}

The host function MUST return an unsigned 64-bit integer value representing the current proof size. In block-execution and block-import contexts, this function MUST return the current size of the proof. To achieve this, parachain node implementors need to enable proof recording for block imports. In other contexts, this function MUST return 18446744073709551615 (u64::MAX), which represents disabled proof recording.

Performance, Ergonomics, and Compatibility

Performance

Parachain nodes need to enable proof recording during block import to correctly implement the proposed host function. Benchmarking conducted with balance transfers has shown a performance reduction of around 0.6% when proof recording is enabled.

Ergonomics

The host function proposed in this RFC allows parachain runtime developers to keep track of the proof size. Typical usage patterns would be to keep track of the overall proof size or the difference between subsequent calls to the host function.

Compatibility

Parachain teams will need to include this host function to upgrade.

Prior Art and References

(source)

Table of Contents

RFC-0045: Lowering NFT Deposits on Asset Hub

Start Date2 November 2023
DescriptionA proposal to reduce the minimum deposit required for collection creation on the Polkadot and Kusama Asset Hubs.
AuthorsAurora Poppyseed, Just_Luuuu, Viki Val, Joe Petrowski

Summary

This RFC proposes changing the current deposit requirements on the Polkadot and Kusama Asset Hub for creating an NFT collection, minting an individual NFT, and lowering its corresponding metadata and attribute deposits. The objective is to lower the barrier to entry for NFT creators, fostering a more inclusive and vibrant ecosystem while maintaining network integrity and preventing spam.

Motivation

The current deposit of 10 DOT for collection creation (along with 0.01 DOT for item deposit and 0.2 DOT for metadata and attribute deposits) on the Polkadot Asset Hub and 0.1 KSM on Kusama Asset Hub presents a significant financial barrier for many NFT creators. By lowering the deposit requirements, we aim to encourage more NFT creators to participate in the Polkadot NFT ecosystem, thereby enriching the diversity and vibrancy of the community and its offerings.

The initial introduction of a 10 DOT deposit was an arbitrary starting point that does not consider the actual storage footprint of an NFT collection. This proposal aims to adjust the deposit first to a value based on the deposit function, which calculates a deposit based on the number of keys introduced to storage and the size of corresponding values stored.

Further, it suggests a direction for a future of calculating deposits variably based on adoption and/or market conditions. There is a discussion on tradeoffs of setting deposits too high or too low.

Requirements

  • Deposits SHOULD be derived from deposit function, adjusted by correspoding pricing mechansim.

Stakeholders

  • NFT Creators: Primary beneficiaries of the proposed change, particularly those who found the current deposit requirements prohibitive.
  • NFT Platforms: As the facilitator of artists' relations, NFT marketplaces have a vested interest in onboarding new users and making their platforms more accessible.
  • dApp Developers: Making the blockspace more accessible will encourage developers to create and build unique dApps in the Polkadot ecosystem.
  • Polkadot Community: Stands to benefit from an influx of artists, creators, and diverse NFT collections, enhancing the overall ecosystem.

Previous discussions have been held within the Polkadot Forum, with artists expressing their concerns about the deposit amounts.

Explanation

This RFC proposes a revision of the deposit constants in the configuration of the NFTs pallet on the Polkadot Asset Hub. The new deposit amounts would be determined by a standard deposit formula.

As of v1.1.1, the Collection Deposit is 10 DOT and the Item Deposit is 0.01 DOT (see here).

Based on the storage footprint of these items, this RFC proposes changing them to:

#![allow(unused)]
fn main() {
pub const NftsCollectionDeposit: Balance = system_para_deposit(1, 130);
pub const NftsItemDeposit: Balance = system_para_deposit(1, 164);
}

This results in the following deposits (calculted using this repository):

Polkadot

NameCurrent Rate (DOT)Calculated with Function (DOT)
collectionDeposit100.20064
itemDeposit0.010.20081
metadataDepositBase0.201290.20076
attributeDepositBase0.20.2

Similarly, the prices for Kusama were calculated as:

Kusama:

NameCurrent Rate (KSM)Calculated with Function (KSM)
collectionDeposit0.10.006688
itemDeposit0.0010.000167
metadataDepositBase0.0067096666170.0006709666617
attributeDepositBase0.006666666660.000666666666

Enhanced Approach to Further Lower Barriers for Entry

This RFC proposes further lowering these deposits below the rate normally charged for such a storage footprint. This is based on the economic argument that sub-rate deposits are a subsididy for growth and adoption of a specific technology. If the NFT functionality on Polkadot gains adoption, it makes it more attractive for future entrants, who would be willing to pay the non-subsidized rate because of the existing community.

Proposed Rate Adjustments

#![allow(unused)]
fn main() {
parameter_types! {
	pub const NftsCollectionDeposit: Balance = system_para_deposit(1, 130);
	pub const NftsItemDeposit: Balance = system_para_deposit(1, 164) / 40;
	pub const NftsMetadataDepositBase: Balance = system_para_deposit(1, 129) / 10;
	pub const NftsAttributeDepositBase: Balance = system_para_deposit(1, 0) / 10;
	pub const NftsDepositPerByte: Balance = system_para_deposit(0, 1);
}
}

This adjustment would result in the following DOT and KSM deposit values:

NameProposed Rate PolkadotProposed Rate Kusama
collectionDeposit0.20064 DOT0.006688 KSM
itemDeposit0.005 DOT0.000167 KSM
metadataDepositBase0.002 DOT0.0006709666617 KSM
attributeDepositBase0.002 DOT0.000666666666 KSM

Short- and Long-Term Plans

The plan presented above is recommended as an immediate step to make Polkadot a more attractive place to launch NFTs, although one would note that a forty fold reduction in the Item Deposit is just as arbitrary as the value it was replacing. As explained earlier, this is meant as a subsidy to gain more momentum for NFTs on Polkadot.

In the long term, an implementation should account for what should happen to the deposit rates assuming that the subsidy is successful and attracts a lot of deployments. Many options are discussed in the Addendum.

The deposit should be calculated as a function of the number of existing collections with maximum DOT and stablecoin values limiting the amount. With asset rates available via the Asset Conversion pallet, the system could take the lower value required. A sigmoid curve would make sense for this application to avoid sudden rate changes, as in:

$$ minDeposit + \frac{\mathrm{min(DotDeposit, StableDeposit) - minDeposit} }{\mathrm{1 + e^{a - b * x}} }$$

where the constant a moves the inflection to lower or higher x values, the constant b adjusts the rate of the deposit increase, and the independent variable x is the number of collections or items, depending on application.

Drawbacks

Modifying deposit requirements necessitates a balanced assessment of the potential drawbacks. Highlighted below are cogent points extracted from the discourse on the Polkadot Forum conversation, which provide critical perspectives on the implications of such changes.

Adjusting NFT deposit requirements on Polkadot and Kusama Asset Hubs involves key challenges:

  1. State Growth and Technical Concerns: Lowering deposit requirements can lead to increased blockchain state size, potentially causing state bloat. This growth needs to be managed to prevent strain on the network's resources and maintain operational efficiency. As stated earlier, the deposit levels proposed here are intentionally low with the thesis that future participants would pay the standard rate.

  2. Network Security and Market Response: Adapting to the cryptocurrency market's volatility is crucial. The mechanism for setting deposit amounts must be responsive yet stable, avoiding undue complexity for users.

  3. Economic Impact on Previous Stakeholders: The change could have varied economic effects on previous (before the change) creators, platform operators, and investors. Balancing these interests is essential to ensure the adjustment benefits the ecosystem without negatively impacting its value dynamics. However in the particular case of Polkadot and Kusama Asset Hub this does not pose a concern since there are very few collections currently and thus previous stakeholders wouldn't be much affected. As of date 9th January 2024 there are 42 collections on Polkadot Asset Hub and 191 on Kusama Asset Hub with a relatively low volume.

Testing, Security, and Privacy

Security concerns

As noted above, state bloat is a security concern. In the case of abuse, governance could adapt by increasing deposit rates and/or using forceDestroy on collections agreed to be spam.

Performance, Ergonomics, and Compatibility

Performance

The primary performance consideration stems from the potential for state bloat due to increased activity from lower deposit requirements. It's vital to monitor and manage this to avoid any negative impact on the chain's performance. Strategies for mitigating state bloat, including efficient data management and periodic reviews of storage requirements, will be essential.

Ergonomics

The proposed change aims to enhance the user experience for artists, traders, and utilizers of Kusama and Polkadot Asset Hubs, making Polkadot and Kusama more accessible and user-friendly.

Compatibility

The change does not impact compatibility as a redeposit function is already implemented.

Unresolved Questions

If this RFC is accepted, there should not be any unresolved questions regarding how to adapt the implementation of deposits for NFT collections.

Addendum

Several innovative proposals have been considered to enhance the network's adaptability and manage deposit requirements more effectively. The RFC recommends a mixture of the function-based model and the stablecoin model, but some tradeoffs of each are maintained here for those interested.

Enhanced Weak Governance Origin Model

The concept of a weak governance origin, controlled by a consortium like a system collective, has been proposed. This model would allow for dynamic adjustments of NFT deposit requirements in response to market conditions, adhering to storage deposit norms.

  • Responsiveness: To address concerns about delayed responses, the model could incorporate automated triggers based on predefined market indicators, ensuring timely adjustments.
  • Stability vs. Flexibility: Balancing stability with the need for flexibility is challenging. To mitigate the issue of frequent changes in DOT-based deposits, a mechanism for gradual and predictable adjustments could be introduced.
  • Scalability: The model's scalability is a concern, given the numerous deposits across the system. A more centralized approach to deposit management might be needed to avoid constant, decentralized adjustments.

Function-Based Pricing Model

Another proposal is to use a mathematical function to regulate deposit prices, initially allowing low prices to encourage participation, followed by a gradual increase to prevent network bloat.

  • Choice of Function: A logarithmic or sigmoid function is favored over an exponential one, as these functions increase prices at a rate that encourages participation while preventing prohibitive costs.
  • Adjustment of Constants: To finely tune the pricing rise, one of the function's constants could correlate with the total number of NFTs on Asset Hub. This would align the deposit requirements with the actual usage and growth of the network.

Linking Deposit to USD(x) Value

This approach suggests pegging the deposit value to a stable currency like the USD, introducing predictability and stability for network users.

  • Market Dynamics: One perspective is that fluctuations in native currency value naturally balance user participation and pricing, deterring network spam while encouraging higher-value collections. Conversely, there's an argument for allowing broader participation if the DOT/KSM value increases.
  • Complexity and Risks: Implementing a USD-based pricing system could add complexity and potential risks. The implementation needs to be carefully designed to avoid unintended consequences, such as excessive reliance on external financial systems or currencies.

Each of these proposals offers unique advantages and challenges. The optimal approach may involve a combination of these ideas, carefully adjusted to address the specific needs and dynamics of the Polkadot and Kusama networks.

(source)

Table of Contents

RFC-0047: Assignment of availability chunks to validators

Start Date03 November 2023
DescriptionAn evenly-distributing indirection layer between availability chunks and validators.
AuthorsAlin Dima

Summary

Propose a way of permuting the availability chunk indices assigned to validators, in the context of recovering available data from systematic chunks, with the purpose of fairly distributing network bandwidth usage.

Motivation

Currently, the ValidatorIndex is always identical to the ChunkIndex. Since the validator array is only shuffled once per session, naively using the ValidatorIndex as the ChunkIndex would pose an unreasonable stress on the first N/3 validators during an entire session, when favouring availability recovery from systematic chunks.

Therefore, the relay chain node needs a deterministic way of evenly distributing the first ~(N_VALIDATORS / 3) systematic availability chunks to different validators, based on the relay chain block and core. The main purpose is to ensure fair distribution of network bandwidth usage for availability recovery in general and in particular for systematic chunk holders.

Stakeholders

Relay chain node core developers.

Explanation

Systematic erasure codes

An erasure coding algorithm is considered systematic if it preserves the original unencoded data as part of the resulting code. The implementation of the erasure coding algorithm used for polkadot's availability data is systematic. Roughly speaking, the first N_VALIDATORS/3 chunks of data can be cheaply concatenated to retrieve the original data, without running the resource-intensive and time-consuming reconstruction algorithm.

You can find the concatenation procedure of systematic chunks for polkadot's erasure coding algorithm here

In a nutshell, it performs a column-wise concatenation with 2-byte chunks. The output could be zero-padded at the end, so scale decoding must be aware of the expected length in bytes and ignore trailing zeros (this assertion is already being made for regular reconstruction).

Availability recovery at present

According to the polkadot protocol spec:

A validator should request chunks by picking peers randomly and must recover at least f+1 chunks, where n=3f+k and k in {1,2,3}.

For parity's polkadot node implementation, the process was further optimised. At this moment, it works differently based on the estimated size of the available data:

(a) for small PoVs (up to 128 Kib), sequentially try requesting the unencoded data from the backing group, in a random order. If this fails, fallback to option (b).

(b) for large PoVs (over 128 Kib), launch N parallel requests for the erasure coded chunks (currently, N has an upper limit of 50), until enough chunks were recovered. Validators are tried in a random order. Then, reconstruct the original data.

All options require that after reconstruction, validators then re-encode the data and re-create the erasure chunks trie in order to check the erasure root.

Availability recovery from systematic chunks

As part of the effort of increasing polkadot's resource efficiency, scalability and performance, work is under way to modify the Availability Recovery protocol by leveraging systematic chunks. See this comment for preliminary performance results.

In this scheme, the relay chain node will first attempt to retrieve the ~N/3 systematic chunks from the validators that should hold them, before falling back to recovering from regular chunks, as before.

A re-encoding step is still needed for verifying the erasure root, so the erasure coding overhead cannot be completely brought down to 0.

Not being able to retrieve even one systematic chunk would make systematic reconstruction impossible. Therefore, backers can be used as a backup to retrieve a couple of missing systematic chunks, before falling back to retrieving regular chunks.

Chunk assignment function

Properties

The function that decides the chunk index for a validator will be parameterized by at least (validator_index, core_index) and have the following properties:

  1. deterministic
  2. relatively quick to compute and resource-efficient.
  3. when considering a fixed core_index, the function should describe a permutation of the chunk indices
  4. the validators that map to the first N/3 chunk indices should have as little overlap as possible for different cores.

In other words, we want a uniformly distributed, deterministic mapping from ValidatorIndex to ChunkIndex per core.

It's desirable to not embed this function in the runtime, for performance and complexity reasons. However, this means that the function needs to be kept very simple and with minimal or no external dependencies. Any change to this function could result in parachains being stalled and needs to be coordinated via a runtime upgrade or governance call.

Proposed function

Pseudocode:

#![allow(unused)]
fn main() {
pub fn get_chunk_index(
  n_validators: u32,
  validator_index: ValidatorIndex,
  core_index: CoreIndex
) -> ChunkIndex {
  let threshold = systematic_threshold(n_validators); // Roughly n_validators/3
  let core_start_pos = core_index * threshold;

  (core_start_pos + validator_index) % n_validators
}
}

Network protocol

The request-response /req_chunk protocol will be bumped to a new version (from v1 to v2). For v1, the request and response payloads are:

#![allow(unused)]
fn main() {
/// Request an availability chunk.
pub struct ChunkFetchingRequest {
	/// Hash of candidate we want a chunk for.
	pub candidate_hash: CandidateHash,
	/// The index of the chunk to fetch.
	pub index: ValidatorIndex,
}

/// Receive a requested erasure chunk.
pub enum ChunkFetchingResponse {
	/// The requested chunk data.
	Chunk(ChunkResponse),
	/// Node was not in possession of the requested chunk.
	NoSuchChunk,
}

/// This omits the chunk's index because it is already known by
/// the requester and by not transmitting it, we ensure the requester is going to use his index
/// value for validating the response, thus making sure he got what he requested.
pub struct ChunkResponse {
	/// The erasure-encoded chunk of data belonging to the candidate block.
	pub chunk: Vec<u8>,
	/// Proof for this chunk's branch in the Merkle tree.
	pub proof: Proof,
}
}

Version 2 will add an index field to ChunkResponse:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Encode, Decode)]
pub struct ChunkResponse {
	/// The erasure-encoded chunk of data belonging to the candidate block.
	pub chunk: Vec<u8>,
	/// Proof for this chunk's branch in the Merkle tree.
	pub proof: Proof,
	/// Chunk index.
	pub index: ChunkIndex
}
}

An important thing to note is that in version 1, the ValidatorIndex value is always equal to the ChunkIndex. Until the chunk rotation feature is enabled, this will also be true for version 2. However, after the feature is enabled, this will generally not be true.

The requester will send the request to validator with index V. The responder will map the V validator index to the C chunk index and respond with the C-th chunk. This mapping can be seamless, by having each validator store their chunk by ValidatorIndex (just as before).

The protocol implementation MAY check the returned ChunkIndex against the expected mapping to ensure that it received the right chunk. In practice, this is desirable during availability-distribution and systematic chunk recovery. However, regular recovery may not check this index, which is particularly useful when participating in disputes that don't allow for easy access to the validator->chunk mapping. See Appendix A for more details.

In any case, the requester MUST verify the chunk's proof using the provided index.

During availability-recovery, given that the requester may not know (if the mapping is not available) whether the received chunk corresponds to the requested validator index, it has to keep track of received chunk indices and ignore duplicates. Such duplicates should be considered the same as an invalid/garbage response (drop it and move on to the next validator - we can't punish via reputation changes, because we don't know which validator misbehaved).

Upgrade path

Step 1: Enabling new network protocol

In the beginning, both /req_chunk/1 and /req_chunk/2 will be supported, until all validators and collators have upgraded to use the new version. V1 will be considered deprecated. During this step, the mapping will still be 1:1 (ValidatorIndex == ChunkIndex), regardless of protocol. Once all nodes are upgraded, a new release will be cut that removes the v1 protocol. Only once all nodes have upgraded to this version will step 2 commence.

Step 2: Enabling the new validator->chunk mapping

Considering that the Validator->Chunk mapping is critical to para consensus, the change needs to be enacted atomically via governance, only after all validators have upgraded the node to a version that is aware of this mapping, functionality-wise. It needs to be explicitly stated that after the governance enactment, validators that run older client versions that don't support this mapping will not be able to participate in parachain consensus.

Additionally, an error will be logged when starting a validator with an older version, after the feature was enabled.

On the other hand, collators will not be required to upgrade in this step (but are still require to upgrade for step 1), as regular chunk recovery will work as before, granted that version 1 of the networking protocol has been removed. Note that collators only perform availability-recovery in rare, adversarial scenarios, so it is fine to not optimise for this case and let them upgrade at their own pace.

To support enabling this feature via the runtime, we will use the NodeFeatures bitfield of the HostConfiguration struct (added in https://github.com/paritytech/polkadot-sdk/pull/2177). Adding and enabling a feature with this scheme does not require a runtime upgrade, but only a referendum that issues a Configuration::set_node_feature extrinsic. Once the feature is enabled and new configuration is live, the validator->chunk mapping ceases to be a 1:1 mapping and systematic recovery may begin.

Drawbacks

  • Getting access to the core_index that used to be occupied by a candidate in some parts of the dispute protocol is very complicated (See appendix A). This RFC assumes that availability-recovery processes initiated during disputes will only use regular recovery, as before. This is acceptable since disputes are rare occurrences in practice and is something that can be optimised later, if need be. Adding the core_index to the CandidateReceipt would mitigate this problem and will likely be needed in the future for CoreJam and/or Elastic scaling. Related discussion about updating CandidateReceipt
  • It's a breaking change that requires all validators and collators to upgrade their node version at least once.

Testing, Security, and Privacy

Extensive testing will be conducted - both automated and manual. This proposal doesn't affect security or privacy.

Performance, Ergonomics, and Compatibility

Performance

This is a necessary data availability optimisation, as reed-solomon erasure coding has proven to be a top consumer of CPU time in polkadot as we scale up the parachain block size and number of availability cores.

With this optimisation, preliminary performance results show that CPU time used for reed-solomon coding/decoding can be halved and total POV recovery time decrease by 80% for large POVs. See more here.

Ergonomics

Not applicable.

Compatibility

This is a breaking change. See upgrade path section above. All validators and collators need to have upgraded their node versions before the feature will be enabled via a governance call.

Prior Art and References

See comments on the tracking issue and the in-progress PR

Unresolved Questions

Not applicable.

This enables future optimisations for the performance of availability recovery, such as retrieving batched systematic chunks from backers/approval-checkers.

Appendix A

This appendix details the intricacies of getting access to the core index of a candidate in parity's polkadot node.

Here, core_index refers to the index of the core that a candidate was occupying while it was pending availability (from backing to inclusion).

Availability-recovery can currently be triggered by the following phases in the polkadot protocol:

  1. During the approval voting process.
  2. By other collators of the same parachain.
  3. During disputes.

Getting the right core index for a candidate can be troublesome. Here's a breakdown of how different parts of the node implementation can get access to it:

  1. The approval-voting process for a candidate begins after observing that the candidate was included. Therefore, the node has easy access to the block where the candidate got included (and also the core that it occupied).

  2. The pov_recovery task of the collators starts availability recovery in response to noticing a candidate getting backed, which enables easy access to the core index the candidate started occupying.

  3. Disputes may be initiated on a number of occasions:

    3.a. is initiated by the validator as a result of finding an invalid candidate while participating in the approval-voting protocol. In this case, availability-recovery is not needed, since the validator already issued their vote.

    3.b is initiated by the validator noticing dispute votes recorded on-chain. In this case, we can safely assume that the backing event for that candidate has been recorded and kept in memory.

    3.c is initiated as a result of getting a dispute statement from another validator. It is possible that the dispute is happening on a fork that was not yet imported by this validator, so the subsystem may not have seen this candidate being backed.

A naive attempt of solving 3.c would be to add a new version for the disputes request-response networking protocol. Blindly passing the core index in the network payload would not work, since there is no way of validating that the reported core_index was indeed the one occupied by the candidate at the respective relay parent.

Another attempt could be to include in the message the relay block hash where the candidate was included. This information would be used in order to query the runtime API and retrieve the core index that the candidate was occupying. However, considering it's part of an unimported fork, the validator cannot call a runtime API on that block.

Adding the core_index to the CandidateReceipt would solve this problem and would enable systematic recovery for all dispute scenarios.

(source)

Table of Contents

RFC-0048: Generate ownership proof for SessionKeys

Start Date13 November 2023
DescriptionChange SessionKeys runtime api to support generating an ownership proof for the on chain registration.
AuthorsBastian Köcher

Summary

This RFC proposes to changes the SessionKeys::generate_session_keys runtime api interface. This runtime api is used by validator operators to generate new session keys on a node. The public session keys are then registered manually on chain by the validator operator. Before this RFC it was not possible by the on chain logic to ensure that the account setting the public session keys is also in possession of the private session keys. To solve this the RFC proposes to pass the account id of the account doing the registration on chain to generate_session_keys. Further this RFC proposes to change the return value of the generate_session_keys function also to not only return the public session keys, but also the proof of ownership for the private session keys. The validator operator will then need to send the public session keys and the proof together when registering new session keys on chain.

Motivation

When submitting the new public session keys to the on chain logic there doesn't exist any verification of possession of the private session keys. This means that users can basically register any kind of public session keys on chain. While the on chain logic ensures that there are no duplicate keys, someone could try to prevent others from registering new session keys by setting them first. While this wouldn't bring the "attacker" any kind of advantage, more like disadvantages (potential slashes on their account), it could prevent someone from e.g. changing its session key in the event of a private session key leak.

After this RFC this kind of attack would not be possible anymore, because the on chain logic can verify that the sending account is in ownership of the private session keys.

Stakeholders

  • Polkadot runtime implementors
  • Polkadot node implementors
  • Validator operators

Explanation

We are first going to explain the proof format being used:

#![allow(unused)]
fn main() {
type Proof = (Signature, Signature, ..);
}

The proof being a SCALE encoded tuple over all signatures of each private session key signing the account_id. The actual type of each signature depends on the corresponding session key cryptographic algorithm. The order of the signatures in the proof is the same as the order of the session keys in the SessionKeys type declared in the runtime.

The version of the SessionKeys needs to be bumped to 1 to reflect the changes to the signature of SessionKeys_generate_session_keys:

#![allow(unused)]
fn main() {
pub struct OpaqueGeneratedSessionKeys {
	pub keys: Vec<u8>,
	pub proof: Vec<u8>,
}

fn SessionKeys_generate_session_keys(account_id: Vec<u8>, seed: Option<Vec<u8>>) -> OpaqueGeneratedSessionKeys;
}

The default calling convention for runtime apis is applied, meaning the parameters passed as SCALE encoded array and the length of the encoded array. The return value being the SCALE encoded return value as u64 (array_ptr | length << 32). So, the actual exported function signature looks like:

#![allow(unused)]
fn main() {
fn SessionKeys_generate_session_keys(array: *const u8, len: usize) -> u64;
}

The on chain logic for setting the SessionKeys needs to be changed as well. It already gets the proof passed as Vec<u8>. This proof needs to be decoded to the actual Proof type as explained above. The proof and the SCALE encoded account_id of the sender are used to verify the ownership of the SessionKeys.

Drawbacks

Validator operators need to pass the their account id when rotating their session keys in a node. This will require updating some high level docs and making users familiar with the slightly changed ergonomics.

Testing, Security, and Privacy

Testing of the new changes only requires passing an appropriate owner for the current testing context. The changes to the proof generation and verification got audited to ensure they are correct.

Performance, Ergonomics, and Compatibility

Performance

The session key generation is an offchain process and thus, doesn't influence the performance of the chain. Verifying the proof is done on chain as part of the transaction logic for setting the session keys. The verification of the proof is a signature verification number of individual session keys times. As setting the session keys is happening quite rarely, it should not influence the overall system performance.

Ergonomics

The interfaces have been optimized to make it as easy as possible to generate the ownership proof.

Compatibility

Introduces a new version of the SessionKeys runtime api. Thus, nodes should be updated before a runtime is enacted that contains these changes otherwise they will fail to generate session keys. The RPC that exists around this runtime api needs to be updated to support passing the account id and for returning the ownership proof alongside the public session keys.

UIs would need to be updated to support the new RPC and the changed on chain logic.

Prior Art and References

None.

Unresolved Questions

None.

Substrate implementation of the RFC.

(source)

Table of Contents

RFC-0050: Fellowship Salaries

Start Date15 November 2023
DescriptionProposal to set rank-based Fellowship salary levels.
AuthorsJoe Petrowski, Gavin Wood

Summary

The Fellowship Manifesto states that members should receive a monthly allowance on par with gross income in OECD countries. This RFC proposes concrete amounts.

Motivation

One motivation for the Technical Fellowship is to provide an incentive mechanism that can induct and retain technical talent for the continued progress of the network.

In order for members to uphold their commitment to the network, they should receive support to ensure that their needs are met such that they have the time to dedicate to their work on Polkadot. Given the high expectations of Fellows, it is reasonable to consider contributions and requirements on par with a full-time job. Providing a livable wage to those making such contributions makes it pragmatic to work full-time on Polkadot.

Note: Goals of the Fellowship, expectations for each Dan, and conditions for promotion and demotion are all explained in the Manifesto. This RFC is only to propose concrete values for allowances.

Stakeholders

  • Fellowship members
  • Polkadot Treasury

Explanation

This RFC proposes agreeing on salaries relative to a single level, the III Dan. As such, changes to the amount or asset used would only be on a single value, and all others would adjust relatively. A III Dan is someone whose contributions match the expectations of a full-time individual contributor. The salary at this level should be reasonably close to averages in OECD countries.

DanFactor
I0.125
II0.25
III1
IV1.5
V2.0
VI2.5
VII2.5
VIII2.5
IX2.5

Note that there is a sizable increase between II Dan (Proficient) and III Dan (Fellow). By the third Dan, it is generally expected that one is working on Polkadot as their primary focus in a full-time capacity.

Salary Asset

Although the Manifesto (Section 8) specifies a monthly allowance in DOT, this RFC proposes the use of USDT instead. The allowance is meant to provide members stability in meeting their day-to-day needs and recognize contributions. Using USDT provides more stability and less speculation.

This RFC proposes that a III Dan earn 80,000 USDT per year. The salary at this level is commensurate with average salaries in OECD countries (note: 77,000 USD in the U.S., with an average engineer at 100,000 USD). The other ranks would thus earn:

DanAnnual Salary
I10,000
II20,000
III80,000
IV120,000
V160,000
VI200,000
VII200,000
VIII200,000
IX200,000

The salary levels for Architects (IV, V, and VI Dan) are typical of senior engineers.

Allowances will be managed by the Salary pallet.

Projections

Based on the current membership, the maximum yearly and monthly costs are shown below:

DanSalaryMembersYearlyMonthly
I10,00027270,00022,500
II20,00011220,00018,333
III80,0008640,00053,333
IV120,0003360,00030,000
V160,0005800,00066,667
VI200,0003600,00050,000
> VI200,000000
Total2,890,000240,833

Note that these are the maximum amounts; members may choose to take a passive (lower) level. On the other hand, more people will likely join the Fellowship in the coming years.

Updates

Updates to these levels, whether relative ratios, the asset used, or the amount, shall be done via RFC.

Drawbacks

By not using DOT for payment, the protocol relies on the stability of other assets and the ability to acquire them. However, the asset of choice can be changed in the future.

Testing, Security, and Privacy

N/A.

Performance, Ergonomics, and Compatibility

Performance

N/A

Ergonomics

N/A

Compatibility

N/A

Prior Art and References

Unresolved Questions

None at present.

(source)

Table of Contents

RFC-0056: Enforce only one transaction per notification

Start Date2023-11-30
DescriptionModify the transactions notifications protocol to always send only one transaction at a time
AuthorsPierre Krieger

Summary

When two peers connect to each other, they open (amongst other things) a so-called "notifications protocol" substream dedicated to gossiping transactions to each other.

Each notification on this substream currently consists in a SCALE-encoded Vec<Transaction> where Transaction is defined in the runtime.

This RFC proposes to modify the format of the notification to become (Compact(1), Transaction). This maintains backwards compatibility, as this new format decodes as a Vec of length equal to 1.

Motivation

There exists three motivations behind this change:

  • It is technically impossible to decode a SCALE-encoded Vec<Transaction> into a list of SCALE-encoded transactions without knowing how to decode a Transaction. That's because a Vec<Transaction> consists in several Transactions one after the other in memory, without any delimiter that indicates the end of a transaction and the start of the next. Unfortunately, the format of a Transaction is runtime-specific. This means that the code that receives notifications is necessarily tied to a specific runtime, and it is not possible to write runtime-agnostic code.

  • Notifications protocols are already designed to be optimized to send many items. Currently, when it comes to transactions, each item is a Vec<Transaction> that consists in multiple sub-items of type Transaction. This two-steps hierarchy is completely unnecessary, and was originally written at a time when the networking protocol of Substrate didn't have proper multiplexing.

  • It makes the implementation way more straight-forward by not having to repeat code related to back-pressure. See explanations below.

Stakeholders

Low-level developers.

Explanation

To give an example, if you send one notification with three transactions, the bytes that are sent on the wire are:

concat(
    leb128(total-size-in-bytes-of-the-rest),
    scale(compact(3)), scale(transaction1), scale(transaction2), scale(transaction3)
)

But you can also send three notifications of one transaction each, in which case it is:

concat(
    leb128(size(scale(transaction1)) + 1), scale(compact(1)), scale(transaction1),
    leb128(size(scale(transaction2)) + 1), scale(compact(1)), scale(transaction2),
    leb128(size(scale(transaction3)) + 1), scale(compact(1)), scale(transaction3)
)

Right now the sender can choose which of the two encoding to use. This RFC proposes to make the second encoding mandatory.

The format of the notification would become a SCALE-encoded (Compact(1), Transaction). A SCALE-compact encoded 1 is one byte of value 4. In other words, the format of the notification would become concat(&[4], scale_encoded_transaction). This is equivalent to forcing the Vec<Transaction> to always have a length of 1, and I expect the Substrate implementation to simply modify the sending side to add a for loop that sends one notification per item in the Vec.

As explained in the motivation section, this allows extracting scale(transaction) items without having to know how to decode them.

By "flattening" the two-steps hierarchy, an implementation only needs to back-pressure individual notifications rather than back-pressure notifications and transactions within notifications.

Drawbacks

This RFC chooses to maintain backwards compatibility at the cost of introducing a very small wart (the Compact(1)).

An alternative could be to introduce a new version of the transactions notifications protocol that sends one Transaction per notification, but this is significantly more complicated to implement and can always be done later in case the Compact(1) is bothersome.

Testing, Security, and Privacy

Irrelevant.

Performance, Ergonomics, and Compatibility

Performance

Irrelevant.

Ergonomics

Irrelevant.

Compatibility

The change is backwards compatible if done in two steps: modify the sender to always send one transaction per notification, then, after a while, modify the receiver to enforce the new format.

Prior Art and References

Irrelevant.

Unresolved Questions

None.

None. This is a simple isolated change.

(source)

Table of Contents

RFC-0059: Add a discovery mechanism for nodes based on their capabilities

Start Date2023-12-18
DescriptionNodes having certain capabilities register themselves in the DHT to be discoverable
AuthorsPierre Krieger

Summary

This RFC proposes to make the mechanism of RFC #8 more generic by introducing the concept of "capabilities".

Implementations can implement certain "capabilities", such as serving old block headers or being a parachain bootnode.

The discovery mechanism of RFC #8 is extended to be able to discover nodes of specific capabilities.

Motivation

The Polkadot peer-to-peer network is made of nodes. Not all these nodes are equal. Some nodes store only the headers of recent blocks, some nodes store all the block headers and bodies since the genesis, some nodes store the storage of all blocks since the genesis, and so on.

It is currently not possible to know ahead of time (without connecting to it and asking) which nodes have which data available, and it is not easily possible to build a list of nodes that have a specific piece of data available.

If you want to download for example the header of block 500, you have to connect to a randomly-chosen node, ask it for block 500, and if it says that it doesn't have the block, disconnect and try another randomly-chosen node. In certain situations such as downloading the storage of old blocks, nodes that have the information are relatively rare, and finding through trial and error a node that has the data can take a long time.

This RFC attempts to solve this problem by giving the possibility to build a list of nodes that are capable of serving specific data.

Stakeholders

Low-level client developers. People interested in accessing the archive of the chain.

Explanation

Reading RFC #8 first might help with comprehension, as this RFC is very similar.

Please keep in mind while reading that everything below applies for both relay chains and parachains, except mentioned otherwise.

Capabilities

This RFC defines a list of so-called capabilities:

  • Head of chain provider. An implementation with this capability must be able to serve to other nodes block headers, block bodies, justifications, calls proofs, and storage proofs of "recent" (see below) blocks, and, for relay chains, to serve to other nodes warp sync proofs where the starting block is a session change block and must participate in Grandpa and Beefy gossip.
  • History provider. An implementation with this capability must be able to serve to other nodes block headers and block bodies of any block since the genesis, and must be able to serve to other nodes justifications of any session change block since the genesis up until and including their currently finalized block.
  • Archive provider. This capability is a superset of History provider. In addition to the requirements of History provider, an implementation with this capability must be able to serve call proofs and storage proof requests of any block since the genesis up until and including their currently finalized block.
  • Parachain bootnode (only for relay chains). An implementation with this capability must be able to serve the network request described in RFC 8.

More capabilities might be added in the future.

In the context of the head of chain provider, the word "recent" means: any not-finalized-yet block that is equal to or an ancestor of a block that it has announced through a block announce, and any finalized block whose height is superior to its current finalized block minus 16. This does not include blocks that have been pruned because they're not a descendant of its current finalized block. In other words, blocks that aren't a descendant of the current finalized block can be thrown away. A gap of blocks is required due to race conditions: when a node finalizes a block, it takes some time for its peers to be made aware of this, during which they might send requests concerning older blocks. The choice of the number of blocks in this gap is arbitrary.

Substrate is currently by default a head of chain provider provider. After it has finished warp syncing, it downloads the list of old blocks, after which it becomes a history provider. If Substrate is instead configured as an archive node, then it downloads all blocks since the genesis and builds their state, after which it becomes an archive provider, history provider, and head of chain provider. If blocks pruning is enabled and the chain is a relay chain, then Substrate unfortunately doesn't implement any of these capabilities, not even head of chain provider. This is considered as a bug that should be fixed, see https://github.com/paritytech/polkadot-sdk/issues/2733.

DHT provider registration

This RFC heavily relies on the functionalities of the Kademlia DHT already in use by Polkadot. You can find a link to the specification here.

Implementations that have the history provider capability should register themselves as providers under the key sha256(concat("history", randomness)).

Implementations that have the archive provider capability should register themselves as providers under the key sha256(concat("archive", randomness)).

Implementations that have the parachain bootnode capability should register themselves as provider under the key sha256(concat(scale_compact(para_id), randomness)), as described in RFC 8.

"Register themselves as providers" consists in sending ADD_PROVIDER requests to nodes close to the key, as described in the Content provider advertisement section of the specification.

The value of randomness can be found in the randomness field when calling the BabeApi_currentEpoch function.

In order to avoid downtimes when the key changes, nodes should also register themselves as a secondary key that uses a value of randomness equal to the randomness field when calling BabeApi_nextEpoch.

Implementers should be aware that their implementation of Kademlia might already hash the key before XOR'ing it. The key is not meant to be hashed twice.

Implementations must not register themselves if they don't fulfill the capability yet. For example, a node configured to be an archive node but that is still building its archive state in the background must register itself only after it has finished building its archive.

Secondary DHTs

Implementations that have the history provider capability must also participate in a secondary DHT that comprises only of nodes with that capability. The protocol name of that secondary DHT must be /<genesis-hash>/kad/history.

Similarly, implementations that have the archive provider capability must also participate in a secondary DHT that comprises only of nodes with that capability and whose protocol name is /<genesis-hash>/kad/archive.

Just like implementations must not register themselves if they don't fulfill their capability yet, they must also not participate in the secondary DHT if they don't fulfill their capability yet.

Head of the chain providers

Implementations that have the head of the chain provider capability do not register themselves as providers, but instead are the nodes that participate in the main DHT. In other words, they are the nodes that serve requests of the /<genesis_hash>/kad protocol.

Any implementation that isn't a head of the chain provider (read: light clients) must not participate in the main DHT. This is already presently the case.

Implementations must not participate in the main DHT if they don't fulfill the capability yet. For example, a node that is still in the process of warp syncing must not participate in the main DHT. However, assuming that warp syncing doesn't last more than a few seconds, it is acceptable to ignore this requirement in order to avoid complicating implementations too much.

Drawbacks

None that I can see.

Testing, Security, and Privacy

The content of this section is basically the same as the one in RFC 8.

This mechanism doesn't add or remove any security by itself, as it relies on existing mechanisms.

Due to the way Kademlia works, it would become the responsibility of the 20 Polkadot nodes whose sha256(peer_id) is closest to the key (described in the explanations section) to store the list of nodes that have specific capabilities. Furthermore, when a large number of providers are registered, only the providers closest to the key are kept, up to a certain implementation-defined limit.

For this reason, an attacker can abuse this mechanism by randomly generating libp2p PeerIds until they find the 20 entries closest to the key representing the target capability. They are then in control of the list of nodes with that capability. While doing this can in no way be actually harmful, it could lead to eclipse attacks.

Because the key changes periodically and isn't predictable, and assuming that the Polkadot DHT is sufficiently large, it is not realistic for an attack like this to be maintained in the long term.

Performance, Ergonomics, and Compatibility

Performance

The DHT mechanism generally has a low overhead, especially given that publishing providers is done only every 24 hours.

Doing a Kademlia iterative query then sending a provider record shouldn't take more than around 50 kiB in total of bandwidth for the parachain bootnode.

Assuming 1000 nodes with a specific capability, the 20 Polkadot full nodes corresponding to that capability will each receive a sudden spike of a few megabytes of networking traffic when the key rotates. Again, this is relatively negligible. If this becomes a problem, one can add a random delay before a node registers itself to be the provider of the key corresponding to BabeApi_next_epoch.

Maybe the biggest uncertainty is the traffic that the 20 Polkadot full nodes will receive from light clients that desire knowing the nodes with a capability. If this every becomes a problem, this value of 20 is an arbitrary constant that can be increased for more redundancy.

Ergonomics

Irrelevant.

Compatibility

Irrelevant.

Prior Art and References

Unknown.

Unresolved Questions

While it fundamentally doesn't change much to this RFC, using BabeApi_currentEpoch and BabeApi_nextEpoch might be inappropriate. I'm not familiar enough with good practices within the runtime to have an opinion here. Should it be an entirely new pallet?

This RFC would make it possible to reliably discover archive nodes, which would make it possible to reliably send archive node requests, something that isn't currently possible. This could solve the problem of finding archive RPC node providers by migrating archive-related request to using the native peer-to-peer protocol rather than JSON-RPC.

If we ever decide to break backwards compatibility, we could divide the "history" and "archive" capabilities in two, between nodes capable of serving older blocks and nodes capable of serving newer blocks. We could even add to the peer-to-peer network nodes that are only capable of serving older blocks (by reading from a database) but do not participate in the head of the chain, and that just exist for historical purposes.

(source)

Table of Contents

RFC-0078: Merkleized Metadata

Start Date22 February 2024
DescriptionInclude merkleized metadata hash in extrinsic signature for trust-less metadata verification.
AuthorsZondax AG, Parity Technologies

Summary

To interact with chains in the Polkadot ecosystem it is required to know how transactions are encoded and how to read state. For doing this, Polkadot-SDK, the framework used by most of the chains in the Polkadot ecosystem, exposes metadata about the runtime to the outside. UIs, wallets, and others can use this metadata to interact with these chains. This makes the metadata a crucial piece of the transaction encoding as users are relying on the interacting software to encode the transactions in the correct format.

It gets even more important when the user signs the transaction in an offline wallet, as the device by its nature cannot get access to the metadata without relying on the online wallet to provide it. This makes it so that the offline wallet needs to trust an online party, deeming the security assumptions of the offline devices, mute.

This RFC proposes a way for offline wallets to leverage metadata, within the constraints of these. The design idea is that the metadata is chunked and these chunks are put into a merkle tree. The root hash of this merkle tree represents the metadata. The offline wallets can use the root hash to decode transactions by getting proofs for the individual chunks of the metadata. This root hash is also included in the signed data of the transaction (but not sent as part of the transaction). The runtime is then including its known metadata root hash when verifying the transaction. If the metadata root hash known by the runtime differs from the one that the offline wallet used, it very likely means that the online wallet provided some fake data and the verification of the transaction fails.

Users depend on offline wallets to correctly display decoded transactions before signing. With merkleized metadata, they can be assured of the transaction's legitimacy, as incorrect transactions will be rejected by the runtime.

Motivation

Polkadot's innovative design (both relay chain and parachains) present the ability to developers to upgrade their network as frequently as they need. These systems manage to have integrations working after the upgrades with the help of FRAME Metadata. This Metadata, which is in the order of half a MiB for most Polkadot-SDK chains, completely describes chain interfaces and properties. Securing this metadata is key for users to be able to interact with the Polkadot-SDK chain in the expected way.

On the other hand, offline wallets provide a secure way for Blockchain users to hold their own keys (some do a better job than others). These devices seldomly get upgraded, usually account for one particular network and hold very small internal memories. Currently in the Polkadot ecosystem there is no secure way of having these offline devices know the latest Metadata of the Polkadot-SDK chain they are interacting with. This results in a plethora of similar yet slightly different offline wallets for all different Polkadot-SDK chains, as well as the impediment of keeping these regularly updated, thus not fully leveraging Polkadot-SDK’s unique forkless upgrade feature.

The two main reasons why this is not possible today are:

  1. Metadata is too large for offline devices. Currently Polkadot-SDK metadata is on average 500 KiB, which is more than what the mostly adopted offline devices can hold.
  2. Metadata is not authenticated. Even if there was enough space on offline devices to hold the metadata, the user would be trusting the entity providing this metadata to the hardware wallet. In the Polkadot ecosystem, this is how currently Polkadot Vault works.

This RFC proposes a solution to make FRAME Metadata compatible with offline signers in a secure way. As it leverages FRAME Metadata, it does not only ensure that offline devices can always keep up to date with every FRAME based chain, but also that every offline wallet will be compatible with all FRAME based chains, avoiding the need of per-chain implementations.

Requirements

  1. Metadata's integrity MUST be preserved. If any compromise were to happen, extrinsics sent with compromised metadata SHOULD fail.
  2. Metadata information that could be used in signable extrinsic decoding MAY be included in digest, yet its inclusion MUST be indicated in signed extensions.
  3. Digest MUST be deterministic with respect to metadata.
  4. Digest MUST be cryptographically strong against pre-image, both first (finding an input that results in given digest) and second (finding an input that results in same digest as some other input given).
  5. Extra-metadata information necessary for extrinsic decoding and constant within runtime version MUST be included in digest.
  6. It SHOULD be possible to quickly withdraw offline signing mechanism without access to cold signing devices.
  7. Digest format SHOULD be versioned.
  8. Work necessary for proving metadata authenticity MAY be omitted at discretion of signer device design (to support automation tools).

Reduce metadata size

Metadata should be stripped from parts that are not necessary to parse a signable extrinsic, then it should be separated into a finite set of self-descriptive chunks. Thus, a subset of chunks necessary for signable extrinsic decoding and rendering could be sent, possibly in small portions (ultimately, one at a time), to cold devices together with the proof.

  1. Single chunk with proof payload size SHOULD fit within few kB;
  2. Chunks handling mechanism SHOULD support chunks being sent in any order without memory utilization overhead;
  3. Unused enum variants MUST be stripped (this has great impact on transmitted metadata size; examples: era enum, enum with all calls for call batching).

Stakeholders

  • Runtime implementors
  • UI/wallet implementors
  • Offline wallet implementors

The idea for this RFC was brought up by runtime implementors and was extensively discussed with offline wallet implementors. It was designed in such a way that it can work easily with the existing offline wallet solutions in the Polkadot ecosystem.

Explanation

The FRAME metadata provides a wide range of information about a FRAME based runtime. It contains information about the pallets, the calls per pallet, the storage entries per pallet, runtime APIs, and type information about most of the types that are used in the runtime. For decoding extrinsics on an offline wallet, what is mainly required is type information. Most of the other information in the FRAME metadata is actually not required for decoding extrinsics and thus it can be removed. Therefore, the following is a proposal on a custom representation of the metadata and how this custom metadata is chunked, ensuring that only the needed chunks required for decoding a particular extrinsic are sent to the offline wallet. The necessary information to transform the FRAME metadata type information into the type information presented in this RFC will be provided. However, not every single detail on how to convert from FRAME metadata into the RFC type information is described.

First, the MetadataDigest is introduced. After that, ExtrinsicMetadata is covered and finally the actual format of the type information. Then pruning of unrelated type information is covered and how to generate the TypeRefs. In the latest step, merkle tree calculation is explained.

Metadata digest

The metadata digest is the compact representation of the metadata. The hash of this digest is the metadata hash. Below the type declaration of the Hash type and the MetadataDigest itself can be found:

#![allow(unused)]
fn main() {
type Hash = [u8; 32];

enum MetadataDigest {
    #[index = 1]
    V1 {
        type_information_tree_root: Hash,
        extrinsic_metadata_hash: Hash,
        spec_version: u32,
        spec_name: String,
        base58_prefix: u16,
        decimals: u8,
        token_symbol: String,
    },
}
}

The Hash is 32 bytes long and blake3 is used for calculating it. The hash of the MetadataDigest is calculated by blake3(SCALE(MetadataDigest)). Therefore, MetadataDigest is at first SCALE encoded, and then those bytes are hashed.

The MetadataDigest itself is represented as an enum. This is done to make it future proof, because a SCALE encoded enum is prefixed by the index of the variant. This index represents the version of the digest. As seen above, there is no index zero and it starts directly with one. Version one of the digest contains the following elements:

  • type_information_tree_root: The root of the merkleized type information tree.
  • extrinsic_metadata_hash: The hash of the extrinsic metadata.
  • spec_version: The spec_version of the runtime as found in the RuntimeVersion when generating the metadata. While this information can also be found in the metadata, it is hidden in a big blob of data. To avoid transferring this big blob of data, we directly add this information here.
  • spec_name: Similar to spec_version, but being the spec_name found in the RuntimeVersion.
  • ss58_prefix: The SS58 prefix used for address encoding.
  • decimals: The number of decimals for the token.
  • token_symbol: The symbol of the token.

Extrinsic metadata

For decoding an extrinsic, more information on what types are being used is required. The actual format of the extrinsic is the format as described in the Polkadot specification. The metadata for an extrinsic is as follows:

#![allow(unused)]
fn main() {
struct ExtrinsicMetadata {
    version: u8,
    address_ty: TypeRef,
    call_ty: TypeRef,
    signature_ty: TypeRef,
    signed_extensions: Vec<SignedExtensionMetadata>,
}

struct SignedExtensionMetadata {
    identifier: String,
    included_in_extrinsic: TypeRef,
    included_in_signed_data: TypeRef,
}
}

To begin with, TypeRef. This is a unique identifier for a type as found in the type information. Using this TypeRef, it is possible to look up the type in the type information tree. More details on this process can be found in the section Generating TypeRef.

The actual ExtrinsicMetadata contains the following information:

  • version: The version of the extrinsic format. As of writing this, the latest version is 4.
  • address_ty: The address type used by the chain.
  • call_ty: The call type used by the chain. The call in FRAME based runtimes represents the type of transaction being executed on chain. It references the actual function to execute and the parameters of this function.
  • signature_ty: The signature type used by the chain.
  • signed_extensions: FRAME based runtimes can extend the base extrinsic with extra information. This extra information that is put into an extrinsic is called "signed extensions". These extensions offer the runtime developer the possibility to include data directly into the extrinsic, like nonce, tip, amongst others. This means that the this data is sent alongside the extrinsic to the runtime. The other possibility these extensions offer is to include extra information only in the signed data that is signed by the sender. This means that this data needs to be known by both sides, the signing side and the verification side. An example for this kind of data is the genesis hash that ensures that extrinsics are unique per chain. Another example is the metadata hash itself that will also be included in the signed data. The offline wallets need to know which signed extensions are present in the chain and this is communicated to them using this field.

The SignedExtensionMetadata provides information about a signed extension:

  • identifier: The identifier of the signed extension. An identifier is required to be unique in the Polkadot ecosystem as otherwise extrinsics are maybe built incorrectly.
  • included_in_extrinsic: The type that will be included in the extrinsic by this signed extension.
  • included_in_signed_data: The type that will be included in the signed data by this signed extension.

Type Information

As SCALE is not self descriptive like JSON, a decoder always needs to know the format of the type to decode it properly. This is where the type information comes into play. The format of the extrinsic is fixed as described above and ExtrinsicMetadata provides information on which type information is required for which part of the extrinsic. So, offline wallets only need access to the actual type information. It is a requirement that the type information can be chunked into logical pieces to reduce the amount of data that is sent to the offline wallets for decoding the extrinsics. So, the type information is structured in the following way:

#![allow(unused)]
fn main() {
struct Type {
    path: Vec<String>,
    type_def: TypeDef,
    type_id: Compact<u32>,
}

enum TypeDef {
    Composite(Vec<Field>),
    Enumeration(EnumerationVariant),
    Sequence(TypeRef),
    Array(Array),
    Tuple(Vec<TypeRef>),
    BitSequence(BitSequence),
}

struct Field {
    name: Option<String>,
    ty: TypeRef,
    type_name: Option<String>,
}

struct Array {
    len: u32,
    type_param: TypeRef,
}

struct BitSequence {
    num_bytes: u8,
    least_significant_bit_first: bool,
}

struct EnumerationVariant {
    name: String,
    fields: Vec<Field>,
    index: Compact<u32>,
}

enum TypeRef {
    Bool,
    Char,
    Str,
    U8,
    U16,
    U32,
    U64,
    U128,
    U256,
    I8,
    I16,
    I32,
    I64,
    I128,
    I256,
    CompactU8,
    CompactU16,
    CompactU32,
    CompactU64,
    CompactU128,
    CompactU256,
    Void,
    PerId(Compact<u32>),
}
}

The Type declares the structure of a type. The type has the following fields:

  • path: A path declares the position of a type locally to the place where it is defined. The path is not globally unique, this means that there can be multiple types with the same path.
  • type_def: The high-level type definition, e.g. the type is a composition of fields where each field has a type, the type is a composition of different types as tuple etc.
  • type_id: The unique identifier of this type.

Every Type is composed of multiple different types. Each of these "sub types" can reference either a full Type again or reference one of the primitive types. This is where TypeRef becomes relevant as the type referencing information. To reference a Type in the type information, a unique identifier is used. As primitive types can be represented using a single byte, they are not put as separate types into the type information. Instead the primitive types are directly part of TypeRef to not require the overhead of referencing them in an extra Type. The special primitive type Void represents a type that encodes to nothing and can be decoded from nothing. As FRAME doesn't support Compact as primitive type it requires a more involved implementation to convert a FRAME type to a Compact primitive type. SCALE only supports u8, u16, u32, u64 and u128 as Compact which maps onto the primitive type declaration in the RFC. One special case is a Compact that wraps an empty Tuple which is expressed as primitive type Void.

The TypeDef variants have the following meaning:

  • Composite: A struct like type that is composed of multiple different fields. Each Field can have its own type. The order of the fields is significant. A Composite with no fields is expressed as primitive type Void.
  • Enumeration: Stores a EnumerationVariant. A EnumerationVariant is a struct that is described by a name, an index and a vector of Fields, each of which can have it's own type. Typically Enumerations have more than just one variant, and in those cases Enumeration will appear multiple times, each time with a different variant, in the type information. Enumerations can become quite large, yet usually for decoding a type only one variant is required, therefore this design brings optimizations and helps reduce the size of the proof. An Enumeration with no variants is expressed as primitive type Void.
  • Sequence: A vector like type wrapping the given type.
  • BitSequence: A vector storing bits. num_bytes represents the size in bytes of the internal storage. If least_significant_bit_first is true the least significant bit is first, otherwise the most significant bit is first.
  • Array: A fixed-length array of a specific type.
  • Tuple: A composition of multiple types. A Tuple that is composed of no types is expressed as primitive type Void.

Using the type information together with the SCALE specification provides enough information on how to decode types.

Prune unrelated Types

The FRAME metadata contains not only the type information for decoding extrinsics, but it also contains type information about storage types. The scope of the RFC is only about decoding transactions on offline wallets. Thus, a lot of type information can be pruned. To know which type information are required to decode all possible extrinsics, ExtrinsicMetadata has been defined. The extrinsic metadata contains all the types that define the layout of an extrinsic. Therefore, all the types that are accessible from the types declared in the extrinsic metadata can be collected. To collect all accessible types, it requires to recursively iterate over all types starting from the types in ExtrinsicMetadata. Note that some types are accessible, but they don't appear in the final type information and thus, can be pruned as well. These are for example inner types of Compact or the types referenced by BitSequence. The result of collecting these accessible types is a list of all the types that are required to decode each possible extrinsic.

Generating TypeRef

Each TypeRef basically references one of the following types:

  • One of the primitive types. All primitive types can be represented by 1 byte and thus, they are directly part of the TypeRef itself to remove an extra level of indirection.
  • A Type using its unique identifier.

In FRAME metadata a primitive type is represented like any other type. So, the first step is to remove all the primitive only types from the list of types that were generated in the previous section. The resulting list of types is sorted using the id provided by FRAME metadata. In the last step the TypeRefs are created. Each reference to a primitive type is replaced by one of the corresponding TypeRef primitive type variants and every other reference is replaced by the type's unique identifier. The unique identifier of a type is the index of the type in our sorted list. For Enumerations all variants have the same unique identifier, while they are represented as multiple type information. All variants need to have the same unique identifier as the reference doesn't know which variant will appear in the actual encoded data.

#![allow(unused)]
fn main() {
let pruned_types = get_pruned_types();

for ty in pruned_types {
    if ty.is_primitive_type() {
        pruned_types.remove(ty);
    }
}

pruned_types.sort(|(left, right)|
    if left.frame_metadata_id() == right.frame_metadata_id() {
        left.variant_index() < right.variant_index()
    } else {
        left.frame_metadata_id() < right.frame_metadata_id()
    }
);

fn generate_type_ref(ty, ty_list) -> TypeRef {
    if ty.is_primitive_type() {
        TypeRef::primtive_from_ty(ty)
    }

    TypeRef::from_id(
        // Determine the id by using the position of the type in the
        // list of unique frame metadata ids.
        ty_list.position_by_frame_metadata_id(ty.frame_metadata_id())
    )
}

fn replace_all_sub_types_with_type_refs(ty, ty_list) -> Type {
    for sub_ty in ty.sub_types() {
        replace_all_sub_types_with_type_refs(sub_ty, ty_list);
        sub_ty = generate_type_ref(sub_ty, ty_list)
    }

    ty
}

let final_ty_list = Vec::new();
for ty in pruned_types {
    final_ty_list.push(replace_all_sub_types_with_type_refs(ty, ty_list))
}
}

Building the Merkle Tree Root

A complete binary merkle tree with blake3 as the hashing function is proposed. For building the merkle tree root, the initial data has to be hashed as a first step. This initial data is referred to as the leaves of the merkle tree. The leaves need to be sorted to make the tree root deterministic. The type information is sorted using their unique identifiers and for the Enumeration, variants are sort using their index. After sorting and hashing all leaves, two leaves have to be combined to one hash. The combination of these of two hashes is referred to as a node.

#![allow(unused)]
fn main() {
let nodes = leaves;
while nodes.len() > 1 {
    let right = nodes.pop_back();
    let left = nodes.pop_back();
    nodes.push_front(blake3::hash(scale::encode((left, right))));
}

let merkle_tree_root = if nodes.is_empty() { [0u8; 32] } else { nodes.back() };
}

The merkle_tree_root in the end is the last node left in the list of nodes. If there are no nodes in the list left, it means that the initial data set was empty. In this case, all zeros hash is used to represent the empty tree.

Building a tree with 5 leaves (numbered 0 to 4):

nodes: 0 1 2 3 4

nodes: [3, 4] 0 1 2

nodes: [1, 2] [3, 4] 0

nodes: [[3, 4], 0] [1, 2]

nodes: [[[3, 4], 0], [1, 2]]

The resulting tree visualized:

     [root]
     /    \
    *      *
   / \    / \
  *   0  1   2
 / \
3   4

Building a tree with 6 leaves (numbered 0 to 5):

nodes: 0 1 2 3 4 5

nodes: [4, 5] 0 1 2 3

nodes: [2, 3] [4, 5] 0 1

nodes: [0, 1] [2, 3] [4, 5]

nodes: [[2, 3], [4, 5]] [0, 1]

nodes: [[[2, 3], [4, 5]], [0, 1]]

The resulting tree visualized:

       [root]
      /      \
     *        *
   /   \     / \
  *     *   0   1
 / \   / \
2   3 4   5

Inclusion in an Extrinsic

To ensure that the offline wallet used the correct metadata to show the extrinsic to the user the metadata hash needs to be included in the extrinsic. The metadata hash is generated by hashing the SCALE encoded MetadataDigest:

#![allow(unused)]
fn main() {
blake3::hash(SCALE::encode(MetadataDigest::V1 { .. }))
}

For the runtime the metadata hash is generated at compile time. Wallets will have to generate the hash using the FRAME metadata.

The signing side should control whether it wants to add the metadata hash or if it wants to omit it. To accomplish this it is required to add one extra byte to the extrinsic itself. If this byte is 0 the metadata hash is not required and if the byte is 1 the metadata hash is added using V1 of the MetadataDigest. This leaves room for future versions of the MetadataDigest format. When the metadata hash should be included, it is only added to the data that is signed. This brings the advantage of not requiring to include 32 bytes into the extrinsic itself, because the runtime knows the metadata hash as well and can add it to the signed data as well if required. This is similar to the genesis hash, while this isn't added conditionally to the signed data. So, to recap:

  • Included in the extrinsic is u8, the "mode". The mode is either 0 which means to not include the metadata hash in the signed data or the mode is 1 to include the metadata hash in V1.
  • Included in the signed data is an Option<[u8; 32]>. Depending on the mode the value is either None or Some(metadata_hash).

Drawbacks

The chunking may not be the optimal case for every kind of offline wallet.

Testing, Security, and Privacy

All implementations are required to strictly follow the RFC to generate the metadata hash. This includes which hash function to use and how to construct the metadata types tree. So, all implementations are following the same security criteria. As the chains will calculate the metadata hash at compile time, the build process needs to be trusted. However, this is already a solved problem in the Polkadot ecosystem by using reproducible builds. So, anyone can rebuild a chain runtime to ensure that a proposal is actually containing the changes as advertised.

Implementations can also be tested easily against each other by taking some metadata and ensuring that they all come to the same metadata hash.

Privacy of users should also not be impacted. This assumes that wallets will generate the metadata hash locally and don't leak any information to third party services about which chunks a user will send to their offline wallet. Besides that, there is no leak of private information as getting the raw metadata from the chain is an operation that is done by almost everyone.

Performance, Ergonomics, and Compatibility

Performance

There should be no measurable impact on performance to Polkadot or any other chain using this feature. The metadata root hash is calculated at compile time and at runtime it is optionally used when checking the signature of a transaction. This means that at runtime no performance heavy operations are done.

Ergonomics & Compatibility

The proposal alters the way a transaction is built, signed, and verified. So, this imposes some required changes to any kind of developer who wants to construct transactions for Polkadot or any chain using this feature. As the developer can pass 0 for disabling the verification of the metadata root hash, it can be easily ignored.

Prior Art and References

RFC 46 produced by the Alzymologist team is a previous work reference that goes in this direction as well.

On other ecosystems, there are other solutions to the problem of trusted signing. Cosmos for example has a standardized way of transforming a transaction into some textual representation and this textual representation is included in the signed data. Basically achieving the same as what the RFC proposes, but it requires that for every transaction applied in a block, every node in the network always has to generate this textual representation to ensure the transaction signature is valid.

Unresolved Questions

None.

  • Does it work with all kind of offline wallets?
  • Generic types currently appear multiple times in the metadata with each instantiation. It could be may be useful to have generic type only once in the metadata and declare the generic parameters at their instantiation.
  • The metadata doesn't contain any kind of semantic information. This means that the offline wallet for example doesn't know what is a balance etc. The current solution for this problem is to match on the type name, but this isn't a sustainable solution.
  • MetadataDigest only provides one token and decimal. However, chains support a lot of chains support multiple tokens for paying fees etc. Probably more a question of having semantic information as mentioned above.

(source)

Table of Contents

RFC-0084: General transactions in extrinsic format

Start Date12 March 2024
DescriptionSupport more extrinsic types by updating the extrinsic format
AuthorsGeorge Pisaltu

Summary

This RFC proposes a change to the extrinsic format to incorporate a new transaction type, the "general" transaction.

Motivation

"General" transactions, a new type of transaction that this RFC aims to support, are transactions which obey the runtime's extensions and have according extension data yet do not have hard-coded signatures. They are first described in Extrinsic Horizon and supported in 3685. They enable users to authorize origins in new, more flexible ways (e.g. ZK proofs, mutations over pre-authenticated origins). As of now, all transactions are limited to the account signing model for origin authorization and any additional origin changes happen in extrinsic logic, which cannot leverage the validation process of extensions.

An example of a use case for such an extension would be sponsoring the transaction fee for some other user. A new extension would be put in place to verify that a part of the initial payload was signed by the author under who the extrinsic should run and change the origin, but the payment for the whole transaction should be handled under a sponsor's account. A POC for this can be found in 3712.

The new "general" transaction type would coexist with both current transaction types for a while and, therefore, the current number of supported transaction types, capped at 2, is insufficient. A new extrinsic type must be introduced alongside the current signed and unsigned types. Currently, an encoded extrinsic's first byte indicate the type of extrinsic using the most significant bit - 0 for unsigned, 1 for signed - and the 7 following bits indicate the extrinsic format version, which has been equal to 4 for a long time.

By taking one bit from the extrinsic format version encoding, we can support 2 additional extrinsic types while also having a minimal impact on our capability to extend and change the extrinsic format in the future.

Stakeholders

  • Runtime users
  • Runtime devs
  • Wallet devs

Explanation

An extrinsic is currently encoded as one byte to identify the extrinsic type and version. This RFC aims to change the interpretation of this byte regarding the reserved bits for the extrinsic type and version. In the following explanation, bits represented using T make up the extrinsic type and bits represented using V make up the extrinsic version.

Currently, the bit allocation within the leading encoded byte is 0bTVVV_VVVV. In practice in the Polkadot ecosystem, the leading byte would be 0bT000_0100 as the version has been equal to 4 for a long time.

This RFC proposes for the bit allocation to change to 0bTTVV_VVVV. As a result, the extrinsic format version will be bumped to 5 and the extrinsic type bit representation would change as follows:

bitstype
00unsigned
10signed
01reserved
11reserved

Drawbacks

This change would reduce the maximum possible transaction version from the current 127 to 63. In order to bypass the new, lower limit, the extrinsic format would have to change again.

Testing, Security, and Privacy

There is no impact on testing, security or privacy.

Performance, Ergonomics, and Compatibility

This change would allow Polkadot to support new types of transactions, with the specific "general" transaction type in mind at the time of writing this proposal.

Performance

There is no performance impact.

Ergonomics

The impact to developers and end-users is minimal as it would just be a bitmask update on their part for parsing the extrinsic type along with the version.

Compatibility

This change breaks backwards compatiblity because any transaction that is neither signed nor unsigned, but a new transaction type, would be interpreted as having a future extrinsic format version.

Prior Art and References

The original design was originally proposed in the TransactionExtension PR, which is also the motivation behind this effort.

Unresolved Questions

None.

Following this change, the "general" transaction type will be introduced as part of the Extrinsic Horizon effort, which will shape future work.

(source)

Table of Contents

RFC-0091: DHT Authority discovery record creation time

Start Date2024-05-20
DescriptionAdd creation time for DHT authority discovery records
AuthorsAlex Gheorghe (alexggh)

Summary

Extend the DHT authority discovery records with a signed creation time, so that nodes can determine which record is newer and always decide to prefer the newer records to the old ones.

Motivation

Currently, we use the Kademlia DHT for storing records regarding the p2p address of an authority discovery key, the problem is that if the nodes decide to change its PeerId/Network key it will publish a new record, however because of the distributed and replicated nature of the DHT there is no way to tell which record is newer so both old PeerId and the new PeerId will live in the network until the old one expires(36h), that creates all sort of problem and leads to the node changing its address not being properly connected for up to 36h.

After this RFC, nodes are extended to decide to keep the new record and propagate the new record to nodes that have the old record stored, so in the end all the nodes will converge faster to the new record(in the order of minutes, not 36h)

Implementation of the rfc: https://github.com/paritytech/polkadot-sdk/pull/3786.

Current issue without this enhacement: https://github.com/paritytech/polkadot-sdk/issues/3673

Stakeholders

Polkadot node developers.

Explanation

This RFC heavily relies on the functionalities of the Kademlia DHT already in use by Polkadot. You can find a link to the specification here.

In a nutshell, on a specific node the current authority-discovery protocol publishes Kademila DHT records at startup and periodically. The records contain the full address of the node for each authorithy key it owns. The node tries also to find the full address of all authorities in the network by querying the DHT and picking up the first record it finds for each of the authority id it found on chain.

The authority discovery DHT records use the protobuf protocol and the current format is specified here. This RFC proposese extending the schema in a backwards compatible manner by adding a new optional creation_time field to AuthorityRecord and nodes can use this information to determine which of the record is newer.

Diff of dht-v3.proto vs dht-v2.proto

@@ -1,10 +1,10 @@
 syntax = "proto3";

-package authority_discovery_v2;
+package authority_discovery_v3;

 // First we need to serialize the addresses in order to be able to sign them.
 message AuthorityRecord {
 	repeated bytes addresses = 1;
+	// Time since UNIX_EPOCH in nanoseconds, scale encoded
+	TimestampInfo creation_time = 2;
 }

 message PeerSignature {
@@ -13,11 +15,17 @@
 	bytes public_key = 2;
 }

+// Information regarding the creation data of the record
+message TimestampInfo {
+       // Time since UNIX_EPOCH in nanoseconds, scale encoded
+       bytes timestamp = 1;
+}
+

Each time a node wants to resolve an authorithy ID it will issue a query with a certain redundancy factor, and from all the results it receives it will decide to pick only the newest record. Additionally, in order to speed up the time until all nodes have the newest record, nodes can optionaly implement a logic where they send the new record to nodes that answered with the older record.

Drawbacks

In theory the new protocol creates a bit more traffic on the DHT network, because it waits for DHT records to be received from more than one node, while in the current implementation we just take the first record that we receive and cancel all in-flight requests to other peers. However, because the redundancy factor will be relatively small and this operation happens rarerly, every 10min, this cost is negligible.

Testing, Security, and Privacy

This RFC's implementation https://github.com/paritytech/polkadot-sdk/pull/3786 had been tested on various local test networks and versi.

With regard to security the creation time is wrapped inside SignedAuthorityRecord wo it will be signed with the authority id key, so there is no way for other malicious nodes to manipulate this field without the received node observing.

Performance, Ergonomics, and Compatibility

Irrelevant.

Performance

Irrelevant.

Ergonomics

Irrelevant.

Compatibility

The changes are backwards compatible with the existing protocol, so nodes with both the old protocol and newer protocol can exist in the network, this is achieved by the fact that we use protobuf for serializing and deserializing the records, so new fields will be ignore when deserializing with the older protocol and vice-versa when deserializing an old record with the new protocol the new field will be None and the new code accepts this record as being valid.

Prior Art and References

The enhancements have been inspired by the algorithm specified in here

Unresolved Questions

N/A

N/A

(source)

Table of Contents

RFC-0097: Unbonding Queue

Date19.06.2024
DescriptionThis RFC proposes a safe mechanism to scale the unbonding time from staking on the Relay Chain proportionally to the overall unbonding stake. This approach significantly reduces the expected duration for unbonding, while ensuring that a substantial portion of the stake is always available to slash of validators behaving maliciously within a 28-day window.
AuthorsJonas Gehrlein & Alistair Stewart

Summary

This RFC proposes a flexible unbonding mechanism for tokens that are locked from staking on the Relay Chain (DOT/KSM), aiming to enhance user convenience without compromising system security.

Locking tokens for staking ensures that Polkadot is able to slash tokens backing misbehaving validators. With changing the locking period, we still need to make sure that Polkadot can slash enough tokens to deter misbehaviour. This means that not all tokens can be unbonded immediately, however we can still allow some tokens to be unbonded quickly.

The new mechanism leads to a signficantly reduced unbonding time on average, by queuing up new unbonding requests and scaling their unbonding duration relative to the size of the queue. New requests are executed with a minimum of 2 days, when the queue is comparatively empty, to the conventional 28 days, if the sum of requests (in terms of stake) exceed some threshold. In scenarios between these two bounds, the unbonding duration scales proportionately. The new mechanism will never be worse than the current fixed 28 days.

In this document we also present an empirical analysis by retrospectively fitting the proposed mechanism to the historic unbonding timeline and show that the average unbonding duration would drastically reduce, while still being sensitive to large unbonding events. Additionally, we discuss implications for UI, UX, and conviction voting.

Note: Our proposition solely focuses on the locks imposed from staking. Other locks, such as governance, remain unchanged. Also, this mechanism should not be confused with the already existing feature of FastUnstake, which lets users unstake tokens immediately that have not received rewards for 28 days or longer.

As an initial step to gauge its effectiveness and stability, it is recommended to implement and test this model on Kusama before considering its integration into Polkadot, with appropriate adjustments to the parameters. In the following, however, we limit our discussion to Polkadot.

Motivation

Polkadot has one of the longest unbonding periods among all Proof-of-Stake protocols, because security is the most important goal. Staking on Polkadot is still attractive compared to other protocols because of its above-average staking APY. However the long unbonding period harms usability and deters potential participants that want to contribute to the security of the network.

The current length of the unbonding period imposes significant costs for any entity that even wants to perform basic tasks such as a reorganization / consolidation of their stashes, or updating their private key infrastructure. It also limits participation of users that have a large preference for liquidity.

The combination of long unbonding periods and high returns has lead to the proliferation of liquid staking, where parachains or centralised exchanges offer users their staked tokens before the 28 days unbonding period is over either in original DOT/KSM form or derivative tokens. Liquid staking is harmless if few tokens are involved but it could result in many validators being selected by a few entities if a large fraction of DOTs were involved. This may lead to centralization (see here for more discussion on threats of liquid staking) and an opportunity for attacks.

The new mechanism greatly increases the competitiveness of Polkadot, while maintaining sufficient security.

Stakeholders

  • Every DOT/KSM token holder

Explanation

Before diving into the details of how to implement the unbonding queue, we give readers context about why Polkadot has a 28-day unbonding period in the first place. The reason for it is to prevent long-range attacks (LRA) that becomes theoretically possible if more than 1/3 of validators collude. In essence, a LRA describes the inability of users, who disconnect from the consensus at time t0 and reconnects later, to realize that validators which were legitimate at a certain time, say t0 but dropped out in the meantime, are not to be trusted anymore. That means, for example, a user syncing the state could be fooled by trusting validators that fell outside the active set of validators after t0, and are building a competitive and malicious chain (fork).

LRAs of longer than 28 days are mitigated by the use of trusted checkpoints, which are assumed to be no more than 28 days old. A new node that syncs Polkadot will start at the checkpoint and look for proofs of finality of later blocks, signed by 2/3 of the validators. In an LRA fork, some of the validator sets may be different but only if 2/3 of some validator set in the last 28 days signed something incorrect.

If we detect an LRA of no more than 28 days with the current unbonding period, then we should be able to detect misbehaviour from over 1/3 of validators whose nominators are still bonded. The stake backing these validators is considerable fraction of the total stake (empirically it is 0.287 or so). If we allowed more than this stake to unbond, without checking who it was backing, then the LRA attack might be free of cost for an attacker. The proposed mechansim allows up to half this stake to unbond within 28 days. This halves the amount of tokens that can be slashed, but this is still very high in absolute terms. For example, at the time of writing (19.06.2024) this would translate to around 120 millions DOTs.

Attacks other than an LRA, such as backing incorrect parachain blocks, should be detected and slashed within 2 days. This is why the mechanism has a minimum unbonding period.

In practice an LRA does not affect clients who follow consensus more frequently than every 2 days, such as running nodes or bridges. However any time a node syncs Polkadot if an attacker is able to connect to it first, it could be misled.

In short, in the light of the huge benefits obtained, we are fine by only keeping a fraction of the total stake of validators slashable against LRAs at any given time.

Mechanism

When a user (nominator or validator) decides to unbond their tokens, they don't become instantly available. Instead, they enter an unbonding queue. The following specification illustrates how the queue works, given a user wants to unbond some portion of their stake denoted as new_unbonding_stake. We also store a variable, max_unstake that tracks how much stake we allow to unbond potentially earlier than 28 eras (28 days on Polkadot and 7 days on Kusama).

To calculate max_unstake, we record for each era how much stake was used to back the lowest-backed 1/3 of validators. We store this information for the last 28 eras and let min_lowest_third_stake be the minimum of this over the last 28 eras. max_unstake is determined by MIN_SLASHABLE_SHARE x min_lowest_third_stake. In addition, we can use UPPER_BOUND and LOWER_BOUND as variables to scale the unbonding duration of the queue.

At any time we store back_of_unbonding_queue_block_number which expresses the block number when all the existing unbonders have unbonded.

Let's assume a user wants to unbond some of their stake, i.e., new_unbonding_stake, and issues the request at some arbitrary block number denoted as current_block. Then:

unbonding_time_delta = new_unbonding_stake / max_unstake * UPPER_BOUND

This number needs to be added to the back_of_unbonding_queue_block_number under the conditions that it does not undercut current_block + LOWER_BOUND or exceed current_block + UPPER_BOUND.

back_of_unbonding_queue_block_number = max(current_block_number, back_of_unbonding_queue_block_number) + unbonding_time_delta

This determines at which block the user has their tokens unbonded, making sure that it is in the limit of LOWER_BOUND and UPPER_BOUND.

unbonding_block_number = min(UPPER_BOUND, max(back_of_unbonding_queue_block_number - current_block_number, LOWER_BOUND)) + current_block_number

Ultimately, the user's token are unbonded at unbonding_block_number.

Proposed Parameters

There are a few constants to be exogenously set. They are up for discussion, but we make the following recommendation:

  • MIN_SLASHABLE_SHARE: 1/2 - This is the share of stake backing the lowest 1/3 of validators that is slashable at any point in time. It offers a trade-off between security and unbonding time. Half is a sensible choice. Here, we have sufficient stake to slash while allowing for a short average unbonding time.
  • LOWER_BOUND: 28800 blocks (or 2 eras): This value resembles a minimum unbonding time for any stake of 2 days.
  • UPPER_BOUND: 403200 blocks (or 28 eras): This value resembles the maximum time a user faces in their unbonding time. It equals to the current unbonding time and should be familiar to users.

Rebonding

Users that chose to unbond might want to cancel their request and rebond. There is no security loss in doing this, but with the scheme above, it could imply that a large unbond increases the unbonding time for everyone else later in the queue. When the large stake is rebonded, however, the participants later in the queue move forward and can unbond more quickly than originally estimated. It would require an additional extrinsic by the user though.

Thus, we should store the unbonding_time_delta with the unbonding account. If it rebonds when it is still unbonding, then this value should be subtracted from back_of_unbonding_queue_block_number. So unbonding and rebonding leaves this number unaffected. Note that we must store unbonding_time_delta, because in later eras max_unstake might have changed and we cannot recompute it.

Empirical Analysis

We can use the proposed unbonding queue calculation, with the recommended parameters, and simulate the queue over the course of Polkadot's unbonding history. Instead of doing the analysis on a per-block basis, we calculate it on a daily basis. To simulate the unbonding queue, we require the ratio between the daily total stake of the lowest third backed validators and the daily total stake (which determines the max_unstake) and the sum of daily and newly unbonded tokens. Due to the NPoS algorithm, the first number has only small variations and we used a constant as approximation (0.287) determined by sampling a bunch of empirical eras. At this point, we want to thank Parity's Data team for allowing us to leverage their data infrastructure in these analyses.

The following graph plots said statistics.

Empirical Queue

The abovementioned graph combines two metrics into a single graph.

  • Unbonded Amount: The number of daily and newly unbonded token over time scaled to the y-axis of 28 days. In particular its normalized by daily_unbonded / max(daily_unbonded) * 28.
  • Unbonding Days: The daily expected unbonding days given the history of daily_unbonded.

We can observe that historical unbonds only trigger an unbonding time larger than LOWER_BOUND in situations with extensive and/or clustered unbonding amounts. The average unbonding time across the whole timeseries is ~2.67 days. We can, however, see it taking effect pushing unbonding times up during large unbonding events. In the largest events, we hit a maximum of 28 days. This gives us reassurance that it is sufficiently sensitive and it makes sense to match the UPPER_BOUND with the historically largest unbonds.

The main parameter affecting the situation is the max_unstake. The relationship is obvious: decreasing the max_unstake makes the queue more sensitive, i.e., having it spike more quickly and higher with unbonding events. Given that these events historically were mostly associated with parachain auctions, we can assume that, in the absence of major systemic events, users will experience drastically reduced unbonding times. The analysis can be reproduced or changed to other parameters using this repository.

Additional Considerations

Deferred slashing

Currently we defer applying many slashes until around 28 days have passed. This was implemented so we can conveniently cancel slashes via governance in the case that the slashing was due to a bug. While rare on Polkadot, such bugs cause a significant fraction of slashes. This includes slashing for attacks other than LRAs for which we've assumed that 2 days is enough to slash. But 2 days in not enough to cancel slashes via OpenGov.

Owing to the way exposures, which nominators back validators with how many tokens, are stored, it is hard to search for whether a nominator has deferred slashes that need to be applied to them on chain as of now. So we cannot simply check when a nominator attempts to withdraw their bond.

We can solve this by freezing the unbonding queue while there are pending slashes in the staking system. In the worst case, where the slash is applied, we would forced all members of the queue to unbond with 28 days minus the days since they are in the queue (i.e., nobody ever needs to wait more than 28 days) and pause the unbonding queue until there are no deferred slashes in the system. This solution is potentially easier to implement but could cause disruptions for unbonding stakers that are not slashed, because they do not benefit from the queue. It is crucial to note that unbonding is still always possible for all stakers in the usual 28 days. Since slashes should occur rarely, this should not cause distruptions in reality too often. In addition, we could further complement the solution by adding a new extrinsic where any account is allowed to point out the unbonding accounts with the deferred slashes. Then, the chain would set the unbonding_block_number of the affected accounts to after the time when the slash would be applied, which will be no more than 28 days from the time the staker unbonded. After removing the offenders from the queue, we could unfreeze the unbonding queue and restore operation for unslashed accounts immediately. To find nominators with deferred slashes it is required, however, to iterate through all nominators, which is only feasible to do off chain. There should be plenty of incentive to do so by the non-slashed unbonding accounts that seek to reduce the opportunity costs of being forced wait potentially much longer than necessary.

This solution achieves resolve the situation securely and, in the worst case where no user submits the extrinsic, no staker would exceed an unbonding duration of the usual 28 days and apply all slashes as intended.

UX/UI

As per the nature of the unbonding queue, the more a user slices up their stake to be unbonded, the quicker they find their expected unbonding time. This, however, comes at the cost of creating more and/or larger transactions, i.e., incurring higher transactions costs. We leave it to UI implementations to provide a good UX to inform users about this trade-off and help them find their individual willingness to pay to unbond even faster. For most users, splitting up their stake will not lead to any meaningful advantage because their effect on the queue is neglible.

Conviction voting

Changing the (expected) unbonding period has an indirect impact on conviction voting, because the governance locks do not stack with the staking locks. In other words, if a user is already being locked in staking, they can, for free, choose a conviction vote that is lower or equal to that locking time. Currently and with an unbonding period of a fixed 28 days, that means, the 3x conviction vote comes essentially for free. There has been discussions to rescale the conviction weights to improved parametrization. But, the transition between the old locks and new locks pose significant challenges.

We argue, that under our unbonding queue, the current conviction voting scheme logically better aligns with their impact on governance, avoiding an expensive solution to migrate existing locks to a new scheme. For example, if the average unbonding period is around 2 days from staking, locking tokens for an additional 26 days justifies a higher weight (in that regard of3x). Voters that seek maximum liquidity are free to do so but it is fair to be weighted less in governance decisions that are naturally affecting the long-term success of Polkadot.

Potential Extension

In addition to a simple queue, we could add a market component that lets users always unbond from staking at the minimum possible waiting time)(== LOWER_BOUND, e.g., 2 days), by paying a variable fee. To achieve this, it is reasonable to split the total unbonding capacity into two chunks, with the first capacity for the simple queue and the remaining capacity for the fee-based unbonding. By doing so, we allow users to choose whether they want the quickest unbond and paying a dynamic fee or join the simple queue. Setting a capacity restriction for both queues enables us to guarantee a predictable unbonding time in the simple queue, while allowing users with the respective willingness to pay to get out even earlier. The fees are dynamically adjusted and are proportional to the unbonding stake (and thereby expressed in a percentage of the requested unbonding stake). In contrast to a unified queue, this prevents the issue that users paying a fee jump in front of other users not paying a fee, pushing their unbonding time back (which would be bad for UX). The revenue generated could be burned.

This extension and further specifications are left out of this RFC, because it adds further complexity and the empirical analysis above suggests that average unbonding times will already be close the LOWER_BOUND, making a more complex design unnecessary. We advise to first implement the discussed mechanism and assess after some experience whether an extension is desirable.

Drawbacks

  • Lower security for LRAs: Without a doubt, the theoretical security against LRAs decreases. But, as we argue, the attack is still costly enough to deter attacks and the attack is sufficiently theoretical. Here, the benefits outweigh the costs.
  • Griefing attacks: A large holder could pretend to unbond a large amount of their tokens to prevent other users to exit the network earlier. This would, however be costly due to the fact that the holder loses out on staking rewards. The larger the impact on the queue, the higher the costs. In any case it must be noted that the UPPER_BOUND is still 28 days, which means that nominators are never left with a longer unbonding period than currently. There is not enough gain for the attacker to endure this cost.
  • Challenge for Custodians and Liquid Staking Providers: Changing the unbonding time, especially making it flexible, requires entities that offer staking derivatives to rethink and rework their products.

Testing, Security, and Privacy

NA

Performance, Ergonomics, and Compatibility

NA

Performance

The authors cannot see any potential impact on performance.

Ergonomics

The authors cannot see any potential impact on ergonomics for developers. We discussed potential impact on UX/UI for users above.

Compatibility

The authors cannot see any potential impact on compatibility. This should be assessed by the technical fellows.

Prior Art and References

(source)

Table of Contents

RFC-0099: Introduce a transaction extension version

Start Date03 July 2024
DescriptionIntroduce a versioning for transaction extensions.
AuthorsBastian Köcher

Summary

This RFC proposes a change to the extrinsic format to include a transaction extension version.

Motivation

The extrinsic format supports to be extended with transaction extensions. These transaction extensions are runtime specific and can be different per chain. Each transaction extension can add data to the extrinsic itself or extend the signed payload. This means that adding a transaction extension is breaking the chain specific extrinsic format. A recent example was the introduction of the CheckMetadatHash to Polkadot and all its system chains. As the extension was adding one byte to the extrinsic, it broke a lot of tooling. By introducing an extra version for the transaction extensions it will be possible to introduce changes to these transaction extensions while still being backwards compatible. Based on the version of the transaction extensions, each chain runtime could decode the extrinsic correctly and also create the correct signed payload.

Stakeholders

  • Runtime users
  • Runtime devs
  • Wallet devs

Explanation

RFC84 introduced the extrinsic format 5. The idea is to piggyback onto this change of the extrinsic format to add the extra version for the transaction extensions. If required, this could also come as extrinsic format 6, but 5 is not yet deployed anywhere.

The extrinsic format supports the following types of transactions:

  • Bare: Does not add anything to the extrinsic.
  • Signed: (Address, Signature, Extensions)
  • General: Extensions

The Signed and General transaction would change to:

  • Signed: (Address, Signature, Version, Extensions)
  • General: (Version, Extensions)

The Version being a SCALE encoded u8 representing the version of the transaction extensions.

In the chain runtime the version can be used to determine which set of transaction extensions should be used to decode and to validate the transaction.

Drawbacks

This adds one byte more to each signed transaction.

Testing, Security, and Privacy

There is no impact on testing, security or privacy.

Performance, Ergonomics, and Compatibility

This will ensure that changes to the transactions extensions can be done in a backwards compatible way.

Performance

There is no performance impact.

Ergonomics

Runtime developers need to take care of the versioning and ensure to bump as required, so that there are no compatibility breaking changes without a bump of the version. It will also add a little bit more code in the runtime to decode these old versions, but this should be neglectable.

Compatibility

When introduced together with extrinsic format version 5 from RFC84, it can be implemented in a backwards compatible way. So, transactions can still be send using the old extrinsic format and decoded by the runtime.

Prior Art and References

None.

Unresolved Questions

None.

None.

(source)

Table of Contents

RFC-0101: XCM Transact remove require_weight_at_most parameter

Start Date12 July 2024
DescriptionRemove require_weight_at_most parameter from XCM Transact
AuthorsAdrian Catangiu

Summary

The Transact XCM instruction currently forces the user to set a specific maximum weight allowed to the inner call and then also pay for that much weight regardless of how much the call actually needs in practice.

This RFC proposes improving the usability of Transact by removing that parameter and instead get and charge the actual weight of the inner call from its dispatch info on the remote chain.

Motivation

The UX of using Transact is poor because of having to guess/estimate the require_weight_at_most weight used by the inner call on the target.

We've seen multiple Transact on-chain failures caused by guessing wrong values for this require_weight_at_most even though the rest of the XCM program would have worked.

In practice, this parameter only adds UX overhead with no real practical value. Use cases fall in one of two categories:

  1. Unpaid execution of Transacts - in these cases the require_weight_at_most is not really useful, caller doesn't have to pay for it, and on the call site it either fits the block or not;
  2. Paid execution of single Transact - the weight to be spent by the Transact is already covered by the BuyExecution weight limit parameter.

We've had multiple OpenGov root/whitelisted_caller proposals initiated by core-devs completely or partially fail because of incorrect configuration of require_weight_at_most parameter. This is a strong indication that the instruction is hard to use.

Stakeholders

  • Runtime Users,
  • Runtime Devs,
  • Wallets,
  • dApps,

Explanation

The proposed enhancement is simple: remove require_weight_at_most parameter from the instruction:

- Transact { origin_kind: OriginKind, require_weight_at_most: Weight, call: DoubleEncoded<Call> },
+ Transact { origin_kind: OriginKind, call: DoubleEncoded<Call> },

The XCVM implementation shall no longer use require_weight_at_most for weighing. Instead, it shall weigh the Transact instruction by decoding and weighing the inner call.

Drawbacks

No drawbacks, existing scenarios work as before, while this also allows new/easier flows.

Testing, Security, and Privacy

Currently, an XCVM implementation can weigh a message just by looking at the decoded instructions without decoding the Transact's call, but assuming require_weight_at_most weight for it. With the new version it has to decode the inner call to know its actual weight.

But this does not actually change the security considerations, as can be seen below.

With the new Transact the weighing happens after decoding the inner call. The entirety of the XCM program containing this Transact needs to be either covered by enough bought weight using a BuyExecution, or the origin has to be allowed to do free execution.

The security considerations around how much can someone execute for free are the same for both this new version and the old. In both cases, an "attacker" can do the XCM decoding (including Transact inner calls) for free by adding a large enough BuyExecution without actually having the funds available.

In both cases, decoding is done for free, but in both cases execution fails early on BuyExecution.

Performance, Ergonomics, and Compatibility

Performance

No performance change.

Ergonomics

Ergonomics are slightly improved by simplifying Transact API.

Compatibility

Compatible with previous XCM programs.

Prior Art and References

None.

Unresolved Questions

None.

None.

(source)

Table of Contents

RFC-0108: Remove XCM testnet NetworkIds

Start Date23 July 2024
DescriptionRemove the NetworkIds for testnets Westend and Rococo
Authors

Summary

This RFC aims to remove the NetworkIds of Westend and Rococo, arguing that testnets shouldn't go in the language.

Motivation

We've already seen the plans to phase out Rococo and Paseo has appeared. Instead of constantly changing the testnets included in the language, we should favor specifying them via their genesis hash, using NetworkId::ByGenesis.

Stakeholders

  • Runtime devs
  • Wallets
  • dApps

Explanation

Remove Westend and Rococo from the included NetworkIds in the language.

Drawbacks

This RFC will make it less convenient to specify a testnet, but not by a large amount.

Testing, Security, and Privacy

None.

Performance, Ergonomics, and Compatibility

Performance

None.

Ergonomics

It will very slightly reduce the ergonomics of testnet developers but improve the stability of the language.

Compatibility

NetworkId::Rococo and NetworkId::Westend can just use NetworkId::ByGenesis, as can other testnets.

Prior Art and References

A previous attempt to add NetworkId::Paseo: https://github.com/polkadot-fellows/xcm-format/pull/58.

Unresolved Questions

None.

None.

(source)

Table of Contents

RFC-0004: Remove the host-side runtime memory allocator

Start Date2023-07-04
DescriptionUpdate the runtime-host interface to no longer make use of a host-side allocator
AuthorsPierre Krieger

Summary

Update the runtime-host interface to no longer make use of a host-side allocator.

Motivation

The heap allocation of the runtime is currently controlled by the host using a memory allocator on the host side.

The API of many host functions consists in allocating a buffer. For example, when calling ext_hashing_twox_256_version_1, the host allocates a 32 bytes buffer using the host allocator, and returns a pointer to this buffer to the runtime. The runtime later has to call ext_allocator_free_version_1 on this pointer in order to free the buffer.

Even though no benchmark has been done, it is pretty obvious that this design is very inefficient. To continue with the example of ext_hashing_twox_256_version_1, it would be more efficient to instead write the output hash to a buffer that was allocated by the runtime on its stack and passed by pointer to the function. Allocating a buffer on the stack in the worst case scenario simply consists in decreasing a number, and in the best case scenario is free. Doing so would save many Wasm memory reads and writes by the allocator, and would save a function call to ext_allocator_free_version_1.

Furthermore, the existence of the host-side allocator has become questionable over time. It is implemented in a very naive way, and for determinism and backwards compatibility reasons it needs to be implemented exactly identically in every client implementation. Runtimes make substantial use of heap memory allocations, and each allocation needs to go twice through the runtime <-> host boundary (once for allocating and once for freeing). Moving the allocator to the runtime side, while it would increase the size of the runtime, would be a good idea. But before the host-side allocator can be deprecated, all the host functions that make use of it need to be updated to not use it.

Stakeholders

No attempt was made at convincing stakeholders.

Explanation

New host functions

This section contains a list of new host functions to introduce.

(func $ext_storage_read_version_2
    (param $key i64) (param $value_out i64) (param $offset i32) (result i64))
(func $ext_default_child_storage_read_version_2
    (param $child_storage_key i64) (param $key i64) (param $value_out i64)
    (param $offset i32) (result i64))

The signature and behaviour of ext_storage_read_version_2 and ext_default_child_storage_read_version_2 is identical to their version 1 counterparts, but the return value has a different meaning. The new functions directly return the number of bytes that were written in the value_out buffer. If the entry doesn't exist, a value of -1 is returned. Given that the host must never write more bytes than the size of the buffer in value_out, and that the size of this buffer is expressed as a 32 bits number, a 64bits value of -1 is not ambiguous.

The runtime execution stops with an error if value_out is outside of the range of the memory of the virtual machine, even if the size of the buffer is 0 or if the amount of data to write would be 0 bytes.

(func $ext_storage_next_key_version_2
    (param $key i64) (param $out i64) (return i32))
(func $ext_default_child_storage_next_key_version_2
    (param $child_storage_key i64) (param $key i64) (param $out i64) (return i32))

The behaviour of these functions is identical to their version 1 counterparts. Instead of allocating a buffer, writing the next key to it, and returning a pointer to it, the new version of these functions accepts an out parameter containing a pointer-size to the memory location where the host writes the output. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine, even if the function wouldn't write anything to out. These functions return the size, in bytes, of the next key, or 0 if there is no next key. If the size of the next key is larger than the buffer in out, the bytes of the key that fit the buffer are written to out and any extra byte that doesn't fit is discarded.

Some notes:

  • It is never possible for the next key to be an empty buffer, because an empty key has no preceding key. For this reason, a return value of 0 can unambiguously be used to indicate the lack of next key.
  • The ext_storage_next_key_version_2 and ext_default_child_storage_next_key_version_2 are typically used in order to enumerate keys that starts with a certain prefix. Given that storage keys are constructed by concatenating hashes, the runtime is expected to know the size of the next key and can allocate a buffer that can fit said key. When the next key doesn't belong to the desired prefix, it might not fit the buffer, but given that the start of the key is written to the buffer anyway this can be detected in order to avoid calling the function a second time with a larger buffer.
(func $ext_hashing_keccak_256_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_keccak_512_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_sha2_256_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_blake2_128_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_blake2_256_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_twox_64_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_twox_128_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_twox_256_version_2
    (param $data i64) (param $out i32))
(func $ext_trie_blake2_256_root_version_3
    (param $data i64) (param $version i32) (param $out i32))
(func $ext_trie_blake2_256_ordered_root_version_3
    (param $data i64) (param $version i32) (param $out i32))
(func $ext_trie_keccak_256_root_version_3
    (param $data i64) (param $version i32) (param $out i32))
(func $ext_trie_keccak_256_ordered_root_version_3
    (param $data i64) (param $version i32) (param $out i32))
(func $ext_default_child_storage_root_version_3
    (param $child_storage_key i64) (param $out i32))
(func $ext_crypto_ed25519_generate_version_2
    (param $key_type_id i32) (param $seed i64) (param $out i32))
(func $ext_crypto_sr25519_generate_version_2
    (param $key_type_id i32) (param $seed i64) (param $out i32) (return i32))
(func $ext_crypto_ecdsa_generate_version_2
    (param $key_type_id i32) (param $seed i64) (param $out i32) (return i32))

The behaviour of these functions is identical to their version 1 or version 2 counterparts. Instead of allocating a buffer, writing the output to it, and returning a pointer to it, the new version of these functions accepts an out parameter containing the memory location where the host writes the output. The output is always of a size known at compilation time. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine.

(func $ext_default_child_storage_root_version_3
    (param $child_storage_key i64) (param $out i32))
(func $ext_storage_root_version_3
    (param $out i32))

The behaviour of these functions is identical to their version 1 and version 2 counterparts. Instead of allocating a buffer, writing the output to it, and returning a pointer to it, the new versions of these functions accepts an out parameter containing the memory location where the host writes the output. The output is always of a size known at compilation time. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine.

I have taken the liberty to take the version 1 of these functions as a base rather than the version 2, as a PPP deprecating the version 2 of these functions has previously been accepted: https://github.com/w3f/PPPs/pull/6.

(func $ext_storage_clear_prefix_version_3
    (param $prefix i64) (param $limit i64) (param $removed_count_out i32)
    (return i32))
(func $ext_default_child_storage_clear_prefix_version_3
    (param $child_storage_key i64) (param $prefix i64)
    (param $limit i64)  (param $removed_count_out i32) (return i32))
(func $ext_default_child_storage_kill_version_4
    (param $child_storage_key i64) (param $limit i64)
    (param $removed_count_out i32) (return i32))

The behaviour of these functions is identical to their version 2 and 3 counterparts. Instead of allocating a buffer, writing the output to it, and returning a pointer to it, the version 3 and 4 of these functions accepts a removed_count_out parameter containing the memory location to a 8 bytes buffer where the host writes the number of keys that were removed in little endian. The runtime execution stops with an error if removed_count_out is outside of the range of the memory of the virtual machine. The functions return 1 to indicate that there are keys remaining, and 0 to indicate that all keys have been removed.

Note that there is an alternative proposal to add new host functions with the same names: https://github.com/w3f/PPPs/pull/7. This alternative doesn't conflict with this one except for the version number. One proposal or the other will have to use versions 4 and 5 rather than 3 and 4.

(func $ext_crypto_ed25519_sign_version_2
    (param $key_type_id i32) (param $key i32) (param $msg i64) (param $out i32) (return i32))
(func $ext_crypto_sr25519_sign_version_2
    (param $key_type_id i32) (param $key i32) (param $msg i64) (param $out i32) (return i32))
func $ext_crypto_ecdsa_sign_version_2
    (param $key_type_id i32) (param $key i32) (param $msg i64) (param $out i32) (return i32))
(func $ext_crypto_ecdsa_sign_prehashed_version_2
    (param $key_type_id i32) (param $key i32) (param $msg i64) (param $out i32) (return i64))

The behaviour of these functions is identical to their version 1 counterparts. The new versions of these functions accept an out parameter containing the memory location where the host writes the signature. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine, even if the function wouldn't write anything to out. The signatures are always of a size known at compilation time. On success, these functions return 0. If the public key can't be found in the keystore, these functions return 1 and do not write anything to out.

Note that the return value is 0 on success and 1 on failure, while the previous version of these functions write 1 on success (as it represents a SCALE-encoded Some) and 0 on failure (as it represents a SCALE-encoded None). Returning 0 on success and non-zero on failure is consistent with common practices in the C programming language and is less surprising than the opposite.

(func $ext_crypto_secp256k1_ecdsa_recover_version_3
    (param $sig i32) (param $msg i32) (param $out i32) (return i64))
(func $ext_crypto_secp256k1_ecdsa_recover_compressed_version_3
    (param $sig i32) (param $msg i32) (param $out i32) (return i64))

The behaviour of these functions is identical to their version 2 counterparts. The new versions of these functions accept an out parameter containing the memory location where the host writes the signature. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine, even if the function wouldn't write anything to out. The signatures are always of a size known at compilation time. On success, these functions return 0. On failure, these functions return a non-zero value and do not write anything to out.

The non-zero value written on failure is:

  • 1: incorrect value of R or S
  • 2: incorrect value of V
  • 3: invalid signature

These values are equal to the values returned on error by the version 2 (see https://spec.polkadot.network/chap-host-api#defn-ecdsa-verify-error), but incremented by 1 in order to reserve 0 for success.

(func $ext_crypto_ed25519_num_public_keys_version_1
    (param $key_type_id i32) (return i32))
(func $ext_crypto_ed25519_public_key_version_2
    (param $key_type_id i32) (param $key_index i32) (param $out i32))
(func $ext_crypto_sr25519_num_public_keys_version_1
    (param $key_type_id i32) (return i32))
(func $ext_crypto_sr25519_public_key_version_2
    (param $key_type_id i32) (param $key_index i32) (param $out i32))
(func $ext_crypto_ecdsa_num_public_keys_version_1
    (param $key_type_id i32) (return i32))
(func $ext_crypto_ecdsa_public_key_version_2
    (param $key_type_id i32) (param $key_index i32) (param $out i32))

The functions superceded the ext_crypto_ed25519_public_key_version_1, ext_crypto_sr25519_public_key_version_1, and ext_crypto_ecdsa_public_key_version_1 host functions.

Instead of calling ext_crypto_ed25519_public_key_version_1 in order to obtain the list of all keys at once, the runtime should instead call ext_crypto_ed25519_num_public_keys_version_1 in order to obtain the number of public keys available, then ext_crypto_ed25519_public_key_version_2 repeatedly. The ext_crypto_ed25519_public_key_version_2 function writes the public key of the given key_index to the memory location designated by out. The key_index must be between 0 (included) and n (excluded), where n is the value returned by ext_crypto_ed25519_num_public_keys_version_1. Execution must trap if n is out of range.

The same explanations apply for ext_crypto_sr25519_public_key_version_1 and ext_crypto_ecdsa_public_key_version_1.

Host implementers should be aware that the list of public keys (including their ordering) must not change while the runtime is running. This is most likely done by copying the list of all available keys either at the start of the execution or the first time the list is accessed.

(func $ext_offchain_http_request_start_version_2
  (param $method i64) (param $uri i64) (param $meta i64) (result i32))

The behaviour of this function is identical to its version 1 counterpart. Instead of allocating a buffer, writing the request identifier in it, and returning a pointer to it, the version 2 of this function simply returns the newly-assigned identifier to the HTTP request. On failure, this function returns -1. An identifier of -1 is invalid and is reserved to indicate failure.

(func $ext_offchain_http_request_write_body_version_2
  (param $method i64) (param $uri i64) (param $meta i64) (result i32))
(func $ext_offchain_http_response_read_body_version_2
  (param $request_id i32) (param $buffer i64) (param $deadline i64) (result i64))

The behaviour of these functions is identical to their version 1 counterpart. Instead of allocating a buffer, writing two bytes in it, and returning a pointer to it, the new version of these functions simply indicates what happened:

  • For ext_offchain_http_request_write_body_version_2, 0 on success.
  • For ext_offchain_http_response_read_body_version_2, 0 or a non-zero number of bytes on success.
  • -1 if the deadline was reached.
  • -2 if there was an I/O error while processing the request.
  • -3 if the identifier of the request is invalid.

These values are equal to the values returned on error by the version 1 (see https://spec.polkadot.network/chap-host-api#defn-http-error), but tweaked in order to reserve positive numbers for success.

When it comes to ext_offchain_http_response_read_body_version_2, the host implementers must not read too much data at once in order to not create ambiguity in the returned value. Given that the size of the buffer is always inferior or equal to 4 GiB, this is not a problem.

(func $ext_offchain_http_response_wait_version_2
    (param $ids i64) (param $deadline i64) (param $out i32))

The behaviour of this function is identical to its version 1 counterpart. Instead of allocating a buffer, writing the output to it, and returning a pointer to it, the new version of this function accepts an out parameter containing the memory location where the host writes the output. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine.

The encoding of the response code is also modified compared to its version 1 counterpart and each response code now encodes to 4 little endian bytes as described below:

  • 100-999: the request has finished with the given HTTP status code.
  • -1 if the deadline was reached.
  • -2 if there was an I/O error while processing the request.
  • -3 if the identifier of the request is invalid.

The buffer passed to out must always have a size of 4 * n where n is the number of elements in the ids.

(func $ext_offchain_http_response_header_name_version_1
    (param $request_id i32) (param $header_index i32) (param $out i64) (result i64))
(func $ext_offchain_http_response_header_value_version_1
    (param $request_id i32) (param $header_index i32) (param $out i64) (result i64))

These functions supercede the ext_offchain_http_response_headers_version_1 host function.

Contrary to ext_offchain_http_response_headers_version_1, only one header indicated by header_index can be read at a time. Instead of calling ext_offchain_http_response_headers_version_1 once, the runtime should call ext_offchain_http_response_header_name_version_1 and ext_offchain_http_response_header_value_version_1 multiple times with an increasing header_index, until a value of -1 is returned.

These functions accept an out parameter containing a pointer-size to the memory location where the header name or value should be written. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine, even if the function wouldn't write anything to out.

These functions return the size, in bytes, of the header name or header value. If request doesn't exist or is in an invalid state (as documented for ext_offchain_http_response_headers_version_1) or the header_index is out of range, a value of -1 is returned. Given that the host must never write more bytes than the size of the buffer in out, and that the size of this buffer is expressed as a 32 bits number, a 64bits value of -1 is not ambiguous.

If the buffer in out is too small to fit the entire header name of value, only the bytes that fit are written and the rest are discarded.

(func $ext_offchain_submit_transaction_version_2
    (param $data i64) (return i32))
(func $ext_offchain_http_request_add_header_version_2
    (param $request_id i32) (param $name i64) (param $value i64) (result i32))

Instead of allocating a buffer, writing 1 or 0 in it, and returning a pointer to it, the version 2 of these functions return 0 or 1, where 0 indicates success and 1 indicates failure. The runtime must interpret any non-0 value as failure, but the client must always return 1 in case of failure.

(func $ext_offchain_local_storage_read_version_1
    (param $kind i32) (param $key i64) (param $value_out i64) (param $offset i32) (result i64))

This function supercedes the ext_offchain_local_storage_get_version_1 host function, and uses an API and logic similar to ext_storage_read_version_2.

It reads the offchain local storage key indicated by kind and key starting at the byte indicated by offset, and writes the value to the pointer-size indicated by value_out.

The function returns the number of bytes that were written in the value_out buffer. If the entry doesn't exist, a value of -1 is returned. Given that the host must never write more bytes than the size of the buffer in value_out, and that the size of this buffer is expressed as a 32 bits number, a 64bits value of -1 is not ambiguous.

The runtime execution stops with an error if value_out is outside of the range of the memory of the virtual machine, even if the size of the buffer is 0 or if the amount of data to write would be 0 bytes.

(func $ext_offchain_network_peer_id_version_1
    (param $out i64))

This function writes the PeerId of the local node to the memory location indicated by out. A PeerId is always 38 bytes long. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine.

(func $ext_input_size_version_1
    (return i64))
(func $ext_input_read_version_1
    (param $offset i64) (param $out i64))

When a runtime function is called, the host uses the allocator to allocate memory within the runtime where to write some input data. These two new host functions provide an alternative way to access the input that doesn't make use of the allocator.

The ext_input_size_version_1 host function returns the size in bytes of the input data.

The ext_input_read_version_1 host function copies some data from the input data to the memory of the runtime. The offset parameter indicates the offset within the input data where to start copying, and must be inferior or equal to the value returned by ext_input_size_version_1. The out parameter is a pointer-size containing the buffer where to write to. The runtime execution stops with an error if offset is strictly superior to the size of the input data, or if out is outside of the range of the memory of the virtual machine, even if the amount of data to copy would be 0 bytes.

Other changes

In addition to the new host functions, this RFC proposes two changes to the runtime-host interface:

  • The following function signature is now also accepted for runtime entry points: (func (result i64)).
  • Runtimes no longer need to expose a constant named __heap_base.

All the host functions that are being superceded by new host functions are now considered deprecated and should no longer be used. The following other host functions are similarly also considered deprecated:

  • ext_storage_get_version_1
  • ext_default_child_storage_get_version_1
  • ext_allocator_malloc_version_1
  • ext_allocator_free_version_1
  • ext_offchain_network_state_version_1

Drawbacks

This RFC might be difficult to implement in Substrate due to the internal code design. It is not clear to the author of this RFC how difficult it would be.

Prior Art

The API of these new functions was heavily inspired by API used by the C programming language.

Unresolved Questions

The changes in this RFC would need to be benchmarked. This involves implementing the RFC and measuring the speed difference.

It is expected that most host functions are faster or equal speed to their deprecated counterparts, with the following exceptions:

  • ext_input_size_version_1/ext_input_read_version_1 is inherently slower than obtaining a buffer with the entire data due to the two extra function calls and the extra copying. However, given that this only happens once per runtime call, the cost is expected to be negligible.

  • The ext_crypto_*_public_keys, ext_offchain_network_state, and ext_offchain_http_* host functions are likely slightly slower than their deprecated counterparts, but given that they are used only in offchain workers this is acceptable.

  • It is unclear how replacing ext_storage_get with ext_storage_read and ext_default_child_storage_get with ext_default_child_storage_read will impact performances.

  • It is unclear how the changes to ext_storage_next_key and ext_default_child_storage_next_key will impact performances.

Future Possibilities

After this RFC, we can remove from the source code of the host the allocator altogether in a future version, by removing support for all the deprecated host functions. This would remove the possibility to synchronize older blocks, which is probably controversial and requires a some preparations that are out of scope of this RFC.

(source)

Table of Contents

RFC-0006: Dynamic Pricing for Bulk Coretime Sales

Start DateJuly 09, 2023
DescriptionA dynamic pricing model to adapt the regular price for bulk coretime sales
AuthorsTommi Enenkel (Alice und Bob)
LicenseMIT

Summary

This RFC proposes a dynamic pricing model for the sale of Bulk Coretime on the Polkadot UC. The proposed model updates the regular price of cores for each sale period, by taking into account the number of cores sold in the previous sale, as well as a limit of cores and a target number of cores sold. It ensures a minimum price and limits price growth to a maximum price increase factor, while also giving govenance control over the steepness of the price change curve. It allows governance to address challenges arising from changing market conditions and should offer predictable and controlled price adjustments.

Accompanying visualizations are provided at [1].

Motivation

RFC-1 proposes periodic Bulk Coretime Sales as a mechanism to sell continouos regions of blockspace (suggested to be 4 weeks in length). A number of Blockspace Regions (compare RFC-1 & RFC-3) are provided for sale to the Broker-Chain each period and shall be sold in a way that provides value-capture for the Polkadot network. The exact pricing mechanism is out of scope for RFC-1 and shall be provided by this RFC.

A dynamic pricing model is needed. A limited number of Regions are offered for sale each period. The model needs to find the price for a period based on supply and demand of the previous period.

The model shall give Coretime consumers predictability about upcoming price developments and confidence that Polkadot governance can adapt the pricing model to changing market conditions.

Requirements

  1. The solution SHOULD provide a dynamic pricing model that increases price with growing demand and reduces price with shrinking demand.
  2. The solution SHOULD have a slow rate of change for price if the number of Regions sold is close to a given sales target and increase the rate of change as the number of sales deviates from the target.
  3. The solution SHOULD provide the possibility to always have a minimum price per Region.
  4. The solution SHOULD provide a maximum factor of price increase should the limit of Regions sold per period be reached.
  5. The solution should allow governance to control the steepness of the price function

Stakeholders

The primary stakeholders of this RFC are:

  • Protocol researchers and evelopers
  • Polkadot DOT token holders
  • Polkadot parachains teams
  • Brokers involved in the trade of Bulk Coretime

Explanation

Overview

The dynamic pricing model sets the new price based on supply and demand in the previous period. The model is a function of the number of Regions sold, piecewise-defined by two power functions.

  • The left side ranges from 0 to the target. It represents situations where demand was lower than the target.
  • The right side ranges from the target to limit. It represents situations where demand was higher than the target.

The curve of the function forms a plateau around the target and then falls off to the left and rises up to the right. The shape of the plateau can be controlled via a scale factor for the left side and right side of the function respectively.

Parameters

From here on, we will also refer to Regions sold as 'cores' to stay congruent with RFC-1.

NameSuggested ValueDescriptionConstraints
BULK_LIMIT45The maximum number of cores being sold0 < BULK_LIMIT
BULK_TARGET30The target number of cores being sold0 < BULK_TARGET <= BULK_LIMIT
MIN_PRICE1The minimum price a core will always cost.0 < MIN_PRICE
MAX_PRICE_INCREASE_FACTOR2The maximum factor by which the price can change.1 < MAX_PRICE_INCREASE_FACTOR
SCALE_DOWN2The steepness of the left side of the function.0 < SCALE_DOWN
SCALE_UP2The steepness of the right side of the function.0 < SCALE_UP

Function

P(n) = \begin{cases} 
    (P_{\text{old}} - P_{\text{min}}) \left(1 - \left(\frac{T - n}{T}\right)^d\right) + P_{\text{min}} & \text{if } n \leq T \\
    ((F - 1) \cdot P_{\text{old}} \cdot \left(\frac{n - T}{L - T}\right)^u) + P_{\text{old}} & \text{if } n > T 
\end{cases}
  • $P_{\text{old}}$ is the old_price, the price of a core in the previous period.
  • $P_{\text{min}}$ is the MIN_PRICE, the minimum price a core will always cost.
  • $F$ is the MAX_PRICE_INCREASE_FACTOR, the factor by which the price maximally can change from one period to another.
  • $d$ is the SCALE_DOWN, the steepness of the left side of the function.
  • $u$ is the SCALE_UP, the steepness of the right side of the function.
  • $T$ is the BULK_TARGET, the target number of cores being sold.
  • $L$ is the BULK_LIMIT, the maximum number of cores being sold.
  • $n$ is cores_sold, the number of cores being sold.

Left side

The left side is a power function that describes an increasing concave downward curvature that approaches old_price. We realize this by using the form $y = a(1 - x^d)$, usually used as a downward sloping curve, but in our case flipped horizontally by letting the argument $x = \frac{T-n}{T}$ decrease with $n$, doubly inversing the curve.

This approach is chosen over a decaying exponential because it let's us a better control the shape of the plateau, especially allowing us to get a straight line by setting SCALE_DOWN to $1$.

Ride side

The right side is a power function of the form $y = a(x^u)$.

Pseudo-code

NEW_PRICE := IF CORES_SOLD <= BULK_TARGET THEN
    (OLD_PRICE - MIN_PRICE) * (1 - ((BULK_TARGET - CORES_SOLD)^SCALE_DOWN / BULK_TARGET^SCALE_DOWN)) + MIN_PRICE
ELSE
    ((MAX_PRICE_INCREASE_FACTOR - 1) * OLD_PRICE * ((CORES_SOLD - BULK_TARGET)^SCALE_UP / (BULK_LIMIT - BULK_TARGET)^SCALE_UP)) + OLD_PRICE
END IF

Properties of the Curve

Minimum Price

We introduce MIN_PRICE to control the minimum price.

The left side of the function shall be allowed to come close to 0 if cores sold approaches 0. The rationale is that if there are actually 0 cores sold, the previous sale price was too high and the price needs to adapt quickly.

Price forms a plateau around the target

If the number of cores is close to BULK_TARGET, less extreme price changes might be sensible. This ensures that a drop in sold cores or an increase doesn’t lead to immediate price changes, but rather slowly adapts. Only if more extreme changes in the number of sold cores occur, does the price slope increase.

We introduce SCALE_DOWN and SCALE_UP to control for the steepness of the left and the right side of the function respectively.

Max price increase factor

We introduce MAX_PRICE_INCREASE_FACTOR as the factor that controls how much the price may increase from one period to another.

Introducing this variable gives governance an additional control lever and avoids the necessity for a future runtime upgrade.

Example Configurations

Baseline

This example proposes the baseline parameters. If not mentioned otherwise, other examples use these values.

The minimum price of a core is 1 DOT, the price can double every 4 weeks. Price change around BULK_TARGET is dampened slightly.

BULK_TARGET = 30
BULK_LIMIT = 45
MIN_PRICE = 1
MAX_PRICE_INCREASE_FACTOR = 2
SCALE_DOWN = 2
SCALE_UP = 2
OLD_PRICE = 1000

More aggressive pricing

We might want to have a more aggressive price growth, allowing the price to triple every 4 weeks and have a linear increase in price on the right side.

BULK_TARGET = 30
BULK_LIMIT = 45
MIN_PRICE = 1
MAX_PRICE_INCREASE_FACTOR = 3
SCALE_DOWN = 2
SCALE_UP = 1
OLD_PRICE = 1000

Conservative pricing to ensure quick corrections in an affluent market

If governance considers the risk that a sudden surge in DOT price might price chains out from bulk coretime markets, it can ensure the model quickly reacts to a quick drop in demand, by setting 0 < SCALE_DOWN < 1 and setting the max price increase factor more conservatively.

BULK_TARGET = 30
BULK_LIMIT = 45
MIN_PRICE = 1
MAX_PRICE_INCREASE_FACTOR = 1.5
SCALE_DOWN = 0.5
SCALE_UP = 2
OLD_PRICE = 1000

Linear pricing

By setting the scaling factors to 1 and potentially adapting the max price increase, we can achieve a linear function

BULK_TARGET = 30
BULK_LIMIT = 45
MIN_PRICE = 1
MAX_PRICE_INCREASE_FACTOR = 1.5
SCALE_DOWN = 1
SCALE_UP = 1
OLD_PRICE = 1000

Drawbacks

None at present.

Prior Art and References

This pricing model is based on the requirements from the basic linear solution proposed in RFC-1, which is a simple dynamic pricing model and only used as proof. The present model adds additional considerations to make the model more adaptable under real conditions.

Future Possibilities

This RFC, if accepted, shall be implemented in conjunction with RFC-1.

References

(source)

Table of Contents

RFC-34: XCM Absolute Location Account Derivation

Start Date05 October 2023
DescriptionXCM Absolute Location Account Derivation
AuthorsGabriel Facco de Arruda

Summary

This RFC proposes changes that enable the use of absolute locations in AccountId derivations, which allows protocols built using XCM to have static account derivations in any runtime, regardless of its position in the family hierarchy.

Motivation

These changes would allow protocol builders to leverage absolute locations to maintain the exact same derived account address across all networks in the ecosystem, thus enhancing user experience.

One such protocol, that is the original motivation for this proposal, is InvArch's Saturn Multisig, which gives users a unifying multisig and DAO experience across all XCM connected chains.

Stakeholders

  • Ecosystem developers

Explanation

This proposal aims to make it possible to derive accounts for absolute locations, enabling protocols that require the ability to maintain the same derived account in any runtime. This is done by deriving accounts from the hash of described absolute locations, which are static across different destinations.

The same location can be represented in relative form and absolute form like so:

#![allow(unused)]
fn main() {
// Relative location (from own perspective)
{
    parents: 0,
    interior: Here
}

// Relative location (from perspective of parent)
{
    parents: 0,
    interior: [Parachain(1000)]
}

// Relative location (from perspective of sibling)
{
    parents: 1,
    interior: [Parachain(1000)]
}

// Absolute location
[GlobalConsensus(Kusama), Parachain(1000)]
}

Using DescribeFamily, the above relative locations would be described like so:

#![allow(unused)]
fn main() {
// Relative location (from own perspective)
// Not possible.

// Relative location (from perspective of parent)
(b"ChildChain", Compact::<u32>::from(*index)).encode()

// Relative location (from perspective of sibling)
(b"SiblingChain", Compact::<u32>::from(*index)).encode()

}

The proposed description for absolute location would follow the same pattern, like so:

#![allow(unused)]
fn main() {
(
    b"GlobalConsensus",
    network_id,
    b"Parachain",
    Compact::<u32>::from(para_id),
    tail
).encode()
}

This proposal requires the modification of two XCM types defined in the xcm-builder crate: The WithComputedOrigin barrier and the DescribeFamily MultiLocation descriptor.

WithComputedOrigin

The WtihComputedOrigin barrier serves as a wrapper around other barriers, consuming origin modification instructions and applying them to the message origin before passing to the inner barriers. One of the origin modifying instructions is UniversalOrigin, which serves the purpose of signaling that the origin should be a Universal Origin that represents the location as an absolute path prefixed by the GlobalConsensus junction.

In it's current state the barrier transforms locations with the UniversalOrigin instruction into relative locations, so the proposed changes aim to make it return absolute locations instead.

DescribeFamily

The DescribeFamily location descriptor is part of the HashedDescription MultiLocation hashing system and exists to describe locations in an easy format for encoding and hashing, so that an AccountId can be derived from this MultiLocation.

This implementation contains a match statement that does not match against absolute locations, so changes to it involve matching against absolute locations and providing appropriate descriptions for hashing.

Drawbacks

No drawbacks have been identified with this proposal.

Testing, Security, and Privacy

Tests can be done using simple unit tests, as this is not a change to XCM itself but rather to types defined in xcm-builder.

Security considerations should be taken with the implementation to make sure no unwanted behavior is introduced.

This proposal does not introduce any privacy considerations.

Performance, Ergonomics, and Compatibility

Performance

Depending on the final implementation, this proposal should not introduce much overhead to performance.

Ergonomics

The ergonomics of this proposal depend on the final implementation details.

Compatibility

Backwards compatibility should remain unchanged, although that depend on the final implementation.

Prior Art and References

  • DescirbeFamily type: https://github.com/paritytech/polkadot-sdk/blob/master/polkadot/xcm/xcm-builder/src/location_conversion.rs#L122
  • WithComputedOrigin type: https://github.com/paritytech/polkadot-sdk/blob/master/polkadot/xcm/xcm-builder/src/barriers.rs#L153

Unresolved Questions

Implementation details and overall code is still up to discussion.

(source)

Table of Contents

RFC-0035: Conviction Voting Delegation Modifications

October 10, 2023
Conviction Voting Delegation Modifications
ChaosDAO

Summary

This RFC proposes to make modifications to voting power delegations as part of the Conviction Voting pallet. The changes being proposed include:

  1. Allow a Delegator to vote independently of their Delegate if they so desire.
  2. Allow nested delegations – for example Charlie delegates to Bob who delegates to Alice – when Alice votes then both Bob and Charlie vote alongside Alice (in the current implementation Charlie will not vote when Alice votes).
  3. Make a change so that when a delegate votes abstain their delegated votes also vote abstain.
  4. Allow a Delegator to delegate/ undelegate their votes for all tracks with a single call.

Motivation

It has become clear since the launch of OpenGov that there are a few common tropes which pop up time and time again:

  1. The frequency of referenda is often too high for network participants to have sufficient time to review, comprehend, and ultimately vote on each individual referendum. This means that these network participants end up being inactive in on-chain governance.
  2. There are active network participants who are reviewing every referendum and are providing feedback in an attempt to help make the network thrive – but often time these participants do not control enough voting power to influence the network with their positive efforts.
  3. Delegating votes for all tracks currently requires long batched calls which result in high fees for the Delegator - resulting in a reluctance from many to delegate their votes.

We believe (based on feedback from token holders with a larger stake in the network) that if there were some changes made to delegation mechanics, these larger stake holders would be more likely to delegate their voting power to active network participants – thus greatly increasing the support turnout.

Stakeholders

The primary stakeholders of this RFC are:

  • The Polkadot Technical Fellowship who will have to research and implement the technical aspects of this RFC
  • DOT token holders in general

Explanation

This RFC proposes to make 4 changes to the convictionVoting pallet logic in order to improve the user experience of those delegating their voting power to another account.

  1. Allow a Delegator to vote independently of their Delegate if they so desire – this would empower network participants to more actively delegate their voting power to active voters, removing the tedious steps of having to undelegate across an entire track every time they do not agree with their delegate's voting direction for a particular referendum.

  2. Allow nested delegations – for example Charlie delegates to Bob who delegates to Alice – when Alice votes then both Bob and Charlie vote alongside Alice (in the current runtime Charlie will not vote when Alice votes) – This would allow network participants who control multiple (possibly derived) accounts to be able to delegate all of their voting power to a single account under their control, which would in turn delegate to a more active voting participant. Then if the delegator wishes to vote independently of their delegate they can control all of their voting power from a single account, which again removes the pain point of having to issue multiple undelegate extrinsics in the event that they disagree with their delegate.

  3. Have delegated votes follow their delegates abstain votes – there are times where delegates may vote abstain on a particular referendum and adding this functionality will increase the support of a particular referendum. It has a secondary benefit of meaning that Validators who are delegating their voting power do not lose points in the 1KV program in the event that their delegate votes abstain (another pain point which may be preventing those network participants from delegating).

  4. Allow a Delegator to delegate/ undelegate their votes for all tracks with a single call - in order to delegate votes across all tracks, a user must batch 15 calls - resulting in high costs for delegation. A single call for delegate_all/ undelegate_all would reduce the complexity and therefore costs of delegations considerably for prospective Delegators.

Drawbacks

We do not foresee any drawbacks by implementing these changes. If anything we believe that this should help to increase overall voter turnout (via the means of delegation) which we see as a net positive.

Testing, Security, and Privacy

We feel that the Polkadot Technical Fellowship would be the most competent collective to identify the testing requirements for the ideas presented in this RFC.

Performance, Ergonomics, and Compatibility

Performance

This change may add extra chain storage requirements on Polkadot, especially with respect to nested delegations.

Ergonomics & Compatibility

The change to add nested delegations may affect governance interfaces such as Nova Wallet who will have to apply changes to their indexers to support nested delegations. It may also affect the Polkadot Delegation Dashboard as well as Polkassembly & SubSquare.

We want to highlight the importance for ecosystem builders to create a mechanism for indexers and wallets to be able to understand that changes have occurred such as increasing the pallet version, etc.

Prior Art and References

N/A

Unresolved Questions

N/A

Additionally we would like to re-open the conversation about the potential for there to be free delegations. This was discussed by Dr Gavin Wood at Sub0 2022 and we feel like this would go a great way towards increasing the amount of network participants that are delegating: https://youtu.be/hSoSA6laK3Q?t=526

Overall, we strongly feel that delegations are a great way to increase voter turnout, and the ideas presented in this RFC would hopefully help in that aspect.

(source)

Table of Contents

RFC-0044: Rent based registration model

Start Date6 November 2023
DescriptionA new rent based parachain registration model
AuthorsSergej Sakac

Summary

This RFC proposes a new model for a sustainable on-demand parachain registration, involving a smaller initial deposit and periodic rent payments. The new model considers that on-demand chains may be unregistered and later re-registered. The proposed solution also ensures a quick startup for on-demand chains on Polkadot in such cases.

Motivation

With the support of on-demand parachains on Polkadot, there is a need to explore a new, more cost-effective model for registering validation code. In the current model, the parachain manager is responsible for reserving a unique ParaId and covering the cost of storing the validation code of the parachain. These costs can escalate, particularly if the validation code is large. We need a better, sustainable model for registering on-demand parachains on Polkadot to help smaller teams deploy more easily.

This RFC suggests a new payment model to create a more financially viable approach to on-demand parachain registration. In this model, a lower initial deposit is required, followed by recurring payments upon parachain registration.

This new model will coexist with the existing one-time deposit payment model, offering teams seeking to deploy on-demand parachains on Polkadot a more cost-effective alternative.

Requirements

  1. The solution SHOULD NOT affect the current model for registering validation code.
  2. The solution SHOULD offer an easily configurable way for governance to adjust the initial deposit and recurring rent cost.
  3. The solution SHOULD provide an incentive to prune validation code for which rent is not paid.
  4. The solution SHOULD allow anyone to re-register validation code under the same ParaId without the need for redundant pre-checking if it was already verified before.
  5. The solution MUST be compatible with the Agile Coretime model, as described in RFC#0001
  6. The solution MUST allow anyone to pay the rent.
  7. The solution MUST prevent the removal of validation code if it could still be required for disputes or approval checking.

Stakeholders

  • Future Polkadot on-demand Parachains

Explanation

This RFC proposes a set of changes that will enable the new rent based approach to registering and storing validation code on-chain. The new model, compared to the current one, will require periodic rent payments. The parachain won't be pruned automatically if the rent is not paid, but by permitting anyone to prune the parachain and rewarding the caller, there will be an incentive for the removal of the validation code.

On-demand parachains should still be able to utilize the current one-time payment model. However, given the size of the deposit required, it's highly likely that most on-demand parachains will opt for the new rent-based model.

Importantly, this solution doesn't require any storage migrations in the current system nor does it introduce any breaking changes. The following provides a detailed description of this solution.

Registering an on-demand parachain

In the current implementation of the registrar pallet, there are two constants that specify the necessary deposit for parachains to register and store their validation code:

#![allow(unused)]
fn main() {
trait Config {
	// -- snip --

	/// The deposit required for reserving a `ParaId`.
	#[pallet::constant]
	type ParaDeposit: Get<BalanceOf<Self>>;

	/// The deposit to be paid per byte stored on chain.
	#[pallet::constant]
	type DataDepositPerByte: Get<BalanceOf<Self>>;
}
}

This RFC proposes the addition of three new constants that will determine the payment amount and the frequency of the recurring rent payment:

#![allow(unused)]
fn main() {
trait Config {
	// -- snip --

	/// Defines how frequently the rent needs to be paid.
	///
	/// The duration is set in sessions instead of block numbers.
	#[pallet::constant]
	type RentDuration: Get<SessionIndex>;

	/// The initial deposit amount for registering validation code.
	///
	/// This is defined as a proportion of the deposit that would be required in the regular
	/// model.
	#[pallet::constant]
	type RentalDepositProportion: Get<Perbill>;

	/// The recurring rental cost defined as a proportion of the initial rental registration deposit.
	#[pallet::constant]
	type RentalRecurringProportion: Get<Perbill>;
}
}

Users will be able to reserve a ParaId and register their validation code for a proportion of the regular deposit required. However, they must also make additional rent payments at intervals of T::RentDuration.

For registering using the new rental system we will have to make modifications to the paras-registrar pallet. We should expose two new extrinsics for this:

#![allow(unused)]
fn main() {
mod pallet {
	// -- snip --

	pub fn register_rental(
		origin: OriginFor<T>,
		id: ParaId,
		genesis_head: HeadData,
		validation_code: ValidationCode,
	) -> DispatchResult { /* ... */ }

	pub fn pay_rent(origin: OriginFor<T>, id: ParaId) -> DispatchResult {
		/* ... */ 
	}
}
}

A call to register_rental will require the reservation of only a percentage of the deposit that would otherwise be required to register the validation code when using the regular model. As described later in the Quick para re-registering section below, we will also store the code hash of each parachain to enable faster re-registration after a parachain has been pruned. For this reason the total initial deposit amount is increased to account for that.

#![allow(unused)]
fn main() {
// The logic for calculating the initial deposit for parachain registered with the 
// new rent-based model:

let validation_code_deposit = per_byte_fee.saturating_mul((validation_code.0.len() as u32).into());

let head_deposit = per_byte_fee.saturating_mul((genesis_head.0.len() as u32).into())
let hash_deposit = per_byte_fee.saturating_mul(HASH_SIZE);

let deposit = T::RentalDepositProportion::get().mul_ceil(validation_code_deposit)
	.saturating_add(T::ParaDeposit::get())
	.saturating_add(head_deposit)
	.saturating_add(hash_deposit)
}

Once the ParaId is reserved and the validation code is registered the rent must be periodically paid to ensure the on-demand parachain doesn't get removed from the state. The pay_rent extrinsic should be callable by anyone, removing the need for the parachain to depend on the parachain manager for rent payments.

On-demand parachain pruning

If the rent is not paid, anyone has the option to prune the on-demand parachain and claim a portion of the initial deposit reserved for storing the validation code. This type of 'light' pruning only removes the validation code, while the head data and validation code hash are retained. The validation code hash is stored to allow anyone to register it again as well as to enable quicker re-registration by skipping the pre-checking process.

The moment the rent is no longer paid, the parachain won't be able to purchase on-demand access, meaning no new blocks are allowed. This stage is called the "hibernation" stage, during which all the parachain-related data is still stored on-chain, but new blocks are not permitted. The reason for this is to ensure that the validation code is available in case it is needed in the dispute or approval checking subsystems. Waiting for one entire session will be enough to ensure it is safe to deregister the parachain.

This means that anyone can prune the parachain only once the "hibernation" stage is over, which lasts for an entire session after the moment that the rent is not paid.

The pruning described here is a light form of pruning, since it only removes the validation code. As with all parachains, the parachain or para manager can use the deregister extrinsic to remove all associated state.

Ensuring rent is paid

The paras pallet will be loosely coupled with the para-registrar pallet. This approach enables all the pallets tightly coupled with the paras pallet to have access to the rent status information.

Once the validation code is stored without having its rent paid the assigner_on_demand pallet will ensure that an order for that parachain cannot be placed. This is easily achievable given that the assigner_on_demand pallet is tightly coupled with the paras pallet.

On-demand para re-registration

If the rent isn't paid on time, and the parachain gets pruned, the new model should provide a quick way to re-register the same validation code under the same ParaId. This can be achieved by skipping the pre-checking process, as the validation code hash will be stored on-chain, allowing us to easily verify that the uploaded code remains unchanged.

#![allow(unused)]
fn main() {
/// Stores the validation code hash for parachains that successfully completed the 
/// pre-checking process.
///
/// This is stored to enable faster on-demand para re-registration in case its pvf has been earlier
/// registered and checked.
///
/// NOTE: During a runtime upgrade where the pre-checking rules change this storage map should be
/// cleared appropriately.
#[pallet::storage]
pub(super) type CheckedCodeHash<T: Config> =
	StorageMap<_, Twox64Concat, ParaId, ValidationCodeHash>;
}

To enable parachain re-registration, we should introduce a new extrinsic in the paras-registrar pallet that allows this. The logic of this extrinsic will be same as regular registration, with the distinction that it can be called by anyone, and the required deposit will be smaller since it only has to cover for the storage of the validation code.

Drawbacks

This RFC does not alter the process of reserving a ParaId, and therefore, it does not propose reducing it, even though such a reduction could be beneficial.

Even though this RFC doesn't delve into the specifics of the configuration values for parachain registration but rather focuses on the mechanism, configuring it carelessly could lead to potential problems.

Since the validation code hash and head data are not removed when the parachain is pruned but only when the deregister extrinsic is called, the T::DataDepositPerByte must be set to a higher value to create a strong enough incentive for removing it from the state.

Testing, Security, and Privacy

The implementation of this RFC will be tested on Rococo first.

Proper research should be conducted on setting the configuration values of the new system since these values can have great impact on the network.

An audit is required to ensure the implementation's correctness.

The proposal introduces no new privacy concerns.

Performance, Ergonomics, and Compatibility

Performance

This RFC should not introduce any performance impact.

Ergonomics

This RFC does not affect the current parachains, nor the parachains that intend to use the one-time payment model for parachain registration.

Compatibility

This RFC does not break compatibility.

Prior Art and References

Prior discussion on this topic: https://github.com/paritytech/polkadot-sdk/issues/1796

Unresolved Questions

None at this time.

As noted in this GitHub issue, we want to raise the per-byte cost of on-chain data storage. However, a substantial increase in this cost would make it highly impractical for on-demand parachains to register on Polkadot. This RFC offers an alternative solution for on-demand parachains, ensuring that the per-byte cost increase doesn't overly burden the registration process.

(source)

Table of Contents

RFC-0054: Remove the concept of "heap pages" from the client

Start Date2023-11-24
DescriptionRemove the concept of heap pages from the client and move it to the runtime.
AuthorsPierre Krieger

Summary

Rather than enforce a limit to the total memory consumption on the client side by loading the value at :heappages, enforce that limit on the runtime side.

Motivation

From the early days of Substrate up until recently, the runtime was present in two forms: the wasm runtime (wasm bytecode passed through an interpreter) and the native runtime (native code directly run by the client).

Since the wasm runtime has a lower amount of available memory (4 GiB maximum) compared to the native runtime, and in order to ensure sure that the wasm and native runtimes always produce the same outcome, it was necessary to clamp the amount of memory available to both runtimes to the same value.

In order to achieve this, a special storage key (a "well-known" key) :heappages was introduced and represents the number of "wasm pages" (one page equals 64kiB) of memory that are available to the memory allocator of the runtimes. If this storage key is absent, it defaults to 2048, which is 128 MiB.

The native runtime has since then been disappeared, but the concept of "heap pages" still exists. This RFC proposes a simplification to the design of Polkadot by removing the concept of "heap pages" as is currently known, and proposes alternative ways to achieve the goal of limiting the amount of memory available.

Stakeholders

Client implementers and low-level runtime developers.

Explanation

This RFC proposes the following changes to the client:

  • The client no longer considers :heappages as special.
  • The memory allocator of the runtime is no longer bounded by the value of :heappages.

With these changes, the memory available to the runtime is now only bounded by the available memory space (4 GiB), and optionally by the maximum amount of memory specified in the Wasm binary (see https://webassembly.github.io/spec/core/bikeshed/#memories%E2%91%A0). In Rust, the latter can be controlled during compilation with the flag -Clink-arg=--max-memory=....

Since the client-side change is strictly more tolerant than before, we can perform the change immediately after the runtime has been updated, and without having to worry about backwards compatibility.

This RFC proposes three alternative paths (different chains might choose to follow different paths):

  • Path A: add back the same memory limit to the runtime, like so:

    • At initialization, the runtime loads the value of :heappages from the storage (using ext_storage_get or similar), and sets a global variable to the decoded value.
    • The runtime tracks the total amount of memory that it has allocated using its instance of #[global_allocator] (https://github.com/paritytech/polkadot-sdk/blob/e3242d2c1e2018395c218357046cc88caaed78f3/substrate/primitives/io/src/lib.rs#L1748-L1762). This tracking should also be added around the host functions that perform allocations.
    • If an allocation is attempted that would go over the value in the global variable, the memory allocation fails.
  • Path B: define the memory limit using the -Clink-arg=--max-memory=... flag.

  • Path C: don't add anything to the runtime. This is effectively the same as setting the memory limit to ~4 GiB (compared to the current default limit of 128 MiB). This solution is viable only because we're compiling for 32bits wasm rather than for example 64bits wasm. If we ever compile for 64bits wasm, this would need to be revisited.

Each parachain can choose the option that they prefer, but the author of this RFC strongly suggests either option C or B.

Drawbacks

In case of path A, there is one situation where the behaviour pre-RFC is not equivalent to the one post-RFC: when a host function that performs an allocation (for example ext_storage_get) is called, without this RFC this allocation might fail due to reaching the maximum heap pages, while after this RFC this will always succeed. This is most likely not a problem, as storage values aren't supposed to be larger than a few megabytes at the very maximum.

In the unfortunate event where the runtime runs out of memory, path B would make it more difficult to relax the memory limit, as we would need to re-upload the entire Wasm, compared to updating only :heappages in path A or before this RFC. In the case where the runtime runs out of memory only in the specific event where the Wasm runtime is modified, this could brick the chain. However, this situation is no different than the thousands of other ways that a bug in the runtime can brick a chain, and there's no reason to be particularily worried about this situation in particular.

Testing, Security, and Privacy

This RFC would reduce the chance of a consensus issue between clients. The :heappages are a rather obscure feature, and it is not clear what happens in some corner cases such as the value being too large (error? clamp?) or malformed. This RFC would completely erase these questions.

Performance, Ergonomics, and Compatibility

Performance

In case of path A, it is unclear how performances would be affected. Path A consists in moving client-side operations to the runtime without changing these operations, and as such performance differences are expected to be minimal. Overall, we're talking about one addition/subtraction per malloc and per free, so this is more than likely completely negligible.

In case of path B and C, the performance gain would be a net positive, as this RFC strictly removes things.

Ergonomics

This RFC would isolate the client and runtime more from each other, making it a bit easier to reason about the client or the runtime in isolation.

Compatibility

Not a breaking change. The runtime-side changes can be applied immediately (without even having to wait for changes in the client), then as soon as the runtime is updated, the client can be updated without any transition period. One can even consider updating the client before the runtime, as it corresponds to path C.

Prior Art and References

None.

Unresolved Questions

None.

This RFC follows the same path as https://github.com/polkadot-fellows/RFCs/pull/4 by scoping everything related to memory allocations to the runtime.

(source)

Table of Contents

RFC-0070: X Track for @kusamanetwork

Start DateJanuary 29, 2024
DescriptionAdd a governance track to facilitate posts on the @kusamanetwork's X account
AuthorAdam Clay Steeber

Summary

This RFC proposes adding a trivial governance track on Kusama to facilitate X (formerly known as Twitter) posts on the @kusamanetwork account. The technical aspect of implementing this in the runtime is very inconsequential and straight-forward, though it might get more technical if the Fellowship wants to regulate this track with a non-existent permission set. If this is implemented it would need to be followed up with:

  1. the establishment of specifications for proposing X posts via this track, and
  2. the development of tools/processes to ensure that the content contained in referenda enacted in this track would be automatically posted on X.

Motivation

The overall motivation for this RFC is to decentralize the management of the Kusama brand/communication channel to KSM holders. This is necessary in my opinion primarily because of the inactivity of the account in recent history, with posts spanning weeks or months apart. I am currently unaware of who/what entity manages the Kusama X account, but if they are affiliated with Parity or W3F this proposed solution could also offload some of the legal ramifications of making (or not making) announcements to the public regarding Kusama. While centralized control of the X account would still be present, it could become totally moot if this RFC is implemented and the community becomes totally autonomous in the management of Kusama's X posts.

This solution does not cover every single communication front for Kusama, but it does cover one of the largest. It also establishes a precedent for other communication channels that could be offloaded to openGov, provided this proof-of-concept is successful.

Finally, this RFC is the epitome of experimentation that Kusama is ideal for. This proposal may spark newfound excitement for Kusama and help us realize Kusama's potential for pushing boundaries and trying new unconventional ideas.

Stakeholders

This idea has not been formalized by any individual (or group of) KSM holder(s). To my knowledge the socialization of this idea is contained entirely in my recent X post here, but it is possible that an idea like this one has been discussed in other places. It appears to me that the ecosystem would welcome a change like this which is why I am taking action to formalize the discussion.

Explanation

The implementation of this idea can be broken down into 3 primary phases:

Phase 1 - Track configurations

First, we begin with this RFC to ensure all feedback can be discussed and implemented in the proposal. After the Fellowship and the community come to a reasonable agreement on the changes necessary to make this happen, the Fellowship can merge changes into Kusama's runtime to include this new track with appropriate track configurations. As a starting point, I recommend the following track configurations:

const APP_X_POST: Curve = Curve::make_linear(7, 28, percent(50), percent(100));
const SUP_X_POST: Curve = Curve::make_reciprocal(?, ?, percent(?), percent(?), percent(?));

// I don't know how to configure the make_reciprocal variables to get what I imagine for support,
// but I recommend starting at 50% support and sharply decreasing such that 1% is sufficient quarterway
// through the decision period and hitting 0% at the end of the decision period, or something like that.

	(
		69,
		pallet_referenda::TrackInfo {
			name: "x_post",
			max_deciding: 50,
			decision_deposit: 1 * UNIT,
			prepare_period: 10 * MINUTES,
			decision_period: 4 * DAYS,
			confirm_period: 10 * MINUTES,
			min_enactment_period: 1 * MINUTES,
			min_approval: APP_X_POST,
			min_support: SUP_X_POST,
		},
	),

I also recommend restricting permissions of this track to only submitting remarks or batches of remarks - that's all we'll need for its purpose. I'm not sure how easy that is to configure, but it is important since we don't want such an agile track to be able to make highly consequential calls.

Phase 2 - Establish Specs for X Post Track Referenda

It is important that we establish the specifications of referenda that will be submitted in this track to ensure that whatever automation tool is built can easily make posts once a referendum is enacted. As stated above, we really only need a system.remark (or batch of remarks) to indicate the contents of a proposed X post. The most straight-forward way to do this is to require remarks to adhere to X's requirements for making posts via their API.

For example, if I wanted to propose a post that contained the text "Hello World!" I would propose a referendum in the X post track that contains the following call data: 0x0000607b2274657874223a202248656c6c6f20576f726c6421227d (i.e. system.remark('{"text": "Hello World!"}')).

At first, we could support text posts only to prove the concept. Later on we could expand this spec to add support for media, likes, retweets, replies, polls, and whatever other X features we want.

Phase 3 - Release, Tooling, & Documentation

Once we agree on track configurations and specs for referenda in this track, the Fellowship can move forward with merging these changes into Kusama's runtime and include them in its next release. We could also move forward with developing the necessary tools that would listen for enacted referenda to post automatically on X. This would require coordination with whoever controls the X account; they would either need to run the tools themselves or add a third party as an authorized user to run the tools to make posts on the account's behalf. This is a bottleneck for decentralization, but as long as the tools are run by the X account manager or by a trusted third party it should be fine. I'm open to more decentralized solutions, but those always come at a cost of complexity.

For the tools themselves, we could open a bounty on Kusama for developers/teams to bid on. We could also just ask the community to step up with a Treasury proposal to have anyone fund the build. Or, the Fellowship could make the release of these changes contingent on their endorsement of developers/teams to build these tools. Lots of options! For the record, me and my team could develop all the necessary tools, but all because I'm proposing these changes doesn't entitle me to funds to build the tools needed to implement them. Here's what would be needed:

  • a listener tool that would listen for enacted referenda in this track, verify the format of the remark(s), and submit to X's API with authenticating credentials
  • a UI to allow layman users to propose referenda on this track

After everything is complete, we can update the Kusama wiki to include documentation on the X post specifications and include links to the tools/UI.

Drawbacks

The main drawback to this change is that it requires a lot of off-chain coordination. It's easy enough to include the track on Kusama but it's a totally different challenge to make it function as intended. The tools need to be built and the auth tokens need to be managed. It would certainly add an administrative burden to whoever manages the X account since they would either need to run the tools themselves or manage auth tokens.

This change also introduces on-going costs to the Treasury since it would need to compensate people to support the tools necessary to facilitate this idea. The ultimate question is whether these on-going costs would be worth the ability for KSM holders to make posts on Kusama's X account.

There's also the risk of misconfiguring the track to make referenda too easy to pass, potentially allowing a malicious actor to get content posted on X that violates X's ToS. If that happens, we risk getting Kusama banned on X!

This change might also be outside the scope of the Fellowship/openGov. Perhaps the best solution for the X account is to have the Treasury pay for a professional agency to manage posts. It wouldn't be decentralized but it would probably be more effective in terms of creating good content.

Finally, this solution is merely pseudo-decentralization since the X account manager would still have ultimate control of the account. It's decentralized insofar as the auth tokens are given to people actually running the tools; a house of cards is required to facilitate X posts via this track. Not ideal.

Testing, Security, and Privacy

There's major precedent for configuring tracks on openGov given the amount of power tracks have, so it shouldn't be hard to come up with a sound configuration. That's why I recommend restricting permissions of this track to remarks and batches of remarks, or something equally inconsequential.

Building the tools for this implementation is really straight-forward and could be audited by Fellowship members, and the community at large, on Github.

The largest security concern would be the management of Kusama's X account's auth tokens. We would need to ensure that they aren't compromised.

Performance, Ergonomics, and Compatibility

Performance

If a track on Kusama promises users that compliant referenda enacted therein would be posted on Kusama's X account, users would expect that track to perform as promised. If the house of cards tumbles down and a compliant referendum doesn't actually get anything posted, users might think that Kusama is broken or unreliable. This could be damaging to Kusama's image and cause people to question the soundness of other features on Kusama.

As mentioned in the drawbacks, the performance of this feature would depend on off-chain coordinations. We can reduce the administrative burden of these coordinations by funding third parties with the Treasury to deal with it, but then we're relying on trusting these parties.

Ergonomics

By adding a new track to Kusama, governance platforms like Polkassembly or Nova Wallet would need to include it on their applications. This shouldn't be too much of a burden or overhead since they've already built the infrastructure for other openGov tracks.

Compatibility

This change wouldn't break any compatibility as far as I know.

References

One reference to a similar feature requiring on-chain/off-chain coordination would be the Kappa-Sigma-Mu Society. Nothing on-chain necessarily enforces the rules or facilitates bids, challenges, defenses, etc. However, the Society has managed to maintain itself with integrity to its rules. So I don't think this is totally out of Kusama's scope. But it will require some off-chain effort to maintain.

Unresolved Questions

  • Who will develop the tools necessary to implement this feature? How do we select them?
  • How can this idea be better implemented with on-chain/substrate features?

(source)

Table of Contents

RFC-0073: Decision Deposit Referendum Track

Start Date12 February 2024
DescriptionAdd a referendum track which can place the decision deposit on any other track
AuthorsJelliedOwl

Summary

The current size of the decision deposit on some tracks is too high for many proposers. As a result, those needing to use it have to find someone else willing to put up the deposit for them - and a number of legitimate attempts to use the root track have timed out. This track would provide a more affordable (though slower) route for these holders to use the root track.

Motivation

There have been recent attempts to use the Kusama root track which have timed out with no decision deposit placed. Usually, these referenda have been related to parachain registration related issues.

Explanation

Propose to address this by adding a new referendum track [22] Referendum Deposit which can place the decision deposit on another referendum. This would require the following changes:

  • [Referenda Pallet] Modify the placeDecisionDesposit function to additionally allow it to be called by root, with root call bypassing the requirements for a deposit payment.
  • [Runtime] Add a new referendum track which can only call referenda->placeDecisionDeposit and the utility functions.

Referendum track parameters - Polkadot

  • Decision deposit: 1000 DOT
  • Decision period: 14 days
  • Confirmation period: 12 hours
  • Enactment period: 2 hour
  • Approval & Support curves: As per the root track, timed to match the decision period
  • Maximum deciding: 10

Referendum track parameters - Kusama

  • Decision deposit: 33.333333 KSM
  • Decision period: 7 days
  • Confirmation period: 6 hours
  • Enactment period: 1 hour
  • Approval & Support curves: As per the root track, timed to match the decision period
  • Maximum deciding: 10

Drawbacks

This track would provide a route to starting a root referendum with a much-reduced slashable deposit. This might be undesirable but, assuming the decision deposit cost for this track is still high enough, slashing would still act as a disincentive.

An alternative to this might be to reduce the decision deposit size some of the more expensive tracks. However, part of the purpose of the high deposit - at least on the root track - is to prevent spamming the limited queue with junk referenda.

Testing, Security, and Privacy

Will need additional tests case for the modified pallet and runtime. No security or privacy issues.

Performance, Ergonomics, and Compatibility

Performance

No significant performance impact.

Ergonomics

Only changes related to adding the track. Existing functionality is unchanged.

Compatibility

No compatibility issues.

Prior Art and References

Unresolved Questions

Feedback on whether my proposed implementation of this is the best way to address the issue - including which calls the track should be allowed to make. Are the track parameters correct or should be use something different? Alternative would be welcome.

(source)

Table of Contents

RFC-0074: Stateful Multisig Pallet

Start Date15 February 2024
DescriptionAdd Enhanced Multisig Pallet to System chains
AuthorsAbdelrahman Soliman (Boda)

Summary

A pallet to facilitate enhanced multisig accounts. The main enhancement is that we store a multisig account in the state with related info (signers, threshold,..etc). The module affords enhanced control over administrative operations such as adding/removing signers, changing the threshold, account deletion, canceling an existing proposal. Each signer can approve/reject a proposal while still exists. The proposal is not intended for migrating or getting rid of existing multisig. It's to allow both options to coexist.

For the rest of the RFC We use the following terms:

  • proposal to refer to an extrinsic that is to be dispatched from a multisig account after getting enough approvals.
  • Stateful Multisig to refer to the proposed pallet.
  • Stateless Multisig to refer to the current multisig pallet in polkadot-sdk.

Motivation

Problem

Entities in the Polkadot ecosystem need to have a way to manage their funds and other operations in a secure and efficient way. Multisig accounts are a common way to achieve this. Entities by definition change over time, members of the entity may change, threshold requirements may change, and the multisig account may need to be deleted. For even more enhanced hierarchical control, the multisig account may need to be controlled by other multisig accounts.

Current native solutions for multisig operations are less optimal, performance-wise (as we'll explain later in the RFC), and lack fine-grained control over the multisig account.

Stateless Multisig

We refer to current multisig pallet in polkadot-sdk because the multisig account is only derived and not stored in the state. Although deriving the account is determinsitc as it relies on exact users (sorted) and thershold to derive it. This does not allow for control over the multisig account. It's also tightly coupled to exact users and threshold. This makes it hard for an organization to manage existing accounts and to change the threshold or add/remove signers.

We believe as well that the stateless multisig is not efficient in terms of block footprint as we'll show in the performance section.

Pure Proxy

Pure proxy can achieve having a stored and determinstic multisig account from different users but it's unneeded complexity as a way around the limitations of the current multisig pallet. It doesn't also have the same fine grained control over the multisig account.

Other points mentioned by @tbaut

  • pure proxies aren't (yet) a thing cross chain
  • the end user complexity is much much higher with pure proxies, also for new users smart contract multisig are widely known while pure proxies are obscure.
  • you can shoot yourself in the foot by deleting the proxy, and effectively loosing access to funds with pure proxies.

Requirements

Basic requirements for the Stateful Multisig are:

  • The ability to have concrete and permanent (unless deleted) multisig accounts in the state.
  • The ability to add/remove signers from an existing multisig account by the multisig itself.
  • The ability to change the threshold of an existing multisig account by the multisig itself.
  • The ability to delete an existing multisig account by the multisig itself.
  • The ability to cancel an existing proposal by the multisig itself.
  • Signers of multisig account can start a proposal on behalf of the multisig account which will be dispatched after getting enough approvals.
  • Signers of multisig account can approve/reject a proposal while still exists.

Use Cases

  • Corporate Governance: In a corporate setting, multisig accounts can be employed for decision-making processes. For example, a company may require the approval of multiple executives to initiate significant financial transactions.

  • Joint Accounts: Multisig accounts can be used for joint accounts where multiple individuals need to authorize transactions. This is particularly useful in family finances or shared business accounts.

  • Decentralized Autonomous Organizations (DAOs): DAOs can utilize multisig accounts to ensure that decisions are made collectively. Multiple key holders can be required to approve changes to the organization's rules or the allocation of funds.

and much more...

Stakeholders

  • Polkadot holders
  • Polkadot developers

Explanation

I've created the stateful multisig pallet during my studies in Polkadot Blockchain Academy under supervision from @shawntabrizi and @ank4n. After that, I've enhanced it to be fully functional and this is a draft PR#3300 in polkadot-sdk. I'll list all the details and design decisions in the following sections. Note that the PR is not 1-1 exactly to the current RFC as the RFC is a more polished version of the PR after updating based on the feedback and discussions.

Let's start with a sequence diagram to illustrate the main operations of the Stateful Multisig.

multisig operations

Notes on above diagram:

  • It's a 3 step process to execute a proposal. (Start Proposal --> Approvals --> Execute Proposal)
  • Execute is an explicit extrinsic for a simpler API. It can be optimized to be executed automatically after getting enough approvals.
  • Any user can create a multisig account and they don't need to be part of it. (Alice in the diagram)
  • A proposal is any extrinsic including control extrinsics (e.g. add/remove signer, change threshold,..etc).
  • Any multisig account signer can start a proposal on behalf of the multisig account. (Bob in the diagram)
  • Any multisig account owener can execute proposal if it's approved by enough signers. (Dave in the diagram)

State Transition Functions

having the following enum to store the call or the hash:

#![allow(unused)]
fn main() {
enum CallOrHash<T: Config> {
	Call(<T as Config>::RuntimeCall),
	Hash(T::Hash),
}
}
  • create_multisig - Create a multisig account with a given threshold and initial signers. (Needs Deposit)
#![allow(unused)]
fn main() {
		/// Creates a new multisig account and attach signers with a threshold to it.
		///
		/// The dispatch origin for this call must be _Signed_. It is expected to be a nomral AccountId and not a
		/// Multisig AccountId.
		///
		/// T::BaseCreationDeposit + T::PerSignerDeposit * signers.len() will be held from the caller's account.
		///
		/// # Arguments
		///
		/// - `signers`: Initial set of accounts to add to the multisig. These may be updated later via `add_signer`
		/// and `remove_signer`.
		/// - `threshold`: The threshold number of accounts required to approve an action. Must be greater than 0 and
		/// less than or equal to the total number of signers.
		///
		/// # Errors
		///
		/// * `TooManySignatories` - The number of signatories exceeds the maximum allowed.
		/// * `InvalidThreshold` - The threshold is greater than the total number of signers.
		pub fn create_multisig(
			origin: OriginFor<T>,
			signers: BoundedBTreeSet<T::AccountId, T::MaxSignatories>,
			threshold: u32,
		) -> DispatchResult 
}
  • start_proposal - Start a multisig proposal. (Needs Deposit)
#![allow(unused)]
fn main() {
		/// Starts a new proposal for a dispatchable call for a multisig account.
		/// The caller must be one of the signers of the multisig account.
		/// T::ProposalDeposit will be held from the caller's account.
		///
		/// # Arguments
		///
		/// * `multisig_account` - The multisig account ID.
		/// * `call_or_hash` - The enum having the call or the hash of the call to be approved and executed later.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `UnAuthorizedSigner` - The caller is not an signer of the multisig account.
		/// * `TooManySignatories` - The number of signatories exceeds the maximum allowed. (shouldn't really happen as it's the first approval)
		pub fn start_proposal(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		) -> DispatchResult
}
  • approve - Approve a multisig proposal.
#![allow(unused)]
fn main() {
		/// Approves a proposal for a dispatchable call for a multisig account.
		/// The caller must be one of the signers of the multisig account.
		///
		/// If a signer did approve -> reject -> approve, the proposal will be approved.
		/// If a signer did approve -> reject, the proposal will be rejected.
		///
		/// # Arguments
		///
		/// * `multisig_account` - The multisig account ID.
		/// * `call_or_hash` - The enum having the call or the hash of the call to be approved.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `UnAuthorizedSigner` - The caller is not an signer of the multisig account.
		/// * `TooManySignatories` - The number of signatories exceeds the maximum allowed.
		/// This shouldn't really happen as it's an approval, not an addition of a new signer.
		pub fn approve(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		) -> DispatchResult
}
  • reject - Reject a multisig proposal.
#![allow(unused)]
fn main() {
		/// Rejects a proposal for a multisig account.
		/// The caller must be one of the signers of the multisig account.
		///
		/// Between approving and rejecting, last call wins.
		/// If a signer did approve -> reject -> approve, the proposal will be approved.
		/// If a signer did approve -> reject, the proposal will be rejected.
		///
		/// # Arguments
		///
		/// * `multisig_account` - The multisig account ID.
		/// * `call_or_hash` - The enum having the call or the hash of the call to be rejected.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `UnAuthorizedSigner` - The caller is not an signer of the multisig account.
		/// * `SignerNotFound` - The caller has not approved the proposal.
		#[pallet::call_index(3)]
		#[pallet::weight(Weight::default())]
		pub fn reject(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		) -> DispatchResult
}
  • execute_proposal - Execute a multisig proposal. (Releases Deposit)
#![allow(unused)]
fn main() {
		/// Executes a proposal for a dispatchable call for a multisig account.
		/// Poropsal needs to be approved by enough signers (exceeds or equal multisig threshold) before it can be executed.
		/// The caller must be one of the signers of the multisig account.
		///
		/// This function does an extra check to make sure that all approvers still exist in the multisig account.
		/// That is to make sure that the multisig account is not compromised by removing an signer during an active proposal.
		///
		/// Once finished, the withheld deposit will be returned to the proposal creator.
		///
		/// # Arguments
		///
		/// * `multisig_account` - The multisig account ID.
		/// * `call_or_hash` - We should have gotten the RuntimeCall (preimage) and stored it in the proposal by the time the extrinsic is called.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `UnAuthorizedSigner` - The caller is not an signer of the multisig account.
		/// * `NotEnoughApprovers` - approvers don't exceed the threshold.
		/// * `ProposalNotFound` -  The proposal does not exist.
		/// * `CallPreImageNotFound` -  The proposal doesn't have the preimage of the call in the state.
		pub fn execute_proposal(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		) -> DispatchResult
}
  • cancel_proposal - Cancel a multisig proposal. (Releases Deposit)
#![allow(unused)]
fn main() {
		/// Cancels an existing proposal for a multisig account.
		/// Poropsal needs to be rejected by enough signers (exceeds or equal multisig threshold) before it can be executed.
		/// The caller must be one of the signers of the multisig account.
		///
		/// This function does an extra check to make sure that all rejectors still exist in the multisig account.
		/// That is to make sure that the multisig account is not compromised by removing an signer during an active proposal.
		///
		/// Once finished, the withheld deposit will be returned to the proposal creator./
		///
		/// # Arguments
		///
		/// * `origin` - The origin multisig account who wants to cancel the proposal.
		/// * `call_or_hash` - The call or hash of the call to be canceled.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `ProposalNotFound` - The proposal does not exist.
		pub fn cancel_proposal(
		origin: OriginFor<T>, 
		multisig_account: T::AccountId, 
		call_or_hash: CallOrHash) -> DispatchResult
}
  • cancel_own_proposal - Cancel a multisig proposal started by the caller in case no other signers approved it yet. (Releases Deposit)
#![allow(unused)]
fn main() {
		/// Cancels an existing proposal for a multisig account Only if the proposal doesn't have approvers other than
		/// the proposer.
		///
		///	This function needs to be called from a the proposer of the proposal as the origin.
		///
		/// The withheld deposit will be returned to the proposal creator.
		///
		/// # Arguments
		///
		/// * `multisig_account` - The multisig account ID.
		/// * `call_or_hash` - The hash of the call to be canceled.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `ProposalNotFound` - The proposal does not exist.
		pub fn cancel_own_proposal(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		) -> DispatchResult
}
  • cleanup_proposals - Cleanup proposals of a multisig account. (Releases Deposit)
#![allow(unused)]
fn main() {
		/// Cleanup proposals of a multisig account. This function will iterate over a max limit per extrinsic to ensure
		/// we don't have unbounded iteration over the proposals.
		///
		/// The withheld deposit will be returned to the proposal creator.
		///
		/// # Arguments
		///
		/// * `multisig_account` - The multisig account ID.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `ProposalNotFound` - The proposal does not exist.
		pub fn cleanup_proposals(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
		) -> DispatchResult
}

Note: Next functions need to be called from the multisig account itself. Deposits are reserved from the multisig account as well.

  • add_signer - Add a new signer to a multisig account. (Needs Deposit)
#![allow(unused)]
fn main() {
		/// Adds a new signer to the multisig account.
		/// This function needs to be called from a Multisig account as the origin.
		/// Otherwise it will fail with MultisigNotFound error.
		///
		/// T::PerSignerDeposit will be held from the multisig account.
		///
		/// # Arguments
		///
		/// * `origin` - The origin multisig account who wants to add a new signer to the multisig account.
		/// * `new_signer` - The AccountId of the new signer to be added.
		/// * `new_threshold` - The new threshold for the multisig account after adding the new signer.
		///
		/// # Errors
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `InvalidThreshold` - The threshold is greater than the total number of signers or is zero.
		/// * `TooManySignatories` - The number of signatories exceeds the maximum allowed.
		pub fn add_signer(
			origin: OriginFor<T>,
			new_signer: T::AccountId,
			new_threshold: u32,
		) -> DispatchResult
}
  • remove_signer - Remove an signer from a multisig account. (Releases Deposit)
#![allow(unused)]
fn main() {
		/// Removes an  signer from the multisig account.
		/// This function needs to be called from a Multisig account as the origin.
		/// Otherwise it will fail with MultisigNotFound error.
		/// If only one signer exists and is removed, the multisig account and any pending proposals for this account will be deleted from the state.
		///
		/// # Arguments
		///
		/// * `origin` - The origin multisig account who wants to remove an signer from the multisig account.
		/// * `signer_to_remove` - The AccountId of the signer to be removed.
		/// * `new_threshold` - The new threshold for the multisig account after removing the signer. Accepts zero if
		/// the signer is the only one left.kkk
		///
		/// # Errors
		///
		/// This function can return the following errors:
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `InvalidThreshold` - The new threshold is greater than the total number of signers or is zero.
		/// * `UnAuthorizedSigner` - The caller is not an signer of the multisig account.
		pub fn remove_signer(
			origin: OriginFor<T>,
			signer_to_remove: T::AccountId,
			new_threshold: u32,
		) -> DispatchResult
}
  • set_threshold - Change the threshold of a multisig account.
#![allow(unused)]
fn main() {
		/// Sets a new threshold for a multisig account.
		///	This function needs to be called from a Multisig account as the origin.
		/// Otherwise it will fail with MultisigNotFound error.
		///
		/// # Arguments
		///
		/// * `origin` - The origin multisig account who wants to set the new threshold.
		/// * `new_threshold` - The new threshold to be set.
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `InvalidThreshold` - The new threshold is greater than the total number of signers or is zero.
		set_threshold(origin: OriginFor<T>, new_threshold: u32) -> DispatchResult
}
  • delete_multisig - Delete a multisig account. (Releases Deposit)
#![allow(unused)]
fn main() {
		/// Deletes a multisig account and all related proposals.
		///
		///	This function needs to be called from a Multisig account as the origin.
		/// Otherwise it will fail with MultisigNotFound error.
		///
		/// # Arguments
		///
		/// * `origin` - The origin multisig account who wants to cancel the proposal.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		pub fn delete_account(origin: OriginFor<T>) -> DispatchResult
}

Storage/State

  • Use 2 main storage maps to store mutlisig accounts and proposals.
#![allow(unused)]
fn main() {
#[pallet::storage]
  pub type MultisigAccount<T: Config> = StorageMap<_, Twox64Concat, T::AccountId, MultisigAccountDetails<T>>;

/// The set of open multisig proposals. A proposal is uniquely identified by the multisig account and the call hash.
/// (maybe a nonce as well in the future)
#[pallet::storage]
pub type PendingProposals<T: Config> = StorageDoubleMap<
    _,
    Twox64Concat,
    T::AccountId, // Multisig Account
    Blake2_128Concat,
    T::Hash, // Call Hash
    MultisigProposal<T>,
>;
}

As for the values:

#![allow(unused)]
fn main() {
pub struct MultisigAccountDetails<T: Config> {
	/// The signers of the multisig account. This is a BoundedBTreeSet to ensure faster operations (add, remove).
	/// As well as lookups and faster set operations to ensure approvers is always a subset from signers. (e.g. in case of removal of an signer during an active proposal)
	pub signers: BoundedBTreeSet<T::AccountId, T::MaxSignatories>,
	/// The threshold of approvers required for the multisig account to be able to execute a call.
	pub threshold: u32,
	pub deposit: BalanceOf<T>,
}
}
#![allow(unused)]
fn main() {
pub struct MultisigProposal<T: Config> {
    /// Proposal creator.
    pub creator: T::AccountId,
    pub creation_deposit: BalanceOf<T>,
    /// The extrinsic when the multisig operation was opened.
    pub when: Timepoint<BlockNumberFor<T>>,
    /// The approvers achieved so far, including the depositor.
    /// The approvers are stored in a BoundedBTreeSet to ensure faster lookup and operations (approve, reject).
    /// It's also bounded to ensure that the size don't go over the required limit by the Runtime.
    pub approvers: BoundedBTreeSet<T::AccountId, T::MaxSignatories>,
    /// The rejectors for the proposal so far.
    /// The rejectors are stored in a BoundedBTreeSet to ensure faster lookup and operations (approve, reject).
    /// It's also bounded to ensure that the size don't go over the required limit by the Runtime.
    pub rejectors: BoundedBTreeSet<T::AccountId, T::MaxSignatories>,
    /// The block number until which this multisig operation is valid. None means no expiry.
    pub expire_after: Option<BlockNumberFor<T>>,
}
}

For optimization we're using BoundedBTreeSet to allow for efficient lookups and removals. Especially in the case of approvers, we need to be able to remove an approver from the list when they reject their approval. (which we do lazily when execute_proposal is called).

There's an extra storage map for the deposits of the multisig accounts per signer added. This is to ensure that we can release the deposits when the multisig removes them even if the constant deposit per signer changed in the runtime later on.

Considerations & Edge cases

Removing an signer from the multisig account during an active proposal

We need to ensure that the approvers are always a subset from signers. This is also partially why we're using BoundedBTreeSet for signers and approvers. Once execute proposal is called we ensure that the proposal is still valid and the approvers are still a subset from current signers.

Multisig account deletion and cleaning up existing proposals

Once the last signer of a multisig account is removed or the multisig approved the account deletion we delete the multisig accound from the state and keep the proposals until someone calls cleanup_proposals multiple times which iterates over a max limit per extrinsic. This is to ensure we don't have unbounded iteration over the proposals. Users are already incentivized to call cleanup_proposals to get their deposits back.

Multisig account deletion and existing deposits

We currently just delete the account without checking for deposits (Would like to hear your thoughts here). We can either

  • Don't make deposits to begin with and make it a fee.
  • Transfer to treasury.
  • Error on deletion. (don't like this)

Approving a proposal after the threshold is changed

We always use latest threshold and don't store each proposal with different threshold. This allows the following:

  • In case threshold is lower than the number of approvers then the proposal is still valid.
  • In case threshold is higher than the number of approvers then we catch it during execute proposal and error.

Drawbacks

  • New pallet to maintain.

Testing, Security, and Privacy

Standard audit/review requirements apply.

Performance, Ergonomics, and Compatibility

Performance

Doing back of the envelop calculation to proof that the stateful multisig is more efficient than the stateless multisig given it's smaller footprint size on blocks.

Quick review over the extrinsics for both as it affects the block size:

Stateless Multisig: Both as_multi and approve_as_multi has a similar parameters:

#![allow(unused)]
fn main() {
origin: OriginFor<T>,
threshold: u16,
other_signatories: Vec<T::AccountId>,
maybe_timepoint: Option<Timepoint<BlockNumberFor<T>>>,
call_hash: [u8; 32],
max_weight: Weight,
}

Stateful Multisig: We have the following extrinsics:

#![allow(unused)]
fn main() {
pub fn start_proposal(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		)
}
#![allow(unused)]
fn main() {
pub fn approve(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		)
}
#![allow(unused)]
fn main() {
pub fn execute_proposal(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		)
}

The main takeway is that we don't need to pass the threshold and other signatories in the extrinsics. This is because we already have the threshold and signatories in the state (only once).

So now for the caclulations, given the following:

  • K is the number of multisig accounts.
  • N is number of signers in each multisig account.
  • For each proposal we need to have 2N/3 approvals.

The table calculates if each of the K multisig accounts has one proposal and it gets approved by the 2N/3 and then executed. How much did the total Blocks and States sizes increased by the end of the day.

Note: We're not calculating the cost of proposal as both in statefull and stateless multisig they're almost the same and gets cleaned up from the state once the proposal is executed or canceled.

Stateless effect on blocksizes = 2/3KN^2 (as each user of the 2/3 users will need to call approve_as_multi with all the other signatories(N) in extrinsic body)

Stateful effect on blocksizes = K * N (as each user will need to call approve with the multisig account only in extrinsic body)

Stateless effect on statesizes = Nil (as the multisig account is not stored in the state)

Stateful effect on statesizes = K*N (as each multisig account (K) will be stored with all the signers (K) in the state)

PalletBlock SizeState Size
Stateless2/3KN^2Nil
StatefulK*NK*N

Simplified table removing K from the equation: | Pallet | Block Size | State Size | |----------------|:-------------:|-----------:| | Stateless | N^2 | Nil | | Stateful | N | N |

So even though the stateful multisig has a larger state size, it's still more efficient in terms of block size and total footprint on the blockchain.

Ergonomics

The Stateful Multisig will have better ergonomics for managing multisig accounts for both developers and end-users.

Compatibility

This RFC is compatible with the existing implementation and can be handled via upgrades and migration. It's not intended to replace the existing multisig pallet.

Prior Art and References

multisig pallet in polkadot-sdk

Unresolved Questions

  • On account deletion, should we transfer remaining deposits to treasury or remove signers' addition deposits completely and consider it as fees to start with?
  • Batch addition/removal of signers.
  • Add expiry to proposals. After a certain time, proposals will not accept any more approvals or executions and will be deleted.
  • Implement call filters. This will allow multisig accounts to only accept certain calls.

(source)

Table of Contents

RFC-0077: Increase maximum length of identity PGP fingerprint values from 20 bytes

Start Date20 Feb 2024
DescriptionIncrease the maximum length of identity PGP fingerprint values from 20 bytes
AuthorsLuke Schoen

Summary

This proposes to increase the maximum length of PGP Fingerprint values from a 20 bytes/chars limit to a 40 bytes/chars limit.

Motivation

Background

Pretty Good Privacy (PGP) Fingerprints are shorter versions of their corresponding Public Key that may be printed on a business card.

They may be used by someone to validate the correct corresponding Public Key.

It should be possible to add PGP Fingerprints to Polkadot on-chain identities.

GNU Privacy Guard (GPG) is compliant with PGP and the two acronyms are used interchangeably.

Problem

If you want to set a Polkadot on-chain identity, users may provide a PGP Fingerprint value in the "pgpFingerprint" field, which may be longer than 20 bytes/chars (e.g. PGP Fingerprints are 40 bytes/chars long), however that field can only store a maximum length of 20 bytes/chars of information.

Possible disadvantages of the current 20 bytes/chars limitation:

  • Discourages users from using the "pgpFingerprint" field.
  • Discourages users from using Polkadot on-chain identities for Web2 and Web3 dApp software releases where the latest "pgpFingerprint" field could be used to verify the correct PGP Fingerprint that has been used to sign the software releases so users that download the software know that it was from a trusted source.
  • Encourages dApps to link to Web2 sources to allow their users verify the correct fingerprint associated with software releases, rather than to use the Web3 Polkadot on-chain identity "pgpFingerprint" field of the releaser of the software, since it may be the case that the "pgpFingerprint" field of most on-chain identities is not widely used due to the maximum length of 20 bytes/chars restriction.
  • Discourages users from setting an on-chain identity by creating an extrinsic using Polkadot.js with identity > setIdentity(info), since if they try to provide their 40 character long PGP Fingerprint or GPG Fingerprint, which is longer than the maximum length of 20 bytes/chars, they will encounter an error.
  • Discourages users from using on-chain Web3 registrars to judge on-chain identity fields, where the shortest value they are able to generate for a "pgpFingerprint" is not less than or equal to the maximum length of 20 bytes.

Solution Requirements

The maximum length of identity PGP Fingerprint values should be increased from the current 20 bytes/chars limit at least a 40 bytes/chars limit to support PGP Fingerprints and GPG Fingerprints.

Stakeholders

  • Any Polkadot account holder wishing to use a Polkadot on-chain identity for their:
    • PGP Fingerprints that are longer than 32 characters
    • GPG Fingerprints that are longer than 32 characters

Explanation

If a user tries to setting an on-chain identity by creating an extrinsic using Polkadot.js with identity > setIdentity(info), then if they try to provide their 40 character long PGP Fingerprint or GPG Fingerprint, which is longer than the maximum length of 20 bytes/chars [u8;20], then they will encounter this error:

createType(Call):: Call: failed decoding identity.setIdentity:: Struct: failed on args: {...}:: Struct: failed on pgpFingerprint: Option<[u8;20]>:: Expected input with 20 bytes (160 bits), found 40 bytes

Increasing maximum length of identity PGP Fingerprint values from the current 20 bytes/chars limit to at least a 40 bytes/chars limit would overcome these errors and support PGP Fingerprints and GPG Fingerprints, satisfying the solution requirements.

Drawbacks

No drawbacks have been identified.

Testing, Security, and Privacy

Implementations would be tested for adherance by checking that 40 bytes/chars PGP Fingerprints are supported.

No effect on security or privacy has been identified than already exists.

No implementation pitfalls have been identified.

Performance, Ergonomics, and Compatibility

Performance

It would be an optimization, since the associated exposed interfaces to developers and end-users could start being used.

To minimize additional overhead the proposal suggests a 40 bytes/chars limit since that would at least provide support for PGP Fingerprints, satisfying the solution requirements.

Ergonomics

No potential ergonomic optimizations have been identified.

Compatibility

Updates to Polkadot.js Apps, API and its documentation and those referring to it may be required.

Prior Art and References

No prior articles or references.

Unresolved Questions

No further questions at this stage.

Relates to RFC entitled "Increase maximum length of identity raw data values from 32 bytes".

(source)

Table of Contents

RFC-0088: Add slashable locked deposit, purchaser reputation, and reserved cores for on-chain identities to broker pallet

Start Date25 Apr 2024
DescriptionAdd slashable locked deposit, purchaser reputation, and reserved cores for on-chain identities to broker pallet
AuthorsLuke Schoen

Summary

This proposes to require a slashable deposit in the broker pallet when initially purchasing or renewing Bulk Coretime or Instantaneous Coretime cores.

Additionally, it proposes to record a reputational status based on the behavior of the purchaser, as it relates to their use of Kusama Coretime cores that they purchase, and to possibly reserve a proportion of the cores for prospective purchasers that have an on-chain identity.

Motivation

Background

There are sales of Kusama Coretime cores that are scheduled to occur later this month by Coretime Marketplace Lastic.xyz initially in limited quantities, and potentially also by RegionX in future that is subject to their Polkadot referendum #582. This poses a risk in that some Kusama Coretime core purchasers may buy Kusama Coretime cores when they have no intention of actually placing a workload on them or leasing them out, which would prevent those that wish to purchase and actually use Kusama Coretime cores from being able to use any at cores at all.

Problem

The types of purchasers may include:

  • Collectors (e.g. purchase a significant core such as the first core that is sold just to increase their likelihood of receiving an NFT airdrop for being one of the first purchasers).
  • Resellers (e.g. purchase a core that may be used at a popular period of time to resell closer to the date to realise a profit)
  • Market makers (e.g. buy cores just to change the floor price or volume).
  • Anti-competitive (e.g. competitor to Polkadot ecosystem purchases cores possibly in violation of anti-trust laws just to restrict access to prospective Kusama Coretime sales cores by the Kusama community that wish to do business in the Polkadot ecosystem).

Chaoatic repurcussions could include the following:

  • Generation of "white elephant" Kusama Coretime cores, similar to "white elephant" properties in the real-estate industry that never actually get used, leased or tenanted.
  • Kusama Coretime core resellers scalping the core time faster than the average core time consumer, and then choosing to use dynamic pricing that causes prices to fluctuate based on demand.
  • Resellers that own the Kusama Coretime scalping organisations may actually turn out to be the Official Kusama Coretime sellers.
  • Official Kusama Coretime sellers may establish a monopoly on the market and abuse that power by charging exhorbitant additional charge fees for each purchase, since they could then increase their floor prices even more, pretending that there are fewer cores available and more demand to make extra profits from their scalping organisations, similar to how it occurred in these concert ticket sales. This could caused Kusama Coretime costs to be no longer be affordable to the Kusama community.
  • Official Kusama Coretime sellers may run pre-sale events, but their websites may not be able to unable to handle the traffic and crash multiple times, causing them to end up cancelling those pre-sales and the pre-sale registrants missing out on getting a core that way, which would then cause available Kusama Coretime cores to be bought and resold at a higher price on third-party sites.
  • The scalping activity may be illegal in some jurisdictions and raise anti-trust issues similar to the Taylor Swift debacle over concert tickets.

Solution Requirements

  1. On-chain identity. It may be possible to circumvent bots and scalpers to an extent by requiring a proportion of Kusama Coretime purchasers to have an on-chain identity. As such, a possible solution could be to allow the configuration of a threshold in the Broker pallet that reserves a proportion of the cores for accounts that have an on-chain identity, that reverts to a waiting list of anonymous account purchasers if the reserved proportion of cores remain unsold.

  2. Slashable deposit. A viable solution could be to require a slashable deposit to be locked prior to the purchase or renewal of a core, similar to how decision deposits are used in OpenGov to prevent spam, but where if you buy a Kusama Coretime core you could be challenged by one of more collectives of fishermen to provide proof against certain criteria of how you used it, and if you fail to provide adequate evidence in response to that scrutiny, then you would lose a proportion of that deposit and face restrictions on purchasing or renewing cores in future that may also be configured on-chain.

  3. Reputation. To disincentivise certain behaviours, a reputational status indicator could be used to record the historic behavior of the purchaser and whether on-chain judgement has determined they have adequately rectified that behaviour, as it relates to their usage of Kusama Coretime cores that they purchase.

Stakeholders

  • Any Kusama account holder wishing to use the Broker pallet in any upcoming Kusama Coretime sales.
  • Any prospective Kusama Coretime purchaser, developer, and user.
  • KSM holders.

Drawbacks

Performance

The slashable deposit if set too high, may result in an economic impact, where less Kusama Coretime core sales are purchased.

Testing, Security, and Privacy

Lack of a slashable deposit in the Broker pallet is a security concern, since it exposes Kusama Coretime sales to potential abuse.

Reserving a proportion of Kusama Coretime sales cores for those with on-chain identities should not be to the exclusion of accounts that wish to remain anonymous or cause cores to be wasted unnecessarily. As such, if cores that are reserved for on-chain identities remain unsold then they should be released to anonymous accounts that are on a waiting list.

No implementation pitfalls have been identified.

Performance, Ergonomics, and Compatibility

Performance

It should improve performance as it reduces the potential for state bloat since there is less risk of undesirable Kusama Coretime sales activity that would be apparent with no requirement for a slashable deposit or there being no reputational risk to purchasers that waste or misuse Kusama Coretime cores.

The solution proposes to minimize the risk of some Kusama Coretime cores not even being used or leased to perform any tasks at all.

It will be important to monitor and manage the slashable deposits, purchaser reputations, and utilization of the proportion of cores that are reserved for accounts with an on-chain identity.

Ergonomics

The mechanism for setting a slashable deposit amount, should avoid undue complexity for users.

Compatibility

Updates to Polkadot.js Apps, API and its documentation and those referring to it may be required.

Prior Art and References

Prior Art

No prior articles.

Unresolved Questions

None

None

(source)

Table of Contents

RFC-0001: Secondary Market for Regions

Start Date2024-06-09
DescriptionImplement a secondary market for region listings and sales
AuthorsAurora Poppyseed, Philip Lucsok

Summary

This RFC proposes the addition of a secondary market feature to either the broker pallet or as a separate pallet maintained by Lastic, enabling users to list and purchase regions. This includes creating, purchasing, and removing listings, as well as emitting relevant events and handling associated errors.

Motivation

Currently, the broker pallet lacks functionality for a secondary market, which limits users' ability to freely trade regions. This RFC aims to introduce a secure and straightforward mechanism for users to list regions they own for sale and allow other users to purchase these regions.

While integrating this functionality directly into the broker pallet is one option, another viable approach is to implement it as a separate pallet maintained by Lastic. This separate pallet would have access to the broker pallet and add minimal functionality necessary to support the secondary market.

Adding smart contracts to the Coretime chain could also address this need; however, this process is expected to be lengthy and complex. We cannot afford to wait for this extended timeline to enable basic secondary market functionality. By proposing either integration into the broker pallet or the creation of a dedicated pallet, we can quickly enhance the flexibility and utility of the broker pallet, making it more user-friendly and valuable.

Stakeholders

Primary stakeholders include:

  • Developers working on the broker pallet.
  • Secondary Coretime marketplaces.
  • Users who own regions and wish to trade them.
  • Community members interested in enhancing the broker pallet’s capabilities.

Explanation

This RFC introduces the following key features:

  1. Storage Changes:

    • Addition of Listings storage map to keep track of regions listed for sale and their prices.
  2. New Dispatchable Functions:

    • create_listing: Allows a region owner to list a region for sale.
    • purchase_listing: Allows a user to purchase a listed region.
    • remove_listing: Allows a region owner to remove their listing.
  3. Events:

    • ListingCreated: Emitted when a new listing is created.
    • RegionSold: Emitted when a region is sold.
    • ListingRemoved: Emitted when a listing is removed.
  4. Error Handling:

    • ExpiredRegion: The region has expired and cannot be listed or sold.
    • UnknownListing: The listing does not exist.
    • InvalidPrice: The listing price is invalid.
    • NotOwner: The caller is not the owner of the region.
  5. Testing:

    • Comprehensive tests to verify the correct functionality of the new features, including listing creation, purchase, removal, and handling of edge cases such as expired regions and unauthorized actions.

Drawbacks

The main drawback of adding the additional complexity directly to the broker pallet is the potential increase in maintenance overhead. Therefore, we propose adding additional functionality as a separate pallet on the Coretime chain. To take the pressure off from implementing these features, implementation along with unit tests would be taken care of by Lastic (Aurora Makovac, Philip Lucsok).

There are potential risks of security vulnerabilities in the new market functionalities, such as unauthorized region transfers or incorrect balance adjustments. Therefore, extensive security measures would have to be implemented.

Testing, Security, and Privacy

Testing

  • Comprehensive unit tests need to be provided to ensure the correctness of the new functionalities.
  • Scenarios tested should include successful and failed listing creation, purchases, and removals, as well as edge cases like expired regions and non-owner actions.

Security

  • Security audits should be performed to identify any vulnerabilities.
  • Ensure that only region owners can create or remove listings.
  • Validate all inputs to prevent invalid operations.

Privacy

  • The proposal does not introduce new privacy concerns as it only affects region trading functionality within the existing framework.

Performance, Ergonomics, and Compatibility

Performance

  • This feature is expected to introduce minimal overhead since it primarily involves read and write operations to storage maps.
  • Efforts will be made to optimize the code to prevent unnecessary computational costs.

Ergonomics

  • The new functions are designed to be intuitive and easy to use, providing clear feedback through events and errors.
  • Documentation and examples will be provided to assist developers and users.

Compatibility

  • This proposal does not break compatibility with existing interfaces or previous versions.
  • No migrations are necessary as it introduces new functionality without altering existing features.

Prior Art and References

  • All related discussions are going to be under this PR.

Unresolved Questions

  • Are there additional security measures needed to prevent potential abuses of the new functionalities?
  • Integration with external NFT marketplaces for more robust trading options.
  • Development of user interfaces to interact with the new marketplace features seamlessly.
  • Exploration of adding smart contracts to the Coretime chain, which would provide greater flexibility and functionality for the secondary market and other decentralized applications. This would require a longer time for implementation, so this proposes an intermediary solution.

(source)

Table of Contents

RFC-0002: Smart Contracts on the Coretime Chain

Start Date2024-06-09
DescriptionImplement smart contracts on the Coretime chain
AuthorsAurora Poppyseed, Phil Lucksok

Summary

This RFC proposes the integration of smart contracts on the Coretime chain to enhance flexibility and enable complex decentralized applications, including secondary market functionalities.

Motivation

Currently, the Coretime chain lacks the capability to support smart contracts, which limits the range of decentralized applications that can be developed and deployed. By enabling smart contracts, the Coretime chain can facilitate more sophisticated functionalities such as automated region trading, dynamic pricing mechanisms, and other decentralized applications that require programmable logic. This will enhance the utility of the Coretime chain, attract more developers, and create more opportunities for innovation.

Additionally, while there is a proposal (#885) to allow EVM-compatible contracts on Polkadot’s Asset Hub, the implementation of smart contracts directly on the Coretime chain will provide synchronous interactions and avoid the complexities of asynchronous operations via XCM.

Stakeholders

Primary stakeholders include:

  • Developers working on the Coretime chain.
  • Users who want to deploy decentralized applications on the Coretime chain.
  • Community members interested in expanding the capabilities of the Coretime chain.
  • Secondary Coretime marketplaces.

Explanation

This RFC introduces the following key components:

  1. Smart Contract Support:

    • Integrate support for deploying and executing smart contracts on the Coretime chain.
    • Use a well-established smart contract platform, such as Ethereum’s Solidity or Polkadot's Ink!, to ensure compatibility and ease of use.
  2. Storage and Execution:

    • Define a storage structure for smart contracts and their associated data.
    • Ensure efficient and secure execution of smart contracts, with proper resource management and gas fee mechanisms.
  3. Integration with Existing Pallets:

    • Ensure that smart contracts can interact with existing pallets on the Coretime chain, such as the broker pallet.
    • Provide APIs and interfaces for seamless integration and interaction.
  4. Security and Auditing:

    • Implement robust security measures to prevent vulnerabilities and exploits in smart contracts.
    • Conduct thorough security audits and testing before deployment.

Drawbacks

There are several drawbacks to consider:

  • Complexity: Adding smart contracts introduces significant complexity to the Coretime chain, which may increase maintenance overhead and the potential for bugs.
  • Performance: The execution of smart contracts can be resource-intensive, potentially affecting the performance of the Coretime chain.
  • Security: Smart contracts are prone to vulnerabilities and exploits, necessitating rigorous security measures and continuous monitoring.

Testing, Security, and Privacy

Testing

  • Comprehensive unit tests and integration tests should be developed to ensure the correct functionality of smart contracts.
  • Test scenarios should include various use cases and edge cases to validate the robustness of the implementation.

Security

  • Security audits should be performed to identify and mitigate vulnerabilities.
  • Implement best practices for smart contract development to minimize the risk of exploits.
  • Continuous monitoring and updates will be necessary to address new security threats.

Privacy

  • The proposal does not introduce new privacy concerns as it extends existing functionalities with programmable logic.

Performance, Ergonomics, and Compatibility

Performance

  • The introduction of smart contracts may impact performance due to the additional computational overhead.
  • Optimization techniques, such as efficient gas fee mechanisms and resource management, should be employed to minimize performance degradation.

Ergonomics

  • The new functionality should be designed to be intuitive and easy to use for developers, with comprehensive documentation and examples.
  • Provide developer tools and SDKs to facilitate the creation and deployment of smart contracts.

Compatibility

  • This proposal should maintain compatibility with existing interfaces and functionalities of the Coretime chain.
  • Ensure backward compatibility and provide migration paths if necessary.

Prior Art and References

  • Ethereum’s implementation of smart contracts using Solidity.
  • Polkadot’s Ink! smart contract platform.
  • Existing decentralized applications and use cases on other blockchain platforms.
  • Proposal #885: EVM-compatible contracts on Asset Hub, which highlights the community's interest in integrating smart contracts within the Polkadot ecosystem.

Unresolved Questions

  • What specific security measures should be implemented to prevent smart contract vulnerabilities?
  • How can we ensure optimal performance while supporting complex smart contracts?
  • What are the best practices for integrating smart contracts with existing pallets on the Coretime chain?
  • Further enhancements could include advanced developer tools and SDKs for smart contract development.
  • Integration with external decentralized applications and platforms to expand the ecosystem.
  • Continuous updates and improvements to the smart contract platform based on community feedback and emerging best practices.
  • Exploration of additional use cases for smart contracts on the Coretime chain, such as decentralized finance (DeFi) applications, voting systems, and more.

By enabling smart contracts on the Coretime chain, we can significantly expand its capabilities and attract a wider range of developers and users, fostering innovation and growth in the ecosystem.

(source)

Table of Contents

RFC-0000: Feature Name Here

Start Date13 July 2024
DescriptionImplement off-chain parachain runtime upgrades
Authorseskimor

Summary

Change the upgrade process of a parachain runtime upgrade to become an off-chain process with regards to the relay chain. Upgrades are still contained in parachain blocks, but will no longer need to end up in relay chain blocks nor in relay chain state.

Motivation

Having parachain runtime upgrades go through the relay chain has always been seen as a scalability concern. Due to optimizations in statement distribution and asynchronous backing it became less crucial and got de-prioritized, the original issue can be found here.

With the introduction of Agile Coretime and in general our efforts to reduce barrier to entry more for Polkadot more, the issue becomes more relevant again: We would like to reduce the required storage deposit for PVF registration, with the aim to not only make it cheaper to run a parachain (bulk + on-demand coretime), but also reduce the amount of capital required for the deposit. With this we would hope for far more parachains to get registered, thousands potentially even ten thousands. With so many PVFs registered, updates are expected to become more frequent and even attacks on service quality for other parachains would become a higher risk.

Stakeholders

  • Parachain Teams
  • Relay Chain Node implementation teams
  • Relay Chain runtime developers

Explanation

The issues with on-chain runtime upgrades are:

  1. Needlessly costly.
  2. A single runtime upgrade more or less occupies an entire relay chain block, thus it might affect also other parachains, especially if their candidates are also not negligible due to messages for example or they want to uprade their runtime at the same time.
  3. The signalling of the parachain to notify the relay chain of an upcoming runtime upgrade already contains the upgrade. Therefore the only way to rate limit upgrades is to drop an already distributed update in the size of megabytes: With the result that the parachain missed a block and more importantly it will try again with the very next block, until it finally succeeds. If we imagine to reduce capacity of runtime upgrades to let's say 1 every 100 relay chain blocks, this results in lot's of wasted effort and lost blocks.

We discussed introducing a separate signalling before submitting the actual runtime, but I think we should just go one step further and make upgrades fully off-chain. Which also helps bringing down deposit costs in a secure way, as we are also actually reducing costs for the network.

Introduce a new UMP message type RequestCodeUpgrade

As part of elastic scaling we are already planning to increase flexibility of UMP messages, we can now use this to our advantage and introduce another UMP message:

#![allow(unused)]
fn main() {
enum UMPSignal {
  // For elastic scaling
  OnCore(CoreIndex),
  // For off-chain upgrades
  RequestCodeUpgrade(Hash),
}
}

We could also make that new message a regular XCM, calling an extrinsic on the relay chain, but we will want to look into that message right after validation on the backers on the node side, making a straight forward semantic message more apt for the purpose.

Handle RequestCodeUpgrade on backers

We will introduce a new request/response protocol for both collators and validators, with the following request/response:

#![allow(unused)]
fn main() {
struct RequestBlob {
  blob_hash: Hash,
}

struct BlobResponse {
  blob: Vec<u8>
}
}

This protocol will be used by backers to request the PVF from collators in the following conditions:

  1. They received a collation sending RequestCodeUpgrade.
  2. They received a collation, but they don't yet have the code that was previously registered on the relaychain. (E.g. disk pruned, new validator)

In case they received the collation via PoV distribution instead of from the collator itself, they will use the exact same message to fetch from the valiator they got the PoV from.

Get the new code to all validators

Once the candidate issuing RequestCodeUpgrade got backed on chain, validators will start fetching the code from the backers as part of availability distribution.

To mitigate attack vectors we should make sure that serving requests for code can be treated as low priority requests. Thus I am suggesting the following scheme:

Validators will notice via a runtime API (TODO: Define) that a new code has been requested, the API will return the Hash and a counter, which starts at some configurable value e.g. 10. The validators are now aware of the new hash and start fetching, but they don't have to wait for the fetch to succeed to sign their bitfield.

Then on each further candidate from that chain that counter gets decremented. Validators which have not yet succeeded fetching will now try again. This game continues until the counter reached 0. Now it is mandatory to have to code in order to sign a 1 in the bitfield.

PVF pre-checking will happen after the candidate which brought the counter to 0 has been successfully included and thus is also able to assume that 2/3 of the validators have the code.

This scheme serves two purposes:

  1. Fetching can happen over a longer period of time with low priority. E.g. if we waited for the PVF at the very first avaialbility distribution, this might actually affect liveness of other chains on the same core. Distributing megabytes of data to a thousand validators, might take a bit. Thus this helps isolating parachains from each other.
  2. By configuring the initial counter value we can affect how much an upgrade costs. E.g. forcing the parachain to produce 10 blocks, means 10x the cost for issuing an update. If too frequent upgrades ever become a problem for the system, we have a knob to make them more costly.

On-chain code upgrade process

First when a candidate is backed we need to make the new hash available (together with a counter) via a runtime API so validators in availability distribution can check for it and fetch it if changed (see previous section). For performance reasons, I think we should not do an additional call, but replace the existing one with one containing the new additional information (Option<(Hash, Counter)>).

Once the candidate gets included (counter 0), the hash is given to pre-checking and only after pre-checking succeeded (and a full session passed) it is finally enacted and the parachain can switch to the new code. (Same process as it used to be.)

Handling new validators

Backers

If a backer receives a collation for a parachain it does not yet have the code as enacted on chain (see "On-chain code upgrade process"), it will use above request/response protocol to fetch it from whom it received the collation.

Availablity Distribution

Validators in availability distribution will be changed to only sign a 1 in the bitfield of a candidate if they not only have the chunk, but also the currently active PVF. They will fetch it from backers in case they don't have it yet.

How do other parties get hold of the PVF?

Two ways:

  1. Discover collators via relay chain DHT and request from them: Preferred way, as it is less load on validators.
  2. Request from validators, which will serve on a best effort basis.

Pruning

We covered how validators get hold of new code, but when can they prune old ones? In principle it is not an issue, if some validors prune code, because:

  1. We changed it so that a candidate is not deemed available if validators were not able to fetch the PVF.
  2. Backers can always fetch the PVF from collators as part of the collation fetching.

But the majority of validators should always keep the latest code of any parachain and only prune the previous one, once the first candidate using the new code got finalized. This ensures that disputes will always be able to resolve.

Drawbacks

The major drawback of this solution is the same as any solution the moves work off-chain, it adds complexity to the node. E.g. nodes needing the PVF, need to store them separately, together with their own pruning strategy as well.

Testing, Security, and Privacy

Implementations adhering to this RFC, will respond to PVF requests with the actual PVF, if they have it. Requesters will persist received PVFs on disk for as long as they are replaced by a new one. Implementations must not be lazy here, if validators only fetched the PVF when needed, they can be prevented from participating in disputes.

Validators should treat incoming requests for PVFs in general with rather low priority, but should prefer fetches from other validators over requests from random peers.

Given that we are altering what set bits in the availability bitfields mean (not only chunk, but also PVF available), it is important to have enough validators upgraded, before we allow collators to make use of the new runtime upgrade mechanism. Otherwise we would risk disputes to not being able to succeed.

This RFC has no impact on privacy.

Performance, Ergonomics, and Compatibility

Performance

This proposal lightens the load on the relay chain and is thus in general beneficial for the performance of the network, this is achieved by the following:

  1. Code upgrades are still propagated to all validators, but only once, not twice (First statements, then via the containing relay chain block).
  2. Code upgrades are only communicated to validators and other nodes which are interested, not any full node as it has been before.
  3. Relay chain block space is preserved. Previously we could only do one runtime upgrade per relay chain block, occupying almost all of the blockspace.
  4. Signalling an upgrade no longer contains the upgrade, hence if we need to push back on an upgrade for whatever reason, no network bandwidth and core time gets wasted because of this.

Ergonomics

End users are only affected by better performance and more stable block times. Parachains will need to implement the introduced request/response protocol and adapt to the new signalling mechanism via an UMP message, instead of sending the code upgrade directly.

For parachain operators we should emit events on initiated runtime upgrade and each block reporting the current counter and how many blocks to go until the upgrade gets passed to pre-checking. This is especially important for on-demand chains or bulk users not occupying a full core. Further more that behaviour of requiring multiple blocks to fully initiate a runtime upgrade needs to be well documented.

Compatibility

We will continue to support the old mechanism for code upgrades for a while, but will start to impose stricter limits over time, with the number of registered parachains going up. With those limits in place parachains not migrating to the new scheme might be having a harder time upgrading and will miss more blocks. I guess we can be lenient for a while still, so the upgrade path for parachains should be rather smooth.

In total the protocol changes we need are:

For validators and collators:

  1. New request/response protocol for fetching PVF data from collators and validators.
  2. New UMP message type for signalling a runtime upgrade.

Only for validators:

  1. New runtime API for determining to be enacted code upgrades.
  2. Different behaviour of bitfields (only sign a 1 bit, if validator has chunk + "hot" PVF).
  3. Altered behaviour in availability-distribution: Fetch missing PVFS.

Prior Art and References

Off-chain runtime upgrades have been discussed before, the architecture described here is simpler though as it piggybacks on already existing features, namely:

  1. availability-distribution: No separate I have code messages anymore.
  2. Existing pre-checking.

https://github.com/paritytech/polkadot-sdk/issues/971

Unresolved Questions

  1. What about the initial runtime, shall we make that off-chain as well?
  2. Good news, at least after the first upgrade, no code will be stored on chain any more, this means that we also have to redefine the storage deposit now. We no longer charge for chain storage, but validator disk storage -> Should be cheaper. Solution to this: Not only store the hash on chain, but also the size of the data. Then define a price per byte and charge that, but:
    • how do we charge - I guess deposit has to be provided via other means, runtime upgrade fails if not provided.
    • how do we signal to the chain that the code is too large for it to reject the upgrade? Easy: Make available and vote nay in pre-checking.

TODO: Fully resolve these questions and incorporate in RFC text.

Further Hardening

By no longer having code upgrade go through the relay chain, occupying a full relay chain block, the impact on other parachains is already greatly reduced, if we make distribution and PVF pre-checking low-priority processes on validators. The only thing attackers might be able to do is delay upgrades of other parachains.

Which seems like a problem to be solved once we actually see it as a problem in the wild (and can already be mitigated by adjusting the counter). The good thing is that we have all the ingredients to go further if need be. Signalling no longer actually includes the code, hence there is no need to reject the candidate: The parachain can make progress even if we choose not to immediately act on the request and no relay chain resources are wasted either.

We could for example introduce another UMP Signalling message RequestCodeUpgradeWithPriority which not just requests a code upgrade, but also offers some DOT to get ranked up in a queue.

Generalize this off-chain storage mechanism?

Making this storage mechanism more general purpose is worth thinking about. E.g. by resolving above "fee" question, we might also be able to resolve the pruning question in a more generic way and thus could indeed open this storage facility for other purposes as well. E.g. smart contracts, so the PoV would only need to reference contracts by hash and the actual PoV is stored on validators and collators and thus no longer needs to be part of the PoV.

A possible avenue would be to change the response to:

#![allow(unused)]
fn main() {
enum BlobResponse {
  Blob(Vec<u8>),
  Blobs(MerkleTree),
}
}

With this the hash specified in the request can also be a merkle root and the responder will respond with the entire merkle tree (only hashes, no payload). Then the requester can traverse the leaf hashes and use the same request response protocol to request any locally missing blobs in that tree.

One leaf would for example be the PVF others could be smart contracts. With a properly specified format (e.g. which leaf is the PVF?), what we got here is that a parachain can not only update its PVF, but additional data, incrementally. E.g. adding another smart contract, does not require resubmitting the entire PVF to validators, only the root hash on the relay chain gets updated, then validators fetch the merkle tree and only fetch any missing leaves. That additional data could be made available to the PVF via a to be added host function. The nice thing about this approach is, that while we can upgrade incrementally, lifetime is still tied to the PVF and we get all the same guarantees. Assuming the validators store blobs by hash, we even get disk sharing if multiple parachains use the same data (e.g. same smart contracts).

(source)

Table of Contents

RFC-0106: Remove XCM fees mode

Start Date23 July 2024
DescriptionRemove the SetFeesMode instruction and fees_mode register from XCM
AuthorsFrancisco Aguirre

Summary

The SetFeesMode instruction and the fees_mode register allow for the existence of JIT withdrawal. JIT withdrawal complicates the fee mechanism and leads to bugs and unexpected behaviour. The proposal is to remove said functionality. Another effort to simplify fee handling in XCM.

Motivation

The JIT withdrawal mechanism creates bugs such as not being able to get fees when all assets are put into holding and none left in the origin location. This is a confusing behavior, since there are funds for fees, just not where the XCVM wants them. The XCVM should have only one entrypoint to fee payment, the holding register. That way there is also less surface for bugs.

Stakeholders

  • Runtime Users
  • Runtime Devs
  • Wallets
  • dApps

Explanation

The SetFeesMode instruction will be removed. The Fees Mode register will be removed.

Drawbacks

Users will have to make sure to put enough assets in WithdrawAsset when previously some things might have been charged directly from their accounts. This leads to a more predictable behaviour though so it will only be a drawback for the minority of users.

Testing, Security, and Privacy

Implementations and benchmarking must change for most existing pallet calls that send XCMs to other locations.

Performance, Ergonomics, and Compatibility

Performance

Performance will be improved since unnecessary checks will be avoided.

Ergonomics

JIT withdrawal was a way of side-stepping the regular flow of XCM programs. By removing it, the spec is simplified but now old use-cases have to work with the original intended behaviour, which may result in more implementation work.

Ergonomics for users will undoubtedly improve since the system is more predictable.

Compatibility

Existing programs in the ecosystem will break. The instruction should be deprecated as soon as this RFC is approved (but still fully supported), then removed in a subsequent XCM version (probably deprecate in v5, remove in v6).

Prior Art and References

The previous RFC PR on the xcm-format repo, before XCM RFCs were moved to fellowship RFCs: https://github.com/polkadot-fellows/xcm-format/pull/57.

Unresolved Questions

None.

The new generic fees mechanism is related to this proposal and further stimulates it as the JIT withdraw fees mechanism will become useless anyway.

(source)

Table of Contents

RFC-0111: Pure Proxy Replication

Start Date12 Aug 2024.
DescriptionReplication of pure proxy account ownership to a remote chain
Authors@muharem @xlc

Summary

This RFC proposes a solution to replicate an existing pure proxy from one chain to others. The aim is to address the current limitations where pure proxy accounts, which are keyless, cannot have their proxy relationships recreated on different chains. This leads to issues where funds or permissions transferred to the same keyless account address on chains other than its origin chain become inaccessible.

Motivation

A pure proxy is a new account created by a primary account. The primary account is set as a proxy for the pure proxy account, managing it. Pure proxies are keyless and non-reproducible, meaning they lack a private key and have an address derived from a preimage determined by on-chain logic. More on pure proxies can be found here.

For the purpose of this document, we define a keyless account as a "pure account", the controlling account as a "proxy account", and the entire relationship as a "pure proxy".

The relationship between a pure account (e.g., account ID: pure1) and its proxy (e.g., account ID: alice) is stored on-chain (e.g., parachain A) and currently cannot be replicated to another chain (e.g., parachain B). Because the account pure1 is keyless and its proxy relationship with alice is not replicable from the parachain A to the parachain B, alice does not control the pure1 account on the parachain B.

Although this behaviour is not promised, users and clients often mistakenly expect alice to control the same pure1 account on the parachain B. As a result, assets transferred to the account or permissions granted for it are inaccessible. Several factors contribute to this misuse:

  • regular accounts on different parachains with the same account ID are typically accessible for the owner and controlled by the same private key (e.g., within System Parachains);
  • users and clients do not distinguish between keyless and regular accounts;
  • members using the multisig account ID across different chains, where a member of a multisig is a pure account;
  • users may prefer an account with a registered identity (e.g. for cross-chain treasury spend proposal), even if the account is keyless;

Given that these mistakes are likely, it is necessary to provide a solution to either prevent them or enable access to a pure account on a target chain.

Stakeholders

Runtime Users, Runtime Devs, wallets, cross-chain dApps.

Explanation

One possible solution is to allow a proxy to create or replicate a pure proxy relationship for the same pure account on a target chain. For example, Alice, as the proxy of the pure1 pure account on parachain A, should be able to set a proxy for the same pure1 account on parachain B.

To minimise security risks, the parachain B should grant the parachain A the least amount of permission necessary for the replication. First, Parachain A claims to Parachain B that the operation is commanded by the pure account, and thus by its proxy, and second, provides proof that the account is keyless.

The replication process will be facilitated by XCM, with the first claim made using the DescendOrigin instruction. The replication call on parachain A would require a signed origin by the pure account and construct an XCM program for parachain B, where it first descends the origin, resulting in the ParachainA/AccountId32(pure1) origin location on the receiving side.

To prove that the pure account is keyless, the client must provide the initial preimage used by the chain to derive the pure account. Parachain A verifies it and sends it to parachain B with the replication request.

We can draft a pallet extension for the proxy pallet, which needs to be initialised on both sides to enable replication:

#![allow(unused)]
fn main() {
// Simplified version to illustrate the concept.
mod pallet_proxy_replica {
  /// The part of the pure account preimage that has to be provided by a client.
  struct Witness {
    /// Pure proxy swapner
    spawner: AccountId,
    /// Disambiguation index
    index: u16,
    /// The block height and extrinsic index of when the pure account was created.  
    block_number: BlockNumber,
    /// The extrinsic index.
    ext_index: u32,
    // Part of the preimage, but constant.
    // proxy_type: ProxyType::Any,
  } 
  // ...
  
  /// The replication call to be initiated on the source chain.
  // Simplified version, the XCM part will be abstracted by the `Config` trait.
  fn replicate(origin: SignedOrigin, witness: Witness, proxy: xcm::Location) -> ... {
       let pure = ensure_signed(origin);
       ensure!(pure == proxy_pallet::derive_pure_account(witness), Error::NotPureAccount);
       let xcm = vec![
         DescendOrigin(who),
         Transact(
             // …
             origin_kind: OriginKind::Xcm,
	     call: pallet_proxy_replica::create(witness, proxy).encode(),
         )
       ];
       xcmTransport::send(xcm)?;
  }
  // …
  
  /// The call initiated by the source chain on the receiving chain.
  // `Config::CreateOrigin` - generally open for whitelisted parachain IDs and 
  // converts `Origin::Xcm(ParachainA/AccountId32(pure1))` to `AccountID(pure1)`.
  fn create(origin: Config::CreateOrigin, witness: Witness, proxy: xcm::Location) -> ... {
       let pure = T::CreateOrigin::ensure_origin(origin);
       ensure!(pure == proxy_pallet::derive_pure_account(witness), Error::NotPureAccount);
       proxy_pallet::create_pure_proxy(pure, proxy);
  }
}

}

Drawbacks

There are two disadvantages to this approach:

  • The receiving chain has to trust the sending chain's claim that the account controlling the pure account has commanded the replication.
  • Clients must obtain witness data.

We could eliminate the first disadvantage by allowing only the spawner of the pure proxy to recreate the pure proxies, if they sign the transaction on a remote chain and supply the witness/preimage. Since the preimage of a pure account includes the account ID of the spawner, we can verify that the account signing the transaction is indeed the spawner of the given pure account. However, this approach would grant exclusive rights to the spawner over the pure account, which is not a property of pure proxies at present. This is why it's not an option for us.

As an alternative to requiring clients to provide a witness data, we could label pure accounts on the source chain and trust it on the receiving chain. However, this would require the receiving chain to place greater trust in the source chain. If the source chain is compromised, any type of account on the trusting chain could also be compromised.

A conceptually different solution would be to not implement replication of pure proxies and instead inform users that ownership of a pure proxy on one chain does not imply ownership of the same account on another chain. This solution seems complex, as it would require UIs and clients to adapt to this understanding. Moreover, mistakes would likely remain unavoidable.

Testing, Security, and Privacy

Each chain expressly authorizes another chain to replicate its pure proxies, accepting the inherent risk of that chain potentially being compromised. This authorization allows a malicious actor from the compromised chain to take control of any pure proxy account on the chain that granted the authorization. However, this is limited to pure proxies that originated from the compromised chain if they have a chain-specific seed within the preimage.

There is a security issue, not introduced by the proposed solution but worth mentioning. The same spawner can create the pure accounts on different chains controlled by the different accounts. This is possible because the current preimage version of the proxy pallet does not include any non-reproducible, chain-specific data, and elements like block numbers and extrinsic indexes can be reproduced with some effort. This issue could be addressed by adding a chain-specific seed into the preimages of pure accounts.

Performance, Ergonomics, and Compatibility

Performance

The replication is facilitated by XCM, which adds some additional load to the communication channel. However, since the number of replications is not expected to be large, the impact is minimal.

Ergonomics

The proposed solution does not alter any existing interfaces. It does require clients to obtain the witness data which should not be an issue with support of an indexer.

Compatibility

None.

Prior Art and References

None.

Unresolved Questions

None.

  • Pure Proxy documentation - https://wiki.polkadot.network/docs/learn-proxies-pure

(source)

Table of Contents

RFC-0112: Compress the State Response Message in State Sync

Start Date14 August 2024
DescriptionCompress the state response message to reduce the data transfer during the state syncing
AuthorsLiu-Cheng Xu

Summary

This RFC proposes compressing the state response message during the state syncing process to reduce the amount of data transferred.

Motivation

State syncing can require downloading several gigabytes of data, particularly for blockchains with large state sizes, such as Astar, which has a state size exceeding 5 GiB (https://github.com/AstarNetwork/Astar/issues/1110). This presents a significant challenge for nodes with slower network connections. Additionally, the current state sync implementation lacks a persistence feature (https://github.com/paritytech/polkadot-sdk/issues/4), meaning any network disruption forces the node to re-download the entire state, making the process even more difficult.

Stakeholders

This RFC benefits all projects utilizing the Substrate framework, specifically in improving the efficiency of state syncing.

  • Node Operators.
  • Substrate Users.

Explanation

The largest portion of the state response message consists of either CompactProof or Vec<KeyValueStateEntry>, depending on whether a proof is requested (source):

  • CompactProof: When proof is requested, compression yields a lower ratio but remains beneficial, as shown in warp sync tests in the Performance section below.
  • Vec<KeyValueStateEntry>: Without proof, this is theoretically compressible because the entries are generated by iterating the storage sequentially starting from an empty storage key, which means many entries in the message share the same storage prefix, making it ideal for compression.

Drawbacks

None identified.

Testing, Security, and Privacy

The code changes required for this RFC are straightforward: compress the state response on the sender side and decompress it on the receiver side. Existing sync tests should ensure functionality remains intact.

Performance, Ergonomics, and Compatibility

Performance

This RFC optimizes network bandwidth usage during state syncing, particularly for blockchains with gigabyte-sized states, while introducing negligible CPU overhead for compression and decompression. For example, compressing the state response during a recent Polkadot warp sync (around height #22076653) reduces the data transferred from 530,310,121 bytes to 352,583,455 bytes — a 33% reduction, saving approximately 169 MiB of data.

Performance data is based on this patch, with logs available here.

Ergonomics

None.

Compatibility

No compatibility issues identified.

Prior Art and References

None.

Unresolved Questions

None.

None.

(source)

Table of Contents

RFC-0114: Introduce secp256r1_ecdsa_verify_prehashed Host Function to verify NIST-P256 elliptic curve signatures

Start Date16 August 2024
DescriptionHost function to verify NIST-P256 elliptic curve signatures.
AuthorsRodrigo Quelhas

Summary

This RFC proposes a new host function, secp256r1_ecdsa_verify_prehashed, for verifying NIST-P256 signatures. The function takes as input the message hash, r and s components of the signature, and the x and y coordinates of the public key. By providing this function, runtime authors can leverage a more efficient verification mechanism for "secp256r1" elliptic curve signatures, reducing computational costs and improving overall performance.

Motivation

“secp256r1” elliptic curve is a standardized curve by NIST which has the same calculations by different input parameters with “secp256k1” elliptic curve. The cost of combined attacks and the security conditions are almost the same for both curves. Adding a host function can provide signature verifications using the “secp256r1” elliptic curve in the runtime and multi-faceted benefits can occur. One important factor is that this curve is widely used and supported in many modern devices such as Apple’s Secure Enclave, Webauthn, Android Keychain which proves the user adoption. Additionally, the introduction of this host function could enable valuable features in the account abstraction which allows more efficient and flexible management of accounts by transaction signs in mobile devices. Most of the modern devices and applications rely on the “secp256r1” elliptic curve. The addition of this host function enables a more efficient verification of device native transaction signing mechanisms. For example:

  1. Apple's Secure Enclave: There is a separate “Trusted Execution Environment” in Apple hardware which can sign arbitrary messages and can only be accessed by biometric identification.
  2. Webauthn: Web Authentication (WebAuthn) is a web standard published by the World Wide Web Consortium (W3C). WebAuthn aims to standardize an interface for authenticating users to web-based applications and services using public-key cryptography. It is being used by almost all of the modern web browsers.
  3. Android Keystore: Android Keystore is an API that manages the private keys and signing methods. The private keys are not processed while using Keystore as the applications’ signing method. Also, it can be done in the “Trusted Execution Environment” in the microchip.
  4. Passkeys: Passkeys is utilizing FIDO Alliance and W3C standards. It replaces passwords with cryptographic key-pairs which is also can be used for the elliptic curve cryptography.

Stakeholders

  • Runtime Authors

Explanation

This RFC proposes a new host function for runtime authors to leverage a more efficient verification mechanism for "secp256r1" elliptic curve signatures.

Proposed host function signature:

#![allow(unused)]
fn main() {
fn ext_secp256r1_ecdsa_verify_prehashed_version_1(
    sig: &[u8; 64],
    msg: &[u8; 32],
    pub_key: &[u8; 64],
) -> bool;
}

The host function MUST return true if the signature is valid or false otherwise.

Drawbacks

N/A

Testing, Security, and Privacy

Security

The changes are not directly affecting the protocol security, parachains are not enforced to use the host function.

Performance, Ergonomics, and Compatibility

Performance

N/A

Ergonomics

The host function proposed in this RFC allows parachain runtime developers to use a more efficient verification mechanism for "secp256r1" elliptic curve signatures.

Compatibility

Parachain teams will need to include this host function to upgrade.

Prior Art and References

(source)

Table of Contents

RFC-0120: Referenda Confirmation by Candle Mechanism

Start Date22 March 2024
DescriptionProposal to decide polls after confirm period via a mechanism similar to a candle auction
AuthorsPablo Dorado, Daniel Olano

Summary

In an attempt to mitigate risks derived from unwanted behaviours around long decision periods on referenda, this proposal describes how to finalize and decide a result of a poll via a mechanism similar to candle auctions.

Motivation

Referenda protocol provide permissionless and efficient mechanisms to enable governance actors to decide the future of the blockchains around Polkadot network. However, they pose a series of risks derived from the game theory perspective around these mechanisms. One of them being where an actor uses the the public nature of the tally of a poll as a way of determining the best point in time to alter a poll in a meaningful way.

While this behaviour is expected based on the current design of the referenda logic, given the recent extension of ongoing times (up to 1 month), the incentives for a bad actor to cause losses on a proposer, reflected as wasted cost of opportunity increase, and thus, this otherwise reasonable outcome becomes an attack vector, a potential risk to mitigate, especially when such attack can compromise critical guarantees of the protocol (such as its upgradeability).

To mitigate this, the referenda underlying mechanisms should incentive actors to cast their votes on a poll as early as possible. This proposal's approach suggests using a Candle Auction that will be determined right after the confirm period finishes, thus decreasing the chances of actors to alter the results of a poll on confirming state, and instead incentivizing them to cast their votes earlier, on deciding state.

Stakeholders

  • Governance actors: Tokenholders and Collectives that vote on polls that have this mechanism enabled should be aware this change affects the outcome of failing a poll on its confirm period.
  • Runtime Developers: This change requires runtime developers to change configuration parameters for the Referenda Pallet.
  • Tooling and UI developers: Applications that interact with referenda must update to reflect the new Finalizing state.

Explanation

Currently, the process of a referendum/poll is defined as an sequence between an ongoing state (where accounts can vote), comprised by a with a preparation period, a decision period, and a confirm period. If the poll is passing before the decision period ends, it's possible to push forward to confirm period, and still, go back in case the poll fails. Once the decision period ends, a failure of the poll in the confirm period will lead to the poll to ultimately be rejected.

stateDiagram-v2
    sb: Submission
    pp: Preparation Period
    dp: Decision Period
    cp: Confirmation Period
    state dpd <<choice>>
    state ps <<choice>>
    cf: Approved
    rj: Rejected

    [*] --> sb
    sb --> pp
    pp --> dp: decision period starts
    dp --> cp: poll is passing
    dp --> ps: decision period ends
    ps --> cp: poll is passing
    cp --> dpd: poll fails
    dpd --> dp: decision period not deadlined
    ps --> rj: poll is failing
    dpd --> rj: decision period deadlined
    cp --> cf
    cf --> [*]
    rj --> [*]

This specification proposes three changes to implement this candle mechanism:

  1. This mechanism MUST be enabled via a configuration parameter. Once enabled, the referenda system MAY record the next poll ID from which to start enabling this mechanism. This is to preserve backwards compatibility with currently ongoing polls.

  2. A record of the poll status (whether it is passing or not) is stored once the decision period is finished.

  3. Including a Finalization period as part of the ongoing state. From this point, the poll MUST be immutable at this point.

    This period begins the moment after confirm period ends, and extends the decision for a couple of blocks, until the VRF seed used to determine the candle block can be considered "good enough". This is, not known before the ongoing period (decision/confirmation) was over.

    Once that happens, a random block within the confirm period is chosen, and the decision of approving or rejecting the poll is based on the status immediately before the block where the candle was "lit-off".

When enabled, the state diagram for the referenda system is the following:

stateDiagram-v2
    sb: Submission
    pp: Preparation Period
    dp: Decision Period
    cp: Confirmation Period
    cds: Finalization
    state dpd <<choice>>
    state ps <<choice>>
    state cd <<choice>>
    cf: Approved
    rj: Rejected

    [*] --> sb
    sb --> pp
    pp --> dp: decision period starts
    dp --> cp: poll is passing
    ps --> cp: poll is passing
    dp --> ps: decision period ends
    ps --> rj: poll is failing
    cp --> dpd: poll fails
    dpd --> cp: decision period over
    dpd --> dp: decision period not over
    cp --> cds: confirmation period ends
    cds --> cd: define moment when candle lit-off
    cd --> cf: poll passed
    cd --> rj: poll failed
    cf --> [*]
    rj --> [*]

Drawbacks

This approach doesn't include a mechanism to determine whether a change of the poll status in the confirming period is due to a legitimate change of mind of the voters, or an exploitation of its aforementioned vulnerabilities (like a sniping attack), instead treating all of them as potential attacks.

This is an issue that can be addressed by additional mechanisms, and heuristics that can help determine the probability of a change of poll status to happen as a result of a legitimate behaviour.

Testing, Security, and Privacy

The implementation of this RFC will be tested on testnets (Paseo and Westend) first. Furthermore, it should be enabled in a canary network (like Kusama) to ensure the behaviours it is trying to address is indeed avoided.

An audit will be required to ensure the implementation doesn't introduce unwanted side effects.

There are no privacy related concerns.

Performance, Ergonomics, and Compatibility

Performance

The added steps imply pessimization, necessary to meet the expected changes. An implementation MUST exit from the Finalization period as early as possible to minimize this impact.

Ergonomics

This proposal does not alter the already exposed interfaces or developers or end users. However, they must be aware of the changes in the additional overhead the new period might incur (these depend on the implemented VRF).

Compatibility

This proposal does not break compatibility with existing interfaces, older versions, but it alters the previous implementation of the referendum processing algorithm.

An acceptable upgrade strategy that can be applied is defining a point in time (block number, poll index) from which to start applying the new mechanism, thus, not affecting the already ongoing referenda.

Prior Art and References

Unresolved Questions

  • How to determine in a statistically meaningful way that a change in the poll status corresponds to an organic behaviour, and not an unwanted, malicious behaviour?

A proposed implementation of this change can be seen on this Pull Request.

(source)

Table of Contents

RFC-0124: Extrinsic version 5

Start Date18 October 2024
DescriptionDefinition and specification of version 5 extrinsics
AuthorsGeorge Pisaltu

Summary

This RFC proposes the definition of version 5 extrinsics along with changes to the specification and encoding from version 4.

Motivation

RFC84 introduced the specification of General transactions, a new type of extrinsic besides the Signed and Unsigned variants available previously in version 4. Additionally, RFC99 introduced versioning of transaction extensions through an extra byte in the extrinsic encoding. Both of these changes require an extrinsic format version bump as both the semantics around extensions as well as the actual encoding of extrinsics need to change to accommodate these new features.

Stakeholders

  • Runtime users
  • Runtime devs
  • Wallet devs

Explanation

Changes to extrinsic authorization

The introduction of General transactions allows the authorization of any and all origins through extensions. This means that, with the appropriate extension, General transactions can replicate the same behavior present-day v4 Signed transactions. Specifically for Polkadot chains, an example implementation for such an extension is VerifySignature, introduced in the Transaction Extension PR3685. Other extensions can be inserted into the extension pipeline to authorize different custom origins. Therefore, a Signed extrinsic variant is redundant to a General one strictly in terms of user functionality and could eventually be deprecated and removed.

Encoding format for version 5

As with version 4, the encoded extrinsic v5 is a SCALE encoded vector of bytes (u8), therefore starting with the encoded length of the following bytes in compact format. The leading byte after the length determines the version and type of extrinsic, as specified by RFC84. For reasons mentioned above, this RFC removes the Signed variant for v5 extrinsics.

For Bare extrinsics, the following bytes will just be the encoded call and nothing else.

For General transactions, as stated in RFC99, an extension version byte must be added to the extrinsic format. This byte should allow runtimes to expose more than one set of extensions which can be used for a transaction. As far as the v5 extrinsic encoding is concerned, this extension byte should be encoded immediately after the leading encoding byte. The extension version byte should be included in payloads to be signed by all extensions configured by runtime devs to ensure a user's extension version choice cannot be altered by third parties.

After the extension version byte, the extensions will be encoded next, followed by the call itself.

A quick visualization of the encoding:

  • Bare extrinsics: (extrinsic_encoded_len, 0b0000_0101, call)
  • General transactions: (extrinsic_encoded_len, , 0b0100_0101, extension_version_byte, extensions, call)

Signatures on Polkadot in General transactions

In order to run a transaction with a signed origin in extrinsic version 5, a user must create the transaction with an instance of at least one extension responsible for authorizing Signed origins with a provided signature.

As stated before, PR3685 comes with a Transaction Extension which replicates the current Signed transactions in v5 extrinsics, namely VerifySignature. I will use this extension as an example on how to replicate current Signed transaction functionality in the new v5 extrinsic format, though the runtime logic is not constrained to this particular implementation.

This extension leverages the new inherited implication functionality introduced in TransactionExtension and creates a payload to be signed using the data of all extensions after itself in the extension pipeline. This extension can be configured to accept a MultiSignature, which makes it compatible with all signature types currently used in Polkadot.

In the context of using an extension such as VerifySignature, for example, to replicate current Signed transaction functionality, the steps to generate the payload to be signed would be:

  1. The extension version byte, call, extension and extension implicit should be encoded (by "extension" and its implicit we mean only the data associated with extensions that follow this one in the composite extension type);
  2. The result of the encoding should then be hashed using the BLAKE2_256 hasher;
  3. The result of the hash should then be signed with the signature type specified in the extension definition.
#![allow(unused)]
fn main() {
// Step 1: encode the bytes
let encoded = (extension_version_byte, call, transaction_extension, transaction_extension_implicit).encode();
// Step 2: hash them
let payload = blake2_256(&encoded[..]);
// Step 3: sign the payload
let signature = keyring.sign(&payload[..]);
}

Summary of changes in version 5

In order to minimize the number of changes to the extrinsic format version and also to help all consumers downstream in the transition period between these extrinsic versions, we should:

  • Remove the Signed variant starting with v5 extrinsics
  • Add the General variant starting with v5 extrinsics
  • Enable runtimes to support both v4 and v5 extrinsics

Drawbacks

The metadata will have to accommodate two distinct extrinsic format versions at a given point in time in order to provide the new functionality in a non-breaking way for users and tooling.

Although having to support multiple extrinsic versions in metadata involves extra work, the change is ultimately an improvement to metadata and the extra functionality may be useful in other future scenarios.

Testing, Security, and Privacy

There is no impact on testing, security or privacy.

Performance, Ergonomics, and Compatibility

This change makes the authorization through signatures configurable by runtime devs in version 5 extrinsics, as opposed to version 4 where the signing payload algorithm and signatures were hardcoded. This moves the responsibility of ensuring proper authentication through TransactionExtension to the runtime devs, but a sensible default which closely resembles the present day behavior will be provided in VerifySignature.

Performance

There is no performance impact.

Ergonomics

Tooling will have to adapt to be able to tell which authorization scheme is used by a particular transaction by decoding the extension and checking which particular TransactionExtension in the pipeline is enabled to do the origin authorization. Previously, this was done by simply checking whether the transaction is signed or unsigned, as there was only one method of authentication.

Compatibility

As long as extrinsic version 4 is still exposed in the metadata when version 5 will be introduced, the changes will not break existing infrastructure. This should give enough time for tooling to support version 5 and to remove version 4 in the future.

Prior Art and References

This is a result of the work in Extrinsic Horizon and RFC99.

Unresolved Questions

None.

Following this change, extrinsic version 5 will be introduced as part of the Extrinsic Horizon effort, which will shape future work.

(source)

Table of Contents

RFC-0138: Election mechanism for invulnerable collators on system chains

Start Date28 January 2025
DescriptionMechanism for electing invulnerable collators on system chains.
AuthorsGeorge Pisaltu

Summary

The current election mechanism for permissionless collators on system chains was introduced in RFC-7. This RFC proposes a mechanism to facilitate replacements in the invulnerable sets of system chains by breaking down barriers that exist today.

Motivation

Following RFC-7 and the introduction of the collator election mechanism, anyone can now collate on a system chain on the permissionless slots, but the invulnerable set has been a contentious issue among current collators on system chains as the path towards an invulnerable slot is almost impossible to pursue. From a technical standpoint, nothing is preventing a permissionless collator, or anyone for that matter, from submitting a referendum to remove one collator from the invulnerable set and add themselves in their place. However, as it quickly becomes obvious, such a referendum would be very difficult to pass under normal circumstances.

The first reason this would be contentious is that there is no significant difference between collators with good performance. There is no reasonable way to keep track of arbitrary data on-chain which could clearly and consistently distinguish between one collator or another. Collators that perform well propose blocks when they are supposed to and that is what is being tracked on-chain. Any other metrics for performance are arbitrary as far as the runtime logic is concerned and should be reasoned upon by humans using public discussion and a referendum.

The second reason for this is the inherently social aspect of this action. Even just proposing the referendum would be perceived as an attack on a specific collator in the set, singling them out, when in reality the proposer likely just wants to be part of the set and doesn't necessarily care who is kicked. In order to consolidate their position, the other invulnerables will rally behind the one that was challenged and the bid to replace one invulnerable will probably fail.

Existing invulnerables have a vested interest in protecting any other invulnerable from such attacks so that they themselves would be protected if need be. The existing collator set has already demonstrated that they can work together and subvert the free market mechanism offered by the runtime when they agreed to not outbid each other on permissionless slots after the new collator selection mechanism was introduced.

The existing invulnerable set on a given system chain are there for a reason; they have demonstrated reliability in the past and were rewarded by governance with invulnerable slots and a bounty to cover their expenses. This means they have a solid reputation and a strong say in governance over matters related to collation. The optics of a permissionless collator actively challenging an invulnerable, even when it's justified, combined with the support of other invulnerables, make the invulnerable set de facto immutable.

While there should be strong guarantees of stability for invulnerables, they should not be a closed circle. The aim of this RFC is to provide a clear, reasonable, fair, and socially acceptable path for a permissionless collator with a proven track record to become an invulnerable while preserving the stability of the invulnerable set of a system parachain.

Stakeholders

  • Infrastructure providers (people who run validator/collator nodes)
  • Polkadot Treasury

Explanation

Proposal

This RFC proposes a periodic, mandatory, round-robin, two-round election mechanism for invulnerables.

How it works

The election should be implemented on top of the current logic in the collator-selection pallet. In this mechanism, candidates would register for the first round of the next election by placing deposits.

When the period between elections passes, the first round of the election starts with every candidate that registered, excluding the incumbent, as an option on the ballot. Votes should be expressed using tokens which should not be available for other transactions while the election is ongoing in order to introduce some opportunity cost to voting. After a certain amount of time passes, the election closes and the candidate who wins the first round of the election advances to the second and final round of the election. The deposits held for voting in the first round must be released before the second round.

In the second round of the election, the winner of the first round has the chance to replace the invulnerable currently holding the slot. A referendum is submitted to replace the incumbent with the winner of the first round of the election, turning the second round of the election into a conviction-voting compatible referendum. If the referendum fails, the incumbent keeps their slot.

The period between elections should be configurable at the collator-selection pallet level. A full election cycle ends when the pallet held an election for every single invulnerable slot. To qualify for the ballot, candidates must have been collating for at least one period from a permissionless slot or be the incumbent.

Motivations behind the particularities of this mechanism

  • Round-robin - It is not desirable to allow any election of the entire invulnerable set at once because the main purpose of invulnerables is to ensure the stability, reliability and liveness of the parachain. It is safer to change them one by one and, in case mistakes happen, governance has time to react without endangering the liveness of any chain.
  • Two-round voting - it's useful to separate the election process into two distinct steps: the first, less important step of determining the challenger at the pallet level through deposits; the second, more important step of actually trying to replace the invulnerable by referendum, which is the same mechanism the invulnerable used to acquire the slot in the first place. It is not so important who is trying to replace the incumbent as long as they meet the requirements and they have a clear way to get to the second round of the election.
  • Mandatory - The runtime, not any particular individual, is actively pushing the invulnerables to convince people that they not only deserve to keep their invulnerable slots, but that they deserve it more than any of the other candidates that registered; the rules of the chain enforce this mechanism so no blame or ill-intent can be attributed to other individuals.
  • Periodic - In order to provide a reasonable path towards an invulnerable slot, no seat can be permanent and should be challenged periodically.
  • Ballot qualification - Any invulnerable collator must have a proven track record as a collator, so allowing only current permissionless collators to run against the current invulnerable minimizes the chance of human error by restricting the number of incompatible choices.

Corner cases

  • If no candidate registers for an election, the slot will become empty, unless the number of collators is lower than the minimum number allowed by the pallet configuration, defined in MinEligibleCollators.
  • In case of equality for the first and second positions, the candidate that registered first wins the election.
  • In case no collator registers or qualifies for the first round of the election, the incumbent is automatically granted the win and gets to keep the invulnerable slot.

Drawbacks

The first major drawback of this proposal is that it would put more responsibility on governance by having people vote regularly in order to maintain the invulnerable collator set on each chain. Today the collator-selection pallet employs a fire-and-forget system where the invulnerables are chosen once by governance vote. Although in theory governance can always intervene to elect new invulnerables, for the reasons stated in this RFC this is not the case in practice. Moving away from this system means more action is needed from governance to ensure the stability of the invulnerable collator sets on each system chain, which automatically increases the probability of errors. However, governance is the ultimate source of truth on-chain and there is a lot more at stake in the hands of governance than the invulnerable collator sets on system chains, so I think this risk is acceptable.

The second drawback of this proposal is the imperfect voting mechanism. Probably the simplest and most fair voting system for this scenario would have been First Past the Post, where all candidates participate in a single election round and the candidate with the most votes wins the election outright. However, the downside of such a system is the technical complexity behind running such an election on-chain. This election mechanism would require a multiple choice referendum implementation in the collator-selection pallet or at the system level somewhere else (e.g. on the Collectives chain), which would be a mix between the conviction-voting and staking pallets and would possibly communicate with all system chains via XCM. While this voting system could be useful in other contexts as well, I don't think it's worth conditioning the invulnerable collator redesign on a separate implementation of the multiple choice voting system when the Two-Round proposed achieves the objectives of this RFC.

Testing, Security, and Privacy

All election mechanisms as well as corner cases can be covered with unit tests.

Performance, Ergonomics, and Compatibility

Performance

The chain will have to run extrinsics to start and end elections periodically, but the impact in terms of weight and PoV size is negligible.

Ergonomics

The invulnerables will be the most affected group, as they will have to now compete in elections periodically to secure their spots. Permissionless candidates will now have a clear, though not guaranteed, path towards becoming an invulnerable, at least for a period of time.

Compatibility

Any changes to the election mechanism of invulnerables should be compatible with the current invulnerable set interaction with the collator set chosen at the session boundary. The current invulnerable set for each chain can be grandfathered in when upgrading the collator-selection pallet version.

Prior Art and References

This RFC builds on RFC-7, which introduced the election mechanism for system chain collators.

Unresolved Questions

  • How long should the period between individual elections be? How long should the full election cycle be?
    • There should be a bit more than one month between individual elections, so that if there are 5 invulnerables on system chains, a full election cycle would take 6 months.
  • How long should the voting stay open?
    • It probably should just be a fixed period (e.g. 1 week) or maybe it can be the entire period before the next election begins.

The main spinoff of this RFC might be a multiple choice poll implementation in a separate pallet to hold a First Past the Post election instead of the Two-Round System proposed, which would prompt a migration to the new voting system within the collator-selection pallet. Additionally, a more complex solution where the voting for all system chains happens in a single place which then sends XCM responses with election results back to system chains can be implemented in the next iteration of this RFC.

(source)

Table of Contents

RFC-114: Adjust Tipper Track Confirmation Periods

Start Date17-Aug-24
DescriptionBig and Small Tipper Track Conformation Period Modification
AuthorsLeemo / ChaosDAO

Summary

This RFC proposes to change the duration of the Confirmation Period for the Big Tipper and Small Tipper tracks in Polkadot OpenGov:

  • Small Tipper: 10 Minutes -> 12 Hours

  • Big Tipper: 1 Hour -> 1 Day

Motivation

Currently, these are the durations of treasury tracks in Polkadot OpenGov. Confirmation periods for the Spender tracks were adjusted based on RFC20 and its related conversation.

Track DescriptionConfirmation Period Duration
Treasurer7 Days
Big Spender7 Days
Medium Spender4 Days
Small Spender2 Days
Big Tipper1 Hour
Small Tipper10 Minutes

You can see that there is a general trend on the Spender track that when the privilege level (the amount the track can spend) the confirmation period approximately doubles.

I believe that the Big Tipper and Small Tipper track's confirmation periods should be adjusted to match this trend.

In the current state it is possible to somewhat positively snipe these tracks, and whilst the power/privilege level of these tracks is very low (they cannot spend a large amount of funds), I believe we should increase the confirmation periods to something higher. This is backed up by the recent sentiment in the greater community regarding referendums submitted on these tracks. The parameters of Polkadot OpenGov can be adjusted based on the general sentiment of token holders when necessary.

Stakeholders

The primary stakeholders of this RFC are: – DOT token holders – as this affects the protocol's treasury – Entities wishing to submit a referendum on these tracks – as this affects the referendum's timeline – Projects with governance app integrations – see Performance, Ergonomics and Compatibility section below

Explanation

This RFC proposes to change the duration of the confirmation period for both the Big Tipper and Small Tipper tracks. To achieve this the confirm_period parameter for those tracks should be changed.

You can see the lines of code that need to be adjusted here:

  • Big Tipper: https://github.com/polkadot-fellows/runtimes/blob/f4c5d272d4672387771fb038ef52ca36f3429096/relay/polkadot/src/governance/tracks.rs#L245

  • Small Tipper: https://github.com/polkadot-fellows/runtimes/blob/f4c5d272d4672387771fb038ef52ca36f3429096/relay/polkadot/src/governance/tracks.rs#L231

This RFC proposes to change the confirm_period for the Big Tipper track to DAYS (i.e. 1 Day) and the confirm_period for the Small Tipper track to 12 * HOURS (i.e. 12 Hours).

Drawbacks

The drawback of changing these confirmation periods is that the lifecycle of referenda submitted on those tracks would be ultimately longer, and it would add a greater potential to negatively "snipe" referenda on those tracks by knocking the referendum out of its confirmation period once the decision period has ended. This can be a good or a bad thing depending on your outlook of positive vs negative sniping.

Testing, Security, and Privacy

This referendum will enhance the security of the protocol as it relates to its treasury. The confirmation period is one of the last lines of defense for the Polkadot token holder DAO to react to a potentially bad referendum and vote NAY in order for its confirmation period to be aborted.

Performance, Ergonomics, and Compatibility

Performance

This is a simple change (code wise) that should not affect the performance of the Polkadot protocol, outside of increasing the duration of the confirmation periods for these 2 tracks.

Ergonomics & Compatibility

As per the implementation of changes described in RFC-20, it was identified that governance UIs automatically update to meet the new parameters:

  • Nova Wallet - directly uses on-chain data, and change will be automatically reflected.
  • Polkassembly - directly uses on-chain data via rpc to fetch trackInfo so the change will be automatically reflected.
  • SubSquare - scan script will update their app to the latest parameters and it will be automatically reflected in their app.

Prior Art and References

N/A

Unresolved Questions

Some token holders may want these confirmation periods to remain as they are currently and for them not to increase. If this is something that the Polkadot Technical Fellowship considers to be an issue to implement into a runtime upgrade then I can create a Wish For Change to obtain token holder approval.

The parameters of Polkadot OpenGov will likely continue to change over time, there are additional discussions in the community regarding adjusting the min_support for some tracks so that it does not trend towards 0%, similar to the current state of the Whitelisted Caller track. This is outside of the scope of this RFC and requires a lot more discussion.

(source)

Table of Contents

RFC-TODO: Stale Nomination Reward Curve

Start Date10 July 2024
DescriptionIntroduce a decaying reward curve for stale nominations in staking.
AuthorsShawn Tabrizi

Summary

This is a proposal to reduce the impact of stale nominations in the Polkadot staking system. With this proposal, nominators are incentivized to update or renew their selected validators once per time period. Nominators that do not update or renew their selected validators would be considered stale, and a decaying multiplier would be applied to their nominations, reducing the weight of their nomination and rewards.

Motivation

Longer motivation behind the content of the RFC, presented as a combination of both problems and requirements for the solution.

One of Polkadot's primary utilities is providing a high quality security layer for applications built on top of it. To achieve this, Polkadot runs a Nominated Proof-of-Stake system, allowing nominators to vote on who they think are the best validators for Polkadot.

This system functions best when nominators and validators are active participants in the network. Nominators should consistently evaluate the quality and preferences of validators, and adjust their nominations accordingly.

Unfortunately, many Polkadot nominators do not play an active role in the NPoS system. For many, they set their nominations, and then seldomly look back at the.

This can lead to many negative behaviors:

  • Incumbents who received early nominations basically achieve tenure.
  • Validator quality and performance can decrease without recourse.
  • The validator set are not the optimal for Polkadot.
  • New validators have a harder time entering the active set.
  • Validators are able to "sneakily" increase their commission.

Stakeholders

Primary stakeholders are:

  • Nominators
  • Validators

Explanation

Detail-heavy explanation of the RFC, suitable for explanation to an implementer of the changeset. This should address corner cases in detail and provide justification behind decisions, and provide rationale for how the design meets the solution requirements.

Drawbacks

Description of recognized drawbacks to the approach given in the RFC. Non-exhaustively, drawbacks relating to performance, ergonomics, user experience, security, or privacy.

Testing, Security, and Privacy

Describe the the impact of the proposal on these three high-importance areas - how implementations can be tested for adherence, effects that the proposal has on security and privacy per-se, as well as any possible implementation pitfalls which should be clearly avoided.

Performance, Ergonomics, and Compatibility

Describe the impact of the proposal on the exposed functionality of Polkadot.

Performance

Is this an optimization or a necessary pessimization? What steps have been taken to minimize additional overhead?

Ergonomics

If the proposal alters exposed interfaces to developers or end-users, which types of usage patterns have been optimized for?

Compatibility

Does this proposal break compatibility with existing interfaces, older versions of implementations? Summarize necessary migrations or upgrade strategies, if any.

Prior Art and References

Provide references to either prior art or other relevant research for the submitted design.

Unresolved Questions

Provide specific questions to discuss and address before the RFC is voted on by the Fellowship. This should include, for example, alternatives to aspects of the proposed design where the appropriate trade-off to make is unclear.

Describe future work which could be enabled by this RFC, if it were accepted, as well as related RFCs. This is a place to brain-dump and explore possibilities, which themselves may become their own RFCs.