polkadot_sdk_docs/guides/enable_elastic_scaling_mvp.rs
1//! # Enable elastic scaling MVP for a parachain
2//!
3//! <div class="warning">This guide assumes full familiarity with Asynchronous Backing and its
4//! terminology, as defined in <a href="https://paritytech.github.io/polkadot-sdk/master/polkadot_sdk_docs/guides/async_backing_guide/index.html">the Polkadot SDK Docs</a>.
5//! Furthermore, the parachain should have already been upgraded according to the guide.</div>
6//!
7//! ## Quick introduction to elastic scaling
8//!
9//! [Elastic scaling](https://polkadot.com/blog/elastic-scaling-streamling-growth-on-polkadot)
10//! is a feature that will enable parachains to seamlessly scale up/down the number of used cores.
11//! This can be desirable in order to increase the compute or storage throughput of a parachain or
12//! to lower the latency between a transaction being submitted and it getting built in a parachain
13//! block.
14//!
15//! At present, with Asynchronous Backing enabled, a parachain can only include a block on the relay
16//! chain every 6 seconds, irregardless of how many cores the parachain acquires. Elastic scaling
17//! builds further on the 10x throughput increase of Async Backing, enabling collators to submit up
18//! to 3 parachain blocks per relay chain block, resulting in a further 3x throughput increase.
19//!
20//! ## Current limitations of the MVP
21//!
22//! The full implementation of elastic scaling spans across the entire relay/parachain stack and is
23//! still [work in progress](https://github.com/paritytech/polkadot-sdk/issues/1829).
24//! The MVP is still considered experimental software, so stability is not guaranteed.
25//! If you encounter any problems,
26//! [please open an issue](https://github.com/paritytech/polkadot-sdk/issues).
27//! Below are described the current limitations of the MVP:
28//!
29//! 1. **Limited core count**. Parachain block authoring is sequential, so the second block will
30//! start being built only after the previous block is imported. The current block production is
31//! capped at 2 seconds of execution. Therefore, assuming the full 2 seconds are used, a
32//! parachain can only utilise at most 3 cores in a relay chain slot of 6 seconds. If the full
33//! execution time is not being used, higher core counts can be achieved.
34//! 2. **Single collator requirement for consistently scaling beyond a core at full authorship
35//! duration of 2 seconds per block.** Using the current implementation with multiple collators
36//! adds additional latency to the block production pipeline. Assuming block execution takes
37//! about the same as authorship, the additional overhead is equal the duration of the authorship
38//! plus the block announcement. Each collator must first import the previous block before
39//! authoring a new one, so it is clear that the highest throughput can be achieved using a
40//! single collator. Experiments show that the peak performance using more than one collator
41//! (measured up to 10 collators) is utilising 2 cores with authorship time of 1.3 seconds per
42//! block, which leaves 400ms for networking overhead. This would allow for 2.6 seconds of
43//! execution, compared to the 2 seconds async backing enabled.
44//! [More experiments](https://github.com/paritytech/polkadot-sdk/issues/4696) are being
45//! conducted in this space.
46//! 3. **Trusted collator set.** The collator set needs to be trusted until there’s a mitigation
47//! that would prevent or deter multiple collators from submitting the same collation to multiple
48//! backing groups. A solution is being discussed
49//! [here](https://github.com/polkadot-fellows/RFCs/issues/92).
50//! 4. **Fixed scaling.** For true elasticity, the parachain must be able to seamlessly acquire or
51//! sell coretime as the user demand grows and shrinks over time, in an automated manner. This is
52//! currently lacking - a parachain can only scale up or down by “manually” acquiring coretime.
53//! This is not in the scope of the relay chain functionality. Parachains can already start
54//! implementing such autoscaling, but we aim to provide a framework/examples for developing
55//! autoscaling strategies.
56//!
57//! Another hard limitation that is not envisioned to ever be lifted is that parachains which create
58//! forks will generally not be able to utilise the full number of cores they acquire.
59//!
60//! ## Using elastic scaling MVP
61//!
62//! ### Prerequisites
63//!
64//! - Ensure Asynchronous Backing is enabled on the network and you have enabled it on the parachain
65//! using [`crate::guides::async_backing_guide`].
66//! - Ensure the `AsyncBackingParams.max_candidate_depth` value is configured to a value that is at
67//! least double the maximum targeted parachain velocity. For example, if the parachain will build
68//! at most 3 candidates per relay chain block, the `max_candidate_depth` should be at least 6.
69//! - Use a trusted single collator for maximum throughput.
70//! - Ensure enough coretime is assigned to the parachain. For maximum throughput the upper bound is
71//! 3 cores.
72//!
73//! <div class="warning">Phase 1 is NOT needed if using the <code>polkadot-parachain</code> or
74//! <code>polkadot-omni-node</code> binary, or <code>polkadot-omni-node-lib</code> built from the
75//! latest polkadot-sdk release! Simply pass the <code>--authoring slot-based</code>
76//! ([`polkadot_omni_node_lib::cli::Cli::experimental_use_slot_based`]) parameter to the command
77//! line and jump to Phase 2.</div>
78//!
79//! The following steps assume using the cumulus parachain template.
80//!
81//! ### Phase 1 - (For custom parachain node) Update Parachain Node
82//!
83//! This assumes you are using
84//! [the latest parachain template](https://github.com/paritytech/polkadot-sdk/tree/master/templates/parachain).
85//!
86//! This phase consists of plugging in the new slot-based collator.
87//!
88//! 1. In `node/src/service.rs` import the slot based collator instead of the lookahead collator.
89#![doc = docify::embed!("../../cumulus/polkadot-omni-node/lib/src/nodes/aura.rs", slot_based_colator_import)]
90//!
91//! 2. In `start_consensus()`
92//! - Remove the `overseer_handle` param (also remove the
93//! `OverseerHandle` type import if it’s not used elsewhere).
94//! - Rename `AuraParams` to `SlotBasedParams`, remove the `overseer_handle` field and add a
95//! `slot_offset` field with a value of `Duration::from_secs(1)`.
96//! - Replace the single future returned by `aura::run` with the two futures returned by it and
97//! spawn them as separate tasks:
98#![doc = docify::embed!("../../cumulus/polkadot-omni-node/lib/src/nodes/aura.rs", launch_slot_based_collator)]
99//!
100//! 3. In `start_parachain_node()` remove the `overseer_handle` param passed to `start_consensus`.
101//!
102//! ### Phase 2 - Activate fixed factor scaling in the runtime
103//!
104//! This phase consists of a couple of changes needed to be made to the parachain’s runtime in order
105//! to utilise fixed factor scaling.
106//!
107//! First of all, you need to decide the upper limit to how many parachain blocks you need to
108//! produce per relay chain block (in direct correlation with the number of acquired cores). This
109//! should be either 1 (no scaling), 2 or 3. This is called the parachain velocity.
110//!
111//! If you configure a velocity which is different from the number of assigned cores, the measured
112//! velocity in practice will be the minimum of these two.
113//!
114//! The chosen velocity will also be used to compute:
115//! - The slot duration, by dividing the 6000 ms duration of the relay chain slot duration by the
116//! velocity.
117//! - The unincluded segment capacity, by multiplying the velocity with 2 and adding 1 to
118//! it.
119//!
120//! Let’s assume a desired maximum velocity of 3 parachain blocks per relay chain block. The needed
121//! changes would all be done in `runtime/src/lib.rs`:
122//!
123//! 1. Rename `BLOCK_PROCESSING_VELOCITY` to `MAX_BLOCK_PROCESSING_VELOCITY` and increase it to the
124//! desired value. In this example, 3.
125//!
126//! ```ignore
127//! const MAX_BLOCK_PROCESSING_VELOCITY: u32 = 3;
128//! ```
129//!
130//! 2. Set the `MILLISECS_PER_BLOCK` to the desired value.
131//!
132//! ```ignore
133//! const MILLISECS_PER_BLOCK: u32 =
134//! RELAY_CHAIN_SLOT_DURATION_MILLIS / MAX_BLOCK_PROCESSING_VELOCITY;
135//! ```
136//! Note: for a parachain which measures time in terms of its own block number, changing block
137//! time may cause complications, requiring additional changes. See here more information:
138//! [`crate::guides::async_backing_guide#timing-by-block-number`].
139//!
140//! 3. Increase the `UNINCLUDED_SEGMENT_CAPACITY` to the desired value.
141//!
142//! ```ignore
143//! const UNINCLUDED_SEGMENT_CAPACITY: u32 = 2 * MAX_BLOCK_PROCESSING_VELOCITY + 1;
144//! ```