Welcome

Hello and a warm welcome to the revive Solidity compiler book!

Warning

Solidity on PVM is running on the pallet-revive runtime. This introduces observable semantic differences in comparison with the EVM.

Study the differences section carefully. Ignoring these differences may lead to defunct contracts.

Notable examples:

The 63/64 gas rule isn’t implemented in the pallet (introduces potential DoS vector when calling other contracts)

Contract instantiation works differently (by hash instead of by code)

The gas model implemented by pallet-revive differs from Ethereum

The heap size is fixed instead of gas-metered and there’s a fixed amount of stack size (contracts working fine on EVM may trap on PVM)

Target audience

Solidity dApp developers should read the user guide. Solidity on PolkaVM introduces important differences to EVM which should be well understood.
Contributors will find the developer guide helpful for getting up to speed.

Other Polkadot contracts resources

Head to contracts.polkadot.io for more general information about contracts on Polkadot.

About

This mdBook documents the revive Solidity compiler project. The content is found under book/. Run make book to observe changes.

`resolc` user guide

resolc is a Solidity v0.8 compiler for Polkadot native smart contracts. Solidity compiled with resolc targets PolaVM (PVM). Thanks to additional compiler optimizations and the PVM JIT, contract code can execute much faster than the EVM equivalent. resolc supports almost all Solidity v0.8 features including inline assembly, offering a high level of comptability with the Ethereum Solidity reference implementation.

`revive` vs. `resolc` nomenclature

revive is the name of the overarching “Solidity to PolkaVM” compiler project, which contains multiple components (for example the Yul parser, the code generation library, the resolc executable itself, and many more things).

resolc is the name of the compiler driver executable, combining many revive components in a single and easy to use binary application.

In other words, revive is the whole compiler infrastructure (more like LLVM) and resolc is a user-facing single-entrypoint compiler frontend (more like clang).

Installation

Building Solidity contracts for PolkaVM requires installing the following two compilers:

solc: The Ethereum Solidity reference compiler implementation.
resolc: The revive Solidity compiler YUL frontend and PolkaVM code generator.

`resolc` binary releases

resolc is supported an all major operating systems and installation is straightforward. Please find our binary releases for the following platforms:

Linux (MUSL)
MacOS (universal)
Windows
Wasm via emscripten

Installing the `solc` dependency

resolc uses solc during the compilation process, please refer to the Ethereum Solidity documentation for installation instructions.

`revive` NPM package

We distribute the revive compiler as node.js module.

Buidling `resolc` from source

Please follow the build instructions in the revive README.md.

CLI usage

We aim to keep the resolc CLI usage close to solc. There are a few things and options worthwhile to know about in resolc which do not exist in the Ethereum world. This chapter explains those in more detail than the CLI help message.

Tip

For the complete help about CLI options, please see resolc --help.

LLVM optimization levels

-O, --optimization <OPTIMIZATION>

resolc exposes the optimization level setting for the LLVM backend. The performance and size of compiled contracts varies wiedly between different optimization levels.

Valid levels are the following:

0: No optimizations are applied.
1: Basic optimizations for execution time.
2: Advanced optimizations for execution time.
3: Aggressive optimizations for execution time.
s: Optimize for code size.
z: Aggressively optimize for code size.

By default, -Oz is applied.

Stack size

--stack-size <STACK_SIZE>

PVM is a register machine with a traditional stack memory space for local variables. This controls the total amount of stack space the contract can use.

You are incentivized to keep this value as small as possible:

Increasing the stack size will increase gas costs due to increased startup costs.
The stack size contributes to the total memory size a contract can use, which includes the contract’s code size.

Default value: 131072

Warning

If the contract uses more stack memory than configured, it will compile fine but eventually revert execution at runtime!

Heap size

--heap-size <HEAP_SIZE>

Unlike the EVM, due to the lack of dynamic memory metering, PVM contracts emulate the EVM heap memory with a static buffer. Consequentially, instead of infinite memory with exponentially growing gas costs, PVM contracts have a finite amount of memory with constant gas costs available.

You are incentivized to keep this value as small as possible: 1.Increasing the heap size will increase startup costs. 2.The heap size contributes to the total memory size a contract can use, which includes the contract’s code size

Default value: 131072

Warning

If the contract uses more heap memory than configured, it will compile fine but eventually revert execution at runtime!

solc

--solc <SOLC>

Specify the path to the solc executable. By default, the one in ${PATH} is used.

Debug artifacts

--debug-output-dir <DEBUG_OUTPUT_DIRECTORY>

Dump all intermediary compiler artifacts to files in the specified directory. This includes the YUL IR, optimized and unoptimized LLVM IR, the ELF object and the PVM assembly. Useful for debugging and development purposes.

Debug info

-g

Generate source based debug information in the output code file. Useful for debugging and development purposes and disabled by default.

Deploy time linking

--link [--libraries <LIBRARIES>] <INPUT_FILES>

In Solidity, 3 things can happen with libraries:

They are not externally callable and thus can be inlined.
1. The solc Solidity optimizer inlines those (usually the case). Note: resolc always activates the solc Solidity optimizer.
2. If the solc Solidity optimizer is disabled or for some reason fails to inline them (both rare), they are not inlined and require linking.
They are externally callable but still linked at compile time. This is the case if at compile time the library address is known (i.e. --libraries supplied in CLI or the corresponding setting in STD JSON input).
They are linked at deploy time. This happens when the compiler does not know the library address (i.e. --libraries flag is missing or the provided libraries are incomplete, same for STD JSON input). This case is rare because it’s discourage and should never be used by production dApps.

In cases 1.2 and 3:

Some of the produced code blobs will be in the “unlinked” raw ELF object format and not yet deployable.
To make them deployable, they need to be “linked” (done using the resolc --link linker mode explained below).
The compiler emitted DELEGATECALL instructions to call non-inlined (unlinked) libraries. The contract deployer must make sure to deploy any libraries prior to contract deployment.

Warning

Using deploy time linking is officially discouraged. Mainly due to bytecode hashes changing after the fact. We decided to support it in resolc regardless, due to popular request.

Similar to how it works in solc, --libraries may be used to provide libraries during linking mode.

Unlike with solc, where linking implies a simple string substitution mechanism, resolc needs to resolve actual missing ELF symbols. This is due to how factory dependencies work in PVM. As a consequence, it isn’t sufficient to just provide the unlinked blobs to the linker. Instead, they must be provided in the exact same directory structure the Solidity source code was found during compile time.

Example:

The contract src/foo/bar.sol:Bar is involved in deploy time linking. It may be a factory dependency.
The contract blob needs to be provided inside a relative src/foo/ directory to --link. Otherwise symbol resolution may fail.

Note

Tooling is supposed to take care of this. In the future, we may append explicit linkage data to simplify the deploy time linking feature.

JS NPM package

The resolc compiler driver is published as an NPM package under @parity/resolc.

It’s usable from Node.js code or directly from the command line:

npx @parity/resolc@latest --bin crates/integration/contracts/flipper.sol -o /tmp/out

Note

While the npm package makes a nice portable option, it doesn’t expose all options.

Tooling integration

resolc achieved successful integration with a variety of third party developer tools.

Solidity toolkits

Support for resolc is available in forks of the hardhat and foundry Solidity toolkits:

Compiler explorer

resolc is available on godbolt.org for the Solidity and Yul input languages. See also the announcement post on the forum.

Remix IDE

There is remix IDE fork with resolc support at remix.polkadot.io. Unfortunately this is no longer actively maintained (there might be bugs and outdated resolc versions).

Standard JSON interface

The revive compiler is mostly compatible with the solc standard JSON interface. There are a few differences and additional (PVM related) input configurations:

The `settings.polkavm` object

Used to configure PVM specific compiler settings.

`settings.polkavm.debugInformation`

A boolean value allowing to enable debug information. Corresponds to resolc -g.

The `settings.polkavm.memoryConfig` object

Used to apply PVM specific memory configuration settings.

`settings.polkavm.memoryConfig.heapSize`

A numerical value allowing to configure the contract heap size. Corresponds to resolc --heap-size.

`settings.polkavm.memoryConfig.stackSize`

A numerical value allowing to configure the contract stack size. Corresponds to resolc --stack-size.

The `settings.optimizer` object

The settings.optimizer object is augmented with support for PVM specific optimization settings.

`settings.optimizer.mode`

A single char value to configure the LLVM optimizer settings. Corresponds to resolc -O.

`settings.llvmArguments`

Allows to specify arbitrary command line arguments to LLVM initialization. Used mainly for development and debugging purposes.

The `settings.outputSelection` object

Used to select desired outputs.

The “all” (`*`) wildcard

Resolc supports the “all” (*) wildcard for the file-level (first-level) and contract-level (second-level) keys. A file-level key can be either the wildcard or a specific file name, whereas the contract-level key can only be the wildcard for robustness reasons.

Thus, output can be requested in 2 ways:

// All files and all contracts:
{
  "settings": {
    "outputSelection": {
      "*": {
        "*": [/* specific contract-level output fields */],
        "": [/* specific file-level output fields */]
      }
    }
  }
}

// Specific files and all their contracts:
{
  "settings": {
    "outputSelection": {
      "path/to/my/file.sol": {
        "*": [/* specific contract-level output fields */],
        "": [/* specific file-level output fields */]
      },
      // Rest of files...
    }
  }
}

The contract-level `evm` output selection

Note

Currently, resolc supports requesting either the full evm output, or one more level of specificity, such as evm.bytecode.

When requesting code generation, such as evm.bytecode or evm.assembly, the resolc compilation process additionally needs ast, metadata, irOptimized, and evm.methodIdentifiers selectors. These selectors will be automatically added if code generation is needed, but will only be included in the output if explicitly requested.

{
  "settings": {
    "outputSelection": {
      "path/to/my/file1.sol": {
        // Contracts in this file will generate bytecode.
        // Only these fields of the JSON output selection will be in the `contracts` output.
        "*": ["abi", "evm.methodIdentifiers", "metadata", "evm.bytecode"],
        // Only this field of the JSON output selection will be in the `sources` output.
        "": ["ast"]
      },
      "path/to/my/file2.sol": {
        // No contracts in this file will generate bytecode.
        "*": ["abi", "evm.methodIdentifiers", "metadata"],
        // No `ast` will be in the `sources` output (only the automatically added `id`,
        // similar to solc as this is not a configurable output selection).
        "": []
      },
    }
  }
}

Differences to EVM

This section highlights some potentially observable differences in the YUL EVM dialect translation compared to Ethereum Solidity.

Solidity developers deploying dApps to pallet-revive ought to read and understand this section well.

Deploy code vs. runtime code

Our contract runtime does not differentiate between runtime code and deploy (constructor) code. Instead, both are emitted into a single PVM contract code blob and live on-chain. Therefore, in EVM terminology, the deploy code equals the runtime code.

Tip

In constructor code, the codesize instruction will return the call data size instead of the actual code blob size.

Solidity

We are aware of the following differences in the translation of Solidity code.

The `63/64` gas rule

pallet-revive doesn’t apply the 63/64 gas rule. We strongly advice to change any code calling untrusted contracts to supply a limited amount of gas only!

`address.creationCode`

This returns the bytecode keccak256 hash instead.

YUL functions

The below list contains noteworthy differences in the translation of YUL functions.

Note

Many functions receive memory buffer offset pointer or size arguments. The PVM pointer size is 32 bit, supplying memory offset or buffer size values above 2^32-1 may lead to OutOfGas errors trap contract execution.

The solc compiler ought to always emit valid memory references, so Solidity dApp authors don’t need to worry about this unless they deal with low level assembly code.

In general, revive preserves the memory layout, meaning low level memory operations are supported. However, a few caveats apply:

The EVM linear heap memory is emulated using a fixed byte buffer of 128kb. This implies that the maximum memory a contract can use is limited to 128kb (on Ethereum, contract memory is capped by gas and therefore varies).
Thus, accessing memory offsets larger than the fixed buffer size will trap the contract at runtime with an OutOfBound error.
The compiler might detect and optimize unused memory reads and writes, leading to a different msize compared to what the EVM would see.

`calldataload`, `calldatacopy`

In the constructor code, the offset is ignored and this always returns 0.

Warning

pallet-revive restricts the calldata size (to 128kb at the time of writing).

`codecopy`

Only supported in constructor code.

`create`, `create2`

Deployments on revive work different than on EVM. In a nutshell: Instead of supplying the deploy code concatenated with the constructor arguments (the EVM deploy model), the revive runtime expects two pointers:

A buffer containing the code hash to deploy.
The constructor arguments buffer.

To make contract instantiation using the new keyword in Solidity work seamlessly, revive translates the dataoffset and datasize instructions so that they assume the contract hash instead of the contract code. The hash is always of constant size. Thus, revive is able to supply the expected code hash and constructor arguments pointer to the runtime.

Warning

This might fall apart in code creating contracts inside assembly blocks. We strongly discourage using the create family opcodes to manually craft deployments in assembly blocks! Usually, the reason for using assembly blocks is to save gas, which is likely futile on revive due to the underlying differences in the VM architectures, gas models and transaction costs.

`dataoffset`

Returns the contract hash.

`datasize`

Returns the contract hash size (constant value of 32).

`revert`, `return`

pallet-revive restricts the returndata size (to 128kb at the time of writing).

`prevrandao`, `difficulty`

Translates to a constant value of 2500000000000000.

`pc`, `extcodecopy`

Only valid to use in EVM (they also have no use case in PVM) and produce a compile time error.

`blobhash`, `blobbasefee`

Related to the Ethereum rollup model and produce a compile time error. Polkadot offers a superior rollup model, removing the use case for blob data related opcodes.

Difference regarding the `solc` `via-ir` mode

There are two different compilation pipelines available in solc and there are small differences between them.

Since resolc processes the YUL IR, always assume the solc IR based codegen behavior for contracts compiled with the revive compiler.

Example: State variable initialization order in inheritance

With via-ir, base constructors run before derived state variables are initialized:

contract InnerContract {
    uint public innerConstructedStartTokenId;

    constructor() {
        innerConstructedStartTokenId = _startTokenId();
    }

    function _startTokenId() internal view virtual returns (uint) {
        return 0;
    }
}

contract Test is InnerContract {
    uint public START_TOKEN_ID = 1;

    constructor() InnerContract() {
    }

    function _startTokenId() internal view virtual override returns (uint) {
        return START_TOKEN_ID;
    }
}

Here, innerConstructedStartTokenId in Test returns 0 (with legacy EVM codegen it’d return 1).

Rust contract libraries

Note

This is not yet implemented but something for consideration on the roadmap.

Solidity - tightly coupled to the EVM - introduces some inherent inefficiencies that are by design and either needs to be followed or can’t be easily worked around, even with efforts like better optimized compiler and VM implementations. This represents a technical dead end. So far the EVM sees no adoption beyond the blockchain industry. Chances are that the EVM end up deprecated for technical reasons (or maybe not and the RISC-V idea gets abandoned, who knows).

PVM, however, is a general purpose VM. It supports LLVM based mainstream programming languages like Rust. It’s a common software engineering practice to compose applications from pieces written in multiple languages, using each to their own strength. For example, AI solutions traditionally use the python scripting language for convenient developer experience, while the underlying AI models get implemented in a lower level language such as C++.

The same pattern can of course be applied to dApps, where we’d expect application specific languages like Solidity mixed with libraries implementing computationally complex algorithms in a lower level language. Business logic and user interfaces are naturally implemented as regular Solidity dApps which can include (link against) Rust libraries. Rust is a fast, safe low level language and the Polkadot SDK is written in Rust itself, making it an excellent choice.

For example, ZK proof verifiers or expensive DeFi primitives would benefit greatly from Rust implementations.

revive provides tooling support and a small Rust contracts SDK for seamless integration with Rust libraries.

`revive-runner` sandbox

Running contract code usually requires a blockchain node. While local dev nodes can be used, sometimes it’s just not desirable to do so. Instead, it can be much more convenient to run and debug contract code with a stripped down environment.

This is where the revive-runner comes in handy. In a nutshell, it is a single-binary no-blockchain pallet-revive runtime.

Installation and usage

Inside the root revive repository directory, install it from source (requires Rust installed):

make install-revive-runner

After installing, see revive-runner --help for usage help.

Trace logs

The standard RUST_LOG environment variable controls the log output from the contract execution. This includes revive runtime logs and PVM execution trace logs. Sometimes it’s convenient to have more fine granular insight. Some useful filters:

RUST_LOG=runtime=trace: The pallet-revive runtime trace logs.
RUST_LOG=polkavm=trace: Low level PolkaVM instruction tracing.

Automatic contract instantiation

To avoid running the constract in an unitialized state, revive-runner automatically instantiates the contract before calling it (constructor arguments can be provided).

Example

Suppose we want to trace the syscalls of the execution of a compiled contract file Flipper.pvm:

RUST_LOG=runtime=trace revive-runner -f Flipper.pvm
[DEBUG runtime::revive] Contract memory usage: purgable=6144/3145728 KB baseline=103063/1572864
[TRACE runtime::revive::strace] call_data_size() = Ok(0) gas_consumed: Weight { ref_time: 985209, proof_size: 0 }
[TRACE runtime::revive::strace] value_transferred(out_ptr: 4294836096) = Ok(()) gas_consumed: Weight { ref_time: 2937634, proof_size: 0 }
[TRACE runtime::revive::strace] call_data_copy(out_ptr: 131216, out_len: 0, offset: 0) = Ok(()) gas_consumed: Weight { ref_time: 4084483, proof_size: 0 }
[TRACE runtime::revive::strace] seal_return(flags: 0, data_ptr: 131216, data_len: 0) = Err(TrapReason::Return(ReturnData { flags: 0, data: [] })) gas_consumed: Weight { ref_time: 5510615, proof_size: 0 }
[TRACE runtime::revive] frame finished with: Ok(ExecReturnValue { flags: (empty), data: [] })
[TRACE runtime::revive::strace] call_data_size() = Ok(0) gas_consumed: Weight { ref_time: 985209, proof_size: 0 }
[TRACE runtime::revive::strace] seal_return(flags: 1, data_ptr: 131088, data_len: 0) = Err(TrapReason::Return(ReturnData { flags: 1, data: [] })) gas_consumed: Weight { ref_time: 2456669, proof_size: 0 }
[TRACE runtime::revive] frame finished with: Ok(ExecReturnValue { flags: REVERT, data: [] })

Developer guide

This chapter covers internal aspects of the compiler and helps contributors getting started with the revive codebase.

Contributor guide

The revive compiler is an open source software project and we gladly accept quality contributions from anyone!

Getting started

A quick reference on how to build the Solidity compiler is maintained in the project’s README.md.

Using the `Makefile`

The Makefile comprehensively encapsulates all development aspects of this codebase. It is kept concise and readable. Please read and use it! You’ll learn for example:

How to build and install a resolc development version.
How to run tests and benchmarks.
How to cross-compile resolc.

As a general rule-of-thumb: If make test runs fine locally, chances for green CI pipelines are good.

Codebase organization

For the most parts, revive is a rather standard Rust workspace codebase. There are some non-Rust dependencies, which sometimes complicates things a little bit.

The `crates/` dir

All Rust crates live under the crates/ directory. The workspace automatically considers any crate found therein. If you need to add a new create, please implement it there.

Compiler library crates should be named with the revive- prefix. The crate location doesn’t need the prefix.

Dependencies

Dependencies should be added as workspace dependencies. Try to avoid pinning dependencies whenever possible. If possible, add dev dependencies as dev-dependencies only.

Please do always include the Cargo.lock dependency lock file with your PR. Please don’t run cargo update together with other changes (it is preferred to update the lock file in a dedicated dependency update PR).

Contribution rules

Changes must be submitted via a pull request (PR) to the github upstream repository.
Ensure that your branch passes make test locally when submitting a pull request.
A PR must not be merged until CI fully passes. Exceptions can be made (for example to fix CI issues itself).
No force pushes to the main branch and open PR branches.
Maintainers can request changes or deny contributions at their own discretion.

Style guide

We require the official Rust formatter and clippy linter. In addition to that, please also consider the following best-effort aspects:

Avoid magic numbers and strings. Instead, add them as module constants.
Avoid abbreviated variable and function names. Always provide meaningful and readable symbols.
Don’t write macros and don’t use third party macros for things that can easily be expressed in few lines of code or outlined into functions.
Avoid import aliasing. Please use the parent or fully qualified path for conflicting symbols.
Any inline comments must provide additional semantic meaning, explain counter-intuitive behavior or highlight non-obvious design decisions. In other words, try to make the code expressive enough to a degree it doesn’t need comments expressing the same thing again in the English language. Delete such comments if your AI assistant generated them.
Public items must have a meaningful doc comment.
Provide meaningful panic messages to .expect() or just use .unwrap().

AI policy

Contributors may use whatever AI assistance tools they wish to whatever degree they wish in the process of creating their contribution, given they acknowledge the following:

Project maintainers may reject any contribution (or portions of it) if the contribution shows signs of problematic involvement of generative AI.

Judgement of “problematic involvement” lies at the sole discretion of project maintainers. No proof (whether a contribution was in fact AI generated or not) is required. Rationale:

No one enjoys reading soulless and uncanny LLM slop. Please review and fix any AI slop yourself prior to submitting a PR.
A Solidity compiler is security sensitive software. Even miniscule mistakes can ultimately lead to loss of funds. AI models are inherently stochastic. They regurarly fail to capture important nuances or produce straight hallucinations. Code that was “blindly” generated has no home here.
revive is a large codebase. Generative AI assistants may not have enough “context window” to sufficiently capture correctness, consistency and style aspects of the codebase. We’d like to keep this codebase maintainable by humans for the forseeable future.

Compiler architecture and internals

revive relies on solc, the Ethereum Solidity compiler, as the Solidity frontend to process smart contracts written in Solidity. LLVM, a popular and powerful compiler framework, is used as the compiler backend and does the heavy lifting in terms of optimizitations and RISC-V code generation.

revive mainly takes care of lowering the Yul intermediate representation (IR) produced by solc to LLVM IR. This approach provides a good balance between maintaining a high level of Ethereum compatibility, good contract performance and feasible engineering efforts.

`resolc`

resolc is the overarching compiler driver library and binary.

When compiling a Solidity source file with resolc, the following steps happen under the hood:

solc is used to lower the Solidity source code into YUL intermediate representation.
revive lowers the YUL IR into LLVM IR.
LLVM optimizes the code and emits a RISC-V ELF shared object (through LLD).
The PolkaVM linker finally links the ELF shared object into a PolkaVM blob.

This compilation process can be visualized as follows:

Architecture Overview

Reproducible contract builds

Because on-chain contract code is identified via its code blob hash, it is crucial to maintain reproducible contract builds. A given compiler version must reproduce the contract build exactly on every target platform resolc supports via the official binary releases.

To ensure this, we employ the following measures:

The code generation must be fully deterministic. For example iterating over standard HashMap invalidates this due to its internal state, making it an invalid operation in revive. To circumvent that, a BTreeMap can be used instead.
We release fully statically linked resolc binaries. This prevents dynamic linking of potentially differentiating libraries.
The only non-bundled dependency is the solc compiler. This is considered fine because the same properties apply to solc.

The `revive` compiler libraries

The main compiler logic is implemented in the revive-yul and revive-llvm-context crates.

The Yul library implements a lexer and parser and lowers the resulting tree into LLVM IR. It does so by emitting LL using the LLVM builder and our own revive-llvm-context compiler context crate. The revive LLVM context crate encapsulates code generation logic (decoupled from the parser).

The Yul library also implements a simple visitor interface (see visitor.rs). If you want to work with the AST, it is strongly recommended to implement visitors. The LLVM code generation is implemented using a dedicated trait for historical reasons only.

EVM heap memory

PVM doesn’t offer a similar API. Hence the emitted contract code emulates the linear EVM heap memory using a static byte buffer. Data inside this byte buffer is kept big endian for EVM compatibility reasons (unaligned access is allowed and makes optimizing this non-trivial).

Unlike with the EVM, where heap memory usage is gas metered, our heap size is static (the size is user controllable via a setting flag). The compiler emits bound checks to prevent overflows.

The LLVM dependency

LLVM is a special non Rust dependency. We interface its builder interface via the inkwell wrapper crate.

We use upstream LLVM, but release and use our custom builds. We require the compiler builtins specifically built for the PVM rv64emacb target and always leave assertions on. Furthermore, we need cross builds because resolc itself targets emscripten and musl. The revive-llvm-builer functions as a cross-platform build script and is used to build and release the LLVM dependency.

We also maintain the lld-sys crate for interfacing with LLD. The LLVM linker is used during the compilation process, but we don’t want to distribute another binary.

Custom optimizations

At the moment, no significant custom optimizations are implemented. Thus, we are missing some optimization opportunities that neither solc nor LLVM can realize (due to their lack of domain specific knowledge about the semantics of our target environment). Furthermore, solc optimizes for EVM gas and a target machine orthogonal to our target (BE 256-bit stack machine EVM vs. 64-bit LE RISC architecture PVM). We have started working on an additional IR layer between Yul and LLVM to capture missed optimization opportunities, though.

PVM and the pallet-revive runtime target

The revive compiler targets PolkaVM (PVM) via pallet-revive on Polkadot.

Target CPU configuration

The exact target CPU configuration can be found here.

Note

The PVM linker requires fully relocatable ELF objects.

Why PVM

PVM is a RISC-V based VM designed to overcome the flaws of WebAssebmly (Wasm). Wasm was believed to be a more efficient successor to the rather slow EVM. However, Wasm is far from an ideal target for smart contracts as some of its design decisions are unfavorable for short-lived workloads. The main problem is on-chain Wasm bytecode compilation or interpretation overhead. Prior benchmarks consistently ignoring this overhead seeded the blockchain industry with flawed assumptions: Only when ignoring the startup overhead Wasm is much faster than the slow computing EVM. In practice however, gains are nullified entirely and Wasm loses completely even against very slow VMs like the EVM. Executing Wasm contracts is in fact so inefficient that typical contract workloads are orders of magnitude more expensive than the equivalent EVM variant.

On the other hand, since RISC-V is similar to CPUs found in validator hardware (x86 and ARM), bytecode translation mostly boils down to a linear mapping from one instruction to another. The embedded ISA specification reduces the number of general purpose registers, in turn removing the need for expensive register allocation. This guarantees single-pass O(n) JIT compilation of contract bytecode. The close proximity of PVM bytecode with actual validator CPU bytecode effectively allows to move all expensive compilation workload off-chain. Benchmarks (1, 2) show that with the PVM JIT, sandboxed PVM code executes at around half the speed of native code, which falls into the same ballpark of the state-of-the-art wasmtime Wasm implementation (while EVM sits somewhere around 1/10 to less than 1/100 of native speed). However, the PVM JIT compiler only uses a fraction of the time wasmtime requires to compile the code.

Note

The PVM JIT isn’t available yet in pallet-revive. At the time of writing, the contract code is interpreted, which is orders of magnitude slower than the JIT.

Host environment: `pallet-revive`

The revive compiler targets the pallet-revive runtime environment.

pallet-revive exposes a syscall like interface for contract interactions with the host environment. This is provided by the revive-runtime-api library.

After the initial launch on the Polkadot Asset Hub blockchain, the runtime API is considered stable and backwards compatible indefinitively.

Testing strategy

Contributors are encouraged to implement some appropriate unit and integration tests together with any bug fixes or new feature implementations. However, when it comes to testing the code generation logic, our testing strategy goes way beyond simple unit and integration tests. This chapter explains how the revive compiler implementation is tested for correctness and how we define correctness.

Tip

Running the integration tests require the evm tool from go-ethereum in your $PATH.

Either install it using your package manager or to build it from source:
git clone https://github.com/ethereum/go-ethereum/
cd go-ethereum
make all
export PATH=/path/to/go-ethereum/build/bin/:$PATH

Bug compatibility with Ethereum Solidity

As a Solidity compiler, we aim to preserve contract code semantics as close as possible to Solidity compiled to EVM with the solc reference implementation. As highlighted in the user guide, due to the underlying target difference, this isn’t always possible. However, wherever it is possible, we follow the philosophy of bug compatibility with the Ethereum contracts stack.

Differential integration tests

A high level of bug compatibility with Ethereum is ensured through differential testing with the Ethereum solc and EVM contracts stack. The revive-integration library is the central integration test utility, providing a set of Solidity integration test cases. Further, it implements differential tests against the reference implementation by combining the revive-runner sandbox, the go-ethereum EVM tool and the revive-differential.

The revive-runner library provides a declarative test specification format. This vastly simplifies writing differential test cases and removes a lot of room for errors in test logic. Example:

{
    "differential": true,
    "actions": [
        {
            "Instantiate": {
                "code": {
                    "Solidity": {
                        "contract": "Bitwise"
                    }
                }
            }
        },
        {
            "Call": {
                "dest": {
                    "Instantiated": 0
                },
                "data": "3fa4f245"
            }
        }
    ]
}

Above example instantiates the Bitwise contract and calls it with some defined calldata. The revive-runner library implements a helper wrapper to execute test specs on the go-ethereum standalone evm tool. This allows the revive-runner to execute specs against the EVM and the pallet-revive runtime. Key to differential testing is setting "differential": true, resulting in the following:

The Bitwise contract is compiled to EVM and PVM code.
The runner executes the defined actions on the EVM and collects all state changes (storage, balance) and execution results.
The runner executes each action on the PVM. Observed state changes after each step as well as the final execution result is asserted to match the EVM counterparts exactly.

Note how we never defined any expected outcome manually. Instead, we simply observe and collect the data defining the “correct” outcome.

Differential testing in combination with declarative test specifications proved to be simple, yet very effective, in ensuring expected Ethereum Solidity semantics on pallet-revive.

The differential testing utility

A lot of nuanced bugs caused by tiny implementation details inside the revive compiler and the pallet-revive runtime could be identified and eliminated early on thanks to the differential testing strategy. Thus, we decided to take this approach further and created a comprehensive test runner and a large suite of more complex test cases.

The Revive Differential Tests follow the exact same strategy but implement a much more powerful test spec format, spec runner and reports. This allows differentially testing much more complex test cases (for example testing Uniswap pair creations and swaps), executed via transactions sent to actual blockchain nodes.

Cross compilation

We cross-compile the resolc.js frontend executable to Wasm for running it in a Node.js or browser environment.

The musl target is used to obtain statically linked ELF binaries for Linux.

Wasm via emscripten

The REVIVE_LLVM_TARGET_PREFIX environment variable is used to control the target environment LLVM dependency. This requires a compatible LLVM build, obtainable via the revive-llvm build script. Example:

# Build the host LLVM dependency with PolkaVM target support
make install-llvm
export LLVM_SYS_211_PREFIX=${PWD}/target-llvm/gnu/target-final

# Build the target LLVM dependency with PolkaVM target support
revive-llvm emsdk
source emsdk/emsdk_env.sh
revive-llvm --target-env emscripten build --llvm-projects lld
export REVIVE_LLVM_TARGET_PREFIX=${PWD}/target-llvm/emscripten/target-final

# Build the resolc frontend executable
make install-wasm
make test-wasm

musl libc

rust-musl-cross is a straightforward way to cross compile Rust to musl. The Dockerfile is an executable example of how to do that.

FAQ

What EVM version do you support?

We neither do nor don’t support any EVM version. We support Solidity versions, starting from solc version 0.8.0 onwards.

Is inline assembly supported

Yes, almost all inline assembly features are supported (see the differences in Yul translation chapter).

Do you support opcode `XY`?

See above, the same applies.

In what Solidity version should I write my dApp?

We generally recommend to always use the latest supported version to profit from latest bugfixes, features and performance improvements.

Find out about the latest supported version by running resolc --supported-solc-versions or checking here.

Tool `XY` says the contract size is larger than 24kb and will fail to deploy?

The 24kb code size restriction only exist for the EVM. Our limit is currently around 1mb and may increase further in the future.

Is `resolc` a drop-in replacement for `solc`?

No. resolc aims to work similarly to solc, but it’s not considered a drop-in replacement.

Vision and Roadmap

The revive compiler speeds up Solidity contracts significantly. revive provides a decisive edge over other contract platforms. Notably, the compiler eliminates the need of rewriting Solidity dApps in Rust or even as single dApp parachains for scaling reasons. Retaining as high compatibility with Ethereum Solidity as possible keeps entry barriers low.

We believe in Dr. Gavin Wood’s ĐApps: What Web 3.0 Looks Like manifesto and the ecosystem of the Solidity programming language. Our motivation lies in the realization that for a true web3 revolution, significant scaling efforts, like the ones provided by the PVM and this project, are necessary to unfold.

Roadmap

The first major release, resolc v1.0.0, emits functional PVM code from given Solidity sources. It relies on solc and LLVM for optimizations. The main priority of this release was delivering a mostly feature complete and safe Solidity v0.8.0 compiler.

Focus for the second major release is on the custom optimization pipeline, which aims to significantly improve emitted code blob sizes.

The below roadmap gives a rough overview of the project’s development timeline.

Roadmap

Keyboard shortcuts

revive compiler book