#graypaper:polkadot.io
last updated 2025-05-24 03:29 UTC
⇩ plaintext · ⇦ all rooms
# 2024-04-17 20:29 syed:
hmm it requires that each shard is diviasible by 64 not the number of available shards.
# 2024-04-17 20:32 syed: If we want to decode shards/segment that are smaller than 64bytes then we have problem with using SIMD I think. Because neighbouring bytes are not encoded in the same polynomial.
# 2024-04-17 20:33 syed: so 64 bytes are encoded in 32 polynomials but each recover bytes with 32 bytes gap between them.
# 2024-04-17 20:34 syed: obviously we can rearrange the bytes before encoding but that kill the whole purpose of the SIMD.
# 2024-04-17 20:37 syed: If we are insisting on recovering less than 64-bytes at the time then maybe we shouldn't use SIMD or we should use smaller registers perhaps.
# 2024-04-18 06:38 gav: I just don't see how you can have 341 of 1023 validators recover 64 bytes.
# 2024-04-18 06:39 gav: how many bytes would they each store to recover those 64 bytes?
# 2024-04-18 06:50 syed: They could get only two bytes out of those 64 and recover 1024*2
# 2024-04-18 07:05 gav: ok so this is the minimal size you can recover at a time? 4*1024
# 2024-04-18 08:42 syed: It is just that up to 32 consecutive bytes you have to recover double amount but you get the next consecutive bytes for free.
# 2024-04-19 18:40 gav: the final metering spec is still being determined. it is heavily dependent on practical implementation speed on realistic hardware, so will necessarily be in flux for some time as implementations arrive and evolve and hardware requirements are decided.
(edited)
# 2024-04-19 18:40 gav: (for ethereum it was the very last protocol element we finalized, and did so through performance analysis of the 3 impls; i expect we'll do something fairly similar here, but with some research over how memory usage interacts with instruction speed)
(edited)
# 2024-04-19 18:42 gav: (and it's not a reference implementation - it just happens to be the first:)
# 2024-04-28 06:39 aedigix: Is there any concerns with endianness for the choice of VM? EVM is big-endian (not really sure the justification for this as I can't find it in the YP)
# 2024-04-29 16:21 dave: Any concern about services intentionally inserting keys into the state trie whose hashes have a long common prefix? With the trie as specified this will result in long trie paths, with a bunch of branch nodes which have one half empty. Possibly could change the definition of M so that instead of always branching on b0 in the branch case you branch on the first bit such that l and r are non-empty? This would of course make the trie implementation more complicated and possibly ruin some optimisations.
# 2024-04-29 16:57 alistair: That doesn't really solve the problem because if you can use brute force to find two hashes which agree on n bits then you can also with an extra 50% hash computations in expectation, find hashes with 1,2,3,..,n-1 bits of prefix agreeing with the first two hashes. Now you still have a depth n trie. The only downside is that you now have to get n+1 hashes included instead of 2.
(edited)
# 2024-04-29 17:26 dave: No it doesn't solve the problem, but it reduces it to the same problem that currently exists in polkadot, so at least it's not making it worse
# 2024-04-29 17:28 dave: Maybe not the "same" problem as services don't exist in polkadot, but you can do similar things with eg account keys
# 2024-04-29 17:45 jeff: I'd expect scale covers this.
Afaik big endian aka network byte order is used in network protocols because it makes testing more robust: If you omit an endianness conversion in only one place then protocol implementations typically fail in testing on little endian machines. The bad situation this avoids is going into production with code that doesn't work when ported to a big endian machine. There is no reason to use network byte order if everyone uses some library that enforces a particular convention, like say protobuf, which likely covers our case more or less. Also, you could skip network byte when you have a very tightly specified protocol, like djb's crypto like ed25519 uses little endian, because he's fighting much bigger implementor risks, while older crypto specifications like secp256k1 often used big endian, just because they didn't understand implementation risks, so they addresses only minor ones.
(edited)
# 2024-04-29 20:02 arkadiy: (127)
C(s,h)
definition Interleaves the hash of the key with the service id to create the trie key. So that a service won't be able to create common prefix chains in other service subtries with this kind of attack.
(edited)
# 2024-04-30 00:39 eclesiomelo: just saw that the definition of header judgement mark is not accurate given that
t != 0
will include valid judgments also, so 107.
Hj ≡ [r | (r, t) <− J, t ≠ ⌊2/3V⌋ + 1]
looks correct, right?
(edited)
# 2024-05-02 17:20 gav: Yes thanks - well spotted! that'll be corrected in the next revision:)
(edited)
# 2024-05-08 09:18 xlchen: this is the only other reference of C that I can find
(edited)
# 2024-05-08 09:25 karim: Maybe this?
\mathbf{C}
in the LaTeX for easier search / mapping.
# 2024-05-08 09:19 xlchen: so it is just an intermediate variable? some output from
(EA,ρ′,δ†,χ,ι,φ)
?
# 2024-05-10 03:43 xlchen: is this a typo? because the total length should be 336 bytes?
# 2024-05-11 10:14 gav: > <@xlchen:matrix.org> so it is just an intermediate variable? some output from
(EA,ρ′,δ†,χ,ι,φ)
?
Exactly.
# 2024-05-11 10:15 gav: > <@xlchen:matrix.org> is this a typo? because the total length should be 336 bytes?
yes - thanks!
# 2024-05-23 08:42 helikon: Hi, at the bottom of the left column of page 6, the set of all dictionaries, D, is defined as:
# 2024-05-23 08:42 helikon: And on the right column of the same page, D is defined as:
# 2024-05-23 11:13 dave: D excludes sets with duplicate keys, so seems like subset is accurate
# 2024-05-23 11:44 helikon: Ah, got it now thanks, I assumed uniqueness of keys. Actually the following (4) just explains that.
(edited)
# 2024-05-23 20:12 oliver.tale-yazdi: Will there be some more explanation on the 2/3+1 honesty assumption?
What time-window is to be considered when evaluating this? Current session? Or something like current session + past session?
# 2024-05-23 20:53 jeff: I'd think the jam paper could just say "byzantine assumptions, including 2/3rd honest, for the current and previous sessions".
It'll be the (machine) elves paper that clarifies our assumptions, but maybe not something jam should discuss to precisely, due to some technicalities:
We make "synchronous" byzantine assumptions for the current session and all previous sessions, which includes that less than 1/3 of validators were miss-behaving, including being off-line. We uses the "network synchrony" assumption only narrowly, and make a serious effort (no shows) to mitigate its weakness, which makes it less bad than it sounds.
Arguably, we only require these assumption for the current and immediately previous session, so long as all parachains are all somehow appropriately publicly monitored. We should not discuss this however since we cannot define this monitoring, and we'll do things like elections on parachains, which probably do not satisfy such a monitoring condition. And non-turnstiled zk parachains are likely eventually too. It's not worth discussing this.
(edited)
# 2024-05-27 07:27 xlchen: The formula doesn't appear to be matching with the description?
# 2024-05-27 08:57 gav: the formula is correct; the text needs to be reworded slightly.
(edited)
# 2024-05-27 08:58 gav: The intention was that
a
always has 3 elements, but the last element, and only the last, may potentially be None.
(edited)
# 2024-05-27 09:02 xlchen: so it is something like
((Sig, Idx), (Sig, Idx), Optional<(Sig, Idx)>)
?
# 2024-06-04 10:06 xlchen: Just to confirm, the Bandersnatch public key size is 32 bytes?
# 2024-06-04 10:17 syed: > <@xlchen:matrix.org> Just to confirm, the Bandersnatch public key size is 32 bytes?
Yes
# 2024-06-04 10:19 syed: We are moving to twisted Edwards form partially to save that one byte
# 2024-06-06 08:16 xlchen: it said there are 3 items in the judgement state but the formula have 4?
# 2024-06-06 08:17 xlchen: and
λ: The validator keys and metadata which were active in the prior epoch.
# 2024-06-06 09:10 gav: Yeah that's old - the stuff about psi\_k should have been removed.
(edited)
# 2024-06-11 20:49 purpletentacle: While this is a minor point, I think it is aligned with the intention of maintaining the formal rigor intended in the graypaper.
I feel there is some inconsistency in terminology concerning "integer" versus "natural numbers." Specifically, when adhering to Section 3.4 (numbers), I believe that the text in sections like 3.6 should use use natural instead of "integer" to be consistent with the corresponding math expressions ($\\N$, etc.). This happens in various other sections as well, where talking about natural numbers or positive integers (unsigned?) would be more appropriate than just "integers"
Generally, the mathematical expressions are accurate, but the narrative often uses "integer" in a broader sense. I think this is due to how the term is used in programming (unsigned integers, etc.) rather than math.
This inconsistency becomes particularly evident in Appendix C.1.2, where the encoding is defined solely for natural numbers, and there is no mention at all about how negative values should be handled. Also, I don't seem to find any real need for serializing negative integers. Probably this section could be restricted to only encoding natural numbers and this shoud be enough.
If useful, I am glad to review and make a PR with suggestions addressing this along the text.
(edited)
# 2024-06-11 21:09 purpletentacle: ...
I have also a few of questions about C.2. .. (262) (263)
- why is H\_j not included when serializing the header? is that correct or an unintentional omission?
- why there is a claim that E\_U(H) has "no inverse"? wouldn't that imply that there is no way to decode a serialized header? (In C.1. "...We define the deserialization function E−1 = E−1 as the inverse of E and able to decode some sequence into the original value...."
(edited)
# 2024-06-11 21:21 purpletentacle: ....
5. Header
"... Excepting the Genesis header, all block headers H have an associated parent header, whose hash is H\_p..."
but expression (263) for header serialization does not indicate optionality (?H\_p). So should there be some exception or fixed value for the genesis case? For instance, like H\_p for H^0 is defined as some fixed preestablished value?
or instead, the serialization of H^0 is a special case that should be explicitely indicated in C.2 that H\_p should not be encoded for H^0?
(edited)
# 2024-06-11 21:40 purpletentacle: in the case of 38, the hash is done over the encoded Extrinsic... H(E(E)), which makes sense..
I believe that 36 is missing the encoding function E to be consistent.. so... H_p = H(E(P(H)))
otherwise, maybe it would be worth indicating somewhere that hashing implies encoding?
(edited)
# 2024-06-12 04:20 gav: - H\_j is not serialized but implied by E\_J.
- where exactly does it say E\_U(H) cannot have an inverse?
(edited)
# 2024-06-12 04:22 gav: Yeah. It’s not optional; genesis is a very special case and it makes no sense to complicate the protocol due to it. It will likely just be defined as zeroes. The genesis header and state are not yet defined in the protocol.
(edited)
# 2024-06-12 04:25 gav: > <@purpletentacle:matrix.org> in the case of 38, the hash is done over the encoded Extrinsic... H(E(E)), which makes sense..
> I believe that 36 is missing the encoding function E to be consistent.. so... H_p = H(E(P(H)))
> otherwise, maybe it would be worth indicating somewhere that hashing implies encoding?
I believe it is mentioned previously that when hashing, values are assumed to be encoded with the regular serialisation function if they are not explicitly so.
# 2024-06-12 04:27 gav: I think I’ve generally made it explicit but as it is unambiguous I gave the option for omission particularly to help improve readability of some longer formulae.
# 2024-06-16 12:53 ltfschoen: it says here
https://jam.web3.foundation/rules that "Clean-room implementation using the Graypaper and public implementor chat channels as the only resources.".
it is my understanding that the Graypaper chat channel is
https://matrix.to/#/#graypaper:polkadot.io
where are the "public implementer" chat channels?
and is there a reason why it doesn't include the Jam chat channel here
https://matrix.to/#/#jam:polkadot.io?
also, it says "Each team is only allowed to work on one implementation", but in the application form it says "What programming language(s) are you using and which language set are you applying for? e.g. "Rust, set B", so i don't understand why it's asking us "What programming language(s) are you using". is it just wanting to know if we'll be using multiple languages in the our single implementation (e.g. if we'll be doing an implementation in Rust and using FFI from Ruby and Python then we'd answer
Rust, Ruby, Python, set B
?
i was actually going to try and work on multiple implementations using multiple language sets (e.g. Rust, Swift Ruby, Python, TS) in parallel as a contingency incase i got stuck and couldn't get support with one of them. after submission of our initial application form, will be possible for us to later on change what programming language(s) and language set that we'll be using for the one implementation that could be eligible for the JAM prize?
(edited)
# 2024-06-16 14:48 oliver.tale-yazdi: Definitions 284 and 291 both define function M_S, just with different arity. The definition section also mentions it twice:
# 2024-06-16 14:50 oliver.tale-yazdi: Is it supposed to be disambiguated by their arity, or is it a name clash?
# 2024-06-16 16:07 gav: They’re meant to be different functions. The name clash is an oversight.
# 2024-06-18 03:07 gav: > <@aedigix:matrix.org> Is there any concerns with endianness for the choice of VM? EVM is big-endian (not really sure the justification for this as I can't find it in the YP)
AFAIK there wasn't an especially good reason for it other than it made sense from a mathematician's perspective.
# 2024-06-18 03:08 gav: In any case modern architectures are either natively LE or have LE-compatibility modes, so it makes sense on the VM side to stick with LE.
# 2024-06-18 03:09 gav: As for serialization, it'll be SCALE, so LE also. This can already be seen via the definitions for the encode function.
(edited)
# 2024-06-18 03:15 mkalohood: miss deleting. The big endian and the little endian have two related concepts: 1. Host order: Different CPUs process data with different byte order types. Intel X86 CPU is little endian, mips architecture CPU is big endian.
2, network order, the network transmits data in the way of byte stream, and the byte order type in the transmission process is called network order. It is independent of the CPU/OS type of other specific devices, ensuring that it can be resolved correctly when transferred between different devices. it defines the byte order as the big endian.
So basically the big endian is used before the network data is sent.
But , I don't think the big endian is the deciding factor in VM selection. I think we can take it apart. The VM communication layer and data operation management can be separated.
# 2024-06-18 03:48 gav: > <@purpletentacle:matrix.org> ...
> I have also a few of questions about C.2. .. (262) (263)
>
> - why is H\_j not included when serializing the header? is that correct or an unintentional omission?
> - why there is a claim that E\_U(H) has "no inverse"? wouldn't that imply that there is no way to decode a serialized header? (In C.1. "...We define the deserialization function E−1 = E−1 as the inverse of E and able to decode some sequence into the original value...."
On the first point, H_j is not serialized - it's auxilliary data implied through E_J.
# 2024-06-18 03:50 gav: > <@gav:polkadot.io> - H\_j is not serialized but implied by E\_J.
> - where exactly does it say E\_U(H) cannot have an inverse?
On the second point "the latter has no inverse", it was merely meant to state that no function
D_U(Y) -> H
was defined explicitly. However, I removed it as it's clearly misleading.
# 2024-06-18 03:53 gav: > <@gav:polkadot.io> I believe it is mentioned previously that when hashing, values are assumed to be encoded with the regular serialisation function if they are not explicitly so.
I made this explicit now.
# 2024-06-19 20:15 purpletentacle: sorry to insist on this point.. but it is not 100% clear to me
36) defines the header as including H_J
but later it is not included
(edited)
# 2024-06-19 20:17 purpletentacle: and when deserializing, it should be kept as ?H\_j = None (optional) if the corresponding extrinsic E\_J is not available?
(edited)
# 2024-06-19 20:18 purpletentacle: I find a bit odd that H is defined as "containing" H\_j when it is actually always external to it
(edited)
# 2024-06-19 20:18 purpletentacle: Btw, I see the definition later at 10.3 but I am still a bit confused and unsure about this value being part of the header or not..
is it fair to assume that the seal applies to the encoded header, an as a consequence, it will never include H\_j ?
if that is the case, when is H\_j relevant or useful?
(edited)
# 2024-06-19 23:55 gav: Yes it’s not especially clear and I’ll clarify it for the next minor revision. As of 0.2.1, it doesn’t get serialised or deserialised but is just defined as an equivalence based on the value of E_J. It is used in the definition of Safrole, equation 56. However, in future versions of the spec this may change and it may be featured in the encoding of the header.
(edited)
# 2024-06-21 08:37 purpletentacle:
The judgements state includes three items, an allow-set ($\psi_\mathbf{a}$), a ban-set ($\psi_\mathbf{b}$) and a punish-set ($\psi_\mathbf{p}$). The allow-set contains the hashes of all work-reports which were disputed and judged to be accurate. The ban-set contains the hashes of all work-reports which were disputed and whose accuracy could not be confidently confirmed. The punish-set is a set of keys of Bandersnatch keys which were found to have guaranteed a report which was confidently found to be invalid.
\begin{equation}
\psi \equiv \tup{\psi_\mathbf{a}, \psi_\mathbf{b}, \psi_\mathbf{p}, \psi_\mathbf{k}}
\end{equation}
`\subsection{Extrinsic}
it looks like a misplaced backtick.. I would say it can be ignored
# 2024-06-21 09:53 gav: > <@qiwei:matrix.org> something missing here for 0.2.1
Yes there’s an error there, the latter component should have been removed.
# 2024-06-21 09:53 gav: It will be fixed in the next revision but feel free to place an issue.
# 2024-06-21 09:55 gav: > <@purpletentacle:matrix.org>
> The judgements state includes three items, an allow-set ($\psi_\mathbf{a}$), a ban-set ($\psi_\mathbf{b}$) and a punish-set ($\psi_\mathbf{p}$). The allow-set contains the hashes of all work-reports which were disputed and judged to be accurate. The ban-set contains the hashes of all work-reports which were disputed and whose accuracy could not be confidently confirmed. The punish-set is a set of keys of Bandersnatch keys which were found to have guaranteed a report which was confidently found to be invalid.
> \begin{equation}
> \psi \equiv \tup{\psi_\mathbf{a}, \psi_\mathbf{b}, \psi_\mathbf{p}, \psi_\mathbf{k}}
> \end{equation}
>
> `\subsection{Extrinsic}
>
>
>
>
> it looks like a misplaced backtick.. I would say it can be ignored
Again, will be fixed in next revision.
# 2024-06-23 13:40 sourabhniyogi: For Appendix B.6 (General), B.7 (Accumulate) and Appendix B.8 (Refine) functions:
1. For ω parameter inputs, I did not understand why the number of parameters on the left hand side did not equal that of the right hand side:
- lookup \[h\_o, b\_0, b\_z\] = ω\_{1..4} (3 vs 4) but then ω\_0 is used so ... ok
- read \[k\_o, k\_z, b\_o, b\_z\] = ω\_{1..5} (4 vs 5) but then w\_0 is used so .. ok
- write \[k\_o, k\_z, v\_o, v\_z\] = ω\_{0..4} (4 vs 5) but there is no w\_4 ... huh?
- new \[o,l,g\_l, g\_h, m\_l, m\_h\] = ω\_{0..6} (6 vs 7) ... huh?
...
- machine \[p\_o, p\_z, i\] = ω\_{0..3} (3 vs 4) ... huh?
- peek \[n,a,b,l\] = ω\_{0..4} (4 vs 5) ... huh?
... and so on
Then you use ω\_0' as a return parameter which has nothing to do with the input ω\_0, huh. ω isn't really doing that much for you it seems except to group the inputs, maybe just describe the inputs and output?
2. At least a line or two describing the each function, specifically referencing these ω parameter inputs would help a lot! Whereas PVM opcode semantics are quite common place and need little additional explication, these functions are the heart of JAM and the additional explication would increase speed of comprehension and reduce guesswork.
3. I believe most of the functions probably deserve at least a passing reference in the main body, or solid exposition around present (242), (246), (252). We can guess import+export+historical\_lookup but nothing references what peek, poke, machine, assign, delegate, quit (?), ... do yet.
4. invoke = 13 is a copy of solicit and surely deserves a different name to the invoke = 20, like invoke\_accumulate vs invoke\_refine.
5. Not clear what's going on with numbers:
- new's bump function 42, 9
- designate's 176
- invoke=20's 13 and 60
6. Not clear why all these 64-bit parameters (g, a, m) have to be split into 2 32-bit and joined back together?
7. A diagram for the whole DA system would be worth a thousand words. I imagine you have them in your JAM slides.
8. If you don't want to add more exposition because you want hyper compactness ok I get it but maybe using
https://en.wikibooks.org/wiki/LaTeX/Macros will allow implementers to disambiguate notation by just reading latex (?!) -- especially for any overloaded notation (E, W, s, C, c, t, ...) that is used to reference more than one concept, implementers can go look at the LaTeX source, where the macros would be unambiguous.
(Feel free to ignore most of the above, not very confident, just learning =). Will of course take any edits you make and follow up with deep look to check if I understand in our stubs)
(edited)
# 2024-06-24 01:55 gav: Regarding the use of omega, did you understand that the range is inclusive on the lower bound and exclusive on the upper?
# 2024-06-24 01:59 sourabhniyogi: Yes, for the cases that appear to be ranges. Not all your ω parameters are ranges though.
# 2024-06-24 02:23 sourabhniyogi: Got it, its clear. [Normies will think X_{1..4} will have 4 elements. You obviously think it should have just 3]
# 2024-06-24 02:29 gav: I would draw your attention to Section 3 Notational Conventions:
# 2024-06-24 02:29 gav: A range may be denoted using an ellipsis for example: \[0, 1, 2, 3\]...2 = \[0, 1\] and \[0, 1, 2, 3\]1⋅⋅⋅+2 = \[1, 2\]
(edited)
# 2024-06-24 02:30 gav: I believe it is not uncommon in computer programming to use (inclusive...exclusive) ranges. Rust, notably, does this.
(edited)
# 2024-06-24 02:31 gav: In any case, regardless of convention I have endeavoured to make my notation clear. But it is very important that anyone serious about interpreting the GP thoroughly read and understand Section 3.
(edited)
# 2024-06-24 02:03 sourabhniyogi: My point is that people would expect ω_0 and ω_0' to be related, of the same rough type -- you take pains in most other cases to have them be the same type. So, its an expectation violation.
# 2024-06-24 02:09 gav: I see no such violation. Perhaps you were misinterpreting my intentions.
(edited)
# 2024-06-24 02:08 gav: I have intentionally avoided attempting to document the host functions in the GP. They will inevitably get documented eventually as a user-resource. But at present the host function definitions are explicit and unambiguous, the primary point of the GP's appendix. Defining them in English as well might lead to people who are less well able to read maths instead relying solely on the English description which will inevitably be more ambiguous and less well defined, increasing the speed of miscomprehension.
(edited)
# 2024-06-24 02:12 gav: > Not clear why all these 64-bit parameters (g, a, m) have to be split into 2 32-bit and joined back together?
How else do you expect to be able to represent a 64-bit value across 32-bit registers?
(edited)
# 2024-06-24 03:47 sourabhniyogi: I expected the caller to do this nitty gritty mapping -- I see you mean 100% of ω_{i...j} are 32-bit registers, now the ω_0' vs ω_0 being unrelated types makes sense.
# 2024-06-24 02:13 gav: Macros have been used on occasion (see e.g. preamble.tex). I will likely increase the usage in time.
# 2024-06-24 02:27 gav: > invoke = 13 is a copy of solicit and surely deserves a different name to the invoke = 20, like invoke\_accumulate vs invoke\_refine.
Unintended.
invoke = 13
will be removed in the next revision. Thanks for reporting this.
(edited)
# 2024-06-26 13:27 danicuki: "Boolean values. Bs denotes the set of Boolean strings of length s, thus Bs = ⟦{, ⊺}⟧s. When dealing with Boolean values we may assume an implicit equivalence mapping to a bit whereby ⊺ = 1 and = 0, thus B◻ = ⟦N2⟧◻. We use the function bits(Y) ∈ B to denote the sequence of bits, ordered with the least significant first, which represent the octet sequence Y, thus bits([5, 0]) = [1, 0, 1, 0, 0,...]. "
I didn't understand why bits([5, 0]) and not simple bits(5)? What this 0 represent in the [5, 0] sequence?
(edited)
# 2024-06-26 13:49 dave: bits takes an octet (=byte) _sequence_, not just a single byte. 0 is simply the second byte in the sequence.
# 2024-06-26 15:48 danicuki: So bits([5]) would be [1,0,1,0,0,0,0,0] and bits([5,0]) would be [1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0]?
# 2024-06-27 01:26 sergei_astapov: > <@danicuki:matrix.org> what is C here?
> gav
It’s the beefy commitment set.
It should be mentioned in the definitions section.
It’s defined at the end of the accumulation definitions.
Section 14 iirc.
# 2024-06-27 19:56 sourabhniyogi: Potential issues in Appendix A:
1.
sub_imm
is missing in A.5.9 - not sure what the opcode is (
add_imm
is 2) but see:
https://github.com/koute/jamtestvectors/blob/master\_pvm\_initial/pvm/programs/inst\_sub\_imm.json
vs
https://github.com/koute/jamtestvectors/blob/master\_pvm\_initial/pvm/programs/inst\_add\_imm.json
2.
cmov_imm_iz
appears here as Opcode 85 but is missing in Appendix A
https://github.com/koute/jamtestvectors/blob/master\_pvm\_initial/pvm/programs/inst\_cmov\_if\_zero\_imm\_ok.json
3. the family of "branch" opcodes { 24, 30, 47, 48, 41, 43 } are repeated in A.5.9 and A.5.10 -- is this intended?
4. The "above condition" referenced in "In the case that the above condition is not met, then the instruction is considered invalid, and it results in a panic" after (224-227) is not clear to me.
5.
mov_reg
(op code 82) and
sbrk
(op code 87) seems to need B and D swapped in some way either in 223 or in the Mutations, whatever works \[but, not sure what
sbrk
does though\].
6. You have to finish the sentence around (213) "This allows for compact representation of both positive and negative encoded values, important as ."
7. Based on
https://github.com/koute/jamtestvectors/blob/master\_pvm\_initial/pvm/programs/inst\_add.json (and many "ALU" operations I think you have a mix up of A, B, D (the
inst_add.json
has 1+2 => 3 r\_A = 7, r\_D = 9, r\_B = 8 where w\[r\_A\] = 1, w\[r\_B\] = 2, w\[r\_D\] = 3). Not sure if its the test vectors that are incorrect or the GP here.
8. The mutations of
cmov_iz
(opcode 85) and
cmov_nz
(opcode 84) appear to be swapped in A.5.11.
(edited)
# 2024-06-28 07:01 gav: 2. Will be fixed.
3. Will be fixed.
4. This refers to the implied decode of the immediate value which could conceivably fail. If the condition cannot fail, then the statement is redundant and you need not pay it any attention.
5. Will be fixed.
6. Will be sorted in the next revision.
7. Not sure what the issue is here. The example you gave seems in line with the GP.
8. Will be fixed.
(edited)
# 2024-06-28 07:24 sourabhniyogi: Ok, for 7, the test vector for
add
is computing 1+2 = 3 with inputs w_7 (1) + w_8 (2) going into w_9 (3) as is easily seen here:
https://github.com/koute/jamtestvectors/blob/master_pvm_initial/pvm/programs/inst_add.json#L28
but looking closely at the test vector code:
(a) decimal 121 is hexadecimal 0x79 (so w_7 are the high order 4 bits INPUT, while the
OUTPUT w_9 is the low order 4 bits)
(b) the byte following (a) is "8" which is the other INPUT
Basically r_A+r_D are together in one byte with r_B following.
In contrast, the GP in (227) has both r_A+r_B put together in one byte and with a byte holding r_D following that byte.
The fix, I believe, is to adjust (227) subscripts from
_A, _B, _D
to
_D, _A, _B
# 2024-06-28 07:34 gav: I see - for this the tests will be altered (GP stays as is)
(edited)
# 2024-06-30 19:55 purpletentacle: This is more like a question than some specific correction.. but I feel that there are some information gaps that I am not able to fill and probably other may find in the same situation. Hopefully is not purely because of my lack of knowledge.
I am confused by how the Bandersnatch ring root function O is defined in 3.0 and/or Appendix G.
This is required to complete the update of the key root in safrole.
...
\where z &= \mathcal{O}([k_b \mid k \orderedin \gamma'_\mathbf{k}]) \\
...
The appendix defines:
O(⟦HB ⟧) ≡ PCS_commitment(⟦HB ⟧)
GP points to
https://github.com/davxy/bandersnatch-vrfs-spec/blob/main/specification.pdf ( which seems to be work in progress, A few TODOs, etc.). The document briefly defines
prove
and
verify
. Proving requires a secret key.. so it could not be applicable to key root update. Verify does not seem to be aplicable either.
- I cannot link with confidence what is PCS_Commitment in Galassi's document.
- It is allowed to use external code (FFI) for bandersnatch VRF.
Would it he possible to provide more information about this step in Safrole, expand a bit more the appendix about Bandersnatch VRF or provide some reference on how the test vectors used the reference crate that was used for this purpose?
# 2024-07-01 00:51 sourabhniyogi: davxy: gav Some freshman questions on Safrole:
1. For your Safrole test vectors ("tiny"), what are the values of
$jam\_entropy (was BYTES( "sassafras\_randomness"))
$jam\_ticket\_seal (was BYTES("sassafras\_ticket\_seal"))
$jam\_fallback\_seal (was BYTES("sassafras\_fallback\_seal"))
and what term in the GP does "entropy" (documented with"Per block entropy (originated from block entropy source VRF)") refer to from the test vector
https://github.com/w3f/jamtestvectors/blob/master/safrole/publish-tickets-no-mark-6.json#L4
?
2. I see "attempt" and "attempts\_number" was renamed to "entry index" and "N" in GP, but what happened to the "ticket threshold" and "redundancy\_factor" - are those \[what I thought key\] concepts gone (in which case, how?) or to be documented still?
3. Since a high level goal of JAM Implementation is to get NON-Rust implementations for Safrole almost everyone will surely use FFI into your recommended crypto package. Can you ( @davxy) provide a single working test case like say this one
https://github.com/w3f/jamtestvectors/blob/master/safrole/publish-tickets-no-mark-6.json
(or the scale equivalent) to set up the ring verifier, get the ring vrf output, and get the entropy buffer updated? With a single well-worked out test case in Rust handing the "Bare VRF" and "Ring VRF" (specifically making the specific flavors exceptionally clear to "I'm not a cryptographer, I just use cryptography" engineers), I'll bet everyone can use that to set up their
extern "C"
type FFI and pass most of the cases in rapid order. Is this possible?
4. For the "tickets\_verifier\_key" eg
https://github.com/w3f/jamtestvectors/blob/master/safrole/publish-tickets-no-mark-6.json#L210
they are 384 bytes and it is labeled as "gamma\_z: The Bandersnatch ring root." in
safrole.asn
. However in GP it is documented as being element Y\_R in (47) but in I.1.2 it says it is 144 bytes. Can you explain further what this discrepancy could be due to?
(edited)
# 2024-07-01 09:01 davxy: sourabhniyogi:
1. For Safrole test vectors the only thing we use is $jam\_ticket\_seal.
In particular this is used for the ring-vrf input construction (context in the GP) to obtain the ticket score (aka ticket-id) during the candidate ticket verification procedure.
The value of $jam\_ticket\_seal" is constant and defined as "jam\_ticket\_seal" ASCII string.
This is a thing that applies in general for values starting with
$
(e.g.
$foo
in the GP =>
"foo"
ascii string).
(Thank you BTW as I've just spotted that I was using "jam\_seal" instead of "jam\_ticket\_seal", I'll add the fix to the upcoming vectors PR)
The other constant strings ($jam\_fallback\_seal and $jam\_entropy) are used for block verification and per-block entropy production (which is passed as input to Safrole).
NOTE: The actual value of per-block entropy used by the Safrole test vectors is not relevant to have been really produced using the signature in the header. Here we abstract away from the value's origin, we don't really care for the sake of the Safrole test vectors and Safrole STF. In the specific case I've used:
- entropy\_0 = blake2b(\[42\_u8; 32\])\[..32\]
- entropy\_i+1 = blake2b(entropy\_i)\[..32\]
2. If you take as a reference the Sassafras RFC then there some differences.
One of these is the attempts number and redundancy factor, which in the GP are in practice simplified to one single thing (attempts number).
Threshold is gone.
Even though reading Sassafras RFC can help (as it is a quite similar protocol), always take the GP as the source of truth for JAM.
3. Sure thing. I'll post here as soon as it is ready.
4. This is a very interesting observation. Current implementation serializes 3 extra fields (part of the SNARK SRS).
Serialization of these fields may be important **in a general application**, but here these values are constant.
I will definitely get rid of these from serialized data (I'm on it). The final size will be 144 (i.e. the last 144 bytes of what you see right now)
(edited)
# 2024-07-01 09:16 oliver.tale-yazdi: > <@davxy:matrix.org> sourabhniyogi:
>
> 1. For Safrole test vectors the only thing we use is $jam\_ticket\_seal.
> In particular this is used for the ring-vrf input construction (context in the GP) to obtain the ticket score (aka ticket-id) during the candidate ticket verification procedure.
> The value of $jam\_ticket\_seal" is constant and defined as "jam\_ticket\_seal" ASCII string.
> This is a thing that applies in general for values starting with
$
(e.g.
$foo
in the GP =>
"foo"
ascii string).
> (Thank you BTW as I've just spotted that I was using "jam\_seal" instead of "jam\_ticket\_seal", I'll add the fix to the upcoming vectors PR)
> The other constant strings ($jam\_fallback\_seal and $jam\_entropy) are used for block verification and per-block entropy production (which is passed as input to Safrole).
> NOTE: The actual value of per-block entropy used by the Safrole test vectors is not relevant to have been really produced using the signature in the header. Here we abstract away from the value's origin, we don't really care for the sake of the Safrole test vectors and Safrole STF. In the specific case I've used:
>
> - entropy\_0 = blake2b(\[42\_u8; 32\])\[..32\]
> - entropy\_i+1 = blake2b(entropy\_i)\[..32\]
> 2. If you take as a reference the Sassafras RFC then there some differences.
> One of these is the attempts number and redundancy factor, which in the GP are in practice simplified to one single thing (attempts number).
> Threshold is gone.
> Even though reading Sassafras RFC can help (as it is a quite similar protocol), always take the GP as the source of truth for JAM.
> 3. Sure thing. I'll post here as soon as it is ready.
> 4. This is a very interesting observation. Current implementation serializes 3 extra fields (part of the SNARK SRS).
> Serialization of these fields may be important **in a general application**, but here these values are constant.
> I will definitely get rid of these from serialized data (I'm on it). The final size will be 144 (i.e. the last 144 bytes of what you see right now)
I think the
[..32]
after the blake2b is not correct, see
https://github.com/w3f/jamtestvectors/issues/6
# 2024-07-01 09:18 davxy: I've just seen your issue. Yeah truncated Blake2b-512 has been used for the test vectors. Need to change to blake2b 256. Thank you
(edited)
# 2024-07-01 11:06 purpletentacle: > Blake2b-512 has been used for the test vectors. Need to change to blake2b 256.
> "jam_seal" instead of "jam_ticket_seal", I'll add the fix to the upcoming vectors PR)
Should we then assume that current test vectors are incorrect and wait for the next release?
As @sourabhniyogi explained, having access to something like 3) as primitives could be very useful... because it can also help to detect these subtle issues in the test vectors themselves
Thank you again for the great responsiveness
# 2024-07-01 11:56 sourabhniyogi: > <@davxy:matrix.org> sourabhniyogi:
>
> 1. For Safrole test vectors the only thing we use is $jam\_ticket\_seal.
> In particular this is used for the ring-vrf input construction (context in the GP) to obtain the ticket score (aka ticket-id) during the candidate ticket verification procedure.
> The value of $jam\_ticket\_seal" is constant and defined as "jam\_ticket\_seal" ASCII string.
> This is a thing that applies in general for values starting with
$
(e.g.
$foo
in the GP =>
"foo"
ascii string).
> (Thank you BTW as I've just spotted that I was using "jam\_seal" instead of "jam\_ticket\_seal", I'll add the fix to the upcoming vectors PR)
> The other constant strings ($jam\_fallback\_seal and $jam\_entropy) are used for block verification and per-block entropy production (which is passed as input to Safrole).
> NOTE: The actual value of per-block entropy used by the Safrole test vectors is not relevant to have been really produced using the signature in the header. Here we abstract away from the value's origin, we don't really care for the sake of the Safrole test vectors and Safrole STF. In the specific case I've used:
>
> - entropy\_0 = blake2b(\[42\_u8; 32\])\[..32\]
> - entropy\_i+1 = blake2b(entropy\_i)\[..32\]
> 2. If you take as a reference the Sassafras RFC then there some differences.
> One of these is the attempts number and redundancy factor, which in the GP are in practice simplified to one single thing (attempts number).
> Threshold is gone.
> Even though reading Sassafras RFC can help (as it is a quite similar protocol), always take the GP as the source of truth for JAM.
> 3. Sure thing. I'll post here as soon as it is ready.
> 4. This is a very interesting observation. Current implementation serializes 3 extra fields (part of the SNARK SRS).
> Serialization of these fields may be important **in a general application**, but here these values are constant.
> I will definitely get rid of these from serialized data (I'm on it). The final size will be 144 (i.e. the last 144 bytes of what you see right now)
Thank you!
# 2024-07-01 16:41 davxy: Alright we're getting there.
I've updated the
PR with all the fixes we've discussed so far. Check the description for the changes.
(if you find / suspect something else please tell)
In order to reduce the ring keys commitment to 144 (as per GP) I had to patch an upstream dependency. So I'll temporarily point ark-ec-vrfs to the patched
ring-proof
for the moment (not pushed yet, I'll notify here when ready).
Juan Leni | zondax.ch Yeah. Unfortunately there was the blake2b thing that invalidated tickets data and some pseudo randomly constructed things that were depending on the hash previously used.
For tomorrow you'll have a clear example of how to use ring vrf (maybe I'll put it in the bandersnatch-specs repo)
(edited)
# 2024-07-02 21:52 davxy: It is quite easy. The signature you see in the extrinsic is a ring signature.
You can deserialize it into the
RingSignature
struct you see in the example (the one who contain the output+ring proof).
Once deserialized you can:
- validate the proof constructing the Verifier as the example show
- use the struct's output entry to actually generate the vrf output hash
Does this help?
To be more concrete, I'll add an example as you suggested
(edited)
# 2024-07-02 22:30 sourabhniyogi: > <@davxy:matrix.org> It is quite easy. The signature you see in the extrinsic is a ring signature.
> You can deserialize it into the
RingSignature
struct you see in the example (the one who contain the output+ring proof).
> Once deserialized you can:
> - validate the proof constructing the Verifier as the example show
> - use the struct's output entry to actually generate the vrf output hash
>
> Does this help?
> To be more concrete, I'll add an example as you suggested
Yes that final example should do it, wonderful
# 2024-07-03 16:36 purpletentacle: is 20-24 secs the typical time for ring\_prove\_verify?
running in a macbook with M3
.. wow..
(edited)
# 2024-07-03 17:34 davxy: Not really. Is a bit too much :-)
❯ cargo run --release
Compiling ark-ec-vrfs-bandersnatch-example v0.1.0 (/mnt/ssd/develop/bandersnatch-vrfs-spec/example)
Finished `release` profile [optimized] target(s) in 1.48s
Running `target/release/ark-ec-vrfs-bandersnatch-example`
* Time taken by ring-vrf-sign: 629.248169ms
Ring signature verified
vrf-output-hash: 6b260bfda2e3ef118c529f30b60dfa4678fbeef3682b55ba002aa8633f1b0364
* Time taken by ring-vrf-verify: 6.105183ms
* Time taken by ietf-vrf-sign: 394.206µs
Ietf signature verified
vrf-output-hash: 6b260bfda2e3ef118c529f30b60dfa4678fbeef3682b55ba002aa8633f1b0364
* Time taken by ietf-vrf-verify: 599.059µs
I have a beefy threadripper 3970X, but I don't expect 24s
See here latest benchmarks:
https://github.com/davxy/crypto-benches/blob/main/vrf/README.md#verify
# 2024-07-03 17:38 davxy: Proving is expected to be slow (well not that slow). But proving is done once per epoch, offchain and by powerful enough candidate validators
# 2024-07-03 17:46 davxy: Verification is more critical. Number of tickets per block is limited to 16 (by the GP) and multiple tickets verification can be done in parallel.
# 2024-07-03 17:50 davxy: Oh. I also forgot to add the "parallel" feature to
ark-ec-vrfs
.
These are the timings (as you can see proving is a lot faster):
❯ cargo run --release
* Time taken by ring-vrf-sign: 156.818104ms
Ring signature verified
vrf-output-hash: 6b260bfda2e3ef118c529f30b60dfa4678fbeef3682b55ba002aa8633f1b0364
* Time taken by ring-vrf-verify: 6.4541ms
* Time taken by ietf-vrf-sign: 412.089µs
Ietf signature verified
vrf-output-hash: 6b260bfda2e3ef118c529f30b60dfa4678fbeef3682b55ba002aa8633f1b0364
* Time taken by ietf-vrf-verify: 613.725µs
(edited)
# 2024-07-23 00:46 sourabhniyogi: The key insight, we found, is from RFC 26 \[which we're not supposed to read and get too confused by because Safrole != Sassafras, but is an excellent introduction nevertheless \] in these:
(a)
ietf_vrf_output (Non anonymous VRF): vrf_output(secret, input) == vrf_signed_output(signature)
(b)
ring_vrf_output (Anonymous VRF): vrf_signed_output(signature) == ring_vrf_signed_output(ring_signature);
The (a) case has the block author reveal himself through the signature of (a) in block authoring, and you can consider that "Non anonymous VRF"
The (b) does NOT reveal the ticket submitter, you can consider that a "Anonymous VRF", which is the beauty of RingVRFs, an important invention.
The key AHA I think for people is you will put (a)+(b) together with the same Bandersnatch key like so:
vrf\_output(secret, input) == vrf\_signed\_output(signature) == ring\_vrf\_signed\_output(ring\_signature)
and this basically maps onto this line here from davxy :
https://github.com/davxy/bandersnatch-vrfs-spec/blob/1ec75e9a3af3a2be7dbca5090171c01e29ac5854/example/src/main.rs#L268C33-L268C48
Once you understand that, all the GP notation will fall into place for you I hope. The ticket submitter uses (b) _anonymously_ and once its time to author blocks (assuming its the non-fallback case, and his ticket submission is among the lowest), he uses (a) _non-anonymously_. The ticketid is the VRF output common to both (a)+(b).
(edited)
# 2024-07-03 00:08 sourabhniyogi: Perfect, thank you! I think most non-cryptographer non-Rust engineers can jump into this and execute now. [We will proceed to try to pass our first few tiny tests with a FFI, hope others will do the same]
(edited)
# 2024-07-04 10:28 davxy: Test vectors PR updated to make use of a not "random" SRS (the stuff used to construct the RingContext).
Check the README for some pointers and update ark-ec-vrfs for new constructor.
# 2024-07-04 12:34 sourabhniyogi: Question: In Section 15 Guaranteeing "With two guarantor signatures, the work-report may be distributed to the forthcoming Jam chain block author in order to be used in the E_G, which leads to a reward for the guarantors." .. but how could a work-report be distributed to a _forthcoming_ block author if via Safrole block authors are only knowable when they actually reveal themselves when they actually author the block?
# 2024-07-04 12:52 gav: By distributing to all possible block authors (i.e. all validators).
(edited)
# 2024-07-04 12:59 purpletentacle: I know networking is still not fully defined.. but I have a quick follow up question on that.. the idea in jam is that every node will connect to every other node and there will be no gossip at all, right?
(edited)
# 2024-07-04 16:52 gav: > <@purpletentacle:matrix.org> I know networking is still not fully defined.. but I have a quick follow up question on that.. the idea in jam is that every node will connect to every other node and there will be no gossip at all, right?
Yes indeed.
# 2024-07-04 16:55 gav: There will likely be two revisions of the protocol - one initial JAMSNP focusing on simplicity and speed of implementation and a second JAMNP focussed on optimisation and security.
(edited)
# 2024-07-04 22:59 sourabhniyogi: Thank you for this sketch, its very helpful! Is it reasonable to have a "tiny" setup (like 6 validators, single team with _maybe_ another team, enabling Milestone 1 "solo" networking PoC, easy to simulate and then run QUIC for real), "medium" setup (like 16-32 validators, enabling Milestone 2 "multi-team" networking PoC with like 4-5 teams, possible to simulate and do for real for a single team though with some basic resources) WELL before the 1023 validator set up (which few teams will have resources for, not sure if its even possible to simulate?). I am wondering if a "lite Appendix H" (not in GP, but suitable for Milestone 1+2) Erasure coding would be worth it to support tiny and medium cases for simplicity / speed of implementation in this tiny vs medium situation, and could have significant utility in the next 6-9 months?
(edited)
# 2024-07-05 00:24 xlchen: those constants should be configurable and in theory all we need to do is config the node with corresponding values
# 2024-07-05 00:25 xlchen: (I had the mistake to hardcoded the constants and later need to refactor the code to make them configurable)
# 2024-07-05 02:41 sourabhniyogi: Absolutely! I mean to ask if before 341\*3=1023 (JAM Toaster sized) we can have
- tiny C=2 x 3 = 6 validators
- xsmall C = 4 x 3 = 12 validators
- small C = 6 x 3 = 18 validators
- medium C = 10 x 3 = 30 validators
- large C = 30 x 3 = 90 validators
- xlarge C = 50 x 3 =150 validators
- xxlarge C = 100 x 3 = 300 validators
- The Toaster C = 341 x 3 = 1023 validators
where we can just have different RS codes in various JAM configurations to match. I believe we only need 3 configurations (one for the first 3 milestones), but I think the RS code of 342:1026 is "impedance matched" to C=341+1 so probably there are more perfect values of C than others for smaller C?
Is the concept of a "bootnode" gone? Where does the 18 byte per validator directory live?
(edited)
# 2024-07-05 02:44 xlchen: but I guess we still need some bootnodes, it is just that technically the network doesn’t need it to be alive so it doesn't need to be specified?
(edited)
# 2024-07-05 02:45 xlchen: we can still have some implementation specific genesis config that includes some bootnodes and could make it such way that all implementations to support such format. it is just that this doesn't need to be defined in GP
# 2024-07-05 04:45 tomusdrw: > <@gav:polkadot.io> There will likely be two revisions of the protocol - one initial JAMSNP focusing on simplicity and speed of implementation and a second JAMNP focussed on optimisation and security.
What's the plan for in-browser light clients? Would be good to have something web socket based (or maybe even REST 🤪) since nothing more fancy can be initiated from the browser context afair.
Alternatively I can imagine a websocket<>jam gateway but if it's not part of major client implementations we are risking heavy centralization and/or monoculture of implementations that one can connect to.
# 2024-07-05 11:21 gav: bootnodes are always a painpoint. JAM won't address them directly, but we might use something like IPFS or WebTorrent to distribute them in a resilient way.
(edited)
# 2024-07-05 11:24 gav: yes indeed - this should be figured out and standardised fairly early on, but it's not crucial for the (validator) protocol per se.
(edited)
# 2024-07-06 13:55 shwchg: Hi everyone! In Appendix H of the Gray paper, we are trying to understand what GF(16) means. In the latest version of the Gray paper, it is mentioned that 16-bit GF points are selected, so we believe that the GF(16) here should refer to GF(2^16). -- do we understand this correctly?
# 2024-07-07 10:53 oliver.tale-yazdi: I dont understand how the selection of fallback keys works here. It looks like
k
is being indexed with a random index (i assume
u32
).
But
k
only has 600 elements, so it will try to access invalid indices? Is there a hidden modulo somewhere?
(edited)
# 2024-07-08 05:36 bhcme: Hi everyone! 👋
I wonder why the rate of 342:1026 is used for erasure coding, as stated in the opening sentence of Appendix H, Erasure Coding: “The foundation of the data-availability and distribution system of Jam is a systematic Reed-Solomon erasure coding function in gf(16) of rate 342:1026, as defined by Lin, Chung, and Han 2014.”
In particular, the precondition for the efficient implementation of the Reed-Solomon encoding and erasure decoding algorithm mentioned in the above paper only applies for (n = 2^r, k)
# 2024-07-08 06:41 sourabhniyogi: Can we get PVM Host function test cases soon? Here is my wish:
https://github.com/w3f/jamtestvectors/pull/3#issuecomment-2212706605
Some candidate typo fixes for Appendix B: \[for GP v0.2.3\]
1. For
info
, it has
b_o
instead of
o
on the 4th mutation
2. For
designate
, does the 176 (144+32 I gather?) need to be 336 to match (51)-(55)? If not, what is the content of the 176 bytes per validator?
3. For
peek
, it seems to need
l
instead of
i
on 2nd and 3rd mutation lines
4. Similarly for
poke
, it seems to need
l
instead of
i
on 3rd and 4th mutation lines
(edited)
# 2024-07-08 08:37 gav: Appendix A and B are rather different things so it makes little sense to comment on that PR.
(edited)
# 2024-07-08 08:44 gav: > <@sourabhniyogi:matrix.org> Can we get PVM Host function test cases soon? Here is my wish:
>
https://github.com/w3f/jamtestvectors/pull/3#issuecomment-2212706605
>
> Some candidate typo fixes for Appendix B: [for GP v0.2.3]
> 1. For
info
, it has
b_o
instead of
o
on the 4th mutation lin3
> 2. For
designate
, does the 176 (144+32 I gather?) might need to be 336 to match (51)-(55)? If not, what is the content of the 176 bytes per validator?
> 3. For
peek
, it seems to need
l
instead of
i
on 2nd and 3rd mutation lines
> 4. Similarly for
poke
, it seems to need
l
instead of
i
on 3rd and 4th mutation lines
>
Test vectors will arrive when they’re ready and in the order that they’re ready. I’d caution you not to be pushy about it. Doing so will not aid their swift arrival.
# 2024-07-12 14:34 danicuki: here is the source code for it (generate image at:
https://yuml.me/)
// greek letters
// -----------------------
[τ'] -> [H]
[β'] -> [H]
[β'] -> [EG]
[β'] -> [C]
[γ'] -> [H]
[γ'] -> [τ]
[γ'] -> [ET]
[γ'] -> [γ]
[γ'] -> [ι]
[γ'] -> [η']
[γ'] -> [κ']
[η'] -> [H]
[η'] -> [η]
[η'] -> [τ]
[κ'] -> [H]
[κ'] -> [τ]
[κ'] -> [κ]
[κ'] -> [γ]
[κ'] -> [ψ']
[λ'] -> [H]
[λ'] -> [τ]
[λ'] -> [λ]
[λ'] -> [κ]
[ψ'] -> [EJ]
[ψ'] -> [ψ]
[δ†] -> [EP]
[δ†] -> [δ]
[δ†] -> [τ']
[ρ†] -> [EJ]
[ρ†] -> [ρ]
[ρ‡] -> [EA]
[ρ‡] -> [ρ†]
[ρ'] -> [EG]
[ρ'] -> [ρ‡]
[ρ'] -> [κ]
[ρ'] -> [τ']
[δ'
χ'
ι'] -> [EA]
[δ'
χ'
ι'] -> [ρ']
[δ'
χ'
ι'] -> [δ†]
[δ'
χ'
ι'] -> [χ]
[δ'
χ'
ι'] -> [ι]
[δ'
χ'
ι'] -> [φ]
[φ'] -> [EA]
[φ'] -> [ρ']
[φ'] -> [δ†]
[φ'] -> [χ]
[φ'] -> [ι]
[φ'] -> [φ]
[C] -> [EA]
[C] -> [ρ']
[C] -> [ρ†]
[C] -> [χ]
[C] -> [ι]
[C] -> [φ]
[α'] -> [EG]
[α'] -> [φ']
[α'] -> [α]
[π'] -> [EG]
[π'] -> [EP]
[π'] -> [EA]
[π'] -> [ET]
[π'] -> [τ']
[π'] -> [τ]
[π'] -> [π]
(edited)
# 2024-07-12 14:33 danicuki: I have created a diagram mapping all state transition components dependencies. I hope it is useful.
# 2024-07-13 19:19 celadari: Hello everyone,
I have some questions, I don't know if I should post them here or on the Jam Chat channel but 🤷♂️ here we go :
### Appendix G:
1. **Signature Set and Key Generation:**
- To confirm, is ( F\_{m,k} ) the set of signatures using the IETF VRF (as per Goldberg 2023) where keys are generated from the Bandersnatch curve?
2. **Size of Y:**
- Why is ( Y ) of size 96 bytes? We perform
decode(x:32)
and
decode(x32:)
. Does this mean that the first 32 bytes are the public key and the rest is additional data?
3. **Merkle Root and Public Keys:**
- To confirm, is ( O(\[H\_B\]) ) the Merkle root of the Merkle tree where the leaves are the authors' public keys?
4. **Use of RingVRF:**
- To confirm, do we use the ring VRF from the Jeffrey 2023 paper, with the first ring VRF construction (part 4 of the paper) and with \( \text{Com}^*.\text{Commit}(\text{ring}) = O([H_B]) \)
5. **Size of Y (second instance):**
- Why is ( Y ) of size 784 bytes? We perform
decode(x:32)
and
decode(x32:)
again. Does this mean that the first 32 bytes are the public key and the rest is additional data?
(edited)
# 2024-07-13 19:20 celadari: ### Appendix I.4.5:
1. **$jam_fallback_seal:**
- For
$jam_fallback_seal
, do we create an output with the VRF Bandersnatch as mentioned in the questions above? If yes, what arguments, context, and additional data are required?
2. **$jam_ticket_seal:**
- For
$jam_ticket_seal
, do we use the VRF RingVRF? If yes, what arguments, context, and additional data are required?
3. **$jam_entropy:**
- For
$jam_entropy
, related to on-chain entropy generation, what does it mean, and how do we compute it?
### Equation (59):
1. **Understanding Components:**
- What are \( i_y \), \( i_r \), \( H_a \), \( H \), and \( E_U(H) \)?
### Equation (60):
1. **Defining \( H_a \):**
- Do we define \( H_a \) with the \( i \) from the equation "let \( i = \gamma_s'[H_t]^{\circlearrow}"?
2. **Understanding \( H \) and \( E_U(H) \):**
- What is \( H \) and \( E_U(H) \)?
# 2024-07-13 19:20 celadari: Let me know if I should move them to other channel or if you need more precisions
Thank you in advance
# 2024-07-14 06:52 gav: > <@celadari:matrix.org> Hello everyone,
>
> I have some questions, I don't know if I should post them here or on the Jam Chat channel but 🤷♂️ here we go :
>
> ### Appendix G:
>
> 1. **Signature Set and Key Generation:**
> - To confirm, is ( F\_{m,k} ) the set of signatures using the IETF VRF (as per Goldberg 2023) where keys are generated from the Bandersnatch curve?
> 2. **Size of Y:**
> - Why is ( Y ) of size 96 bytes? We perform
decode(x:32)
and
decode(x32:)
. Does this mean that the first 32 bytes are the public key and the rest is additional data?
> 3. **Merkle Root and Public Keys:**
> - To confirm, is ( O(\[H\_B\]) ) the Merkle root of the Merkle tree where the leaves are the authors' public keys?
> 4. **Use of RingVRF:**
> - To confirm, do we use the ring VRF from the Jeffrey 2023 paper, with the first ring VRF construction (part 4 of the paper) and with \( \text{Com}^*.\text{Commit}(\text{ring}) = O([H_B]) \)
> 5. **Size of Y (second instance):**
> - Why is ( Y ) of size 784 bytes? We perform
decode(x:32)
and
decode(x32:)
again. Does this mean that the first 32 bytes are the public key and the rest is additional data?
Your interpretation of the use of the colon in subscripting is correct.
# 2024-07-14 06:55 gav: > <@celadari:matrix.org> ### Appendix I.4.5:
>
> 1. **$jam_fallback_seal:**
> - For
$jam_fallback_seal
, do we create an output with the VRF Bandersnatch as mentioned in the questions above? If yes, what arguments, context, and additional data are required?
>
> 2. **$jam_ticket_seal:**
> - For
$jam_ticket_seal
, do we use the VRF RingVRF? If yes, what arguments, context, and additional data are required?
>
> 3. **$jam_entropy:**
> - For
$jam_entropy
, related to on-chain entropy generation, what does it mean, and how do we compute it?
>
> ### Equation (59):
>
> 1. **Understanding Components:**
> - What are \( i_y \), \( i_r \), \( H_a \), \( H \), and \( E_U(H) \)?
>
> ### Equation (60):
>
> 1. **Defining \( H_a \):**
> - Do we define \( H_a \) with the \( i \) from the equation "let \( i = \gamma_s'[H_t]^{\circlearrow}"?
>
> 2. **Understanding \( H \) and \( E_U(H) \):**
> - What is \( H \) and \( E_U(H) \)?
>
>
>
Appendix I is the index. Maybe ask your questions in terms of the actual protocol specification?
# 2024-07-14 07:00 gav: Re 59, you can find all the information in the text around it. E.g. i is determined from subscripting gamma’_s, a member of the sequence of tickets, telling you that it’s a ticket and thus that i_y is a VRF output, used as a ticket identifier (see (50) and text around it).
# 2024-07-14 07:00 gav: Don’t expect to be spoon fed this stuff. It’s not playschool. You need to read carefully.
# 2024-07-14 07:05 celadari: > <@gav:polkadot.io> Appendix I is the index. Maybe ask your questions in terms of the actual protocol specification?
I'm not sure to understand what you mean by "Appendix I is the index" 🤔
At least the term
$jam_entropy
=> can you give pointer to this term please ?
# 2024-07-14 07:13 gav: I’m not sure why you think it reasonable to ask for someone else to do a PDF text search for you like they’ve nothing better to be doing Sunday morning.. anyway (62) X_E = $jam_entropy
(edited)
# 2024-07-14 07:14 gav: As i say do not expect to be spoon fed. Any more such questions may elicit little response.
# 2024-07-15 10:50 xlchen: just to confirm, a validator key can be selected multiple times right
# 2024-07-15 13:40 dakkk: Reading integer encoding in appendix C, I'm unable to understand how x=0 is handled by the general natural serialization rule 274; if x is 0, I can't apply the first
if
since
(2**0 == 1) > 0
(so it does not exist a valid
l
), but applying the second
if
doesn't seems correct either.
Am I missing something, or the case where x=0 is not present?
(edited)
# 2024-07-15 14:01 celadari: > <@dakkk:matrix.org> sent an image.
then in this case, 0 < 2**64 no ?
# 2024-07-15 14:04 dakkk: yes, but following this path, 0 is encoded as \xff\x00\x00\x00\x00\x00\x00\x00\x00; it doesn't seem right to me
# 2024-07-15 14:10 celadari: what happens when you generate several elements from 1 to 2**7
# 2024-07-15 14:13 gav: > <@dakkk:matrix.org> yes, but following this path, 0 is encoded as \xff\x00\x00\x00\x00\x00\x00\x00\x00; it doesn't seem right to me
yeah, zero is a special case and should be in the first branch
# 2024-07-16 18:07 sourabhniyogi: Right sorry -- 684 is still not a multiple of 64 or power of 2.
# 2024-07-17 13:21 gav: No it’s 342 x 2 bytes.
It’s therefore the minimum reconstructible datum.
# 2024-07-18 03:38 sourabhniyogi: Right, we understand that is what the GP says but that's not what cheme's PR test vectors are doing. We have our encoder + decoder FFI using the Rust libraries (and "passing" for arbitrary blob sizes), but the process by which cheme's erasure coding test vectors are generated are clearly on a different page than the GP's Appendix H.
(edited)
# 2024-07-18 07:13 gav: > <@sourabhniyogi:matrix.org> Right, we understand that is what the GP says but that's not what cheme's PR test vectors are doing. We have our encoder + decoder FFI using the Rust libraries (and "passing" for arbitrary blob sizes), but the process by which cheme's erasure coding test vectors are generated are clearly on a different page than the GP's Appendix H.
Ok. Is it possible that “Parity implementation (serialized)” is just 6 of the graypaper ECs concatenated?
# 2024-07-16 09:22 xlchen: I don't get this. what does it mean by "the lowest items of the sorted union "?
# 2024-07-16 09:23 xlchen: is the tickets accumulator a sorted array by the ticket id?
# 2024-07-16 09:34 xlchen: ok after read the sassafras RFC, I think I get it. Please confirm. This is basically a PoW-ish mining process and validators are trying to mine ticket with lowest VRF output and they have limited submission allowance and only EPOCH-LENGTH number of tickets will be accepted
# 2024-07-16 09:35 gav: > <@xlchen:matrix.org> ok after read the sassafras RFC, I think I get it. Please confirm. This is basically a PoW-ish mining process and validators are trying to mine ticket with lowest VRF output and they have limited submission allowance and only EPOCH-LENGTH number of tickets will be accepted
That's about right.
# 2024-07-16 09:36 gav: "union" => combining two sets. "sorted" => sorting that. "lowest items" => take only the first N items.
# 2024-07-16 09:38 xlchen: I see. So the ticket extrinsic shouldn't contain invalid ticket (when pool is full and high VRF output) otherwise the block become invalid?
(edited)
# 2024-07-16 09:40 oliver.tale-yazdi: > <@xlchen:matrix.org> I see. So the ticket extrinsic shouldn't contain invalid ticket (when pool is full and high VRF output) otherwise the block become invalid?
or incorrectly sorted. The conformance tests cover these cases
# 2024-07-16 09:43 gav: Yes, broadly speaking we always put the onus on the block-producer to make blocks which don't waste the block importer's time.
# 2024-07-16 12:12 gav: Some fixes just went in especially for disputes/judgements/verdicts; if you're working on that, best to use
main
.
(edited)
# 2024-07-17 09:19 xlchen: for serialization of numbers, what is the default serialization method?
# 2024-07-17 09:20 xlchen: I guess when unspecified, it is the general natural number serialization method which is a variable length one?
# 2024-07-17 11:07 gav: It was used in an older version of the PVM. It might be used again in the future, but right now it's superfluous.
# 2024-07-17 14:33 celadari: equation (59) is for the normal lottery case (header "anonymously" signed with ringVRF) right ?
so shouldn't it be
$$\\mathbf{H}_s \\in \\bar{\\mathbb{F}}_{\\gamma\_z}^{\[\]}\\langle{X\_T\\frown \\eta\_3^{'} ++ i\_r }\\rangle$$ ?
(edited)
# 2024-07-17 15:09 gav: No - neither case is anonymous at the point of seal verification. It’s only anonymous at the point of ticket submission.
(edited)
# 2024-07-17 16:11 subotic: Probably just a minor detail, two lines after (214): The PVM exit reason
r
. Shouldn't this be $\varepsilon$?
# 2024-07-17 17:24 gav: > <@subotic:matrix.org> Probably just a minor detail, two lines after (214): The PVM exit reason
r
. Shouldn't this be $\varepsilon$?
yes - typo. will be fixed in the next release. thanks
# 2024-07-17 20:06 proxy720: Hi @gav, I'm looking at the latest version of the paper (DRAFT 0.3.0). I think there is a typo in a couple of equations in appendix C (the scale encoder):
- (273): first case: ... ^ E_l(x mod 2*8l), should it be E_{l-1} ?
- (273): second case: ... ^ E_3, I think it wants to be E_2
- (274): the same, I think the indices should be decreasing
But maybe I'm wrong...
# 2024-07-17 20:14 proxy720: > <@proxy720:matrix.org> Hi @gav, I'm looking at the latest version of the paper (DRAFT 0.3.0). I think there is a typo in a couple of equations in appendix C (the scale encoder):
> - (273): first case: ... ^ E_l(x mod 2*8l), should it be E_{l-1} ?
> - (273): second case: ... ^ E_3, I think it wants to be E_2
> - (274): the same, I think the indices should be decreasing
> But maybe I'm wrong...
Yes, I am wrong the expressions are not recursive but refer to (272). Sorry, my bad.
# 2024-07-17 23:27 xlchen: A small annoyance, all the formula are numbered which is obviously very useful for many reasons. However, the number are not fixed and subject to change between versions. This makes referring to a particular formula from code a bit harder as it also needs to fixed to a specific GP version as well and then eventually the repo will contain references to multiple GP versions and just makes things harder. I don't know if there is a solution to this but it will be great if we have a good way to reference to GP from code comment / docs / chats
# 2024-07-18 07:09 gav: > <@xlchen:matrix.org> A small annoyance, all the formula are numbered which is obviously very useful for many reasons. However, the number are not fixed and subject to change between versions. This makes referring to a particular formula from code a bit harder as it also needs to fixed to a specific GP version as well and then eventually the repo will contain references to multiple GP versions and just makes things harder. I don't know if there is a solution to this but it will be great if we have a good way to reference to GP from code comment / docs / chats
Once there is a 1.0 then this should be better. Until then if you want a fixed reference you could use the latex label, where there is one (and if there isn’t, then let me know or make a pr)
# 2024-07-18 08:31 sourabhniyogi: Cheme's erasure coding process (now ours, via FFI into the same Rust libs) is as follows:
Encoding:
- blob is split into 4096-byte segments
- each 4096-byte segment is erasure coded into 12-byte subshards shared among 1026 (342 x 3) validators, ie 12,312 bytes
Decoding:
- available 12-byte subshards in a 12,312 byte encoding (only 342 of which are needed for reconstruction) are used to reconstruct the 4096-byte segment
- each of the 4096-byte segments are put together to reconstruct the blob
GP wants 4104-byte segments (W\_C = 684 x W\_S = 6) instead of 4096. This is a real discrepancy.
(edited)
# 2024-07-18 09:03 gav: Ok - yeah cheme's current system is incorrect and the test vectors shouldn't have been generated.
# 2024-07-18 09:04 gav: If you're confident you have the GP properly implemented, feel free to create your own test vectors and submit them as a PR.
# 2024-07-18 09:06 gav: The lowest reconstructible datum of EC should be 684 bytes, reconstructible from 342 parties out of a total generated during EC of 1026, each presenting 2 bytes.
# 2024-07-18 09:08 gav: So 342 byte-pairs gets expanded to 1026 byte-pairs and from this any 342 byte-pairs can be used to reconstruct the original 342 byte-pairs.
For ECing segments (4104 bytes, 2052 byte-pairs), then we basically do the ECing above 6 times in parallel and concatenate.
# 2024-07-18 09:09 gav: For ECing the Work-Bundle, then we do the above N times in parallel and concatenate, where N is
CEIL(WORK_BUNDLE_SIZE / 684)
(edited)
# 2024-07-18 09:17 sourabhniyogi: Understood! But I think we're up against the fact that Lin, Chung, and Han 2014 referenced in GP only works with powers of 2, and need to get down into like "what package can do _that_" nitty gritty.
# 2024-07-18 09:30 sourabhniyogi: This paper (who Han pointed one of our team to) appears to get over the "powers of 2" limitations and gets to what the GP wants:
https://ieeexplore.ieee.org/document/8955804
but I don't believe its reasonable for teams to be implementing it independently. IMHO every team should have an FFI into a very widely used erasure coding package, or (similar to davxy Bandersnatch repo, I imagine) we all just use the same one, implemented once, well?
(edited)
# 2024-07-18 09:48 gav: Agreed, implementation of the underlying RS EC is not something which is needed for the prize. I'd be very happy to see an high quality treasury-funded implementation for use across client implementations.
(edited)
# 2024-07-18 10:52 gav: 684 bytes split out into 1026 \* 2 bytes, any 342 of which can be used to reconstruct.
(edited)
# 2024-07-18 10:53 gav: once you've got this to match, the rest should be straightforward.
# 2024-07-18 11:36 alistair: > <@gav:polkadot.io> 684 bytes split out into 1026 \* 2 bytes, any 342 of which can be used to reconstruct.
This could be 1024*2.
(edited)
# 2024-07-18 11:59 oliver.tale-yazdi: > <@gav:polkadot.io> once you've got this to match, the rest should be straightforward.
yep i can recover it from a random 342 subset of these
# 2024-07-18 12:02 shwchg: I'll briefly explain the method from the paper mentioned in GP Appendix H, using 256:1024 as an example:
1. Use the original 256 GF points as the values of the evaluation points, and then reverse-generate a polynomial expression, which is defined by the authors.
2. Therefore, substituting the 0th to 255th elements into this polynomial will give us the original data we need.
3. Next, substitute the 256th to 1023th elements sequentially into the polynomial to obtain the redundancy data we need.
And this method for quickly computing the values of the evaluation points is a recursive formula. inverse tranform is a recursive formula too
(edited)
# 2024-07-18 12:04 shwchg:
Could you explain more clearly how you generated this vector? No matter how I look at it, it doesn't seem like it can be used for 342:1026.
# 2024-07-18 12:10 shwchg: The provided test vector contains original data with 1026 bytes, which is 513 octet pairs (Y_2). According to GP, every 342 octet pairs will be packed into 1 "segment" for encoding. Therefore, in this case, we should get 2 "segments" of codewords, totaling 1026 * 2 = 2052 octet pairs.
However, this provided test vector only contains 1 segment (1026 octet pairs) of codewords.
Please let me know if I got anything wrong.
# 2024-07-18 12:25 gav: The
data
I provided in the gist
9c07...a574
is 1368 nibbles, which is 684 bytes (342 byte-pairs).
(edited)
# 2024-07-18 12:26 gav: This gets ECed into 1026 byte-pairs (
segment_ec
) - concatenating the first 342 result in the
data
. The other 684 byte-pairs are the additional redundancy.
(edited)
# 2024-07-18 12:29 gav: Once you can do this, then you just need to glue 6 of them together to do segment encoding (4104 = 684 * 6).
# 2024-07-18 12:39 shwchg: Sorry for that. I originally thought it was base64 (because the other vectors are encoded using base64), but this doesn't answer my question. The paper has strict limitation of (n,k)should be 2^r, so how did you apply them to generate this vector?
# 2024-07-18 12:46 gav: For now just assume it's a normal systematic RS EC in GF(2^16)
# 2024-07-18 12:48 gav: We have found that the crate parity uses for EC can parallelise effectively if you're careful about which shards are which and you ensure indices are equivalent for all shards.
(edited)
# 2024-07-18 12:49 gav: We'll probably improve this in the future to allow for 32x parallelism using 4 indices (8 shards per index), which should get very close to the theoretical maximum performance without necessitating a fresh implementation.
# 2024-07-18 12:49 gav: However, for M1/M2 performance is not especially relevant - you just need to be able to get the right EC results.
(edited)
# 2024-07-18 12:51 gav: As for hex/base64, yeah I switched it - I'm not sure why base64 was ever used as it makes it largely unreadable to human eyes. I'll push people to stick to hex so it's trivial to see things like string lengths right in a text editor and not worry about weird padding artefacts.
(edited)
# 2024-07-18 13:19 shwchg: So in the future, it is possible that there will be changes in the encoding rate and data segmentation, right? When will this be confirmed, and what would you recommend we do for our implementation in the meantime?
# 2024-07-18 13:29 shwchg: Regarding the recovery of the test vector you provided earlier, Can you provide us with more relevant implementation details?
I want to make sure that our approach is correct.
# 2024-07-18 14:01 gav: No changes are planned and I don't view any changes as likely.
(edited)
# 2024-07-18 14:21 gav: I believe this is the paper describing the underlying EC schema: D. G. Cantor, "On arithmetical algorithms over finite fields",
Journal of Combinatorial Theory, Series A, vol. 50, no. 2, pp. 285-300, 1989.
Systematic Reed Solomon Finite-Field Erasure Codec, GF(2^16), Cantor basis 2
(edited)
# 2024-07-18 14:56 prematurata: hello when trying to implement the codec for the
dictionary encoding
"method 1". or this formal definition, i find it hard to write a generic codec. let me explain better.
the notation seems to indicate that when the dictionary then the following steps should be taken:
- encode number of keys/valuepairs ( formula 275 )
- then for each key value pair
- concatenate the encoded key
- concatenate the encoded corresponding value
if my understanding is true, then lets think about a dict with different keytypes and value types each with variable length. There is no explicit mechanism in the formal specification that would allow an implementation to properly encode & decode the key value pairs.
I would've expected to also see some kind of
E(k*) = E(|k|) concatenated with E(k)
. same for
d[k]
(edited)
# 2024-07-18 14:58 prematurata: unless i am misinterpreting the formula (which is a strong possibility here)
# 2024-07-18 15:01 gav: No dictionary with differing key/value types use this encode function.
# 2024-07-18 15:03 prematurata: > <@gav:polkadot.io> No dictionary with differing key/value types use this encode function.
thanks. maybe it's worth mentioning in the graypaper?
# 2024-07-18 15:05 prematurata: also for future reference
i think
it might be a good idea. In case jam ends up needing to use it. we might want to be sure that we only use it when that condition is satisfied
# 2024-07-18 16:28 sourabhniyogi: Oliver Tale-Yazdi: Which package did you use to decode the
https://gist.github.com/gavofyork/4cde92327cd91eff3e2ab91d316cb83a "normal systematic RS EC in GF(2^16)"? Can you post your (I presume Rust) decoder in a similar gist? A encode-decode combo?
Happy to submit a PR with the same shape of test vectors as cheme's PR after we succeed, have you review it, based on the above.
(edited)
# 2024-07-19 17:14 sourabhniyogi: gav: Do you have wire format preferences for JAMSNP
https://hackmd.io/@polkadot/jamsnp
For speed, I'd like to suggest we start with JSON and end with scale or protobuf. Not arguing for protobuf over scale here ... but even though JSON is something engineers tend to snicker at, it usually makes debugging life easier, between individuals and especially between teams, at the expense of a bunch of (I suggest, temporary) serialization/deserialization. Or perhaps JAMSNP can support more than 1 wire format.
Are the details of JAMSNP appropriate to develop by implementers and then turned into a GP Appendix after we get a couple "serious" implementations in to code complete form, or do you (+JAM protocol engineers) own this problem?
If you don't own this, should implementers put something together into a polkadot-fellows RFC and then GP compatible notation?
(edited)
# 2024-07-19 20:32 gav: SCALE, to the degree it matters. Most items will have a standard binary representation anyway.
# 2024-07-21 20:07 sourabhniyogi: Ok -- maybe C.2 should be expanded to cover all the SCALE encodings, not just for blocks but for JAMSNP stream objects? Here are my notes:
- CE: Ticket submission -
Ticket
is an _element_ of ${\\bf E}\_T$ (Section 6.7 Equation 73) -- serialization of elements of $\\mathbb{C}$ in C.2 (Equation 288)
- CE: Work Report publication -
Guaranteed Work Report
is an _element_ of ${\\bf E}\_G$ (Section 11.4 Equation 136) -- Serialization of $\\mathbb{W}$ is C.2 (Equation 286)
- CE: Assurance publication -
Assurance
is an _element_ of ${\\bf E}\_A$ (Section 11.2 Equation 123) -- need serialization in C.2 and a set definition in I.1.1.
- CE: Judgement publication -
Dispute
contains _components_ of ${\\bf E}\_D$ (Section 10.2 Equation 97) -- need serialization in C.2, though ${\\bf B}$ covers ${\\bf c}, {\\bf v}, {\\bf f}$ components.
- CE: Preimage publication -
Preimage
is an _element_ of ${\\bf E}\_P$ (Section 11.2 Equation 153) -- need serialization in C.2 and set definition in I.1.1.
- CE: Block publication -
Block
is Section 4.1 Equation 13, serialization of ${\\bf B}$ is thorough in C.2 (Equations 280-282).
- CE: Work Package Submission and Sharing -
Work Package
is Section 14.3 Equation (174), serialization of elements of $\\mathbb{P}$ in C.2 (Equation 287).
- CE: Audit-announcement -
Announcement
- see Section 17.3 Equation 196. Need serialization in C.2 and set definition in I.1.1.
- CE: AuditDA query
AuditDAQuery
& response
AuditDAResponse
- need serialization in C.2 and maybe set definitions in I.1.1.
- CE: ImportDA query
ImportDAQuery
& response
ImportDAResponse
- need serialization in C.2 and maybe set definitions in I.1.1
- CE: Public ImportDA reconstruction
ImportDAReconstructionQuery
& response
ImportDAReconstructionResponse
- need serialization in C.2 and maybe set definitions in I.1.1
Beyond the above encoding nitty gritty, it would be powerful to have JAMSNP, for each stream, have a short description of when the sender is typically expected to send and the expected behavior of the receiver when receiving each of the above. You could leave it as an exercise for JAM implementers, but why?
Pedantic question: Why is V=1023, C=341 and not V=1026, C=342 where the latter matches Appendix H's 1026?
Do you have a recommendation on how implementers get started with a "tiny" C, V, W\_C, W\_S?
Above is relative to 0.3.1 (7/17/24) -- In addition, the start of section 14 is messed up.
(edited)
# 2024-07-22 12:36 celadari: I have a question regarding the safrole algorithm. Since the author of the block "gives away who he is," shouldn't the header also include proof that they are entitled to the ticket? From my understanding, $$H\_s$$ is a Bandersnatch signature of $$E\_U(H)$$ with same context as ticket, but I don't see how it is sufficient to prove ownership of the ticket.
Shouldn't we also include opring, using the notation from the 2023 paper by Jeffrey Burdges, where opring contains a NARK.Proof(comring, pk, ring)
?
I also reviewed the code in the Bandersnatch VRFs spec example. I noticed that the code assumes the prover\_key\_index is the correct one by using the same variable
prover_key_index
for both signing and verification.
I feel there is something obvious that I am not seeing :)
(edited)
# 2024-07-22 12:58 oliver.tale-yazdi: >gives away who he is," shouldn't the header also include proof that they are entitled to the ticket
IIUC: the header seal signature is the proof. It is a bandersnatch sig (not a bandersnatch ring sig).
This sig can only be created by the holder of the key for the slot of that block - which was determined in advance by the ticket contest.
So by issuing a standard bandersnatch signature, they give away that they control the respective authoring key.
# 2024-07-22 13:12 celadari: I agree that it is a bandersnatch signature but the ticket itself is generated using a bandersnatch ring signature.
How do you "relate" the holder of the key with the ticket seal in $$\\gamma\_s$$ ?
$$i$$ is known ($$\\gamma\_s\[H\_t\]$$) - and so is the expected Y(H\_s) - so what prevents me from signing with my own key E\_U(H), sign [] with context $$X_E + Y(H_v)$$ as well (since I know expected Y(H\_s)) then pretend I am the owner of the ticket ?
(edited)
# 2024-07-22 13:19 oliver.tale-yazdi: Not sure if i understand what you are asking, but if you just sign
Y(H_v)
with a different key, then it wont verify.
Y(H_v)
is passed into the signature verification function
vrf_verify
alongside with the public key. Only the author key can verify for that
Y(H_v)
.
so you cannot just "steal" the ticket by copying the
Y(H_v)
.
# 2024-07-22 13:29 celadari: What do you mean by "sign Y(H_v)" ? H_v and H_s are both headers provided by the one submitting the block.
The vrf value/ticket is known (looking at
$$\gamma_s$$
) that is we know what $$Y(H_s)$$ is supposed to be. So we know the context of $$H_v$$ signature. So I can sign [] with context X_E+Y(H_s) using my own key and sign E_U(H) with context X_T+eta_3+i_r (I can try i_r equals 0 or 1), I don't see what is gonna fail here
# 2024-07-22 19:19 gav: > Pedantic question: Why is V=1023, C=341 and not V=1026, C=342 where the latter matches Appendix H's 1026?
Keeping the number no greater than 1024 is (or at least was) helpful for ensuring the speed of a few algorithms, including EC and SNARKs.
(edited)
# 2024-07-22 20:16 gav: The numbers are correct. There is nothing to resolve. There are 1023 validators and 342 are required for EC reconstruction.
(edited)
# 2024-07-22 23:18 sourabhniyogi: Got it. Clarifying that chunks for "1023..1024..1025" in Appendix H 342:1026 don't actually get distributed to anyone might be useful, in a footnote in H.1 or 14.2. This suppresses others like me who perceive a discrepancy.
(edited)
# 2024-07-23 09:47 gav: In fact the latest GP (
main
branch) revises this to a rate of 342:1023, thereby removing this weirdness.
(edited)
# 2024-07-22 19:20 gav: > Do you have a recommendation on how implementers get started with a "tiny" C, V, W_C, W_S?
Don't know what you mean.
# 2024-07-22 19:36 sourabhniyogi: I mean that for those of us who want to just start doing producing blocks with a single machine simulating V nodes will choose a tiny V, like 6 (C=2). But the EC parameters have to match. Ideally we'd have "tiny" test vectors with very low V to match the "full" test vectors. Safrole test vectors have tiny V=6 already and V=1024. So the recommendation request is how to adjust other parameters W\_C, W\_S, ... and the EC procedure such that all teams can do basically the same thing.
For me this low V situation (V=6 or 9 or 12) enables a quick way for 2-4 teams to test against each other, in a ZombieNet like way, instead of the impractically large V=1023 or V=1026 situation (not for mainnet, just for a single team to get all components working together). The EC component is most glaring so we'll just use some other Reed-Solomon parameters in the low V case but I'm sure you will have better parameter selection?
(edited)
# 2024-07-22 20:20 gav: > <@sourabhniyogi:matrix.org> I mean that for those of us who want to just start doing producing blocks with a single machine simulating V nodes will choose a tiny V, like 6 (C=2). But the EC parameters have to match. Ideally we'd have "tiny" test vectors with very low V to match the "full" test vectors. Safrole test vectors have tiny V=6 already and V=1024. So the recommendation request is how to adjust other parameters W\_C, W\_S, ... and the EC procedure such that all teams can do basically the same thing.
>
> For me this low V situation (V=6 or 9 or 12) enables a quick way for 2-4 teams to test against each other, in a ZombieNet like way, instead of the impractically large V=1023 or V=1026 situation (not for mainnet, just for a single team to get all components working together). The EC component is most glaring so we'll just use some other Reed-Solomon parameters in the low V case but I'm sure you will have better parameter selection?
Safrole should have test vectors for V=1023.
@davxy:matrix.orgplease confirm.
# 2024-07-22 19:21 gav: > In addition, the start of section 14 is messed up.
Don't know what you mean.
# 2024-07-22 19:29 sourabhniyogi: see final pdf. Not a big deal, but maybe you had some content there.
# 2024-07-22 20:17 gav: > <@sourabhniyogi:matrix.org> sent an image.
No idea how you generated that. My pdf (and that uploaded to github) is fine.
# 2024-07-22 19:23 gav: > but I don't see how it is sufficient to prove ownership of the ticket.
See the top line of (60)
# 2024-07-22 19:23 gav: The ticket ID (VRF output) is required to be the same. This guarantees the sealer is the ticket owner.
# 2024-07-22 19:28 celadari: But how do we go from bandersnatch signature (H\_s) to bandersnatch ringVRF ? :)
(The only obvious way I see is to have
opring
)
When I look at the ringVRF construction from 2023 Jeffrey Burdges I don't see how we can do this (aside from using
opring
) ?
And the ticket is signed with message/additional data equal to empty \[\] and H\_s is supposed to be on E\_U : I think the signature changes if additional data/message changes
(edited)
# 2024-07-22 20:18 gav: > <@celadari:matrix.org> But how do we go from bandersnatch signature (H\_s) to bandersnatch ringVRF ? :)
> (The only obvious way I see is to have
opring
)
>
> When I look at the ringVRF construction from 2023 Jeffrey Burdges I don't see how we can do this (aside from using
opring
) ?
>
> And the ticket is signed with message/additional data equal to empty \[\] and H\_s is supposed to be on E\_U : I think the signature changes if additional data/message changes
Will need to defer to
@davxy:matrix.orgfor the RingVRF/Bandersnatch specifics.
# 2024-07-22 20:21 gav: I have no plans to alter the EC; the tiny Safrole test vectors were given only as a convenience since proof generation can be quite slow. The same is not true for EC.
# 2024-07-22 20:23 gav: V=6 seems fair for a testnet but it’s totally up to you. Experiment.
(edited)
# 2024-07-23 06:14 davxy: > <@gav:polkadot.io> Safrole should have test vectors for V=1023.
@davxy:matrix.orgplease confirm.
I confirm that "full" vectors are generated with V=1023
# 2024-07-23 06:24 davxy: > <@gav:polkadot.io> Will need to defer to
@davxy:matrix.orgfor the RingVRF/Bandersnatch specifics.
@sourabhniyogi reply is quite insightful :-) @celadari I'll add something later (currently afk)
# 2024-07-23 10:33 cisco: On another topic, I think some
max
s on appendix A should be
min
s, since they are used to disallow "out of bounds" access
# 2024-07-23 14:10 celadari: Thank you Davide 🙌, After checking the gist I realized that the ietfVRF output and ringVRF output are the same: obvious when looking at the 2024 paper but not obvious when looking at the 2023 paper
SO a ietfVRF Proof on an output made by ringVRF is still valid
SO even if the ticket in gamma_s was made using ringVRF, using the ietfVRF.Verify with H_s as signature, H_a as public key and E_U(H) as aux_data should produce the same output (since output is independant from aux_data)
# 2024-07-29 13:20 jay-chrawnna: More lectures and resources will be added over time. Please DM me directly if you have a request to make it more useful!
# 2024-07-29 14:36 celadari: > <@davxy:matrix.org> sourabhniyogi:
>
> 1. For Safrole test vectors the only thing we use is $jam\_ticket\_seal.
> In particular this is used for the ring-vrf input construction (context in the GP) to obtain the ticket score (aka ticket-id) during the candidate ticket verification procedure.
> The value of $jam\_ticket\_seal" is constant and defined as "jam\_ticket\_seal" ASCII string.
> This is a thing that applies in general for values starting with
$
(e.g.
$foo
in the GP =>
"foo"
ascii string).
> (Thank you BTW as I've just spotted that I was using "jam\_seal" instead of "jam\_ticket\_seal", I'll add the fix to the upcoming vectors PR)
> The other constant strings ($jam\_fallback\_seal and $jam\_entropy) are used for block verification and per-block entropy production (which is passed as input to Safrole).
> NOTE: The actual value of per-block entropy used by the Safrole test vectors is not relevant to have been really produced using the signature in the header. Here we abstract away from the value's origin, we don't really care for the sake of the Safrole test vectors and Safrole STF. In the specific case I've used:
>
> - entropy\_0 = blake2b(\[42\_u8; 32\])\[..32\]
> - entropy\_i+1 = blake2b(entropy\_i)\[..32\]
> 2. If you take as a reference the Sassafras RFC then there some differences.
> One of these is the attempts number and redundancy factor, which in the GP are in practice simplified to one single thing (attempts number).
> Threshold is gone.
> Even though reading Sassafras RFC can help (as it is a quite similar protocol), always take the GP as the source of truth for JAM.
> 3. Sure thing. I'll post here as soon as it is ready.
> 4. This is a very interesting observation. Current implementation serializes 3 extra fields (part of the SNARK SRS).
> Serialization of these fields may be important **in a general application**, but here these values are constant.
> I will definitely get rid of these from serialized data (I'm on it). The final size will be 144 (i.e. the last 144 bytes of what you see right now)
Hi Davide,
I'm quoting this old message as it is related to my doubt. I have checked messages on the chat here and there and I just wanted to confirm my understanding of the test vectors regarding safrole.
- Just to confirm, in the test vectors "input.entropy" refers here to Y(H_v) of the GP, right?
- Thus, in the test vectors we don't verify equations (59) and (60) (parts related to ark-ec-vrfs that I asked you about last week), right?
# 2024-07-29 17:42 davxy: - As that old message reports, the
input.entropy
**for the vest vectors** is generated in some "arbitrary" manner (the quoted message reports how I generated it). In a real scenario YES, it is generated as per equation (66) of the last GP release (DRAFT 0.3.1 Jul 17). In the context of the safrole test vectors, is not really important how we generate it as
H_v
is not present at all.
- In safrole test vectors **we don't verify** the validity of the signatures contained within the header (i.e. H\_v and H\_s) simply because are not part of the safrole's specific STF. These will be probably verified in some upcoming test vectors to asses block's header validity
(edited)
# 2024-07-29 18:36 celadari: Regarding "recent history", why do we assign H\_r to beta\[0\] instead of assigning it to beta\[|beta| - 1\] ?
Since we produce beta' by appending the new elements to the end and taking the last H elements of the array (equation 81) I would have thought the last element of beta should be used for equation (80)
(edited)
# 2024-07-30 04:33 xlchen: not 100% sure what is the modulo subscription operator is doing here. does it mean the memory access is never out of bonds but wrapped?
# 2024-07-30 10:37 gav: > <@xlchen:matrix.org> not 100% sure what is the modulo subscription operator is doing here. does it mean the memory access is never out of bonds but wrapped?
Yes.
# 2024-07-31 09:01 prematurata: am I wrong or there is no definition of what
Xv
is? being used in
98
first and
101
. I have the feeling
99
might miss something?
# 2024-07-31 09:18 gav: > <@prematurata:matrix.org> am I wrong or there is no definition of what
Xv
is? being used in
98
first and
101
. I have the feeling
99
might miss something?
v is either true or false. Both possibilities are defined in (99)
# 2024-07-31 09:20 prematurata: thanks for the answer. I might miss the link between
v
and
(99)
then. but thanks for confirming
(edited)
# 2024-07-31 10:04 oliver.tale-yazdi: that
T
is true and the upside down
⟂
is false.
(edited)
# 2024-07-31 10:05 philip.poloczek: > <@danicuki:matrix.org> I think there is a bug in the version without background:
>
https://graypaper.com/graypaper_no_background.pdf
Thanks for reporting this. It's fixed now. Sorry for the inconvenience.
I've added a disclaimer to the site that the white version is just for convenience and the gray version/github release being the decisive one
# 2024-07-31 14:56 prematurata: let me know if that's correct i will gladly open a PR with the fix
# 2024-07-31 14:57 celadari: since we append new elements at the end and only take the H last elements:
(edited)
# 2024-07-31 15:20 gav: > <@celadari:matrix.org> sorry for insisting but shouldn't we use ?
yes that looks to be a typo...
# 2024-08-02 11:19 prematurata: what is this sentence referring to? I tried to find the
above condition
that its being referred there but I am unsure
# 2024-08-02 11:26 subotic: > <@prematurata:matrix.org> what is this sentence referring to? I tried to find the
above condition
that its being referred there but I am unsure
maybe it should say
above conditions are not met
? This is how I understood it a least.
# 2024-08-02 11:27 prematurata: oh that might also make more sense. but i dont see any conditions (there) the 3 above elements are just definitions, that's part of the reason why i'm confused
(edited)
# 2024-08-02 11:32 subotic: > <@prematurata:matrix.org> oh that might also make more sense. but i dont see any conditions (there) the 3 above elements are just definitions, that's part of the reason why i'm confused
hmm, good point.
# 2024-08-02 12:32 cisco: I interpreted it as
if the definitions can't be computed
# 2024-08-02 12:35 gav: > <@cisco:parity.io> I interpreted it as
if the definitions can't be computed
Yes
# 2024-08-02 12:36 prematurata: > <@cisco:parity.io> I interpreted it as
if the definitions can't be computed
I thought about that as well ... as far as I can think of there is only one case when that cannot be computed. and if the next istruction is just right after the one we're evaluating
# 2024-08-02 12:36 gav: Because of its inverse encoding (E^-1), there's a possibility it can fail (i.e. that there is no v_x which can satisfy the condition)
(edited)
# 2024-08-02 12:37 gav: The other conditions (equalities) can always be met.
(edited)
# 2024-08-02 12:39 gav: then that condition would obviously not be possible to be met. No value exists which is both greater than 1 and less than 0.
# 2024-08-02 12:39 gav: It's not (always) about being evaluatable or computable. Of course practical implementations must evaluate/compute, but formal specs only need specify relations and conditions.
(edited)
# 2024-08-02 12:40 gav: Therefore we say "if the condition cannot be met" rather than "if you can't compute the value".
# 2024-08-02 12:41 gav: There are various reasons why a value "may not be computable" - perhaps you don't have the operands, perhaps you don't know an algorithm to do it, perhaps the CPU is broken... there are practical concerns since computable is practical endeavour. We wouldn't want to define the protocol in terms of practical concerns.
(edited)
# 2024-08-02 12:41 gav: Meeting a condition is not a practical endeavour. It's purely theoretical.
# 2024-08-02 12:56 prematurata: i am still trying to wrap my head around that. so i can't really find a practical
l_x
for which
E_lx
is not computable. (not considering the case where the arguments input size is zero or formally
ζı+1 ∈ ϖ
).
I understand the reasoning about
condition
vs
computable
but, feel free to correct me. Since the
E^-1
is the inverse of
E
for which there is no input for which
E
fails to produce output... then I'd expect (always mathematically speaking) that if the inverse function is "called" with a member of the output set of
E
which is **Y**l (aka a sequence of
l
octects) then, unless otherwise specified, there would be no reason to believe the inverse fails for some inputs.
# 2024-08-02 12:58 prematurata: So if my reasoning is correct, then I'd propose that either the
disclaymer
is
global
. If, for some reason, the defined values cannot be computed then a panic should occur
(edited)
# 2024-08-02 15:12 gav: Well, there are sequences of octets which when passed into E^-1 do not provide a value in N_(2^32), right?
(edited)
# 2024-08-02 15:18 prematurata: yes you're right. but if you mean when the input sequence is longer than 4 octects... well that should be enforced by the
min
in
l_x
definition .
(edited)
# 2024-08-02 15:37 prematurata: I don't see how little endian encoding which is defined in 273 has a way for its inverse to produce a value **not** in
N_(2^32)
. when subscripted with
x
being
0<=x<=4
(edited)
# 2024-08-02 17:51 gav: (Until 0.3 series with the PVM changes that equation was originally the decode function without the subscript, implying that it could fail to decode. That's no longer the case so I'll remove the now-superfluous disclaimer.)
# 2024-08-04 10:17 gav: Heads-up: 0.4 will likely include a formalisation of #57 (dependent on final review).
# 2024-08-05 03:23 jay-chrawnna: we've released the clips of the Belgium lecture for those using the YouTube
playlist. Because the Gray Paper is a living project, we opt to title & number the clips based on the version of the graypaper at time of lecture.
You'll find them on graypaper.com soon™️
# 2024-08-05 04:35 xlchen: A minor inconsistency,
e
is used as epoch index for Safrole, but
m
is used as epoch index for validator activity statistics
# 2024-08-05 06:06 xlchen: is this a typo? (this part is super confusing because judgements now renamed to verdicts but there is still a variable j used in verdicts)
# 2024-08-05 06:13 xlchen: I am also unsure what exactly is the squared part mean here. It reads like epoch index minus 2? But it doesn't make much sense to me here
# 2024-08-05 06:15 xlchen: I guess it means that value
a
must be either the current epoch index or the previous one?
# 2024-08-05 09:09 xlchen: thanks. this reminds me I should finish watching the jam lectures
# 2024-08-05 06:23 subotic: In 226, shouldn't the iota in the first case be
i + 1 + skip(i)
?
(edited)
# 2024-08-05 07:52 xlchen: > Specifically, the machine does not halt, the instruction counter increments by one
# 2024-08-05 07:52 xlchen: we don't want to update it twice, or make an exception case here
# 2024-08-05 08:05 subotic: so this means that the first case is covered through (220). Got it. Thanks!
# 2024-08-05 09:49 celadari: > <@xlchen:matrix.org> we don't want to update it twice, or make an exception case here
Yes but following the formalism if we go to
b
we execute it first. We don't do
b + 1 + skip(b)
, we first go to
b
and execute it. So I think what Ivan said makes sense 🤔
# 2024-08-05 23:13 xlchen: and the usual steps continues. i.e. increment the counter and execute the next instruction
# 2024-08-05 10:49 gav: > <@xlchen:matrix.org> is this a typo? (this part is super confusing because judgements now renamed to verdicts but there is still a variable j used in verdicts)
Yes. I’ve cleared up the naming.
# 2024-08-05 10:49 gav: Disputes includes verdicts (a final aggregated decision on whether a WR was bad) and offenders (proofs of those validators who issued judgements against such a verdict).
(edited)
# 2024-08-05 10:50 gav: Verdicts are made up of individual judgements (which is a vote and a validator's signature).
(edited)
# 2024-08-05 10:51 gav: (In previous versions, we did not have offenders, and both judgements and verdicts were called judgements, confusingly)
(edited)
# 2024-08-06 09:21 tomusdrw: Hey there! The last couple of days I've been working on a tool that would help to read and annotate the Gray Paper. Looking for some feedback:
https://graypaper.fluffylabs.dev/
On the top of the priority list to implement is to display the notes within the document and migrate them between versions. I found myself having a lot of notes on a printed version, but they are hard to maintain when the document keeps changing, hence the website.
Another important feature for me is the ability to share a link to a selection - I've noticed here that many people share screenshots (which I believe is still good), but if the screenshots were augmented with such a link it's easier for others to find the context of the discussion (and the graypaper version the quote is coming from).
Curious to hear what would be important for you or what you'd like to see improved. The (pretty hacky) code is on github, so feel free to file an issue or a PR there.
# 2024-08-06 09:44 oliver.tale-yazdi: The linking feature is awesome! I tried a few things to have links to a specific formular but this is top. Does it search for the text or how does it resolve a link to a spot in the paper?
# 2024-08-06 09:56 tomusdrw: > <@oliver.tale-yazdi:parity.io> The linking feature is awesome! I tried a few things to have links to a specific formular but this is top. Does it search for the text or how does it resolve a link to a spot in the paper?
As mentioned earlier it's pretty hacky :D The PDF is converted to HTML and it has a very specific structure. The link saves the HTML nodes that are selected and then searches for these exact nodes when loaded. I'm still trying to figure out how to best "migrate" the selection from one version to another, but I'm positive I'll figure something out, given I have full information about the selection (including page, section, subsection, gp version and the text content)
# 2024-08-06 09:57 tomusdrw: The migration to newer version will most likely need some fuzzy searching if the exact text is not found, but at least we can limit that to some particular section.
# 2024-08-06 09:58 tomusdrw: We are also working on using synctex to map PDF/HTML selection to raw latex sources as another option.
# 2024-08-06 10:07 kianenigma: This is super useful, thanks for sharing! will start using it and share feedback :)
# 2024-08-06 10:28 cisco: Awesome tool! I also took a lot of notes on a printed version but all the changes make my notes harder to find.
You could maybe show the visual diffs between the different versions and associate notes to a particular version if stuff changed
# 2024-08-06 12:14 prematurata: <del>~~Hello, i believe there is no formalism on how to properly encode/decode workresults. The
o
term is either
byte array
or
enum
. (123 and 124) .</del>
<del>considering the
Eg
exstrinsic is a tuple containing workreport
138
and
119
. And that there is no specific formalism for term
o
in 123 on appendix
C
, i believe that note
20
at page
49
might also apply here?~~</del>
(edited)
# 2024-08-06 13:15 prematurata: is (284) correct? it misses the
segment-root
from the
S
set
# 2024-08-07 13:01 tomusdrw: AFAICT the PVM program should run up until it reaches an invalid instruction. What's the reason for that? Couldn't programs with invalid instructions be cheaply rejected during some initial (linear) validation based on the instruction mask? We also have well-defined trap instruction to terminate the program, so no need for that to panic (like there was in EVM afair).
The only reasons I could think of is:
1. Avoid the need of pre-validation completely (i.e. don't assume there is anything like that in GP)
2. Support some future extensions?
Is there anything I'm missing here? Perhaps the compilers generate a code with invalid instructions in case of panic?
# 2024-08-07 14:22 luke_f: hello.
i think there is a mistake in recent history - equation 83
**the subscript s is supposed to be subscript x**
since _s_ is an availability specification, that does no have a _p_ sub-component (eq 122)
_x_, however is refinement context , that does have _p_ component that is a hash (eq 121)
or maybe it refers to
S subscript *h*? the work package hash?
(edited)
# 2024-08-07 17:18 gav: It refers to the work package hash in the specification component.
# 2024-08-08 01:58 luke_f: > <@gav:polkadot.io> It refers to the work package hash in the specification component.
thank you
# 2024-08-08 07:26 tomusdrw: gav: any comments on this one? My current read of the GP is that it's fine to have programs with invalid instructions, since they obviously might never be reached in code and we still need to panic only if that particular instruction is reached during execution. And obviously that's payed for.
However I'm thinking if it would be "less wasteful" to disallow programs with invalid instructions to be executed at all. But perhaps it's over complication or not practical for some reasons yet unknown for me?
(edited)
# 2024-08-08 08:52 prematurata: and maybe this is also a typo...
posterior m
can be > than
Y
but then
|ya|
would not be = to
E
... what I think we want to make sure is that the
current m
is > than
Y
so that we know the lottery "time" is over
(edited)
# 2024-08-08 10:00 gav: > <@tomusdrw:matrix.org> AFAICT the PVM program should run up until it reaches an invalid instruction. What's the reason for that? Couldn't programs with invalid instructions be cheaply rejected during some initial (linear) validation based on the instruction mask? We also have well-defined trap instruction to terminate the program, so no need for that to panic (like there was in EVM afair).
>
> The only reasons I could think of is:
> 1. Avoid the need of pre-validation completely (i.e. don't assume there is anything like that in GP)
> 2. Support some future extensions?
>
> Is there anything I'm missing here? Perhaps the compilers generate a code with invalid instructions in case of panic?
Point 1.
# 2024-08-08 10:01 gav: The PVM is designed to be linear complexity in terms of execution steps, not program size.
(edited)
# 2024-08-08 10:02 gav: Correct. Will be in next round of corrections thanks
(edited)
# 2024-08-08 10:04 gav: Yeah that is already corrected to the prior m on the latest GP.
(edited)
# 2024-08-08 12:23 prematurata: > <@gav:polkadot.io> Yeah that is already corrected to the prior m on the latest GP.
oops sorry. i must have missed one release
# 2024-08-08 12:25 gav: > <@tomusdrw:matrix.org> AFAICT the PVM program should run up until it reaches an invalid instruction. What's the reason for that? Couldn't programs with invalid instructions be cheaply rejected during some initial (linear) validation based on the instruction mask? We also have well-defined trap instruction to terminate the program, so no need for that to panic (like there was in EVM afair).
>
> The only reasons I could think of is:
> 1. Avoid the need of pre-validation completely (i.e. don't assume there is anything like that in GP)
> 2. Support some future extensions?
>
> Is there anything I'm missing here? Perhaps the compilers generate a code with invalid instructions in case of panic?
(There are actually a number of exit conditions, an illegal instruction is only one of them and I wouldn't expect it to happen in production.)
# 2024-08-09 14:11 danicuki: In the serialization definitions, there is the variable-size prefix 29-bit natural serialization function E4∗ \se_{4*}
But I don't see it being used anywhere. I see only mentions to E4 (without the *) \se_4
Should we use \se_{4*} by default when we see \se_4 ?
# 2024-08-09 15:51 gav: > <@danicuki:matrix.org> In the serialization definitions, there is the variable-size prefix 29-bit natural serialization function E4∗ \se_{4*}
>
>
> But I don't see it being used anywhere. I see only mentions to E4 (without the *) \se_4
>
> Should we use \se_{4*} by default when we see \se_4 ?
No. When you see \se_4, use \se_4. The variable-length encoding functions are not presently used. They might be used in the future but for now you obviously don’t need to implement them to build a conformant implementation.
# 2024-08-11 17:19 finsig: For integer encoding, eq275, there are values of x that satisfy 2^7*l <= x < 2^7(l+1) but do not correctly encode with length (l). For example, 2^14-1 satisfies the criteria for length 1, 2^7 <= 2^14-1 < 2^14, and encodes to bytes [255]. However, it requires length 2 bytes [255,63] = 2^14-1. Should the criteria instead be: 2^7*(l -1) <= x < 2^(7 * l)
# 2024-08-11 17:40 gav: I’m not sure what you mean. It satisfies the criteria for l=1, which implies it needs two bytes: the discriminator summed with the most-significant bits in the first byte (128 + 63 = 191), together with least significant byte in the second (255), giving [191,255]
(edited)
# 2024-08-12 01:59 finsig: > <@gav:polkadot.io> I’m not sure what you mean. It satisfies the criteria for l=1, which implies it needs two bytes: the discriminator summed with the most-significant bits in the first byte (128 + 63 = 191), together with least significant byte in the second (255), giving [191,255]
sorry, my mistake. I understand now.
# 2024-08-12 10:04 oliver.tale-yazdi: The text in the guarantees extrinsic and the Block Serialization mention a core-index, but it looks like i was moved into the work-report. So just the text and encoding logic needs fixing?
# 2024-08-14 02:31 sourabhniyogi: davxy: Do you have a snippet of code that represents the above computation? Would like to check our assumptions. This "ring root" is used in a new epoch as per:
(edited)
# 2024-08-14 22:57 sourabhniyogi: gav: Doesn't host function
write
need a "s" register input like
read
has in
\omega_0
?
(edited)
# 2024-08-14 23:12 sourabhniyogi: For
import
GP says "This process may in fact be lazy as the Refine function makes no usage of the data until the import hostcall is made." -- does this imply that PVM interpreters should "page fault" when hitting an "import" instruction? Is this the only host function having a lazy evaluation / page fault or would there be others (say,
export
\[though I imagine it would be the last thing being done\]), in interacting with the DA systems?
(edited)
# 2024-08-14 23:17 sourabhniyogi: For Epoch Markers, at the start of Section 5.1 you have "If not ∅, then the epoch marker specifies key and entropy relevant to the following epoch in case the ticket contest does not complete adequately (a very much unexpected eventuality)" but then in equation (72) you have something that appears contradictory:
# 2024-08-14 23:19 sourabhniyogi: Does the epoch marker appear on rare occasions (only when the ticket contest does not complete adequately) or on every new epoch (what (72) implies)
# 2024-08-15 06:41 gav: It is merely noting that not all import data may need to be fetched ahead of the start of execution.
(edited)
# 2024-08-15 06:45 gav: It’s obvious, really, but I wanted to point it out as for M4 it will be necessary to parallelise execution with import segment fetching.
# 2024-08-15 06:52 davxy: > relevant to the following epoch in case the ticket contest does not complete adequately (a very much unexpected eventuality)"
It is rare that the ticket contest doesn't complete successfully. If the contest is successful, then the previously announced epoch marker becomes irrelevant for the sake of slot assignments to validators.
# 2024-08-15 11:35 oliver.tale-yazdi: To me it looks like the types of the sets in definition 101 are incompatible:
Lambda U Kappa
being a set of validator key tuples and
Psi-o
a set of Ed25519 keys:
# 2024-08-15 12:46 gav: > <@oliver.tale-yazdi:parity.io> To me it looks like the types of the sets in definition 101 are incompatible:
Lambda U Kappa
being a set of validator key tuples and
Psi-o
a set of Ed25519 keys:
yeah - it should be the Ed25519 portions of lambda and kappa.
# 2024-08-16 08:03 cisco: Looking at the PVM's standard program initialization. What is that
z
? I've identified all others and even read compiled pvm files successfully without mentioning it
# 2024-08-17 00:53 sourabhniyogi: For
historical_lookup
host function, the $H$ in $H({\\bf a},t,h)$ should be a $\\Lambda$ to match 92+94 of Section 9.
Any reason why you don't just assign some opcodes to all the host functions? Can just add 100 or 128 to the existing ones, 13 becomes 113 or 141?
(edited)
# 2024-08-17 08:26 gav: > <@sourabhniyogi:matrix.org> For
historical_lookup
host function, the $H$ in $H({\\bf a},t,h)$ should be a $\\Lambda$ to match 92+94 of Section 9.
>
> Any reason why you don't just assign some opcodes to all the host functions? Can just add 100 or 128 to the existing ones, 13 becomes 113 or 141?
Then each of the contexts would need to be integrated and defined as part of the basic PVM, which is obviously a terrible idea.
# 2024-08-17 08:27 gav: > <@sourabhniyogi:matrix.org> For
historical_lookup
host function, the $H$ in $H({\\bf a},t,h)$ should be a $\\Lambda$ to match 92+94 of Section 9.
>
> Any reason why you don't just assign some opcodes to all the host functions? Can just add 100 or 128 to the existing ones, 13 becomes 113 or 141?
On the first point, yes indeed - refactoring error and will be corrected in 0.3.5.
# 2024-08-17 14:15 gav: I’m not really sure what you’re proposing, but if you mean to use opcodes which are not yet in use in place of host functions I would caution against it. Opcodes which are not in use should panic as an illegal instruction. If they do anything else then it will fail test vectors.
# 2024-08-18 02:12 sourabhniyogi:
https://github.com/koute/polkavm/pull/156
has the idea. Understand the caution, but not sure how else implementers can tackle implementing 23 host functions (8 or so critical to get the basics of DA+ a battery of refine-accumulate code for "tiny" JAM services: import, export, solicit, forget, historical\_lookup, read, write, lookup) without something along these lines.
(edited)
# 2024-08-18 07:13 mkchung: In eq(93), the preimage appears to be of arbitrary size - which is then erasure encoded & distributed in segments of size wc*ws to DA. But how does the E_p (lookup extrinsic) in section 12.1 (eq 155-158) able to know the length of the preimage? Without knowing the length, we probably cannot remove the padded zeros from the wc * ws segments. What am I missing?
# 2024-08-18 10:07 gav: > <@sourabhniyogi:matrix.org>
https://github.com/koute/polkavm/pull/156
> has the idea. Understand the caution, but not sure how else implementers can tackle implementing 23 host functions (8 or so critical to get the basics of DA+ a battery of refine-accumulate code for "tiny" JAM services: import, export, solicit, forget, historical\_lookup, read, write, lookup) without something along these lines.
you'll need to implement the host functions regardless. you'll need to implement PVM too regardless.
# 2024-08-18 10:07 gav: i don't understand why you're trying to use polkavm given that even for M1 you'll need to write your own.
(edited)
# 2024-08-18 10:59 sourabhniyogi: We are implementing our own PVM interpreter of course! We have already done so for the Appendix A instruction set and covered the tests of
https://github.com/w3f/jamtestvectors/pull/3. We just need to have some similar byte code for Appendix B. To do that, we have to assemble some byte code that has host functions, and so we only use koute's assembler to map some assembly code (the PR has 23 of them) into PVM byte code to test that assembler.
We are
not FFIing into koute/polkavm. We are only using it as an
assembler to generate byte code to implement 23 host functions of Appendix B, following koute's assemble/disassemble tool.
https://github.com/w3f/jamtestvectors/pull/3#issuecomment-2257688558
# 2024-08-18 11:01 gav: but why not just use
ecalli
along with the relevant host call index?
(edited)
# 2024-08-18 11:07 oliver.tale-yazdi: you can call this with the universal interpreter of that project with something like
pvme call test.pvm entry 42 69 --host-functions "get_third_number:100"
and then compare the result to your interpreter
# 2024-08-18 11:11 sourabhniyogi: > <@gav:polkadot.io> but why not just use
ecalli
along with the relevant host call index?
AHA! Is there some line in the GP that explains ecalli -- This is the missing bit of info.
# 2024-08-18 11:16 sourabhniyogi: Well, I missed this "link" between Appendix A and Appendix B -- where is it documented?
# 2024-08-18 11:27 sourabhniyogi: Ok -- I got it, thank you! (Closed my PR for now, it no longer makes sense) Any reason why
assign
and
new
don't get a host index in GP?
# 2024-08-18 10:09 gav: and in using it, rather than altering the code to introduce new non-standard instructions in order to empower
ArchVisitor
, you should obviously just invoke
ArchVisitor
using the
ecall
.
(edited)
# 2024-08-18 10:18 gav: > <@mkchung:matrix.org> In eq(93), the preimage appears to be of arbitrary size - which is then erasure encoded & distributed in segments of size wc*ws to DA. But how does the E_p (lookup extrinsic) in section 12.1 (eq 155-158) able to know the length of the preimage? Without knowing the length, we probably cannot remove the padded zeros from the wc * ws segments. What am I missing?
93 describes account state storage.
# 2024-08-18 10:18 gav: this is not erasure-coded. it is stored in state as a byte-sequence (with an implied length) in state directly.
(edited)
# 2024-08-18 10:21 gav: (157) shows exactly how the length can be verified: it is stored, along with the hash, as a key in δ[s]_l
# 2024-08-18 11:56 gav:
0.3.5 is released. Mostly trivial corrections, but there's also some small alterations to the work bundle serialisation format.
# 2024-08-19 09:14 danicuki: I have a doubt about state encoding, more specifically:
C(12) ↦ E4(χ)
C(13) ↦ E4(π)
How can we E4 here if χ and π are not actual integers? Would it be just E(χ) and E(π)?
# 2024-08-19 14:59 oliver.tale-yazdi: I also have a Q about what
r
is in definition 111. It must be a work-report hash since it comes from
V
, but
rho
does not contain work-report hashes but full work reports?
Do we have to compute the hashes and then remove them?
# 2024-08-19 16:31 gav: > <@danicuki:matrix.org> I have a doubt about state encoding, more specifically:
>
> C(12) ↦ E4(χ)
> C(13) ↦ E4(π)
>
> How can we E4 here if χ and π are not actual integers? Would it be just E(χ) and E(π)?
chi is just a tuple of numbers, so element should be encoded accordingly.
# 2024-08-19 16:33 gav: For tuples and sequences, E_4 is the same as E: totally transparent.
# 2024-08-19 16:34 gav: > <@oliver.tale-yazdi:parity.io> I also have a Q about what
r
is in definition 111. It must be a work-report hash since it comes from
V
, but
rho
does not contain work-report hashes but full work reports?
> Do we have to compute the hashes and then remove them?
Yes - indeed this should have been placed in a hash function H(…)
# 2024-08-19 21:23 mkchung: > <@gav:polkadot.io> (157) shows exactly how the length can be verified: it is stored, along with the hash, as a key in δ[s]_l
Is E_P the only way to get preimage into the on-chain state δ? In other words, is it solely triggered by the solicit host func(Ωs) at the accumalate? Presumably block author would need to retrieve these preimage off chain from DAs in order to include it in the block as E_P?
# 2024-08-19 22:37 sourabhniyogi: How does code get deployed?
As per 14.3, a Work Package has codehash $c$ (but not a length) –_must_ code deployment go through (i)
export
(ii)
solicit
in
accumulate
resulting in (iii) E\_P-driven state trie updates, as stated by Eq (91) or is there some other route?
More precisely how does a service's FIRST code hash get deployed? Its not happening in host function
new
... so is there some privileged service that service creators have to use to get started that does (i)-(iii)?
(edited)
# 2024-08-19 23:05 sourabhniyogi: In 14.3 right before Eq (177) you have "a sequence of hashed of blob hashes and lengths" which I think should be ungarbled as something like "a sequence of extrinsic hashes and lengths" -- but shortly thereafter you have "We make an assumption that the preimage to each extrinsic hash in each work-item is known by the guarantor. In general this data will be passed to the guarantor alongside the work-package." _These_ preimages have nothing to do with the E\_P-driven state trie updates from
solicit
, correct?
By "In general this data will be passed to the guarantor" by "this data" do you mean the
Extrinsic
preimage _via JAMNP_ like what you have as
--> Vec<Extrinsic> ++ Vec<JustifiedImport>
or something else?
What does
JustifiedImport
refer to in GP?
(edited)
# 2024-08-20 00:45 sourabhniyogi: In B.3 "The Refine invocation also ... explicitly accepts the work payload, ${\\bf y}$, ..., the import and extrinsic data blobs (both just concatenated segments) as dictated by the work-item, ${\\bf i}$ and ${\\bf x}$" -- which refers to 14.3's (177) (carried through to (184) and then (253) but then _nowhere_ in the refine-specific host functions of (254)), correct?
Assuming the above understanding is correct, I am expecting some refine-specific host functions
analogous to
import
(which gets at ${\\bf i}$) that gets at the
payload ${\\bf y}$ (which might be called
import_y
and the
extrinsic "raw" preimage data ${\\bf x}$ (which might be called
import_x
). But I don't see anything -- so which host functions read ${\bf y}$ and ${\bf y}$? Is there some setup of ${\\bf x}$ and ${\\bf y}$ in memory that we missed?
(edited)
# 2024-08-20 05:29 gav: As you postulate. The first service (which will appear in the Genesis block) will include functionality for creating new services permissionlessly.
(edited)
# 2024-08-20 05:32 gav: > <@sourabhniyogi:matrix.org> In 14.3 right before Eq (177) you have "a sequence of hashed of blob hashes and lengths" which I think should be ungarbled as something like "a sequence of extrinsic hashes and lengths" -- but shortly thereafter you have "We make an assumption that the preimage to each extrinsic hash in each work-item is known by the guarantor. In general this data will be passed to the guarantor alongside the work-package." _These_ preimages have nothing to do with the E\_P-driven state trie updates from
solicit
, correct?
>
> By "In general this data will be passed to the guarantor" by "this data" do you mean the
Extrinsic
preimage _via JAMNP_ like what you have as
--> Vec<Extrinsic> ++ Vec<JustifiedImport>
or something else?
>
> What does
JustifiedImport
refer to in GP?
On the first two points, yes.
# 2024-08-20 05:35 gav: > <@sourabhniyogi:matrix.org> In B.3 "The Refine invocation also ... explicitly accepts the work payload, ${\\bf y}$, ..., the import and extrinsic data blobs (both just concatenated segments) as dictated by the work-item, ${\\bf i}$ and ${\\bf x}$" -- which refers to 14.3's (177) (carried through to (184) and then (253) but then _nowhere_ in the refine-specific host functions of (254)), correct?
>
> Assuming the above understanding is correct, I am expecting some refine-specific host functions
analogous to
import
(which gets at ${\\bf i}$) that gets at the
payload ${\\bf y}$ (which might be called
import_y
and the
extrinsic "raw" preimage data ${\\bf x}$ (which might be called
import_x
). But I don't see anything -- so which host functions read ${\bf y}$ and ${\bf y}$? Is there some setup of ${\\bf x}$ and ${\\bf y}$ in memory that we missed?
It should not read “both just contented segments”; only imports are segments. Extrinsic data are (arbitrary length) blobs.
# 2024-08-20 05:36 gav: Extrinsic data are passed into the refine function directly. Imported segments are read through the import host call.
# 2024-08-20 05:51 sourabhniyogi: Thank you for explaining -- still unclear though -- which refine instructions / host functions can read/copy extrinsic data ${\\bf x}$ (I understand, passed into refine function directly) into RAM? What about the payload ${\bf y}$?
(edited)
# 2024-08-20 06:29 gav: > <@sourabhniyogi:matrix.org> Thank you for explaining -- still unclear though -- which refine instructions / host functions can read/copy extrinsic data ${\\bf x}$ (I understand, passed into refine function directly) into RAM? What about the payload ${\bf y}$?
> Extrinsic data are passed into the refine function directly.
See (251)
# 2024-08-20 06:29 gav: Arguments to Refine are encoded into a; this includes all extrinsic data.
# 2024-08-20 06:30 gav: (NOTE: "Extrinsic" is used for two separate and very different concepts in the GP. That's not ideal. I'll likely change this use of "extrinsic" to "argument" in a later draft.)
(edited)
# 2024-08-20 10:32 sourabhniyogi: Understood -- I am postulating that PVM programs must have host functions in B.8 that copy
(a) ${\\bf x}$ extrinsic data encoded into the $a$ argument of Refine, call it
import_x
to copy $a\_{\\bf x}$ into RAM
and
(b) ${\\bf y}$ payload data encoded into the $a$ argument of Refine, call it
import_y
to copy $a\_{\\bf y}$ into RAM
I don't see any such way within {
historical_lookup
,
import
,
export
,
machine
,
peek
,
poke
,
invoke
,
expunge
} to access elements of $a$.
It seems I am missing a concept or there needs to be 2 more host functions
import_x
and
import_y
.
(edited)
# 2024-08-20 10:50 sourabhniyogi: The postulated
import_x
and
import_y
would be like
import
here:
# 2024-08-20 10:53 sourabhniyogi: except instead of copying ${\\bf i}_{\\omega\_0}$ it would copy $a_{{\\bf x}_{\\omega\_0}}$ or ${a_{\\bf y}}$ into RAM (and no need for the min(\\omega\_2, W\_C W\_S), no index $\\omega\_0$ for payload ${\\bf y}$) Perhaps names like
extrinsic
(?) and
payload
would be more appropriate than
import_x
and
import_y
-- What do you think?
(edited)
# 2024-08-21 16:48 dave: Time as defined in the GP seems to be TAI-based. That is, it is defined simply as "seconds passed since the beginning of the JAM Common Era". Practically speaking most systems are UTC-based. The difference is that UTC is adjusted by leap seconds, whereas TAI is not. Sticking with the current definition seems good, as it's conceptually simple. It's probably worth explicitly talking about this in the GP though, as just doing the natural thing (current UNIX timestamp minus epoch timestamp) will on most systems not be correct. Well, it might be, if there are no more leap seconds. AIUI they will be abolished by 2035 but there may be some before then. I'm not sure if there's a "standard" way of dealing with this. I believe it is possible in Linux to make the system clock follow TAI, however I don't know if this is a good idea or not.
# 2024-08-22 11:17 lucasvo: I've been working through the paper and one thing I'm unsure about is how ordering of work packages defined and in practice how is it determined?
# 2024-08-22 12:49 gav: It’s not defined except through the prerequisite field. This currently doesn’t give a strong ordering for accumulation (as availability may complete out of order) but from 0.4 onwards it will.
(edited)
# 2024-08-22 20:35 sourabhniyogi: TAI makes sense in a final ratification. Since most will create some abstractions to get the "JCE", it seems hardly problematic to address the leapseconds within that abstraction.
Are you taking the lead on the next JAMNP writeup?
We are wondering if there really is no gossiping of Tickets, Blocks, etc. within JAMNP as it seems pretty wild for each validator to broadcast to all V-1 fellow validators. Not only that, if each ticket is sent to V-1 validators, all the cool anonymity will be canceled by QUIC. Is there some detail of how QUIC will be used for gossiping objects quickly in JAMNP? Or some expectation that JAMNP will support gossiping soon enough?
(edited)
# 2024-08-22 20:39 dave: Tickets are to be sent via a proxy; the proxy will know the sender's identity obviously but is trusted to not reveal it. If you assume 2/3 honest validators then most of the tickets will be anonymous
# 2024-08-22 20:40 dave: The plan is for new block distribution to be done via the availability system
# 2024-08-23 01:59 xlchen: to my understanding, those are unordered set, in that case, what order should be used for serialization? or they are just FIFO array without duplications?
(edited)
# 2024-08-23 08:26 gav: Actually not via the “availability system” but through a separate distribution (and redistribution) of erasure coded block pieces. It’ll be its own distribution system but will reuse the erasure coding logic.
(edited)
# 2024-08-23 09:01 gav: They're values which can be derived from the maps (so not strictly needed in the DB) but determining the total size/number of items directly from the Merkle tree is non-trivial. So they're in there to facilitate implementations' tracking of the changes rather than rebuilding each block (or having a separate database to track them) both of which add a lot of implementation bloat and/or complexity.
(edited)
# 2024-08-23 09:03 sourabhniyogi: > <@gav:polkadot.io> Actually not via the “availability system” but through a separate distribution (and redistribution) of erasure coded block pieces. It’ll be its own distribution system but will reuse the erasure coding logic.
Makes sense for "big"
Block
, but wouldn't tiny JAM objects (
Guarantee
,
Assurance
,
Dispute
,
PreimageLookup
) benefit from libp2p style gossiping in speed over erasure coding?
# 2024-08-23 17:10 dave: We require that the code hashes in the guarantees extrinsic match the current code hashes for the relevant services. This may not be the case by the time the WRs are accumulated though? Is the intention that accumulation code should protect itself against this if necessary, eg by including a version in the refine output?
# 2024-08-23 21:02 danicuki: Why (h4:) is negated here? Is this correct? Why not simple h4: ?
# 2024-08-23 21:46 xlchen: I don’t exactly know but I think it is to protect from hash collusion
# 2024-08-24 16:56 gav: > <@xlchen:matrix.org> I don’t exactly know but I think it is to protect from hash collusion
Exactly.
# 2024-08-23 22:56 sourabhniyogi: gav: Some basic questions on Assurances.
1. From (185)+(186) _Availability Specifier_ $s$ (with h, l, u, e) ALONE (included in a Work Report), will an assurer/validator be able to reconstruct the _auditable_ work package (by querying fellow validators) after getting their chunk and a proof of inclusion using $s$ ALONE? If not, what else does a \[non-guarantor\] assurer need other than $s$ from a
Guarantee
to reconstruct the _auditable_ work package (by querying fellow validators)?
2. If the answer to (1) is "YES, s alone is sufficient for work package reconstruction (by querying fellow validators)", we need a _test case_ of a work package ${\\bf p}$ and some pretend exported items with this "Availability Specifier" $s$ computed (which is equivalent to a Auditable Work package and sufficient for assuring, if we understand correctly). When can we get a ${\\bf p}, {\bf e}, s$ test case? Having this will be essential for JAM implementers to have proper $E\_G$ verifiable work report generation and to get through a full $E\_A$ Assurance generation.
3. Given the proof referenced in Section 16 "... Firstly, their erasure-coded chunk for this report. The validity of this chunk can be trivially proven through the work-report’s work-package erasure-root and a Merkle proof of inclusion in the correct location. The proof should be included from the guarantor." ... which QUIC message covers this proof _submission_ by guarantors to all other validators in JAMNP?
4. In section 16 "Availability Assurance", you refer to "provided" and "required" manifests, seemingly for the first time, which appear to be "inside" the availability specifier $s$. Are these ${\\bf b}^\\club$ (package, imported items, extrinsics related) and ${\\bf s}^\\club$ (exported items)-- which one is "provided" and which one is "required"? (The 14.3 paragraph "Guarantors are required to erasure-code and distribute two data sets:... " which is similar enough to cause us to believe section 16 could be folded into 14.3?)
(edited)
# 2024-08-23 23:09 sourabhniyogi: 5. What is ${\bf s}$ vs $n$ within ${\bf s}[n]$ in 185? The meaning of ${\bf s}$ flips between imported and exported items in this section so we'd like to confirm.
(edited)
# 2024-08-24 16:54 gav: > <@dave:parity.io> We require that the code hashes in the guarantees extrinsic match the current code hashes for the relevant services. This may not be the case by the time the WRs are accumulated though? Is the intention that accumulation code should protect itself against this if necessary, eg by including a version in the refine output?
For the few blocks where a service is undergoing an upgrade then you’d probably want to avoid sending reports to it since it’s fundamentally impossible to predict which code is current come Accumulation.
# 2024-08-24 16:59 gav: 1. They'll be able to ensure that the chunks they get from each validator is correct and that the bundle/work-package itself is correct. The bundle is all that is needed for a validator to audit.
2. As I’ve said before, test cases will be provided once they are ready. I won’t repeat myself further.
(edited)
# 2024-08-24 17:02 gav: 3. I don’t know what you mean by “proof of submission”. In any case JAMSNP doesn't presently include any message/stream type for providing DA chunks. It will be included in due course.
(edited)
# 2024-08-24 17:38 gav: > <@sourabhniyogi:matrix.org> 5. What is ${\bf s}$ vs $n$ within ${\bf s}[n]$ in 185? The meaning of ${\bf s}$ flips between imported and exported items in this section so we'd like to confirm.
I presume you mean:
# 2024-08-24 17:40 gav: Here, M(s) is the segment root of the exporting work-package and n is the index of a segment exported by it.
(edited)
# 2024-08-24 17:40 gav: s is therefore the sequence of exported segments from said exporting work-package.
(edited)
# 2024-08-24 18:25 sourabhniyogi: By proof submission, I mean the "The proof should be included from the guarantor." I believe you stubbed
JustifiedImport
in as
--> Vec<Extrinsic> ++ Vec<JustifiedImport>
in
https://hackmd.io/@polkadot/jamsnp , so we'll run with that for now.
(edited)
# 2024-08-24 18:30 gav: > <@sourabhniyogi:matrix.org> By proof submission, I mean the "The proof should be included from the guarantor." I believe you stubbed as
JustifiedImport
in as
--> Vec<Extrinsic> ++ Vec<JustifiedImport>
in
https://hackmd.io/@polkadot/jamsnp , so we'll run with that for now.
That's only for sharing of Work Packages between guarantors on the same core.
# 2024-08-24 18:30 gav: There is not yet an instruction for sharing (justified) DA chunks.
(edited)
# 2024-08-24 18:33 gav: It'll likely just be
Vec<Hash> ++ Blob ++ Vec<Hash> ++ Vec<SegmentChunk> ++ Vec<Hash>
.
(edited)
# 2024-08-24 18:36 gav: The
Vec<Hash>
will just be complementary Merkle-node-hashes from leaf to root. The first will contain hashes for the blob-subtree, the second for the segments subtree and the third for the super-tree.
# 2024-08-26 18:47 noahjoeris: Heya. Isn't the posterior authorizer pool also dependent on the header?
# 2024-08-26 23:58 xlchen: The return type for the merkle justification generation function is wrong
# 2024-08-27 01:26 xlchen: this also using log2(|v|) which will be undefined when v is empty
# 2024-08-27 01:28 xlchen: it requires the length of v to be no more than
2^x
so
log2(|v|) - x
is always <= 0?
# 2024-08-27 14:00 clw0908: Hello everyone,
When storing storage or preimage, we will specify the service identifier and the storage or preimage key at the same time, but we do not additionally store the mapping between the service identifier and the storage or preimage key.
# 2024-08-28 02:37 xlchen: my understand is that this is used to construct state root but it doesn't means you also have to store the state into db with exactly this format
# 2024-08-28 02:38 xlchen: so you can/should store the keys in whatever format you want
# 2024-08-27 14:00 clw0908: So assuming we have a service account identifier (s), how do we retrieve all of (s)'s storage(bold s) keys or preimage(bold p) keys?
# 2024-08-28 01:40 mkchung: In host function new=9, what is the x_i used for creating the service account?
Additionally, for l∶{(c, l)↦[]} portion, are we writing this a_l on-chain, similar to the (h,z)=[] in the solicit=13 case? This should also trigger E_P, correct?
# 2024-08-28 06:28 sourabhniyogi: Basic questions on Is-Authorized.
1. Why is the authorizer not a 4th entry point alongside refine/accumulate/on\_transfer?
2. In (182) you use blob-hood of ${\\bf o}$ (being an element of Y) to determine if work is authorized, but how does the PVM code actually return and "set" ${\\bf o}$ -- not clear from (248) how Is-Authorized can return a blob, e.g. a public key or recovered address. Is it through \omega_10 + \omega_11 pointing to where the blob is returned in memory? If so what is the "not authorized" return mechanism?
3. In 8.2 GP lecture you mention validating signatures, e.g. validating ${\\bf j}$ right at the start of the work package. Shall we add ECRECOVER/... signature verification host functions for this purpose?
4. Why did you not support access to
historical_lookup
in Is-Authorized, to get at service-specific state?
5. It seems clear we will want "opinionated" inclusion of cryptographic primitives in ecalli host functions (BLS, Keccak/Blake2b/xx hash functions, Bandersnatch, Edwards, etc) due to the usual "interpreted code is Nx slower than compiled, we must have this primitive!" concerns. How is this problem NOT going to immediately reveal itself on the first big JAM service (parachain validation) which uses all these must have primitives?
(edited)
# 2024-08-29 13:35 dave: Re (1) AIUI authorisation is for usage of a core, this usage is not tied to a particular service
# 2024-09-09 08:04 gav: (2) Should be obvious from the PVM definition. If you panic then the result will not be in Y.
# 2024-09-09 08:04 gav: (3) PVM should really be fast enough. This isn't Ethereum.
(edited)
# 2024-09-09 08:04 gav: (4). IsAuthorized is intended to be as lightweight as possible to minimize the possibility of validator-griefing. This could potentially change in the future once we start actually writing code on JAM prototypes and understand the kinds of use-cases better.
# 2024-08-28 07:40 prematurata: it looks to me that appendix F (304) is wrong.
4
should be added and i' mont sure about the underlined red text. look like it does not make muc sense
# 2024-08-28 08:54 luke_f: Hello
I'm having a hard time understanding Equation 141
Specifically R is a constant as far as i can tell, but used as a function here?
(edited)
# 2024-08-28 09:01 celadari: I also had the same question but I assumed in the end that it meant "multiply" R by (floor(tau/R) - 1)
# 2024-08-28 08:56 cisco: How is it that the PVM argument invocation can return a blob? I guess the memory contents on a success are the blob.
Here the return type is
( (N_G, Y) U {panic, oom}, X )
but the inner function
R
seems to return
({panic, oom} U N_G, X U Y)
(edited)
# 2024-08-28 14:51 sourabhniyogi: Ok, we'll run with your interpretation for now 😃, returning a "panic" being "authorization failed" if a new host function that we added to verify a signature against a hardcoded public key (or read via
historical_lookup
) doesn't pass verification.
(edited)
# 2024-08-30 01:21 sourabhniyogi: We realized (again) that passing in \omega_10 + \omega_11 is to be the way to send in "a=arguments" (refine: authorization hash, accumulate: wrangled results, transfer: transfer memos) and ALSO return a blob.
# 2024-08-28 23:52 sourabhniyogi: General JAM Service Invocation/Interaction-related questions:
1. How are the
refine
,
accumulate
and
on_transfer
entry points (described after (90)) specified in the jump table ${\\bf j}$ of Eq (213)? We are currently just using
pub @refine:
,
pub @accumulate:
and
pub @on_transfer
in koute's assembler, but this appears underspecified in GP except for some 0/1/2 entry points
2. Given that
refine
can ONLY access on-chain state from
historical_lookup
host function calls (through
accumulate
's
solicit
host function calls that result in $a\_p$ preimage writes through $E\_P$ extrinsics) ... I am struggling to see how
accumulate
can drive
refine
inputs -- _except_ via some observer who "sees" the finalized post-accumulate state and submits a new work package based on that state in the form of extrinsics and payloads. Why can't
accumulate
also have access to the
export
host function such that
refine
can
import
segments exported by
accumulate
?
3. Why can't
refine
have access to
read
host function to get at state written to by
accumulate
's
write
host function calls? I imagine it would be
historical_read
, analogous to
historical_lookup
.
4. How can there be a
refine
-less Work Package / Service, _not_ needing Assurances to trigger an
accumulate
, that is purely
accumulate
based?
5. What is the _idiomatic_ way (via a specific instruction) to "panic" for authorization code to return "Not Authorized"? We are doing
ecalli 42
as a hack right now on failed authorizations.
6. For multiple services going through
accumulate
what is the ordering of ${\\bf S}$ in 157? Assuming serial execution (which is sort of implied by
new
\[building up ${\\bf x}_n$\] and
transfer
\[building up ${\\bf x}_{\\bf s}$\]), is the context ${\\bf x}$ (of Eq 253) carried over from one services
accumulate
to the next?
7. From
upgrade
It appears to us that the invocation context ${\\bf x}$ in ${\\bf X}$ needs its ${\\bf s}$ to be a dictionary like ${\\bf x}\_{\\bf n}$, but right now its just ${\\bf s}$ of ${\\cal A}?$.
8. Host function service account lookup ${\\bf d}$ needs an explanation before B.5 On-transfer. It is used in
lookup
,
read
,
info
but we're not sure whether its in ${\\bf x}$ and how its initialized.
9. Why did you keep machine/invoke/peek/poke/expunge solely for
refine
and not also
accumulate
?
10. $\\bar{x}$, the full extrinsic data (not hash-len combo) given to the guarantors alongside the work package, needs to be put into Audit DA, since re-execution of
refine
by auditors require it. This step is not described, but is of course necessary, correct?
(edited)
# 2024-08-29 13:48 dave: Re (4), AIUI the blessed services are always accumulated every block even if there are no work reports for them
# 2024-08-29 15:38 sourabhniyogi: Yipes, I read (28) + Sect 11 "After enough assurances the work-report is considered available, and the work outputs transform the state of their associated ser- vice by virtue of accumulation, covered in section 12. The report may also be timed-out, implying it may be replaced by another report without accumulation." + (162) input of refine results M(s) as all implying that accumulation REQUIRES refine.
# 2024-08-29 15:50 dave: For most services that is true -- accumulate will not be called unless some work reports have become available, the 3 blessed services are an exception to this. See (159)
# 2024-09-09 08:28 gav: 1. There is no explicit jump table - only an entry point. These are now set to 0 (isAuthorized), 5 (Refine), 10 (Accumulate) and 15 (OnTransfer).
(edited)
# 2024-09-09 08:30 gav: 2. Regarding JAM, Refine -> Accumulate is a one-way street and as such sensibly designed services must be able to perform Refine with some degree of asynchroneity to Accumulate. This is why I write that JAM is "mostly coherent". Getting any state-changes resulting from Accumulation as inputs to a later Refine would imply making a state proof, which introduces synchroneity and as such a degree of latency.
Re: "Why can't...": Lazy and Dumb Question. If you think you can add such a feature without blowing up the protocol's complexity, submit your PR to the GP repo.
(edited)
# 2024-09-09 08:31 gav: 3. Lazy and Dumb Question. If you think you can add such a feature without blowing up the protocol's complexity, submit your PR to the GP repo.
(edited)
# 2024-09-09 08:35 gav: sourabhniyogi: This is not a forum to request your (IMO totally unachievable) product dreams. If you have a serious suggestion to improve the JAM protocol, make a PR to the GP repo. Be ready to defend its implications against some of the best minds of the industry. Beware that designing and writing a high-performance secure decentralized protocol is nontrivial and you can't just wish features to exist.
# 2024-09-09 08:37 gav: 4. With the particular exception of the privileged services, a service cannot be accumulated without at least one refine result. So in short, there generally cannot.
I've covered this many, many times in basically every talk I've given. Please familiarise yourself with this content. I'm not here to be your personal oracle, spoon-feeding you with material at your own pace.
(edited)
# 2024-09-09 08:39 gav: 5. Any kind of panic works, but if you want a convention, I'd use
trap
.
(edited)
# 2024-09-09 09:03 gav: 8. There is indeed a missing argument for Refine's
lookup
case. This will be fixed in the next release. Refine's
read
and
info
cases properly supply this parameter.
(edited)
# 2024-09-09 09:05 gav: 9. They are quite heavy-weight facilities and it is far from clear that they can be used effectively in the <10ms of gas which Accumulate is given. This is something which might change as we begin writing prototype services.
# 2024-09-10 09:07 sourabhniyogi: You already have service users submitting a refinement context within the work package, which includes an anchor block's state root but not an unfinalized accumulate state root. With Safrole, assuming very high liveness, you also won mostly forkless state roots.
So my lazy dumb strawman is that you can win
historical_read
parallel to
historical_lookup
in
refine
by modifying the "anchor block" centric refinement context included in work packages to be have state roots not
(a) relative to an anchor block
but instead
(b) whatever the work package submitter has observed to be a recent \[potentially unfinalized\] state root
The user already have to submit their work packages relative to (a), so what is lost with (b)?
I am not making a request for product dreams here 😅, I am asking why an observed unfinalized state root from the user is an insufficient state proof.
If 2 or 3 Guarantors refine a work package according to some unfinalized state root but it doesn't get finalized, then the work report doesn't get assured, it doesn't get audited, and so it just times out.
The claim is that if this timeout is quite rare in practice, the experience for users improves tremendously with
historical_read
and a streamlined "accumulate=>refine" .
The lazy/dumb GP modification would be extend what what you have in 9.2 "By retaining historical information on its availability,
we become confident that any validator with a recently finalized view of the chain is able to determine whether any
given preimage was available at any time within the period
where auditing may occur." from preimages alone to what is done with
write
operations. Still asynchronous post-write here, but with less latency than (a).
The new requirement is that any services aggressive work results (with aggressive (b)) would only be able to affect
accumulate
if the refinement context's state root in conception (b) was finalized.
I understand this is in the "better to implement JAM to safely crawl before JAM walks/runs" region here. But you claim running with (b)'s aggressive refinement context is unachievable because ... what?
(edited)
# 2024-09-10 09:09 sourabhniyogi: Typo: (252) should have "historical_lookup" instead of "lookup"
# 2024-09-10 20:47 gav: JAM (and Polkadot before it) is secure only because of Elves. Elves requires that all validators are able to audit all Work Reports *regardless of whether they’re synced to the same fork or not*. All host functions available to Refine must return exactly the same result for any node on any fork beyond the lookup-anchor. Since we cannot assume any particular state is known (even the lookup anchor block’s state, which could be very old and is likely pruned) then our host functions must be pretty much stateless. Historical lookup works only because we know, at any block up to 24 hours later, whether a preimage was known at the time and if so what it was. We manage this only with specialised data structures (in accounts) and limiting the rate at which a preimage may be supplied, removed and supplied again to avoid state blowup. This can all be discovered by a thorough read of the GP. The design would not cleanly apply to more arbitrary and general state changes such as service storage.
(edited)
# 2024-09-12 14:00 sourabhniyogi: Thank you very much for this explanation. We will properly understand ELVES=>JAM constraints to see if there is any way we could improve the JAM service developer/user experience.
# 2024-08-29 11:37 sourabhniyogi: (160) should have $l: {\\bf r}\_l$ to match 11.1.4's eq (121) for the hash of the payload.
(252) should use
historical_lookup
instead of
lookup
(edited)
# 2024-09-04 03:33 shwchg: some questions about dispute and audit:
Will disputes enter a special voting phase if someone casts a false judgement during each tranche settlement? I would like to know the details of how E\_D is formed.
(edited)
# 2024-09-04 09:28 dave: AIUI H_V, like the other markers, was (intentionally) redundant. It simply provided verdict information for those downloading only headers, not block extrinsics. All validators need to download full blocks, so its removal makes no difference to them (other than not having to generate it etc obviously)
# 2024-09-04 09:32 shwchg: Sorry, I might not have been clear. H_V still exists in the generation of s, so is it simply being removed from the function?
# 2024-09-04 09:35 dave: Ah sorry think there is some confusion. H_j is what was removed
# 2024-09-04 09:41 shwchg: my bad, that H_j is header verdict XD
Regarding dispute voting, what are your thoughts?
# 2024-09-04 09:45 dave: Re (2) if a node sees a negative judgement for a report, then it should start auditing it if it hasn't already. I'm not sure if it's supposed to send out an announcement in this case; I would guess this is not necessary, but probably also harmless.
# 2024-09-04 09:49 shwchg: got it, so next is to convert the judgments into Verdict and move on to the logic in Chapter 10. Thank you!
# 2024-09-04 09:52 dave: > <@dave:parity.io> Re (2) if a node sees a negative judgement for a report, then it should start auditing it if it hasn't already. I'm not sure if it's supposed to send out an announcement in this case; I would guess this is not necessary, but probably also harmless.
In the case that there is a negative judgement, it is thus expected that all validators will produce a judgement. If a block author has seen enough judgements to build a verdict then they will do this.
# 2024-09-04 20:07 mateuszsikora: Hey, correct me if I am wrong but there is no specification in GP how to handle out of memory in PVM. We have 2 cases:
1. A generic program without memory segmentation. In this case we start from empty memory and we can call
sbrk
until the memory exceed 2 ** 32. what should happen then?
2. A "standard program initialization" program. In this case we have heap between 2Z_Q + Q(|o|) and 2**32 - 2Z_Q - Z_I - P(s) (beginning of stack segment). what should happen when the memory that we allocated exceeds the heap segment?
I guess in both cases it should be a page fault but what should be the address then? we don't pass any address to
sbrk
so there is no any strict point where this fault could happen
# 2024-09-05 17:38 jan: Memory allocation/deallocation handling is still a work-in-progress, and it's possible the
sbrk
instruction will get modified and/or removed. I'd suggest you temporarily skip it and focus on other parts of JAM and/or PVM.
---
If you're interested in some history as to why
sbrk
is there then let me give you some background.
Historically I designed PolkaVM (on which PVM in the GP is based on) to be a VM which is as "powerful" as WASM VMs (so it can completely replace our current WASM-based executor in Polkadot 1.0 and our WASM-based smart contracts VM) while being as simple as possible to implement, and without sacrificing any performance.
So this is where the idea for the
sbrk
came from (which is similar to what WASM has): the VM maintains a heap pointer, and the guest program can use
sbrk
to query that pointer and/or to bump it up. And every time it crosses a page boundary the VM allocates new memory for the program.
So this design has numerous benefits. First, it's very simple to use as a guest program (pseudo code):
// Get a pointer to the new allocation.
let pointer = sbrk(0);
// Actually allocate it.
if sbrk(size) != 0 {
// Allocation succeeded.
// Now `pointer` points to `size` bytes you can use.
}
This is also great for use cases like e.g. tiny smart contracts which can use this as directly as an allocator without having to bring a heavyweight allocator of their own (which would consume a lot of space).
Secondly, it's simple to implement in the VM, something like that (pseudo code again):
fn sbrk(size) -> Pointer {
if size == 0 {
// The guest wants to know the current heap pointer.
return current_heap_pointer;
}
// The guest wants to allocate.
let new_heap_pointer = current_heap_pointer + size;
if new_heap_pointer > max_heap_pointer {
// Allocation failed.
return 0;
}
let next_page_boundary = align_to_page_size(current_heap_pointer);
if new_heap_pointer > next_page_boundary {
allocate_new_pages(next_page_boundary..align_to_page_size(new_heap_pointer));
}
current_heap_pointer += size;
return current_heap_pointer;
And this (along with the memory map I came up with, which is what we now call "standard program initialization") also makes in very easy to write an interpreter for this, because when handling loads/stores from memory you only have to do something like this:
fn load_value32(address) -> value {
if address >= stack_address && address + 4 <= stack_address_end {
return stack[address - stack_address];
} else if address >= rw_data_address && address + 4 <= align_to_page_size(current_heap_pointer) {
return rw_data[address - stack_address];
} else if address >= ro_data_address && address + 4 <= ro_data_address_end {
return ro_data[address - stack_address];
} else {
// Address is inaccessible.
return Err;
}
}
It's cheap, fast, and doesn't require any crazy data structures and doesn't require any handling of corner cases (for example, accesses which could read both from the stack and from RW data don't have to be handled, because they're impossible by definition; the interpreter can just keep them in separate arrays, and call it a day).
So that's how (any why) it __was__ originally designed, but then came JAM and changed things. (: (Again, remember, I started working on this **before** JAM, and some things were just grandfathered into JAM.)
What JAM introduces is a concept of inner VMs (see
machine
,
peek
,
poke
,
invoke
and
expunge
host functions in section B.8 of the GP) where one VM can spawn another VM, and as it is currently designed those inner VMs are extremely flexible and have completely free-form memories and are dynamically paged.
What this essentially means is that all of those nice properties of
sbrk
that I've listed - simple and easy to implement, fast, doesn't require fancy data structures - they all now go out of the window!
So we will probably be replacing
sbrk
with something else that's more appropriate for the more flexible inner VM model. And unfortunately also most likely orders of magnitude harder to implement (at least if you want to reach at least the half-speed milestone), but it is what it is. I'm still finishing some other stuff up, but I'll most likely be working on this soon-ish. (If any of you have any good and/or crazy ideas feel free to message me!)
(edited)
# 2024-09-09 06:54 mateuszsikora: Thank you Jan Bujak, that clarifies a lot. It would be nice to include a note in the GP that
sbrk
might be modified or removed, and it is advisable not to implement it yet
# 2024-09-06 00:49 xlchen: This appears to be the original version described in wikipedia, not the modern version?
# 2024-09-06 08:53 celadari: Looks like it ! Modern version looks a bit simpler to implement as well
# 2024-09-06 08:53 celadari: Modern version requires random variables j in [0, i] in the loop, and function Q_l (equation 306) is said to be constrained to [0, l]. So it should be possible to use the modern version.
PS: I don't see why Q_l result is contained in [0, l] but in the meantime I assume this hypothesis (I don't understand the 4i modulo 32 to be honest).
# 2024-09-06 03:22 xlchen: I am trying to understand this, which doesn't make much sense to me
# 2024-09-06 08:34 celadari: I think R here refers to the rotation period 10 (appendix I.4.4) and it means multiply \* R by (floor(tau'/R) -1)
(edited)
# 2024-09-06 03:24 xlchen: if it is referring to this R (in that case the font isn't right), it takes two arguments
# 2024-09-06 03:26 xlchen: and the last part just isn't making sense to me. the only logical interoperation I can come up with is that it is a typo that suppose to limit some t passing to the function R as the second argument?
# 2024-09-06 20:17 danicuki: Hi. I have a doubt about the definition of erasure-coding function. Is it a recursive function? If so, what is the base case?
# 2024-09-09 07:45 gav: > <@xlchen:matrix.org> it requires the length of v to be no more than
2^x
so
log2(|v|) - x
is always <= 0?
The outcome is that it skips the final $x$ items of the proof.
# 2024-09-09 07:50 gav: The
ceil(log_2(|v|))
just gives the (generally maximum, but here constant) number of nodes from root to leaf. Substracting x would result in a negative number in the case that there are fewer than 2^x leaves. So we clamp it to zero and take only the first such proof items.
# 2024-09-09 07:52 gav: This is useful since we know we'll get a well-aligned subtree of 2\*\*x data items (the "page") and thus need only proof data to get it from the root to the sub-tree root.
(edited)
# 2024-09-09 07:52 gav: Consequently, if there are no more items in total than the items on our page, our proof can be empty (i.e. the subtree root is equal to the root). This where the
max(0, ...)
comes from.
(edited)
# 2024-09-09 07:55 gav: It is never required to enumerate a service's storage keys or preimages.
(edited)
# 2024-09-09 09:27 gav: > <@danicuki:matrix.org> Hi. I have a doubt about the definition of erasure-coding function. Is it a recursive function? If so, what is the base case?
It is not.
# 2024-09-09 09:31 gav: > <@danicuki:matrix.org> The only place I see C defined is here:
C_k is the "true" definition, and it assumes an input of data in multiples of 684 bytes, and this multiple is the number of "chunks", k. The k subscript may generally be elided since the number of chunks is implied by the input, but sometimes I put it in regardless to make things clearer.
# 2024-09-09 09:35 gav: We use a 256-bit hash-sequence to create a 32-bit integer sequence. It would be wasteful to full 256-bit hash for just 32 bits of entropy. So we do a hash only every 8th item and otherwise take the 256-bits of the hash and split it into 8 32-bit integers.
(edited)
# 2024-09-09 19:37 danicuki: this formula is a little bit confusing because pc has two meanings here and the letters "c" are almost identical.
(edited)
# 2024-09-10 05:15 xlchen: I don't fully get this part. Firstly I think the code hash of a service can never be null. And when o is empty, it returns what? The same service account? But what about other fields in the return context?
# 2024-09-10 05:16 xlchen: and it is not invoking PVM when o is empty array, which I am not sure is correct. the only case this can be triggered are for privilege services and it should still invoke PVM when there is no work result to accumulate?
# 2024-09-10 05:35 xlchen: I also need some clarification about this regards to privileged services without work report
# 2024-09-10 05:36 xlchen: when there are two work result for a service, does it receive 2 minimal accumulation gas + (gas ratio * remain gas)? or 1 minimal gas?
# 2024-09-10 05:36 xlchen: how about privileged services without work result? they should still receive the minimal accumulation gas right?
# 2024-09-10 06:40 gav: > <@danicuki:matrix.org> this formula is a little bit confusing because pc has two meanings here and the letters "c" are almost identical.
It only has one meaning; the two lines are complementary. One defines the set it belongs to (i.e. its "type") and the other defines the value within this set.
# 2024-09-10 07:06 gav: > <@xlchen:matrix.org> I don't fully get this part. Firstly I think the code hash of a service can never be null. And when o is empty, it returns what? The same service account? But what about other fields in the return context?
This is not comparing the hash but the code itself, which can be null in the case that the service doesn't currently host the preimage of its code hash.
# 2024-09-10 07:09 gav: Yes, the function implies this, since it is taking the sum over all work results attributable to the given service, and each element of that sum includes the minimum accumulation gas together with its share of the remainder of the core.
(edited)
# 2024-09-10 07:15 xlchen: how about the minimum accumulation gas from privileged services if they don't have any work results?
# 2024-09-10 07:18 gav: > <@xlchen:matrix.org> how about the minimum accumulation gas from privileged services if they don't have any work results?
At present, that's zero according to the formula.
# 2024-09-10 07:19 gav: > <@xlchen:matrix.org> the remaining gas should also subtract those?
?
# 2024-09-10 07:20 xlchen: so if the gas is zero for privileged services without work results, then why include them in S? invoke those with zero gas surely wouldn’t yield anything other than OOG?
# 2024-09-10 07:24 gav: Yeah, it's not final, I'm just describing correct behaviour at present.
# 2024-09-11 04:36 gav: 0.3.7 is released:
https://github.com/gavofyork/graypaper/releases/tag/v0.3.7
In addition to the usual corrections/clarifications, this contains a couple of modest but important changes:
- Erasure-coding is now validator-major not chunk-major: this optimizes the happy case (where you have the first 342 validator's shares) and doesn't really affect the general case.
- There's a new item in chi (privileged services) for managing always-accumulate services and how much gas they get. The three privileged services *no longer always-accumulate implicitly*.
# 2024-09-11 07:34 dakkk: I've found a possible discrepancy in single-step state transition of PVM:
In 214 psi_1 receives (c,k,j) + pvm_state, while in 218 psi_1 receives (c,j) + pvm_state
# 2024-09-11 15:14 tomusdrw: Question regarding BitSequence codec:
Am I getting this right, that in case of a variable-length bit sequence encoding, one should
prefix it with length of the bit sequence itself, not with length of it's packed representation?
# 2024-09-11 20:20 xlchen: it has to be the bit length otherwise how do you figure out that? but I don’t think variable length bit sequence is used atm so no need to implement it
# 2024-09-11 15:17 tomusdrw: Also IMHO it would be good to only have one canonical representation of some encoding, so GP should strictly define that:
1. The remaining bits (i.e. the remaining
bitLength % 8
) should be set to 0 (and decoding should fail in other case)
2. The boolean discriminator can be only
0
or
1
, so any other number should fail the decoding.
# 2024-09-11 15:24 emielvanderhoek: Bitsequence with length 3 for Octet: 253 and octet: 1 both decode as [True,False,False]. With encoding the bits outside of the sequence matter to get the right octet back.
(edited)
# 2024-09-12 00:31 gav: 1. This is already the implication. (We define only the encoding function, which implies that the remaining bits are set to zero. Decoding is just the inverse of the encoding function and so would naturally be invalid in if these bits happened to be set, since there would be no valid operand to the encoder function which could produce that output.)
2. What do you mean by “Boolean discriminator”?
(edited)
# 2024-09-12 06:27 tomusdrw: 1. Okay fair point, I didn't think about implication like that. What I had in mind about being more explicit with bits is to change the sum limits to be full
0..8
and then define
b_i * 2^i if i < |b|, 0 otherwise
(edited)
# 2024-09-12 06:30 tomusdrw: > <@gav:polkadot.io> 1. This is already the implication. (We define only the encoding function, which implies that the remaining bits are set to zero. Decoding is just the inverse of the encoding function and so would naturally be invalid in if these bits happened to be set, since there would be no valid operand to the encoder function which could produce that output.)
> 2. What do you mean by “Boolean discriminator”?
2. Sorry, should have been more precise: I've meant the
discriminator for set in union with empty set. But given your other explanation about just providing the canonical encoding it's pretty clear that any number other than
0
and
1
should be rejected.
# 2024-09-12 07:06 emielvanderhoek: This also implies that the only PVM testvectors that are currently valid are the ones where |c| is accidentally a multiple of 8.
# 2024-09-12 07:08 emielvanderhoek: We can work around this for now by not having a ‘strict’ decoding of a bitsequence. I.e. simply ignoring the bits outside of the bitmask.
# 2024-09-12 07:27 gav: I think this would only confuse matters. It’s pretty clear currently using a summation to define each octet and totally ignoring the fact that the octets in the resultant sequence could hypothetically be expressed in bits some some of which may have no correspondence with bits in the bit sequence.
(edited)
# 2024-09-12 07:28 gav: It is bits in, octets (values between 0 and 255 inclusive) out. There’s absolutely no need to bring “unused bits” into it. Concerns specific to particular low-level implementations are not generally going to be made explicit in the GP.
(edited)
# 2024-09-12 09:14 tomusdrw: Omg, sorry I'm so dumb. Only now I've realized it's an actual sum, so obviously the other bits don't need any special treatment
# 2024-09-12 08:24 jan: The remaining bits in the PVM code blob's bitmask (in case the size of the instructions' slice is not divisible by 8) don't really matter and will produce the same result (as all out of bounds instructions are assumed to be traps, and regardless of what skip they have they will decode all the same),
except the very first "extra" out of bounds bit which nominally should be
1
, because that bit determines the skip of the last instruction. So just assuming they're all zeros would screw up decoding of the last instruction in case the instructions' slice is divisible by 8.
But you're right that there's discrepancy here between the GP and the test vectors. It essentially boils down to this: what should we do if the number of instructions is not divisible by 8? Naively reading the GP the
|c| = |k|
would suggest to me that the number of bits in the bit mask should be always equal to the number of instruction bytes, but that indeed is not the case for our current test vectors.
So how do we fix this discrepancy? First, even if the number of instruction bytes is divisible by 8 we need to assume that out of bounds bits are
1
(only the very first one needs to be, but it's simpler to assume all of them are), otherwise the decoding of the last instruction will be broken (alternatively skip can be defined to be "the number of
0
s until the next
1
s, or until the end of the bitmask", then conceptually the out of bounds bits can be
0
).
Second, decide what to do when the number of instruction bytes is not divisible by 8:
a) change the
|c| = |k|
to say that the
|c|
should be rounded up to the nearest 8 (so, essentially, "round_up_to_nearest(8, instruction_bytes) == bits_in_bitmask"); essentially, allow the number of instructions and the number of bits in the bitmask not match, but enforce that the bitmask can have at most an "extra" 7 bits of padding (of which, as I've previously explained, only the first bit matters)
b) delete the
|c| = |k|
(since AFAIK we always know the length of the whole **p** anyway, so the only purpose this equation serves is to add an additional constraint); essentially, allow the number of instructions and the number of bits in the bitmask not match, and not enforce anything about the bitmask's length
c) force the instruction bytes to be always divisible by 8, in which case we could change the encoding of the **p** code blob to something like
p = E(|j|) ⌢ E_1(z) ⌢ E_z (j) ⌢ E(c) ⌢ E(k)
(basically remove
E(|c|)
, since, again, the lengths here can be implicitly calculated from the length of **p**)
So I think (b) is probably not a great idea. (a) is what I currently have implemented in PolkaVM, and I don't dislike the (c) option (It can waste up to 6 bytes, but it doesn't need any extra validation of the sizes. so one less place for the implementations to diverge I guess). gav
# 2024-09-12 08:30 gav: Firstly I don’t understand the problem of instructions not being divisible by 8. Bit strings need not be divisible by 8 either.
# 2024-09-12 08:36 jan: > <@gav:polkadot.io> Firstly I don’t understand the problem of instructions not being divisible by 8. Bit strings need not be divisible by 8 either.
We have a bit in the bit mask for every instruction, right? But we can't just encode singular bits; we need to
physically encode them as bytes. So if we have, say, an instructions blob which only takes 2 bytes of space, then sure, theoretically we only need 2 bits in the bit mask, but practically we must encode full 8 bits because 8 bits is the lowest granularity we can store. So this is what I mean by "instructions not being divisible by 8".
# 2024-09-12 09:03 jan: The implicit "remaining bits of the bitmask encoded as bytes must be zero" of GP's bit string encoding has two issues here as far as I can see:
1. this being an implicit requirement can be confusing for the implementers (as we've seen from the questions here; it might not be entirely obvious to everyone that those should be zeros, nor whether this should be validated), so it could be worthwhile to add a clarifying footnote or something explicitly saying this so that there's no room for confusion,
2. for the PVM code blob specifically this breaks the decoding of the very last instruction's skip value, _unless_ we say that the the skip calculation goes only as far as the end of the bitmask (so effectively implicitly the bitmask would have a
1
in there because it would behave as if there actually was a
1
encoded there, even though we'd physically require encoding
0
s there!)
(edited)
# 2024-09-12 09:05 jan: Hmm, but okay,
another way of dealing with this could maybe be inverting the mask so that operands get a
1
and instruction opcodes get a
0
in the bitmask -
then the zero padding would work.
# 2024-09-12 09:19 emielvanderhoek: Rereading GP-0.3.6-eq:276 (encoding of bitsequence) to me seems to explicitly force leading zero’s when encoding to an octet.
Example: Value [T,F,F] leads to value 1 or 0x01 (0-filled). And no other value.
(edited)
# 2024-09-12 09:29 gav: > We have a bit in the bit mask for every instruction, right? But we can't just encode singular bits; we need to physically encode them as bytes
The encoding is irrelevant from the perspective of the PVM spec.
(edited)
# 2024-09-12 09:32 gav: I say again: the length of any sequence (including a sequence of bits) do not need to be divisible by 8. From the perspective of the spec there is
absolutely nothing special about bit-sequences compared to any other kind of sequence.
# 2024-09-12 09:34 gav: There is a perfectly well-defined serialization codec. It is independent of the business logic and does not prejudice it at all.
(edited)
# 2024-09-12 09:37 gav: Reading what you're saying, the only thing I can think of which might need a tweak is the skip function.
# 2024-09-12 09:40 gav: Since the final instruction's opcode bitmask could not be followed by a 1 as that would imply the opcode bitmask sequence as being longer than the instruction data sequence.
# 2024-09-12 09:40 jan: Sure, but the bits
are physically there, so everyone needs to agree to handle them in the same way. (:
So, again, if the implicit assumption is that those extra physical bits in a bitstream must be zero then that doesn't work for PVM as currently defined, and we need to fix it somehow.
One simple way we could fix it is to invert the bitmask (so change the
1
to
0
in the skip equation) at a tiny cost to performance (~0.006% worse compilation speed as I just implemented it and measured); if that's fine to you then we can go with that.
# 2024-09-12 09:41 gav: > Sure, but the bits are physically there, so everyone needs to agree to handle them in the same way. (:
Not necessarily.
# 2024-09-12 09:42 gav: Maybe someone implements the bit sequence as
Vec<bool>
. Maybe they're in C++ and it's
std::vector<bool>
. Maybe they're in Scheme or Haskell and it's linked list of bools.
# 2024-09-12 09:42 gav: The serialization format and the internal business logic are NOT the same.
# 2024-09-12 09:43 gav: AFAICT nobody has brought up any valid concern related to bit-sequences. The spec is 100% clear.
(edited)
# 2024-09-12 09:44 gav: Bit sequences are arbitrary in length and anywhere spec subscripts by an index >= length is undefined regardless of what might happen to be in any particular place in a machine's physical RAM.
(edited)
# 2024-09-12 09:46 gav: You seem to be conflating some element of (perfectly well defined) business logic with the fact that the serialization of a bit sequence of type B\_8 and value \[1, 1, 1, 1, 1, 1, 1, 0\] happens to give the same octet sequence as a bit sequence of type B\_7 and value \[1, 1, 1, 1, 1, 1, 1\].
(edited)
# 2024-09-12 09:48 gav: As I say, at present the skip function appears to be in part undefined. This is the only alteration I see the need for. And it's just a broken function - it has nothing to do with the serialization format of sequences, or subscripting into bit sequences.
(edited)
# 2024-09-12 10:03 jan: Okay, sure, but the bits
are physically there, so it needs to be defined somehow (implicitly or explicitly) how they are handled. Are they validated (enforced to be zero) or are they ignored? The protocol
will physically transmit those bytes, so it needs to define how they're handled (even if, again, this definition is only implicit and not explicitly written out). So I disagree that this is in any way an implementation-only concern. It's just a matter of whether the spec defines what happens explicitly or implicitly.
Like, for example, let's take this pseudo code as an example:
a = spawn_pvm(instructions = [1,2,3,4,5,6,7], bitmask = [0b11111110])
b = spawn_pvm(instructions = [1,2,3,4,5,6,7], bitmask = [0b11111111])
We need the behaviour of this snippet to be the same for every implementation, hence the spec must define what happens here. Sure, the spec doesn't explicitly have to concern itself whether the last bit is there, but it needs to at least implicitly define (as a consequence of what it
does explicitly define) exactly what happens, whether that be just ignoring the extra bits, or checking whether the extra bits are zero and if not then rejecting the program. If you want to say "this concern is too low level for the the spec to explicitly define, and we will only define it implicitly" then okay, fair enough.
Anyway, so, can we change the bitmask for the skip to be the other way around -
0
s for instructions and
1
s for the argumets? That should resolve the issue with only a minimal hit to the performance of any potential implementations and not require any extra modifications.
# 2024-09-12 10:06 gav: > but the bits are physically there
Not in the spec they're not.
(edited)
# 2024-09-12 10:08 gav: It sounds as though your implementation of the spec is probably based on non-conformant assumptions about memory layouts.
# 2024-09-12 10:08 gav: And that you're conflating these non-conformant assumptions with the spec's serialization format for bit sequences.
# 2024-09-12 10:10 gav: Now I don't think there's any issue with altering the skip function in the way I state above.
# 2024-09-12 10:11 gav: Implementations are free to include a bounds-check, but they can also just place extra bits (of value
1
) on the end of their (bit sequence)
k
, and thus allow writing a conformant skip function which needs no explicit bounds check as long as its argument is properly constrained (and we know it should be by virtue of how it's used in the spec).
(edited)
# 2024-09-12 10:13 gav: Inverting the meaning of bits in k is not going to help unless you presume (generally incorrectly, but perhaps by design in your implementation) that there are accessible trailing zero bits in whatever your internal representation of sequence k is.
(edited)
# 2024-09-12 10:14 gav: > It's just a matter of whether the spec defines what happens explicitly or implicitly.
I've no idea what you're talking about.
# 2024-09-12 10:15 gav: Beyond any as-yet unknown errors, the spec defines correct behaviour perfectly.
# 2024-09-12 10:15 gav: There is nothing implicit about the serialization format nor about k, nor about what happens when you subscript into k.
# 2024-09-12 10:19 jan: Okay, can I ask an offtopic question?
Leaving the spec-land, can you clarify what exactly are implementations supposed to do with the extra bits in the real world to be conformant with the spec? Are they supposed to ignore it? Or validate that they are zeros? Since the spec doesn't define what to do with them and doesn't concern itself with them am I correct in assuming that they should be ignored?
# 2024-09-12 10:22 jan: Okay, so they should be ignored by the implementations. Got it.
# 2024-09-12 10:22 gav: There is a definition of how to serialize a sequence of N_2 values ("bits", booleans, whatever you like to call it).
# 2024-09-12 10:22 gav: This implies a deserialization function definition which requires canonical serialization.
(edited)
# 2024-09-12 10:24 gav: As for internal (in-RAM) representations: it's purely an implementation concern.
# 2024-09-12 10:25 gav: The GP specifies correctness through the serialization of blocks only. It does not EVER specify how a machine should represent any particular datum in physical memory. For all the GP knows, it could be a pen-and-paper block execution.
(edited)
# 2024-09-12 10:26 gav: As such there is simply no concern of "extra bits" with regards to correctness.
# 2024-09-12 10:29 gav: If I might rephrase your question (possibly incorrectly) to "I plan to implement a bit sequence in Rust as a
Vec<u8>
and thus will naturally have access to a multiple of 8 bits at a time. If the bit sequence is a non-multiple of 8 in length and I attempt to dereference a bit whose index falls into those bits beyond the rightful length but still within the final byte which I am able to access, how should I proceed?"
(edited)
# 2024-09-12 10:30 jan: (Although the question would be the same regardless of the programming language, as long as we're implementing JAM on a computer.)
# 2024-09-12 10:31 gav: The answer would be: "Either your implementation is incorrect (because the GP does not subscript at that index) or the GP is incorrect (because it does subscript at that index and thus includes an undefined term)."
(edited)
# 2024-09-12 10:32 gav: Again not true. This question can only be framed with the presumption that there is accessible capacity beyond the rightful length, which seems pretty specific to the presumptions inherent in your implementation.
(edited)
# 2024-09-12 10:33 jan: Okay, so the implementation should not access those bytes, and if it does this should not change any observable behavior, so any practical implementations of JAM on a computer will have to ignore those bits.
# 2024-09-12 10:34 gav: > <@jan:parity.io> Okay, so the implementation should not access those bytes, and if it does this should not change any observable behavior, so any practical implementations of JAM on a computer will have to ignore those bits.
It's not as simple as that.
# 2024-09-12 10:34 gav: Implementations are totally free to access whatever parts of RAM they want in whatever ways they want to.
# 2024-09-12 10:35 gav: As long as their behaviour is in line with the GP it really doesn't matter.
# 2024-09-12 10:37 gav: From the GP's perspective, there are no "extra bits", or more generally stated, there is no capacity beyond the length. For the sequence s = \[0, 1\], s\[2\] is undefined. If the spec ever tries to evaluate it, there is a mistake in the spec. Maybe a corresponding operation in some implementions would define some result. That's a potential avenue for an implementation-specific optimisation perhaps, but it's irrelevant from the perspective of asking about correct behaviour.
(edited)
# 2024-09-12 10:40 gav: To help illustrate this a bit further you'd probably want to implement an efficient bit-sequence as a
Vec<usize>
, implying that you'd have up to 31 "extra bits" on 32-bit architectures and 63 "extra bits" on 64-bit architectures. Clearly the spec cannot care about what architecture any given impl instance is on. Therefore it must surely be that the GP cannot possibly consider the existence of any such "extra bits".
(edited)
# 2024-09-12 10:41 jan: Hm... but isn't there at least one part of the spec that accesses those "extra" bits, namely, the
machine
hostcall defines the program as being p_z
bytes long, so doesn't this impliy that there
will be extra bits in there?
# 2024-09-12 10:43 gav: Yes, this draws upon the exact same (deterministic) codec as before.
# 2024-09-12 10:44 jan: Okay, fair enough. So in
this case the bits should be zero to correctly decode, otherwise an error is returned?
# 2024-09-12 10:45 gav: The point where wire-format (the p which is passed in
machine
and passed into Phi) changes into business logic is the E function.
# 2024-09-12 10:45 dakkk: In appendix A, why the gas delta is always 0 for every instruction?
# 2024-09-12 10:45 gav: Before, one can argue that there may be "extra bits" (though I'd say it's unhelpful and unnecessary to frame it in that way). Afterwards there are most certainly not. As I say, some implementations will likely use optimized language primitives like C++'s
vector<bool>
in order to represent this data and there will be no such accessible-beyond-rightful-length data.
(edited)
# 2024-09-12 10:47 gav: We have not yet defined sensible gas costs - it'll be one of the last things we do.
(edited)
# 2024-09-12 10:48 gav: > <@jan:parity.io> Okay, fair enough. So in
this case the bits should be zero to correctly decode, otherwise an error is returned?
That's implication of the math, yeah.
# 2024-09-12 10:48 dakkk: > <@gav:polkadot.io> We have not yet defined sensible gas costs - it'll be one of the last things we do.
okiedokie; so I assume in koute pvm test vectors the default value is 1
# 2024-09-12 13:26 jan: > <@gav:polkadot.io> That's implication of the math, yeah.
Okay, thank you for clarifying everything. Now I'm clear on the intended behavior. Sorry for being so dense. (:
So the skip equation should be fixed, as it current does access out of bounds elements of
k
as far as I understand:
j∈N∶ k_{i+1+j} = 1
(since
j
is
∈N
, so it goes up to infinity, but even without this it would do an out-of-bounds access for the very last instruction due to the
+1
)
And we might consider deleting/changing this sentence which precedes the skip equation as it explicitly talks about the extra padding. I'm not saying it's incorrect (logically it's true), but in the context of our "there are no extra bits" conversation it may be confusing when (outside of the serialization codec section) you have a sentence which explicitly says that those bits do in fact exist:
> We assert that the length of the bitmask is no smaller than the length of the instruction blob (and in fact is simply rounded to the nearest multiple of eight for ease of octet-encoding).
So, let me quickly summarize the main options that I can see:
a) change
skip
to not read out of bounds,
b) keep
skip
as-is, define out of bounds reads to be
1
(after all we already do something similar with the instructions blob),
c) keep
skip
as-is, define out of bounds reads to be
1
, use a dedicated serialization codec for this bit mask which requires that
|k| mod 8 = 0
and
ceil(|c| / 8) = |k| / 8
(so this would make the business logic explicitly use the "extra" bits, which now I know you don't want)
I can make a PR to the GP. I'm guessing you'd like to go with (a), correct?
(For reference, I've initially implemented (c) in PolkaVM because that was marginally the fastest option which didn't require adding any extra unnecessary padding.)
# 2024-09-12 13:33 jan: > <@dakkk:matrix.org> okiedokie; so I assume in koute pvm test vectors the default value is 1
Correct. The test vectors currently assume that every instructions costs
1
gas, but this is just a strictly temporary measure so that people can implement their gas metering machinery without needing the final gas cost model to be defined (since gas is a core part of the JAM you must have at least
some gas cost model, and assuming every instruction costs only
1
gas is the simplest one you can have). We will be defining a proper gas cost model in the future and updating the vectors.
# 2024-09-12 15:56 gav: > <@jan:parity.io> Okay, thank you for clarifying everything. Now I'm clear on the intended behavior. Sorry for being so dense. (:
>
> So the skip equation should be fixed, as it current does access out of bounds elements of
k
as far as I understand:
j∈N∶ k_{i+1+j} = 1
(since
j
is
∈N
, so it goes up to infinity, but even without this it would do an out-of-bounds access for the very last instruction due to the
+1
)
>
> And we might consider deleting/changing this sentence which precedes the skip equation as it explicitly talks about the extra padding. I'm not saying it's incorrect (logically it's true), but in the context of our "there are no extra bits" conversation it may be confusing when (outside of the serialization codec section) you have a sentence which explicitly says that those bits do in fact exist:
>
> > We assert that the length of the bitmask is no smaller than the length of the instruction blob (and in fact is simply rounded to the nearest multiple of eight for ease of octet-encoding).
>
> So, let me quickly summarize the main options that I can see:
>
> a) change
skip
to not read out of bounds,
> b) keep
skip
as-is, define out of bounds reads to be
1
(after all we already do something similar with the instructions blob),
> c) keep
skip
as-is, define out of bounds reads to be
1
, use a dedicated serialization codec for this bit mask which requires that
|k| mod 8 = 0
and
ceil(|c| / 8) = |k| / 8
(so this would make the business logic explicitly use the "extra" bits, which now I know you don't want)
>
> I can make a PR to the GP. I'm guessing you'd like to go with (a), correct?
>
> (For reference, I've initially implemented (c) in PolkaVM because that was marginally the fastest option which didn't require adding any extra unnecessary padding.)
(a) sure.
# 2024-09-12 15:57 gav: And indeed that parenthesised text should be removed as it is neither useful nor commensurate with everything else.
# 2024-09-12 15:57 jan: > <@gav:polkadot.io> (a) sure.
Got it. I'll make a PR to the GP and update the test vectors with the fixed paddings then.
# 2024-09-13 06:53 gav: sourabhniyogi: Before you or your team invest too much effort into attempting to redevelop the JAM protocol please note that following the resolution of
Ordered Accumulations I do not anticipate any significant changes beyond tweaks, corrections and high-value-low-impact optimizations. Following this issue I consider JAM essentially feature complete and it will be very difficult to convince me to merge significant, novel feature additions into the GP. As a rough timeline I would like to have PolkaJAM entering audit in Q2 next year, which pretty much implies a security audit of the GP starting in Q1 and thus a spec freeze by EOY.
(edited)
# 2024-09-13 08:15 prematurata: so to avoid confusion may I suggest to remove the
_X
from the omega?
# 2024-09-15 08:59 vinsystems: Yes, its the correct link. Should
l
be determined implicitly by the number of octets of x? Or is it explicitly passed as an argument to the encode function?
# 2024-09-15 09:18 vinsystems: Is there any case in which
l
calculated from the encoding number is 0?
(edited)
# 2024-09-16 06:36 jan: To make this equation a little more clear, here's a visualization of what it does. It essentially encodes numbers as varints in the following way:
At most 7bit - 0xxxxxxx
At most 14bit - 10xxxxxx xxxxxxxx
At most 21bit - 110xxxxx xxxxxxxx xxxxxxxx
...
At most 56bit - 11111110 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
At most 64bit - 11111111 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
So the first encoded byte tells you how long the encoded number is by the amount of
1
s either until the first
0
(starting at the most significant bit) or until the end of the byte. If there's "free space" left after the first
0
then part of the number is packed in there (the
x
s in the first byte). And then the rest of the bits of the encoded number are appended (the
x
s after the first byte). So as you can see the number of
x
bits is always divisible by
7
(hence why you can find the
7l
in the GP equation), except the very last case where we don't have to encode the
0
in the first byte anymore because we'll hit the end of the byte when counting the
1
s when decoding anyway (hence the "otherwise if x < 2^64" part in the GP).
Here are some example numbers encoded using this:
number -> encoded as
01111111 -> 01111111 (127 -> [127])
10000000 -> 10000000, 10000000 (128 -> [128, 128])
00111111_11111111 -> 10111111, 11111111 (16383 -> [63, 255])
So if you're maybe familiar with the "standard" uleb128 varint serialization scheme - this is essentially similar, except instead of putting the continuation bits in every byte it packs all of the continuation bytes in the first byte and limits the number to at most 64-bit. (The rationale for this is that this is more efficient to decode on modern CPUs, because you only need to look at the first byte to know the length of the varint, instead of having to check the most significant bit for every byte.)
Hopefully this makes it a little more clear.
# 2024-09-19 15:09 vinsystems: In the last example
00111111_11111111 -> 10111111, 11111111 (16383 -> [63, 255])
Don't should be
16383 -> [191, 255]
?
In this case,
l = 1
Acording with (272): 16383 -> 2⁸ - 2⁷ + (16383/2⁸) = 191
concatenate with (16383 mod 2⁸) = 255
# 2024-09-19 15:12 jan: Yes, you're correct. That was a copy-paste error on my part when I converted binary to decimal. 10111111 is, of course, 191 is decimal. Sorry for the confusion.
# 2024-09-15 09:42 vinsystems: > <@gav:polkadot.io> If 1 <= x < 128 then l = 0.
Ok, I thought
l = number of octets
of x. Thanks.
# 2024-09-17 13:08 celadari: Hi,
I have a question regarding state Merkelization. We say we use a Patricia trie, but looking at equation 295, it seems there are no extension nodes for common prefixes, which looks more like a regular prefix trie with Merkle hashing.
Could you confirm if my understanding is correct? If so, (we want a Patricia trie), wouldn’t we need to define l and r using b_t1 and b_t0, where t1 and t0 are the largest common prefix among paths starting with 1 and 0, respectively?
# 2024-09-17 15:17 dave: It's intentional that there are no extension nodes. The keys are all hashes and so it isn't expected that there will be long common prefixes.
(edited)
# 2024-09-17 15:29 celadari: I see ! I'm definitely not a database expert but doesn't it mean we'll have at least 512 jumps in the db
# 2024-09-18 01:07 gav: > <@celadari:matrix.org> I see ! I'm definitely not a database expert but doesn't it mean we'll have at least 512 jumps in the db
Only if it is implemented in the least optimal way.
# 2024-09-17 16:08 clw0908: Hello, some question about 4 Invocation function entry points:
As GP mentioned, the four entry points (0 (isAuthorized), 5 (Refine), 10 (Accumulate) and 15 (OnTransfer)) will pass through Ψ M -> Ψ H -> Ψ -> Ψ 1.
My question is: What is the relationship between these four entry points and the PVM's instruction counter? Will these different entry points affect the PVM's instruction counter?
(edited)
# 2024-09-18 02:50 clw0908: > <@gav:polkadot.io> The instruction counter is initialised with the entry point.
Why initialize the instruction counter with the entry point?
Does it mean that different invocation functions execute different parts of the instruction data (bold C in GP(218))?
# 2024-09-18 02:51 jan: e.g. for refine you set the instruction pointer to 5 and start execution from there
# 2024-09-18 02:57 jan: To give more backround on this: the way this will work
in practice is that the program will have 4 unconditional jump instructions at the very start, and the first 3 of those instructions will be encoded in such a way as to be padded and always take up 5 bytes of space. Hence the entry points of 0, 5, 10 and 15. This design allows for hardcoded addresses for each of the entry points without having to specify them dynamically anywhere in the protocol.
# 2024-09-18 03:03 jan: (And just to be clear in case it isn't obvious: as a PVM implementation you don't and shouldn't care about this, and this shouldn't be handled in any special way on the VM level. You're supposed to set your instruction counter to the hardcoded address of the entry point, and just start execution from there, regardless of what exactly happens to be there. The fact that there will be usually unconditional jump instructions there is just an implementation detail, and is not in any way required by the GP.)
# 2024-09-18 13:28 prematurata: > <@gav:polkadot.io> Yes it should read blackboard A
what about the subscribpted t ?
# 2024-09-19 09:45 prematurata: is
m
of deferred transfer Y\_64 or Y\_128? i ask because in 12.4 the text reads
a memo component m of 64 octets
but then its defined as
m ∈ YM
(161) and
M
is equal to 128 in the appendix.
What is even more interesting is that the
ΩT
fn on p.45 says we should deserialize a
M
long octet sequence into a
T
value. So my guess is that the (161) is wrong.
Assuming
m
is Y64 then deserializing
T
means that 128-64 octets must be used to get the other elements in
T
(
s
,
d
,
a
,
g
). but even if all of them were 8 octets that would mean that we're left with 32 extra octets. so I'm a bit lost
(edited)
# 2024-09-20 02:24 sourabhniyogi: How does Coreplay, with its stop-freeze-registers-resume pattern fit into JAM?
This would be the major "feature request", in addition to:
- connecting JAM back into Polkadot's { Staking, Coretime, and Asset Hub } system chains
- getting complete about the rewards/punishment for guaranteeing/auditing/...
- JAMNP
- getting clear about how the (refineless) privileged services work from a user point of view
I hope by Strongly Web3's "equidistant from all teams" principle, we can have multiple teams (3+) undergoing audits and not just PolkaJAM in the Q2 timeframe =).
(edited)
# 2024-09-20 07:50 gav: sourabhniyogi: I am happy to field specific questions on interpretation of the Gray paper. I am not happy to be managed.
# 2024-09-20 08:04 gav: > <@prematurata:matrix.org> is
m
of deferred transfer Y\_64 or Y\_128? i ask because in 12.4 the text reads
a memo component m of 64 octets
but then its defined as
m ∈ YM
(161) and
M
is equal to 128 in the appendix.
>
> What is even more interesting is that the
ΩT
fn on p.45 says we should deserialize a
M
long octet sequence into a
T
value. So my guess is that the (161) is wrong.
>
> Assuming
m
is Y64 then deserializing
T
means that 128-64 octets must be used to get the other elements in
T
(
s
,
d
,
a
,
g
). but even if all of them were 8 octets that would mean that we're left with 32 extra octets. so I'm a bit lost
That was a typo in the text: where it reads 64 should read 128 and is fixed in the next release. The other points are invalid. The dererialization you refer to is moot (I have since removed the function as it's a no-op here), and "deserializes" only the memo data. The other fields of the record are provided elsewhere.
# 2024-09-20 13:50 prematurata: > <@gav:polkadot.io> That was a typo in the text: where it reads 64 should read 128 and is fixed in the next release. The other points are invalid. The dererialization you refer to is moot (I have since removed the function as it's a no-op here), and "deserializes" only the memo data. The other fields of the record are provided elsewhere.
thanks
# 2024-09-22 21:19 danicuki: What is the meaning of av[c] here? As far as I understood, av is an integer.
I interpret it as the core associated with validator in index av. But I am not sure the formula is accurate to the meaning.
# 2024-09-22 21:32 danicuki: I think I found the answer: it should be af, not av, right?
# 2024-09-23 08:49 danicuki: Should Formula (128)
af[c] ⇒ ρ†[c]≠∅
be
af[c] ⇔ ρ†[c]≠∅
?
# 2024-09-23 09:25 danicuki: Another doubt: What does H(Hp, af) means in formula 126? H function should take only one argument, no?
# 2024-09-23 10:49 gav: > <@danicuki:matrix.org> Should Formula (128)
>
> af[c] ⇒ ρ†[c]≠∅
>
> be
>
> af[c] ⇔ ρ†[c]≠∅
>
> ?
No.
# 2024-09-23 10:50 gav: H is assumed to encode any arguments give prior to hashing. Multiple arguments are treated as tuples.
(edited)
# 2024-09-23 14:17 danicuki: > <@gav:polkadot.io> H is assumed to encode any arguments give prior to hashing. Multiple arguments are treated as tuples.
Just to make sure I understood correctly:
af is a binary string of 341 elements (e.g. represented in programming languages as an array of integers). It is not clear to me how should we hash this tuple. There is no clear definition of how to Hash a tuple in the GP. H(m ∈ Y)
Should I assume: concat the Hp binary with the 0 and 1 af array and then Hash the result of this concatenation?
# 2024-09-23 17:58 vinsystems: After implementing the codec functions, I noticed that there is already a crate
parity-scale-codec. Does this crate have the same encode implementations as the GP codec functions?
(edited)
# 2024-09-23 19:37 vinsystems: After implementing the codec functions, I noticed that there is already a crate
parity-scale-codec. Does this crate have the same encode implementations as the GP codec functions?
# 2024-09-23 19:59 tomusdrw: There are some similarities, but nope. AFAIR SCALE was initially mentioned in the Gray Paper (and is mentioned on the JAM prize website), but it is no longer. Main difference is the variable-length encoding of numbers.
# 2024-09-23 22:47 gav: FWIW the only difference between SCALE and JAM's serialization codec is in treatment of compact integers.
# 2024-09-23 22:36 gav: There's a lot of programming languages in the world. I'd avoid trying to make blanket statements about them.
(edited)
# 2024-09-23 22:38 gav: As I wrote:
> H is assumed to encode any arguments give prior to hashing. Multiple arguments are treated as tuples.
# 2024-09-23 22:39 gav: The GP defines how to encode tuples and how to hash octet-sequences.
# 2024-09-24 02:34 gav: To avoid further confusion I'll make it even more explicit with the sentence:
> The inputs of a hash function should be expected to be passed through our serialization codec $\mathcal{E}$ to yield an octet sequence to which the cryptography may be applied.
# 2024-09-24 02:58 xlchen: GP defines BLS pubkey to be 144 bytes but it seems like the BLS pubkey is usually 48 bytes?
# 2024-09-24 03:04 xlchen: seems like the raw type (
blst_p1
) is 144 bytes but that should be internal details
# 2024-09-24 08:59 davxy: The public key for the scheme consists of two public elements, each corresponding to one of the two groups used by BLS for pairing: G1 and G2. Specifically, it is represented as (g1
sk, g2 sk), where g1 is the generator of G1 and g2 is the generator of G2. Each point in G1 is compressed into 48 bytes, while each point in G2 is compressed into 96 bytes. This scheme allows for fast verification of aggregate signatures (which are very useful for bridges). In particular the G1 component allows for aggregate signature correctness verification without the need for N expensive pairings, with N the number of aggregated signatures. The full scheme is detailed in this paper:
https://eprint.iacr.org/2022/1611 and is implemented by this library:
https://github.com/w3f/bls. The authors of the paper are jeff Syed Alistair and Oana Ciobotaru
(edited)
# 2024-09-24 15:07 dakkk: I have a doubt; in A.1, we define the program blob
p
(octects containing instructions
c
, bitmask
k
and jump
j
).
Then in A.7 we say "We thus define the standard program code format p, which includes not only the instructions and jump table (pre-
viously represented by the term c), but ..."
This I think it is wrong since we already defined
p
as the combination of
c
,
k
and
j
; the term
c
previously indicates only the instruction data.
# 2024-09-24 16:45 dave: Would it make sense to maintain an extended recent block history for checking prerequisite WP hashes? As it stands, a work package can have a recent enough anchor block to in theory be reportable, but not actually be reportable because the reporting of its prerequisite WP has been forgotten about. A WP with a prerequisite essentially has a shorter lifetime than other WPs because of this.
# 2024-09-25 02:18 xlchen: isn't this a circular dependency? The sealing key depends on unsigned block, but it includes Hv, and Hv depends on the VRF output of Hs?
# 2024-09-25 06:17 celadari: I think you first generate the anonymous VRF (with ring root) as a ticket submitter - you don't need aux for that: only context.
This anonymous VRF happens to be equal to Y(H_s). You compute a signature H_v and then you compute H_s.
Everything I said is you as a block author.
(edited)
# 2024-09-25 09:04 celadari: I think you generate Y(H\_s) as a normal VRF using the context X\_F ++ eta\_3: anonymous and non-anonymous method give same VRF so here you can use whichever you want. You don't need aux for VRF output: just the context.
You then compute signature H\_v and then you compute signature H\_s.
Again, everything I said is you as a block author.
If someone can confirm what I said it would be nice
(edited)
# 2024-09-25 02:20 xlchen: or does the message (encoded unsigned block) won't impact the VRF output? so I can pass empty data when calculating the VRF output?
# 2024-09-25 15:41 davxy: I confirm.
Y(H_s) can also be generated independently from the signature (you just need the context). That is what breaks the cyclic dep.
# 2024-09-28 02:02 gav: 1. Yes.
2. Yes.
3. No - the GP describes the signed material, but other data may be passed in addition for context. The signed material itself doesn’t need the block hash.
(edited)
# 2024-09-30 14:33 dave: Re (3), because the block's hash is not included in the signed data, it doesn't seem unlikely that an announcement intended for a block on one fork could be fiddled with and then used in the context of a block on a different fork? Maybe this is not a problem but that is not clear to me
# 2024-09-30 16:40 dave: One more question on this... AFAICT, as currently specified, auditing of a block is performed by the prior validator set rather than the posterior validator set. This seems a bit odd, given that eg the availability assurance stuff uses the posterior validator set. Wondering if this is intentional or if it was simply missed when lots of things were changed to use posterior state?
# 2024-09-30 22:45 gav: They’re two very different processes - auditing is entirely off-chain, recorded publicly only through grandpa; assurance is recorded on-chain - so I wouldn’t draw any conclusions about one from the other.
(edited)
# 2024-09-30 22:47 dave: Well auditors may request shards from assurers. If they're always the same set you might for example only service shard requests if you have seen an appropriate audit announcement first (not sure if this would be sensible)
# 2024-09-30 04:37 gav: Note that the syntax in the preceding portion is the little-used numeric tuple subscript (the subscript “2”). I intend to remove this syntax in an upcoming revision. Instead the entire term “(x_s)_l[h,z]_2” can simply be replaced with“w”.
# 2024-09-30 07:36 emielsebastiaan: Graypaper releases with black backgrounds remain a pain point for me personally. Does anyone have a convenience download of the 0.3.8 release without the background and with black text? For a while these were available for download on the Graypaper site in the resources section.
# 2024-09-30 22:39 gav: But regardless, even if they were not it would not alter anything in the spec.
# 2024-09-30 22:49 gav: David Emett: There is not really the prior/posterior difference for auditing as it's a fully off-chain process. With assurance, you actually check signatures on-chain, so you need to select which of the two sets those sigs refer to. Auditing doesn't happen inside a block. No audit signatures are routinely checked on-chain (judgements are the exception, and these are intentionally drawn from the prior set). Auditing happens _between_ blocks. So there really isn't a prior or posterior. There's just the "current" validator key set. I suppose that could be phrased as the posterior of the block associated with the timeslot at the point that the auditing process began.
(edited)
# 2024-09-30 22:49 gav: The slight annoyance for clients is that there may be several forks.
# 2024-09-30 22:50 gav: So clients need to track all forks and do all audits of those forks where they are in the resultant (i.e. posterior) set.
(edited)
# 2024-09-30 23:14 dave: AIUI auditing happens in the context of a block, and this block has a prior validator set and a posterior validator set. We presumably need to pick one of these to be the validator set that is expected to audit the block? I don't understand how you could just use the "current" validator set as (a) there is no consensus over this and (b) auditing is stateful and thus we presumably need the same validator set throughout the whole auditing process for a block?
# 2024-09-30 23:18 gav: So both auditing and block STF happen in the context of some pre-existing chain-head and its implied (posterior) state. In the context of the the block STF (which essentially validates a candidate child-block and determines the implied (posterior) state), then we now have two states - the old implied state and the new implied state. I call these two the prior and the posterior, since one is the state before the STF and the other is after the effects of the STF.
(edited)
# 2024-09-30 23:20 gav: Now, depending on how you argue, this could be named the prior or the posterior. But the reality is that neither name is sensible.
(edited)
# 2024-09-30 23:23 gav: AFAIK I never actually specify either term in the section on auditing (nor should I have). I guess the confusion comes from the fact that I didn't suffix kappa with a prime. But this wouldn't make sense since there is no concept of a non-prime kappa here.
# 2024-09-30 23:26 dave: You state here that the terms are in the context of the block that is being audited, so sigma prime is the posterior state for example
# 2024-09-30 23:28 dave: Particularly given that you use for example bold W to refer to the accumulated reports, which is defined in terms of prior state and the extrinsic
# 2024-09-30 23:29 gav: So basically replacing
> assume ourselves focused on some block B with other terms corresponding, so σ′ is said block’s posterior state, H is its header &c.
with
> assume ourselves focused on the most recent implied state of the chain, σ. The header of the most recent block of the chain is assumed as H.
(edited)
# 2024-09-30 23:29 dave: It would seem odd to say that "non-prime" names actually refer to items in the posterior state of the block
(edited)
# 2024-09-30 23:33 gav: But using prime doesn't make sense here for the reasons I mention above.
# 2024-09-30 23:42 dave: Not sure I agree. But in any case, if we agree that the validator set which should audit a block is the validator set in the state after the block has been executed, then I'm happy
(edited)
# 2024-09-30 23:54 gav: Ok - I notice that we need to draw upon the prior of rho in order to describe work-reports which became available during the block. So it's going to be convenient (indeed, pretty much necessary) to have both prior and posterior even though auditing _per se_ doesn't impact their relationship.
(edited)
# 2024-09-30 23:55 gav: So we'll keep it as-is and I'll add some clarifying text and a prime to kappa.
# 2024-10-01 00:35 dave: LGTM. Getting back to the original question, should the \kappa bits be \kappa'?
# 2024-10-01 03:48 gav: I think kappa’ makes more sense yes. That's now in the PR.
(edited)
# 2024-10-04 11:06 prematurata: > <@gav:polkadot.io> Shouldn't what by
Y
?
sorry (181) the return should by a sequence of
Y
imho as the output of
zeroPad
(P) does not guarantuee size output
# 2024-10-07 07:27 gav: Zeropad function guarantees that the output is an integer multiple of the subscript (l = W_S.W_E, the length of a single segment in bytes). In this case the zeropad function just has the effect of padding the segment to exactly this size (blackboard Y_l === blackboard G) since its input is always greater than zero and never bigger than a segment.
(edited)
# 2024-10-08 13:01 gav: The
bits
function for building tree nodes is now MSB-first and the PVM spec has several fixes as well as a fair amount of restructuring needed for OrdAcc.
(edited)
# 2024-10-08 13:06 gav: > <@dakkk:matrix.org> that "i" should be "l"?
Please link to GPreader - images don't seem to work on this server
# 2024-10-09 09:04 prematurata: was looking at 0.4.0 i noticed that the isAuthorized invocation's F function at 267 should "return" 2 registers .
it was like this even in 0.3.8 but i believe it is atypo.
F
should return **all** registers considering is being used by
ΨM
and
ΨH
which uses
F
to builld
ω''
(edited)
# 2024-10-09 09:05 prematurata: same goes for the number of input registers. shouldnt that be 13?
# 2024-10-09 09:06 prematurata: side note: if i am wrong then why/what is the expected output is
n != gas
? i see
w_8
is now being used
# 2024-10-09 09:28 gav: Planned changes for 0.5 will be comparatively minor and very much directed towards tweaking and optimization:
- Post-Accumulate Preimage Integration
- Extrinsic Commitment should be hash of hashes
- 64-bit PVM
# 2024-10-09 09:43 gav: > <@prematurata:matrix.org> same goes for the number of input registers. shouldnt that be 13?
Yes, this is an old typo and will be corrected in 0.4.1
# 2024-10-09 09:44 prematurata: > <@gav:polkadot.io> Yes, this is an old typo and will be corrected in 0.4.1
thanks for confirming, then... is w_8 value to be placed in w_1 and the rest set to zero?
# 2024-10-09 09:51 gav: w\_1 isn't used any more. (it was also a typo from when the register sets were limited to the argument/return value registers)
(edited)
# 2024-10-09 13:09 prematurata: since you might be there gav you might also want to check the first case of
C
(279) and (280) >
check fn
(edited)
# 2024-10-09 15:53 prematurata: quick question, Initially i thought it was a typo but I am now sure i am missing something, what is
_t
in relation to a service account?
for example in the
new
accumulate function both
bold s
and
bold a
reference this
_t
(edited)
# 2024-10-09 15:56 prematurata: as well as
(X_{bold s})_t
in the first matching case of
new
# 2024-10-09 19:18 tomusdrw: hi there! we've just deployed a new version of the GP Reader. It has much shorter links and displays notes as annotations on the document itself. If you run into any issues please let us know on github. The GP is also updated to the latest 0.4.1 released earlier today.
# 2024-10-10 06:50 dakkk: gav: in your opinion what are the parts of the paper that you consider almost "written on stone" (or at least enough solid to be considered almost ready)?
# 2024-10-10 06:53 gav: Nothing is likely to change very much now. Safrole might not change at all.
# 2024-10-10 06:54 gav: It’s hard to see what will need to be tweaked until we begin actually testing and benchmarking.
# 2024-10-10 06:55 gav: But beyond the issues already in the GP repo and its readme, there’s nothing on my mind which needs altering.
# 2024-10-10 12:57 dvladco: Hi everyone, we noticed that the polkavm rust implementation charges 1 extra gas when load/store fails with out of bounds error, however we couldn’t find this described in the graypaper, is this expected behaviour?
# 2024-10-10 13:01 jan: Do you mean that there is extra gas charged on top of the 1 gas that the instruction costs normally?
If so then that's a bug. Nevertheless, it doesn't really matter what PolkaVM does here. PolkaVM is
not a reference implementation of the GP; it's just
an implementation, so you need to remember that just because PolkaVM does something it doesn't mean it's correct. :P
# 2024-10-10 13:18 dvladco: I understand, I was asking this because there is a PR for the
jamtestvectors and as I can see these are based the PolkaVM implementation, so we are also trying to validate our implementation agains these tests
# 2024-10-10 13:20 dvladco: in this case I assume a more appropriate place to ask this would be in the PR itself :)
(edited)
# 2024-10-10 13:21 jan: The gas cost model in general is still a work-in-progress, so this is not specced in the GP yet (nor it should be considered final behavior), however the reason you're seeing 1 extra gas being charged there is because we charge gas on an entry to a basic block, and not per instruction.
# 2024-10-10 13:22 jan: Basically, for efficiency's sake what we do is we calculate the gas cost for the whole basic block, and charge the whole cost at the start.
# 2024-10-10 13:22 jan: And in that particular test case you have a load/store instruction in the middle of a basic block, so the execution gets interrupted.
# 2024-10-10 13:23 jan: But since we've already entered the basic block the gas was already charged, as if the _whole_ basic block was executed.
(edited)
# 2024-10-10 13:25 jan: So charging the gas at the start of basic blocks allows us to do two things: 1) be more efficient (charging gas on each executed instruction is expensive), and 2) it should allow us to have a better gas cost model which will also take the dependencies between the instructions into account instead of only assigning static costs to each instruction in isolation
# 2024-10-10 13:34 dvladco: oh I see, thanks for explaining, that's what I needed to know. So charging for the whole block even though it is probably not going to be fully executed is what we actually want, is this behaviour going to be described in the gray paper eventually?
# 2024-10-10 13:52 gav: The behaviour will be fully described, but for now we will be removing these non-concordant test instances.
# 2024-10-11 08:02 luke_f: Hello
Could anyone please help me understand Eq. 159?
https://graypaper.fluffylabs.dev/#/c71229b/152702152702
Is the symbol ⋃ defined anywhere? I couldn't find a definition.
I’m interpreting it as the union of all dictionaries in Eq. 158, but I’m not sure what exactly the union of dictionaries means in this context. Could someone explain?
Thank you!
# 2024-10-11 15:01 prematurata: > <@dakkk:matrix.org> I'm a bit confused about this list indexing notation, where a dot is used. I've found this notation only in zip/lace functions of appendix H. I'm not sure what it means, and I can't find a reference in the Notation chapter
>
>
https://graypaper.fluffylabs.dev/#/c71229b/382700382700
I 'm interested as well. i assumed
times
but didnt explore any further
# 2024-10-14 12:32 dave: A question concerning audit announcements. If we see a negative judgment for a work-report from some other auditor, we are required to audit the work-report ourselves. The GP states that we must announce our intention to audit this work-report ("In all cases, we publish a signed statement of which of the cores we believe we are required to audit"). Currently this isn't really possible in JAM-SNP as the "evidence" for requirement to audit a work-report only covers the no-shows case. I can extend the announcement message format to support this but I'm not sure what the reason is for announcing in this case, is it actually necessary?
# 2024-10-15 02:21 stanleyli: Hi David Emett , I have read your wonderful JAMNP-S simple.md. I have a few questions and was wondering if you could answer them.
(1)Are the [Segment Shard] of CE139/140 the same as those of CE137?
(2) in CE 139, you mentioned that "Guarantors should initially use protocol 139 to fetch segment shards. If a reconstructed import segment is inconsistent with its reconstructed proof". What exactly is this "reconstructed proof" when CE 139 only asked for [Segment Shard]? Perhaps you meant Guarantor will use [Segment Shard] retrieved from Assurer to verify against its own pageProof that it sent out during CE 137?
(3) You mentioned that guarantors will use CE139/CE140 to request import segment shards from assurers. However, Builder specify Imported segments using the combination of ({tree_root, index})[
https://github.com/w3f/jamtestvectors/blob/master/codec/data/work_package.json#L22-L35] in CE 133 and not [Erasure Root ++ Shard Index ++ len++[Segment Index]]. Should CE 139 have some alternative request format using only [segment_root ++ index]?
(4) I need clarification on how is pageProof being used in CE139/CE140? Our understanding is that page proof is quite large as it's encoded with Justification & ↕si⋅⋅⋅+64 in
https://graypaper.fluffylabs.dev/#/c71229b/1a15001a1500. So it's probably possible to recover exported segments from pageproof alone;
Do I understand this wrong?
# 2024-10-15 11:09 dave: > Are the [Segment Shard] of CE139/140 the same as those of CE137?
Yes. In CE137 the assurer receives a Segment Shard sequence, the CE139/140 Segment Index is an index into this sequence.
# 2024-10-15 11:19 dave: > What exactly is this "reconstructed proof" when CE 139 only asked for [Segment Shard]
The segments that are erasure coded consist of the segments exported by the work package _plus_ a sequence of "proof" pages, essentially containing Merkle proofs from the segment root to the exported segments. When you fetch an exported segment, you also need to fetch the corresponding proof to the segment root, so that you can convince yourself and auditors that the segment is correct.
# 2024-10-15 11:23 dave: In the full network protocol these proof pages might be implicitly returned, but with SNP you will need to explicitly ask for them.
# 2024-10-15 11:39 dave: > You mentioned that guarantors will use CE139/CE140 to request import segment shards from assurers. However, Builder specify Imported segments using the combination of ({tree\_root, index})\[
https://github.com/w3f/jamtestvectors/blob/master/codec/data/work\_package.json#L22-L35\] in CE 133 and not \[Erasure Root ++ Shard Index ++ len++\[Segment Index\]\]. Should CE 139 have some alternative request format using only \[segment\_root ++ index\]
Guarantors will need to maintain a map from Segment Root to Erasure Root. This can be built from the work-reports placed on-chain / distributed via CE135. Note that an individual assurer cannot in general prove the relationship between a segment shard and the segment root, only between the shard and the erasure root. This is why for CE139/140 an erasure root is sent rather than a segment root. Note also that any segment root --> erasure root mapping constructed based on reports included on-chain may have incorrect entries, as reports included on-chain are not necessarily correct. Guarantors can conclude that an erasure root was incorrect if after fetching all shards with CE 140 and verifying all the justifications, the reconstructed segment and proof do not match up.
(edited)
# 2024-10-15 11:51 dave: > I need clarification on how is pageProof being used in CE139/CE140? Our understanding is that page proof is quite large as it's encoded with Justification & ↕si⋅⋅⋅+64 in
https://graypaper.fluffylabs.dev/#/c71229b/1a15001a1500. So it's probably possible to recover exported segments from pageproof alone;
The proof pages contain Merkle proofs from the segment root to the exported segments. Each proof page/segment contains proofs for 64 exported segments. You certainly can't reconstruct the exported segments themselves from these proof pages. CE139/140 is intended to be used to fetch proof page shards as well as exported segment shards. You need to explicitly request proof page shards by passing the appropriate segment indices.
# 2024-10-17 23:22 stanleyli: Hi gav: is there a typo in the Page proof function here?
https://graypaper.fluffylabs.dev/#/293bf5a/1a4f001a5100
In eq 196, is the ↕s referring to the actual segment or the segment hash? If it’s the segment, then each element within the page proof (encoded as 6Hash + G) would make this page proof extremely large (up to 64 G) ?
(edited)
# 2024-10-18 03:27 clw0908: Hello everyone:
Based on the GP's definition, I speculate the following, (not certain if it's correct):
(1) The genesis state will have privileged service code to obtain preimages.
(2) Users will use (1) to get their code into on-chain preimages.
My question is: How will the privileged service get the code into the system?
# 2024-10-18 11:23 gav: > <@stanleyli:matrix.org> Hi gav: is there a typo in the Page proof function here?
https://graypaper.fluffylabs.dev/#/293bf5a/1a4f001a5100
>
> In eq 196, is the ↕s referring to the actual segment or the segment hash? If it’s the segment, then each element within the page proof (encoded as 6Hash + G) would make this page proof extremely large (up to 64 G) ?
Yes, as David Emett suggested, there should be a
H^#(...)
around the
s_{i...+64}
# 2024-10-21 03:08 sourabhniyogi: I am likely missing a crucial point or important detail concerning the idea that guarantors, in order to use CE139/CE140, every validator must maintain a map from every Segment Root to Erasure Root -- this now requires every validator to become a indexer of pretty much a month's worth of work packages's work reports to be able to be guarantors. No problem in indexing a month's worth of ImportDA actiity, but validator need to take every single work report across all of JAM, and index ALL the _potential_ segment roots to Erasure Roots just to be able to _maybe_ fetch some import segment in some future work package's work items -- Could this be?? If so, what is the simplest process to map an on-chain work report to all its segment roots _without_ auditing the work report?
As of 0.4.x, imported segments can now also be specified in a SECOND way with $H^\\boxplus$ as (work package hash, segment index) combinations, and further requires that validators build a lookup dictionary mapping these into segment roots. But given CE139 has
[Erasure-Root ++ Shard Index ++ len++[Segment Index]]
as the request key, it seems that this second way to specify imported segments is (a) sufficient for builders to specify import segments ... and (b) has far lower indexing requirements because validators just need to store a months worth of workpackagehash => erasure-Root mappings to use CE139 (which are just sitting in the work report!). Would it be reasonable to drop the first "older" way with segmentroot and just keep this SECOND method as the sole method to specify imported segments in a work package? If not, why not?
I must have missed something crucial here... Thank you for your help!
(edited)
# 2024-10-21 09:10 dave: Each work report has only one segment root, this is in the availability specifier
# 2024-10-21 09:15 dave: Re the WP hash to segment root mapping, this only needs to be tracked for ~1 epoch, as the chain can only check mappings going this far back
# 2024-10-21 09:21 dave: This is also a reason for still permitting the "older" way: the "newer" way using WP hashes will only work for referencing segments exported by WPs reported in the last ~hour, whereas the "older" way should work for any segments exported in the last ~28 days
# 2024-10-21 09:21 dave: > <@dave:parity.io> Each work report has only one segment root, this is in the availability specifier
Probably this is what you are missing?
# 2024-10-21 09:23 dave: It should be possible to build this index using only the last 28 days worth of blocks
# 2024-10-21 20:29 sourabhniyogi: Alright, thank you very much for clarifying the "one segment root", there are so many different things "segments root" and "segment roots" could mean I'll make a point to use singular "segment-root (e)" based on your design and my now clear understanding of what is going on!
# 2024-10-21 20:36 sourabhniyogi: One remaining nitpick question is this: Are you saying that a work item of a work package is _invalid_ if it uses a (workpackagehash, index) combination to specify an import segment if the workpackagehash is more than an epoch old? Is this implied/stated in the GP somewhere? If not, it should be added or made more explicit?
(edited)
# 2024-10-21 21:31 dave: > <@sourabhniyogi:matrix.org> Alright, thank you very much for clarifying the "one segment root", there are so many different things "segments root" and "segment roots" could mean I'll make a point to use singular "segment-root (e)" based on your design and my now clear understanding of what is going on!
FWIW I changed the SNP doc to use "segments-root" as opposed to "segment root"
# 2024-10-21 23:02 dave: > <@sourabhniyogi:matrix.org> One remaining nitpick question is this: Are you saying that a work item of a work package is _invalid_ if it uses a (workpackagehash, index) combination to specify an import segment if the workpackagehash is more than an epoch old? Is this implied/stated in the GP somewhere? If not, it should be added or made more explicit?
Well, "more than an epoch old" is not exact. The GP is quite explicit about what work reports are valid for inclusion in a block and what happens to them after this point. The WP hash -> segment root mapping is checked at accumulation time, against the history of accumulated work packages, lower-case xi. This happens in the E function with the "x u w_l = w_l u x" check. The history is a circular buffer of length E (= timeslots per epoch), with one entry pushed every block. So it can store _at least_ one epoch's worth of history but possibly more if there are "missing" blocks
# 2024-10-21 23:44 sourabhniyogi: No debate on that, thanks for explaining. But on this last nitpick I mean the work items in the work package, as of 0.4.x have this $H^\boxplus$ source here:
https://graypaper.fluffylabs.dev/#/293bf5a/199e0019a500
As far as I can tell, I could have an old work package hash from a long time ago -- "whereas a value drawn from H⊞ implies the hash value is
the hash of the exporting work-package. In the latter case, it must be converted into a segment-root by the guarantor and this conversion reported in the work-report for on-chain validation." If you believe this last statement is only applied for around an epoch (with the circular buffer mechanics, as you mention) then we should ensure this is clear in GP, don't you think?
# 2024-10-22 10:37 gav: > <@sourabhniyogi:matrix.org> No debate on that, thanks for explaining. But on this last nitpick I mean the work items in the work package, as of 0.4.x have this $H^\boxplus$ source here:
>
https://graypaper.fluffylabs.dev/#/293bf5a/199e0019a500
> As far as I can tell, I could have an old work package hash from a long time ago -- "whereas a value drawn from H⊞ implies the hash value is
> the hash of the exporting work-package. In the latter case, it must be converted into a segment-root by the guarantor and this conversion reported in the work-report for on-chain validation." If you believe this last statement is only applied for around an epoch (with the circular buffer mechanics, as you mention) then we should ensure this is clear in GP, don't you think?
The guarantor will not be punished for introducing a WR with an old/unvalidatable SR lookup entry. They might just not get rewarded.
And it beyond the scope of the GP (at least currently) to go deep into the optimum strategies across all off-chain behaviour.
# 2024-10-21 04:12 ksc85pwpj5: Excuse me, I would like to ask what the difference is between x ∪ wl and wl ∪ x, and what is the meaning of x ∪ wl = wl ∪ x.
(edited)
# 2024-10-21 06:11 prematurata: If I remember properly left or right side take priority in case of keys existing in both elements. This means that a U b = b U a effectively means there are no conflicting keys or if there are their values are the same
# 2024-10-21 11:24 gav: Indeed. I was a bit torn over how to express this in the paper.
# 2024-10-21 12:22 gav: It is the third item of (x\_s)\_l\[h, z\] with the condition that it's a three-item array (notice it's behind the "if" condition)
(edited)
# 2024-10-21 12:28 gav: It wasn't especially clear that the value was used as it was only referenced with the "_2" subscript prior to the "if". This will be improved in 0.4.3.
# 2024-10-22 11:14 dvladco: Hello, I am struggling to understand appendix A.7 more specifically I can't find the explanation for a bunch of variables like
o
,
w
,
z
,
s
in formula
260
and later
a
,
W
,
R
,
V
,
A
in formula
264
. I have managed to decode RAM but mainly looking at the polkavm rust implementation however I don't understand how to decode the registers from the program code
p
# 2024-10-22 11:27 jan: The formula (264) defines the memory map of the standard program initialization (that is: what is put at a given memory address, whether it's initialized, whether it's read-only/read-write, etc.). The formula (260) is essentially the standard JAM "program blob". You don't decode the registers from **p** (260); the initial values of the registers are given in (265).
# 2024-10-22 11:54 dvladco: Yeah, that would make sense however I got confused by this sentance:
"We thus define the standard program code format p, which includes not only the instructions and jump table (pre-
viously represented by the term c), but also information on the state of the ram and registers at program start"
# 2024-10-22 11:56 dvladco: here it says that we get registers from program code p and later in the formula 259 we have this
Y(p) -> (c, ω, μ)?
(edited)
# 2024-10-22 12:19 jan: The wording here might be a little confusing as-is. The initial registers are defined as part of the standard program initialization (eq. 265), but they're
not explicitly part of the program blob (eq. 260). The Y in eq. 259 is meant to represent the standard program initialization, but it doesn't necessarily mean that the values of the registers are directly extracted from **p**.
# 2024-10-22 12:46 dvladco: Thank you, this explanation makes sense, also formula 266 shows
a
as separate from
p
. I guess formula 259 should be something like this then
Y(p, a) -> (c, ω, μ)?
(edited)
# 2024-10-22 13:19 dvladco: one more thing, formula 265 describes what are the initial registers, but I can't find defined anywhere how to convert
a
which is a series of octets with at most Z_I elements to registers [N_R]_13
# 2024-10-22 13:53 gav: It is |a|, the number of items in a, which should easily fit in register index 8.
# 2024-10-22 14:35 dvladco: I think I understand now, so when we refer to data arguments here it means a slice of bytes that can later be decoded when executing the program. I thought there would be a way to pass arguments as register values directly when invoking the PVM
# 2024-10-22 16:07 gav: It is perhaps not super clear, but it's intended to serialize all of the inner elements as E_4
# 2024-10-24 08:06 prematurata: it's not the first time i ask about this derived terms. my memory is not working properly :)
# 2024-10-24 13:02 luke_fishman: Hello gav
I think there is a mistake in Equation 183
E(WQ, W∗ ...n) if i = 0
the second argument is of the type
⟦W⟧
(list of work reports)
but the second argument that function
E
expects is
D⟨H → H⟩
(Equation 164)
(edited)
# 2024-10-24 13:03 dave: > <@luke_fishman:matrix.org> Hello gav
> I think there is a mistake in Equation 183
>
E(WQ, W∗ ...n) if i = 0
>
> the second argument is of the type
⟦W⟧
(list of work reports)
>
> but the second argument that function
E
expects is
D⟨H → H⟩
(Equation 164)
There is a PR open to fix this:
https://github.com/gavofyork/graypaper/pull/112
# 2024-10-24 21:13 danicuki: In formula 268, psi\_i definition:
https://graypaper.fluffylabs.dev/#/439ca37/29f80229fc02
↦ r where (g, r, ∅) = ΨM (pc, 0, GI , E(p, c), F, ∅)
What is pc? I believe there is a typo in the argument, since work packages do not have a field 'c'
(edited)
# 2024-10-24 21:25 emielsebastiaan: I believe a correction is needed in GP in the State Transition Dependency Graph section for gamma and kappa.
Suggested change: Remove
ψ'
from GP-0.4.3-eq:21 and add
ψ'
to GP-0.4.3-eq:19.
Likely an unchanged bit from an earlier version of GP.
Details here:
https://github.com/gavofyork/graypaper/pull/118 (edited)
# 2024-10-25 09:10 gav: > <@danicuki:matrix.org> In formula 268, psi\_i definition:
>
>
https://graypaper.fluffylabs.dev/#/439ca37/29f80229fc02
>
> ↦ r where (g, r, ∅) = ΨM (pc, 0, GI , E(p, c), F, ∅)
>
> What is pc? I believe there is a typo in the argument, since work packages do not have a field 'c'
Bold c is defined as the preimage of regular c.
# 2024-10-25 12:16 dave: The state keys as currently defined in the State Merklization section can collide for different bits of state. e.g. The keys for general service storage and for service preimages are calculated identically? It seems like a_l keys can be made to collide with almost anything as both h and l are completely controllable by the code calling solicit.
# 2024-10-25 16:10 tomusdrw: > <@danicuki:matrix.org> What does \[WHAT, ω8, . . . \] mean here? What do I put on ...? I assume to pass through ω2, ω3, ... ?
>
https://graypaper.fluffylabs.dev/#/439ca37/2a60002a6000
That's how I understand it as well. You assign ω0=WHAT, ω1 = ω8, and the rest stays the same. The same notation is used later as parameters pass-through (e.g. Omega_G).
# 2024-10-25 16:33 sourabhniyogi: If you all have a precise idea of what a "null authorizer" should be in our JAM implementations [disconnected from the CoreTime system chain], like down to pvm assembly code included in genesis state, kindly share --I think we can converge on something very quickly?
# 2024-10-26 11:12 gav: (They're not going to be especially meaningful yet as we don't have a sensible gas model yet)
# 2024-10-26 11:51 gav: Note that it is omega_7 which gets set to the error code
WHAT
. omega otherwise remains the same.
# 2024-10-26 20:18 gav: Extrinsic preimages are expected to be passed alongside the work package by the block builder.
(edited)
# 2024-10-26 20:19 gav: Import segments are expected to be retrieved by the guarantor through erasure-code reconstruction from pieces from the validators collected via the network.
(edited)
# 2024-10-27 12:22 gav: But yeah, it's a provision from an older revision where it was desirable to ensure that equivalent keys in different services wouldn't result in similar trie paths.
# 2024-10-27 12:23 gav: It's less important with the current trie as there's already a provision to split it based on service ID. It's still more optimal to hash in order to uniformally distribute tree paths, and generally more secure to "salt" the hash with the service ID.
(edited)
# 2024-10-28 16:43 prematurata: Something i might be missing.
https://graypaper.fluffylabs.dev/#/439ca37/2bb6002bb800 here preimage code ( i guess) is provided to the argument invocation. (d\[s\]\_bold\_c)
but the preimage section does not seem to mention any mechanism of storing/fetching preimage for the service. I marked my code with a TODO and now i am fixing all the leftovers.
omega_n (new service host call) does not seem to have access of such preimage but rather codehash is being directly provided in memory.... so I'm a bit lost of where the preimage is being handled
(edited)
# 2024-10-28 23:15 gav: > <@tomusdrw:matrix.org>
https://graypaper.fluffylabs.dev/#/439ca37/2e70002e7100 Should this
d
that's being subscripted here be just
\mi
? I'm having a hard time to figure out where is the data necessary to build that dictionary coming from. In other context
d
is just a dictionary indexed by service id afaict.
Yes. Will be corrected in next revision
# 2024-10-30 01:42 charliewinston14: Hello. I asked this in the Jam chat and didn't get a response so trying here (if this is not allowed let me know ill delete this post). Question about erasure coding in JAM. I’ve broken the blob into pieces, and then those pieces into the octet pairs. The pairs were then converted to 16 bits. I’m now trying to figure out the field element formula. It looks to be the summation of each bit multiplied by “vj” and this is where I’m not sure what value to use. If j = 7, then vj = α14 +α4 +α. What is α here?
# 2024-10-30 07:49 gav: Version 0.4.4 is out. This includes several corrections but two important protocol alterations. One change to the state merklisation, and one to the way that segment root lookup dictionary validations are done.
# 2024-10-30 07:50 gav: I’ll probably be pushing on with 0.5 mostly now with a number of smaller protocol tweaks. See the milestone in the GP report if you want to know what to expect.
# 2024-10-30 10:50 syed: From_V function is basically computing the sum you are pointing at but in real life you shouldn't do that, you should directly compute the code polynomial from (m_0, m_1,..m_15) vectors without the need to convert to standard field element representation.
# 2024-10-30 11:48 davxy: Currently, JAMNP does not incorporate any Beefy-related messaging. This gadget will need to be implemented on top of GRANDPA, which itself is not described by JAMNP yet.
I would tentatively assume that the protocol won’t diverge _significantly_ from the one described for Polkadot. Of course, there are some differences, such as constructing the MMR with accumulation outputs, and using only BLS instead of BLS or ECDSA for signatures. However, for context, there are numerous introductory resources on these concepts available.
Regarding the state of the "BEEFY Distribution" section in GP, it currently suggests signing each finalized block B. I'm not sure if this will later shift to a "sign every X blocks" approach similar to Beefy on Polkadot.
The signatures are for sure BLS signatures, which can be later aggregated for efficient light client verification. Aggregated signatures will necessitate an aggregated key, and APK proofs might be introduced to verify the validity of these aggregated keys (there is at least a talk and one excellent tutorial from Syed online).
Who is in charge of aggregation of BLS keys in JAM? Who is in charge of maintaining the keys commitment eventually required for APK proofs? Probably this aspect is outside of the core JAM protocol per se.
These aspects likely warrants further discussion, and it is probably not worth for you (or anyone else) to focus on it right now, given that GRANDPA is not on M1, and potentially not even on M2 either, meaning BEEFY is further down the line (it may be the last component indeed). I believe further details will be provided when the time is appropriate.
(edited)
# 2024-10-30 18:40 davxy: Yeah. I think
this walkthrough is very helpful for understanding some low level aspects underlying APK (aggregated public key) proofs. It may also serve as a beneficial read before diving into the ring-proof primitive we're using for Safrole (technical spec draft available
here), as they share some similarities. However, unless you’re interested for your own learning, it may not be essential to review this material. Regarding your JAM impl, you might prefer to just use FFI
(edited)
# 2024-10-31 10:48 sourabhniyogi: Below are some basic questions on Ordered Accumulation and Prerequisites.
*Background*. Lets say we wish to compute $C[n + 1] = A[n] + B[n] + C[n]$ (some kind of recurrence relationship, lets say Fibonacci, Tribonnaci, and Quadranocci in a toy example, all of $A$, $B$, $C$ trying to write their value into service storage of key 0) using the
refine
work results of services $A[n]$, $B[n]$, $C[n]$ coming into
accumulate
. We would like service $C$ to be able to
read
the results of $A, B, C$ and
write
out $C[n+1]$ to the service storage of $C$ in $C$'s
accumulate
code.
Questions:
1. If we have a _single_ work package $p$ with 3 work items from service $A$, $B$, $C$ (with no prerequisites), there is **NO** way to have $C$'s
write
_guaranteed_ to happen **AFTER** $A$ and $B$. This is because $\Delta_{+}$ will initialize _parallelized_ execution of $\Delta_1$ for all 3 services through $\Delta_*$. Can you confirm this?
2. If we have _three_ work packages $p_A, p_B, p_C$ with 1 work item each ($p_A$ with
refine
work results from service A, $p_B$ from service $B$, $p_C$ from $C$), we **CAN** have C's
write
happen after A and B, by specifying BOTH $p_A$ and $p_B$ as _prerequisite_ work packages of $p_C$. This SET of TWO prereqs was not possible until GP 0.4.5 supporting a _set_ of prerequisites. Can you confirm this?
3. If (2) is correct, for the $C[n + 1] = A[n] + B[n] + C[n]$
write
to happen _at the same time slot_ for some
n
, then another to happen within 2 timeslots for the next
n
, we **REQUIRE** _three_ cores for _three_ work packages, one core for each work package. In a tiny $V=6,C=2$ test configuration, we CANNOT achieve this, but in a small configuration with $V=9,C=3$, we CAN. Can you confirm this?
4. Actually, the answer to (3) is more nuanced! Because we could actually solve this ordered accumulation $C[n + 1] = A[n] + B[n] + C[n]$ with only TWO cores working on TWO packages -- one package $p_{A,B}$ with 2 work items using services $A$ and $B$ and then another package $p_C$ with ONE prerequisite: $p_{A,B}$. Then with a tiny $V=6,C=2$ test configuration, we CAN achieve the
read
of $C$ based on the
write
s of $A$ and $B$ all in one
accumulate
at the same time slot. Can you confirm this?
5. With a tiny testnet of $V=6,C=2$, it IS possible test (4) using $\Delta_+$ with $i$ splitting ${\bf w}$ into two pieces that test $\Delta_*$ and the tail recursion of $\Delta_+$. To observe $N$ calls to $\Delta_+$, you need $N-1$ cores though and need to figure out how to manage $g$ in $N$ splits. Can you confirm this?
[If this is wrong, please advise how we should test $\Delta_+$]
6. Setting aside gas limits for a while (because at present anyway every operation is gas 1 or so, and the accumulation limit is 5-6 orders of magnitude higher) we can have 25 cores working on 25 work packages like this all in one
accumulate
of
Z[n+1]
, with each work package having 2 prerequisites (except for $p_{A,B}$ which have none):
* $C[n + 1] = A[n] + B[n] + C[n]$ ($p_C$ depending on $p_{A,B}$ using 2 cores)
* $D[n + 1] = B[n] + C[n+1] + D[n]$ ($p_D$ depending on $p_{A,B}$ and $p_{C}$ using 1 more core for $D$)
* $E[n + 1] = C[n+1] + D[n+1] + E[n]$ ($p_E$ depending on $p_{C}$ and $p_{D}$ using 1 more core for $E$)
* $F[n + 1] = D[n+1] + E[n+1] + F[n]$ ($p_F$ depending on $p_{D}$ and $p_{E}$ using 1 more core for $F$)
* ... and so on until ...
* $Z[n + 1] = X[n+1] + Y[n+1] + Z[n]$ ($p_Z$ depending on $p_{X}$ and $p_{Y}$ using 1 more core for $Z$)
Assuming all 25 cores can complete their 25 refines and get 25 work reports guaranteed and assured in a medium configuration ($V=120,C=40$), we can have the entire ordered accumulation done in ONE timeslot. Can you confirm this?
Because there is a lot of formatting that github treats at least somewhat better also putting it here
https://github.com/gavofyork/graypaper/issues/129 -- is that better?
# 2024-10-31 13:48 gav: 2. The intention was only give the same guarantee as in (1). However the way the queuing works at present will, I think, allow this.
(edited)
# 2024-10-31 13:54 gav: The invariant you can definitely rely on is that work packages will not be accumulated any later than their dependencies.
(edited)
# 2024-10-31 13:55 gav: In reality the order of item accumulation of any single service may come down to that service’s Accumulate code.
(edited)
# 2024-10-31 13:57 gav: For dependent WPs with items in different services, you’ll just be guaranteed that the dependency service doesn’t see the dependent service’s changes before it’s own happen.
# 2024-10-31 13:58 gav: > <@sourabhniyogi:matrix.org> Below are some basic questions on Ordered Accumulation and Prerequisites.
>
> *Background*. Lets say we wish to compute $C[n + 1] = A[n] + B[n] + C[n]$ (some kind of recurrence relationship, lets say Fibonacci, Tribonnaci, and Quadranocci in a toy example, all of $A$, $B$, $C$ trying to write their value into service storage of key 0) using the
refine
work results of services $A[n]$, $B[n]$, $C[n]$ coming into
accumulate
. We would like service $C$ to be able to
read
the results of $A, B, C$ and
write
out $C[n+1]$ to the service storage of $C$ in $C$'s
accumulate
code.
>
> Questions:
>
> 1. If we have a _single_ work package $p$ with 3 work items from service $A$, $B$, $C$ (with no prerequisites), there is **NO** way to have $C$'s
write
_guaranteed_ to happen **AFTER** $A$ and $B$. This is because $\Delta_{+}$ will initialize _parallelized_ execution of $\Delta_1$ for all 3 services through $\Delta_*$. Can you confirm this?
>
> 2. If we have _three_ work packages $p_A, p_B, p_C$ with 1 work item each ($p_A$ with
refine
work results from service A, $p_B$ from service $B$, $p_C$ from $C$), we **CAN** have C's
write
happen after A and B, by specifying BOTH $p_A$ and $p_B$ as _prerequisite_ work packages of $p_C$. This SET of TWO prereqs was not possible until GP 0.4.5 supporting a _set_ of prerequisites. Can you confirm this?
>
> 3. If (2) is correct, for the $C[n + 1] = A[n] + B[n] + C[n]$
write
to happen _at the same time slot_ for some
n
, then another to happen within 2 timeslots for the next
n
, we **REQUIRE** _three_ cores for _three_ work packages, one core for each work package. In a tiny $V=6,C=2$ test configuration, we CANNOT achieve this, but in a small configuration with $V=9,C=3$, we CAN. Can you confirm this?
>
> 4. Actually, the answer to (3) is more nuanced! Because we could actually solve this ordered accumulation $C[n + 1] = A[n] + B[n] + C[n]$ with only TWO cores working on TWO packages -- one package $p_{A,B}$ with 2 work items using services $A$ and $B$ and then another package $p_C$ with ONE prerequisite: $p_{A,B}$. Then with a tiny $V=6,C=2$ test configuration, we CAN achieve the
read
of $C$ based on the
write
s of $A$ and $B$ all in one
accumulate
at the same time slot. Can you confirm this?
>
> 5. With a tiny testnet of $V=6,C=2$, it IS possible test (4) using $\Delta_+$ with $i$ splitting ${\bf w}$ into two pieces that test $\Delta_*$ and the tail recursion of $\Delta_+$. To observe $N$ calls to $\Delta_+$, you need $N-1$ cores though and need to figure out how to manage $g$ in $N$ splits. Can you confirm this?
>
> [If this is wrong, please advise how we should test $\Delta_+$]
>
> 6. Setting aside gas limits for a while (because at present anyway every operation is gas 1 or so, and the accumulation limit is 5-6 orders of magnitude higher) we can have 25 cores working on 25 work packages like this all in one
accumulate
of
Z[n+1]
, with each work package having 2 prerequisites (except for $p_{A,B}$ which have none):
>
> * $C[n + 1] = A[n] + B[n] + C[n]$ ($p_C$ depending on $p_{A,B}$ using 2 cores)
> * $D[n + 1] = B[n] + C[n+1] + D[n]$ ($p_D$ depending on $p_{A,B}$ and $p_{C}$ using 1 more core for $D$)
> * $E[n + 1] = C[n+1] + D[n+1] + E[n]$ ($p_E$ depending on $p_{C}$ and $p_{D}$ using 1 more core for $E$)
> * $F[n + 1] = D[n+1] + E[n+1] + F[n]$ ($p_F$ depending on $p_{D}$ and $p_{E}$ using 1 more core for $F$)
> * ... and so on until ...
> * $Z[n + 1] = X[n+1] + Y[n+1] + Z[n]$ ($p_Z$ depending on $p_{X}$ and $p_{Y}$ using 1 more core for $Z$)
>
> Assuming all 25 cores can complete their 25 refines and get 25 work reports guaranteed and assured in a medium configuration ($V=120,C=40$), we can have the entire ordered accumulation done in ONE timeslot. Can you confirm this?
>
> Because there is a lot of formatting that github treats at least somewhat better also putting it here
https://github.com/gavofyork/graypaper/issues/129 -- is that better?
Sounds right, yes.
# 2024-10-31 14:00 jaymansfield: Where can I find function C defined? I don’t see it in section 12 and in the appendix under C it says “see equation ??”
(edited)
# 2024-10-31 14:05 gav: > <@jaymansfield:matrix.org> Where can I find function C defined? I don’t see it in section 12 and in the appendix under C it says “see equation ??”
Please provide link with GP reader.
# 2024-10-31 14:11 jaymansfield: Looking for function C to be able to generate the accumulate root
# 2024-10-31 14:13 gav: and as the GP says right above your selection, it's defined in section 12.
(edited)
# 2024-10-31 14:15 jaymansfield: > <@gav:polkadot.io> and as the GP says right above your selection, it's defined in section 12.
Thanks I’ll take a look. Confusing though as there is no mention of C in 177 either.
# 2024-10-31 14:19 jaymansfield: Oh thanks got it now! I should have specified which GP version. In v0.4.5 I found it as 182.
# 2024-10-31 14:22 gav: I was just referencing the GP you used in your link, but yeah using numbers are a bad idea :)
(edited)
# 2024-10-31 14:28 gav: > <@gav:polkadot.io> Sounds right, yes.
Answered in the issue.
# 2024-10-31 18:04 dvladco: Hello, I have a few questions about the
solicit host call for the accumulation host functions
1: This condition:
if h ≠ ∇ ∧ (h, z) /∈ (x_s)l
I assume should be
if h ≠ ∇ ∧ (h, z) /∈ K((x_s)l)
since
l
is a dictionary so here we are trying to verify if the key is in the dictionary
2: one line bellow here:
if (x_s)l[(h, z)] = [x, y]
I can't figure out where do the _x_ and _y_ (italic) come from? (edit and _t_ as well)
(edited)
# 2024-10-31 18:11 gav: 1. When it’s unambiguous then I don’t bother writing the extra K() since it clutters and adds no clarity. This is often the case in dictionaries when we care about key inclusion.
# 2024-10-31 18:12 gav: 2. They are free variables defined by the fact that the value in question has two items.
(edited)
# 2024-10-31 18:14 dvladco: nevermind about
t
I see in the latest version it's changed :)
# 2024-10-31 18:16 dvladco: So basically
if l = [x, y]
just means we check that there are two items
# 2024-10-31 22:29 dave: The preimage hash is passed in to eg solicit directly, it's trivial to pass in almost-colliding hashes. As the trie key construction function doesn't preserve all hash bits, these almost-colliding hashes could actually collide in the trie if they are not hashed beforehand
# 2024-11-04 10:47 dvladco: Hello, I would like to confirm if the gas calculation
here is correct, it seems to me that
ω8
contains the memory address for memo
# 2024-11-04 16:50 dakkk: phi is called authorizer queue in the graypaper
EDIT: I edited my question to match the GP
(edited)
# 2024-11-04 16:52 gav: Where does the “maximum number of items” come from? (The formalism is correct, it is the exact number not the maximum)
# 2024-11-04 16:54 gav: Ok that should not have the word “maximum” in there.
(edited)
# 2024-11-06 13:16 dakkk: gav: is it possible to modify the formula numbering system within the graypaper? Currently, when formulas are added or removed, subsequent formula numbers shift, which can be problematic when referencing formulas from code.
A potential solution to this issue is to implement section-based numbering. For instance, the third formula in Chapter 10 would be labeled as 10.3. This approach would also enhance readability when formula numbers are referenced within the paper itself, as it would immediately indicate the formula's location.
# 2024-11-06 13:36 prematurata: > <@dakkk:matrix.org> gav: is it possible to modify the formula numbering system within the graypaper? Currently, when formulas are added or removed, subsequent formula numbers shift, which can be problematic when referencing formulas from code.
>
> A potential solution to this issue is to implement section-based numbering. For instance, the third formula in Chapter 10 would be labeled as 10.3. This approach would also enhance readability when formula numbers are referenced within the paper itself, as it would immediately indicate the formula's location.
I'd love this. I do reference formulas in my code as well for future reference. I just recently added graypaper v number next to the formula "Id" but this suggestion is much more robust
# 2024-11-06 13:38 rick: > <@prematurata:matrix.org> I'd love this. I do reference formulas in my code as well for future reference. I just recently added graypaper v number next to the formula "Id" but this suggestion is much more robust
Permanent formula ids would be very helpful
# 2024-11-06 14:52 qiwei: in
quit
host-call, if halt with a transfer, the service account balance will be below threshold balance, just wondering is the account supposed to be removed from state afterwards? I did not find such logic (maybe i missed it..)
# 2024-11-06 17:10 gav: > <@qiwei:matrix.org> in
quit
host-call, if halt with a transfer, the service account balance will be below threshold balance, just wondering is the account supposed to be removed from state afterwards? I did not find such logic (maybe i missed it..)
I'm not sure what you mean. The
quit
hostcall would delete the account in question.
# 2024-11-07 15:14 dvladco: Hey, is
d
here taken from
u
? I don't see it being explicitly defined anywhere
(edited)
# 2024-11-07 18:33 dave: Is it intended that reports can sit around in rho essentially forever? Seems like some shenanigans might be possible if someone in control of a core leaves a report sitting there long enough for auditors to no longer be able to audit it
# 2024-11-07 18:35 celadari: I think there is timeout: if new report comes in and timeout has passed the old report is flushed and the new report takes place im rho
# 2024-11-07 18:45 dave: There is a timeout but the report is only discarded if a new one comes along to replace it. Someone in control of the core could ensure no other report comes along. Might not be abusable but doesn't seem impossible to me and would be easy to fix by forcibly clearing reports after a certain amount of time
# 2024-11-07 18:52 sourabhniyogi: What does it mean to be "in control of the core" if validators are rotated within an epoch at high frequency?
# 2024-11-07 18:54 sourabhniyogi: If you mean the coretime owner, and they aren't using their core at a high frequency, that's not really a concern, I believe.
# 2024-11-07 20:10 dave: Yes I mean the coretime owner. I'm not concerned about the core not getting used, I'm concerned about an invalid report potentially getting accumulated because noone audits it. The core not getting used would be a cost to execute this attack 😅
# 2024-11-08 11:17 gav: I don't think it matters - the only reason it wouldn't go into accumulation is because it's not yet available and if it happens to stick around for ages before 2/3+1 of validators finally all decide it is available, then there's no great harm in initiating the auditing from that point.
(edited)
# 2024-11-08 11:22 gav: Also, it's not the coretime owner who can prevent its becoming available: it's only the guarantors, and they do a switcheroo every 10 blocks.
(edited)
# 2024-11-08 11:25 gav: If two guarantors and the coretime owner coordinated, then they could keep an unavailable WP in rho indefinitely and only choose to make it available at some arbitrary late stage (assuming the coretime owner has indefinite funds). But again, auditing only begins once available is assured.
# 2024-11-08 11:40 dave: > <@gav:polkadot.io> If two guarantors and the coretime owner coordinated, then they could keep an unavailable WP in rho indefinitely and only choose to make it available at some arbitrary late stage (assuming the coretime owner has indefinite funds). But again, auditing only begins once available is assured.
Yes this is the case I was imagining. Of course this assumes it is _possible_ to make the WP available at an arbitrary later point. It's not obvious to me that this wouldn't be possible. If you wait long enough to do this then auditors will be able to assemble the bundle but may not be able to check the report because they have discarded necessary state. What then?
# 2024-11-08 11:43 dave: > <@gav:polkadot.io> I don't think it matters - the only reason it wouldn't go into accumulation is because it's not yet available and if it happens to stick around for ages before 2/3+1 of validators finally all decide it is available, then there's no great harm in initiating the auditing from that point.
My point is that I think it _is_ harmful to kick off auditing so late as we don't keep eg preimages forever
# 2024-11-08 11:44 dave: > <@gav:polkadot.io> What state might they have discarded?
Forgotten preimages for example
# 2024-11-08 11:48 gav: I think the tolerances are pretty high on that, but it’s a lot easier to reason about correctness if it’s prevented outright. We can disable availability for WRs which should have timed out. This should avert any possibility of misuse.
(edited)
# 2024-11-08 11:50 dakkk: > <@dakkk:matrix.org> gav: is it possible to modify the formula numbering system within the graypaper? Currently, when formulas are added or removed, subsequent formula numbers shift, which can be problematic when referencing formulas from code.
>
> A potential solution to this issue is to implement section-based numbering. For instance, the third formula in Chapter 10 would be labeled as 10.3. This approach would also enhance readability when formula numbers are referenced within the paper itself, as it would immediately indicate the formula's location.
If anyone is interested in using this type of numbering in their project, it is sufficient to put this line at the beginning of graypaper.tex:
``
\numberwithin{equation}{section}
``
# 2024-11-10 19:38 basedafdev: in the latest GP, the state-key constructor for (service, hash) in the second term exceeds 32 bytes. E_4(l) ~ H(h) = 36 bytes. Unless i'm missing something here
# 2024-11-14 10:03 dakkk: davxy: in your safrole test vectors, is it possible to know the validators' secret keys?
# 2024-11-14 10:35 davxy: dakkk | JamPy:
i
-th key is geneted as follows:
1. interpret
i
as a 32 bit unsigned integer.
2. Encode it in LE.
3. Repeat the encoded value 8 times (you get a 32 bytes
data
)
- Ed25519 secret: is
data
- Bandersnatch secret: use
data
to call
Secret::from_seed
in ark-ec-vrfs
# 2024-11-14 10:47 dakkk: > <@davxy:matrix.org> dakkk | JamPy:
i
-th key is geneted as follows:
> 1. interpret
i
as a 32 bit unsigned integer.
> 2. Encode it in LE.
> 3. Repeat the encoded value 8 times (you get a 32 bytes
data
)
>
> - Ed25519 secret: is
data
> - Bandersnatch secret: use
data
to call
Secret::from_seed
in ark-ec-vrfs
thank you (y)
# 2024-11-14 11:33 prematurata: > <@davxy:matrix.org> dakkk | JamPy:
i
-th key is geneted as follows:
> 1. interpret
i
as a 32 bit unsigned integer.
> 2. Encode it in LE.
> 3. Repeat the encoded value 8 times (you get a 32 bytes
data
)
>
> - Ed25519 secret: is
data
> - Bandersnatch secret: use
data
to call
Secret::from_seed
in ark-ec-vrfs
not sure who's behind jamcha.in but I'd suggest we all use this to generate keys
# 2024-11-14 12:17 jaymansfield: > <@davxy:matrix.org> dakkk | JamPy:
i
-th key is geneted as follows:
> 1. interpret
i
as a 32 bit unsigned integer.
> 2. Encode it in LE.
> 3. Repeat the encoded value 8 times (you get a 32 bytes
data
)
>
> - Ed25519 secret: is
data
> - Bandersnatch secret: use
data
to call
Secret::from_seed
in ark-ec-vrfs
Thank you for this! Needed it as well.
# 2024-11-14 15:11 gav: This (san-serif) R is a constant; it's a simple product.
(edited)
# 2024-11-14 21:27 gav: > <@cisco:parity.io> The prerequisites of a work report are already a set, so shouldn't
the D function just do w_x_p U ...?
Yes. Will be fixed in next release
# 2024-11-15 08:44 dakkk: Your assumptions seems correct, but the state serialization is not intendeed to be "reversible"
(edited)
# 2024-11-15 08:47 prematurata: > <@dakkk:matrix.org> Your assumptions seems correct, but the state serialization is not intendeed to be "reversible"
I agree with you but, according to my (bad) memory this is the only case when we serialize something but we are unable to deserialize it... I wonder if we really want to introduce a first here
# 2024-11-15 08:49 dakkk: > <@prematurata:matrix.org> I agree with you but, according to my (bad) memory this is the only case when we serialize something but we are unable to deserialize it... I wonder if we really want to introduce a first here
it's not the first case; check preimage metadata serialization key: we are hashing an hash, that's not reversible
# 2024-11-15 08:51 prematurata: > <@dakkk:matrix.org> it's not the first case; check preimage metadata serialization key: we are hashing an hash, that's not reversible
I knew my bad memory failed me again =) tkz
# 2024-11-15 08:54 dakkk: > <@prematurata:matrix.org> I knew my bad memory failed me again =) tkz
No problem; btw it would be an interesting feature to have a deserializable encoding of the state, so for instance we can use it for storing on disk. But as mentioned earlier it is not the main purpose of this
# 2024-11-15 09:48 xlchen: we don’t really need the ability to decide state key for node usage. so it is mostly useful as a dev / debug / indexing needs. the simple solution is just save the preimage in a aux store next to the state store
# 2024-11-15 09:59 dave: We need to be able to decode state values for warp sync so this does seem like an oversight. In any case if there is an ambiguous encoding that results in two different states having the same state root this is at the least not ideal...
# 2024-11-15 10:02 dave: In general it should be possible to execute a block given just the initial state trie without needing a "real" state to deal with ambiguity
# 2024-11-15 10:09 prematurata: since 0.5.0 is merged do you think it would be better to have formula numbers follow the subchapters as well? ex...
when reading chapter 12.2, the current 0.5.0 first formula is 12.13. what about (not sure it's possible in latex) renaming 12.13 to 12.2.1?
# 2024-11-15 10:10 gav: I’d avoid paying too much attention to the versions of main branch.
# 2024-11-15 10:11 prematurata: yeah i should have better explained myself. nevertheless what do you htink about changing the formula numbering a bit further?
# 2024-11-15 10:12 prematurata: i think it would be beneficial to encode the subsection in the formula numbering as well
# 2024-11-15 11:55 gav: > <@prematurata:matrix.org> since 0.5.0 is merged do you think it would be better to have formula numbers follow the subchapters as well? ex...
>
> when reading chapter 12.2, the current 0.5.0 first formula is 12.13. what about (not sure it's possible in latex) renaming 12.13 to 12.2.1?
To be honest, the bigger size of formula numbers is already screwing up the layouts.
# 2024-11-15 19:58 gav: It avoids the need for the code to be fetched (and fetchable) at time of service creation.
# 2024-11-15 20:01 gav: It’s not like you can’t put a huge preimage into a service’s storage if you’re willing to pay.
# 2024-11-15 20:01 tomusdrw: > <@gav:polkadot.io> It avoids the need for the code to be fetched (and fetchable) at time of service creation.
not sure I get it. If I understand correctly
l
is the code length, and the
new
host call could just end with some error code if
l >= W_C
.
# 2024-11-15 20:02 gav: But (assuming we keep the check where it’s actually used) then it’s one extra piece of math that’s not really needed
# 2024-11-15 20:03 tomusdrw: I see. Though, you could still have some genesis services with code larger than
W_C
if it's checked only in
new
.
(edited)
# 2024-11-15 20:04 gav: Yeah. Though then it’d be the chain publishers fault if it screwed the PVM
# 2024-11-15 20:05 gav: Realistically implementations will likely need to limit the PVM code size so if the protocol theoretically supports unlimited blob sizes if they’re added in genesis it’s a bit problematic
# 2024-11-15 20:06 tomusdrw: Well, if it's not explicit in GP it could lead to some cross-implementation issues. My curiosity is satisfied though. So up to you where the check should be :)
(edited)
# 2024-11-17 11:16 stanleyli: Hello guys, I have a question about the pageproof, according to the definition in the GP, constructing a Merkle Tree appears to follow the diagram provided. In this context, B(1,2) represents the branch of leaf 1 and leaf 2, and its calculation method is described as H($node ⌢ N(v\_{...|v|/2}, H) ⌢ N(v\_{|v|/2...}, H)).
https://graypaper.fluffylabs.dev/#/364735a/354e0135ac01 (edited)
# 2024-11-17 11:18 stanleyli: The methos doesn't have any problem, but it leads to a final result formatted as [hash, hash ... hash, blob], where the earlier elements are hashes, but the last becomes a blob (length = G = 4104). This results in a significantly longer output length overall.
Would it make sense to modify the N() function so that it returns H(v) when |v|=1? This way, every node would consistently be a hash, and each element in the trace path would have a uniform length of hash.
Or perhaps I’ve misunderstood something?
(edited)
# 2024-11-18 11:29 gav: > <@stanleyli:matrix.org> The methos doesn't have any problem, but it leads to a final result formatted as [hash, hash ... hash, blob], where the earlier elements are hashes, but the last becomes a blob (length = G = 4104). This results in a significantly longer output length overall.
>
> Would it make sense to modify the N() function so that it returns H(v) when |v|=1? This way, every node would consistently be a hash, and each element in the trace path would have a uniform length of hash.
>
> Or perhaps I’ve misunderstood something?
The justification function curly-J uses the constancy function _C_, which basically hashes all data items before passing them in to the trace function _T_.
# 2024-11-19 15:50 gav: Graypaper
version 0.5.0 is out. No vectors yet but they should be coming soon. There are several important changes, chiefly to the PVM, which is now 64-bit.
# 2024-11-20 11:48 qiwei: In Appendix I, there is a
ZG = 2^14: The standard pvm program initialization page size. See section A.7.
, but A.7 is still using ZP, maybe need to update to use ZG?
# 2024-11-20 15:35 dakkk: Is it correct to deduct that in order to complete Milestone 1 of JAM prize we need to wait version 1.0 of the graypaper? Or can we deliver an 0.5 implementation?
(edited)
# 2024-11-21 17:43 charliewinston14: Good afternoon everyone. I was hoping someone could shed some light on how to generate the accumulate root thats in block history? Not sure how it differs from the mmr peaks I already generated in the history items.
(edited)
# 2024-11-24 15:41 celadari: Hey guys, in the
refinement context
what's the difference between the anchor (
a
) and the lookup-anchor (
l
) ?
# 2024-11-24 17:24 gav: In summary, anchor is used to anchor the WP (and thus WR) to a particular block which must (still) be in recent history when reported and ensures that the WP is recent. Lookup-anchor need not be very recent at all (it can be quite old), but must be finalised; this is used to ensure that the point at which the
historical_lookup
(in
refine
) is both finalised and within the amount of lookup-history which we maintain on-chain.
(edited)
# 2024-11-24 17:27 gav: You can find out more by reading the section on Reporting (for anchor) and the
historical_lookup
and its dependent function (for lookup-anchor)
# 2024-11-25 08:11 gav: However specifically and only for standard memory initialization, we use an additional two initialization "page" sizes (2^14 and 2^16) in the math.
(edited)
# 2024-11-25 16:40 dakkk: Got it; so the always accumulated services and queuing would have the majority of the available gas, 341000000 - (341\*100000) = 306900000 (~90%); is it the correct order of magnitude?
(edited)
# 2024-11-25 17:54 davxy: The doubt is mostly about the big 🍰 reserved for queued and always accumulate services
# 2024-11-25 18:52 gav: > <@dakkk:matrix.org> Got it; so the always accumulated services and queuing would have the majority of the available gas, 341000000 - (341\*100000) = 306900000 (~90%); is it the correct order of magnitude?
Well no not necessarily but the fact that always-accumulate exists at all means they can’t necessarily be equivalent.
# 2024-11-26 00:22 mkchung:
https://graypaper.fluffylabs.dev/#/911af30/36cb02362803
Can state-key constructor functions C be enhanced with a "type" identifier? something like:
(i ∈ N28, t ∈ N2) ↦ [i, 0, 0, . . . ,t]
(i, s ∈ NS, t ∈ N2) ↦ [i, n0, 0, n1, 0, n2, 0, n3, 0, 0, . . . ,t] where n = E4(s)
(s, h, t) ↦ [n0, h0, n1, h1, n2, h2, n3, h3, h4, h5, . . . , h26, t] where n = E4(s)
so that we can determine the expected type (δ, a\_s, a\_p, a\_l) from the C key alone?
(edited)
# 2024-11-26 06:49 mkchung: "T" is intended to be a type identifier (octet) so that it can potentially support up to 255 different types in state merklelization.
C1-C15 and account related states (i.e., δ, a_s, a_p, a_l) are all different types. I think it's probably beneficial to have one octet from C key as a way to "validate" what encoded struct we are expecting (if collision is not a concern)?
(edited)
# 2024-11-26 06:55 mkchung: Right now it's quite difficult to differentiate account related states (i.e., δ, a_s, a_p, a_l) from C key alone without this "type identifier".
# 2024-11-26 07:23 gav: The first two key compositions should be easy to spot given the zeroes. The last one is harder, sure because of the information density, but precisely because of this we can’t really afford to waste a byte as that would further reduce the cryptographic security.
# 2024-11-26 21:35 dave: FWIW it is possible for a node to maintain the "original" state keys if it executes every block from genesis, but in the case of warp sync this is _not_ possible because not all bits of the original state keys influence the state root and so they cannot all be proven
# 2024-11-26 21:37 dave: One further issue is that in the case of the preimage lookup dictionary l, the key transformation involves a hash and so cannot be reversed
# 2024-11-26 21:39 dave: In that case you could track the original key and prove this, but if we really want to address this issue I think the simpler solution is to simply include the original key in the transformed value
# 2024-11-26 21:43 dave: We could possibly do a similar thing for the discarded key bits
# 2024-11-26 21:43 dave: Of course this adds complexity, which I think is the reason Gav hasn't done it (not that I can speak for him)
# 2024-11-26 23:04 xlchen: there is no reason why we can't have an extra aux store to store the preimages / hash mappings to support reverse lookup. this is something like archive node feature that's optional but nice to have thing. I don't think it needs be spec'ed
# 2024-11-27 01:36 sourabhniyogi: Since CE129 is not intended for debug purposes but for warp-sync, this issue is not relevant, thank you. Closed the issue =). But the general need for a state representation for STF testing (and debugging state) is still there.
For the syncing problem, we do need a clear understanding of how implementations should do the warp sync with CE129 and how AuditDA (1 hour) and ImportDA (28 days) should work at similar timescales, right? We imagine syncing could be skipped for AuditDA but ImportDA cannot. We doubt teams need to implement it until getting through M1+M2, right?
We should spec out the RPC endpoints (with JSON request/responses) to cover the debugging case of state sharing without regard to CE 129 warp syncing... and the top 10 others.
(edited)
# 2024-11-27 01:46 dave: Don't understand what you mean by skipping sync for DA? Could you elaborate?
# 2024-11-27 03:35 sourabhniyogi: A new validator who joins a JAM network has to get all chunks associated with its validator index -- for both AuditDA (1 hour) and ImportDA (28 days). For Audit DA, if there is some upper bound on how many new validators can join (not sure about where this will happen) then within an hour of participating its up to date -- so a new validator could "skip" this Audit DA syncing operation since there is so much redundancy. Or it could look through the last hours worth of work reports and ask the guarantors for its chunks.
But for Import DA, the new validator should have some fast way of getting all the chunks its responsible for holding. It could look through the last 28 days of work reports and ask the guarantors for its chunks .. but I am not sure that it can.
# 2024-11-27 09:16 dave: Where did you get that from? Once enough validators claim receipt of their shards I do not believe there is any requirement for the guarantors of a report to send out more shards. There is also no mechanism for validators to report that they have obtained the shards after this point, although you could argue that isn't really necessary.
# 2024-11-27 16:47 sourabhniyogi: Consider the situation where on Dec 1 there is a segment exported in a V=1024 network. Under normal conditions, up to 2/3 of the network can be dead and the segment will be available in ImportDA.
BUT if every day from Dec 1 to 11, 100 validators leave and 100 new join to take their place, in such a way that the entire network has been completely replaced, but
(a) no one has the data to reconstruct the segment
(b) no one in the new set has used any JAMNP method to fetch their chunk from the guarantor,
what happens?
I think the present answer is "there is no plan".
If the plan is "oh we have 28 day unbonding periods", what happens with unbonding queues that are way shorter than that?
This sets my expectation that we need a way to have ImportDA syncs for a changing of the guard, where one validator takes the place of another at a specific validator index.
(edited)
# 2024-11-27 16:54 dave: There is a question of whether there are sufficient incentives for old validators to maintain import DA data and make it available to the current validator set for the full 28 days. I don't know the answer to that.
# 2024-11-27 16:56 dave: AFAIK though there is no mechanism to "hand over" import DA data to new validators
# 2024-11-27 16:57 gav: Indeed there is not. Thus far the assumption is that it doesn’t matter.
# 2024-11-27 17:06 sourabhniyogi: Concerning the accumulation queue, we are struggling to come up with a test case where we have 12.3
https://graypaper.fluffylabs.dev/#/911af30/167100167100 filled with anything because 11.38
https://graypaper.fluffylabs.dev/#/911af30/156001156001 makes it difficult to impossible to do so.
Concretely, if package M has package F as a prerequisite, because of 11.38 requirement, M can't even get a work report in for 12.3 to matter. Put another way, 11.38 has prereqs constraining refine of M before 12.3 accumulation can matter.
How can we get a good test case for accumulation queue then?
We must be misunderstanding something here.
# 2024-11-27 17:09 dave: AIUI the dependencies stuff in the backend is to deal with availability of work reports happening out of order
# 2024-11-27 17:10 dave: So say you have two WPs A and B, with B depending on A. A and B can be reported in the same block. If B becomes available before A then it will enter the "ready" queue but must wait for A to be made available and accumulated before it can be accumulated itself.
(edited)
# 2024-11-27 17:15 dave: That's a requirement for reporting. Availability happens after this
# 2024-11-27 17:17 sourabhniyogi: If B doesn't get reported until A is reported, there is no way for B to become available before A, right? No "If B becomes available before A" possibility exists then, thus no way for the accumulation queue to be filled with B.
(edited)
# 2024-11-27 17:19 dave: Why do you say that? The availability of the two packages is pretty independent
# 2024-11-27 17:20 dave: For one if the guarantors for A go offline then A just won't become available at all
# 2024-11-27 17:31 sourabhniyogi: 1. For guarantors, they can assure the data as soon as they generated the work report.
2. For non-guarantors, they will only be able to assure the data based on observing a guarantee extrinsic that includes a work report.
The majority of the network (all but 2 or 3 validators out of V) is in 2.
So for the B-depends-on-A:
* By 11.38 requirement on reporting/guarantees, there is no way to generate a work report for B until A has a work report
* Because of 2, the majority of the network cannot provide an assurance
* Because the above "the majority of the network cannot provide an assurance", we are having difficulty generating any situation where 12.3 accumulation queue is filled with anything.
What are we getting wrong in the above chain of logic? Is there a test case where 12.3 accumulation queue is filled with anything?
Note that we ARE able to generate a test case like this:
(1) |E_G|=2, where refine of A+B are executed _in the same slot_
(2) |E_A|=V, where all V validators assure both A+B and then due to the "B depending on A" first A is accumulated (having no dependencies) then B (having a dependency on A)
# 2024-11-27 17:35 sourabhniyogi: and what we have to do to get a test case working is:
(1) |E_G|=2, where refine of A+B are executed in the same slot (that is, 11.38 is NOT a problem)
then simulate a forced delay of assurances in order to get our accumulation queue filled. Is this the only way?
# 2024-11-27 17:36 dave: What do you mean by "Because of 2, the majority of the network cannot provide an assurance"
# 2024-11-27 17:39 dave: If in your test case all reports are assured immediately then no you probably won't get anything in the queue. So yes your test needs to simulate assurances being delayed. Not sure why this is impossible?
# 2024-11-27 17:40 sourabhniyogi: Not impossible -- but our belief is that this is the only way. Can you think of another?
# 2024-11-27 17:41 dave: > <@dave:parity.io> AIUI the dependencies stuff in the backend is to deal with availability of work reports happening out of order
Not aware of one. As I said, I believe the entire point of the dependencies stuff is to deal with this happening.
# 2024-11-27 17:41 sourabhniyogi: ok, great, we don't have a misconception, thank you!
# 2024-12-02 08:54 vinsystems: Is it possible to have different epoch judgments in the same dispute extrinsic?
(edited)
# 2024-12-05 11:05 luke_fishman: Hello, guess i am missing something.
in
Refine Invocation we only ever call the Argument Invocation with **x = (∅, \[\])**
this then passes on through ΨM and ΨH and eventually one of the Omegas in
B.8 is getting called with this pair as an argument
so if we only ever pass "empty" pair, how will the internal Omegas ever operate on non empty argument?
(edited)
# 2024-12-05 11:17 luke_fishman: ok i think i can answer myself (sorry for the spam)
by using a series of host calls i can progressively "fill up" the context pair
for example i could use machine => poke => peek in order to
create a machine in the **m** dictionary, write some program into it then verify it is there using peek
# 2024-12-06 08:05 gav: > <@amritj:matrix.org> We use the paged proof hashed segments for building segment justification.
>
> The segment root is built with a constancy preprocessor that prepends "$leaf" to the data.
>
> Shouldn't we also prepend "$leaf" here?
>
>
https://graypaper.fluffylabs.dev/#/911af30/1a18011a1f01
Yes in the present system it would be needed. I might alter things so that for the pages proofs tree we don’t bother with the leaf/node prefixes at all since they’re redundant here.
# 2024-12-06 15:05 amritj: One more suggestion, I think we should also check the export count in the work package items to be equal to the length of the actually exported segments.
Otherwise, work item can export more segments than Wm (2\*\*11) and I think may even overwrite index of the export segment of next work item
https://graypaper.fluffylabs.dev/#/911af30/1a0c021a0c02 (edited)
# 2024-12-06 16:33 gav: > <@amritj:matrix.org> One more suggestion, I think we should also check the export count in the work package items to be equal to the length of the actually exported segments.
>
> Otherwise, work item can export more segments than Wm (2\*\*11) and I think may even overwrite index of the export segment of next work item
>
>
https://graypaper.fluffylabs.dev/#/911af30/1a0c021a0c02
It's technically impossible to validly export more items than W_M already due to the condition in the
export
host-call (the only way to validly append an export)
# 2024-12-06 17:06 amritj: Yes, we are checking if the export segment list is full if the export segment offset + current work item export count is more than W\_M (2\*\*11) , but the export segment offset is calculated by summing up the export counts of work items mentioned in the work package built by the builder
But we are not validating if the export count is equal to no of segments actually exported by the work items after refine
Maybe I am missing something?
(edited)
# 2024-12-06 17:22 gav: > <@amritj:matrix.org> Yes, we are checking if the export segment list is full if the export segment offset + current work item export count is more than W\_M (2\*\*11) , but the export segment offset is calculated by summing up the export counts of work items mentioned in the work package built by the builder
>
> But we are not validating if the export count is equal to no of segments actually exported by the work items after refine
>
> Maybe I am missing something?
https://github.com/gavofyork/graypaper/pull/160
# 2024-12-06 17:24 gav: I'm not (yet) totally convinced that the builder can really break anything for anyone other than themselves by misreporting these counts.
(edited)
# 2024-12-07 10:58 amritj: Question:
In auditing, we consider the report audited if the report have no negative judgement and there exists some tranche where all validators required to audit have positive judgement about the report.
In case some validator announced a audit but failed to deliver the judgement, and new audits are announced to cover that.
So, the new tranche A_n will be = old tranche auditors + new auditors just announced - auditors failed to deliver judgement ?
https://graypaper.fluffylabs.dev/#/911af30/1eab001eae00 (edited)
# 2024-12-07 11:26 gav: Once it has been determined by the auditor that they will audit the new tranche, the VRF is used to determine which cores (and thus which WRs) should be audited. These are announced and by doing so the node is obliged to publish a judgement. They receive corresponding announcements and, later, judgements, from other auditors and will audit a new tranche only if by the tranche's time limit they have not received a judgement for each announcement.
(edited)
# 2024-12-07 11:37 amritj: In new tranche, auditors will only audit reports that were announced to be audited in previous tranch but judgement no received, right?
(edited)
# 2024-12-07 11:46 amritj: Ahh, gotcha, the new audit tranche only contain reports that did not receive all judgement, other reports that did receive all judgements are excluded from the new tranche list because they are already validated, thanks I got bit confused!
(edited)
# 2024-12-07 11:58 gav: So we perceive m_n missing audits at tranche n: those two formulae determine which of the work-reports (w) we will be required to audit this tranche.
# 2024-12-07 11:58 gav: For any tranche, as the number of perceived missing-judgements increases, we audit a greater selection of the overall work-reports.
(edited)
# 2024-12-07 11:59 gav: We don't (necessarily) audit the report(s) with the missing judgement(s).
(edited)
# 2024-12-07 12:00 gav: And we _always_ pass a judgement on any audits we have announced ourselves.
(edited)
# 2024-12-07 12:03 amritj: Thanks, I got confused about rechecking all previous tranche WR to be present in J\_T to consider this WR audited, but as per eq 17.16 we only include those Work reports in new tranche that did not have all announced judgement, so no need to check work reports that are not included in next tranche again!
(edited)
# 2024-12-07 19:04 gav: > <@amritj:matrix.org> Yes, we are checking if the export segment list is full if the export segment offset + current work item export count is more than W\_M (2\*\*11) , but the export segment offset is calculated by summing up the export counts of work items mentioned in the work package built by the builder
>
> But we are not validating if the export count is equal to no of segments actually exported by the work items after refine
>
> Maybe I am missing something?
https://github.com/gavofyork/graypaper/pull/160/commits/184515d80354f96f0410b48cea950ed7c9e4f03f
# 2024-12-07 19:05 gav: This will be updated to a new system; WPs should not be invalidated by a bad export count.
# 2024-12-08 03:47 amritj: I asked this question in Jam room, not solved there so asking here too:
Question:
The segments are erasure-coded and distributed to the validators. Each validator receives a single shard out of the 1,023 shards, determined by their validator index. Therefore, the shard index of the shard they receive equals to their validator index.
According to the JAMNP protocol, to fetch data from the assurers, we need to provide each assurer with the shard index we require from them.
The exported segments are expected to remain available for 28 days. During this period, it is assumed that more than 341 validators will remain consistent. However, there is still a significant likelihood that validator indexes may change during this time due to the addition or removal of validators.
I asked if we have to remember historical validator sets to know who to request shards from, which David Emett confirmed
But we are not able to find any way to distribute this data to a new validator added to the network, and without this data he won't be able to compute work reports.
I proposed to not require shard index in the request to fetch data from assurer but assurer himself give us the shard index which we can ofcourse verify is correct or not, so we only need the erasure root and segment index, but as some new validators added to the system if we randomly choose 342 assurers to request data from some of the responses will be empty so either we can request from ~400-500 validators (churn rate as confirmed by sourabhniyogi is low, so we will most probably receive more than 342 shards) OR somehow store which validators were expected to be active in the requested WR segment epoch by storing status component like we do in preimages in the validator metadata which I am not much sure about
The above solution is only for new validators, old validators will use the normal method as they know which validators have which shard
Is this a good solution to the problem or is there a better answer we can't see?
(edited)
# 2024-12-08 04:54 amritj: Btw, a highly unlikely attack vector, but I'm still curious if it's possible:
For a report to be marked valid, it goes through two checks: first, guaranteeing, and second, auditing.
To pass the guaranteeing stage, we need two guarantors to sign. To pass the auditing stage, we need all 10 randomly selected auditors to give a positive judgment about the report.
Let's say one of the guarantors is a malicious actor and tries to bribe their fellow guarantor to sign an invalid report. The bribe would need to be large enough to offset the guarantor's potential loss if they are found guilty. Since only one more actor needs to accept the risk and the bribe, let's assume the other guarantor accepts the bribe.
After guaranteeing, we have to somehow pass the auditing stage. To do this, the attacker sets up an off-chain bribe market offering auditors of this report a deal: if all them don't announce anything, and prove they are auditors of the report after the invalid report is marked valid, they will receive a huge sum of money.
The auditors face no risk even if they are the only one remaining silent, as they aren't passing a positive judgment. If all 10 auditors stay silent and no one announces anything, the report will automatically be marked valid.
(edited)
# 2024-12-08 05:30 prematurata: A couple of bugs i found in 0.5.2:
- WorkItem new
a
field (accumultionGasLimit) is not taken care in the codec defined in C.26
- A.34 has been changed and
Q
fn in A.32 renamed to
Z
... yet some parts of graypaper stil report old nomenclature. (paragraph after
sbrk
& A.33)
# 2024-12-08 08:35 gav: > <@prematurata:matrix.org> A couple of bugs i found in 0.5.2:
> - WorkItem new
a
field (accumultionGasLimit) is not taken care in the codec defined in C.26
> - A.34 has been changed and
Q
fn in A.32 renamed to
Z
... yet some parts of graypaper stil report old nomenclature. (paragraph after
sbrk
& A.33)
Thanks will be fixed in next revision
# 2024-12-08 08:38 gav: > <@amritj:matrix.org> I asked this question in Jam room, not solved there so asking here too:
>
> Question:
>
> The segments are erasure-coded and distributed to the validators. Each validator receives a single shard out of the 1,023 shards, determined by their validator index. Therefore, the shard index of the shard they receive equals to their validator index.
>
> According to the JAMNP protocol, to fetch data from the assurers, we need to provide each assurer with the shard index we require from them.
>
> The exported segments are expected to remain available for 28 days. During this period, it is assumed that more than 341 validators will remain consistent. However, there is still a significant likelihood that validator indexes may change during this time due to the addition or removal of validators.
>
> I asked if we have to remember historical validator sets to know who to request shards from, which David Emett confirmed
>
> But we are not able to find any way to distribute this data to a new validator added to the network, and without this data he won't be able to compute work reports.
>
> I proposed to not require shard index in the request to fetch data from assurer but assurer himself give us the shard index which we can ofcourse verify is correct or not, so we only need the erasure root and segment index, but as some new validators added to the system if we randomly choose 342 assurers to request data from some of the responses will be empty so either we can request from ~400-500 validators (churn rate as confirmed by sourabhniyogi is low, so we will most probably receive more than 342 shards) OR somehow store which validators were expected to be active in the requested WR segment epoch by storing status component like we do in preimages in the validator metadata which I am not much sure about
>
> The above solution is only for new validators, old validators will use the normal method as they know which validators have which shard
>
> Is this a good solution to the problem or is there a better answer we can't see?
We may consider storing historical (28 days worth) of validator keys in state.
# 2024-12-08 08:43 gav: Honest Non-auditors would see the bribe market and realise they can profit by speculatively evaluating the report, finding it is invalid and making the judgement (honest participants in a dispute get a small reward). Once there is a single negative judgement in the system, all validators must audit. Expected loss of the guarantors is therefore huge and it becomes (wildly) unprofitable to attempt the attack.
(edited)
# 2024-12-08 09:48 amritj: Yeah, that's the way I thought it will fail, but what if only the auditors of that work report know about the bribe.
One way is after the auditor announce that they are going to audit that report, the attacker may reach all WR auditors to pass positive judgement.
But in this way, even if auditor is willing to accept the bribe he don't know if other will do the same and have huge risk.
But if someone have built the logic to accept the bribe he is not a good auditor, and all these bad auditors can also before announcing everyone about which report they going to audit tell the bribe market about it and if bribe market found ~30 such auditors working on same report he may ask them to not send this announcment to everyone and let the report to pass, otherwise everything goes normally, this could go on for years waiting for perfect moment.
Kinda filmy, and highly unlikely
(edited)
# 2024-12-08 10:00 amritj: Their can be some hacker group that in years injected this kind of virus in some of the validators machines that work in background without validators knowing about it, waiting for perfect chance
(edited)
# 2024-12-08 10:24 gav: It’s an independent VRF per validator, stemming from both an on-chain seed and their secret key. There’s no way of knowing which validator will self-select before they announce.
(edited)
# 2024-12-08 10:25 gav: And once the announcement happens, then it’s over as the escalation will begin if there’s no judgement.
(edited)
# 2024-12-08 10:26 gav: As the attacker, you would need to know that the auditors were all compromised before committing to guaranteeing the invalid report. And this is impossible as it comes from 1023 VRFs whose entropy is secret (because at lease some are not compromised).
(edited)
# 2024-12-08 10:28 gav: Of course if you knew enough (> 98%) of the validator secret keys you might be able to pull this off with non-negative expected profit, but we already assume <33% compromised nodes.
(edited)
# 2024-12-08 10:52 amritj: Yeah, I calculated the probability, and the chances of all 30 auditors being in the compromised set, even if 33% of the validators are compromised, is 2.01×10^-15. That's extremely low
To have ~50% chance ~98% validators should be compromised
I think I must work on my math skils before asking these questions 😅
(edited)
# 2024-12-08 11:09 gav: All you know is that every (honest) validator has a roughly 1/100 chance of self-selecting for any given core/WR.
(edited)
# 2024-12-08 11:11 gav: i.e. ~99% chance of not, so assume that you compromise the secret keys of >90% of validators, that means there's still ~100 validators who are honestly self-selecting.
# 2024-12-08 11:11 gav: the chance that none of them unexpectedly audits your bad WR is therefore 0.99^100 = 36%, so you've around 2/3 chance of being discovered.
(edited)
# 2024-12-08 11:12 gav: Once discovered, we can expect even the self-interested nodes acting under bribes to act honestly unless they're already bribed with the auditor slash amount (~10% of stake).
(edited)
# 2024-12-08 11:17 gav: So you have to totally cover the slash of 2 guarantors (2x 100% stake), probably some of the third guarantor (who would have elevated chances of self-selecting out of spite of being left out of the guaranteeing process), and still have such a great hack that you compromise 90% of the secret keys (which in due course we can assume will be technically impractical owing to multiple implementations and hardware key support) or somehow bribe 90% to voluntarily give up their secret keys knowing that they could get slashed if the secret is misused (e.g. as a guarantor).
# 2024-12-08 11:18 gav: And even then your attack would need to give you _guaranteed_ 2x revenue on that total cost just for it to have expected break-even
(edited)
# 2024-12-08 11:25 gav: In any case I think all these arguments are made in a more rigourous format in the ELVES paper, in case you want to dive deeper into it
# 2024-12-08 11:28 amritj: I understand it now, thanks man for answering my dumb questions with such great explanations, really appreciate it.
My mind was stuck that we only need to disrupt the network between validators somehow at that exact time of audit announcement making A_n empty, and pass the audit test
But the amount of validators required to pull this off is huge, and can only think of only one next to impossible case if more 95% of the validators use the same Internet Provider, and the provider performed a huge targeted attack on the jam protocol validators and stopped their communication at the exact time, not much sure about this too
(reading the ELVES paper now)
(edited)
# 2024-12-08 11:30 gav: Good that there's some effort to comprehend the underlying game theory! Less trust:)
# 2024-12-08 18:51 gav: > <@amritj:matrix.org> I asked this question in Jam room, not solved there so asking here too:
>
> Question:
>
> The segments are erasure-coded and distributed to the validators. Each validator receives a single shard out of the 1,023 shards, determined by their validator index. Therefore, the shard index of the shard they receive equals to their validator index.
>
> According to the JAMNP protocol, to fetch data from the assurers, we need to provide each assurer with the shard index we require from them.
>
> The exported segments are expected to remain available for 28 days. During this period, it is assumed that more than 341 validators will remain consistent. However, there is still a significant likelihood that validator indexes may change during this time due to the addition or removal of validators.
>
> I asked if we have to remember historical validator sets to know who to request shards from, which David Emett confirmed
>
> But we are not able to find any way to distribute this data to a new validator added to the network, and without this data he won't be able to compute work reports.
>
> I proposed to not require shard index in the request to fetch data from assurer but assurer himself give us the shard index which we can ofcourse verify is correct or not, so we only need the erasure root and segment index, but as some new validators added to the system if we randomly choose 342 assurers to request data from some of the responses will be empty so either we can request from ~400-500 validators (churn rate as confirmed by sourabhniyogi is low, so we will most probably receive more than 342 shards) OR somehow store which validators were expected to be active in the requested WR segment epoch by storing status component like we do in preimages in the validator metadata which I am not much sure about
>
> The above solution is only for new validators, old validators will use the normal method as they know which validators have which shard
>
> Is this a good solution to the problem or is there a better answer we can't see?
David Emett: might have an opinion on this; personally I'd be tempted to store 28 days worth of historical validator key sets on-chain (full key 336 bytes, 10 validators change per epoch and 4 bytes to redirect to last change = 9 MB), as well as the 28 days of worth of entropy (32 bytes per 6 seconds = 13 MB)
# 2024-12-09 13:24 dave: Is historical entropy needed? Not clear to me what for. Re historical validator info, my preference was to just expand the epoch mark to include Ed25519 keys and metadata, as this seems simpler (from a spec perspective at least) and the chain history will be required anyway to construct the SR->ER map
# 2024-12-08 18:52 gav: Then, you can select the shards as desired and find (or cross-reference) the validator's info fairly easily.
# 2024-12-08 18:55 gav: > validator indexes may change during this time due to the addition or removal of validators
Index churn is somewhat problematic when considering our current EC reconstruction algorithms. The staking chain would need to be aware of this and enforce index-validator affinity, so it wouldn't assign a different index to the same validator from hour to hour without good cause.
(edited)
# 2024-12-08 19:00 gav: > old validators will use the normal method as they know which validators have which shard
Only as long as they know which era/block the segments-root belongs to and they know the validator information for that time?
(edited)
# 2024-12-09 04:48 amritj: > <@gav:polkadot.io> > old validators will use the normal method as they know which validators have which shard
>
> Only as long as they know which era/block the segments-root belongs to and they know the validator information for that time?
Yeah, old validators need to maintain a mapping for historical validators.
And new validators can now easily too if they can fetch historical validator set data from other nodes and verify it with epoch mark
(edited)
# 2024-12-11 20:57 danicuki: I have a doubt about this formula:
(7.2) β† ≡ β except β†\[|β| − 1\]s = Hr
What is the correct interpretation:
1. β† is a copy of β except that in the last element of β† we replace s for the value of Hr
2. β† is a copy of β only if the value of s in the last element is equal to Hr, otherwise β† is nil
https://graypaper.fluffylabs.dev/#/5b732de/0fd5010fde01 (edited)
# 2024-12-11 20:59 dave: It's just putting the correct state root in for the last block, which will be 0 in beta (prior state)
# 2024-12-11 21:00 danicuki: So, in this case, wouldn't be more precise to put β†\[|β| − 1\]s ≡ Hr
(edited)
# 2024-12-11 21:02 danicuki: and then comes another question: in report vectors, the prior_state_root (Hr) is not informed in the header, which means they are null. Should I replace the existing vale with null? Because the vectors post_state don't do this
# 2024-12-11 21:06 celadari: I don't understand the second part of the question regarding reports. Can you tell which equations you're refering to ?
# 2024-12-11 21:12 danicuki: If I have to replace β† last element, Hr should not be null in the vector.
# 2024-12-11 21:14 celadari: Hr is not null,
But beta[|beta|-1] is null (more specifically 32 bytes of zeros)
# 2024-12-11 21:29 celadari: Yeah, I wasn't super precise but yeah Hr is part of the header
# 2024-12-11 21:29 danicuki: My interpretation is:
1. take the last element of history and put the Hr on s (which was zero in the last iteration) (7.2)
2. create a new element for history (7.3)
3. add this newly created element in the end of the list, removing the first element if the list is bigger then Constant H (7.3)
(edited)
# 2024-12-11 21:32 danicuki: actually, my main doubt is: are these vectors affecting the recent history state component? I think they are not
# 2024-12-11 21:33 danicuki: I have the same question for the authorizer pool (I think someone already asked this, I didn't see the answer)
# 2024-12-11 21:35 danicuki: My local tests were not passing because I was trying to match my post state recent history with the ones in the vectors, which actually are not correct, they are supposed to be ignored in these tests
# 2024-12-11 21:48 davxy: No. Recent block history last block state root is mutated by the recent block history STF.
(edited)
# 2024-12-11 21:49 davxy: > <@danicuki:matrix.org> I have the same question for the authorizer pool (I think someone already asked this, I didn't see the answer)
Authorizer pool is now mutated by the "authorizers" STF. Vectors not published yet.
# 2024-12-11 21:50 danicuki: But I see authorizers (alpha) being changed in reports vectors, no?
# 2024-12-15 18:43 dave: Not sure what you mean by decode? This stuff is for calculating the state root from the state, which is a one-way operation
# 2024-12-16 21:50 jaymansfield: Is there a target date for having version 1.0 of the GP ready?
# 2024-12-16 22:22 jaymansfield: Thank you. I’m assuming M1 submissions can’t be before this date then? Or will the conformance tool come before this?
# 2024-12-17 08:34 gav: > <@jaymansfield:matrix.org> Thank you. I’m assuming M1 submissions can’t be before this date then? Or will the conformance tool come before this?
Correct. From the published rules:
> Prizes are paid no earlier than the ratification by the Polkadot Fellowship of version 1.0 of the JAM protocol. Payment of the prize by the Web3.0 Foundation is conditional upon the successful completion of all KYC/AML processes
# 2024-12-18 00:54 xlchen: I am a bit confused about what is a work package bundle. I can't find a formal definition of it
# 2024-12-18 01:06 xlchen: I see. Thanks. It will be better if it is bit more explicit
# 2024-12-18 14:58 prematurata: davxy afk: since youre the bandernatch god :) I have a question...
Hs
has a direct dependency on
Hv
which is included in the
context
to sign includes
Hv
as last element. (6.16 and 6.15)
Hv
seems to also depend on
Hs
.... well.. it depends on
Y(Hs)
(see 6.17) . While i know it needs to match the ticket in case we're not in fallback mode, i believe there is some cryptography magic i didnt fully understand to know what
Y(Hs)
would be without really having Hs.... otherwise we would have a circular dependency
(edited)
# 2024-12-18 15:01 prematurata: If i understand properly this magic should be using
G.4
with same message, privkey and **empty context** to get a RingVRF proof on which i could then apply G.5 and get the same result
(edited)
# 2024-12-18 15:10 alxmirap: I may be able to answer that, I believe. The encoding in equations 6.15/6.16 uses the function \Epsilon_U(H), defined in Appendix C as being the full \Epsilon(H) without the H_s term (equations C.19 and C.20).
I believe this encoding function was created precisely for the case of omitting this field to avoid the circular dependency.
# 2024-12-18 15:13 alxmirap: This function is introduced in section 5, just before Eq 5.1.
# 2024-12-18 15:20 dave: The VRF output is independent of the additional context, so there isn't really a circular dependency
# 2024-12-18 15:21 prematurata: i think i didnt explain myself
- \Epsilon_U(H) contains
Hv
-
Hs
uses \Epsilon_U(H)
-
Hv
uses
Y(Hs)
# 2024-12-18 18:12 yu2c: Hello, question here :
\mathbb{H}^0 = [0]_32 in Ch3.8.1 called "zero-hash"
However,
\mathbb{H}_0 called "zero hash"
Is there any difference between these terms?
Does "zero hash" also refer to [0]_32 or does it take 0 as input of hash func?
# 2024-12-18 21:25 gav: Both terms are equivalent in meaning but one is a typo and will be corrected to the other. For now please treat them as equal.
(edited)
# 2024-12-18 22:55 basedafdev: In appendix D.2, the state serialization spec for $ \pi $ is a bit ambiguous to me. \pi is defined as a tuple of prior stats for all validators, and current stats for all validators, how is E_4(\pi) possible?
# 2024-12-19 00:06 gav: But yeah I can look into reformulating this as someone else had a similar query.
# 2024-12-19 13:29 luke_fishman: OK then i must be missing something
g (of the same column) is w9
g (gas cost) = 10 + ω8 + 2^32 ⋅ ω9 (clearly bigger then w9)
so :
if Q < w9 (the case to return HIGH) => Q < g(gas cost) => not enough gas => fallback to Eq. B.20 => registers unchanged
so we can never get w7' = HIGH
(edited)
# 2024-12-19 19:24 gav: The gas cost for transfer is only a placeholder. I’ll make it more sensible soon.
# 2024-12-19 19:25 gav: But the g it is referring to is not the instruction gas cost on the left but the let expression above.
# 2024-12-20 09:26 gav: @room Version 0.5.3 is
released, containing quite a few corrections and clarifications. No really major changes this time, though we now have *active removal of reports on timeout*.
(edited)
# 2024-12-21 18:54 clearloop: **9.3 Account Footprint and Threshold Balance - formula (9.8)**
What about changing the
a_l
here to
a_o
, it looks too similar to
a_\mathbf{l}
(edited)
# 2024-12-21 19:50 sourabhniyogi: When we refer to JAM's DA capabilities, does that refer to the preimages (supported by accumulate host functions solicit/forget) as well as the ImportDA + AuditDA (supported by refine host functions import/export), or only the latter?
For rollups utilizing JAM to store blocks and headers, should they be using one or the other or both in designs intended for max scalability for a whole ecosystem (e.g. OP Stack)?
(edited)
# 2024-12-22 17:05 sourabhniyogi: Which system is the CoreChains service going to write blocks and headers to ? For zk rollup ecosystems using JAM DA vs preimages, where should proofs be stored -- is there a design judgment on this that should emulate that of the CoreChains service?
(edited)
# 2024-12-22 20:05 xlchen: If I am building it, blocks will just be the body of work item, headers stored in service storage, parachain code blob in preimage, chain state in DA
(edited)
# 2024-12-22 20:44 gav: > <@sourabhniyogi:matrix.org> Which system is the CoreChains service going to write blocks and headers to ? For zk rollup ecosystems using JAM DA vs preimages, where should proofs be stored -- is there a design judgment on this that should emulate that of the CoreChains service?
Feel free to design your services exactly as you want.
# 2024-12-26 13:51 yu2c: Hello, I have a question about the structure of recent history $beta$, in the GP
(7.1) is (header hash, accumulation-result MMR, state root, work-report dict.), however in
(7.3) the item $n$ is (work-report dict., header hash, accumulation-result MMR, state root).
The order seems inconsistent. Which one should I follow? I assume this is a tuple, so the order is important for implementation because they will concat. at the end
# 2024-12-26 15:46 gav: Tuple item order is irrelevant. It only matters for serialisation and that is well-defined in the appendix.
(edited)
# 2024-12-28 18:56 jaymansfield: Was it intentional that CE-130 was skipped in the networking specs?
# 2024-12-29 07:06 sisco0: In the graypaper, it is stated that "Size-Synchrony Antagonism" is not a known concept under literature. However, the following concepts might be related: CAP Theorem, Blockchain Trilemma (
Link). Would not the "Size-Synchrony Antagonism" term be part of any of the cited ones?
Citing the Blockchain Trilemma research paper:
1. Adding more nodes increases communication overhead
2. As 𝑁 increases, 𝐿 grows at least logarithmically, potentially linearly or quadratically, depending on protocol specifics.
(edited)
# 2024-12-29 08:21 gav: CAP theorem comes closer since what it calls “consistency”is similar in nature to what i termed “coherency”. However each of its three concepts are binary conditions and the proof simply states that no system can fulfill all three. Furthermore it doesn’t conceptualise the size of the system directly (how could it under a binary condition?), instead going for antagonist concepts “availability” and “partition-tolerance”. As such I’d argue CAP theorem is a more concrete trilemma on specifically replicated data systems.
(edited)
# 2024-12-29 08:21 gav: The blockchain trilemma does conceptualise the size of the system but does not involve the coherency at all. It simply states that as a (blockchain) system scales then it becomes either centralised or insecure. This trilemma implicitly assumes total coherence.
(edited)
# 2024-12-30 04:01 yu2c: Sorry, I'm a bit confused about the definition of serialization of guarantee extrinsic $Eg$.
In GP
5.6, $\mathbf{g}$ is defined
However, in appendix
C.16, there's already a definition for the serialization of $Eg$.
Is $\mathbf{g}$ specifically defined for $H_x$? Or which definition should we follow? Thanks
# 2024-12-30 07:51 prematurata: As of now 2 different serializations are needed in 2 different contexts
# 2024-12-30 20:53 charliewinston14: Question about CE-137. In the specs a bundle shard is defined as [u8]. Is there a fixed size for them? There is no reference to a "bundle shard" in the GP itself and not sure if it goes by a different name in it.
# 2024-12-31 11:57 dave: A bundle shard is one of the pieces you get back after erasure coding a WP bundle. Size is not fixed as it depends on the size of the original bundle
# 2024-12-31 09:58 luke_fishman: a doubt about the encoding of service storage and pre-image lookups in D.2:
https://graypaper.fluffylabs.dev/#/6e1c0cd/373302377802
if i am reading this correctly, after encoding it would be impossible to reconstruct the original hash keys when decoding
∀(s ↦ a) ∈ δ, (k ↦ v) ∈ as ∶ C(s, E4(232 − 1) ⌢ k0...28) ↦ v
am i missing something?
thank you and a happy new year
# 2024-12-31 10:02 gav: > <@luke_fishman:matrix.org> a doubt about the encoding of service storage and pre-image lookups in D.2:
>
https://graypaper.fluffylabs.dev/#/6e1c0cd/373302377802
>
>
> if i am reading this correctly, after encoding it would be impossible to reconstruct the original hash keys when decoding
>
>
> ∀(s ↦ a) ∈ δ, (k ↦ v) ∈ as ∶ C(s, E4(232 − 1) ⌢ k0...28) ↦ v
>
>
> am i missing something?
>
>
> thank you and a happy new year
>
>
>
You’re not missing anything. HNY:)
# 2024-12-31 10:03 luke_fishman: and this is intended? decode(encode(service\_account) != service\_account
(edited)
# 2024-12-31 10:07 gav: Obviously, since decode is defined only as the inverse of encode, that is not true unless service account has more degrees of freedom than its encoding. This should not be the case.
# 2024-12-31 10:10 luke_fishman: right, obviously should not be the case.
but you have confirmed above the i am not missing anything -> impossible to reconstruct the hash keys of storage dictionary s, preimage lookup dictionaries p and l.
so... here i get stuck? where is the missing information to allow to reconstruct those hashes?
# 2024-12-31 10:12 gav: the mappings are not serialised alongside the rest of the service account.
(edited)
# 2024-12-31 10:17 gav: The trie root is a commitment to them. In the protocol there’s no facility or direct need to enumerate the mappings’ keys thus no need to actually store them, only to be able to query them. It’s basically up to the implementation exactly how they’re stored, but good ones will need to take into account the commitment scheme (ie trie) when designing for it.
(edited)
# 2025-01-02 10:05 gav: v_x belongs to N_32 - it’s constructed from at most 4 octets from the instruction data. (l_x is
min(…, 4)
)
(edited)
# 2025-01-02 19:56 tomusdrw: I'm wondering about an edge case with arguments parsing in PVM. Some instructions simply use
ζı+1
(like
here), but for immediates/offsets we use
ℓ
which is mask dependent (like
here).
ζ is zero-padded so it's fine if we go beyond the program length, but what if
ζı+1
is actually the next instruction?
AFAICT from GP we should treat the instruction byte as argument (and read registers from it) and in the next step just move according to the mask (i.e. execute that instruction).
Is that expected or should there be some special casing for that in the GP?
# 2025-01-02 20:04 tomusdrw: In an edge case to that edge case we may have a program that is just instructions (i.e. all bits in the mask are set), but we are still executing every instruction correctly, because we read the next instruction bytes as arguments to the current instruction (or 0s in case we go beyond the program length). I just want to confirm that I'm understanding it correctly.
(edited)
# 2025-01-02 23:57 subotic: Not sure if I'm correct, but I understand the definitions in the GP as how the program
must be encoded. So if there is no argument present where one should be as per GP, then the program is incorrect and we need to
panic
.
# 2025-01-03 00:50 gav: > <@tomusdrw:matrix.org> I'm wondering about an edge case with arguments parsing in PVM. Some instructions simply use
ζı+1
(like
here), but for immediates/offsets we use
ℓ
which is mask dependent (like
here).
>
> ζ is zero-padded so it's fine if we go beyond the program length, but what if
ζı+1
is actually the next instruction?
>
> AFAICT from GP we should treat the instruction byte as argument (and read registers from it) and in the next step just move according to the mask (i.e. execute that instruction).
> Is that expected or should there be some special casing for that in the GP?
It’s perfectly fine to concoct programs which reuse a previous instruction’s argument as the next instruction’s opcode.
# 2025-01-03 00:55 jan: Basically what Gav said. The program code blob is just that, a blob of bytes. You have the instruction pointer which tells you at which position to decode an instruction, and you have the skip value which tells you how to increment the instruction pointer after that instruction is executed. And the skip value doesn't necessarily have to be the same as the "length" of the instruction.
# 2025-01-03 21:35 charliewinston14: Hi. Question about block sealing (by ticket). Maybe I am missing something but tickets are generated using a ring signature so there doesn't seem to be a way to determine which ring member created it. But then how do we validate that when a block is received that it was created by the correct author that matches the ticket? couldn't any other member of the ring just create the same block and set themselves as the author (pretending the ticket was theirs)?
(edited)
# 2025-01-03 21:46 davxy: The block author signs the block header using a "standard" Bandersnatch VRF signature. The VRF output generated from this signature matches the Ring VRF output, which serves as the ticket ID. This proves the block author's ownership of the ticket, as they are the only one capable of producing this specific output. For further details, please refer to the Bandersnatch VRF specification paper and example
https://github.com/davxy/bandersnatch-vrfs-spec. (edited)
# 2025-01-05 09:10 luke_fishman: couple of question please regarding chapter 14.4 (computation of work results)
1. in Eq. 14.11 => definition of I(p,j) => invocation of refine => the function
S(import segment data) is being called with two arguments, but in
14.14 is defined only with single (work item) argument.
which one is it?
2. maybe related, who/what is the bold face **s** used by the functions J and S in 14.14? where does it come from?
Thank you
# 2025-01-05 11:15 gav: 1. You can disregard the bold-l parameter; it is assumed to be “part of the environment” in each of the functions which utilise L and is not passed explicitly into it.
(edited)
# 2025-01-05 11:17 gav: 2. Bold-s is constrained as the correct operand to the merkle root equality.
# 2025-01-05 11:29 luke_fishman: understood, thank you.
follow up question, could you shed some light on how to tell the difference between H and H⊞?
i saw your answer in the other room about encoding in C.29 but that didn't clear it up for me.
are they both Y\_32 or is there any "mark" to tell them apart?
(edited)
# 2025-01-05 11:42 gav: One is drawn from the set Y_32 and the other from a bijective mapping, denoted by square-plus (that little window symbol)
# 2025-01-05 11:52 luke_fishman: yes indeed there is the window thingy mark. that's not what i meant.
maybe it's a silly question.
the intention was in code how do i tell them apart
if they are both a 32 octets long binary, how could i tell if i'm looking at an H or an H⊞?
(edited)
# 2025-01-05 11:58 gav: Well that all depends on what language you’re using. In general, you’ll need to use some extra memory to store whether it’s with or without the mark.
# 2025-01-06 06:44 luke_fishman: gav: follow up question please.
about the construction of the segment-root dictionary l.
initially we have:
K(l) ≡ {h S w ∈ pw , (h⊞, n) ∈ wi} , SlS ≤ 8
this is clear enough, from the work item in a work package we take up to 8 hashes, of the kind that is the work-package hash. okie dokie.
then we come to :
∃p, c ∈ P, NC ∶ H(p) = h in
14.13
which means to me one of two things
1. we have here knowledge of "all" the work packages (all in what scope?) or some subset/list of work packages and we go through it to find the work-pacakges that the work-items refers to
2. we only keep in the segment-root dictionary the hashes that refer to the work-pacakge paseed into work result computation function Ξ. so all the keys will be identical, H(p), but since
s as far as i can see is not dependent on the core index c. we would end up with a dictionary of all identical key -> value pairs
option 2 doesn't make much sense, which brings me back yo option 1, but i see no reference to any list or set of work-packages
(edited)
# 2025-01-06 06:50 luke_fishman: is there a video lecture of chapter 14 as part of the JAM tour? i could not find any
# 2025-01-06 10:30 gav: Yes, as the guarantor you'll need to have seen the WRs of any WP hashes mentioned in the SR Lookup, so you can be sure that it's correct otherwise you might end up guaranteeing WP with a WR which won't be accumulated.
(edited)
# 2025-01-06 10:35 gav: Basically just what will be in the recent blocks by the time of becoming available.
(edited)
# 2025-01-06 10:39 gav: This mechanism is designed to allow pipelining when sequences of unidependent work packages are executed on the same core.
# 2025-01-06 10:41 gav: Without it, it would be hard to reference data in the DA without figuring out the SR manually which may not necessarily be known at the time of authoring the importing package.
# 2025-01-07 10:52 0xjunha: I have a question regarding the leaf node encoding function (
https://graypaper.fluffylabs.dev/#/911af30/37f402371103) for state merklization.
In a Merkle trie, a leaf node will be typically placed at a depth less than 256, implying that it should represent the remaining, unconsumed bits of the
state_key
after navigating to that point - so that we can guess which
state_key
that leaf node represents.
When considering what the leaf node holds as the encoded state key (
bits(k)...248
), should this refer to:
a) "The first 248 bits of the full 256-bit state key"? Or
b) "The first 248 bits of the unconsumed path bits"?
For example, let's say we have a state key like
0b1101_1010_1001...
and a leaf node is at depth 10, meaning we've navigated the trie using the path
0b1101_1010_10
to reach this node, should the leaf node:
a) Store the first 248 bits of
0b1101_1010_1001...
? Or
b) Only store the first 248 bits of
0b01...
, skipping the first 10 bits, thus excluding the part already used for navigation (
0b1101_1010_10
)? Actually in this case the remaining part is only 246 bits, so we have 2 bits of free space.
From my interpretation of the formalism, it seems the encoding takes the "first 248 bits of the full 256-bit state key" regardless of the leaf node's position (depth), so option a), but just wanted to confirm this understanding.
# 2025-01-08 19:54 danicuki: Getting back to an old question: I still didn't figure out what C(p) means in this formula. Here:
d ↦ [join(c) ∣ c <−T[C(p) ∣ p <−unzip684(d)]]
d is a binary with size multiple of both k and 684
I unzip684 d, so I get k binaries p of size 684
then I take each one of this binaries p and apply C(p) - what is C(p) ?
# 2025-01-10 18:08 danicuki: I got it. Thanks for sharing this. I tried to use this library, but it accepts only shards with size that are multiple of 64 bytes. In case of JAM, shards are 684 bytes. Do you know how we could deal with this?
Another question is: 684 / 1023 configuration would be for production network. What would be the testnet / tiny / small network configuration?
# 2025-01-10 20:56 danicuki: About the Erasure Code definition, if C ∶ ⟦Y2⟧342 → ⟦Y2⟧1023, and p ∈ ⟦Y684⟧, is there a formal specification missing in formula H.6 to transform p ∈ ⟦Y684⟧ into ⟦Y2⟧342 ?
# 2025-01-11 03:40 amritj: I am also facing this problem right now. I have temporarily padded my data to support 64 bytes and will fix it later.
There is also a discussion about this issue on GitHub here:
https://github.com/w3f/jamtestvectors/pull/4
Maybe someone else has a concrete solution.
I believe the configuration for testnet, tiny, or small networks will remain the same because the data size will still be 684k. In all these networks, the same validator holds multiple pieces of the 1023 chunks.
(edited)
# 2025-01-11 17:32 weigen: Hi, I have a question about disputes. Based on my understanding, the faults (f) in the extrinsic include validators who issued invalid judgments on a work report. Since the work report is actually valid, it should belong to the good set (Psi_g).
This logic aligns with the behavior seen in this test vector (
https://github.com/davxy/jam-test-vectors/blob/polkajam-vectors/disputes/tiny/progress_with_verdicts-4.json#L82, https://github.com/davxy/jam-test-vectors/blob/polkajam-vectors/disputes/tiny/progress_with_verdicts-4.json#L191), where work reports included in faults (f) are added to the good set of the posterior Psi. However, according to the Gray Paper (
https://graypaper.fluffylabs.dev/#/579bd12/128a01128a01), the reports in faults (f) should instead be placed in the bad set (Psi_b) and not in the good set.
This creates a conflict: if the work reports in faults (f) are valid and should be part of the good set, why are they assigned to the bad set instead? My understanding is that validators who issued invalid judgments on these valid reports are placed in faults (f), but the reports themselves remain valid.
Am I interpreting this correctly, or have I missed something?
# 2025-01-12 10:44 dave: So eg if a WR is found to be bad, then validators who supported the WR (ie produced a positive judgment) can be reported via faults
# 2025-01-13 07:43 danicuki: > <@clearloop:matrix.org> can't we just use
https://github.com/paritytech/erasure-coding/tree/main directly? for the testvector, I think it is incomplete or we'd better do it after the accumulation part, if I'm not mistaken, the testvectors required a lot of external logic for assembling the chunks
This parity example just takes a generic byte string and splits it into chunks. It still uses the Reed Solomon library, that only accepts strings with 64 bytes multiple size
# 2025-01-16 05:02 clearloop: thanks! I have skipped the types specified in erasure_coding after seeing
parity/erasure_coding
, will get it back these days!
# 2025-01-13 12:24 gav: If
invoke
returns zero in omega_7, then it unambiguously means
Halt
.
# 2025-01-13 12:25 gav: You’re probably trying to associate a _type_ with omega_7. This would be an incorrect approach.
# 2025-01-13 16:56 ycc3741: gav:
Does the official source provide any test data for verifying whether each class has been correctly serialized according to the specifications in Appendix
C.2 before performing the hash operation?
# 2025-01-13 16:57 ycc3741: We are currently encountering some issues and would like to verify whether the serialization is correct. Having the serialized results available would greatly facilitate development. Thank you!
# 2025-01-13 17:08 jaymansfield: > <@ycc3741:matrix.org> We are currently encountering some issues and would like to verify whether the serialization is correct. Having the serialized results available would greatly facilitate development. Thank you!
Just a tip in the meantime until there are official full block/state serialization vectors.. I found switching to parsing all of the *.bin vectors rather than the json equivalents helped me catch a few encoding issues.
# 2025-01-15 14:52 prasad-kumkar: For state merklization, is there a recommended approach for data ordering before constructing the Merkle tree? While I noticed the state key construction function C appears to provide some ordering through key generation, I'm unsure if:
1. This is indeed meant to determine the ordering before Merkleization
2. If so, would service indices (s) create well-distributed keys since they are sparsely distributed 32-bit integers?
I initially thought of splitting data in half recursively (like traditional Merkle trees), but noticed JAM's approach might be different.
# 2025-01-15 16:02 gav: This is a Merkle trie, so the keys (given by C) determine exactly the node structure of the tree.
# 2025-01-15 16:03 gav: You don’t get to decide. “Ordering” is moot as we never iterate. But if you really want an order (eg for RPC) then you can apply dictionary ordering to the keys.
# 2025-01-15 16:05 gav: Service indices are sparse, as are keys generally: the function C is designed to be sparse and mostly uniform. The tree’s implied node structure (by virtue of the commitment scheme) should be able to manage this perfectly well.
# 2025-01-15 16:07 gav: One reasonable question might be whether (or to what degree) keys/values should be kept in-memory, and whether (or to what degree) the nodes of the tree should be persisted and whether that persistence should be in-memory or on-disk.
# 2025-01-15 16:08 gav: For now I’d leave this for implementers to decide. I expect that getting to M4 or M5 will almost certainly need aggressive use of RAM to store/memoize one or both of these databases.
# 2025-01-15 19:52 jaymansfield: A suggestion for CE-132.. it might be a good idea to allow multiple tickets to be submitted by a proxy in a single stream. The current specification works good for the tiny chain spec but it may not be the best performance wise when using the full chain spec. The full chain spec will have 2046 tickets to be distributed to everyone within approx. 22 minutes (half of the lottery time and allowing 3 min for connectivity changes). This will result in 90+ incoming streams per minute for each validator just to consume tickets.
# 2025-01-15 19:56 dave: Might happen for the full network protocol, don't think it's particularly important though
# 2025-01-15 20:23 mkchung: For CE-134 "Work-package sharing", why is "slot" not included as part of the msg?
Guarantor -> Guarantor
--> Core Index ++ Segments-Root Mappings
--> Work-Package Bundle
--> FIN
<-- Work-Report Hash ++ Ed25519 Signature
<-- FIN
Perhaps this can be something like
--> Core Index ++ slot ++ Segments-Root Mappings
?
# 2025-01-15 21:27 dave: Not really necessary as it stands, the recipient can just accept or reject based on whether there is an appropriate core assignment or not. What an appropriate assignment is probably needs to be specified in more detail to ensure different implementations work well together, but will likely just be based on the current time and state at the head of the chain
# 2025-01-17 18:15 gav: > <@cisco:parity.io> If
this function returns B_{8n}, shouldn't it say "forall i in N_{8n}" instead of "...in N_{2^{8n}}"?
Yes. Will be fixed in next revision
# 2025-01-19 10:18 vinsystems: Question about the
schema.asn of PVM test programs: "expected-status" is described as "the status code of the execution, i.e. the way the program is supposed to end".
1.- Is this the GP exit reason? (halt, panic, out of gas, page fault, hostcall fault)
2.- If (1) is yes, should "trap" be replaced by "panic" and the other exit reasons be added to "expected-status"?
(edited)
# 2025-01-20 09:57 vinsystems: It's a bit confusing because "trap" is an instruction and "panic" is an exit reason.
What do you do when a page fault ocurrs? Do you store the lowest address to access in a status register before executing the trap instruction to switch from PVM mode to JAM kernel mode?
# 2025-01-20 12:51 jan: I will align the naming in the next version of the test vectors to make it less confusing.
# 2025-01-20 12:55 jan: For toplevel PVMs a page fault is no different than executing a "trap" instruction or any other condition which would result in a "panic" exit reason, and for those you don't even need to know the address of the page that faulted.
For inner PVMs (those are PVMs triggered with the
invoke
hostcall) a page fault will interrupt the execution of the inner PVM and should return the address of the page which triggered the fault.
# 2025-01-20 13:00 jan: And in case you're wondering, the store instructions are atomic, so for example if the inner PVM tries to write 4 bytes into the memory, and the first two bytes end up at the 2 last bytes of page N (which was already faulted) and the last two bytes end up at the 2 first bytes of page N +1 (which was not faulted) then such a write will
only trigger a page fault with the address of the N + 1 page and memory will not be modified.
# 2025-01-20 18:01 sourabhniyogi: Small Clarification question about preimages included in the genesis state (
example): To be conformant to GP, should it include a matching
a\_l in the state trie _even though no preimage was ever "requested"_? If so, what would (9.7) (
GP link) prescribe as its $[x]$ value?
(edited)
# 2025-01-21 12:29 sourabhniyogi: Question on JAM DA throughput: How does the model referenced in
Sec 20 arrive at the distributed availability of 852MB/s? Simplest model based on
W_B=12MB (max encoded work package size, derived from bandwidth considerations I think?) with 6s on guarantee and 6s on assurance yields 1MB/s per core x 341 cores = 341MB/s -- so what accounts for the difference?
# 2025-01-21 13:24 dave: JAM is pipelined so peak throughput is 341 WPs per block, not per 2 blocks. Don't know exactly where the 852MB/s number comes from, but the bundle size does not include exported segments so possibly that is the other missing bit
(edited)
# 2025-01-22 04:52 sourabhniyogi: Ok, got it -- this 2x pipelining "easter egg" is not obvious from GP or JAMNP. We now see the conditions for it are carefully enabled through the ordering of assurances and guarantees on "rho". Is the idea that this performance optimization is optional for M3 "Kusama performance" but practically required for M4 "Polkadot performance" -- yet not really part of the JAM protocol per se and thus outside of GP? Does a hint in CE133/134 to achieve this factor of 2x make sense?
# 2025-01-23 08:47 gav: Implementations are expected to author blocks in a reasonably efficient manner. Asynchrony which is possible should be exploited. This will likely be required as early as M2.
# 2025-01-23 08:50 gav: We will work to provide some basic M2 test vectors demonstrating this expectation.
# 2025-01-23 09:11 gav: For the inner aspects of state transition (block execution), asynchrony is perhaps not quite so relevant as with block authoring.
(edited)
# 2025-01-23 09:12 gav: There will likely need to be some to hit M3 and M4 (e.g. signature checking, inter-block Merkle root calculation).
# 2025-01-23 09:12 gav: But it's up to teams how the optimise, given the correctness-constraints of the GP.
# 2025-01-23 09:14 gav: For block authoring (M2), skipping a block for either guaranteeing or assuring will be considered incorrect behaviour.
# 2025-01-23 09:14 gav: Nodes are expected to provide (and use) information in a timely fashion; this goes for all aspects of block authoring and production.
(edited)
# 2025-01-24 10:06 dvladco: Hi, in the GP v0.5.4 the instructions
rot_l_32
and
rot_r_32
have 3 registers but
ω_B
is never used, is this a typo? and should we rotate by
ω_B
instead?
# 2025-01-25 14:48 dave: Serialisation is defined in appendix C. In particular, see C.22
# 2025-01-26 12:07 weigen: Maybe the constraint of 𝑛∈𝑁_16 is needed, since in C.22 the input of ε_2 should be N_16
# 2025-01-26 14:22 gav: It’s not strictly needed, though if N \ N_{2^16} were ever fed into E_2 the result would be undefined.
(edited)
# 2025-01-26 14:26 gav: Since it is deserialised with E_2, any foreign-born availability specification will always be in range for correct reserialisation. And though your node could create an availablilty spec with an out of range value, it would only break your own node since it could not be encoded and this would be needed for it to be sent to another node.
(edited)
# 2025-01-26 17:18 gav: It’s a bit fiddly but it’s so that they can be separately distributed and individual items concisely proven to be correct.
(edited)
# 2025-01-26 17:25 sourabhniyogi: With
tiny
having
rotation_period: 4
(which is not all that different from
full
having
rotation_period: 10
) we have found in our "run ~12MB work packages back-to-back" tests that work packages are reasonable for one validator to start towards the end of one rotation but by the time a guarantee is signable by one of the three, it is the case that one or two of them have rotated out of the core. Ok, so ... what is a good mode of operation for
(a) don't start the work if you're towards the end according to rule R, because you won't be rewarded for it!
(b) finish it because you'll be rewarded for it!
We do believe its valuable to have this rule R or reward process specified in more detail to ensure different implementations work well together -- can we come up with a good Schelling point at this point?
Not getting the slot in the "tiny" network (where C is just 2, R is just 4) with "big" work packages (close to 12MB) makes this issue quite prominent in regression tests.
(edited)
# 2025-01-26 18:40 dave: Agree this will probably need to be specified at some point for different impls to work well together. I think this rule will need to be informed by the performance characteristics of a full 1000-validator network though, with real builder nodes, and we aren't in a position to run such a network yet. I can say that at the moment our node follows a pretty simple rule: as a validator, accept a package for core C if we will be "assigned" to that core in the next slot or the slot after that. Note that it is possible for block authors to include packages using assignments from the previous rotation, so there is quite a bit of leeway.
# 2025-01-26 18:43 dave: Given this leeway perhaps a better rule would be to allow if we're currently assigned or will be assigned in the next slot. In any case I would recommend making this sort of thing easy to tweak in your implementation!
# 2025-01-26 18:46 dave: The rule should also probably be a bit more lenient for work packages received indirectly from another guarantor on the same assignment, to avoid a situation where the "primary" guarantor _just_ accepts a package but the guarantors it then shares it with do not.
# 2025-01-26 18:51 dave: Of course at the end of the day none of this behaviour will be required; nodes will be free to accept or reject packages however they think will maximise their profit. A specified rule seems like a good starting point though
# 2025-01-26 22:45 sourabhniyogi: Alright - we'll do "to allow if we're currently assigned or will be assigned in the next slot." for now sure thing thank you
# 2025-01-29 16:46 vinsystems: If a panic occurs when
80 -> load_imm_jump
executes the
branch function, the "ωA" don't should be changed, right?
# 2025-01-29 16:49 jan: It's always changed, regardless of whether the jump fails or not.
# 2025-01-29 16:49 jan: (This is the equivalent of RISC-V's
call
instruction; the usual use of this instruction is to load the return address and jump to another function.)
(edited)
# 2025-01-30 13:18 subotic: Regarding
page-fault
and this paragraph
https://graypaper.fluffylabs.dev/#/579bd12/243c00245500, I understand that when trying to write to writable memory and I cannot, I emit a
page-fault
. What is not so clear to me based on this definition, that if I want to write to read-only memory, I should emit a
panic
. I only know that because of the test-vectors. Or did I miss a place in the GP where this is defined?
# 2025-01-30 13:25 subotic: Great, thanks for the link. As always, things are more complicated, then they seem.
# 2025-01-30 14:22 gav: There's a few changes in this from 0.5.4 all centred around the PVM
# 2025-01-30 14:26 gav: -
import
host call has been removed in favour of a new
fetch
hostcall; this reduces the amount of data placed in PVM memory up-front and provides a means of extracting data beyond just that concerning the current work-item but for other work-items too.
- All data providing host-calls now accept an offset parameter to allow any contiguous subportion of the data to be read.
- OOB has been (almost) removed. When the outer PVM has a host-call in which it is passed a memory address it cannot access, then it panics irrecoverably.
# 2025-01-30 14:27 gav: The page faulting specification has also been updated and formalised.
# 2025-01-30 14:28 gav: This release signals the end of the 0.5 series and, potentially (but probably not), the final protocol revision (not including the obvious stuff which we know will need doing before 1.0 such as gas pricing).
(edited)
# 2025-01-30 14:30 gav: 0.6 series should contain primarily cleanups, finesse, formatting, discussion and corrections.
# 2025-01-30 14:30 gav: 0.7 and 0.8 will be any important tweaks or optimisations brought on through service-prototyping and Toaster-testing.
(edited)
# 2025-02-02 11:16 gav: Version
0.6.1 is released.
This is just a few small corrections and a simplification of
fetch
.
(edited)
# 2025-02-03 09:49 jan: Yes, since X\_4 requires the input to be N\_32 there should be a modulo there, otherwise the result would be undefined.
Side note, since I already had questions regarding this: you can think of every 32-bit instruction variant as _always_ taking a modulo of its inputs, even though sometimes in the GP we omit this modulo to make the equations simpler if the result would be equivalent anyway, for example in case of
add_32
the result gets truncated anyway so truncating the inputs is unnecessary. The high-level intent is to allow every 32-bit instruction variant to be implemented as follows (to allow for an efficient recompiler implementation): 1) truncate all inputs to 32-bit, 2) do the operation, 3) sign-extend to 64-bit, so if you find any equation in the GP for 32-bit instruction variants would have given a result that doesn't match this please ping me and we'll correct it.
(edited)
# 2025-02-03 11:11 emielsebastiaan: Related to an issue we found in testvectors for PVM instruction 206.
Does GP allow for negative output of a modulo operation?
Since this is not explicitly mentioned in section 3, I assume the answer is no.
If the answer is no, then PolkaVM probably has an incorrect implementation of GP.
https://github.com/gavofyork/graypaper/issues/222
# 2025-02-03 12:05 emielsebastiaan: Yes in that case GP should be adjusted to remove any ambiguity.
GP should to explicitly state that the modulo operator on a negative number yields a negative number, and not a positive number as expected by 'Maths'.
# 2025-02-04 11:31 davxy: > <@carlos-romano:matrix.org> anyone please? 🙏
I'll take a look. Could you please specify one particular test vector so I can review it directly?
# 2025-02-04 19:28 davxy: The "reports" STF exercised by these vectors has been modified to not change the content of the auth queues. The content of the queues is changed by the "authorizations" test vectors. I'll add a note to the readme to make this explicit
(edited)
# 2025-02-05 07:06 carlos-romano: ahh ok thanks, so then we don't need to check them against the STF test vectors post state right? Thanks a lot 🙏
# 2025-02-04 12:03 gav: > <@emielsebastiaan:matrix.org> Yes in that case GP should be adjusted to remove any ambiguity.
> GP should to explicitly state that the modulo operator on a negative number yields a negative number, and not a positive number as expected by 'Maths'.
This is corrected/clarified in main.
# 2025-02-04 16:28 prematurata: I've a question about 14.13.
bold_l
is being constructed by essentially ensuring that all the workpackagehashes (special hash) in the workItems import data segments
i
have a key in the bold\_l (14.11) and that the "pointing" value in the
bold_l
dictionary is the **previously** computed "segmentRoot" of the AVailability specification (14.13).
- If this is correct then it means guarantors need to maintain a Datastore containing a dictionary of Previously computed WorkResults corect?
- I don't see any limitation of imported segments referencing WorkPackage hashes which generated a report on the same core as the one we're trying to compute. This means that a workpackage p could have work items
w
whose reference in its import-segments a workpackagehash w which was computed on another core
(edited)
# 2025-02-05 09:05 gav: 1. Correct
2. Nodes need only keep fairly recent WPH->SR history, since it is known that the resultant report must pass the on-chain WPH->SR lookup whose history is limited. Ultimately guarantors are free to ignore work packages whose imports they deem unreasonable, unknowable or unlikely to result in a profitable endeavour. To pass the conformance tests nodes need only be able to make guarantees under reasonable conditions.
(edited)
# 2025-02-05 09:23 prematurata: Thanks for this. Will it ever be specified in graypaper? Especially the "fairly recent" or "reasonable conditions"?
# 2025-02-05 09:24 prematurata: I think the same might be applied on 14.14 which basically is implying that there is a datastore from the MerkleTreeRoot and its "preimage"
# 2025-02-05 09:40 gav: > <@prematurata:matrix.org> Thanks for this. Will it ever be specified in graypaper? Especially the "fairly recent" or "reasonable conditions"?
Guaranteeing is a strategic endeavour, and the Gray Paper generally avoids dictating strategy.
# 2025-02-05 09:43 gav: However the aspect of “fairly recent” is well-specified in terms of on-chain behaviour. See equations 12.4-12.8
# 2025-02-05 09:43 gav: Of course a guarantor would need to guess when their guaranteed work-report would likely make it to accumulation in order to apply this limit to the strategy of guaranteeing.
# 2025-02-05 11:34 gav: But given that guarantors are responsible for the same core for 10 blocks at a time and receive all guarantees across (even across other cores), then it's quite possible to make a pretty decent guess on the quality and size of the core's backlog and how long it might be before the new report would make it on-chain.
(edited)
# 2025-02-05 14:17 prematurata: Thanks gavin. I also have another question about the new fetch.
- Is it intended for μ′\_{o⋅⋅⋅+l} to be updated even in case of a panic? Ex when w9 is 5 and memory between w10 and +32 is readable.
- also what is bold x?
(edited)
# 2025-02-05 14:41 gav: In the case of a panic, memory remains the same (though since panic at the top level is unrecoverable, it doesn’t really matter)
# 2025-02-05 14:43 gav: There is a superfluous condition there - the additional omega_9 term for a panic. This will be removed in the next revision. This will then make it very clear that memory does not change in the case of a panic.
# 2025-02-06 08:42 prematurata: no, I have no idea. my first guess would have been A.7 but according to that it seems bold x woudld then be a set containing numbers \in N_{2^32} and that would make no sense in the fetch definition
# 2025-02-06 11:50 gav: Bold x is any value which satisfies the various conditions on it. You'll find that practically speaking, there's only one which anyone could reasonably know.
(edited)
# 2025-02-06 11:54 gav: It's used specifically for extrinsics, where an extrinsic is specified in the WP as a hash/len. We assert that we know the preimage since we would not get to this part of the guarantee pipeline without knowing.
# 2025-02-06 11:55 gav: Alternatively it could have been made explicit with an additional term being passed in to the Refine function and its context, but that would have just complicated the formulation.
# 2025-02-06 11:56 gav: The GP isn't about describing a node's internal _data-logistics_; that's an implementation-specific consideration. It only concerns outward *behaviour*.
# 2025-02-06 16:02 emielsebastiaan: > <@gav:polkadot.io> @room :
v0.6.2 is out - contains all the latest corrections.
3 PRs with pvm instruction modifications are pending review at your convenience.
# 2025-02-06 16:05 gav: Merged one - will wait for Jan Bujak to take a look at the others.
# 2025-02-06 16:09 jan: Yeah, since in the GP we "store" the values unsigned those were definitely missing the conversions back, LGTM.
# 2025-02-07 15:09 gav: In the case of
void
, the pages must all be accessible because we're making them inaccessible.
# 2025-02-07 15:10 gav: We wouldn't want exactly that condition for
zero
whose job is to initialize pages to being accessible and zero.
# 2025-02-07 15:11 gav: We could introduce a condition to require them to be previously inaccessible, but currently we don't. This was intented, but maybe it should be changed if requiring it bring us more performance impls ( Jan Bujak ?)
# 2025-02-07 15:30 jan: The lack of requirement in
zero
that the pages are inaccessible is indeed intended so that e.g. this host call can be used to clear/reinitialize memory without first having to void it nor track what is already allocated. Requiring it wouldn't really change anything performance-wise since you have to (or the OS has to) iterate over the page map to find the holes to fill anyway. (And in practice not requiring this check can make things simpler because you can just ask the OS to zero allocate an address range in bulk instead of having to do this yourself.)
Hm, but now that I think about it we probably should make
void
not require the pages to be accessible either, as that could simplify its usage in certain cases (and, same as with
zero
, the page map has to be iterated over anyway by the implementation, whether it returns an error or not). Basically have it take a range of pages we want it to free, and when it returns it'll guarantee that the whole range is now free. So the
HUH
error branch could just be deleted altogether (since voiding the first 64k or out-of-bounds would be a no-op as there can't never be anything allocated there anyway).
# 2025-02-09 02:00 ascriv: Is anyone aware of any non-rust libraries which implement bandersnatch vrf signatures? Seems the main implementations are in Rust for now
(edited)
# 2025-02-09 17:08 jay_ztc: Not aware of any other than davxys work on this. Have it on my list to investigate as a potential SPOF for the network.
# 2025-02-09 18:55 davxy: AFAIK,
arc-ec-vrfs
is currently the only implementation available. This is likely because the scheme is not standardized, thus any existing implementation should be tied to JAM.
For those interested in implementing it, the details of Bandersnatch VRFs are soecified here:
https://github.com/davxy/bandersnatch-vrfs-spec. Implementing the "plain" VRF is **relatively** straightforward, especially with the support of a "bigint" library, making it a manageable task.
However, the complexity increases significantly when dealing with the ring-VRF variant. Even though it is thoroughly specified here:
https://github.com/davxy/ring-proof-spec , implementing it requires a library that supports the underlying SNARK framework it relies on. Since there are no official standards (only de facto ones, at best), you’ll likely need to build many things from scratch.
We have been using
arkworks for this, as it provides **most** of the tools necessary. That said, some additional components were developed by the W3F team and ourselves to fill in the gaps anyway.
In summary, if someone is inclined to implement any of these components (also by building over arkworks, for example), I’m available to provide some support. Additionally, test vectors are available for validating conformance.
(edited)
# 2025-02-09 19:20 ascriv: Since implementing it from scratch is so challenging, it seems essentially everyone will rely on the one implementation that is available (using something like go’s FFI if they’re not implementing jam in rust), which becomes a redundancy and a failure point. It’s probably well audited and as a single point of failure still probably ok though.
# 2025-02-09 19:21 ascriv: I would love if people reimplement it of their own volition in their chosen language but I think we might need additional incentives if we do think relying on just this one implementation is in fact an issue
# 2025-02-10 18:35 vinsystems: 1.- In the preimages extrinsic, do the service-data pairs have to be
ordered _only_ by service id? Or do we also have to consider the order of the data blob?
2.- Is the R function of eq
(12.30) the one defined in eq
(12.23)?
(edited)
# 2025-02-10 19:42 gav: 1. Both, primarily service ID and then data blob. Same goes for any tuple.
# 2025-02-10 19:44 gav: 2. No. R in 12.30 and 12.31 are the same. R in 12.23 and 2.24 are the same. I might rename one of them to avoid the confusion.
# 2025-02-10 19:58 leonidas_m: The GP assumes that we know the preimage data because there's only one value that satisfies the necessary conditions. However, since the Refine function neither allows passing the preimage explicitly as a parameter nor permits querying the state, it's unclear where this data should come from.
Should the preimage be retrieved from a node's internal data store (local database), or is it expected to be fetched from an external source (e.g., network)?
# 2025-02-10 21:23 gav: Validator nodes are expected to be sent solicited preimages directly from external sources.
(edited)
# 2025-02-12 00:39 charliewinston14: Hello I have a few questions I was hoping someone could help with.
1. Is there a formal definition of a work package bundle other than the textual description in 14.4.1? It mentions the attributes but not the data types of each attribute.
2. The extrinsic data in a work package bundle, is it a list of hashes and lengths? Or is this the full preimages themselves?
3. The exported segments in a work package bundle, or they the actual segments of length 4104? Or are they a list of root and indexes?
4. Is the extrinsic that’s passed in CE133 the same as what’s in a work package bundle? If so why is it included in CE133 if it can just be calculated based off the work package by the receiver using X(w) in 14.14
# 2025-02-12 15:06 gav: > <@charliewinston14:matrix.org> Hello I have a few questions I was hoping someone could help with.
>
> 1. Is there a formal definition of a work package bundle other than the textual description in 14.4.1? It mentions the attributes but not the data types of each attribute.
> 2. The extrinsic data in a work package bundle, is it a list of hashes and lengths? Or is this the full preimages themselves?
> 3. The exported segments in a work package bundle, or they the actual segments of length 4104? Or are they a list of root and indexes?
> 4. Is the extrinsic that’s passed in CE133 the same as what’s in a work package bundle? If so why is it included in CE133 if it can just be calculated based off the work package by the receiver using X(w) in 14.14
>
1. It is the second argument to A in 14.15
# 2025-02-12 15:08 gav: 3. A bundle has the actual segments and justifications. See 14.14 S and J.
# 2025-02-12 15:09 gav: 4. 14.14 X assumes knowledge of the relevant hash preimages. The receiver may not have such knowledge therefore it makes sense to provide it. Theoretically we could make it an on-demand thing later.
# 2025-02-14 11:49 sourabhniyogi: How is the strangely odd number
81 from the
a_o
service storage size formula derived? The 32 within the same formula is the storage for key, but 81 is 17 more bytes than 64...
(edited)
# 2025-02-14 19:27 ascriv: For initializing the ring context for bandersnatch ring vrf stuff, there must be a common seed we’ll all be using? So that it remains deterministic
# 2025-02-16 18:27 ascriv: I think I have a correction for the state transition dependency graph: lambda prime needs to be added to the inputs for the state transition for the validator statistics, since we need to compute the reporters set which requires G* which requires lambda prime
# 2025-02-17 08:36 celadari: Hi everyone,
I have a question about accumulation regarding the call to the accumulate function.
We call Ψ_A in equation (12.19) [GP version 0.6.1] at the accumulation stage.
I'm following the formalism of the white paper.
Should this function call alter δ, or does it return a new account (like the Ω functions) that needs to be accumulated/saved later in equation (12.21)?
Am I clear? 😅
# 2025-02-17 08:41 gav: The accumulate function is used in the final definition of posterior delta. It is up to implementations to determine at what point they alter any particular internal data structure(s) which may represent delta or some partial/intermediate value of it.
(edited)
# 2025-02-17 08:43 gav: One thing which will be very helpful to know is that no two accumulate functions which both contribute to the same “wave” of accumulations will have contradictory changes.
# 2025-02-18 04:38 sourabhniyogi: We believe the notation ${\\bf p}\_{\\bf c}$ used in Eq B.1
here needs to be adjusted to accommodate service $h$ and authorization code hash $u$ of 14.2
here -- that the preimage $u$ of service $h$ is the authorization code input to $\\Psi\_M$ of B.2. Can someone confirm this interpretation is correct?
Specifically, the genesis state with bootstrap service 0 will have a null authorizer code hash (say, 0x12344321...) and the very first work package ${\\bf p}$ (to create a new service) will have $h=0$ and $u=0x12344321...$ to reference this null authorizer.
Does that make sense?
(edited)
# 2025-02-18 05:07 sourabhniyogi: Sorry -- how does Eq B.1's first input of $\\Psi\_M$ ( which is ${\\bf p}\_{\\bf c}$) get at a work package's authorization code?
(edited)
# 2025-02-18 13:26 leonidas_m: Hey, I've noticed that some PVM tests (eg
inst_load_u8_nok
,
inst_store_u8_trap_inaccessible
) charge more than 1 extra gas when a page fault occurs even though only a single instruction is executed. In previous commits, similar tests were removed because the cost model was based on polkaVM and wasn't specified in the GP. Are these tests facing the same issue now or am I misunderstanding something?
# 2025-02-19 07:58 leonidas_m: Looks like the same issue also affects the following test vectors:
inst_store_imm_indirect_u16_with_offset_nok
inst_store_imm_indirect_u32_with_offset_nok
inst_store_imm_indirect_u64_with_offset_nok
inst_store_imm_indirect_u8_with_offset_nok
inst_store_imm_u8_trap_inaccessible
inst_store_indirect_u16_with_offset_nok
inst_store_indirect_u32_with_offset_nok
inst_store_indirect_u64_with_offset_nok
inst_store_indirect_u8_with_offset_nok
# 2025-02-19 12:37 celadari: Hi guys,
Just some question related to safrole:
Equations (6.15), (6.16), (6.17), (6.18), (6.19), (6.20) - GP version 6.2 - refer to gamma_s' and eta_3' => so we should update gamma_s and eta_3 before ? Therefore, we should compute equations (6.23), (6.24) before checking equations (6.15), (6.16), (6.17), (6.18), (6.19), (6.20) ?
# 2025-02-19 13:28 oliver.tale-yazdi: I had a similar Q and that is was i ended up doing. It seems that the gamma states cannot be updated at once anymore but need to be done in two steps
# 2025-02-19 13:33 celadari: Can you elaborate please ? 🙃
Should we update gamma then before making the header check of equation (6.15)-(6.20) ?
# 2025-02-19 14:57 oliver.tale-yazdi: We first do gamma_k & gamma_z, then check entropy marker H_e and then gamma_s & gamma_a.
Not sure if that is optimal, it was just what i coded up first 🤷
# 2025-02-19 14:21 ascriv: That’s how I interpret the equations, since they involve posterior variables, the posterior variables must be computed already
# 2025-02-20 12:50 gav: Its result can easily be fed into the serialization function.
(edited)
# 2025-02-20 14:31 gav: Cool - looks like you’ve already done most of the work for a PR to sort this - want to submit one?
# 2025-02-20 18:09 sourabhniyogi: If someone else agrees with it, we will give it a shot!
(edited)
# 2025-02-22 00:16 charliewinston14: Hello.
I’m having some difficulty understanding the erasure root formula in the GP, specifically calculating “s♣” in 14.16.
Hoping someone can point me in the right direction.
I had no problem calculating “b♣” and have my erasure coding function C and paged proof generation function P already.
The s♣ formula has C#6 (s⌢ P(s))), where S is an array and P(S) is an array as well. Does that mean to concatenate them both together and then pass to chunking function? I think it’s the # that is confusing me as that normally means apply to each of the sub items. There is also a # on the binary merkle call so I’m assuming that I need to call the merkle function multiple times and not just once with the results of the erasure encoding but not understanding the formula at all. Can someone give me a tip of how to proceed with it?
https://graypaper.fluffylabs.dev/#/5f542d7/1b4c011b5701
# 2025-02-22 01:57 ascriv: Is there a very rough estimate for v1.0.0? Trying to think if I can make milestone 1 by that time :v
# 2025-02-22 02:45 gav: Latest estimate is by end of Q3. But will depend a lot on the outcome of Toaster and initial service development.
# 2025-02-22 03:47 ymcsabo: Hi, in the gray paper section 15.2, it mentions advanced nodes and naive nodes. What are some of the examples of those two types of nodes?
# 2025-02-22 07:24 gav: There is no clear delineation. It’s about *strategy*. Some implementations (or node configurations) may use a more sophisticated strategy for predicting the best work package to execute and guarantee. This will allow their operators to take greater rewards under some circumstances. But again, this is strategy and therefore largely out of scope for the GP.
(edited)
# 2025-02-22 07:25 luke_fishman: I need some clarification regarding the advancement of the instruction counter i in the PVM
Reading
A.1 and
A.7
I understand the counter i` will always advance to the next instruction unless the exit reason is panic or halts
so if we have program like
ecalli ..
op1 ...
op2 ...
host call fail => continue from op1
host call succeed => continue from op2 (due to the extra skip in [A.33] (
https://graypaper.fluffylabs.dev/#/5f542d7/2b70012b7001)
However, reading the
text below A.34 i understand that
i' is:
exit reason == continue > i + 1 +skip
out of gas => i
panic or halt => 0
page fault => i
host call => i
but this makes [A.33] (
https://graypaper.fluffylabs.dev/#/5f542d7/2b70012b7001) not make sense
as now we will have
host call fail => counter stays => reinvoke into the failing host call
# 2025-02-22 07:34 gav: If the hostcall succeeds (where you pointed) then i’’ is used as the new instruction counter for the invocation of Phi_H which effectively skips past the ecalli instruction. If the hostcall results in anything other than a continue (the last condition) then Phi_H is not invoked again anyway.
(edited)
# 2025-02-22 07:43 luke_fishman: right. my code does just that. no issue
so lets talk about the case where the host call succeeds
we start with phi\_1 which returned ecalli, and i' = i +1 +skip (i.e point to instruction after the the ecalli)
and so we don't need to advance again after the host call has succeeded. since we already point to the next instruction
or, the Phi\_1 should not advance the counter on a ecalli exit reason, and after host call finishes with success then the counter advances
basically, why the text below A.34 says the i\` point the the host call, when seems to me from A.7 that it has already progressed beyond
(edited)
# 2025-02-22 08:06 gav: I see your point, yes. i’’ needs not be defined; i’ should be used instead. Feel free to make a PR if i don’t get to it first.
# 2025-02-22 08:07 luke_fishman: yep. that's what i thought , i'' is not needed
Thank you for confirming
# 2025-02-22 11:48 luke_fishman: regarding
i
in
B.9
i = check((E4−1 (H(E(s, η0′ , Ht ))) mod (2^32 − 2^9 )) + 2^8 )
does the decode\_4 bytes imply that only the first(last?) bytes of the hash are to be taken?
(edited)
# 2025-02-22 15:19 sourabhniyogi: Question on what the first value of a_t is in
new
here which is defined in
9.8:
Assume the preimage of the code is 1149 bytes, and recall GP constants: B_S = 100, B_I = 10, B_L = 1
What is the value of a_i and a_o and thus a_t:
* (1): a_i = 0, a_o = 0 ==> a_t = 100
(2): a_i = 2, a_o = 81 + 1149 = 1230 ==> a_t = 100 + 10 2 + 1 * 1230 = 1350
The order of operations in
new
is not clear, especially with the a_t and l "happening in the same line" here:
https://graypaper.fluffylabs.dev/#/5f542d7/31b90231b902
# 2025-02-22 15:30 gav: a_t is a dependent variable whose value is implied through the (non-negotiable) definition of bold-l, which is fully defined as c and l are both fixed values.
(edited)
# 2025-02-22 15:31 gav: balance is required to be equal to this (dependent) variable. There exists only one solution to this statement.
(edited)
# 2025-02-22 15:33 gav: Ordering is a point of implementation strategy, for implementation languages which require the practitioner to specify it (ie imperative ones).
(edited)
# 2025-02-22 15:36 gav: Plenty of languages, like formal logic, don’t generally insist on specifying a solution in terms of ordered mutations.
# 2025-02-22 15:37 gav: If this is a new concept, I’d suggest reading some undergrad computer science texts such as “structure and interpretation of computer programmes”.
(edited)
# 2025-02-23 03:51 luke_fishman: good morning everyone
very small specific question about encoding
looking at the encoding in the calculation of
i
in
B.9
E(s, η0′ , Ht )
for me the symbol e means general encoding
C.6
But the text under C.6 says
_"Note that at present this is utilized only in encoding the length prefix of variable-length sequences."_
which would imply:
- service index is encoded as 4 bytes
(refs
9.1,
C.23)
- timeslot is encoded as 4 bytes as well (refs
I.1.1,
C.16, C.20, C.22)
(edited)
# 2025-02-23 12:04 gav: I generally prefer not using a prime unless the plain (non-prime) term is also used.
# 2025-02-23 18:32 ascriv: I assume that when inspecting memory during the sbrk instruction, this should not case a memory-access exception, right? Also, is it right to interpret the math as saying “find the earliest inaccessible contiguous memory segment starting at or above h of length wa, and set it as mutable”?
# 2025-02-23 20:49 ascriv: neg_add_imm_64 has a +2^64 but then mods by 2^64 so this addition is the same as +0, so it’s redundant. Unless there’s a typo
# 2025-02-23 21:06 ascriv: rot_r_64_imm performs a left shift as written (ith bit of w’a = i+vx bit of wb), but the name suggests a right shift. Is this correct?
# 2025-02-23 23:10 ascriv: Also, shouldn’t we be doing Z inverse on the result of the max (227) and min (229) instructions? To convert back to unsigned before storing in the register
# 2025-02-24 02:58 charliewinston14: Morning all. Are the "EC shards" referenced in CE137 the same as the "bundle shards" referenced in CE138? What is the difference between these two APIs? Are they essentially the same except CE137 also returns segment shards?
# 2025-02-24 12:03 gav: > <@shwchg:matrix.org> Hi Dr.Wood
>
https://github.com/gavofyork/graypaper/pull/248
> Can we use beta_dagga as the only reference(instead of beta+beta_dagga) for all processes in Section 11?
>
The two should give equal effects in Section 11 as the only difference is the placement of the beefy root.
# 2025-02-24 12:04 gav: > <@ascriv:matrix.org> neg_add_imm_64 has a +2^64 but then mods by 2^64 so this addition is the same as +0, so it’s redundant. Unless there’s a typo
I don’t define negative modulo; this ensures the modulo is positive.
# 2025-02-24 12:07 gav: It is correct. Check the definition of caligraphic B.
(edited)
# 2025-02-24 12:08 gav: > <@ascriv:matrix.org> Also, shouldn’t we be doing Z inverse on the result of the max (227) and min (229) instructions? To convert back to unsigned before storing in the register
Yes. If you’re in the mood feel free to submit a PR.
# 2025-02-24 13:23 0xjunha: > <@ascriv:matrix.org> Also, shouldn’t we be doing Z inverse on the result of the max (227) and min (229) instructions? To convert back to unsigned before storing in the register
Actually this change is merged into main - probably will be included in the next release?
https://github.com/gavofyork/graypaper/pull/228
# 2025-02-25 15:20 jay_ztc: Hi team 👋 Quick question about the Merkle function in D.6-> the branch splitting conditional implies that the key should be left shifted one bit before each recursion. Am I interpreting this correctly? The tests & gh consensus suggests that the keys shouldn't be rotated at each recursion, but rather that the $d'th bit should be used in the splitting conditional at recursion depth $d. I'm happy to open a PR to the GP repo if appropriate.
https://graypaper.fluffylabs.dev/#/5f542d7/391b01391c01
# 2025-02-25 16:23 sourabhniyogi: In order to compare large amounts of PVM traces between teams precisely, I would like a formula to hash the registers with the PVM paged memory for teams to know they ended up with same answer at the end, and if they did not, be able to quickly determine which line in some PVM trace of PC they differed in results.
Its not hard to come up with a procedure, but does a ready made answer exist within the PolkaVM repo or is there some public algorithm to do this kind of operation so we don't reinvent the wheel needlessly?
(edited)
# 2025-02-25 16:53 jaymansfield: Hey! Hoping to get a clarification on the justifications for CE-138 (audit shard request). It mentions "The assurer should construct this by appending the corresponding segment shard root to the justification received via CE 137.".
What is the segment shard root corresponding too exactly when it's a request about a work package shard?
(edited)
# 2025-02-25 21:26 sourabhniyogi: The Parity Service trait definition for accumulate here returns an Option<Hash>
fn accumulate(_slot: Slot, _id: ServiceId, items: Vec<AccumulateItem>) -> Option<Hash>
but it appears there are TWO ways to provide a Some for accumulate:
(1) if $\omega_8=32$, then the B.12 ${\bf o} \in \mathbb{H}$ condition applies
(2) the yield host function
As B.12 is written, (1) takes precedence over (2), but the new (2) yield is a cleaner solution otherwise why was it added? Now that its been added, I believe we don't need (1). Having both is not needed since whatever 32-byte optional yield could go through output (1) OR (2), so perhaps we can eliminate (1).
Nitpick check: is $omega8 = 32$ a sufficient criteria for (1) to take precedence over (2) ? What if $omega8 > 32$? What if $omega8 < 32$?
Related nitpick check: is there a way to change the C notation in eq 4.7 vs 4.17 to eliminate the appearance of dependency loops. We are pretty sure the C in 4.7 is from the previous states 4.17 but seek confirmation?
(edited)
# 2025-02-26 01:53 ascriv: Should (A.43) have x’ instead of x? For clarity that it’s the x after the host call
# 2025-02-26 01:55 ascriv: And should the type of the gas in the return for (A.42) be signed (Zg) to handle e.g. when the host call returns out of gas?
# 2025-02-26 13:44 jaymansfield: Hey! Question about the state transition dependency graph 4.2.1. Should the calculation of β′ be moved further down since it depends on the commitment map C which doesn't exist yet, or does it use the commitment map from the previous block?
# 2025-02-26 13:54 gav: > <@sourabhniyogi:matrix.org> In order to compare large amounts of PVM traces between teams precisely, I would like a formula to hash the registers with the PVM paged memory for teams to know they ended up with same answer at the end, and if they did not, be able to quickly determine which line in some PVM trace of PC they differed in results.
>
> Its not hard to come up with a procedure, but does a ready made answer exist within the PolkaVM repo or is there some public algorithm to do this kind of operation so we don't reinvent the wheel needlessly?
No, there’s no canonical PVM state serialization. Registers are trivial, but for memory we would need to have some definition on how to encode the pages and their accessibility.
# 2025-02-26 19:16 sourabhniyogi:
https://hackmd.io/@sourabhniyogi/pvmhash
is a first try, hopefully a couple of us will try to converge on something as we get our host function implementations and PVM interpreter implementations correct.
I am wondering why R/W/0 page accessibility came to your mind right away (as opposed to X/Y contexts which has most of the immediate debugging problems) -- I must be missing something since any discrepancy in internal representations of page accessibility would be visible by some load/store instructions effect (or lack thereof) on a particular page -- encoding this page accessibility would be for 2 teams to reason about the contents of the memory after they saw the memory affected/not affected based on R/W/0 page accessibility bits -- if you anticipate this to be quite important early, I would like to put it in early in a "v1" (like in a page) -- is it?
Related question maybe?: Was the W\_G=4104 segment size chosen for segments to match a 4096-page size and a specific 8-byte encoding of the page, a page number and some specific metadata, specifically the accessibility bits. If so, we might as well get the "dump a page" to map into the 4104 encoding imagined for CoreVM service with the desired 8-byte encoding for pages maybe?
(edited)
# 2025-02-27 04:59 gav: The context is not PVM state - the whole point is that it's external.
(edited)
# 2025-02-27 05:00 gav: Of course you'll likely still want to test it, but I don't think there's any reason to check it before the context is collapsed into the final result from Phi\_A
(edited)
# 2025-02-27 10:59 sourabhniyogi: Then the v2 "hash" intends to capture the PVM state AND both contexts so as to support debugging of incorrect host function implementations. What should this be called?
Since a Phi\_A result may have many host function calls to get at its result (with many intermediate X + Y contexts), we do have a reason to check this v2 "hash", to see if an intermediate value of the v2 "hash" from one implementation matches another after some of those host function calls \[which affect the X (or Y) context\] complete, one (or maybe both) of which is incorrect. Does that make sense?
(edited)
# 2025-02-26 14:02 gav: > <@sourabhniyogi:matrix.org> The Parity Service trait definition for accumulate here returns an Option<Hash>
> fn accumulate(_slot: Slot, _id: ServiceId, items: Vec<AccumulateItem>) -> Option<Hash>
> but it appears there are TWO ways to provide a Some for accumulate:
> (1) if $\omega_8=32$, then the B.12 ${\bf o} \in \mathbb{H}$ condition applies
> (2) the yield host function
> As B.12 is written, (1) takes precedence over (2), but the new (2) yield is a cleaner solution otherwise why was it added? Now that its been added, I believe we don't need (1). Having both is not needed since whatever 32-byte optional yield could go through output (1) OR (2), so perhaps we can eliminate (1).
> Nitpick check: is $omega8 = 32$ a sufficient criteria for (1) to take precedence over (2) ? What if $omega8 > 32$? What if $omega8 < 32$?
>
> Related nitpick check: is there a way to change the C notation in eq 4.7 vs 4.17 to eliminate the appearance of dependency loops. We are pretty sure the C in 4.7 is from the previous states 4.17 but seek confirmation?
We can consider removing (1) at a later stage. Host calls are not especially cheap and returning data is a more natural pattern than relying on the side-effect of a host call.
# 2025-02-27 05:03 gav: > <@jaymansfield:matrix.org> Hey! Question about the state transition dependency graph 4.2.1. Should the calculation of β′ be moved further down since it depends on the commitment map C which doesn't exist yet, or does it use the commitment map from the previous block?
I'm kept the (different variations of) the state components together rather than try to keep any "execution order". Indeed rather the point of this dependency graph is to demonstrate that there exists no specific order since it's a partially parallel rather than a fully serial system.
# 2025-02-27 08:28 dakkk: > <@gav:polkadot.io> I'm kept the (different variations of) the state components together rather than try to keep any "execution order". Indeed rather the point of this dependency graph is to demonstrate that there exists no specific order since it's a partially parallel rather than a fully serial system.
sourabhniyogi: I think this answered your doubt. C used by beta' is the result from accumulation process
# 2025-02-27 05:09 gav: > Nitpick check: is $omega8 = 32$ a sufficient criteria for (1) to take precedence over (2) ? What if $omega8 > 32$? What if $omega8 \< 32$?
Not sure what you mean by $omega8, but assuming you mean selecting between that latter 2 variants of B.12, then it's pretty clear: you use the returned value IFF it is a 32-byte sequence (blackboard H). If it's anything other than this (e.g. 31 byte sequence or 33 byte sequence), then you fallback to the _otherwise_ condition of using the (success) context's yield.
(edited)
# 2025-02-28 03:17 ascriv: Thanks. Should the returned gas value in (A.34) be signed?
# 2025-02-28 04:09 jan: I'm confused. There's nothing wrong with the branch_eq instruction at pc 466 in rv64ui_add test and it certainly doesn't branch into the middle of a basic block?
448: 01 fallthrough
: @20
449: 33 00 0d r0 = 0xd
452: 33 01 0b r1 = 0xb
455: c8 10 0b r11 = r0 + r1
458: 64 b3 r3 = r11
460: 95 aa 01 r10 = r10 + 0x1
463: 33 02 02 r2 = 0x2
466: ab 2a ef jump 449 if r10 != r2
# 2025-02-28 04:21 jay_ztc: You're right, this is my mistake... embarrassingly small bug on my end, should have reviewed the target more thoroughly before posting as well (different debugging scope & missed fallthrough is bb terminator) 🤦♂️ Thanks for your response.
(edited)
# 2025-03-01 13:47 sourabhniyogi: Can a parent VM determine which PVM memory pages have been modified by an "invoked" child VM?
CoreVM-type services aim to extend JAM computation across multiple work packages by exporting PVM memory pages as segments at the end of a task and retrieving them at the start of the next.
While the
memcpy/memset host call can support data transfer between parent and child VMs (an imported segment copied into some child VM page index), the parent needs a way to identify all mutated memory pages—not just those explicitly copied via memcpy—to ensure they are correctly included in the exported segments.
One approach could be extending
M to track modified page indexes and providing an explicit mechanism for the parent VM to access this set, supporting efficient page/segment export.
Is this a good idea or is there a better approach?
# 2025-03-02 04:27 gav: > <@ascriv:matrix.org> Thanks. Should the returned gas value in (A.34) be signed?
Yes. Already merged.
# 2025-03-02 04:49 ascriv: A lot of the host functions (eg solicit, forget, yield, etc) use Z_o…+32 (for example). I think these can never be negative since they’re register values? So N is a bit more clear
# 2025-03-02 05:08 gav: > <@ascriv:matrix.org> A lot of the host functions (eg solicit, forget, yield, etc) use Z_o…+32 (for example). I think these can never be negative since they’re register values? So N is a bit more clear
Yes, feel free to make a PR.
# 2025-03-02 14:23 ascriv: Is there always an implicit mod 2^32 when inspecting or mutating the ram given a register (64 bit unsigned)? we do this a lot and wondering how it should be handled
# 2025-03-02 22:01 ascriv: According to the gp, we only handle memory access/modification exceptions in the single step function, but presumably we’d want to catch such exceptions in , e.g host calls as well, no?
# 2025-03-02 22:17 jay_ztc: believe mem fault in nested pvm is returned to parent pvm instance for the pvm program to handle, someone correct me if I'm wrong here
# 2025-03-02 22:24 ascriv: Yes, but if for example the 8th register is not a valid memory location, then when deserializing to construct g, we should fault, right? Before the pvm function is called
# 2025-03-02 22:29 jay_ztc: Good question. I would assume write to that location from an 'outer-shell' implementation perspective, and if the pvm program attempts to access it, then it would trigger a page fault. Interested to hear other folks thoughts on this though.
(edited)
# 2025-03-03 02:45 gav: > <@ascriv:matrix.org> Is there always an implicit mod 2^32 when inspecting or mutating the ram given a register (64 bit unsigned)? we do this a lot and wondering how it should be handled
No. Everything is explicit.
# 2025-03-03 02:46 gav: They are handled explicitly. If you believe there is an instance where unarmed memory may be addressed, please report.
(edited)
# 2025-03-03 02:49 gav: > <@danicuki:matrix.org> I have a doubt about the work package execution formula:
https://graypaper.fluffylabs.dev/#/5f542d7/1a48021a5d02
>
> it says that
I(p,j) = (r, e)
when
|e| = we
, but what if r is an error and the |e| = we? Shouldn't be
>
> -
(r, [G0, G0, ...) if r not binary
first
> -
(r, e) if |e| = we
second
>
> ?
No. GP is correct.
# 2025-03-03 02:50 gav: > <@ascriv:matrix.org> Yes, but if for example the 8th register is not a valid memory location, then when deserializing to construct g, we should fault, right? Before the pvm function is called
See the second line there. The range beginning with o is ensured to be in the set of valid memory addresses.
# 2025-03-03 04:53 ascriv: In export, since we’re reading indices p…+z wrapped, should we also be checking if Np…+z (mod ram size) is in Vu ?
# 2025-03-03 08:11 0xjunha: I have several questions/comments regarding the historical lookup and related constants:
1. The constant
D
has been updated from 28,800 slots (48 hrs) to 4,800 slots (8 hrs). I opened a PR to update it in appendix I too:
https://github.com/gavofyork/graypaper/pull/260
2. Probably the constant value
L
should be reduced too?
D
seems to be introduced to prevent a preimage data from being removed while it still could be referenced during auditing. So
D
should be larger than
L
. However, current value of
L
is 14,400 (24 hrs): (
https://graypaper.fluffylabs.dev/#/5f542d7/417000417000 and
https://graypaper.fluffylabs.dev/#/5f542d7/0c9f000c9f00) which hasn't been updated since the initial commit, while
D
was updated as mentioned above.
3.
https://graypaper.fluffylabs.dev/#/5f542d7/113b00113b00 Regarding the brief definition of the historical lookup function, should the constant
C_D
be
D
instead? I wonder if this is a typo or I'm missing something. Also, while the function is designed to be called off-chain, should we interpret the
H_t
here as "The timeslot index of the last finalized block header that an auditor sees at the point of auditing"?
# 2025-03-03 09:03 gav: 1./2. Yes there's an issue for this now; the provided PR may not be quite right.
(edited)
# 2025-03-03 09:04 gav: 3. Yes indeed
C_D
should be
D
; will be fixed in 0.6.4.
(edited)
# 2025-03-03 09:06 gav: Big-Lambda is defined fully at 9.7 and this proper definition does not use
H_t
.
# 2025-03-03 09:07 gav: 9.5 is only provided to help the reader understand what problem the function is attempting to solve.
(edited)
# 2025-03-03 09:10 gav: In addition to many corrections and clarifications, there are several small but important functional alterations; pay attention to the first 6 items in the changelog.
(edited)
# 2025-03-03 11:17 yu2c: Does anyone know why the PDF file for release v0.6.3 is 111 MB, while the previous version v0.6.2 was only 4.22 MB? 🧐
# 2025-03-03 14:19 gav: > <@yu2c:matrix.org> Does anyone know why the PDF file for release v0.6.3 is 111 MB, while the previous version v0.6.2 was only 4.22 MB? 🧐
Good point!:) I’ll look into getting a smaller rendering.
# 2025-03-03 16:58 yu2c: And small suggestion: Add $\mathbb{N}_{R}$ defined in
(4.23) in the Appendix I / Sets / Regular Notions
# 2025-03-03 19:34 gav: > <@yu2c:matrix.org> And small suggestion: Add $\mathbb{N}_{R}$ defined in
(4.23) in the Appendix I / Sets / Regular Notions
PR?:)
# 2025-03-03 20:50 danicuki: ΨR call here passes 11 parameters: (wc,wg ,ws,h,wy ,px, pa,o,S(w,l),X(w),ℓ) (
https://graypaper.fluffylabs.dev/#/85129da/1a9b021aad02?v=0.6.3)
But ΨR defined still has only 5 (i,p,o,i,ς):
https://graypaper.fluffylabs.dev/#/85129da/2d65002d9300?v=0.6.3
# 2025-03-03 21:12 ascriv: > <@gav:polkadot.io> Is there any particular reason you think we should be more lenient?
In poke for example, we read from the ram with wrapping , which means we want to allow for the case where z > ram size, in which case we wrap back to 0. But in that case, Ns…z will not be in Vu, so we will panic
# 2025-03-03 21:12 ascriv: If we don’t want to handle the case where z > ram size, then we are wrapping unnecessarily
# 2025-03-04 05:26 gav: > <@danicuki:matrix.org> ΨR call here passes 11 parameters: (wc,wg ,ws,h,wy ,px, pa,o,S(w,l),X(w),ℓ) (
https://graypaper.fluffylabs.dev/#/85129da/1a9b021aad02?v=0.6.3)
>
> But ΨR defined still has only 5 (i,p,o,i,ς):
https://graypaper.fluffylabs.dev/#/85129da/2d65002d9300?v=0.6.3
Yes indeed:
https://github.com/gavofyork/graypaper/pull/273
# 2025-03-04 05:29 gav: As I've said countless times now, the GP does not dictate the specifics of data logistics, only observable behaviour.
fetch
requires implementations to return the correct extrinsic data by virtue of the constraints placed on the return value from
fetch
. How it gets the extrinsic data is entirely implementation-specific and left as an exercise for the reader. Probably it will be supplied along with the rest of the WP by the builder, but the GP does not define this since it is not *observable behaviour*.
(edited)
# 2025-03-04 05:32 gav: Your question betrays a presupposition that the formalisms in the GP are 1:1 mappable to some implementation code. That may sometimes be the case but certainly not always.
(edited)
# 2025-03-04 05:33 gav: e.g. We are able to use formalisms in mathematics such as
let H(return_value) = input_value
. This does not map 1:1 with (procedural) code, because it would imply the ability to make a reverse hash, for which no general solution is known.
(edited)
# 2025-03-04 05:36 gav: It works in the GP because it is describing
what we wish to see, not
how it must be delivered.
# 2025-03-04 05:36 gav: In the case of implementing the "impossible" reverse hash function, it is fine because implementations are allowed to "cheat" and use external knowledge (such as DA contents or data arriving over the network) in order to arrive at the answer.
# 2025-03-04 05:40 gav: Implementing the GP is a puzzle and intentionally so. There may be different ways of solving the puzzle. This diversity can help deliver a resilient, even anti-fragile, network. Don't expect a perfectly described path to implementation. You'll need to use your brain.
# 2025-03-04 05:41 gav: Ahh, in fact the wrapping is superfluous there; I'll accept a PR which removes it from the host functions.
(edited)
# 2025-03-04 05:43 gav: > <@yu2c:matrix.org> And small suggestion: Add $\mathbb{N}_{R}$ defined in
(4.23) in the Appendix I / Sets / Regular Notions
PR welcome:)
# 2025-03-04 09:19 clearloop: hi there, I'm currently confused about the fallback keys (sealing) in grandpa:
while blocks sealed with fallback keys are for the placeholder of contingency,
https://graypaper.fluffylabs.dev/#/85129da/1fc0001fc400?v=0.6.3 seems mean that headers with fallback keys are not eligible to be selected as best headers which will make the blocks with fallback keys meaningless, if I'm not mistaken, the blocks with fallback keys should be selected as the best headers if there are no ticket sealed blocks at the end of slots, plz correct me if I'm wrong 🙏
(edited)
# 2025-03-04 13:20 gav: Such blocks may be the only way of extending the chain and getting to the point of having regular ticketed blocks which increase the best chain score.
(edited)
# 2025-03-05 01:34 ascriv: In New, should we panic if the 8th register is not in N-2^32? Because then (c,l) will not be a valid key for the l component of the new service account
# 2025-03-05 02:40 clearloop: ~~so the case of finalizing headers with fallback keys is: we just confirmed a ticketed block is on the best chain, and we are requesting the ancestors of the best head ( blocks with the fallback keys in the headers could be here ) ?~~
clear about it now, best chain voted by blocks with most ancestor blocks, headers with fallback keys could be part of them
(edited)
# 2025-03-05 02:44 ascriv: > <@ascriv:matrix.org> In New, should we panic if the 8th register is not in N-2^32? Because then (c,l) will not be a valid key for the l component of the new service account
Also in new, a is missing the p component which should presumably be {}
# 2025-03-06 05:05 gav: Well, it’s not that it’s not “valid” (don’t forget we’re not using typed logic here, just basic set formalisms). Rather that there can be early certainty that if l is not in N_2^32, then there can be no item (c, l) in the set (H, N_2^32). This implies no key can exist.
(edited)
# 2025-03-06 15:24 dakkk: gav: in your latest lecture in Taipei about JAM, you said that your implementation achieved ~**% speed of native code; would it possible to have the program you used for this benchmark? I'd love to test jampy PVM around different improvements I have in my sleeve
# 2025-03-06 16:00 dakkk: Jan Bujak: I'm calling
bash guest-programs/build-benchmarks.sh
in order to create pvm and native binaries: it correctly creates PVM binaries, but the x86_64 contains only shared objects. Am I missing a step?
# 2025-03-06 17:28 dakkk: I'm giving up since generated files are elf or .polkavm format, and I do not want to read other implementation's code. I'll write my own benchmark program
# 2025-03-07 20:23 ascriv: For some cases when lhs and rhs types are different, it’s clear what to do, for example in set inclusion we can safely interpret as evaluating to false. But in other cases where lhs and rhs are different types, like the building of the set that is the new service account, it becomes too ambiguous. One interpretation is that the l component in the new service account should be empty. Another is that a should be \error . I’ve made a PR with the second interpretation
http://github.com/gavofyork/graypaper/pull/279 (edited)
# 2025-03-08 08:15 gav: you've highlighted regular l, when the preimage dictionary is bold-l.
# 2025-03-08 08:18 gav: No idea what you're talking about. s\* and s are both in
\N_S
. the comparison is just a simple numeric one. If you work the logic through, it would be taken if
omega_7 in { s, 2^64-1 }
.
(edited)
# 2025-03-08 14:45 jay_ztc: Thanks for the response! I'm not sure I'm following though? 't' is referencing an account, and I only see 'l' used in the accounts context when referring to the lookup dict? It would make sense if the metadata returned by the info host call contains some metadata about preimages, maybe just the keys even -> but I'm assuming that would be represented as K{t}_l in the spec if that were the case...
# 2025-03-08 15:02 jay_ztc: Thanks for clarifying! Wasn't sure if an account being able to selectively target its
s vs.
d context was a valid use case. Also was getting hung up on this being a more verbose way to get to 'a' than in the previous host call (lookup) -> but I see the intermediate value used when querying the storage, so the verbosity makes sense there.
# 2025-03-08 20:35 jaymansfield: Hello. For the bless and assign host calls, what should happen if a non-privileged service tries to execute it? Panic or HUH? The GP doesn’t really specify.
(edited)
# 2025-03-10 13:25 jay_ztc: Happy Monday folks. In the 'new' host function-> the GP doesn't specify what should happen if the preimage blob length is above the valid range required by the account spec (N_l = 2^32). There is a related case where this type of behavior is specified-> In the bless host call the GP specifies a 'WHO' exit when the values aren't within the valid N_s range. Curious to hear what the guidance is here, happy to open a PR if needed.
https://graypaper.fluffylabs.dev/#/5f542d7/31db0031e600?v=0.6.2
# 2025-03-11 17:55 gav: Exactly what the GP specifies. It returns OK but has no effect on the system's state.
(edited)
# 2025-03-11 17:59 gav: Ahh yes,
t_l
should actually be
t_o
. Feel free to post an issue or make a PR:)
(edited)
# 2025-03-11 18:02 gav: This was previously reported and is addressed in
main
branch.
# 2025-03-11 18:11 jaymansfield: > <@gav:polkadot.io> Exactly what the GP specifies. It returns OK but has no effect on the system's state.
Thanks!
# 2025-03-11 18:13 ascriv: > <@jay_ztc:matrix.org> Happy Monday folks. In the 'new' host function-> the GP doesn't specify what should happen if the preimage blob length is above the valid range required by the account spec (N_l = 2^32). There is a related case where this type of behavior is specified-> In the bless host call the GP specifies a 'WHO' exit when the values aren't within the valid N_s range. Curious to hear what the guidance is here, happy to open a PR if needed.
>
>
https://graypaper.fluffylabs.dev/#/5f542d7/31db0031e600?v=0.6.2
I made a change which results in a panic if that happens
# 2025-03-11 22:05 ascriv: I see in the on transfer and accumulate invocations we check if a service account’s code hash is “without value” but according to the service account type the code hash must have a value. Is the service account type wrong or are these checks wrong?
(edited)
# 2025-03-12 12:19 ascriv: > <@gav:polkadot.io> what makes you think that the code hash must have a value?
In 9.3 the code hash is of type H, I think if we want it to be able to be valueless it should be of type “H?” ?
# 2025-03-12 15:41 gav: ahh right, sure the hash is non-empty. but there may not be a preimage.
(edited)
# 2025-03-12 15:43 gav: Yes indeed. I suspect this may have been incorrectly corrected recently.
# 2025-03-12 15:50 ascriv: > <@gav:polkadot.io> regular c and bold c are not the same.
Yep, misread as regular c. Thanks
# 2025-03-12 16:49 subotic: Sure, for C to return 0 in case of negative gas or for change to Z_G? If change to Z_G, then this will also spill over into Psi_A and the accumulation chapter, where N_G is expected.
# 2025-03-13 12:17 gav: Yes, in fact it was confusing the gas remaining (which is the PVM gas counter and can be negative in the case of an underrun) with gas used (which is the value used by the higher level accumulation functions and cannot be negative). Should make more sense with
https://github.com/gavofyork/graypaper/pull/288. (edited)
# 2025-03-15 19:11 ascriv: In the outer accumulation function, is it intended that the free accumulation services dict be zeroed out after the first iteration? Couldn’t there be work reports in the remaining ones which was made by a free accumulation service?
# 2025-03-17 05:22 0xjunha: My understanding is that the always-accumulate services should be processed first, followed by other services. And there is enough gas to run all the always-accumulate services in the initial round, so those getting accumulated later than the initial round would never happen.
(edited)
# 2025-03-17 04:40 clearloop: hey teams, curious about in your implementations, after genesis, can you ensure there will be no empty slots in the local testnet? e.g. there are valid blocks on each of the time slots and all of them get full finalized with all of the nodes within 6 seconds
(edited)
# 2025-03-17 10:04 prematurata: I think I've found something interesting when handling the
ecalli
pvm fn. According to the gp
ΨH|A.34
calls
Ψ()|A.1
and handles the
ε′ = hxh
.
The
ecalli
fn (
https://graypaper.fluffylabs.dev/#/85129da/25ff0025ff00?v=0.6.3 ) is set to modify just
ε
leaving the default
ı′ = ı + 1 + skip(ı)
defined in A.7 in place.
Then the host call gets executed inside
ΨH|A.34
. If all goes ok and
f
returns (▸, ....) then we should recursively call
ΨH
but with
ı′′
which is defined as
ı′′ = ı′ + 1 + skip(ı′)
. Now since
i'
is what is being returned by
Ψ
and is already skipping an instruction, then
ΨH
gets called by skipping another instruction (due to the 2x skips being applied).
Now i think this is either a misinterpretation of mine or an error in the graypaper as in my point of view it makes no sense to skip 1 instruction after a successful hostcall execution via ecalli. Multiple implementations seems to agree with this considering there are multiple implementors passing the duna testnet.
(edited)
# 2025-03-17 14:51 prematurata: yes. i didnt see this issue daaamn. but essentially yeah its the same as i am reporting
(edited)
# 2025-03-17 19:54 jaymansfield: Hey, I have a question relating to guaranteeing/auditing timing. After a work package is guaranteed and included on-chain, assurers will normally start assuring its availability in the next slot. Is there any reason to wait for these assurances to be posted, or can the auditors (if they are assigned to that specific core) just immediately request shards from the original guarantors? My implementation currently doesn't wait for the assurances and just wanted to see if thats okay, or if i should change the timing around. What are the best practices here?
(edited)
# 2025-03-18 02:30 shwchg:
https://graypaper.fluffylabs.dev/#/85129da/1b6d001b8800?v=0.6.3
Is equation (14.14) using J_x as defined in (E.5) or the PagedProof P described in (14.10)?
From the preceding equations, it appears to use J_x, but the paragraph stating that “such a vast amount of data is not generally needed as the justification can be derived through a single PagedProof” suggests that using P could also be reasonable. Could we get a clarification on this?
# 2025-03-18 17:33 vinsystems: What does
t_o
mean in the info host\_call function?
In the graypaper reader version 3 March 2025 this term is
t_l
, but in the latest version of the GP 0.6.3 March 13 2025 this term is
t_o
.
(edited)
# 2025-03-18 18:26 gav: Not sure what you mean: see section 17.3; audit selection happens on all work-reports "pending which have just become available".
# 2025-03-18 18:26 gav: So you'll never be self-selecting to audit a work-report until you, at least, believe it is available.
# 2025-03-18 18:30 gav: 14.14 defines regular-J, using caligraphic-J (subscripted with 0), itself defined in E.5.
(edited)
# 2025-03-18 18:33 gav: The formulation of 14.14 does not use the paged-proof(regular-P, 14.10) formulation at all.
(edited)
# 2025-03-18 18:34 gav: Now, please remember as I've now said countless times, the Gray Paper defines _observable behaviour_. It doesn't necessarily tell you how to create that behaviour.
(edited)
# 2025-03-18 18:36 gav: In this case, as the text states, you'll need to combine this formulation with the page-proof formulation and see that you can create the correct behaviour without requiring all of the data in the segment tree.
(edited)
# 2025-03-18 18:38 gav: That's indeed the whole point of the paged-proofs. They're a very efficient means of storing proofs across nodes for showing the correctness of exported data segments when the time comes to import them.
(edited)
# 2025-03-18 18:42 gav: So you'll need to study (14.10) and understand that for any export, a relevant proof-page can be fetched from the Segments DA and information from within the two parts of it can be (quite easily) selected and combined in order to produce a full import-justification of the form needed by (14.14 J).
(edited)
# 2025-03-18 19:25 gav: Activity statistics should help us understand what our JAM instances are actually doing and would be a nice thing to monitor and visualise on a web/cli tool if anyone's making any.
# 2025-03-18 21:11 sourabhniyogi: Is it reasonable to use
CE139 to fetch the relevant-proof page (or rather shards of the proof page, so as to reconstruct the proof page) from "Segments DA", where the
Segment Index
of CE139 is _greater_ than the export count?
(edited)
# 2025-03-18 21:26 eclesiomelo: hey guys! I have a question regards the test service in the Accumulation test vector, we have tried to parse it using the definition A.37. I would like to confirm that this PVM blob is formatted according to this definition and not simply A.2 (deblob), is this correct?
# 2025-03-19 14:44 eclesiomelo: okay, so given the definition A.37 the first 3 bytes encodes the |o|, however the first 3 bytes in the test service is 0x47000c which decodes to 786503 in decimal which is greater than the test service total bytes
(edited)
# 2025-03-19 15:00 eclesiomelo: I am using the inverse of integer encoding, definition C.5, and where I pass [47, 00, 0c] and as output I got 786503, which I think I am missing something.
# 2025-03-19 10:01 gav: W* is the
accumulatable WRs; we don't accumulate all of them if the gas limit doesn't allow.
# 2025-03-19 10:03 gav: As for always-accumulate services, see (12.20) - there's always enough gas to accumulate those in addition to the usual amount for each core.
(edited)
# 2025-03-19 10:08 gav: There's an additional note placed on G_T in definitions advising to account for the always-accumulate services, but because of (12.20) G_T formally only places a lower-limit on the amount of gas used; if there's more always-accumulate gas than would fit into G_T, we run "over" G_T and honour the always-accumulate gas.
# 2025-03-19 11:29 boymaas: Thank you, gav . I am going to take another look later today. I was aware of 12.20 with the addition of the always accumulate services. I was imagining a scenario where we could get a set of accumulatable WRs whose total gas consumption would be more than Gₐ * C. Where fe an immediate WR, would resolve a bunch of queued WRs. In a tiny network setup with 2 cores for example. Since g is used to select a subset of the accumulable WRs I though it could eat into the reserve for the always accumulate services.
# 2025-03-19 14:42 gav: Always-accumulate services should be in the first "batch" of WRs accumulated (i.e. among the first items of W*) so they will, indeed, always accumulate.
# 2025-03-19 14:44 gav: Not at all clear what you're pointing at or the reasoning behind your suggestion.
(edited)
# 2025-03-19 14:57 vinsystems:
Eq B.6 defines the result context
X
of the accumulate invocation. The last term of this equation is
y ∈ ¿H
. Should this term be
y ∈ H?
instead of
y ∈ ¿H
since the operator
¿
is used for serializable terms?
(edited)
# 2025-03-19 17:12 sourabhniyogi: C(13) in Eq D.2 needs to get the current / last validator stats (pi\_V, pi\_L) back in (from 13.1/13.2)
(edited)
# 2025-03-19 17:38 sourabhniyogi: The subscripts in C(13) in D.2 only have { C, S, L } but should have { C, S, V, L } to cover both core/service activity (the new elements of C(13)) and validator statistics activity (the old elements of C(13)), that's all.
(edited)
# 2025-03-19 17:43 gav: > The subscripts in C(13) in D.2 only have { C, S }
(They actually had C, L and S, but were indeed missing V as the typo had C twice.)
(edited)
# 2025-03-19 18:59 sourabhniyogi: How did you end up with W\_M = 2048 (The maximum number of imports and exports in a work-package.) \[or 3072 in recent versions\] which implies at most 32 proof pages and 8.4MB of CoreVM memory -- is this due to bandwidth or networking considerations? With 4104 bytes usable per proof page, you can fit many many more import/export segments than 2048. If we solved the 14.10/14.14 "puzzle", we have only 64\*32+5\*32 byte = 2208 bytes out the 4104 bytes being used [or 2240 in recent versions]. You can definitely fit many many more segments with the unused proof page space of 4104-2208=1896 bytes
(edited)
# 2025-03-19 19:02 dave: I didn't come up with these numbers; Gav did. I think they're based primarily on bandwidth usage, as discussed in the discussion section at the end of the GP
# 2025-03-19 19:08 dave: Re proof segment usage, think one reason for the choice of 64 is that it's a power-of-2 which makes things simple -- you fit a complete subtree in a segment. Could probably fit more proofs in but it would get more complicated as you would have a partial subtree (or multiple subtrees depending on how you look at it). Not sure if there are other reasons.
(edited)
# 2025-03-19 19:21 sourabhniyogi: Did you get "full" erasure coding decoding in the JAM Toaster yet (or some serious fraction of it) -- I've been bothered by doing all this networking just to get a measly 12 bytes and was wondering if instead of W\_G=4104 pages we'd like something slightly bigger than 64K or 128K instead (thus getting 16x-32x as much). Then the 2048 could gets 16 or 32x times as much memory but of course 16x - 32x as much bandwidth usage. The idea would be that you can get your CoreVM loaded with fewer segments and network calls, but maybe someone will want a CoreVM to have not 8.4MB of memory but 134MB or 268MB instead. Not sure if W\_G = 4104 is chosen to be compatible with some RISC-V related considerations?
(edited)
# 2025-03-19 19:25 dave: W_G=4104 for 4k pages for eg CoreVM. Bigger page sizes are not great, as it means way more read/write amplification in the worst case. 4k is already really too big for a number of use cases, we just can't really go smaller as 4k is the min page size on modern HW
# 2025-03-19 19:28 dave: Please note that the 12 bytes is _per segment shard requested_. You should absolutely be batching these requests so that in the ideal case you make ~340 network requests for all of the segment shards you need for a WP. This is still a lot of requests of course. In the full network protocol we may add a fast path for requesting the original data directly from guarantors.
# 2025-03-19 19:32 dave: 12MB is a limit on the number of pages read/written _per WP_. It is _not_ the limit on the CoreVM memory. This can't really be made larger as it is constrained by bandwidth. Remember that all this data needs to be read/written from/to the DA system for every core every slot
# 2025-03-19 19:53 sourabhniyogi: Understood about the batching of
segmentIndex
in JAMNP taking us from 12 bytes to some multiple of that. My mental model is that the majority of segments will come from one primary work package and occasionally a few segments come in from another few "foreign" wps but I understand that's just one use case, where my mental model is formed by what I imagine a lazy CoreVM programmer would do (if you tell him its like normal programs).
By "read/write amplification" I believe you mean the idea that we load up all these pages (yes, in a single WP) from Segments DA and only a tiny % of the pages are updated and only a tiny fraction of each page at that, right? So it looks like a tradeoff between that and the number of networking calls, even with batched
segmentIndex
. Thanks for explaining!
# 2025-03-20 10:10 dave: By read/write amplification I mean even if you want to read or write just 1 byte you still have to import/export an entire page/segment; you end up reading/writing 4000 times as much. Of course that is the worst case, usually it will not be quite as bad
# 2025-03-20 03:43 sourabhniyogi: What is the expected strategy to have the CoreVM service identify which pages in the child VM have been written to (say, marked with a "dirty" bit) so that those pages (and only those pages) are exported to Segment DA by the parent VM?
# 2025-03-20 08:47 gav: Yes, CoreVM will support this via the inner PVM host API. We don't yet have that host call API in place but I believe it's on Jan Bujak 's TODO list.
(edited)
# 2025-03-20 08:56 gav: One additional thing to note for segment reconstruction is that actually fetching each 12 byte piece from a unique set of 342 nodes and reconstructing is the very worst case. There are four better cases: that some or all of the segments are provided to the guarantors by the builder (because it could be the same builder or builder-network whose package created them in the first place). In this case they could have the proof-pages or just be content indexed and passed as extrinsics. Or the guarantor could already have the pages in their own cache because it was they who guaranteed the exporting package. Or the guarantor could fetch them from the exporting guarantor as whole segments. Or that the guarantor batches them because they were exported by the same package as other segments which are also needed for import. And in the worst case, the system wouldn't _break_; it would just mean that packages which exposed this worst-case behaviour would take potentially a little longer to make their way through the pipeline.
(edited)
# 2025-03-20 10:26 danicuki: The encoding for newly created Work Result fields (C.23) do not specify integer sizes for xu, xi, xx, xz, xe. Are they all 1 byte values?
# 2025-03-20 16:32 prematurata: I have a couple of questions/remarks about accumulation. Specifically how jam should handle dependencies when accumulating a service. Let's consider the following scenario:
- WorkReport A contains 2 results: service 1 and service 2
- WorkReport B has a dependency on A
((WB)_x)_p = ((WA)_s)_h
and contains one result of service 3.
So
W!
contains \[1, 2\], lets assume
WQ
is empty... But
W*
contains also 3 so
W* = [1, 2, 3]
.
Now
∆∗
which is called "parallelized accumulation", is being called with W\* on 12.21 (after ∆+)
https://graypaper.fluffylabs.dev/#/68eaa1f/17b50317c103?v=0.6.4 .
question 1: But since there is no ordering enforced (that i can see) when executing
∆∗
, then 3 might be executed before it's dependencies.
question 2: lets say i am wrong in the previous question. Lets say Service 1 writes in it's storage a key that Service 3 needs to read from (via hostcall read). service 3 does not seem to receive the updated storage when
∆1
gets called with s=3
I guess both questions/remarks are wrong and that Service3 needs to be indeed gets executed after 1,2 (which can be parallelized) and that it should be accumulated using the updated service accounts from the 1,2 accumulation. but i can't see where this is enforced in the GP.
(edited)
# 2025-03-20 17:08 jan: Indeed, as Gav said, we will most likely add a way for the parent VM to fetch this information in an efficient manner. It's on my TODO list; we just don't want to spec it before implementing it to make sure the performance is good.
# 2025-03-20 18:34 sourabhniyogi: That's totally great. A
page-manifest
host function call of both (A) what pages MUST be exported because it was written to (a "dirty bit" per page) and (B) what page MUST be imported because it was read from (a "accessed" bit per page) appears necessary for programmers to not have to think much about how their child "guest program" uses memory and the parent CoreVM service will need this manifest to poke and peek pages so the programmer doesn't have to.
But since guest programs will vary widely in terms of how many "accessed" and "dirty" bits they get before they hit some refine gas limit OR the W\_M=3072 (12MB) limit on imported+exported segments which applies across ALL the VMs, parent and child, it seems the builder should be able to run with the whole 2^32 VMs always fully loaded ... and
then use this bookkeeper to dump work packages out right before it runs out of gas or hit the W_M, to effectively break up long running computations into work packages that the guarantors can legit handle in accordance with JAM spec. We would thus want "builder" mode refining (having tens of thousands maybe the 1MM page) and "guarantor" model refining (W\_M) working on just the subset the builder wants. Is that the way to think about designing this host function? How do performance considerations figure in?
(edited)
# 2025-03-20 18:42 gav: 1. WIs for services 1, 2 and 3 would be executed in the same batch. From the perspective of each they would be executed under the same prior state. If #3 inspected the state of #2 it would see its prior state; if #2 inspected #3, it would also see #3's prior state.
2. No. #3 would not (necessarily) see the storage change.
(edited)
# 2025-03-20 18:43 gav: The thing to understand is that JAM services are very much designed to be only asynchronously interactive, at least at the accumulation stage.
# 2025-03-20 18:44 gav: The prerequisite functionality is there to ensure that a package doesn't get accumulated before another package is known to be accumulatable.
# 2025-03-20 18:44 gav: It is not there to force a total ordering over its constituent work items; and certainly not over multiple services.
(edited)
# 2025-03-20 18:45 gav: This would create a potentially troublesome pattern and may over-extend the queue system and reduce potential parallelisability for accumulate.
(edited)
# 2025-03-20 18:45 gav: A total ordering is possible within a single service; by creating dependent packages, you can be certain that certain WIs will not be accumulated in batches before others. You may end up with a dependency in the same batch, but then that's up to the service code to apply the appropriate ordering.
(edited)
# 2025-03-20 18:47 gav: If you're looking to synchronise between services, then you'll need to use
transfer
s.
(edited)
# 2025-03-20 18:49 gav: Transfers can be combined with co-scheduling at the Refine stage (e.g. sharing the same WP) and it becomes possible to create causal entanglement between the WIs of multiple services which can be enforced at the accumulation stage, so the entangled effects only get integrated into state when both sides are known to be completable.
(edited)
# 2025-03-20 18:53 gav: Accumulation is designed with a view to becoming parallelisable. At some later revision of JAM we may e.g. increase the number of cores to 682 with a requirement that a single service cannot regularly have more accumulation gas than is possible with 341 of them. We can't squeeze more gas into the synchronous pipeline, so this would be made viable through CPU parallelism executing multiple service's simultaneously.
# 2025-03-20 18:54 gav: This model breaks more as work-items between services become orderable and synchronous dependencies - chains of execution - start to become the norm.
# 2025-03-20 18:54 gav: So it's something I'm really trying to avoid with this design.
# 2025-03-20 18:56 gav: In short, cross-service execution dependencies force synchroneity and are evil and the death of scalability. We want to keep them off-chain.
(edited)
# 2025-03-20 19:03 prematurata: perfect gavin. Thanks for this indepth explanation. it was very much needed for me... I was trying to make both the "dependency system"/prerequisite and parallelism work together.
# 2025-03-21 14:34 tvvkk7: Hello, I'm implementing PVM invocations, but I'm curious about how we get the actual service code, or in other words the standard program codes.
Take
on-transfer invocation for example, we input service codeHash into argument invocations. But, service codeHash is a 32-octet value. How do we get the program codes through service codeHash ?
# 2025-03-21 14:54 gav: There’s a function, big Lambda. This should define how to derive the hash preimage.
# 2025-03-21 22:19 celadari: Hi guys, small questions:
- Is ε(tₐ, tᵦ, tₜ, tᵧ, tₘ, tₒ, tᵢ) used in the definition of the host call function info Ωᵢ the same encoding as **𝐚 ∼ 𝓔₈(𝐚ᵦ, 𝐚ᵧ, 𝐚ₘ, 𝐚ₒ) ∼ 𝓔₄(𝐚ᵢ)**?
- Perhaps there's something I don't see, but in equation A.43 we define **u = ρ − max(ρ′, 0)**.
If ρ′ is negative, then **u = ρ**, which means the gas doesn't change.
So the service would have run code, but the gas stays the same — is that correct?
If so, is that the intended behavior?
# 2025-03-22 04:56 clearloop: may I ask if this is part of the new PVM tests or the stf of accumulation? I see everybody is talking about this however we haven't met this yet 😅
(edited)
# 2025-03-22 09:02 celadari: Sorry actually you were asking about the first question or the second question ?
# 2025-03-22 09:06 clearloop: sry I'm just curious about the host call part, I'm now updating our PVM tests while I don't see tests with host calls, so I assume it belongs to the accumulation stf?
# 2025-03-22 05:55 gav: > <@celadari:matrix.org> Hi guys, small questions:
> - Is ε(tₐ, tᵦ, tₜ, tᵧ, tₘ, tₒ, tᵢ) used in the definition of the host call function info Ωᵢ the same encoding as **𝐚 ∼ 𝓔₈(𝐚ᵦ, 𝐚ᵧ, 𝐚ₘ, 𝐚ₒ) ∼ 𝓔₄(𝐚ᵢ)**?
> - Perhaps there's something I don't see, but in equation A.43 we define **u = ρ − max(ρ′, 0)**.
> If ρ′ is negative, then **u = ρ**, which means the gas doesn't change.
> So the service would have run code, but the gas stays the same — is that correct?
> If so, is that the intended behavior?
On the second point, no. u is gas used. By ensuring the second term (gas counter) is never negative we just ensure that the gas used is never greater than the gas limit.
# 2025-03-22 07:40 celadari: Oh thank you, I hadn't inderstood that u was used gas, makes sense
# 2025-03-22 05:58 gav: On the first point, no. The former encoding uses the variable size numeric encodings.
# 2025-03-22 11:21 greywolve: Is the explicit encoding of the
tuple in 5.6 redundant since that's going to just remain an octet sequence after the outer encoding? Or is there something special I'm missing? (i.e is it pretty much the same as the regular serialization in the appendix only the work report replaced with the hashed work report instead)
(edited)
# 2025-03-22 11:56 gav: > <@greywolve:matrix.org> Is the explicit encoding of the
tuple in 5.6 redundant since that's going to just remain an octet sequence after the outer encoding? Or is there something special I'm missing? (i.e is it pretty much the same as the regular serialization in the appendix only the work report replaced with the hashed work report instead)
It is done this way to avoid having to send all guarantees with the header. Merkle proofs can be provided for those which are sent on other channels.
# 2025-03-22 11:58 gav: > <@celadari:matrix.org> Question regarding this:
>
> i'' was removed in the definition of Psi_H in version 6.4 of the GP so my question => ¿ do we advance the counter after ecalli instruction or not when we exit ecalli for the host call ?
>
> Looking at this line looks like we don't advance it
https://graypaper.fluffylabs.dev/#/68eaa1f/246700247200?v=0.6.4
>
> but then it conflicts with idea of using i' (not using i'') during Psi_H call (
https://graypaper.fluffylabs.dev/#/68eaa1f/2b9d012b9d01?v=0.6.4) where it would mean that i' is advanced after ecalli instruction for the host call ?
>
> Thanks in advance for the clarification 🙏
ecalli is no different to other instructions regarding i’: i’ still represents the instruction immediately following and as per the definition of PsiH, we advance to it once the host call is resolved.
# 2025-03-22 13:10 subotic: Ahh, now I understand it. The program counter always advances as per (A.7) and additionally in the case of
ecalli
,
epsilon is h x v_x
instead of
play
. Thanks!
# 2025-03-24 09:56 dakkk: gav: what is the rationale of having core and service statistics into the protocol? While validators' statistics are useful as explained in the GP, there are no information of the usefulness of core and service statistics, and I'm unable to figure it out by myself
# 2025-03-24 10:10 emilkietzman: Secondary markets of Agile Coretime like RegionX or Lastic - You could check Core utilizations in different projects and sell unused Coretime
# 2025-03-24 23:32 celadari: Question regarding host call functions Omeja_J(n=reject)
- if we apply it to an account
a
of index
s
=> it is supposed to eliminate from the database only the components
a_c
,
a_b
,
a_g
,
a_m
,
a_o
,
a_i
(represented by C(255, s) in the trie) or
a_l
,
a_p
,
a_s
as well ?
# 2025-03-25 14:55 jay_ztc: Is the ordering of the preimages extrinsic fully defined in the GP? 12.35 specifies that the preimages extrinsic should be ordered by the (account, preimage) tuples, but doesn't go into further detail. Looking at the tuple & sequence notation sections in section 3, there's not a default tuple ordering defined there either.
https://graypaper.fluffylabs.dev/#/68eaa1f/181001181001?v=0.6.4 (edited)
# 2025-03-25 16:31 sourabhniyogi: With v0.5's PVM 64-bit change (and a irreversible commitment to not supporting 32-bit PVM), it is reasonable to adjust
https://graypaper.fluffylabs.dev/#/68eaa1f/2c15002c1500?v=0.6.4 to have a memory address space beyond 4GB? If not, what are the technical reasons for continuing with this 32-bit layout?
I believe we should extend the "A"
here to include "accessed" (i) and "dirty" (e) bits so as to map into what pages must be imported and what pages must be exported, thus treating coreVM OSes specially. Or get at least a convention on how corevm services use the 4104-4096=8 bytes to keep the page number and these additional metadata bits. I understand JAM protocol may not wish to impose constraints on what these additional 8 bytes contain (though I believe it makes sense to have these i+e bits to support OSes to run on JAM), but a nevertheless pregnant question regardless is: are there additional metadata bits candidates per segment we should consider when designing CoreVM + Coreplay services?
(edited)
# 2025-03-25 20:31 gav: No idea what you're talking about. Please reframe it in specific terms of Omega_J.
# 2025-03-25 20:33 gav: We can't actually handle 4 GB of allocations in terms of gas. Realistically most programs will mostly execute with only 1MB actually accessible (maybe sometimes with 16MB, but only very rarely with more).
(edited)
# 2025-03-25 20:33 gav: Gas will be scaled depending on how much memory you're accessing. It'll become impractical many orders of magnitude below 32-bit.
# 2025-03-25 20:36 jan: > If not, what are the technical reasons for continuing with this 32-bit layout?
Speed, as it makes sandboxing cheaper, and as Gav said you won't be able to use this much memory in practice anyway, so it's pointless to have a 64-bit address space.
# 2025-03-25 20:36 gav: And, for security (auditing) validators will need to be able to execute several refinements concurrently, probably around 10; we'll also need 2 guarantor refinements. If they all used, say, 4 GB of RAM, then validators would need to have 48GB of RAM free before we even start thinking about the state DB and various caches.
# 2025-03-25 20:37 gav: That would probably push minimum requirements to beyond 64 GB per node, which is too much.
# 2025-03-25 20:37 gav: Any, in any case, there's no sensible on-chain use-case which would need 64-bit addressability.
# 2025-03-25 21:03 sourabhniyogi: I read David Emett 's comment of "12MB is a limit on the number of pages read/written per WP. It is not the limit on the CoreVM memory." in the following way:
- A CoreVM service user actually really does have a 4GB virtual computer. However, in any given work package, spanning say a few seconds of computation only a small number of pages are accessed (imported) or written to (exported).
- Only W\_M (3072 as of 0.6.4) pages = 12MB of this 4GB virtual computer is realistic to set up with JAM's PVM but it is only _tiny fraction_ of the larger addressable subset of the CoreVM service users's 4GB virtual computer.
- So one work package might access ABC (12MB) the next might access DEF (a different 12MB), the next EFG etc, none of which exceed W\_M _individually_ but in totality across multiple work packages exceed 12MB. In this way, you could totally want much more than 4GB.
- When a builder submits a work package to a guarantor, JAM being basically an audit protocol of what happens in up to a W\_M (12MB sized) sliver of memory of what the larger 4GB computer did. JAM is optimized for trustless OS services.
In previous decades there was a story of "640K ought to be enough for anybody" and maybe "I think there is a world market for about five computers" -- these days 4-8B people all have 4-8GB phones in their pocket and so perhaps the trustless supercomputing equivalent is "12MB/4GB ought to be enough for everyone" and "there is a world market for about 5 trustless supercomputers" =). Could we imagine that all 4-8B people collectively get their Shared World Computer in a 64-bit way so they may all coreplay together, even though any individual work package only references a tiny sliver?
(edited)
# 2025-03-25 21:14 gav: I appreciate the ambition, but the limits are there for a reason.
# 2025-03-25 21:14 gav: We've explained the reasoning. Live with it or come up with a better protocol yourself.
# 2025-03-25 21:16 gav: There's the business of dreaming and the business of building. This channel is for the latter.
# 2025-03-25 21:54 gav: You can answer your own question if you simply phrase it in terms of Omega_J.
# 2025-03-25 21:56 gav: Omega_J has the effect of removing a particular item (d) from the accounts dictionary (delta).
# 2025-03-26 15:25 celadari: Thanks again for your time and answer.
Just to give a bit more context on why I was confused:
I hadn’t realized that the condition on
𝑑𝑖=2
was actually implying that the **storage**, **preimages**, etc. for that account had to be _already empty_ — meaning the account must have gone through a
forget(Omega_F)
and
write(Omega_W)
before being eligible for deletion.
Initially, I thought we were supposed to manually remove these fields by directly deleting down from the partial trie key like
𝐶(𝑠,𝐸4(2^32−1))
which would have worked fine for **preImageLookupP** and **storage**, but not for **preImageLookupL** (because of
E(l)
) — and that one had me pulling my hair out 😅
All good now and thanks again :)
(edited)
# 2025-03-25 21:57 gav: And if you see the various data concerning accounts which makes up the state trie (from which the trie root may be derived), they're all defined through the contents of the accounts dictionary delta.
# 2025-03-25 21:58 gav: Therefore if the dictionary no longer contains a particular account, then state trie items such as the the stored data, preimages, &c which were concerning said account will no longer be in place.
# 2025-03-26 07:36 tvvkk7: Hello gav , I'm implementing accumulation invocation. The
initializer function I requires eta'_0. Does eta'_0 be input through
U ? Although, it is only needed in Psi_A.
# 2025-03-26 09:21 shawntabrizi: As I recall, compact numbers in JAM are different than they are in SCALE and Polkadot today. Can someone write a small description of the new compact number format? ❤️
# 2025-03-26 09:31 xlchen: I give my code to chatgpt and this is the description from it
This encoding converts an unsigned integer into a sequence of bytes using just as many bytes as needed. Here’s how it works in simple terms:
1. Zero Handling:
If the integer is 0, it simply outputs one byte with the value 0.
2. Determining Byte Count:
For nonzero numbers, the encoder figures out how many extra bytes are needed by finding the smallest number l (from 0 to 7) such that the value fits within 7 \times (l+1) bits. If none of these work, it defaults to using 8 bytes in total.
3. Control Byte Creation:
The first byte (control byte) combines a prefix that indicates how many extra bytes follow and part of the number itself. The prefix is calculated as:
\text{prefix} = 256 - (1 \ll (8 - l))
The control byte is then formed by adding this prefix to the most significant bits of the number.
4. Appending Remaining Bytes:
After the control byte, the remaining bytes (if any) represent the lower parts of the number, each taking an 8-bit chunk.
In summary, the format starts with a control byte that tells you how many additional bytes there are and includes part of the data, followed by the extra bytes that complete the full representation of the integer.
# 2025-03-26 09:24 jan: One prefix byte plus payload. The prefix byte determines the length through the number of
1
s before the first
0
. The unused bits in the first byte are used for the payload. The first byte always contains the most significant bits. The rest of the bytes are always written in a little endian order. Can encode at most 64-bit numbers.
At most 7bit - 0xxxxxxx
At most 14bit - 10xxxxxx xxxxxxxx
At most 21bit - 110xxxxx xxxxxxxx xxxxxxxx
At most 28bit - 1110xxxx xxxxxxxx xxxxxxxx xxxxxxxx
At most 35bit - 11110xxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
At most 42bit - 111110xx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
At most 49bit - 1111110x xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
At most 56bit - 11111110 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
At most 64bit - 11111111 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
# 2025-03-26 09:24 knight1205: is there any specification for work package builders, yet?
# 2025-03-26 09:33 jan: Not sure I understand your question. That program contains only static jumps and it doesn't require a jump table.
# 2025-03-26 09:35 clearloop: oh I got you, so there could be problems in my
djump
implementation, I'm referencing the jump table in all jump instructions
# 2025-03-26 09:36 jan: The only dynamic jump in that program is the jump that goes to the hardcoded special address which halts the program; no other jumps there use the jump table.
# 2025-03-26 09:37 jan: Non-dynamic jumps definitely shouldn't do anything with the jump table.
# 2025-03-26 09:37 clearloop: I may need to fix more problems in my code, since I can pass all
test_instr_*
with my current implementation 😂
# 2025-03-26 09:43 gav: There may be conventions, published APIs and/or SDKs to help create package builders.
# 2025-03-26 09:44 gav: But the JAM protocol doesn't prescribe any means of building packages any more than it prescribes how you should create your service logic.
(edited)
# 2025-03-26 09:45 gav: It terms of data logistics, ideally builders will connect to the JAM network via inbuilt nodes (light or full, depending on the use-case and circumstances) and use internal node APIs to inject new packages. Nodes would then identify the right guarantors and send the package to them.
# 2025-03-26 09:48 gav: However, in the early days, we'll probably see RPCs being used by builder executables to deliver packages to nodes on testnets. It's definitely not something I'd want to see in production, but running & synching a full-node to insert a single work-package is plainly suboptimal and we don't have any light-clients yet.
# 2025-03-26 09:50 gav: That's not a question the GP (or I) can answer. How you get eta' into the appropriate place in your code for it to be able to calculate the initial machine state is entirely an implementation detail.
(edited)
# 2025-03-26 09:50 xlchen: are validators expected to open connections with any builder nodes? obviously limits needs to be enforced on validators side but that can still be a DoS vector?
# 2025-03-26 09:51 gav: And even if it were somehow constrained, validators have no idea who they are.
# 2025-03-26 09:51 gav: Whereas validators IDs are well-known to anyone with an up to date JAM state.
# 2025-03-26 09:52 gav: Strategically, validators need to find work-packages they can guarantee in order to make rewards. But they must balance this with the possibility of being DoSed/attacked.
(edited)
# 2025-03-26 09:53 gav: So there will be some need for implementors to create guarantor strategies which balance these two opposing forces. There's not really a right answer here and it's the sort of thing which should be discussed at JAM0.
# 2025-03-26 09:54 gav: Realistically I'd expect validators to have several dozen nodes connected, other than fellow validators.
# 2025-03-26 09:55 xlchen: I see. something to be figured out. worst case we just have tx pool and some package gossip protocol + peer reputation. ie something like what we have today
# 2025-03-26 09:55 gav: And to actively churn through nodes, keeping ones who tend to give them good packages.
# 2025-03-26 09:56 gav: Connection could come with a promise to give packages adhering to a set of authorizers. Failure to provide a package on a core with one of those authorizers in the pool could result in booting.
# 2025-03-26 09:57 gav: Obviously if bad packages are received, then this would also result in booting.
# 2025-03-26 10:12 xlchen: so for a builder to be able to consistently deliver work packages, it needs to work with all sorts of services. It certainly need a pool and some way to collect the packages. this is a big chunk of work
# 2025-03-26 10:15 gav: Builders will almost certainly be private enterprises and specialised to a particular service or service-type.
# 2025-03-26 10:16 gav: E.g. for parachains, it could be that every parachain will have its own builder network (aka collator network). Though with the Omninode, we'll probably see generic parachain builder networks.
(edited)
# 2025-03-26 10:16 gav: But still, they'll only build for the one Parachains service.
# 2025-03-26 10:17 gav: It will be up to the builders to convince guarantors on cores which their packages are capable of running that they can furnish them with packages.
# 2025-03-26 10:18 gav: Thankfully this need not be done blindly; IsAuthorized is designed to run independently and cheaply.
# 2025-03-26 10:19 gav: And once IsAuthorized executes successfully, the guarantor knows that the builder can reasonably supply a package worth refining/guaranteeing.
# 2025-03-26 10:35 knight1205: so the strategy to build connection and accept work packages will be fixed for each implementation, for consistency, or will there be different strategies? If fixed, will that be provided in JAM-NP?
# 2025-03-26 10:36 gav: JAM-SNP already contains network messages for provision/sharing of work-packages (and preimages)
# 2025-03-26 10:37 knight1205: but that is just protocol for connection setup. will there be any strategy/requirements for acceptance or just we have to validate author hash or work package on our own and then perform computations?
# 2025-03-26 10:38 gav: > So there will be some need for implementors to create guarantor strategies which balance these two opposing forces. There's not really a right answer here and it's the sort of thing which should be discussed at JAM0
# 2025-03-26 10:39 gav: (For the purposes of M2 conformance testing we'll have idealised connections and implementations will not need to concern themselves with the possibility of DoS.)
# 2025-03-26 10:41 gav: As per the security audit (M5), implementations will need to demonstrate a resilience against DoS, including attacks by peers. But of course, over-conservative nodes which sacrifice too many rewards may find that fewer validators are willing to run them.
(edited)
# 2025-03-26 10:43 gav: Again, no right answers, and I do expect (and hope for!) some differences between node strategies, but our implementor conferences are meant for brainstorming and sharing insights into such things.
# 2025-03-26 11:01 dave: SNP currently allows builders to identify themselves at connection time by adding /builder to the protocol advertised during ALPN, see
https://github.com/zdave-parity/jam-np/blob/main/simple.md#alpn. To some extent how validators treat these connections is a strategy thing and it isn't necessary for all implementations to behave the same. A reasonable strategy might be to grant a peer connecting with the /builder suffix a special builder connection slot (subject to availability), but require the peer to submit a valid work-package within a few seconds after connecting in order to keep the slot and not lose reputation.
# 2025-03-28 08:38 gav: s\_{bold c} is the preimage (s\_{regular c} is the hash) - It is defined in (9.4), and doesn't rely on the preimage lookup function.
(edited)
# 2025-03-28 08:58 gav: Yes it's a moot point; I'm happy to take a PR which simplifies it, though I'm not sure if that's necessarily easy.
# 2025-03-30 08:21 celadari: Hi everyone,
I have a question regarding the
program metadata introduced in GP 6.3.
If an extrinsic
E_P
includes a
pre_image
that do not conform to the expected encoding
Epsilon(double_arrow Epsilon(a_m), a_c)
(as specified here:
https://graypaper.fluffylabs.dev/#/68eaa1f/106c01107101?v=0.6.4):
Should we:
-
Consider the entire block invalid ?
OR
- *Accept the block*, and allow the service lookup dictionaries to include these entries, with the understanding that invocations of
Psi_A, Psi_R, Psi_I, Psi_T
for the service of this pre_image would simply fail (by failing I mean that invocations panic thus don't change state) ?
(edited)
# 2025-03-30 11:55 gav: The preimage is determined solely by data encoded as per the GP specification.
# 2025-03-30 11:56 gav: Either it is requested or it is not. If it is not, then the block is invalid. There’s no room for guesswork here.
# 2025-03-30 14:12 celadari: Let me use an example to explain my question more clearly:
Suppose we receive an incoming block with some extrinsics, among which are
E_P
extrinsics (preimages). Let’s assume one of these preimages is for service
s
and is encoded as a
vertical-double-array p
. From the first byte, we determine the length of
p
, extract the corresponding bytes, and treat that as the preimage.
This preimage
p
is expected to represent an
Epsilon(double_arrow Epsilon(a_m), a_c)
structure (
https://graypaper.fluffylabs.dev/#/68eaa1f/106c01107101?v=0.6.4).
Now, say
p = [129, 2, 4, 5, 6]
. Interpreting this:
The metadata slice is supposed to be of length 129, starting right after the first byte.
But the total array doesn't even contain 129 elements—so this is clearly an incorrectly encoded preimage.
My question is:
- Should we **reject the entire block** due to this malformed preimage?
OR
- Should we **accept the block**, include the "bad formed" preimage in the lookup for service
s
, and simply let the
Psi_A
panic at execution time (and thus not updating anything) for this service
s
?
# 2025-03-30 14:50 gav: Preimage extrinsic (E_P) is a sequence of pairs (service index with blob).
# 2025-03-30 14:52 gav: Each other in the preimage extrinsic must be a valid request as per the prior state.
# 2025-03-30 14:53 gav: If you are still confused, I suggest you rephrase your query in though terms.
# 2025-03-30 14:54 gav: I'm not sure what you're really asking, but if the question is "if I receive a block which doesn't correctly decode but from which I could make a best guess at imagining some underlying meaning, should I import it as though it was really an encoding of this best guess?" then the answer is OF COURSE NOT!
(edited)
# 2025-03-30 14:55 gav: Again, this is a consensus protocol. There is no room for error or guesswork.
# 2025-03-30 17:31 dave: If I understand correctly, what you're asking is: if a preimage is requested which will be used as the code blob for a service, must the preimage be a valid "code blob" for it to be includable in a block and integrated into the service storage? Pretty sure the answer to this is no: as long as a preimage has been requested then it can be included in a block. If a service requests a preimage that cannot be decoded or used as a code blob for whatever reason, then attempts to use it as such will fail at the point of use. I'm not sure it can really work any other way as there is no type/format/whatever associated with preimages in the state; they are opaque binary blobs.
# 2025-03-30 18:26 rustybot: Codec vectors only exercise the codec. Payload is mostly just random data
(edited)
# 2025-03-30 18:28 celadari: I agree but since we are talking about encoding I wasn't sure to which extend we were supposed to verify or not.
But thanks anyway ✌️
# 2025-03-30 18:30 gav: If there was a need to verify, then it would state as much in the Gray Paper. It doesn't.
# 2025-03-31 15:39 ascriv: Has anyone yet taken a serious look at if size-synchrony antagonism has been formalized mathematically? Would be nice to have further validation e.g. that what we’re doing is somewhat optimal
# 2025-03-31 15:55 gav: AI answer:
> One relevant concept is "complexity theory," which posits that as systems grow in size and complexity, the potential for disorder and misalignment among components increases. This can lead to difficulties in achieving coherence. Larger systems may have more diverse elements, which can result in varying goals, behaviors, and interactions that can disrupt overall coherence.
> Another related idea is "Ashby's Law of Requisite Variety," which states that for a system to effectively manage its environment, it must be as diverse as the environment it operates in. In larger systems, the variety of components and interactions can lead to challenges in maintaining coherence unless there are effective mechanisms for integration and coordination.
# 2025-03-31 15:56 gav: However the strict trilemma of Scale, Speed and Coherence doesn't seem to be established.
(edited)
# 2025-03-31 15:57 gav: It seems to me, at least, quite demonstrable given real systems have causality bound by speed and component-distances.
# 2025-03-31 15:58 jay_ztc: CAP comes to mind, sort of a cousin principle if you will
# 2025-03-31 16:00 gav: Coherence -> degree of causality across all pairwise pieces of system state
Speed -> bound as the time that it would take light to effect a causal resolution across the two most distant causally entangled parts of state
Size -> Given some maximal density of system state, the maximum distance between causally entangled state-components of the system
(edited)
# 2025-03-31 16:00 gav: It seems pretty trivial to show that If you increase any of these you must reduce one or both of the others.
# 2025-03-31 16:02 gav: So if you make a system bigger (add more state components and therefore make things farther apart) you either need to accept causal resolution will be slower because at least some portions of state are farther apart (and light only travels at a certain speed) or you have to limit what parts are causally entangled, limiting distances travelled for resolution and thus reduce coherence.
(edited)
# 2025-03-31 16:03 gav: CAP is somehow related, but it's binary (select any two). It also doesn't deal with size of speed but only "correctness" properties.
(edited)
# 2025-03-31 16:04 gav: But yes, is a related trilemma/antagonism applicable to (distributed) systems.
# 2025-03-31 16:54 ascriv: > <@gav:polkadot.io> So if you make a system bigger (add more state components and therefore make things farther apart) you either need to accept causal resolution will be slower or you have to limit what parts are causally entangled and thus reduce coherence.
distance = rate * time in some sense?
# 2025-03-31 17:00 ascriv: Distance ~ size
Rate ~ speed
Time ~ coherence
So roughly size = speed*coherence
(edited)
# 2025-03-31 17:21 emielsebastiaan: If a single global coherent state is the design goal (which it is) you can can play/design around different types of decoherence. Eg spatial decoherence (shards), temporal decoherence (ordered accumulation / asynchrony). You can allow for certain types of decoherence and still have a fully coherent global state sufficiently oftentimes to allow for the emergent abstraction of the Cloud layer.
(edited)
# 2025-04-01 11:07 gav: you can make a system go fast, go big, or stay fully coherent but not all of them.
# 2025-04-01 11:10 gav: so, if you keep a system small (in order to keep it fast and coherent we might presume), you'll not be able to decentralise nor will you be able to scale out.
(edited)
# 2025-04-01 11:11 gav: i'd argue that by introducing such decoherence you do not have a fully coherent state.
# 2025-04-01 11:13 gav: however there may be ways to make the system
apparently coherent, or dynamically rebalance the speed and/or coherence in order to optimise all three at any given time.
# 2025-04-01 11:35 ascriv: > <@gav:polkadot.io> or speed
size coherence = 1
Or size*coherence ~ speed, with more speed of info travel you can get bigger or more coherent, no?
# 2025-04-01 11:37 gav: there's the physical limit of speed (speed of light), and the overall speed of the system (one over time to causal resolution)
# 2025-04-01 11:38 gav: it probably isn't sensible to call both things "speed".
(edited)
# 2025-04-01 11:41 gav: the speed of light determines the upper limit of causal resolution - no system could ever process causal interactions faster than this. but it doesn't account for keeping a complex and arbitrary system in coherence. as coherent systems become bigger and more complex, the speed of their overall causality diverges from this universal physical limit.
(edited)
# 2025-04-01 11:45 gav: at a basic level, as a coherent system grows, even if all of its internal causality happened at the speed of light, it would still take longer to step through its state transitions becuase it would take light longer to get from the corners of the system to interact and resolve.
# 2025-04-01 11:46 gav: so the system - in terms of state transitions per second - would be slower.
# 2025-04-01 11:47 gav: this is compounded by complexity, meaning that internal causal entanglements probably resolve slower as the system grows more complex and arbitrary.
# 2025-04-01 11:48 gav: of course our systems have a long way to go before the speed of light becomes too important. but still, the principle can serve us well now.
# 2025-04-01 11:58 gav: basically
T=ZX/C
where:
-
T
is time to causal resolution (s - this is the inverse of the system's operating speed),
-
Z
is size of the system (m - the diameter of the system's bounding sphere),
-
X
is complexity factor of the system (no units, but a factor of at least 1 which describes the number of times light must travel back across the diameter of the bounding sphere in order to guarantee a causal resolution of state-transition)
-
C
is speed of light
(edited)
# 2025-04-01 12:06 gav: a totally trivial system would be a single laser switch in a vacuum with one light emitter transmitting a light signal to some light receiver. in this case Z would be the distance between the emitter and receiver, X would be close to one and T would therefore amount to the time it took light to travel between them.
# 2025-04-01 12:07 gav: as we introduce the capability of data processing
X
increases since the round trip of light is much higher as it passes through more gates and it routed around; and as we introduce state (whether intra-transition or inter-transition)
Z
increases as we need to cover a greater space to hold more information (also a fundamental physical principle as well as intuitively correct).
(edited)
# 2025-04-01 12:27 ascriv: That seems like a good model. as a cool aside maximum info scales with the surface area (not volume) of the bounding sphere, given by the bekenstein bound which black holes are believed to saturate
(edited)
# 2025-04-01 12:59 emielsebastiaan: My team and I have put some thought into this. I’ll try to digest it into something presentable for our little Lisbon meetup.
# 2025-04-01 18:09 dakkk: > <@boymaas:matrix.org> Have we considered, as a thought experiment,
https://en.wikipedia.org/wiki/Quantum_entanglement as a means to get instant coherence in distributed systems? Going beyond the speed of light ... 😃
You can't communicate any information faster than the speed of light; quantum entanglement doesn't do that
# 2025-04-01 18:13 boymaas: Too bad, reading it now indeed, would have been an interesting case.
# 2025-04-01 18:42 jay_ztc: is sbrk here to stay? I noticed its being used in the accumulate testvectors.
# 2025-04-02 08:26 xlchen: minimum. you are not wrong if retained for one more day, but not the case otherwise
# 2025-04-04 16:16 yuchun: Hey there,
I have a question regarding the available work-reports
The **W** available work-reports (defined in equation
(11.16)) are extracted from
rhoDagger
using the core index. As I understand it, each core should correspond to only one work-report, is that correct?
However, I’m a bit confused about equation
(13.10). It sums the work-reports for a specific core from the set of available work-reports. Does this imply that the same core might appear multiple times in the available work-reports?
Please feel free to let me know if I’ve misunderstood anything.
Thanks
# 2025-04-04 18:49 gav: (In the case of 13.10, sum was simply to ensure that we get zero if the core is empty)
# 2025-04-08 20:08 prematurata: I have a question: tecnically speaking is there something preventing the same service to be executed multiple times in the same block?
(edited)
# 2025-04-08 20:17 gav: And it can happen due to queuing and earlier work tranches but using all their allotted gas.
# 2025-04-14 17:06 charliewinston14: Maybe I missed it but is there anything in the GP that mentions how to validate a justification received by CE 137?? I can generate one using the trace function but am not sure how a receiving node verifys the shard is correct using it.
# 2025-04-14 17:12 dave: In general the GP doesn't say how to implement anything, it just defines the required behaviour. The justification is a Merkle proof and can be verified in the usual way. There are lots of blogposts explaining the concept that you can find by googling "merkle proof"
# 2025-04-15 14:28 charliewinston14: Ok that helped I have the general idea now. One question to everyone, given each value in a justification how do you know if they represent a left or right node to be be able to calculate the right hash?
# 2025-04-15 15:15 dave: You can determine the "path" (series of lefts/rights) from the shard index
# 2025-04-15 15:13 greywolve: I think some variant of this question has been asked but I didn't see a clear answer.
Assuming I want to submit a ticket extrinsic that will be included in the first block of a new epoch. At that point I have the last state of the previous epoch.
1. Do I use eta\_1 from the old state for the ring signature context? Given this will become eta\_2 in the next epoch when the ticket will be verified.
2. What about the y\_z, I only have access to the old y\_z. I don't see a way to compute the new one yet. Do I just sign with the old y\_z (and y_k) and hope it doesn't change?
3. When importing this first block in the epoch, would I use the prior y\_z or new y\_z to verify tickets?
6.29 seems to indicate the prior. (edited)
# 2025-04-16 07:52 gav: 1. Yes.
2. I think this is a typo (@davide - wdyt?) - it should read γ'z, not γz. I'll change this.
3. You'll need to compute it - you have all the information necessary from the prior state.
(edited)
# 2025-04-18 15:05 jay_ztc: Just to confirm my understanding- the PVM entry pc is NOT required to be the start of a basic block, correct? Jan Bujak
# 2025-04-18 15:11 jan: Currently, yes (because we need the ability to resume execution after a page fault).
(edited)
# 2025-04-18 15:17 jimboj21: I would think the non-zero cases would have to be jumps right?
# 2025-04-18 15:20 jan: I suppose technically the outer PVM's entry points could end up in the middle of a basic block if you'd build a particularly cursed blob.
# 2025-04-18 15:21 jay_ztc: No worries, thanks for clarifying. It sounds like it's acceptable for an outer entrypoint to be into the middle of a basic block. My apologies for @ing after hours.
# 2025-04-18 15:23 jan: I have considered disallowing such entry points in the past (basically allow only start-of-basic-block entry points and those needed for hostcall + page fault resumption), but we haven't made such change to the GP yet.
# 2025-04-18 15:24 jay_ztc: That's how I approached in my reasoning as well, do you think this change is likely? No worries if its too early to tell- just curious. Might end up tabling this in my impl.
(edited)
# 2025-04-18 15:30 jan: Can't give you a 100% answer at this point, but it is possible. In general the plan for the final gas cost model is to charge gas only at the beginning of the basic blocks (because charging per instruction is very inefficient), so now if you allow entry points anywhere this potentially complicates the gas metering implementation. Unfortunately disallowing them doesn't necessarily alleviate the problem because you still need to allow entry after page faults, and since memory accesses are so common we don't want to make memory access instructions into basic block terminators.
# 2025-04-18 15:35 jay_ztc: Thanks for clarifying, this is really helpful. I don't have enough context yet to form an opinion on the tradeoffs between allowing inner-instances being able to resume right after pageFault vs requiring them to restart/rollback to the start of a basic block.
(edited)
# 2025-04-18 15:37 jan: Although we
need some limitation when it comes to entry points, because without any restrictions you could e.g. jump into the middle of an instruction, and depending on the particular bytes used that might actually be a valid instruction. Supporting this in practice would be a nightmare in any other implementation that isn't a naive slow interpreter, so at very least jumping in the middle of instructions is something that we definitely do
not want to support.
# 2025-04-18 15:38 jay_ztc: *context as far as the programs/usage of programs on top of JAM
(edited)
# 2025-04-18 15:39 jan: Unfortunately it's impossible to require a restart at the start of a basic block as that'd screw up the program state.
# 2025-04-18 15:39 jan: The already executed part of the basic block might have modified memory or registers in a way that is irreversible.
# 2025-04-18 15:40 jan: So a rollback is not possible without taking a snapshot at the start of every basic block which might page fault, and we do not want that as it'd be abysmally slow.
# 2025-04-18 15:40 jay_ztc: the outer pvm instance would have access to gas host function and could potentially use its own memory as a backup before any unsafe calls right? Although, this gets way complicated fast
# 2025-04-18 15:42 jay_ztc: But at the least, the outer pvm could do gas check to determine if it wants to 'risk' another inner invocation right? Although being turing complete makes this pretty hard- unless there's some well-defined gas-estimate API contract between nested invocations
# 2025-04-18 15:43 jan: That's infeasible; if you could do that you'd become a very rich person as then it'd mean you solved the halting problem. :P
# 2025-04-18 15:44 jay_ztc: good point lol, suppose that api idea would be practically useless due to such a small scope of applicability...
# 2025-04-18 15:46 jay_ztc: This discussion has provided a lot of clarity, and given me a few things to think about... Many thanks for your time Jan Bujak 🙏
# 2025-04-18 17:07 jimboj21: Given the return signature: When 12.17 is called shouldnt the assignment order be o*, t*, b*, u* ?
# 2025-04-19 16:18 ycc3741: I just want to confirm something about STF.
Should the dispute when updating ψ′ happen before safrole?
Because based on what I see in this
link,
updating gamma_k' requires ψ′_o.
# 2025-04-19 17:23 gav: > <@jimboj21:matrix.org> Given the return signature: When 12.17 is called shouldnt the assignment order be o*, t*, b*, u* ?
Yes. I think this is already fixed in main.
# 2025-04-19 17:26 gav: Yes pretty much. Order is technically an implementation detail - some languages don’t have the concept of ordering - so I’m not going to tell you any order per se, but your reading is correct - key rotation is dependent on disputes.
(edited)
# 2025-04-20 14:21 gav: Yes your reading is correct. It is in order to get as much information as possible in statistics, not just reported stuff (which represents the most recent computation work done on cores by guarantors) but also what data - required by each core - has recently been made available through assurers
# 2025-04-21 21:25 ascriv: I think the GP is not clear on how to serialize elements of N? And therefore, e.g. state serialization (D.2) is not clear on how to serialize the validator statistics component C, which has components in N
# 2025-04-21 21:53 ascriv: Makes sense, wanted to make sure we’re intentionally using the general natural serialization here instead of accidentally not specifying the subscript. Since usually we only use the general one for the length discriminator
# 2025-04-21 21:55 ascriv: It’s just not very consistent. Also in C(13) we are clear to use E_4 for pi_V and pi_L which have components in bold N
# 2025-04-21 21:55 danicuki: > <@ascriv:matrix.org> Makes sense, wanted to make sure we’re intentionally using the general natural serialization here instead of accidentally not specifying the subscript. Since usually we only use the general one for the length discriminator
Yes. It is used for the first time in latest version of GP
# 2025-04-21 21:56 danicuki: It is used to save storage space, as statistics number can grow indefinetly (or not).
(edited)
# 2025-04-21 21:58 ascriv: But couldn’t pi_V and _L also? Yet we implicitly limit them based on the usage of E_4 serializing them
# 2025-04-22 11:25 gav: Yes, the encoding of integers is not presently entirely uniform. Some are encoded for size savings (statistics, where there's relatively a lot of data in a place where bandwidth is very tight), others for the ability to swiftly/efficiently decode (e.g. in PVM I/O). It will be reviewed during the 0.6 series under this issue
https://github.com/gavofyork/graypaper/issues/293.
# 2025-04-22 11:26 gav: The encoding used for pi\_V and pi\_L might yet be changed, or possibly pi\_C/pi\_S.
(edited)
# 2025-04-22 11:29 gav: It's mostly corrections, with two small protocol alterations:
- There's now a gas limit in the accumulation operand tuple.
- There's a new host-call to allow services to provide preimages to other services directly without going through the regular off-chain preimage process.
# 2025-04-22 12:34 jay_ztc: Got a few small questions about PVM edge cases, to confirm my understanding->
1. If c\[0\] isn't a valid opcode, this would result in a panic, _if and only if_ the program attempts to execute c\[0\], correct?
2. If an instruction (this time corresponding to an index from the bitmask, k) contains an invalid opcode, this would result in a panic, _if and only if_ the program attempts to execute it, correct?
3. if the skip length function doesn't find any bitmask-marked opcodes within the subsequent 24 octets, this bumps the pc to current+24. So if there happened to be a valid opcode & args at that new pc, it would continue executing at the new pc in the same manner as if it were marked by the instruction bitmask, correct?
Jan Bujak
(edited)
# 2025-04-22 13:00 jan: Yes, currently "invalid" instructions don't make the program invalid, and only have an effect if executed (they're effectively treated as a trap). Yes, if there's no
1
found in the bitmask then the skip is assumed to be 24. Note that IIRC currently even if the next instruction has
0
in its opcode bitmask bit (i.e. the next bit after the last bit that the bitmask scan checks) the instruction will also be executed, but this is not intended behavior and is on my TODO list to add to the GP that every instruction must have a
1
in the bitmask to be considered valid (otherwise we'll run into some nasty corner cases).
(edited)
# 2025-04-22 13:04 jay_ztc: Many thanks for the quick response 🙏, your insight on the TODO is much appreciated. enjoy your evening.
# 2025-04-22 13:08 jan: Also, while we're at the topic parsing, (this is going to be relevant to people aiming for M3 and M4) in case you're wondering why the limit is 24 - it was deliberately picked to allow for fast parsing. To parse a PVM instruction at a given position you need to read only two values from memory: a 128-bit integer from the instructions slice, and a 32-bit integer from the bitmask slice, and then you can easily parse it in an efficient manner with bitshifts etc. (for the bitmask one can use the "leading zeros" intrinsic/method which is a single assembly instruction on modern CPUs to cheaply get the skip to get to the next instruction, hence the maximum is 24)
# 2025-04-22 13:10 jay_ztc: corner case of the corner case might be a 'greater than 24 octet' break in instructions after a valid basic block termination instruction-> (ie is the subsequent 'valid' instruction also a 'valid' jump target, or is the jump target the 'invalid' 24th octet)
edit: 'program termination' -> 'basic block' (brain fart)
(edited)
# 2025-04-22 13:12 jan: IIRC the jumps actually check that the previous instruction has the
1
set in its bitmask (because you can only jump either to offset=0 or after block terminators)
# 2025-04-22 13:14 jan: So with the current way things are you might get a paradoxical block terminator that would execute when you arrive at it from the previous instruction, but the next instruction wouldn't be a valid target for a jump.
(edited)
# 2025-04-22 13:14 jan: (again, I want to require all instructions to have 1 set in its bitmask to prevent such potential corner cases)
# 2025-04-22 13:20 jay_ztc: jumps target indexes in w (basic block index set), which is calculated by applying the skip-length function to each opcode index in the k bitmask\*-> so this would result in a jump target of the 24th octet (our 'invalid' instruction).
edit: *additionally, k always includes offset=0 (c[0])
(edited)
# 2025-04-22 13:30 jay_ztc: actually, looks like both branch & jump use the 'beginning of basic blocks' collection,
w (which itself uses the skip-length).
# 2025-04-22 13:37 jay_ztc: Didn't intend for this chat to continue into your off-hours, my apologies. Don't worry about looking at this tonight especially given its a non-urgent corner case.
# 2025-04-22 14:32 knight1205: Here:
https://graypaper.fluffylabs.dev/#/68eaa1f/1bcb011b1702?v=0.6.4
In Eqn 14.14 and 14.15 we are using s (now b) which is defined as all the segments exported by all work packages,
and then we are calculating it's merkle root and comparing with segment root of some previously exported segments of any work package.
Is this what it implies? As segments exported from a work package leads to different segment roots, then in that case how can it be same as segment root of all the segments (from all wps)?
(edited)
# 2025-04-23 16:50 dave: In S and J, M(s) = L(r) means s is the sequence of segments with the specified root
# 2025-04-23 16:53 knight1205: but it is specified as all the segments exported by all the work packages exporting a segment. is this right? or this?
> s is the sequence of segments with the specified root
# 2025-04-23 16:54 knight1205:
Note that while S and J are both formulated using the
inner term b (all segments exported by all work-packages
exporting a segment to be imported)
# 2025-04-23 16:59 dave: Hmm not sure about the wording of that sentence. The point there is that if a work-package A is importing a single segment exported by work-package B, you do not need to fetch _all_ of the segments exported by B
# 2025-04-23 17:00 dave: You only need to fetch the segment you care about, plus the corresponding proof segment
# 2025-04-23 17:00 knight1205: got it. thanks for the clarification. Though I am still not sure about the wordings.
# 2025-04-23 19:42 mkchung: For CE140 justification: j++[b]++T(s,i,H), what exactly does the s in T(s, i, H) denote? Is it really the raw exported segment shards for the entire workpakge at given shard index? Without hashing the s term, the co-path in T(s, i, H) would grow linearly as the number of exports increases & each DA would need to store its sibling's entire "raw exported segment shards" as part of the co-path proof?
# 2025-04-23 19:44 mkchung: Let's say there are 100 exported segments for a given workpackage, can you provide an estimated justification size for requesting just one segment? I'm estimating the justification to be somewhere around 66~410435 byte in tiny setting(where validator=6, W_P=1026) and around 297~3866 bytes in full setting(where validator=1023, W_P=6). Does my estimate look reasonable to you?
# 2025-04-23 20:33 jaymansfield: "s is the full sequence of segment shards with the given shard index."
# 2025-04-23 20:33 jaymansfield: The CE140 justification basically allows you to first calculate the segment root for a given segment index, which is then used to validate the erasure root
(edited)
# 2025-04-23 22:02 mkchung: That's for the recepient/consumer of a CE140 justification. But the burden of producing such proof seems to fall on DA?
if you were the "generator" of CE140 justification(i.e someone requested the justification from you), wouldn't you also need to fetch the raw "s" from your sibling's just to provide a co-path to the segment shard that you are responsible to store?
# 2025-04-23 22:03 mkchung: I think I'm confused as why "S" not being hashed before building the segments root
# 2025-04-24 16:52 jaymansfield: No you only need to know what was returned from CE137 originally. You are providing a path to a root for a list of shards with an index, not for all shards of a given segment. Refer back to s\_clubs in your availability specifier.
(edited)
# 2025-04-28 15:46 dave: Sorry, missed these messages earlier. s is the full sequence of segments shards with the given shard index yes. These are all provided in CE 137 (the
[Segment Shard]
), you should not need to request data from another node to handle a CE 140 request.
(edited)
# 2025-04-28 15:47 dave: FWIW the segments root is _not_ involved here, the justifications returned by CE 140 are justifications from the returned segment shards to the erasure root
# 2025-05-09 10:35 knight1205: Hi David Emett ,
Sorry to disturb you again.
Here, in S function we are fetching all the segments whose root / wp hash is present in imports. Is this function returning segments grouped based on segment root or just simply list of segment?
Coz, current definition implies its just list of segments, however in Refine Invocation, we are passing list of list of segments
here.
if the latter is true, then, I think in eqn itself there must be change for more clarity.
Please correct me if I understand this wrong.
# 2025-05-09 11:09 dave: Bold b is the complete list of segments with the given root, or exported by the work-package with the given hash
# 2025-05-09 11:32 knight1205: that i got, but after that we are picking nth segment from b and building a list. but in psi R list of list of segment is expected.
# 2025-05-09 12:05 dave: Ah I see what you mean now, sorry. The refine call for each work item has access to the imports for _all_ work items. So the list-of-list-of-segments is [S(w) for w in p_w]
# 2025-05-09 12:06 dave: I believe there is an error in the definition of I(p,j); this says S(w) is passed in rather than the above
# 2025-04-22 18:33 sourabhniyogi: We have found our development life has improved with the "metadata" attached to preimages, specifically when the preimage is for service code, because we can then have tools show that metadata as strings like "fib" and "gameoflife". Can we get the same for workpackages+bundles so that we can attach metadata like "fib(93)", "gameoflife(314)", or should that be in the payload as a mere convention?
# 2025-04-22 19:56 sourabhniyogi: Another quick request: Would you mind assigning
provide
a number like 17 or 27, and perhaps starting historical\_lookup at 32 or 64 or 128? That would leave a bit of space between the two big groups of host functions for "one more".
(edited)
# 2025-04-23 13:24 jay_ztc: Jan Bujak: I found this very helpful post of yours in a thread from last September. Do you think holding off on implementing sbrk is still relevant advice? Do you have any new thoughts since this message was posted that you can share?
# 2025-04-23 13:27 jay_ztc: reposting below for context::
https://matrix.to/#/!ddsEwXlCWnreEGuqXZ:polkadot.io/$\_RkIlMDNZrROw\_6WDXpbllO2VSbjY1FNTIfDjVZhhdw?via=polkadot.io&via=matrix.org&via=parity.io
Memory allocation/deallocation handling is still a work-in-progress, and it's possible the sbrk instruction will get modified and/or removed. I'd suggest you temporarily skip it and focus on other parts of JAM and/or PVM.
If you're interested in some history as to why sbrk is there then let me give you some background.
Historically I designed PolkaVM (on which PVM in the GP is based on) to be a VM which is as "powerful" as WASM VMs (so it can completely replace our current WASM-based executor in Polkadot 1.0 and our WASM-based smart contracts VM) while being as simple as possible to implement, and without sacrificing any performance.
So this is where the idea for the sbrk came from (which is similar to what WASM has): the VM maintains a heap pointer, and the guest program can use sbrk to query that pointer and/or to bump it up. And every time it crosses a page boundary the VM allocates new memory for the program.
So this design has numerous benefits. First, it's very simple to use as a guest program (pseudo code):
// Get a pointer to the new allocation.
let pointer = sbrk(0);
// Actually allocate it.
if sbrk(size) != 0 {
// Allocation succeeded.
// Now `pointer` points to `size` bytes you can use.
}
This is also great for use cases like e.g. tiny smart contracts which can use this as directly as an allocator without having to bring a heavyweight allocator of their own (which would consume a lot of space).
Secondly, it's simple to implement in the VM, something like that (pseudo code again):
fn sbrk(size) -> Pointer {
if size == 0 {
// The guest wants to know the current heap pointer.
return current_heap_pointer;
}
// The guest wants to allocate.
let new_heap_pointer = current_heap_pointer + size;
if new_heap_pointer > max_heap_pointer {
// Allocation failed.
return 0;
}
let next_page_boundary = align_to_page_size(current_heap_pointer);
if new_heap_pointer > next_page_boundary {
allocate_new_pages(next_page_boundary..align_to_page_size(new_heap_pointer));
}
current_heap_pointer += size;
return current_heap_pointer;
And this (along with the memory map I came up with, which is what we now call "standard program initialization") also makes in very easy to write an interpreter for this, because when handling loads/stores from memory you only have to do something like this:
fn load_value32(address) -> value {
if address >= stack_address && address + 4 <= stack_address_end {
return stack[address - stack_address];
} else if address >= rw_data_address && address + 4 <= align_to_page_size(current_heap_pointer) {
return rw_data[address - stack_address];
} else if address >= ro_data_address && address + 4 <= ro_data_address_end {
return ro_data[address - stack_address];
} else {
// Address is inaccessible.
return Err;
}
}
It's cheap, fast, and doesn't require any crazy data structures and doesn't require any handling of corner cases (for example, accesses which could read both from the stack and from RW data don't have to be handled, because they're impossible by definition; the interpreter can just keep them in separate arrays, and call it a day).
So that's how (any why) it was originally designed, but then came JAM and changed things. (: (Again, remember, I started working on this before JAM, and some things were just grandfathered into JAM.)
What JAM introduces is a concept of inner VMs (see machine, peek, poke, invoke and expunge host functions in section B.8 of the GP) where one VM can spawn another VM, and as it is currently designed those inner VMs are extremely flexible and have completely free-form memories and are dynamically paged.
What this essentially means is that all of those nice properties of sbrk that I've listed - simple and easy to implement, fast, doesn't require fancy data structures - they all now go out of the window!
So we will probably be replacing sbrk with something else that's more appropriate for the more flexible inner VM model. And unfortunately also most likely orders of magnitude harder to implement (at least if you want to reach at least the half-speed milestone), but it is what it is. I'm still finishing some other stuff up, but I'll most likely be working on this soon-ish. (If any of you have any good and/or crazy ideas feel free to message me!)
(edited)
# 2025-04-23 15:22 celadari: Hello,
I have a question regarding the eject host function, specifically about the condition described here:
https://graypaper.fluffylabs.dev/#/68eaa1f/328e03329103?v=0.6.4.
I want to confirm my understanding:
Does it mean that in order for a caller s to successfully eject account d, the codeHash of account d must be equal to the (hash) service index of the caller s ?
Thanks in advance for the clarification!
(edited)
# 2025-04-24 13:47 haikoschol: sourabhniyogi: was that reaction a "yes" or a "i'm wondering about that too" or a "yay, equations!!1!"? 🤔
# 2025-04-24 15:12 haikoschol: KwickBit - Charles-Edouard LADARI: it's not the hash though, it's the (32 octet) encoding of the service index, right?
# 2025-04-24 15:14 celadari: By 32 octet i meaning we add 0s before the last 4 octets of the service index ? 🤔
# 2025-04-24 15:45 haikoschol: it won't incidentally turn out to be the hash of some code, so the code hash field needs to have been set with
upgrade
to this value before i reckon
# 2025-04-23 18:58 charliewinston14: Hello. Two questions about availability assurances.
1. Should these be sent out every slot even if the bitfield didn’t change from the previous distribution?
2. For an assurer to say a core is available in their assurance bitfield, does that mean they have access to just the shards matching their validator index, or are they saying they have access to ALL shards for a given core?
(edited)
# 2025-04-23 19:02 sourabhniyogi: 1. Yes, its critical that they are since assurances are anchored to the immediate parent header hash. The only exception is if the entire bitfield is zero, in which case I don't believe there is a point to submitting an assurance unless some reward exists for liveness when all cores are idle (which would be sad ... but possible!)
2. Just their shards matching their validator index. If they had access to ALL the shards, that would be too much in the large! But of course they can do whatever they want, and they might be one of the guarantors.
(edited)
# 2025-04-23 19:04 vinsystems: Hi, question about the accumulate
transfer = 11
function.
When Service A sends an amount X to Service B, a deferred transfer (i.e.
s: A, d: B, a: X, m: ..., g: ...
) is created and the amount to be sent
is substracted from the Service's A balance.
But I cannot find in the GP where it says to update the balance of the receiver (service B).
I thought that it was done when
all the deferred effects of the transfers are applied, but the
on_transfer
functions only modify the subject's account storage.
(edited)
# 2025-04-24 04:40 jan: We are going to remove the sbrk instruction and replace it with a hostcall soon.
# 2025-04-24 17:28 ascriv: How is the serialization of the storage dictionary (a_s) of a service account in (D.2) not lossy? The key (I thought) is not the hash of the value unlike the preimage lookup. I’m definitely missing something or we are losing the last 4 bytes of the keys
(edited)
# 2025-04-24 17:53 prasad-kumkar: I think the point is we don’t need to store the full a_s keys, they’re only used to look up values when the key is already known
# 2025-04-24 18:30 erin: hello all, I've created hosted archives of the JAM and graypaper chats with plaintext versions also available at
https://paritytech.github.io/matrix-archiver/
there are a few quality of life improvements still to be done but any feedback or comments are welcome if you find this useful. This is also now linked on the jamcha.in site.
These are updated daily at ~3am UTC.
Other JAM-related channels are welcome to be archived - they need to be unencrypted and world-readable (history available to "Anyone"). Please open an issue
here with the internal room ID if you wish to archive a channel.
(edited)
# 2025-04-24 18:41 ascriv: > <@prasad-kumkar:matrix.org> I think the point is we don’t need to store the full a_s keys, they’re only used to look up values when the key is already known
I don’t completely follow, should we just be comparing first 28 bytes when doing lookups? What about when two keys only differ in the last 4 bytes?
# 2025-04-24 18:46 tomusdrw: erin: would you consider adding
<a name="{element-msg-id}"></a>
to the
<time>
in html mode? That would allow linking to a specific message, which I think would be pretty cool.
(edited)
# 2025-04-24 18:47 erin: others may include "jump to bottom" and perhaps pagination, though i kinda like the big raw log style (feedback again welcome here).
# 2025-04-24 18:54 tomusdrw: afaict it's 220kB gzipped for 1y+ worth of content. My guess is that adding JS pagination to this with a modern framework would start to pay off only after 2 more years :D
(edited)
# 2025-04-24 18:55 erin: it's just static html generated by a python script at the moment
# 2025-04-25 13:49 sourabhniyogi: Cool if we had discord rooms with content, should we put them in this repo so its not a matrix-archiver but a gp-archive?
# 2025-04-25 16:04 erin: there should be an automated script to grab the content daily unless they're dead/archived channels. the script/rendering right now is very matrix-specific. it would be best to have a matrix bridge or something regardless
# 2025-04-24 18:54 ascriv: > <@dakkk:matrix.org> keys are hash, a collision is very unlikely
So state serialization being lossless isn’t important, it just needs to be extremely unlikely to collide
# 2025-04-25 13:02 charliewinston14: Hi I have a timing question. Segments are kept in the DA for 28 day, but it seems SegmentRootLookupItem's in work package bundles must be recent (exist in beta or extrinsic). Does that mean we store segments for 28 days, but they can only be used as import segments for 8 days?
# 2025-04-25 13:03 dave: The segment root lookup stuff is for when you import using a work-package hash
# 2025-04-25 13:03 dave: This is only supported for recent work-packages, for older packages you should use the segment-root directly
# 2025-04-25 13:08 charliewinston14: Thank you. Didn't realize an ImportSpec could be a export root OR a work package hash.
# 2025-04-25 16:02 knight1205: Does anyone know how is shard index determined from validator index? Like which particular validator requests for which particular shard index as in CE 137?
Here in
GP:
It is just mentioned that
>chunks are distributed to each validator whose keys are together with similarly corresponding chunks for imported, extrinsic and exported segments data, such that
> each validator can justify completeness according to the
> work-report’s erasure-root.
I am not sure what it is trying to imply.
# 2025-04-26 17:40 ascriv: For the function definition (D.4), it seems like k could have (much) fewer than 248 bits, e.g. in the case where there are many elements of the preimage lookup for any service account. Since in this case the keys for all of these in the serialized state will have the same first 8 bytes, so by the time we are computing the leaf value for one of these keys we will have cleaved off at least 64 bits, leaving fewer than 192 bits for k, if I understand correctly
# 2025-04-26 17:40 ascriv: Should we be padding with zeroes in such cases to compute bits(k)…248?
# 2025-04-26 17:47 ascriv: Ah, nevermind. This is why we also have the key in the rhs of the map, so we remember the original non-cleaved key
# 2025-04-28 09:31 gav: The state serialisation for the latter two components is the general (variable size integer) encoding.
(edited)
# 2025-04-28 16:07 prasad-kumkar: Should argument
a
from
argument invocation function be encoded with p before passing in
program initialization function, as its being decoding from
p
as described in
A.37 ? As noted:
> Given some p which is appropriately encoded together with some argument data a, we can define program code c, registers ω and ram μ through the standard initialization decoder function Y
(edited)
# 2025-04-28 19:14 ascriv: Should we expect to be able to handle bad blocks for M1? Or are all blocks presumed valid?
# 2025-04-28 21:07 xlchen: IMO M1 is just STF, state in, new state out. there is no concept of a block chain to be considered
# 2025-04-28 21:10 xlchen: so you should detect if a block is bad (purely using the provided input state), but no need to care about forks because it is stateless
# 2025-04-28 21:09 davxy: > <@ascriv:matrix.org> Should we expect to be able to handle bad blocks for M1? Or are all blocks presumed valid?
Be prepared for bad blocks. A lot of bad blocks :)
# 2025-04-30 02:45 ascriv: I have a question also about state serialization:
For pending reports serialization, is it the case that the segment count in (11.5) should be serialized using the general natural number encoding? In fact , since E is not subscripted in C(10), shouldn’t all naturals in the work report be encoded in the general way?
# 2025-04-30 02:51 ascriv: But, if we’re going with naturals that are subscripted (like gas values) should always be encoded in the non general way (that seems to be convention?), does that mean the gas values in core and service statistics should not be generally encoded?
(edited)
# 2025-04-30 10:01 jimboj21: I see in 12.20 that P is a return value. From what I can see it is not defined here but is a returned by the accumulation invokable pvm instance. However it does not appear that psi_a function signature has been updated to reflect this
# 2025-04-30 12:58 erin: It's now fixed - sorry about that. Looks like the archival process timed out, could have been due to a slow github runner. I've increased the timeout to something quite high now so it should (hopefully) not happen again.
# 2025-04-30 13:15 jay_ztc: <del>Validators won't be able to get the preimage hash itself from the serialized state-> Instead they will need to refer to the block extrinsics right? (for any newly-requested, not yet provided preimage).</del>
What if instead of hashing, we stack the length octets on top of our preimage hash octets? Basically instead of concatenating bytes, we add each of the 4 length octets to each of the first 4 hash octets, and use the result as our state-key?
I'm thinking if we can avoid the hash operation while meeting the same requirements- we can lessen our perf overhead
edit: wording (duplication)
edit: strikethrough not-really-relevent edge case of validator requesting state from another validator for reconstruction
(edited)
# 2025-04-30 13:16 jay_ztc: For context, this was the post from David Emett in Oct 2024:
2024-10-31 22:29 dave: The preimage hash is passed in to eg solicit directly, it's trivial to pass in almost-colliding hashes. As the trie key construction function doesn't preserve all hash bits, these almost-colliding hashes could actually collide in the trie if they are not hashed beforehand
# 2025-04-30 13:23 dave: > <@jay_ztc:matrix.org> Validators won't be able to get the preimage hash itself from the serialized state-> Instead they will need to refer to the block extrinsics right? (for any newly-requested, not yet provided preimage).
>
> What if instead of hashing, we stack the length octets on top of our preimage hash octets? Basically instead of concatenating bytes, we add each of the 4 length octets to each of the first 4 hash octets, and use the result as our state-key?
>
> I'm thinking if we can avoid the hash operation while meeting the same requirements- we can lessen our perf overhead
>
> edit: wording (duplication)
Maybe I'm not understanding but it seems trivial to generate collisions with this scheme. The requested hash and length are almost unconstrained (the length is somewhat constrained by required deposit I think)
# 2025-04-30 14:20 luke_fishman: has anyone passes the latest accumulate test vector
same_code_different_services-1
in the
pre-state
there are 2 services: 1729, 1730
1730 has no "blob" i.e service code
in the
post-state
service 1730 is removed, only service 1729 remains
i am currently clueless as to how that might have happend
- it could not have ejected itself - since it has no code
- as far as i can tell, it is not removed from
d'
in
12.17
how else can it be removed?
davxy could you advise?
(edited)
# 2025-04-30 14:21 jimboj21: Luke | Jamixir: I am also stuck trying to get this working currently
# 2025-04-30 18:04 sourabhniyogi: The new 0.6.5 gas parameter x\_g in the operand is serialized in
C.29 not with E\_8 but with
C.6, unlike the others that use E\_8 (see C.23, C.26, C.28) -- is this a typo or intended? Why not make them consistent in one direction or the other?
(edited)
# 2025-04-30 19:28 gav: All these encodings will be revisited in due course before 0.7.0 (there’s an issue on the 0.6 milestone page) but until then assume your reading is correct.
# 2025-04-30 20:21 davxy: > <@luke_fishman:matrix.org> has anyone passes the latest accumulate test vector
same_code_different_services-1
>
> in the
pre-state
there are 2 services: 1729, 1730
> 1730 has no "blob" i.e service coed
>
>
> in the
post-state
service 1730 is removes, only service 1729 remains
>
> i am currently clueless to how that might have happend
> - it could not have ejected itself - since it has no code
> - as far as i can tell, it is not removed from
d'
in
12.17
>
> how else can it be removed?
>
> davxy could you advice?
>
I'll have a look
# 2025-05-01 04:23 clw0908: Some questions about
A.36,
A.37, and
A.43:
In A.43, **p** is the program blob of a service, and **a** is the serialized arguments from
refine
or
accumulate
(on-transfer).
So how can **p** include **a** in A.37?
Since A.36 only checks whether there exists a **c**, **o**, **w**,
z
,
s
(excluding **a**) that satisfies A.37, does this mean we don’t have to care whether **a** exists in **p**?
# 2025-05-01 08:03 gav: Hmm. Looks like bold a getting added into that concatenation is a typo.
# 2025-05-01 08:06 prasad-kumkar: then shall **a** be passed as a function argument to Y?
# 2025-05-01 20:01 ascriv: For instruction 104, doesn’t this actually get the trailing (not leading) zeros as it’s defined? For example if wA is 0x1, B_8 will be 1 followed by 0s, and so the formula will evaluate to 0, when it should be 63
(edited)
# 2025-05-01 20:06 ascriv: We should be using the big endian representation of wA in 104-107 unless I’m mistaken
# 2025-05-01 22:48 ascriv: Made a pr www.github.com/gavofyork/graypaper/pull/357
(edited)
# 2025-05-02 11:19 gav: segments are 4104 bytes and can be reconstructed by 342 validators each presenting their 12 byte
shard (and its index)
# 2025-05-02 12:29 danicuki: > <@gav:polkadot.io> segments are 4104 bytes and can be reconstructed by 342 validators each presenting their 12 byte
shard (and its index)
Thanks. Then In tiny testnets (6 nodes, 2 cores) each shard has 2052 bytes? So shard size is 4104 / [number of cores]?
# 2025-05-02 15:52 sourabhniyogi: You might find the above useful (if you spot an error, please advise!)
# 2025-05-02 17:21 jaymansfield: Hey all. With having M2 pretty much complete at this point I'm now trying to wrap my head around how to build the required recompiler. Based on my research so far, it looks like I could generate RISC-V machine code, add an ELF header, then use a linker to produce the final ELF file, and finally execute that ELF (somehow) as a background process. Am I heading in the right direction, or am I way off and should just wait for Jan’s talk?
# 2025-05-02 17:23 jan: Nope. My talk next week should contain the bare basics that should let you get started on writing one.
# 2025-05-02 17:25 jaymansfield: Alright I'll hold off for now. I'm no longer going to be able to make it to jam experience so I really hope this is available to watch online afterwards
# 2025-05-02 17:26 jan: I will also share the slides so even if you can't watch the talk they should be useful.
# 2025-05-03 17:09 ascriv: State serialization is lossy, specifically in regards to the keys of the storage dictionary of service accounts. I’m a bit confused why this isn’t important. Are we not expected to be able to fully recover a serialized state in the case that the storage dictionary is non empty? State serialization is lossless everywhere else, so it seems odd.
(edited)
# 2025-05-03 17:47 jaymansfield: I don’t think it’s something really meant to be reversed. Also storage items are not the only thing suffering from this. There are cases where preimage lookups are also not recoverable (if the preimage isn’t known). If it’s just been solicited through a host call and not included in a block yet you wouldn’t know the full hash of it either by looking at the lookup state key
(edited)
# 2025-05-03 20:02 davxy: As I wrote in the GH discussion. Storage keys are lossy from the host perspective. When a service requires to read (or write) a service key it gives to the host the full unhashes key. The host then computes the state key (as per D.2) and gives (or writes) the data. Host doesn't need to keep track of the full key as it doesn't need it
(edited)
# 2025-05-03 20:19 jaymansfield: I understand the process you suggested, but doesn't it go against the GP? It defines the preimage and storage lookup dictionaries as having 32 byte keys with a key format of H(E4(s∗)⌢ µko ⋅⋅⋅+kz). If we were to support state keys instead for these since they are not all reversible the logic in the GP is invalid in a few spots
# 2025-05-03 20:24 ascriv: I think Jason is talking about the definition in the service account (9.3), but I think it’s still correct, however implementations will likely not have an actual struct which maps 32 octet strings to blobs, and instead opt for 31 octet strings (the state key)
# 2025-05-03 20:25 ascriv: But even that implementation would still follow gp, functionally
# 2025-05-03 20:26 davxy: To process our vectors you need to support construction of the state using state keys
# 2025-05-03 22:30 jaymansfield: Ok thank you. Will make the updates to be able to support this
# 2025-05-03 20:31 ascriv: That’s still true, it’s just to process davxys test vectors you can’t have an implementation which works with the full storage keys, instead working with the state key versions of the storage keys. GP doesn’t say your internal representation of service accounts must match the one in (9.3)
(edited)
# 2025-05-03 20:32 ascriv: An implementation might has a_s mapping 31 byte strings to blobs and still be faithful to gp
# 2025-05-03 20:33 ascriv: Just need to make sure it serializes correctly, does key existence checks correctly, etc
# 2025-05-03 20:43 davxy: I see your problem. IIUC you are talking about constructing the Dictionaries with the preimage full hash from the raw KV state.
- For the preimage you can. Just hash the value.
- For the preimage lookup dictionary you can just if you already have the corresponding preimage in the KV
- For services you can't recover the unhashed service key
Yeah, the vectors assume that you use the state keys as your dictionary keys
(edited)
# 2025-05-03 20:58 cisco: I'm also really confused by this. I guess the encodings will be revised in 0.7.0 but I would like to know how they are right now if possible
# 2025-05-03 21:04 rustybot: > <@davxy:matrix.org> I see your problem. IIUC you are talking about construct some Dictionary with the preimage full hash from the raw KV state.
> - For the preimage you can. Just hash the value.
> - For the preimage lookup dictionary you can just if you already have the corresponding preimage in the KV
> - For services you can't recover the unhashed service key
>
> Yeah, the vectors assume that you use the state keys as your dictionary keys
I’d also like to add that, to support warp sync (i.e., syncing from a finalized state rather than from genesis), you'll receive the raw key-value state keys. Not sure if this is already specified by JAM SNP
# 2025-05-03 21:21 ascriv: > <@davxy:matrix.org> I see your problem. IIUC you are talking about construct some Dictionary with the preimage full hash from the raw KV state.
> - For the preimage you can. Just hash the value.
> - For the preimage lookup dictionary you can just if you already have the corresponding preimage in the KV
> - For services you can't recover the unhashed service key
>
> Yeah, the vectors assume that you use the state keys as your dictionary keys
I think (very) technically speaking, an implementation which relies on state keys for storage does not actually implement the gp, because of the extraordinarily unlikely outcome of a state key collision when checking existence of a key. this is probably not even worth mentioning due to the likelihood but thought it was interesting
# 2025-05-04 09:33 greywolve: In the last point, do you mean for storage keys you can't recover the unhashed storage key? It's only the storage dict that would have state keys? (since we can recover the rest)
(edited)
# 2025-05-04 13:09 ascriv: > <@cisco:parity.io> I'm also really confused by this. I guess the encodings will be revised in 0.7.0 but I would like to know how they are right now if possible
If you look at (C.24) it explains how to serialize work reports which I previously overlooked
# 2025-05-05 09:57 gav: The segment-count field (in the availability specifier set) is encoded as a 2-byte fixed length integer.
# 2025-05-05 10:01 gav: Nowhere in the GP does it tell you (how) to store the state.
(edited)
# 2025-05-05 10:02 gav: And in fact this is done via the Merkle root, which (under reasonable assumptions) identifies a single state. But (obviously) it's lossy in so much as it impossible to reconstruct a potentially multi-gigabyte state from a 32 byte quantity.
(edited)
# 2025-05-05 10:03 gav: Implementations are free to store state however they choose. GP takes pains not to specify this; it's one of the very many implementation details. GP only specifies how to recognise a valid block.
(edited)
# 2025-05-05 10:06 gav: Of course when we want to provide test vectors for incomplete portions of the protocol, then we may need to make judgement calls on exactly what data an implementation has available. That's a bit unfortunate and may coerce implementations into particular design patterns. Note that __implementations need not be able to pass the test vectors__. They must only pass the conformance tests, which will test in line with the GP not any internal data structures we might assume implementations have within.
# 2025-05-05 10:07 gav: Test vectors are provided only in so much as they might
help teams.
# 2025-05-05 10:08 gav: If you're talking about a possible collision of the 32nd byte of a hash (or even the 27th byte), then the GP implicitly assumes it is impossible.
(edited)
# 2025-05-05 10:10 gav: Always remember: GP only specifies behaviour, not mechanism.
(edited)
# 2025-05-05 10:12 gav: The change made to characterise state keys as 31 bytes in the GP should make no practical difference to behaviour; it's there to ensure that W3F's test vectors have something concrete to point at in the GP.
# 2025-05-05 11:21 rustybot: Since GP defines the service storage dictionary key as the full hash, an implementation is free to use the full hash (9.3) as the dictionary key. If this implementation supports warp sync, as outlined in CE129, it receives a sequence of KV items with 31-byte state keys (as per D.1). But now this implementation cannot construct its service dictionary properly (as it uses full hashes), and may fail to process subsequent blocks. This happens because host calls will pass full hashes, but they've padded the keys received from another node, causing mismatches. The likelihood of such a mismatch is pretty high.
Doesn't this suggest that implementations aiming to support warp sync need to use 31-byte state keys for their dictionaries?
(edited)
# 2025-05-05 11:41 ascriv: This is my understanding as well, but interested to hear others opinions
(edited)
# 2025-05-05 11:43 ascriv: The last few bytes of the keys are not present in the serialized state so warp sync and full storage key implementations must not be compatible
(edited)
# 2025-05-05 14:09 gav: Yes, warp-sync is not (yet) within the behavioural description of the GP.
(edited)
# 2025-05-06 00:56 jaymansfield: There might be a typo in I(p,j) where it checks the output size of the refine calls. I'm assuming it should be checking if the size is greater then WR rather then smaller
# 2025-05-05 14:17 gav: There are a few small changes here in line with the 0.6 milestone, but the most impactful is the alteration to
fetch
, which now works across the different invocations and averts the need for unavoidable unbounded RAM allocations.
(edited)
# 2025-05-05 14:18 gav:
fetch
also changed to allow on-chain entropy to be inspected in Accumulate/OnTransfer, and, in the future, off-chain entropy to be inspected in Refinement.
# 2025-05-05 14:18 gav: It is now possible to introspect the chain's parameters, allowing for chain-agnostic PVM code.
# 2025-05-07 17:30 sigsigsigsigsig: coming back to jam, struggling to find the black on white version of the updated pdf could anyone help?
# 2025-05-07 17:33 sigsigsigsigsig: will think of you while i'm enjoying reading it in the uk sun 😁 💕
# 2025-05-08 14:32 ascriv: The fifth component of the yield invocation in (B.2) should be a hash but it’s a work package. Assuming that’s a typo?
# 2025-05-09 16:14 ascriv: The third state key constructor in (D.1) only uses the first 27 bytes of h, but each usage of that constructor sends 31 bytes. I think e.g. for the storage dictionary we should use k0…23, and similar the next 2. This wouldn’t change functionality but as far as I can tell we are sending 4 extraneous bytes in each of them
(edited)
# 2025-05-10 01:15 ascriv: also I saw that we changed the arguments for the argument invocation in the accumulation definition (B.9) to include just the length of operand tuples array, instead of the full array. is this intentional? is it also intentional that we've left off any subscript for E, and thus we use the general natural encoding for t and s?
# 2025-05-11 17:10 gav: > <@ascriv:matrix.org> also I saw that we changed the arguments for the argument invocation in the accumulation definition (B.9) to include just the length of operand tuples array, instead of the full array. is this intentional? is it also intentional that we've left off any subscript for E, and thus we use the general natural encoding for t and s?
All intentional
# 2025-05-11 17:11 gav: Passing the whole vector will become unsound when gas cost inflates with accessible memory. The point of these changes is to support sound service code and thus ensure that baseline gas costs are fixed over all inputs. Memos and work results must now be
fetch
ed as needed.
(edited)
# 2025-05-11 19:08 jan: To expand on the rationale here - we've been doing research/running experiments on how to come up with a secure gas cost model we can use in a permissionless environment like JAM, and it has been increasingly clear that minimizing the amount of memory that is accessible to the programs is important to keep the gas costs of the memory instructions in check. In general the more memory you have accessible the easier it gets to trigger cache misses, which makes the the worst case, well, worse, for memory access instructions, potentially up to
several hundred times more costly (compared to the average case) if you can access the whole ~4GB worth of address space. So if we can do something while needing less memory, even if that ends up being slightly more computationally inefficient, it will still be a net win considering the gas costs.
# 2025-05-12 19:56 ascriv: For this message, can we assume that (w_s)_l is added for each r in w_r ? so if len(w_r) = 2, the b component for R(c) is 2 * (w_s)_l ?
# 2025-05-12 21:22 gav: No, this is an oversight. (w_s)_l should only be counted once. Feel free to open an issue.
# 2025-05-13 10:35 greywolve: Also applies the the preimages/preimage lookups, 4 bytes less
# 2025-05-14 14:06 decentration: Are there existing conformance vectors available for validating the entire serialized state, including all chapters, after it has been merkleized into a single state root?
# 2025-05-15 13:19 ascriv: Yes I find they are not up to date with 0.6.6 yet @sourabhniyogi FYI
# 2025-05-15 18:44 dakkk: Reading the GP a falltrought instruction is like a nop, right? It only do nothing and the program counter goes to the next instruction?
# 2025-05-15 20:35 jan: The primary function of the
fallthrough
instruction is to allow jumps into code which otherwise you wouldn't be able to jump into, as it starts a new basic block, and only jumps to the beginning of basic blocks are allowed. But otherwise yes, it just goes to the next instruction.
# 2025-05-16 21:05 greywolve: In the state
constructor function for service account state keys wouldn't a representation like \[i, 0, n0, 0, n1, 0, n2, 0, n3, 0, 0 ... \] make identifying the different types of keys more robust? In that case the second byte would be unique for each of them. (regular, service, preimage, preimage lookup, storage)
(note, assuming preimage lookup lengths are > 0 and \< max(uint32) -1, but it's still unique for the other four at least)
(edited)
# 2025-05-20 09:19 gav: Not sure how you mean "more robust" - could you give an example of it not being robust?
(edited)
# 2025-05-20 20:28 greywolve: Maybe more robust was the wrong way to phrase it, more like make a collision impossible, though that's just a nit since right now it's quite unlikely there would ever be a collision.
I just meant with that representation, service state keys and preimage meta state keys have the second byte as always 0, and always > 1 respectively, so they can never collide then.
Eg. if x can be any byte.
[255, 0 ...] // service keys
[x, >1 ...] // preimage meta keys
But this probably doesn't make much difference in practice I guess.
(edited)
# 2025-05-23 22:24 ascriv: for sbrk, if wA happens to be 0, then N_x..+wA = {}, which is a subset of every set. so there is no x which satisfies the conditions, meaning sbrk is undefined