#graypaper:polkadot.io

# 2024-06-11 20:49 purpletentacle: While this is a minor point, I think it is aligned with the intention of maintaining the formal rigor intended in the graypaper. I feel there is some inconsistency in terminology concerning "integer" versus "natural numbers." Specifically, when adhering to Section 3.4 (numbers), I believe that the text in sections like 3.6 should use use natural instead of "integer" to be consistent with the corresponding math expressions ($\\N$, etc.). This happens in various other sections as well, where talking about natural numbers or positive integers (unsigned?) would be more appropriate than just "integers" Generally, the mathematical expressions are accurate, but the narrative often uses "integer" in a broader sense. I think this is due to how the term is used in programming (unsigned integers, etc.) rather than math. This inconsistency becomes particularly evident in Appendix C.1.2, where the encoding is defined solely for natural numbers, and there is no mention at all about how negative values should be handled. Also, I don't seem to find any real need for serializing negative integers. Probably this section could be restricted to only encoding natural numbers and this shoud be enough. If useful, I am glad to review and make a PR with suggestions addressing this along the text. (edited)

# 2024-06-11 21:09 purpletentacle: ... I have also a few of questions about C.2. .. (262) (263) - why is H\_j not included when serializing the header? is that correct or an unintentional omission? - why there is a claim that E\_U(H) has "no inverse"? wouldn't that imply that there is no way to decode a serialized header? (In C.1. "...We define the deserialization function E−1 = E−1 as the inverse of E and able to decode some sequence into the original value...." (edited)

# 2024-06-11 21:21 purpletentacle: .... 5. Header "... Excepting the Genesis header, all block headers H have an associated parent header, whose hash is H\_p..." but expression (263) for header serialization does not indicate optionality (?H\_p). So should there be some exception or fixed value for the genesis case? For instance, like H\_p for H^0 is defined as some fixed preestablished value? or instead, the serialization of H^0 is a special case that should be explicitely indicated in C.2 that H\_p should not be encoded for H^0? (edited)

# 2024-06-11 21:39 purpletentacle: image.png

# 2024-06-11 21:40 purpletentacle: in the case of 38, the hash is done over the encoded Extrinsic... H(E(E)), which makes sense.. I believe that 36 is missing the encoding function E to be consistent.. so... H_p = H(E(P(H))) otherwise, maybe it would be worth indicating somewhere that hashing implies encoding? (edited)

# 2024-06-12 04:20 gav: - H\_j is not serialized but implied by E\_J. - where exactly does it say E\_U(H) cannot have an inverse? (edited)

# 2024-06-12 04:22 gav: Yeah. It’s not optional; genesis is a very special case and it makes no sense to complicate the protocol due to it. It will likely just be defined as zeroes. The genesis header and state are not yet defined in the protocol. (edited)

# 2024-06-12 04:25 gav: > <@purpletentacle:matrix.org> in the case of 38, the hash is done over the encoded Extrinsic... H(E(E)), which makes sense.. > I believe that 36 is missing the encoding function E to be consistent.. so... H_p = H(E(P(H))) > otherwise, maybe it would be worth indicating somewhere that hashing implies encoding? I believe it is mentioned previously that when hashing, values are assumed to be encoded with the regular serialisation function if they are not explicitly so.

# 2024-06-12 04:27 gav: I think I’ve generally made it explicit but as it is unambiguous I gave the option for omission particularly to help improve readability of some longer formulae.

# 2024-06-16 12:53 ltfschoen: it says here https://jam.web3.foundation/rules that "Clean-room implementation using the Graypaper and public implementor chat channels as the only resources.". it is my understanding that the Graypaper chat channel is https://matrix.to/#/#graypaper:polkadot.io where are the "public implementer" chat channels? and is there a reason why it doesn't include the Jam chat channel here https://matrix.to/#/#jam:polkadot.io? also, it says "Each team is only allowed to work on one implementation", but in the application form it says "What programming language(s) are you using and which language set are you applying for? e.g. "Rust, set B", so i don't understand why it's asking us "What programming language(s) are you using". is it just wanting to know if we'll be using multiple languages in the our single implementation (e.g. if we'll be doing an implementation in Rust and using FFI from Ruby and Python then we'd answer Rust, Ruby, Python, set B? i was actually going to try and work on multiple implementations using multiple language sets (e.g. Rust, Swift Ruby, Python, TS) in parallel as a contingency incase i got stuck and couldn't get support with one of them. after submission of our initial application form, will be possible for us to later on change what programming language(s) and language set that we'll be using for the one implementation that could be eligible for the JAM prize? (edited)

# 2024-06-16 14:48 oliver.tale-yazdi: Definitions 284 and 291 both define function M_S, just with different arity. The definition section also mentions it twice:

# 2024-06-16 14:48 oliver.tale-yazdi: Screenshot 2024-06-16 at 16.48.36.png

# 2024-06-16 14:48 oliver.tale-yazdi: Screenshot 2024-06-16 at 16.48.46.png

# 2024-06-16 14:49 oliver.tale-yazdi: Screenshot 2024-06-16 at 16.48.57.png

# 2024-06-16 14:50 oliver.tale-yazdi: Is it supposed to be disambiguated by their arity, or is it a name clash?

# 2024-06-16 16:07 gav: They’re meant to be different functions. The name clash is an oversight.

# 2024-06-18 03:05 mkalohood:

# 2024-06-18 03:07 gav: > <@aedigix:matrix.org> Is there any concerns with endianness for the choice of VM? EVM is big-endian (not really sure the justification for this as I can't find it in the YP) AFAIK there wasn't an especially good reason for it other than it made sense from a mathematician's perspective.

# 2024-06-18 03:08 gav: In any case modern architectures are either natively LE or have LE-compatibility modes, so it makes sense on the VM side to stick with LE.

# 2024-06-18 03:09 gav: As for serialization, it'll be SCALE, so LE also. This can already be seen via the definitions for the encode function. (edited)

# 2024-06-18 03:15 mkalohood: miss deleting. The big endian and the little endian have two related concepts: 1. Host order: Different CPUs process data with different byte order types. Intel X86 CPU is little endian, mips architecture CPU is big endian. 2, network order, the network transmits data in the way of byte stream, and the byte order type in the transmission process is called network order. It is independent of the CPU/OS type of other specific devices, ensuring that it can be resolved correctly when transferred between different devices. it defines the byte order as the big endian. So basically the big endian is used before the network data is sent. But , I don't think the big endian is the deciding factor in VM selection. I think we can take it apart. The VM communication layer and data operation management can be separated.

# 2024-06-18 03:48 gav: > <@purpletentacle:matrix.org> ... > I have also a few of questions about C.2. .. (262) (263) > > - why is H\_j not included when serializing the header? is that correct or an unintentional omission? > - why there is a claim that E\_U(H) has "no inverse"? wouldn't that imply that there is no way to decode a serialized header? (In C.1. "...We define the deserialization function E−1 = E−1 as the inverse of E and able to decode some sequence into the original value...." On the first point, H_j is not serialized - it's auxilliary data implied through E_J.

# 2024-06-18 03:50 gav: > <@gav:polkadot.io> - H\_j is not serialized but implied by E\_J. > - where exactly does it say E\_U(H) cannot have an inverse? On the second point "the latter has no inverse", it was merely meant to state that no function D_U(Y) -> H was defined explicitly. However, I removed it as it's clearly misleading.

# 2024-06-18 03:53 gav: > <@gav:polkadot.io> I believe it is mentioned previously that when hashing, values are assumed to be encoded with the regular serialisation function if they are not explicitly so. I made this explicit now.

# 2024-06-19 13:22 gav: 0.2.1 of the paper is tagged and released.

# 2024-06-19 20:15 purpletentacle: image.png

# 2024-06-19 20:15 purpletentacle: sorry to insist on this point.. but it is not 100% clear to me 36) defines the header as including H_J but later it is not included (edited)

# 2024-06-19 20:16 purpletentacle: image.png

# 2024-06-19 20:17 purpletentacle: and when deserializing, it should be kept as ?H\_j = None (optional) if the corresponding extrinsic E\_J is not available? (edited)

# 2024-06-19 20:18 purpletentacle: I find a bit odd that H is defined as "containing" H\_j when it is actually always external to it (edited)

# 2024-06-19 20:18 purpletentacle: Btw, I see the definition later at 10.3 but I am still a bit confused and unsure about this value being part of the header or not.. is it fair to assume that the seal applies to the encoded header, an as a consequence, it will never include H\_j ? if that is the case, when is H\_j relevant or useful? (edited)

# 2024-06-19 23:55 gav: Yes it’s not especially clear and I’ll clarify it for the next minor revision. As of 0.2.1, it doesn’t get serialised or deserialised but is just defined as an equivalence based on the value of E_J. It is used in the definition of Safrole, equation 56. However, in future versions of the spec this may change and it may be featured in the encoding of the header. (edited)

# 2024-06-21 04:17 qiwei: image.png

# 2024-06-21 04:17 qiwei: something missing here for 0.2.1

# 2024-06-21 08:37 purpletentacle:

The judgements state includes three items, an allow-set ($\psi_\mathbf{a}$), a ban-set ($\psi_\mathbf{b}$) and a punish-set ($\psi_\mathbf{p}$). The allow-set contains the hashes of all work-reports which were disputed and judged to be accurate. The ban-set contains the hashes of all work-reports which were disputed and whose accuracy could not be confidently confirmed. The punish-set is a set of keys of Bandersnatch keys which were found to have guaranteed a report which was confidently found to be invalid.
\begin{equation}
  \psi \equiv \tup{\psi_\mathbf{a}, \psi_\mathbf{b}, \psi_\mathbf{p}, \psi_\mathbf{k}}
\end{equation}

`\subsection{Extrinsic}

it looks like a misplaced backtick.. I would say it can be ignored

# 2024-06-21 09:29 qiwei: I see, from the commit history, maybe there are a few other occurences of \psi_\mathbf{k} need to be removed

# 2024-06-21 09:53 gav: > <@qiwei:matrix.org> something missing here for 0.2.1 Yes there’s an error there, the latter component should have been removed.

# 2024-06-21 09:53 gav: It will be fixed in the next revision but feel free to place an issue.

# 2024-06-21 09:55 gav: > <@purpletentacle:matrix.org>

> The judgements state includes three items, an allow-set ($\psi_\mathbf{a}$), a ban-set ($\psi_\mathbf{b}$) and a punish-set ($\psi_\mathbf{p}$). The allow-set contains the hashes of all work-reports which were disputed and judged to be accurate. The ban-set contains the hashes of all work-reports which were disputed and whose accuracy could not be confidently confirmed. The punish-set is a set of keys of Bandersnatch keys which were found to have guaranteed a report which was confidently found to be invalid.
> \begin{equation}
>   \psi \equiv \tup{\psi_\mathbf{a}, \psi_\mathbf{b}, \psi_\mathbf{p}, \psi_\mathbf{k}}
> \end{equation}
> 
> `\subsection{Extrinsic}
> 
> 
>

> > it looks like a misplaced backtick.. I would say it can be ignored Again, will be fixed in next revision.

# 2024-06-23 13:40 sourabhniyogi: For Appendix B.6 (General), B.7 (Accumulate) and Appendix B.8 (Refine) functions: 1. For ω parameter inputs, I did not understand why the number of parameters on the left hand side did not equal that of the right hand side: - lookup \[h\_o, b\_0, b\_z\] = ω\_{1..4} (3 vs 4) but then ω\_0 is used so ... ok - read \[k\_o, k\_z, b\_o, b\_z\] = ω\_{1..5} (4 vs 5) but then w\_0 is used so .. ok - write \[k\_o, k\_z, v\_o, v\_z\] = ω\_{0..4} (4 vs 5) but there is no w\_4 ... huh? - new \[o,l,g\_l, g\_h, m\_l, m\_h\] = ω\_{0..6} (6 vs 7) ... huh? ... - machine \[p\_o, p\_z, i\] = ω\_{0..3} (3 vs 4) ... huh? - peek \[n,a,b,l\] = ω\_{0..4} (4 vs 5) ... huh? ... and so on Then you use ω\_0' as a return parameter which has nothing to do with the input ω\_0, huh. ω isn't really doing that much for you it seems except to group the inputs, maybe just describe the inputs and output? 2. At least a line or two describing the each function, specifically referencing these ω parameter inputs would help a lot! Whereas PVM opcode semantics are quite common place and need little additional explication, these functions are the heart of JAM and the additional explication would increase speed of comprehension and reduce guesswork. 3. I believe most of the functions probably deserve at least a passing reference in the main body, or solid exposition around present (242), (246), (252). We can guess import+export+historical\_lookup but nothing references what peek, poke, machine, assign, delegate, quit (?), ... do yet. 4. invoke = 13 is a copy of solicit and surely deserves a different name to the invoke = 20, like invoke\_accumulate vs invoke\_refine. 5. Not clear what's going on with numbers: - new's bump function 42, 9 - designate's 176 - invoke=20's 13 and 60 6. Not clear why all these 64-bit parameters (g, a, m) have to be split into 2 32-bit and joined back together? 7. A diagram for the whole DA system would be worth a thousand words. I imagine you have them in your JAM slides. 8. If you don't want to add more exposition because you want hyper compactness ok I get it but maybe using https://en.wikibooks.org/wiki/LaTeX/Macros will allow implementers to disambiguate notation by just reading latex (?!) -- especially for any overloaded notation (E, W, s, C, c, t, ...) that is used to reference more than one concept, implementers can go look at the LaTeX source, where the macros would be unambiguous. (Feel free to ignore most of the above, not very confident, just learning =). Will of course take any edits you make and follow up with deep look to check if I understand in our stubs) (edited)

# 2024-06-24 01:55 gav: Regarding the use of omega, did you understand that the range is inclusive on the lower bound and exclusive on the upper?

# 2024-06-24 01:59 sourabhniyogi: Yes, for the cases that appear to be ranges. Not all your ω parameters are ranges though.

# 2024-06-24 02:03 gav: I'm not sure what you think the issue is.

# 2024-06-24 02:04 gav: let [a, b, c] = X_{1..4} seems perfectly comprehensible.

# 2024-06-24 02:23 sourabhniyogi: Got it, its clear. [Normies will think X_{1..4} will have 4 elements. You obviously think it should have just 3]

# 2024-06-24 02:29 gav: I would draw your attention to Section 3 Notational Conventions:

# 2024-06-24 02:29 gav: A range may be denoted using an ellipsis for example: \[0, 1, 2, 3\]...2 = \[0, 1\] and \[0, 1, 2, 3\]1⋅⋅⋅+2 = \[1, 2\] (edited)

# 2024-06-24 02:30 gav: I believe it is not uncommon in computer programming to use (inclusive...exclusive) ranges. Rust, notably, does this. (edited)

# 2024-06-24 02:31 gav: In any case, regardless of convention I have endeavoured to make my notation clear. But it is very important that anyone serious about interpreting the GP thoroughly read and understand Section 3. (edited)

# 2024-06-24 02:37 sourabhniyogi: Alright that explication is clear. Thanks!

# 2024-06-24 01:57 gav: I’m not sure what your point is when you say “Then you use…”

# 2024-06-24 02:03 sourabhniyogi: My point is that people would expect ω_0 and ω_0' to be related, of the same rough type -- you take pains in most other cases to have them be the same type. So, its an expectation violation.

# 2024-06-24 02:08 gav: ω_0' is always the return code.

# 2024-06-24 02:09 gav: ω is always the argument values.

# 2024-06-24 02:09 gav: I see no such violation. Perhaps you were misinterpreting my intentions. (edited)

# 2024-06-24 02:08 gav: I have intentionally avoided attempting to document the host functions in the GP. They will inevitably get documented eventually as a user-resource. But at present the host function definitions are explicit and unambiguous, the primary point of the GP's appendix. Defining them in English as well might lead to people who are less well able to read maths instead relying solely on the English description which will inevitably be more ambiguous and less well defined, increasing the speed of miscomprehension. (edited)

# 2024-06-24 02:12 gav: > Not clear why all these 64-bit parameters (g, a, m) have to be split into 2 32-bit and joined back together? How else do you expect to be able to represent a 64-bit value across 32-bit registers? (edited)

# 2024-06-24 03:47 sourabhniyogi: I expected the caller to do this nitty gritty mapping -- I see you mean 100% of ω_{i...j} are 32-bit registers, now the ω_0' vs ω_0 being unrelated types makes sense.

# 2024-06-24 02:13 gav: Macros have been used on occasion (see e.g. preamble.tex). I will likely increase the usage in time.

# 2024-06-24 02:14 gav: There will be no diagrams in the GP.

# 2024-06-24 02:27 gav: > invoke = 13 is a copy of solicit and surely deserves a different name to the invoke = 20, like invoke\_accumulate vs invoke\_refine. Unintended. invoke = 13 will be removed in the next revision. Thanks for reporting this. (edited)

# 2024-06-26 13:27 danicuki: "Boolean values. Bs denotes the set of Boolean strings of length s, thus Bs = ⟦{, ⊺}⟧s. When dealing with Boolean values we may assume an implicit equivalence mapping to a bit whereby ⊺ = 1 and = 0, thus B◻ = ⟦N2⟧◻. We use the function bits(Y) ∈ B to denote the sequence of bits, ordered with the least significant first, which represent the octet sequence Y, thus bits([5, 0]) = [1, 0, 1, 0, 0,...]. " I didn't understand why bits([5, 0]) and not simple bits(5)? What this 0 represent in the [5, 0] sequence? (edited)

# 2024-06-26 13:49 dave: bits takes an octet (=byte) _sequence_, not just a single byte. 0 is simply the second byte in the sequence.

# 2024-06-26 15:48 danicuki: So bits([5]) would be [1,0,1,0,0,0,0,0] and bits([5,0]) would be [1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0]?

# 2024-06-26 21:20 danicuki: Screenshot 2024-06-26 at 18.19.53.png

# 2024-06-26 21:20 danicuki: what is C here?

# 2024-06-27 01:26 sergei_astapov: > <@danicuki:matrix.org> what is C here? > gav It’s the beefy commitment set. It should be mentioned in the definitions section. It’s defined at the end of the accumulation definitions. Section 14 iirc.

# 2024-06-27 19:56 sourabhniyogi: Potential issues in Appendix A: 1. sub_imm is missing in A.5.9 - not sure what the opcode is (add_imm is 2) but see: https://github.com/koute/jamtestvectors/blob/master\_pvm\_initial/pvm/programs/inst\_sub\_imm.json vs https://github.com/koute/jamtestvectors/blob/master\_pvm\_initial/pvm/programs/inst\_add\_imm.json 2. cmov_imm_iz appears here as Opcode 85 but is missing in Appendix A https://github.com/koute/jamtestvectors/blob/master\_pvm\_initial/pvm/programs/inst\_cmov\_if\_zero\_imm\_ok.json 3. the family of "branch" opcodes { 24, 30, 47, 48, 41, 43 } are repeated in A.5.9 and A.5.10 -- is this intended? 4. The "above condition" referenced in "In the case that the above condition is not met, then the instruction is considered invalid, and it results in a panic" after (224-227) is not clear to me. 5. mov_reg (op code 82) and sbrk (op code 87) seems to need B and D swapped in some way either in 223 or in the Mutations, whatever works \[but, not sure what sbrk does though\]. 6. You have to finish the sentence around (213) "This allows for compact representation of both positive and negative encoded values, important as ." 7. Based on https://github.com/koute/jamtestvectors/blob/master\_pvm\_initial/pvm/programs/inst\_add.json (and many "ALU" operations I think you have a mix up of A, B, D (the inst_add.json has 1+2 => 3 r\_A = 7, r\_D = 9, r\_B = 8 where w\[r\_A\] = 1, w\[r\_B\] = 2, w\[r\_D\] = 3). Not sure if its the test vectors that are incorrect or the GP here. 8. The mutations of cmov_iz (opcode 85) and cmov_nz (opcode 84) appear to be swapped in A.5.11. (edited)

# 2024-06-28 05:58 gav: Point 6 will be sorted in the next revision.

# 2024-06-28 05:59 gav: 1. sub_imm doesn't exist.

# 2024-06-28 07:01 gav: 2. Will be fixed. 3. Will be fixed. 4. This refers to the implied decode of the immediate value which could conceivably fail. If the condition cannot fail, then the statement is redundant and you need not pay it any attention. 5. Will be fixed. 6. Will be sorted in the next revision. 7. Not sure what the issue is here. The example you gave seems in line with the GP. 8. Will be fixed. (edited)

# 2024-06-28 07:24 gav: (Fixed in main) (edited)

# 2024-06-28 07:24 sourabhniyogi: Ok, for 7, the test vector for add is computing 1+2 = 3 with inputs w_7 (1) + w_8 (2) going into w_9 (3) as is easily seen here: https://github.com/koute/jamtestvectors/blob/master_pvm_initial/pvm/programs/inst_add.json#L28 but looking closely at the test vector code: (a) decimal 121 is hexadecimal 0x79 (so w_7 are the high order 4 bits INPUT, while the OUTPUT w_9 is the low order 4 bits) (b) the byte following (a) is "8" which is the other INPUT Basically r_A+r_D are together in one byte with r_B following. In contrast, the GP in (227) has both r_A+r_B put together in one byte and with a byte holding r_D following that byte. The fix, I believe, is to adjust (227) subscripts from _A, _B, _D to _D, _A, _B

# 2024-06-28 07:34 gav: I see - for this the tests will be altered (GP stays as is) (edited)

# 2024-06-30 19:55 purpletentacle: This is more like a question than some specific correction.. but I feel that there are some information gaps that I am not able to fill and probably other may find in the same situation. Hopefully is not purely because of my lack of knowledge. I am confused by how the Bandersnatch ring root function O is defined in 3.0 and/or Appendix G. This is required to complete the update of the key root in safrole. ... \where z &= \mathcal{O}([k_b \mid k \orderedin \gamma'_\mathbf{k}]) \\ ... The appendix defines: O(⟦HB ⟧) ≡ PCS_commitment(⟦HB ⟧) GP points to https://github.com/davxy/bandersnatch-vrfs-spec/blob/main/specification.pdf ( which seems to be work in progress, A few TODOs, etc.). The document briefly defines prove and verify. Proving requires a secret key.. so it could not be applicable to key root update. Verify does not seem to be aplicable either. - I cannot link with confidence what is PCS_Commitment in Galassi's document. - It is allowed to use external code (FFI) for bandersnatch VRF. Would it he possible to provide more information about this step in Safrole, expand a bit more the appendix about Bandersnatch VRF or provide some reference on how the test vectors used the reference crate that was used for this purpose?

# 2024-06-30 21:09 davxy: Indeed. I'm going to update the Bandernatch with some extra info about the procedure used to construct the verifier and how to use it. The real meat of ring proof spec should be delivered by a dedicated document BTW. But apparently this one has not been made public yet. I'll try to push for some progress. For the moment, I think the brst route is to use the reference impl and have a look at the ring proof tests in ark-ec-vrf: - https://github.com/davxy/ark-ec-vrfs/blob/3b4b3591aba62d206408052f6ffd5d5aba6d60a5/src/suites/bandersnatch.rs#L206 - https://github.com/davxy/ark-ec-vrfs/blob/3b4b3591aba62d206408052f6ffd5d5aba6d60a5/src/testing.rs#L70 This gives you an idea of how the data in the test vectors (the ring verifier key) is constructed and the used to verify some proof.

# 2024-07-01 00:51 sourabhniyogi: davxy: gav Some freshman questions on Safrole: 1. For your Safrole test vectors ("tiny"), what are the values of $jam\_entropy (was BYTES( "sassafras\_randomness")) $jam\_ticket\_seal (was BYTES("sassafras\_ticket\_seal")) $jam\_fallback\_seal (was BYTES("sassafras\_fallback\_seal")) and what term in the GP does "entropy" (documented with"Per block entropy (originated from block entropy source VRF)") refer to from the test vector https://github.com/w3f/jamtestvectors/blob/master/safrole/publish-tickets-no-mark-6.json#L4 ? 2. I see "attempt" and "attempts\_number" was renamed to "entry index" and "N" in GP, but what happened to the "ticket threshold" and "redundancy\_factor" - are those \[what I thought key\] concepts gone (in which case, how?) or to be documented still? 3. Since a high level goal of JAM Implementation is to get NON-Rust implementations for Safrole almost everyone will surely use FFI into your recommended crypto package. Can you ( @davxy) provide a single working test case like say this one https://github.com/w3f/jamtestvectors/blob/master/safrole/publish-tickets-no-mark-6.json (or the scale equivalent) to set up the ring verifier, get the ring vrf output, and get the entropy buffer updated? With a single well-worked out test case in Rust handing the "Bare VRF" and "Ring VRF" (specifically making the specific flavors exceptionally clear to "I'm not a cryptographer, I just use cryptography" engineers), I'll bet everyone can use that to set up their extern "C" type FFI and pass most of the cases in rapid order. Is this possible? 4. For the "tickets\_verifier\_key" eg https://github.com/w3f/jamtestvectors/blob/master/safrole/publish-tickets-no-mark-6.json#L210 they are 384 bytes and it is labeled as "gamma\_z: The Bandersnatch ring root." in safrole.asn. However in GP it is documented as being element Y\_R in (47) but in I.1.2 it says it is 144 bytes. Can you explain further what this discrepancy could be due to? (edited)

# 2024-07-01 08:10 gav: @davxy ^^^

# 2024-07-01 09:01 davxy: sourabhniyogi: 1. For Safrole test vectors the only thing we use is $jam\_ticket\_seal. In particular this is used for the ring-vrf input construction (context in the GP) to obtain the ticket score (aka ticket-id) during the candidate ticket verification procedure. The value of $jam\_ticket\_seal" is constant and defined as "jam\_ticket\_seal" ASCII string. This is a thing that applies in general for values starting with $ (e.g. $foo in the GP => "foo" ascii string). (Thank you BTW as I've just spotted that I was using "jam\_seal" instead of "jam\_ticket\_seal", I'll add the fix to the upcoming vectors PR) The other constant strings ($jam\_fallback\_seal and $jam\_entropy) are used for block verification and per-block entropy production (which is passed as input to Safrole). NOTE: The actual value of per-block entropy used by the Safrole test vectors is not relevant to have been really produced using the signature in the header. Here we abstract away from the value's origin, we don't really care for the sake of the Safrole test vectors and Safrole STF. In the specific case I've used: - entropy\_0 = blake2b(\[42\_u8; 32\])\[..32\] - entropy\_i+1 = blake2b(entropy\_i)\[..32\] 2. If you take as a reference the Sassafras RFC then there some differences. One of these is the attempts number and redundancy factor, which in the GP are in practice simplified to one single thing (attempts number). Threshold is gone. Even though reading Sassafras RFC can help (as it is a quite similar protocol), always take the GP as the source of truth for JAM. 3. Sure thing. I'll post here as soon as it is ready. 4. This is a very interesting observation. Current implementation serializes 3 extra fields (part of the SNARK SRS). Serialization of these fields may be important **in a general application**, but here these values are constant. I will definitely get rid of these from serialized data (I'm on it). The final size will be 144 (i.e. the last 144 bytes of what you see right now) (edited)

# 2024-07-01 09:16 oliver.tale-yazdi: > <@davxy:matrix.org> sourabhniyogi: > > 1. For Safrole test vectors the only thing we use is $jam\_ticket\_seal. > In particular this is used for the ring-vrf input construction (context in the GP) to obtain the ticket score (aka ticket-id) during the candidate ticket verification procedure. > The value of $jam\_ticket\_seal" is constant and defined as "jam\_ticket\_seal" ASCII string. > This is a thing that applies in general for values starting with $ (e.g. $foo in the GP => "foo" ascii string). > (Thank you BTW as I've just spotted that I was using "jam\_seal" instead of "jam\_ticket\_seal", I'll add the fix to the upcoming vectors PR) > The other constant strings ($jam\_fallback\_seal and $jam\_entropy) are used for block verification and per-block entropy production (which is passed as input to Safrole). > NOTE: The actual value of per-block entropy used by the Safrole test vectors is not relevant to have been really produced using the signature in the header. Here we abstract away from the value's origin, we don't really care for the sake of the Safrole test vectors and Safrole STF. In the specific case I've used: > > - entropy\_0 = blake2b(\[42\_u8; 32\])\[..32\] > - entropy\_i+1 = blake2b(entropy\_i)\[..32\] > 2. If you take as a reference the Sassafras RFC then there some differences. > One of these is the attempts number and redundancy factor, which in the GP are in practice simplified to one single thing (attempts number). > Threshold is gone. > Even though reading Sassafras RFC can help (as it is a quite similar protocol), always take the GP as the source of truth for JAM. > 3. Sure thing. I'll post here as soon as it is ready. > 4. This is a very interesting observation. Current implementation serializes 3 extra fields (part of the SNARK SRS). > Serialization of these fields may be important **in a general application**, but here these values are constant. > I will definitely get rid of these from serialized data (I'm on it). The final size will be 144 (i.e. the last 144 bytes of what you see right now) I think the [..32] after the blake2b is not correct, see https://github.com/w3f/jamtestvectors/issues/6

# 2024-07-01 09:18 davxy: I've just seen your issue. Yeah truncated Blake2b-512 has been used for the test vectors. Need to change to blake2b 256. Thank you (edited)

# 2024-07-01 09:23 gav: https://github.com/gavofyork/graypaper/releases/tag/v0.2.2 is tagged and download available.

# 2024-07-01 11:06 purpletentacle: > Blake2b-512 has been used for the test vectors. Need to change to blake2b 256. > "jam_seal" instead of "jam_ticket_seal", I'll add the fix to the upcoming vectors PR) Should we then assume that current test vectors are incorrect and wait for the next release? As @sourabhniyogi explained, having access to something like 3) as primitives could be very useful... because it can also help to detect these subtle issues in the test vectors themselves Thank you again for the great responsiveness

# 2024-07-01 11:56 sourabhniyogi: > <@davxy:matrix.org> sourabhniyogi: > > 1. For Safrole test vectors the only thing we use is $jam\_ticket\_seal. > In particular this is used for the ring-vrf input construction (context in the GP) to obtain the ticket score (aka ticket-id) during the candidate ticket verification procedure. > The value of $jam\_ticket\_seal" is constant and defined as "jam\_ticket\_seal" ASCII string. > This is a thing that applies in general for values starting with $ (e.g. $foo in the GP => "foo" ascii string). > (Thank you BTW as I've just spotted that I was using "jam\_seal" instead of "jam\_ticket\_seal", I'll add the fix to the upcoming vectors PR) > The other constant strings ($jam\_fallback\_seal and $jam\_entropy) are used for block verification and per-block entropy production (which is passed as input to Safrole). > NOTE: The actual value of per-block entropy used by the Safrole test vectors is not relevant to have been really produced using the signature in the header. Here we abstract away from the value's origin, we don't really care for the sake of the Safrole test vectors and Safrole STF. In the specific case I've used: > > - entropy\_0 = blake2b(\[42\_u8; 32\])\[..32\] > - entropy\_i+1 = blake2b(entropy\_i)\[..32\] > 2. If you take as a reference the Sassafras RFC then there some differences. > One of these is the attempts number and redundancy factor, which in the GP are in practice simplified to one single thing (attempts number). > Threshold is gone. > Even though reading Sassafras RFC can help (as it is a quite similar protocol), always take the GP as the source of truth for JAM. > 3. Sure thing. I'll post here as soon as it is ready. > 4. This is a very interesting observation. Current implementation serializes 3 extra fields (part of the SNARK SRS). > Serialization of these fields may be important **in a general application**, but here these values are constant. > I will definitely get rid of these from serialized data (I'm on it). The final size will be 144 (i.e. the last 144 bytes of what you see right now) Thank you!

# 2024-07-01 16:41 davxy: Alright we're getting there. I've updated the PR with all the fixes we've discussed so far. Check the description for the changes. (if you find / suspect something else please tell) In order to reduce the ring keys commitment to 144 (as per GP) I had to patch an upstream dependency. So I'll temporarily point ark-ec-vrfs to the patched ring-proof for the moment (not pushed yet, I'll notify here when ready). Juan Leni | zondax.ch Yeah. Unfortunately there was the blake2b thing that invalidated tickets data and some pseudo randomly constructed things that were depending on the hash previously used. For tomorrow you'll have a clear example of how to use ring vrf (maybe I'll put it in the bandersnatch-specs repo) (edited)

# 2024-07-02 18:04 davxy: https://github.com/davxy/bandersnatch-vrfs-spec/commit/9bf454693a8e27776bd1bdde551dda4b51f22451 A couple of trivial examples pushed in the specs repo.

# 2024-07-02 19:43 sourabhniyogi: Thank you, this simplifies "which flavor" questions a lot! But how can we construct _vrf_output not from

let _vrf_output = &output.hash()[..32];

(which uses one of the randomly generated secrets

let output = secret.output(input);

) but instead from the extrinsic signature ALONE https://github.com/w3f/jamtestvectors/blob/master/safrole/publish-tickets-no-mark-2.json#L8 (and the verifier key and/or the set of authorities that can submit tickets) This expectation of vrf_signed_output(signature) being possible was set by this line here https://github.com/polkadot-fellows/RFCs/blob/main/text/0026-sassafras-consensus.md?plain=1#L228C34-L228C62 and codified in GP 0.2.2 (303)+(306) I think we would like a "fn ring_verify()" [and "fn ietf_verify()"] that does NOT set up a prover with any secret generation at all. Is this reasonable?

# 2024-07-02 21:52 davxy: It is quite easy. The signature you see in the extrinsic is a ring signature. You can deserialize it into the RingSignature struct you see in the example (the one who contain the output+ring proof). Once deserialized you can: - validate the proof constructing the Verifier as the example show - use the struct's output entry to actually generate the vrf output hash Does this help? To be more concrete, I'll add an example as you suggested (edited)

# 2024-07-02 22:30 sourabhniyogi: > <@davxy:matrix.org> It is quite easy. The signature you see in the extrinsic is a ring signature. > You can deserialize it into the RingSignature struct you see in the example (the one who contain the output+ring proof). > Once deserialized you can: > - validate the proof constructing the Verifier as the example show > - use the struct's output entry to actually generate the vrf output hash > > Does this help? > To be more concrete, I'll add an example as you suggested Yes that final example should do it, wonderful

# 2024-07-02 23:20 davxy: I've updated the example in the spec repo. Check that out

# 2024-07-03 16:36 purpletentacle: is 20-24 secs the typical time for ring\_prove\_verify? running in a macbook with M3 .. wow.. (edited)

# 2024-07-03 17:34 davxy: Not really. Is a bit too much :-)

❯ cargo run  --release
   Compiling ark-ec-vrfs-bandersnatch-example v0.1.0 (/mnt/ssd/develop/bandersnatch-vrfs-spec/example)
    Finished `release` profile [optimized] target(s) in 1.48s
     Running `target/release/ark-ec-vrfs-bandersnatch-example`
* Time taken by ring-vrf-sign: 629.248169ms
Ring signature verified
 vrf-output-hash: 6b260bfda2e3ef118c529f30b60dfa4678fbeef3682b55ba002aa8633f1b0364
* Time taken by ring-vrf-verify: 6.105183ms
* Time taken by ietf-vrf-sign: 394.206µs
Ietf signature verified
 vrf-output-hash: 6b260bfda2e3ef118c529f30b60dfa4678fbeef3682b55ba002aa8633f1b0364
* Time taken by ietf-vrf-verify: 599.059µs

I have a beefy threadripper 3970X, but I don't expect 24s See here latest benchmarks: https://github.com/davxy/crypto-benches/blob/main/vrf/README.md#verify

# 2024-07-03 17:38 davxy: Proving is expected to be slow (well not that slow). But proving is done once per epoch, offchain and by powerful enough candidate validators

# 2024-07-03 17:46 davxy: Verification is more critical. Number of tickets per block is limited to 16 (by the GP) and multiple tickets verification can be done in parallel.

# 2024-07-03 17:50 davxy: Oh. I also forgot to add the "parallel" feature to ark-ec-vrfs. These are the timings (as you can see proving is a lot faster):

❯ cargo run --release
* Time taken by ring-vrf-sign: 156.818104ms
Ring signature verified
 vrf-output-hash: 6b260bfda2e3ef118c529f30b60dfa4678fbeef3682b55ba002aa8633f1b0364
* Time taken by ring-vrf-verify: 6.4541ms
* Time taken by ietf-vrf-sign: 412.089µs
Ietf signature verified
 vrf-output-hash: 6b260bfda2e3ef118c529f30b60dfa4678fbeef3682b55ba002aa8633f1b0364
* Time taken by ietf-vrf-verify: 613.725µs

(edited)

# 2024-07-23 00:46 sourabhniyogi: The key insight, we found, is from RFC 26 \[which we're not supposed to read and get too confused by because Safrole != Sassafras, but is an excellent introduction nevertheless \] in these: (a)

ietf_vrf_output (Non anonymous VRF): vrf_output(secret, input) == vrf_signed_output(signature)

(b)

ring_vrf_output (Anonymous VRF): vrf_signed_output(signature) == ring_vrf_signed_output(ring_signature);

The (a) case has the block author reveal himself through the signature of (a) in block authoring, and you can consider that "Non anonymous VRF" The (b) does NOT reveal the ticket submitter, you can consider that a "Anonymous VRF", which is the beauty of RingVRFs, an important invention. The key AHA I think for people is you will put (a)+(b) together with the same Bandersnatch key like so: vrf\_output(secret, input) == vrf\_signed\_output(signature) == ring\_vrf\_signed\_output(ring\_signature) and this basically maps onto this line here from davxy : https://github.com/davxy/bandersnatch-vrfs-spec/blob/1ec75e9a3af3a2be7dbca5090171c01e29ac5854/example/src/main.rs#L268C33-L268C48 Once you understand that, all the GP notation will fall into place for you I hope. The ticket submitter uses (b) _anonymously_ and once its time to author blocks (assuming its the non-fallback case, and his ticket submission is among the lowest), he uses (a) _non-anonymously_. The ticketid is the VRF output common to both (a)+(b). (edited)

# 2024-07-03 00:08 sourabhniyogi: Perfect, thank you! I think most non-cryptographer non-Rust engineers can jump into this and execute now. [We will proceed to try to pass our first few tiny tests with a FFI, hope others will do the same] (edited)

# 2024-07-04 00:20 sourabhniyogi: davxy: We succeeded with our FFI, but did not get a "tiny" extrinsic verification to work yet: https://github.com/w3f/jamtestvectors/pull/5#issuecomment-2207625234 What did we miss?

# 2024-07-04 10:28 davxy: Test vectors PR updated to make use of a not "random" SRS (the stuff used to construct the RingContext). Check the README for some pointers and update ark-ec-vrfs for new constructor.

# 2024-07-04 12:26 sourabhniyogi: Got it -- I got the new ring_context and am pulling the binary of "zcash-srs-2-11-uncompressed.bin" but still no verification yet: https://github.com/w3f/jamtestvectors/pull/5#issuecomment-2208850402 What did we miss?

# 2024-07-04 13:16 davxy: Replied in the PR

# 2024-07-04 14:34 sourabhniyogi: Woohoo! Thank you!

# 2024-07-06 05:27 sourabhniyogi: We are succeeding with most of the STF successfully thanks to your example, but have a few outstanding cases in "tiny" to address: https://github.com/w3f/jamtestvectors/pull/5#issuecomment-2211649983 Can you kindly advise?

# 2024-07-04 12:34 sourabhniyogi: Question: In Section 15 Guaranteeing "With two guarantor signatures, the work-report may be distributed to the forthcoming Jam chain block author in order to be used in the E_G, which leads to a reward for the guarantors." .. but how could a work-report be distributed to a _forthcoming_ block author if via Safrole block authors are only knowable when they actually reveal themselves when they actually author the block?

# 2024-07-04 12:52 gav: By distributing to all possible block authors (i.e. all validators). (edited)

# 2024-07-04 12:59 purpletentacle: I know networking is still not fully defined.. but I have a quick follow up question on that.. the idea in jam is that every node will connect to every other node and there will be no gossip at all, right? (edited)

# 2024-07-04 15:56 sourabhniyogi: Here is working verification in Rust (with matching "gamma_z" =)) for "tiny" case in @davxy PR: https://gist.github.com/sourabhniyogi/7097609935fc7a584d71731acdf32027 Thank you davxy !

# 2024-07-04 16:52 gav: > <@purpletentacle:matrix.org> I know networking is still not fully defined.. but I have a quick follow up question on that.. the idea in jam is that every node will connect to every other node and there will be no gossip at all, right? Yes indeed.

# 2024-07-04 16:54 gav: It's still a WiP, but you can get a feel for the basic networking protocol https://hackmd.io/@polkadot/jamsnp

# 2024-07-04 16:55 gav: There will likely be two revisions of the protocol - one initial JAMSNP focusing on simplicity and speed of implementation and a second JAMNP focussed on optimisation and security. (edited)

# 2024-07-04 22:59 sourabhniyogi: Thank you for this sketch, its very helpful! Is it reasonable to have a "tiny" setup (like 6 validators, single team with _maybe_ another team, enabling Milestone 1 "solo" networking PoC, easy to simulate and then run QUIC for real), "medium" setup (like 16-32 validators, enabling Milestone 2 "multi-team" networking PoC with like 4-5 teams, possible to simulate and do for real for a single team though with some basic resources) WELL before the 1023 validator set up (which few teams will have resources for, not sure if its even possible to simulate?). I am wondering if a "lite Appendix H" (not in GP, but suitable for Milestone 1+2) Erasure coding would be worth it to support tiny and medium cases for simplicity / speed of implementation in this tiny vs medium situation, and could have significant utility in the next 6-9 months? (edited)

# 2024-07-05 00:23 xlchen: we will have JAM Toaster for the 1023 validators setup

# 2024-07-05 00:24 xlchen: those constants should be configurable and in theory all we need to do is config the node with corresponding values

# 2024-07-05 00:25 xlchen: (I had the mistake to hardcoded the constants and later need to refactor the code to make them configurable)

# 2024-07-05 02:41 sourabhniyogi: Absolutely! I mean to ask if before 341\*3=1023 (JAM Toaster sized) we can have - tiny C=2 x 3 = 6 validators - xsmall C = 4 x 3 = 12 validators - small C = 6 x 3 = 18 validators - medium C = 10 x 3 = 30 validators - large C = 30 x 3 = 90 validators - xlarge C = 50 x 3 =150 validators - xxlarge C = 100 x 3 = 300 validators - The Toaster C = 341 x 3 = 1023 validators where we can just have different RS codes in various JAM configurations to match. I believe we only need 3 configurations (one for the first 3 milestones), but I think the RS code of 342:1026 is "impedance matched" to C=341+1 so probably there are more perfect values of C than others for smaller C? Is the concept of a "bootnode" gone? Where does the 18 byte per validator directory live? (edited)

# 2024-07-05 02:43 xlchen: there is a 32 bytes metadata in the validator key

# 2024-07-05 02:43 xlchen: which contains the ipv6 address

# 2024-07-05 02:44 xlchen: but I guess we still need some bootnodes, it is just that technically the network doesn’t need it to be alive so it doesn't need to be specified? (edited)

# 2024-07-05 02:45 xlchen: we can still have some implementation specific genesis config that includes some bootnodes and could make it such way that all implementations to support such format. it is just that this doesn't need to be defined in GP

# 2024-07-05 04:45 tomusdrw: > <@gav:polkadot.io> There will likely be two revisions of the protocol - one initial JAMSNP focusing on simplicity and speed of implementation and a second JAMNP focussed on optimisation and security. What's the plan for in-browser light clients? Would be good to have something web socket based (or maybe even REST 🤪) since nothing more fancy can be initiated from the browser context afair. Alternatively I can imagine a websocket<>jam gateway but if it's not part of major client implementations we are risking heavy centralization and/or monoculture of implementations that one can connect to.

# 2024-07-05 11:21 gav: bootnodes are always a painpoint. JAM won't address them directly, but we might use something like IPFS or WebTorrent to distribute them in a resilient way. (edited)

# 2024-07-05 11:24 gav: yes indeed - this should be figured out and standardised fairly early on, but it's not crucial for the (validator) protocol per se. (edited)

# 2024-07-05 12:28 gav: @room another new release https://github.com/gavofyork/graypaper/releases/tag/v0.2.3

# 2024-07-06 13:55 shwchg: Hi everyone! In Appendix H of the Gray paper, we are trying to understand what GF(16) means. In the latest version of the Gray paper, it is mentioned that 16-bit GF points are selected, so we believe that the GF(16) here should refer to GF(2^16). -- do we understand this correctly?

# 2024-07-06 14:18 gav: yes :)

# 2024-07-07 03:04 shwchg: > <@gav:polkadot.io> yes :) Thank you!

# 2024-07-07 10:53 oliver.tale-yazdi: I dont understand how the selection of fallback keys works here. It looks like k is being indexed with a random index (i assume u32). But k only has 600 elements, so it will try to access invalid indices? Is there a hidden modulo somewhere? (edited)

# 2024-07-07 10:53 oliver.tale-yazdi: Screenshot 2024-07-07 at 12.49.08.png

# 2024-07-07 11:39 gav: The little circular arrow functions as a modulus.

# 2024-07-07 11:40 gav: See the definition in the notation section.

# 2024-07-08 05:36 bhcme: Hi everyone! 👋 I wonder why the rate of 342:1026 is used for erasure coding, as stated in the opening sentence of Appendix H, Erasure Coding: “The foundation of the data-availability and distribution system of Jam is a systematic Reed-Solomon erasure coding function in gf(16) of rate 342:1026, as defined by Lin, Chung, and Han 2014.” In particular, the precondition for the efficient implementation of the Reed-Solomon encoding and erasure decoding algorithm mentioned in the above paper only applies for (n = 2^r, k)

# 2024-07-08 06:41 sourabhniyogi: Can we get PVM Host function test cases soon? Here is my wish: https://github.com/w3f/jamtestvectors/pull/3#issuecomment-2212706605 Some candidate typo fixes for Appendix B: \[for GP v0.2.3\] 1. For info, it has b_o instead of o on the 4th mutation 2. For designate, does the 176 (144+32 I gather?) need to be 336 to match (51)-(55)? If not, what is the content of the 176 bytes per validator? 3. For peek, it seems to need l instead of i on 2nd and 3rd mutation lines 4. Similarly for poke, it seems to need l instead of i on 3rd and 4th mutation lines (edited)

# 2024-07-08 08:37 gav: Appendix A and B are rather different things so it makes little sense to comment on that PR. (edited)

# 2024-07-08 08:44 gav: > <@sourabhniyogi:matrix.org> Can we get PVM Host function test cases soon? Here is my wish: > https://github.com/w3f/jamtestvectors/pull/3#issuecomment-2212706605 > > Some candidate typo fixes for Appendix B: [for GP v0.2.3] > 1. For info, it has b_o instead of o on the 4th mutation lin3 > 2. For designate, does the 176 (144+32 I gather?) might need to be 336 to match (51)-(55)? If not, what is the content of the 176 bytes per validator? > 3. For peek, it seems to need l instead of i on 2nd and 3rd mutation lines > 4. Similarly for poke, it seems to need l instead of i on 3rd and 4th mutation lines > Test vectors will arrive when they’re ready and in the order that they’re ready. I’d caution you not to be pushy about it. Doing so will not aid their swift arrival.

# 2024-07-12 14:31 danicuki:

# 2024-07-12 14:32 danicuki: jam-dependencies.jpeg

# 2024-07-12 14:34 danicuki: here is the source code for it (generate image at: https://yuml.me/)

// greek letters
// -----------------------
[τ'] -> [H]

[β'] -> [H]
[β'] -> [EG]
[β'] -> [C]

[γ'] -> [H]
[γ'] -> [τ]
[γ'] -> [ET]
[γ'] -> [γ]
[γ'] -> [ι]
[γ'] -> [η']
[γ'] -> [κ']

[η'] -> [H]
[η'] -> [η]
[η'] -> [τ]

[κ'] -> [H]
[κ'] -> [τ]
[κ'] -> [κ]
[κ'] -> [γ]
[κ'] -> [ψ']

[λ'] -> [H]
[λ'] -> [τ]
[λ'] -> [λ]
[λ'] -> [κ]

[ψ'] -> [EJ]
[ψ'] -> [ψ]

[δ†] -> [EP]
[δ†] -> [δ]
[δ†] -> [τ']

[ρ†] -> [EJ]
[ρ†] -> [ρ]

[ρ‡] -> [EA]
[ρ‡] -> [ρ†]

[ρ'] -> [EG]
[ρ'] -> [ρ‡]
[ρ'] -> [κ]
[ρ'] -> [τ']

[δ'
χ'
ι'] -> [EA]
[δ'
χ'
ι'] -> [ρ']
[δ'
χ'
ι'] -> [δ†]
[δ'
χ'
ι'] -> [χ]
[δ'
χ'
ι'] -> [ι]
[δ'
χ'
ι'] -> [φ] 

[φ'] -> [EA]
[φ'] -> [ρ']
[φ'] -> [δ†]
[φ'] -> [χ]
[φ'] -> [ι]
[φ'] -> [φ] 

[C] -> [EA]
[C] -> [ρ']
[C] -> [ρ†]
[C] -> [χ]
[C] -> [ι]
[C] -> [φ] 

[α'] -> [EG]
[α'] -> [φ'] 
[α'] -> [α] 

[π'] -> [EG]
[π'] -> [EP]
[π'] -> [EA]
[π'] -> [ET]
[π'] -> [τ']
[π'] -> [τ]
[π'] -> [π]

(edited)

# 2024-07-12 14:33 danicuki: I have created a diagram mapping all state transition components dependencies. I hope it is useful.

# 2024-07-13 19:19 celadari: Hello everyone, I have some questions, I don't know if I should post them here or on the Jam Chat channel but 🤷‍♂️ here we go : ### Appendix G: 1. **Signature Set and Key Generation:** - To confirm, is ( F\_{m,k} ) the set of signatures using the IETF VRF (as per Goldberg 2023) where keys are generated from the Bandersnatch curve? 2. **Size of Y:** - Why is ( Y ) of size 96 bytes? We perform decode(x:32) and decode(x32:). Does this mean that the first 32 bytes are the public key and the rest is additional data? 3. **Merkle Root and Public Keys:** - To confirm, is ( O(\[H\_B\]) ) the Merkle root of the Merkle tree where the leaves are the authors' public keys? 4. **Use of RingVRF:** - To confirm, do we use the ring VRF from the Jeffrey 2023 paper, with the first ring VRF construction (part 4 of the paper) and with $ \text{Com}^*.\text{Commit}(\text{ring}) = O([H_B]) $ 5. **Size of Y (second instance):** - Why is ( Y ) of size 784 bytes? We perform decode(x:32) and decode(x32:) again. Does this mean that the first 32 bytes are the public key and the rest is additional data? (edited)

# 2024-07-13 19:20 celadari: ### Appendix I.4.5: 1. **$jam_fallback_seal:** - For $jam_fallback_seal, do we create an output with the VRF Bandersnatch as mentioned in the questions above? If yes, what arguments, context, and additional data are required? 2. **$jam_ticket_seal:** - For $jam_ticket_seal, do we use the VRF RingVRF? If yes, what arguments, context, and additional data are required? 3. **$jam_entropy:** - For $jam_entropy, related to on-chain entropy generation, what does it mean, and how do we compute it? ### Equation (59): 1. **Understanding Components:** - What are $ i_y $, $ i_r $, $ H_a $, $ H $, and $ E_U(H) $? ### Equation (60): 1. **Defining $ H_a $:** - Do we define $ H_a $ with the $ i $ from the equation "let $ i = \gamma_s'[H_t]^{\circlearrow}"? 2. **Understanding \( H $ and $ E_U(H) $:** - What is $ H $ and $ E_U(H) $?

# 2024-07-13 19:20 celadari: Let me know if I should move them to other channel or if you need more precisions Thank you in advance

# 2024-07-14 06:52 gav: > <@celadari:matrix.org> Hello everyone, > > I have some questions, I don't know if I should post them here or on the Jam Chat channel but 🤷‍♂️ here we go : > > ### Appendix G: > > 1. **Signature Set and Key Generation:** > - To confirm, is ( F\_{m,k} ) the set of signatures using the IETF VRF (as per Goldberg 2023) where keys are generated from the Bandersnatch curve? > 2. **Size of Y:** > - Why is ( Y ) of size 96 bytes? We perform decode(x:32) and decode(x32:). Does this mean that the first 32 bytes are the public key and the rest is additional data? > 3. **Merkle Root and Public Keys:** > - To confirm, is ( O(\[H\_B\]) ) the Merkle root of the Merkle tree where the leaves are the authors' public keys? > 4. **Use of RingVRF:** > - To confirm, do we use the ring VRF from the Jeffrey 2023 paper, with the first ring VRF construction (part 4 of the paper) and with $ \text{Com}^*.\text{Commit}(\text{ring}) = O([H_B]) $ > 5. **Size of Y (second instance):** > - Why is ( Y ) of size 784 bytes? We perform decode(x:32) and decode(x32:) again. Does this mean that the first 32 bytes are the public key and the rest is additional data? Your interpretation of the use of the colon in subscripting is correct.

# 2024-07-14 06:53 gav: The rest will need @davxy:matrix.orgto chime in.

# 2024-07-14 06:55 gav: > <@celadari:matrix.org> ### Appendix I.4.5: > > 1. **$jam_fallback_seal:** > - For $jam_fallback_seal, do we create an output with the VRF Bandersnatch as mentioned in the questions above? If yes, what arguments, context, and additional data are required? > > 2. **$jam_ticket_seal:** > - For $jam_ticket_seal, do we use the VRF RingVRF? If yes, what arguments, context, and additional data are required? > > 3. **$jam_entropy:** > - For $jam_entropy, related to on-chain entropy generation, what does it mean, and how do we compute it? > > ### Equation (59): > > 1. **Understanding Components:** > - What are $ i_y $, $ i_r $, $ H_a $, $ H $, and $ E_U(H) $? > > ### Equation (60): > > 1. **Defining $ H_a $:** > - Do we define $ H_a $ with the $ i $ from the equation "let $ i = \gamma_s'[H_t]^{\circlearrow}"? > > 2. **Understanding \( H $ and $ E_U(H) $:** > - What is $ H $ and $ E_U(H) $? > > > Appendix I is the index. Maybe ask your questions in terms of the actual protocol specification?

# 2024-07-14 07:00 gav: Re 59, you can find all the information in the text around it. E.g. i is determined from subscripting gamma’_s, a member of the sequence of tickets, telling you that it’s a ticket and thus that i_y is a VRF output, used as a ticket identifier (see (50) and text around it).

# 2024-07-14 07:00 gav: Don’t expect to be spoon fed this stuff. It’s not playschool. You need to read carefully.

# 2024-07-14 07:03 gav: https://youtu.be/QS9buJLt5jo?si=5Sh8Qc-teI-dxVJ_ might be helpful to watch as well as others in the series.

# 2024-07-14 07:05 celadari: > <@gav:polkadot.io> Appendix I is the index. Maybe ask your questions in terms of the actual protocol specification? I'm not sure to understand what you mean by "Appendix I is the index" 🤔 At least the term $jam_entropy => can you give pointer to this term please ?

# 2024-07-14 07:13 gav: I’m not sure why you think it reasonable to ask for someone else to do a PDF text search for you like they’ve nothing better to be doing Sunday morning.. anyway (62) X_E = $jam_entropy (edited)

# 2024-07-14 07:14 gav: As i say do not expect to be spoon fed. Any more such questions may elicit little response.

# 2024-07-15 10:50 xlchen: Screenshot 2024-07-15 at 10.50.10 PM.png

# 2024-07-15 10:50 xlchen: just to confirm, a validator key can be selected multiple times right

# 2024-07-15 11:02 gav: That is indeed an implication of the formula (edited)

# 2024-07-15 11:59 gav: @room v0.3.0 is out; mostly corrections but with some reworking of the Guarantees and Verdicts (né Judgements) extrinsic. https://github.com/gavofyork/graypaper/releases/tag/v0.3.0

# 2024-07-15 12:08 oliver.tale-yazdi: > <@gav:polkadot.io> @room v0.3.0 is out; mostly corrections but with some reworking of the Guarantees and Verdicts (né Judgements) extrinsic. https://github.com/gavofyork/graypaper/releases/tag/v0.3.0 I think only the version on the file header was updated, not the version on the first page: https://github.com/gavofyork/graypaper/pull/40/files

# 2024-07-15 12:08 oliver.tale-yazdi: Screenshot 2024-07-15 at 14.08.23.png

# 2024-07-15 12:13 gav: > <@oliver.tale-yazdi:parity.io> I think only the version on the file header was updated, not the version on the first page: https://github.com/gavofyork/graypaper/pull/40/files Should be good now

# 2024-07-15 13:40 dakkk: Reading integer encoding in appendix C, I'm unable to understand how x=0 is handled by the general natural serialization rule 274; if x is 0, I can't apply the first if since (2**0 == 1) > 0 (so it does not exist a valid l), but applying the second if doesn't seems correct either. Am I missing something, or the case where x=0 is not present? (edited)

# 2024-07-15 13:59 celadari: image.png

# 2024-07-15 13:59 celadari: You mean this function right ?

# 2024-07-15 14:00 celadari: I think $l$ is supposed to be fixed here no ?

# 2024-07-15 14:00 dakkk: image.png

# 2024-07-15 14:01 celadari: > <@dakkk:matrix.org> sent an image. then in this case, 0 < 2**64 no ?

# 2024-07-15 14:04 dakkk: yes, but following this path, 0 is encoded as \xff\x00\x00\x00\x00\x00\x00\x00\x00; it doesn't seem right to me

# 2024-07-15 14:08 celadari: I think you might be right, let me just check something

# 2024-07-15 14:10 celadari: what happens when you generate several elements from 1 to 2**7

# 2024-07-15 14:13 gav: > <@dakkk:matrix.org> yes, but following this path, 0 is encoded as \xff\x00\x00\x00\x00\x00\x00\x00\x00; it doesn't seem right to me yeah, zero is a special case and should be in the first branch

# 2024-07-15 14:13 gav: i'll correct now.

# 2024-07-15 14:46 gav: image.png

# 2024-07-15 14:47 gav: https://github.com/gavofyork/graypaper/pull/45

# 2024-07-15 21:00 sourabhniyogi: In-depth question about erasure coding: https://github.com/w3f/jamtestvectors/pull/4#issuecomment-2229422696 where we see a discrepancy between GP Appendix H and the test vector PR.

# 2024-07-16 07:34 gav: Typo: 648 bytes should be 684 bytes (edited)

# 2024-07-16 18:07 sourabhniyogi: Right sorry -- 684 is still not a multiple of 64 or power of 2.

# 2024-07-17 13:21 gav: No it’s 342 x 2 bytes. It’s therefore the minimum reconstructible datum.

# 2024-07-18 03:38 sourabhniyogi: Right, we understand that is what the GP says but that's not what cheme's PR test vectors are doing. We have our encoder + decoder FFI using the Rust libraries (and "passing" for arbitrary blob sizes), but the process by which cheme's erasure coding test vectors are generated are clearly on a different page than the GP's Appendix H. (edited)

# 2024-07-18 07:13 gav: > <@sourabhniyogi:matrix.org> Right, we understand that is what the GP says but that's not what cheme's PR test vectors are doing. We have our encoder + decoder FFI using the Rust libraries (and "passing" for arbitrary blob sizes), but the process by which cheme's erasure coding test vectors are generated are clearly on a different page than the GP's Appendix H. Ok. Is it possible that “Parity implementation (serialized)” is just 6 of the graypaper ECs concatenated?

# 2024-07-16 07:37 gav: @emeric:parity.io^^

# 2024-07-16 09:21 xlchen: Screenshot 2024-07-16 at 9.21.10 PM.png

# 2024-07-16 09:22 xlchen: I don't get this. what does it mean by "the lowest items of the sorted union "?

# 2024-07-16 09:23 xlchen: is the tickets accumulator a sorted array by the ticket id?

# 2024-07-16 09:34 xlchen: ok after read the sassafras RFC, I think I get it. Please confirm. This is basically a PoW-ish mining process and validators are trying to mine ticket with lowest VRF output and they have limited submission allowance and only EPOCH-LENGTH number of tickets will be accepted

# 2024-07-16 09:35 gav: > <@xlchen:matrix.org> ok after read the sassafras RFC, I think I get it. Please confirm. This is basically a PoW-ish mining process and validators are trying to mine ticket with lowest VRF output and they have limited submission allowance and only EPOCH-LENGTH number of tickets will be accepted That's about right.

# 2024-07-16 09:36 gav: "union" => combining two sets. "sorted" => sorting that. "lowest items" => take only the first N items.

# 2024-07-16 09:38 xlchen: I see. So the ticket extrinsic shouldn't contain invalid ticket (when pool is full and high VRF output) otherwise the block become invalid? (edited)

# 2024-07-16 09:40 oliver.tale-yazdi: > <@xlchen:matrix.org> I see. So the ticket extrinsic shouldn't contain invalid ticket (when pool is full and high VRF output) otherwise the block become invalid? or incorrectly sorted. The conformance tests cover these cases

# 2024-07-16 09:43 gav: Yes, broadly speaking we always put the onus on the block-producer to make blocks which don't waste the block importer's time.

# 2024-07-16 12:12 gav: Some fixes just went in especially for disputes/judgements/verdicts; if you're working on that, best to use main. (edited)

# 2024-07-16 19:52 gav:

# 2024-07-16 19:53 gav:

# 2024-07-17 09:19 xlchen: Screenshot 2024-07-17 at 9.19.35 PM.png

# 2024-07-17 09:19 xlchen: for serialization of numbers, what is the default serialization method?

# 2024-07-17 09:20 xlchen: Screenshot 2024-07-17 at 9.19.59 PM.png

# 2024-07-17 09:20 xlchen: I guess when unspecified, it is the general natural number serialization method which is a variable length one?

# 2024-07-17 09:21 xlchen: also I can't find where (273) is used?

# 2024-07-17 11:07 gav: It was used in an older version of the PVM. It might be used again in the future, but right now it's superfluous.

# 2024-07-17 14:33 celadari: equation (59) is for the normal lottery case (header "anonymously" signed with ringVRF) right ? so shouldn't it be $$\\mathbf{H}_s \\in \\bar{\\mathbb{F}}_{\\gamma\_z}^{\[\]}\\langle{X\_T\\frown \\eta\_3^{'} ++ i\_r }\\rangle$$ ? (edited)

# 2024-07-17 15:09 gav: No - neither case is anonymous at the point of seal verification. It’s only anonymous at the point of ticket submission. (edited)

# 2024-07-17 15:18 celadari: Oh I see

# 2024-07-17 16:11 subotic: Probably just a minor detail, two lines after (214): The PVM exit reason r. Shouldn't this be $\varepsilon$?

# 2024-07-17 17:24 gav: > <@subotic:matrix.org> Probably just a minor detail, two lines after (214): The PVM exit reason r. Shouldn't this be $\varepsilon$? yes - typo. will be fixed in the next release. thanks

# 2024-07-17 20:06 proxy720: Hi @gav, I'm looking at the latest version of the paper (DRAFT 0.3.0). I think there is a typo in a couple of equations in appendix C (the scale encoder): - (273): first case: ... ^ E_l(x mod 2*8l), should it be E_{l-1} ? - (273): second case: ... ^ E_3, I think it wants to be E_2 - (274): the same, I think the indices should be decreasing But maybe I'm wrong...

# 2024-07-17 20:14 proxy720: > <@proxy720:matrix.org> Hi @gav, I'm looking at the latest version of the paper (DRAFT 0.3.0). I think there is a typo in a couple of equations in appendix C (the scale encoder): > - (273): first case: ... ^ E_l(x mod 2*8l), should it be E_{l-1} ? > - (273): second case: ... ^ E_3, I think it wants to be E_2 > - (274): the same, I think the indices should be decreasing > But maybe I'm wrong... Yes, I am wrong the expressions are not recursive but refer to (272). Sorry, my bad.

# 2024-07-17 23:27 xlchen: A small annoyance, all the formula are numbered which is obviously very useful for many reasons. However, the number are not fixed and subject to change between versions. This makes referring to a particular formula from code a bit harder as it also needs to fixed to a specific GP version as well and then eventually the repo will contain references to multiple GP versions and just makes things harder. I don't know if there is a solution to this but it will be great if we have a good way to reference to GP from code comment / docs / chats

# 2024-07-18 07:09 gav: > <@xlchen:matrix.org> A small annoyance, all the formula are numbered which is obviously very useful for many reasons. However, the number are not fixed and subject to change between versions. This makes referring to a particular formula from code a bit harder as it also needs to fixed to a specific GP version as well and then eventually the repo will contain references to multiple GP versions and just makes things harder. I don't know if there is a solution to this but it will be great if we have a good way to reference to GP from code comment / docs / chats Once there is a 1.0 then this should be better. Until then if you want a fixed reference you could use the latex label, where there is one (and if there isn’t, then let me know or make a pr)

# 2024-07-18 08:31 sourabhniyogi: Cheme's erasure coding process (now ours, via FFI into the same Rust libs) is as follows: Encoding: - blob is split into 4096-byte segments - each 4096-byte segment is erasure coded into 12-byte subshards shared among 1026 (342 x 3) validators, ie 12,312 bytes Decoding: - available 12-byte subshards in a 12,312 byte encoding (only 342 of which are needed for reconstruction) are used to reconstruct the 4096-byte segment - each of the 4096-byte segments are put together to reconstruct the blob GP wants 4104-byte segments (W\_C = 684 x W\_S = 6) instead of 4096. This is a real discrepancy. (edited)

# 2024-07-18 09:03 gav: Ok - yeah cheme's current system is incorrect and the test vectors shouldn't have been generated.

# 2024-07-18 09:04 gav: If you're confident you have the GP properly implemented, feel free to create your own test vectors and submit them as a PR.

# 2024-07-18 09:06 gav: The lowest reconstructible datum of EC should be 684 bytes, reconstructible from 342 parties out of a total generated during EC of 1026, each presenting 2 bytes.

# 2024-07-18 09:08 gav: So 342 byte-pairs gets expanded to 1026 byte-pairs and from this any 342 byte-pairs can be used to reconstruct the original 342 byte-pairs. For ECing segments (4104 bytes, 2052 byte-pairs), then we basically do the ECing above 6 times in parallel and concatenate.

# 2024-07-18 09:09 gav: For ECing the Work-Bundle, then we do the above N times in parallel and concatenate, where N is CEIL(WORK_BUNDLE_SIZE / 684) (edited)

# 2024-07-18 09:17 sourabhniyogi: Understood! But I think we're up against the fact that Lin, Chung, and Han 2014 referenced in GP only works with powers of 2, and need to get down into like "what package can do _that_" nitty gritty.

# 2024-07-18 09:30 sourabhniyogi: This paper (who Han pointed one of our team to) appears to get over the "powers of 2" limitations and gets to what the GP wants: https://ieeexplore.ieee.org/document/8955804 but I don't believe its reasonable for teams to be implementing it independently. IMHO every team should have an FFI into a very widely used erasure coding package, or (similar to davxy Bandersnatch repo, I imagine) we all just use the same one, implemented once, well? (edited)

# 2024-07-18 09:48 gav: Agreed, implementation of the underlying RS EC is not something which is needed for the prize. I'd be very happy to see an high quality treasury-funded implementation for use across client implementations. (edited)

# 2024-07-18 10:43 sourabhniyogi: Before we go down that (not as resilient?!) rabbit-hole, here are two solutions: - **Solution A: Change 342:1026 to powers of 2 (e.g., 256:1024).** Instead of using the leopard package https://github.com/catid/leopard/tree/master we use the original author's package: https://github.com/SianJhengLin/Fast-algorithms-of-Reed-Solomon-erasure-codes - **Solution B: Change Y\_2 in GP** This could be Y\_64, or Y\_192 as implemented by Parity here: https://github.com/paritytech/erasure-coding/blob/6c22e28a7402cfc75381e430483defe8189df486/src/subshard.rs#L317 or any multiple of 64 as per the docs: https://github.com/catid/leopard/blob/6e5725ebdf9da4370b0bcc4f70fa8eb66f4e6198/README.md?plain=1#L27 where n and k represent "original data length" and "recovery data length". Neither solution requires building new packages (good!) but both require some parameter changes to the GP. (edited)

# 2024-07-18 10:51 gav: https://gist.github.com/gavofyork/4cde92327cd91eff3e2ab91d316cb83a

# 2024-07-18 10:51 gav: here's a test vector for the "atomic" EC.

# 2024-07-18 10:52 gav: 684 bytes split out into 1026 \* 2 bytes, any 342 of which can be used to reconstruct. (edited)

# 2024-07-18 10:53 gav: once you've got this to match, the rest should be straightforward.

# 2024-07-18 10:53 gav: no. please stick to the spec. thanks. (edited)

# 2024-07-18 11:36 alistair: > <@gav:polkadot.io> 684 bytes split out into 1026 \* 2 bytes, any 342 of which can be used to reconstruct. This could be 1024*2. (edited)

# 2024-07-18 11:36 alistair:

# 2024-07-18 11:59 oliver.tale-yazdi: > <@gav:polkadot.io> once you've got this to match, the rest should be straightforward. yep i can recover it from a random 342 subset of these

# 2024-07-18 12:02 shwchg: image.png

# 2024-07-18 12:02 shwchg: I'll briefly explain the method from the paper mentioned in GP Appendix H, using 256:1024 as an example: 1. Use the original 256 GF points as the values of the evaluation points, and then reverse-generate a polynomial expression, which is defined by the authors. 2. Therefore, substituting the 0th to 255th elements into this polynomial will give us the original data we need. 3. Next, substitute the 256th to 1023th elements sequentially into the polynomial to obtain the redundancy data we need. And this method for quickly computing the values of the evaluation points is a recursive formula. inverse tranform is a recursive formula too (edited)

# 2024-07-18 12:04 shwchg: Could you explain more clearly how you generated this vector? No matter how I look at it, it doesn't seem like it can be used for 342:1026.

# 2024-07-18 12:10 shwchg: The provided test vector contains original data with 1026 bytes, which is 513 octet pairs (Y_2). According to GP, every 342 octet pairs will be packed into 1 "segment" for encoding. Therefore, in this case, we should get 2 "segments" of codewords, totaling 1026 * 2 = 2052 octet pairs. However, this provided test vector only contains 1 segment (1026 octet pairs) of codewords. Please let me know if I got anything wrong.

# 2024-07-18 12:24 gav: Which test vector?

# 2024-07-18 12:25 shwchg: This one https://gist.github.com/gavofyork/4cde92327cd91eff3e2ab91d316cb83a

# 2024-07-18 12:25 gav: The data I provided in the gist 9c07...a574 is 1368 nibbles, which is 684 bytes (342 byte-pairs). (edited)

# 2024-07-18 12:26 gav: This gets ECed into 1026 byte-pairs (segment_ec) - concatenating the first 342 result in the data. The other 684 byte-pairs are the additional redundancy. (edited)

# 2024-07-18 12:29 gav: Once you can do this, then you just need to glue 6 of them together to do segment encoding (4104 = 684 * 6).

# 2024-07-18 12:31 shwchg:

# 2024-07-18 12:39 shwchg: Sorry for that. I originally thought it was base64 (because the other vectors are encoded using base64), but this doesn't answer my question. The paper has strict limitation of (n,k)should be 2^r, so how did you apply them to generate this vector?

# 2024-07-18 12:45 gav: Yeah, that paper probably shouldn't be referenced.

# 2024-07-18 12:45 gav: I'll remove the reference in the next revision.

# 2024-07-18 12:46 gav: For now just assume it's a normal systematic RS EC in GF(2^16)

# 2024-07-18 12:47 gav: https://ieeexplore.ieee.org/document/8955804 might be an interesting paper when you come to optimisation, assuming there is no existing impl which can come via FFI.

# 2024-07-18 12:48 gav: We have found that the crate parity uses for EC can parallelise effectively if you're careful about which shards are which and you ensure indices are equivalent for all shards. (edited)

# 2024-07-18 12:49 gav: We'll probably improve this in the future to allow for 32x parallelism using 4 indices (8 shards per index), which should get very close to the theoretical maximum performance without necessitating a fresh implementation.

# 2024-07-18 12:49 gav: However, for M1/M2 performance is not especially relevant - you just need to be able to get the right EC results. (edited)

# 2024-07-18 12:51 gav: As for hex/base64, yeah I switched it - I'm not sure why base64 was ever used as it makes it largely unreadable to human eyes. I'll push people to stick to hex so it's trivial to see things like string lengths right in a text editor and not worry about weird padding artefacts. (edited)

# 2024-07-18 13:18 shwchg: Got it, thanks for clarifying. However, we still have some concerns: the term "normal systematic RS EC" is quite ambiguous. In our understanding, RS coding (Reed Solomon) are more likely a "category" of coding algorithm, there're at least two type of RS coding: original view and BCH view. See: https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction#:~:text=There%20are%20two%20basic%20types%20of%20Reed%E2%80%93Solomon%20codes%C2%A0%E2%80%93%20original%20view%20and%20BCH%20view%C2%A0%E2%80%93%20with%20BCH%20view%20being%20the%20most%20common%2C%20as%20BCH%20view%20decoders%20are%20faster%20and%20require%20less%20working%20storage%20than%20original%20view%20decoders. Different algorithms (or even just different implementations) of RS coding could lead to different encoding results. Therefore, if RS coding doesn't align among each JAM implementation, the encoding results might not be compatible with each other and may not decode correctly.

# 2024-07-18 13:19 shwchg: So in the future, it is possible that there will be changes in the encoding rate and data segmentation, right? When will this be confirmed, and what would you recommend we do for our implementation in the meantime?

# 2024-07-18 13:29 shwchg: Regarding the recovery of the test vector you provided earlier, Can you provide us with more relevant implementation details? I want to make sure that our approach is correct.

# 2024-07-18 14:01 gav: No changes are planned and I don't view any changes as likely. (edited)

# 2024-07-18 14:21 gav: I believe this is the paper describing the underlying EC schema: D. G. Cantor, "On arithmetical algorithms over finite fields", Journal of Combinatorial Theory, Series A, vol. 50, no. 2, pp. 285-300, 1989. Systematic Reed Solomon Finite-Field Erasure Codec, GF(2^16), Cantor basis 2 (edited)

# 2024-07-18 14:46 shwchg: ok! Thank u!

# 2024-07-18 14:48 prematurata: image.png

# 2024-07-18 14:56 prematurata: hello when trying to implement the codec for the dictionary encoding "method 1". or this formal definition, i find it hard to write a generic codec. let me explain better. the notation seems to indicate that when the dictionary then the following steps should be taken: - encode number of keys/valuepairs ( formula 275 ) - then for each key value pair - concatenate the encoded key - concatenate the encoded corresponding value if my understanding is true, then lets think about a dict with different keytypes and value types each with variable length. There is no explicit mechanism in the formal specification that would allow an implementation to properly encode & decode the key value pairs. I would've expected to also see some kind of E(k*) = E(|k|) concatenated with E(k) . same for d[k] (edited)

# 2024-07-18 14:58 prematurata: unless i am misinterpreting the formula (which is a strong possibility here)

# 2024-07-18 15:01 gav: No dictionary with differing key/value types use this encode function.

# 2024-07-18 15:02 gav: (In fact, AFAIR I don't think it's used at all, currently)

# 2024-07-18 15:03 prematurata: > <@gav:polkadot.io> No dictionary with differing key/value types use this encode function. thanks. maybe it's worth mentioning in the graypaper?

# 2024-07-18 15:05 prematurata: also for future reference i think it might be a good idea. In case jam ends up needing to use it. we might want to be sure that we only use it when that condition is satisfied

# 2024-07-18 15:05 gav: I'll just remove the definitions for now. (edited)

# 2024-07-18 16:28 sourabhniyogi: Oliver Tale-Yazdi: Which package did you use to decode the https://gist.github.com/gavofyork/4cde92327cd91eff3e2ab91d316cb83a "normal systematic RS EC in GF(2^16)"? Can you post your (I presume Rust) decoder in a similar gist? A encode-decode combo? Happy to submit a PR with the same shape of test vectors as cheme's PR after we succeed, have you review it, based on the above. (edited)

# 2024-07-18 16:36 oliver.tale-yazdi: > <@sourabhniyogi:matrix.org> Oliver Tale-Yazdi: Which package did you use to decode the https://gist.github.com/gavofyork/4cde92327cd91eff3e2ab91d316cb83a "normal systematic RS EC in GF(2^16)"? Can you post your (I presume Rust) decoder in a similar gist? A encode-decode combo? > > Happy to submit a PR with the same shape of test vectors as cheme's PR after we succeed, have you review it, based on the above. I just used the decoder from here https://github.com/paritytech/erasure-coding/blob/main/src/subshard.rs

# 2024-07-18 17:19 gav: this might be useful for anyone wanting to get their head around what keysets are used when: https://hackmd.io/@polkadot/jamkeysets

# 2024-07-18 19:42 sourabhniyogi: Updated https://github.com/w3f/jamtestvectors/pull/4#issuecomment-2237386980 with a snippet of Rust code https://gist.github.com/sourabhniyogi/f448e213134c814d652d2eccf086bf53 covering the first test vector for JAM's erasure decoding. If this somehow breaks hygiene, please explain?

# 2024-07-19 17:14 sourabhniyogi: gav: Do you have wire format preferences for JAMSNP https://hackmd.io/@polkadot/jamsnp For speed, I'd like to suggest we start with JSON and end with scale or protobuf. Not arguing for protobuf over scale here ... but even though JSON is something engineers tend to snicker at, it usually makes debugging life easier, between individuals and especially between teams, at the expense of a bunch of (I suggest, temporary) serialization/deserialization. Or perhaps JAMSNP can support more than 1 wire format. Are the details of JAMSNP appropriate to develop by implementers and then turned into a GP Appendix after we get a couple "serious" implementations in to code complete form, or do you (+JAM protocol engineers) own this problem? If you don't own this, should implementers put something together into a polkadot-fellows RFC and then GP compatible notation? (edited)

# 2024-07-19 20:32 gav: SCALE, to the degree it matters. Most items will have a standard binary representation anyway.

# 2024-07-21 20:07 sourabhniyogi: Ok -- maybe C.2 should be expanded to cover all the SCALE encodings, not just for blocks but for JAMSNP stream objects? Here are my notes: - CE: Ticket submission - Ticket is an _element_ of ${\\bf E}\_T$ (Section 6.7 Equation 73) -- serialization of elements of $\\mathbb{C}$ in C.2 (Equation 288) - CE: Work Report publication - Guaranteed Work Report is an _element_ of ${\\bf E}\_G$ (Section 11.4 Equation 136) -- Serialization of $\\mathbb{W}$ is C.2 (Equation 286) - CE: Assurance publication - Assurance is an _element_ of ${\\bf E}\_A$ (Section 11.2 Equation 123) -- need serialization in C.2 and a set definition in I.1.1. - CE: Judgement publication - Dispute contains _components_ of ${\\bf E}\_D$ (Section 10.2 Equation 97) -- need serialization in C.2, though ${\\bf B}$ covers ${\\bf c}, {\\bf v}, {\\bf f}$ components. - CE: Preimage publication - Preimage is an _element_ of ${\\bf E}\_P$ (Section 11.2 Equation 153) -- need serialization in C.2 and set definition in I.1.1. - CE: Block publication - Block is Section 4.1 Equation 13, serialization of ${\\bf B}$ is thorough in C.2 (Equations 280-282). - CE: Work Package Submission and Sharing - Work Package is Section 14.3 Equation (174), serialization of elements of $\\mathbb{P}$ in C.2 (Equation 287). - CE: Audit-announcement - Announcement - see Section 17.3 Equation 196. Need serialization in C.2 and set definition in I.1.1. - CE: AuditDA query AuditDAQuery & response AuditDAResponse - need serialization in C.2 and maybe set definitions in I.1.1. - CE: ImportDA query ImportDAQuery & response ImportDAResponse - need serialization in C.2 and maybe set definitions in I.1.1 - CE: Public ImportDA reconstruction ImportDAReconstructionQuery & response ImportDAReconstructionResponse - need serialization in C.2 and maybe set definitions in I.1.1 Beyond the above encoding nitty gritty, it would be powerful to have JAMSNP, for each stream, have a short description of when the sender is typically expected to send and the expected behavior of the receiver when receiving each of the above. You could leave it as an exercise for JAM implementers, but why? Pedantic question: Why is V=1023, C=341 and not V=1026, C=342 where the latter matches Appendix H's 1026? Do you have a recommendation on how implementers get started with a "tiny" C, V, W\_C, W\_S? Above is relative to 0.3.1 (7/17/24) -- In addition, the start of section 14 is messed up. (edited)

# 2024-07-22 12:36 celadari: I have a question regarding the safrole algorithm. Since the author of the block "gives away who he is," shouldn't the header also include proof that they are entitled to the ticket? From my understanding, $$H\_s$$ is a Bandersnatch signature of $$E\_U(H)$$ with same context as ticket, but I don't see how it is sufficient to prove ownership of the ticket. Shouldn't we also include opring, using the notation from the 2023 paper by Jeffrey Burdges, where opring contains a NARK.Proof(comring, pk, ring) ? I also reviewed the code in the Bandersnatch VRFs spec example. I noticed that the code assumes the prover\_key\_index is the correct one by using the same variable prover_key_index for both signing and verification. I feel there is something obvious that I am not seeing :) (edited)

# 2024-07-22 12:58 oliver.tale-yazdi: >gives away who he is," shouldn't the header also include proof that they are entitled to the ticket IIUC: the header seal signature is the proof. It is a bandersnatch sig (not a bandersnatch ring sig). This sig can only be created by the holder of the key for the slot of that block - which was determined in advance by the ticket contest. So by issuing a standard bandersnatch signature, they give away that they control the respective authoring key.

# 2024-07-22 13:12 celadari: I agree that it is a bandersnatch signature but the ticket itself is generated using a bandersnatch ring signature. How do you "relate" the holder of the key with the ticket seal in $$\\gamma\_s$$ ? $$i$$ is known ($$\\gamma\_s\[H\_t\]$$) - and so is the expected Y(H\_s) - so what prevents me from signing with my own key E\_U(H), sign [] with context $$X_E + Y(H_v)$$ as well (since I know expected Y(H\_s)) then pretend I am the owner of the ticket ? (edited)

# 2024-07-22 13:19 oliver.tale-yazdi: Not sure if i understand what you are asking, but if you just sign Y(H_v) with a different key, then it wont verify. Y(H_v) is passed into the signature verification function vrf_verify alongside with the public key. Only the author key can verify for that Y(H_v). so you cannot just "steal" the ticket by copying the Y(H_v).

# 2024-07-22 13:29 celadari: What do you mean by "sign Y(H_v)" ? H_v and H_s are both headers provided by the one submitting the block. The vrf value/ticket is known (looking at $$\gamma_s$$) that is we know what $$Y(H_s)$$ is supposed to be. So we know the context of $$H_v$$ signature. So I can sign [] with context X_E+Y(H_s) using my own key and sign E_U(H) with context X_T+eta_3+i_r (I can try i_r equals 0 or 1), I don't see what is gonna fail here

# 2024-07-22 19:19 gav: > Pedantic question: Why is V=1023, C=341 and not V=1026, C=342 where the latter matches Appendix H's 1026? Keeping the number no greater than 1024 is (or at least was) helpful for ensuring the speed of a few algorithms, including EC and SNARKs. (edited)

# 2024-07-22 19:30 sourabhniyogi: Looks like Safrole wants V=1023 https://github.com/w3f/jamtestvectors/blob/master/safrole/README.md and EC wants V=1026. This needs to be resolved for GP self-consistency, right? Alistair said something about "This could be 1024*2." a couple of days back and he has the path to reconcile this? (edited)

# 2024-07-22 20:16 gav: The numbers are correct. There is nothing to resolve. There are 1023 validators and 342 are required for EC reconstruction. (edited)

# 2024-07-22 23:18 sourabhniyogi: Got it. Clarifying that chunks for "1023..1024..1025" in Appendix H 342:1026 don't actually get distributed to anyone might be useful, in a footnote in H.1 or 14.2. This suppresses others like me who perceive a discrepancy. (edited)

# 2024-07-23 09:47 gav: In fact the latest GP (main branch) revises this to a rate of 342:1023, thereby removing this weirdness. (edited)

# 2024-07-22 19:20 gav: > Do you have a recommendation on how implementers get started with a "tiny" C, V, W_C, W_S? Don't know what you mean.

# 2024-07-22 19:36 sourabhniyogi: I mean that for those of us who want to just start doing producing blocks with a single machine simulating V nodes will choose a tiny V, like 6 (C=2). But the EC parameters have to match. Ideally we'd have "tiny" test vectors with very low V to match the "full" test vectors. Safrole test vectors have tiny V=6 already and V=1024. So the recommendation request is how to adjust other parameters W\_C, W\_S, ... and the EC procedure such that all teams can do basically the same thing. For me this low V situation (V=6 or 9 or 12) enables a quick way for 2-4 teams to test against each other, in a ZombieNet like way, instead of the impractically large V=1023 or V=1026 situation (not for mainnet, just for a single team to get all components working together). The EC component is most glaring so we'll just use some other Reed-Solomon parameters in the low V case but I'm sure you will have better parameter selection? (edited)

# 2024-07-22 20:20 gav: > <@sourabhniyogi:matrix.org> I mean that for those of us who want to just start doing producing blocks with a single machine simulating V nodes will choose a tiny V, like 6 (C=2). But the EC parameters have to match. Ideally we'd have "tiny" test vectors with very low V to match the "full" test vectors. Safrole test vectors have tiny V=6 already and V=1024. So the recommendation request is how to adjust other parameters W\_C, W\_S, ... and the EC procedure such that all teams can do basically the same thing. > > For me this low V situation (V=6 or 9 or 12) enables a quick way for 2-4 teams to test against each other, in a ZombieNet like way, instead of the impractically large V=1023 or V=1026 situation (not for mainnet, just for a single team to get all components working together). The EC component is most glaring so we'll just use some other Reed-Solomon parameters in the low V case but I'm sure you will have better parameter selection? Safrole should have test vectors for V=1023. @davxy:matrix.orgplease confirm.

# 2024-07-22 23:10 sourabhniyogi: Yes the Safrole test vectors are V=1023.

# 2024-07-22 19:21 gav: > In addition, the start of section 14 is messed up. Don't know what you mean.

# 2024-07-22 19:25 sourabhniyogi: image.png

# 2024-07-22 19:29 sourabhniyogi: see final pdf. Not a big deal, but maybe you had some content there.

# 2024-07-22 20:17 gav: > <@sourabhniyogi:matrix.org> sent an image. No idea how you generated that. My pdf (and that uploaded to github) is fine.

# 2024-07-22 23:15 sourabhniyogi: The present https://graypaper.com/graypaper\_inverted.pdf https://graypaper.com/graypaper\_no\_background.pdf have the above problem https://graypaper.com/graypaper.pdf does not. (edited)

# 2024-07-22 19:23 gav: > but I don't see how it is sufficient to prove ownership of the ticket. See the top line of (60)

# 2024-07-22 19:23 gav: image.png

# 2024-07-22 19:23 gav: The ticket ID (VRF output) is required to be the same. This guarantees the sealer is the ticket owner.

# 2024-07-22 19:28 celadari: image.png

# 2024-07-22 19:28 celadari: But how do we go from bandersnatch signature (H\_s) to bandersnatch ringVRF ? :) (The only obvious way I see is to have opring) When I look at the ringVRF construction from 2023 Jeffrey Burdges I don't see how we can do this (aside from using opring) ? And the ticket is signed with message/additional data equal to empty \[\] and H\_s is supposed to be on E\_U : I think the signature changes if additional data/message changes (edited)

# 2024-07-22 19:31 celadari: Maybe we can go to bed for now 😅

# 2024-07-22 20:18 gav: > <@celadari:matrix.org> But how do we go from bandersnatch signature (H\_s) to bandersnatch ringVRF ? :) > (The only obvious way I see is to have opring) > > When I look at the ringVRF construction from 2023 Jeffrey Burdges I don't see how we can do this (aside from using opring) ? > > And the ticket is signed with message/additional data equal to empty \[\] and H\_s is supposed to be on E\_U : I think the signature changes if additional data/message changes Will need to defer to @davxy:matrix.orgfor the RingVRF/Bandersnatch specifics.

# 2024-07-22 20:21 gav: I have no plans to alter the EC; the tiny Safrole test vectors were given only as a convenience since proof generation can be quite slow. The same is not true for EC.

# 2024-07-22 20:23 gav: V=6 seems fair for a testnet but it’s totally up to you. Experiment. (edited)

# 2024-07-23 06:14 davxy: > <@gav:polkadot.io> Safrole should have test vectors for V=1023. @davxy:matrix.orgplease confirm. I confirm that "full" vectors are generated with V=1023

# 2024-07-23 06:24 davxy: > <@gav:polkadot.io> Will need to defer to @davxy:matrix.orgfor the RingVRF/Bandersnatch specifics. @sourabhniyogi reply is quite insightful :-) @celadari I'll add something later (currently afk)

# 2024-07-23 10:33 cisco: On another topic, I think some maxs on appendix A should be mins, since they are used to disallow "out of bounds" access

# 2024-07-23 10:34 cisco: Screenshot 2024-07-22 at 21.28.05.png

# 2024-07-23 10:34 cisco: In all of these sections that calculate l_x or l_y

# 2024-07-23 10:49 gav: Yeah, there’s a fix incoming for that.

# 2024-07-23 11:24 davxy: > <@davxy:matrix.org> @sourabhniyogi reply is quite insightful :-) @celadari I'll add something later (currently afk) This gist may be helpful: https://gist.github.com/davxy/c3327f799cb70a7c55087c97741fa8d9 (for struct defs, etc. refer to the example in bandersnatch specs repo)

# 2024-07-23 14:10 celadari: Thank you Davide 🙌, After checking the gist I realized that the ietfVRF output and ringVRF output are the same: obvious when looking at the 2024 paper but not obvious when looking at the 2023 paper SO a ietfVRF Proof on an output made by ringVRF is still valid SO even if the ticket in gamma_s was made using ringVRF, using the ietfVRF.Verify with H_s as signature, H_a as public key and E_U(H) as aux_data should produce the same output (since output is independant from aux_data)

# 2024-07-23 14:05 celadari:

# 2024-07-24 22:45 shawntabrizi: gav - off grid, slow to respond: perhaps add a link to this channel to the header: https://www.youtube.com/@JamPrizeTour/videos It has all videos of the JAM lectures, including Buenos Aries, Singapore, and Brussels

# 2024-07-29 13:16 jay-chrawnna: Hi folks. We're pleased to introduce the new lecture section of the graypaper website: https://graypaper.com/lectures/

# 2024-07-29 13:17 jay-chrawnna: Gray Paper Website.png

# 2024-07-29 13:19 jay-chrawnna: This was filmed and created as a joint project by Key Pictures & The Kus under OpenGov Ref. 763 (edited)

# 2024-07-29 13:20 jay-chrawnna: More lectures and resources will be added over time. Please DM me directly if you have a request to make it more useful!

# 2024-07-29 13:23 jay-chrawnna: thanks Philip for all the help getting it live!

# 2024-07-29 13:26 philip.poloczek: Glad I could help. 👍️:)

# 2024-07-29 14:20 jay-chrawnna: if using YouTube, checkout the Playlist to see them in order! https://www.youtube.com/watch?v=wbnTnBQNDr4&list=PLwcnAOKMj-Ab7sDej2P4peqGxu2mTGYNy&index=1 (edited)

# 2024-07-29 14:36 celadari: > <@davxy:matrix.org> sourabhniyogi: > > 1. For Safrole test vectors the only thing we use is $jam\_ticket\_seal. > In particular this is used for the ring-vrf input construction (context in the GP) to obtain the ticket score (aka ticket-id) during the candidate ticket verification procedure. > The value of $jam\_ticket\_seal" is constant and defined as "jam\_ticket\_seal" ASCII string. > This is a thing that applies in general for values starting with $ (e.g. $foo in the GP => "foo" ascii string). > (Thank you BTW as I've just spotted that I was using "jam\_seal" instead of "jam\_ticket\_seal", I'll add the fix to the upcoming vectors PR) > The other constant strings ($jam\_fallback\_seal and $jam\_entropy) are used for block verification and per-block entropy production (which is passed as input to Safrole). > NOTE: The actual value of per-block entropy used by the Safrole test vectors is not relevant to have been really produced using the signature in the header. Here we abstract away from the value's origin, we don't really care for the sake of the Safrole test vectors and Safrole STF. In the specific case I've used: > > - entropy\_0 = blake2b(\[42\_u8; 32\])\[..32\] > - entropy\_i+1 = blake2b(entropy\_i)\[..32\] > 2. If you take as a reference the Sassafras RFC then there some differences. > One of these is the attempts number and redundancy factor, which in the GP are in practice simplified to one single thing (attempts number). > Threshold is gone. > Even though reading Sassafras RFC can help (as it is a quite similar protocol), always take the GP as the source of truth for JAM. > 3. Sure thing. I'll post here as soon as it is ready. > 4. This is a very interesting observation. Current implementation serializes 3 extra fields (part of the SNARK SRS). > Serialization of these fields may be important **in a general application**, but here these values are constant. > I will definitely get rid of these from serialized data (I'm on it). The final size will be 144 (i.e. the last 144 bytes of what you see right now) Hi Davide, I'm quoting this old message as it is related to my doubt. I have checked messages on the chat here and there and I just wanted to confirm my understanding of the test vectors regarding safrole. - Just to confirm, in the test vectors "input.entropy" refers here to Y(H_v) of the GP, right? - Thus, in the test vectors we don't verify equations (59) and (60) (parts related to ark-ec-vrfs that I asked you about last week), right?

# 2024-07-29 17:42 davxy: - As that old message reports, the input.entropy **for the vest vectors** is generated in some "arbitrary" manner (the quoted message reports how I generated it). In a real scenario YES, it is generated as per equation (66) of the last GP release (DRAFT 0.3.1 Jul 17). In the context of the safrole test vectors, is not really important how we generate it as H_v is not present at all. - In safrole test vectors **we don't verify** the validity of the signatures contained within the header (i.e. H\_v and H\_s) simply because are not part of the safrole's specific STF. These will be probably verified in some upcoming test vectors to asses block's header validity (edited)

# 2024-07-29 17:52 celadari: Thank you

# 2024-07-29 18:36 celadari: Regarding "recent history", why do we assign H\_r to beta\[0\] instead of assigning it to beta\[|beta| - 1\] ? Since we produce beta' by appending the new elements to the end and taking the last H elements of the array (equation 81) I would have thought the last element of beta should be used for equation (80) (edited)

# 2024-07-30 04:32 xlchen: Screenshot 2024-07-30 at 4.31.47 PM.png

# 2024-07-30 04:33 xlchen: not 100% sure what is the modulo subscription operator is doing here. does it mean the memory access is never out of bonds but wrapped?

# 2024-07-30 10:36 gav:

# 2024-07-30 10:37 gav: > <@xlchen:matrix.org> not 100% sure what is the modulo subscription operator is doing here. does it mean the memory access is never out of bonds but wrapped? Yes.

# 2024-07-31 09:01 prematurata: am I wrong or there is no definition of what Xv is? being used in 98 first and 101. I have the feeling 99 might miss something?

# 2024-07-31 09:02 prematurata: image.png

# 2024-07-31 09:15 danicuki: I think there is a bug in the version without background: https://graypaper.com/graypaper_no_background.pdf

# 2024-07-31 09:15 danicuki: Screenshot 2024-07-31 at 10.13.30.png

# 2024-07-31 09:18 gav: > <@prematurata:matrix.org> am I wrong or there is no definition of what Xv is? being used in 98 first and 101. I have the feeling 99 might miss something? v is either true or false. Both possibilities are defined in (99)

# 2024-07-31 09:19 gav: > <@danicuki:matrix.org> I think there is a bug in the version without background: > https://graypaper.com/graypaper_no_background.pdf > > I don’t know who produced these. @philip.poloczek:parity.io please remove them if they’re broken, thanks.

# 2024-07-31 09:19 obi:

# 2024-07-31 09:20 obi:

# 2024-07-31 09:20 prematurata: thanks for the answer. I might miss the link between vand (99) then. but thanks for confirming (edited)

# 2024-07-31 10:04 oliver.tale-yazdi: that T is true and the upside down ⟂ is false. (edited)

# 2024-07-31 10:05 philip.poloczek: > <@danicuki:matrix.org> I think there is a bug in the version without background: > https://graypaper.com/graypaper_no_background.pdf Thanks for reporting this. It's fixed now. Sorry for the inconvenience. I've added a disclaimer to the site that the white version is just for convenience and the gray version/github release being the decisive one

# 2024-07-31 10:11 oliver.tale-yazdi: > <@danicuki:matrix.org> I think there is a bug in the version without background: > https://graypaper.com/graypaper_no_background.pdf there are some unofficial alternative renders in three color schemata here as well: https://jamcha.in/spec. They are automatically re-rendered from the main branch with at most 2hr delay. just in case you want the latest unreleased changes

# 2024-07-31 12:06 gav: @room v0.3.2 is tagged and PDF available https://github.com/gavofyork/graypaper/releases/tag/v0.3.2

# 2024-07-31 14:55 prematurata: I think i migght have found a typo in 129 (edited)

# 2024-07-31 14:56 prematurata: image.png

# 2024-07-31 14:56 prematurata: let me know if that's correct i will gladly open a PR with the fix

# 2024-07-31 14:57 celadari: sorry for insisting but shouldn't we use ? (edited)

# 2024-07-31 14:57 celadari: image.png

# 2024-07-31 14:57 celadari: instead of

# 2024-07-31 14:57 celadari: image.png

# 2024-07-31 14:57 celadari: since we append new elements at the end and only take the H last elements: (edited)

# 2024-07-31 14:58 celadari: image.png

# 2024-07-31 15:20 gav: > <@celadari:matrix.org> sorry for insisting but shouldn't we use ? yes that looks to be a typo...

# 2024-07-31 17:31 gav: > <@prematurata:matrix.org> I think i migght have found a typo in 129 https://github.com/gavofyork/graypaper/pull/54/commits/c6d3af2abac9ab170e04ac820a40a6525c782b49 - thanks!

# 2024-08-01 11:34 qiwei: image.png

# 2024-08-01 11:35 qiwei: this should be m instead of m'?

# 2024-08-01 12:13 gav: > <@qiwei:matrix.org> this should be m instead of m'? Yes :) https://github.com/gavofyork/graypaper/pull/54/commits/4a98e1f7180ed6d01d62ab5109092e2cb806d984

# 2024-08-02 11:19 prematurata: image.png

# 2024-08-02 11:19 prematurata: what is this sentence referring to? I tried to find the above condition that its being referred there but I am unsure

# 2024-08-02 11:26 subotic: > <@prematurata:matrix.org> what is this sentence referring to? I tried to find the above condition that its being referred there but I am unsure maybe it should say above conditions are not met? This is how I understood it a least.

# 2024-08-02 11:27 prematurata: oh that might also make more sense. but i dont see any conditions (there) the 3 above elements are just definitions, that's part of the reason why i'm confused (edited)

# 2024-08-02 11:32 subotic: > <@prematurata:matrix.org> oh that might also make more sense. but i dont see any conditions (there) the 3 above elements are just definitions, that's part of the reason why i'm confused hmm, good point.

# 2024-08-02 12:32 cisco: I interpreted it as if the definitions can't be computed

# 2024-08-02 12:35 gav: > <@cisco:parity.io> I interpreted it as if the definitions can't be computed Yes

# 2024-08-02 12:36 gav: The condition it is refering to is the equality of v_x.

# 2024-08-02 12:36 prematurata: > <@cisco:parity.io> I interpreted it as if the definitions can't be computed I thought about that as well ... as far as I can think of there is only one case when that cannot be computed. and if the next istruction is just right after the one we're evaluating

# 2024-08-02 12:36 gav: Because of its inverse encoding (E^-1), there's a possibility it can fail (i.e. that there is no v_x which can satisfy the condition) (edited)

# 2024-08-02 12:36 prematurata: ha!

# 2024-08-02 12:37 gav: The other conditions (equalities) can always be met. (edited)

# 2024-08-02 12:38 gav: For a simple example, suppose I had defined v_x as i: 1<i<0.

# 2024-08-02 12:39 gav: then that condition would obviously not be possible to be met. No value exists which is both greater than 1 and less than 0.

# 2024-08-02 12:39 gav: It's not (always) about being evaluatable or computable. Of course practical implementations must evaluate/compute, but formal specs only need specify relations and conditions. (edited)

# 2024-08-02 12:40 gav: Therefore we say "if the condition cannot be met" rather than "if you can't compute the value".

# 2024-08-02 12:41 gav: There are various reasons why a value "may not be computable" - perhaps you don't have the operands, perhaps you don't know an algorithm to do it, perhaps the CPU is broken... there are practical concerns since computable is practical endeavour. We wouldn't want to define the protocol in terms of practical concerns. (edited)

# 2024-08-02 12:41 gav: Meeting a condition is not a practical endeavour. It's purely theoretical.

# 2024-08-02 12:56 prematurata: i am still trying to wrap my head around that. so i can't really find a practical l_x for which E_lx is not computable. (not considering the case where the arguments input size is zero or formally ζı+1 ∈ ϖ). I understand the reasoning about condition vs computable but, feel free to correct me. Since the E^-1 is the inverse of E for which there is no input for which E fails to produce output... then I'd expect (always mathematically speaking) that if the inverse function is "called" with a member of the output set of Ewhich is **Y**l (aka a sequence of loctects) then, unless otherwise specified, there would be no reason to believe the inverse fails for some inputs.

# 2024-08-02 12:58 prematurata: So if my reasoning is correct, then I'd propose that either the disclaymer is global. If, for some reason, the defined values cannot be computed then a panic should occur (edited)

# 2024-08-02 15:12 gav: Well, there are sequences of octets which when passed into E^-1 do not provide a value in N_(2^32), right? (edited)

# 2024-08-02 15:12 gav: As such the decoding may fail.

# 2024-08-02 15:18 prematurata: yes you're right. but if you mean when the input sequence is longer than 4 octects... well that should be enforced by the min in l_x definition . (edited)

# 2024-08-02 15:29 gav: I don’t mean that.

# 2024-08-02 15:37 prematurata: I don't see how little endian encoding which is defined in 273 has a way for its inverse to produce a value **not** in N_(2^32). when subscripted with x being 0<=x<=4 (edited)

# 2024-08-02 17:49 gav: Yes indeed: https://github.com/gavofyork/graypaper/pull/54/commits/d474ed4da1032cfa1398c11afc0b3edf37f3934b (edited)

# 2024-08-02 17:51 gav: (Until 0.3 series with the PVM changes that equation was originally the decode function without the subscript, implying that it could fail to decode. That's no longer the case so I'll remove the now-superfluous disclaimer.)

# 2024-08-04 10:16 gav: 0.3.3 released with all the recent corrections: https://github.com/gavofyork/graypaper/releases/tag/v0.3.3

# 2024-08-04 10:17 gav: Heads-up: 0.4 will likely include a formalisation of #57 (dependent on final review).

# 2024-08-05 03:23 jay-chrawnna: we've released the clips of the Belgium lecture for those using the YouTube playlist. Because the Gray Paper is a living project, we opt to title & number the clips based on the version of the graypaper at time of lecture. You'll find them on graypaper.com soon™️

# 2024-08-05 04:35 xlchen: A minor inconsistency, e is used as epoch index for Safrole, butm is used as epoch index for validator activity statistics

# 2024-08-05 04:35 xlchen: Screenshot 2024-08-05 at 4.35.22 PM.png

# 2024-08-05 04:35 xlchen: Screenshot 2024-08-05 at 4.35.34 PM.png

# 2024-08-05 04:43 xlchen: the state deps graph for π seems to be incomplete?

# 2024-08-05 04:43 xlchen: Screenshot 2024-08-05 at 4.43.18 PM.png

# 2024-08-05 04:43 xlchen: Screenshot 2024-08-05 at 4.43.35 PM.png

# 2024-08-05 04:43 xlchen: it doesn't have Hi

# 2024-08-05 06:06 xlchen: is this a typo? (this part is super confusing because judgements now renamed to verdicts but there is still a variable j used in verdicts)

# 2024-08-05 06:07 xlchen: Screenshot 2024-08-05 at 6.06.57 PM.png

# 2024-08-05 06:07 xlchen: Screenshot 2024-08-05 at 6.07.17 PM.png

# 2024-08-05 06:12 xlchen: Screenshot 2024-08-05 at 6.12.08 PM.png

# 2024-08-05 06:13 xlchen: I am also unsure what exactly is the squared part mean here. It reads like epoch index minus 2? But it doesn't make much sense to me here

# 2024-08-05 06:15 xlchen: I guess it means that value a must be either the current epoch index or the previous one?

# 2024-08-05 09:06 oliver.tale-yazdi: yes 11:28 https://www.youtube.com/watch?v=iMAqVPGppbs&t=688s

# 2024-08-05 09:09 xlchen: thanks. this reminds me I should finish watching the jam lectures

# 2024-08-05 06:23 subotic: In 226, shouldn't the iota in the first case be i + 1 + skip(i)? (edited)

# 2024-08-05 07:52 xlchen: > Specifically, the machine does not halt, the instruction counter increments by one

# 2024-08-05 07:52 xlchen: so during normal execution, the counter will be updated

# 2024-08-05 07:52 xlchen: we don't want to update it twice, or make an exception case here

# 2024-08-05 08:05 subotic: so this means that the first case is covered through (220). Got it. Thanks!

# 2024-08-05 09:49 celadari: > <@xlchen:matrix.org> we don't want to update it twice, or make an exception case here Yes but following the formalism if we go to b we execute it first. We don't do b + 1 + skip(b), we first go to band execute it. So I think what Ivan said makes sense 🤔

# 2024-08-05 23:13 xlchen: the instruction updates the counter and that's it.

# 2024-08-05 23:13 xlchen: and the usual steps continues. i.e. increment the counter and execute the next instruction

# 2024-08-08 10:46 gav: This is correct.

# 2024-08-05 06:24 subotic: image.png

# 2024-08-05 10:49 gav: > <@xlchen:matrix.org> is this a typo? (this part is super confusing because judgements now renamed to verdicts but there is still a variable j used in verdicts) Yes. I’ve cleared up the naming.

# 2024-08-05 10:49 gav: Disputes includes verdicts (a final aggregated decision on whether a WR was bad) and offenders (proofs of those validators who issued judgements against such a verdict). (edited)

# 2024-08-05 10:50 gav: Verdicts are made up of individual judgements (which is a vote and a validator's signature). (edited)

# 2024-08-05 10:51 gav: (In previous versions, we did not have offenders, and both judgements and verdicts were called judgements, confusingly) (edited)

# 2024-08-06 09:21 tomusdrw: Hey there! The last couple of days I've been working on a tool that would help to read and annotate the Gray Paper. Looking for some feedback: https://graypaper.fluffylabs.dev/ On the top of the priority list to implement is to display the notes within the document and migrate them between versions. I found myself having a lot of notes on a printed version, but they are hard to maintain when the document keeps changing, hence the website. Another important feature for me is the ability to share a link to a selection - I've noticed here that many people share screenshots (which I believe is still good), but if the screenshots were augmented with such a link it's easier for others to find the context of the discussion (and the graypaper version the quote is coming from). Curious to hear what would be important for you or what you'd like to see improved. The (pretty hacky) code is on github, so feel free to file an issue or a PR there.

# 2024-08-06 09:44 oliver.tale-yazdi: The linking feature is awesome! I tried a few things to have links to a specific formular but this is top. Does it search for the text or how does it resolve a link to a spot in the paper?

# 2024-08-06 09:56 tomusdrw: > <@oliver.tale-yazdi:parity.io> The linking feature is awesome! I tried a few things to have links to a specific formular but this is top. Does it search for the text or how does it resolve a link to a spot in the paper? As mentioned earlier it's pretty hacky :D The PDF is converted to HTML and it has a very specific structure. The link saves the HTML nodes that are selected and then searches for these exact nodes when loaded. I'm still trying to figure out how to best "migrate" the selection from one version to another, but I'm positive I'll figure something out, given I have full information about the selection (including page, section, subsection, gp version and the text content)

# 2024-08-06 09:57 tomusdrw: The migration to newer version will most likely need some fuzzy searching if the exact text is not found, but at least we can limit that to some particular section.

# 2024-08-06 09:58 tomusdrw: We are also working on using synctex to map PDF/HTML selection to raw latex sources as another option.

# 2024-08-06 10:07 kianenigma: This is super useful, thanks for sharing! will start using it and share feedback :)

# 2024-08-06 10:28 cisco: Awesome tool! I also took a lot of notes on a printed version but all the changes make my notes harder to find. You could maybe show the visual diffs between the different versions and associate notes to a particular version if stuff changed

# 2024-08-06 12:09 prematurata:

# 2024-08-06 12:10 prematurata:

# 2024-08-06 12:11 prematurata:

# 2024-08-06 12:14 prematurata: <del>~~Hello, i believe there is no formalism on how to properly encode/decode workresults. The o term is either byte array or enum. (123 and 124) .</del> <del>considering the Eg exstrinsic is a tuple containing workreport 138 and 119. And that there is no specific formalism for term o in 123 on appendix C, i believe that note 20 at page 49 might also apply here?~~</del> (edited)

# 2024-08-06 12:25 prematurata:

# 2024-08-06 13:14 prematurata: image.png

# 2024-08-06 13:15 prematurata: is (284) correct? it misses the segment-root from the S set

# 2024-08-06 17:30 gav: Incorrect. Will be fixed!

# 2024-08-06 17:31 gav: Should appear after the regular erasure root.

# 2024-08-07 13:01 tomusdrw: AFAICT the PVM program should run up until it reaches an invalid instruction. What's the reason for that? Couldn't programs with invalid instructions be cheaply rejected during some initial (linear) validation based on the instruction mask? We also have well-defined trap instruction to terminate the program, so no need for that to panic (like there was in EVM afair). The only reasons I could think of is: 1. Avoid the need of pre-validation completely (i.e. don't assume there is anything like that in GP) 2. Support some future extensions? Is there anything I'm missing here? Perhaps the compilers generate a code with invalid instructions in case of panic?

# 2024-08-07 14:18 luke_f: image.png

# 2024-08-07 14:22 luke_f: hello. i think there is a mistake in recent history - equation 83 **the subscript s is supposed to be subscript x** since _s_ is an availability specification, that does no have a _p_ sub-component (eq 122) _x_, however is refinement context , that does have _p_ component that is a hash (eq 121) or maybe it refers to S subscript *h*? the work package hash? (edited)

# 2024-08-07 17:18 gav: It refers to the work package hash in the specification component.

# 2024-08-08 01:58 luke_f: > <@gav:polkadot.io> It refers to the work package hash in the specification component. thank you

# 2024-08-08 07:26 tomusdrw: gav: any comments on this one? My current read of the GP is that it's fine to have programs with invalid instructions, since they obviously might never be reached in code and we still need to panic only if that particular instruction is reached during execution. And obviously that's payed for. However I'm thinking if it would be "less wasteful" to disallow programs with invalid instructions to be executed at all. But perhaps it's over complication or not practical for some reasons yet unknown for me? (edited)

# 2024-08-08 07:27 prematurata: image.png

# 2024-08-08 07:34 prematurata: I think there's a typo in 59. k_b should be k_e

# 2024-08-08 08:52 prematurata: image.png

# 2024-08-08 08:52 prematurata: and maybe this is also a typo... posterior m can be > than Y but then |ya| would not be = to E... what I think we want to make sure is that the current m is > than Y so that we know the lottery "time" is over (edited)

# 2024-08-08 10:00 gav: > <@tomusdrw:matrix.org> AFAICT the PVM program should run up until it reaches an invalid instruction. What's the reason for that? Couldn't programs with invalid instructions be cheaply rejected during some initial (linear) validation based on the instruction mask? We also have well-defined trap instruction to terminate the program, so no need for that to panic (like there was in EVM afair). > > The only reasons I could think of is: > 1. Avoid the need of pre-validation completely (i.e. don't assume there is anything like that in GP) > 2. Support some future extensions? > > Is there anything I'm missing here? Perhaps the compilers generate a code with invalid instructions in case of panic? Point 1.

# 2024-08-08 10:01 gav: The PVM is designed to be linear complexity in terms of execution steps, not program size. (edited)

# 2024-08-08 10:02 gav: Correct. Will be in next round of corrections thanks (edited)

# 2024-08-08 10:04 gav: Yeah that is already corrected to the prior m on the latest GP. (edited)

# 2024-08-08 12:23 prematurata: > <@gav:polkadot.io> Yeah that is already corrected to the prior m on the latest GP. oops sorry. i must have missed one release

# 2024-08-08 12:25 gav: > <@tomusdrw:matrix.org> AFAICT the PVM program should run up until it reaches an invalid instruction. What's the reason for that? Couldn't programs with invalid instructions be cheaply rejected during some initial (linear) validation based on the instruction mask? We also have well-defined trap instruction to terminate the program, so no need for that to panic (like there was in EVM afair). > > The only reasons I could think of is: > 1. Avoid the need of pre-validation completely (i.e. don't assume there is anything like that in GP) > 2. Support some future extensions? > > Is there anything I'm missing here? Perhaps the compilers generate a code with invalid instructions in case of panic? (There are actually a number of exit conditions, an illegal instruction is only one of them and I wouldn't expect it to happen in production.)

# 2024-08-09 12:28 gav: v0.3.4 is out with the latest corrections. Nothing particularly new in there. (edited)

# 2024-08-09 14:09 danicuki:

# 2024-08-09 14:11 danicuki: In the serialization definitions, there is the variable-size prefix 29-bit natural serialization function E4∗ \se_{4*} But I don't see it being used anywhere. I see only mentions to E4 (without the *) \se_4 Should we use \se_{4*} by default when we see \se_4 ?

# 2024-08-09 14:12 danicuki: Screenshot 2024-08-09 at 15.11.50.png

# 2024-08-09 14:12 danicuki: Screenshot 2024-08-09 at 15.12.02.png

# 2024-08-09 15:51 gav: > <@danicuki:matrix.org> In the serialization definitions, there is the variable-size prefix 29-bit natural serialization function E4∗ \se_{4*} > > > But I don't see it being used anywhere. I see only mentions to E4 (without the *) \se_4 > > Should we use \se_{4*} by default when we see \se_4 ? No. When you see \se_4, use \se_4. The variable-length encoding functions are not presently used. They might be used in the future but for now you obviously don’t need to implement them to build a conformant implementation.

# 2024-08-09 21:12 xlchen:

# 2024-08-11 17:19 finsig: For integer encoding, eq275, there are values of x that satisfy 2^7*l <= x < 2^7(l+1) but do not correctly encode with length (l). For example, 2^14-1 satisfies the criteria for length 1, 2^7 <= 2^14-1 < 2^14, and encodes to bytes [255]. However, it requires length 2 bytes [255,63] = 2^14-1. Should the criteria instead be: 2^7*(l -1) <= x < 2^(7 * l)

# 2024-08-11 17:40 gav: I’m not sure what you mean. It satisfies the criteria for l=1, which implies it needs two bytes: the discriminator summed with the most-significant bits in the first byte (128 + 63 = 191), together with least significant byte in the second (255), giving [191,255] (edited)

# 2024-08-12 01:59 finsig: > <@gav:polkadot.io> I’m not sure what you mean. It satisfies the criteria for l=1, which implies it needs two bytes: the discriminator summed with the most-significant bits in the first byte (128 + 63 = 191), together with least significant byte in the second (255), giving [191,255] sorry, my mistake. I understand now.

# 2024-08-12 10:04 oliver.tale-yazdi: Screenshot 2024-08-12 at 11.59.40.png

# 2024-08-12 10:04 oliver.tale-yazdi: Screenshot 2024-08-12 at 12.00.37.png

# 2024-08-12 10:04 oliver.tale-yazdi: The text in the guarantees extrinsic and the Block Serialization mention a core-index, but it looks like i was moved into the work-report. So just the text and encoding logic needs fixing?

# 2024-08-12 11:11 gav: Yes: https://github.com/gavofyork/graypaper/commit/27947d73ab593597dcf4f41c036bd1eb68fd6ffe (edited)

# 2024-08-14 02:31 sourabhniyogi: image.png

# 2024-08-14 02:31 sourabhniyogi: davxy: Do you have a snippet of code that represents the above computation? Would like to check our assumptions. This "ring root" is used in a new epoch as per: (edited)

# 2024-08-14 03:10 qiwei: I think it's just https://github.com/davxy/bandersnatch-vrfs-spec/blob/57f89545b825b8f032180dff59cf9d360d43d0dd/example/src/main.rs#L116

let mut buf = Vec::new();

verifier.commitment.serialize_compressed(&mut buf).unwrap();

# 2024-08-14 08:08 davxy: ^ this

# 2024-08-14 09:02 sourabhniyogi: Excellent, we guessed right =)

# 2024-08-14 02:33 sourabhniyogi: image.png

# 2024-08-14 03:07 sourabhniyogi: Is it possible for you to provide a full run of blocks for 4 epochs of a "tiny" chain (eg V=6, E=12, Y=8) with a complete genesis config with predefined Alice/Bob/.. /Fergie keys like https://substrate.stackexchange.com/questions/41/what-are-the-predefined-accounts-alice-bob-etc (edited)

# 2024-08-14 22:56 sourabhniyogi:

# 2024-08-14 22:57 sourabhniyogi: gav: Doesn't host function write need a "s" register input like read has in \omega_0? (edited)

# 2024-08-14 22:57 sourabhniyogi:

# 2024-08-14 23:12 sourabhniyogi: For import GP says "This process may in fact be lazy as the Refine function makes no usage of the data until the import hostcall is made." -- does this imply that PVM interpreters should "page fault" when hitting an "import" instruction? Is this the only host function having a lazy evaluation / page fault or would there be others (say, export \[though I imagine it would be the last thing being done\]), in interacting with the DA systems? (edited)

# 2024-08-14 23:17 sourabhniyogi: For Epoch Markers, at the start of Section 5.1 you have "If not ∅, then the epoch marker specifies key and entropy relevant to the following epoch in case the ticket contest does not complete adequately (a very much unexpected eventuality)" but then in equation (72) you have something that appears contradictory:

# 2024-08-14 23:17 sourabhniyogi: image.png

# 2024-08-14 23:19 sourabhniyogi: Does the epoch marker appear on rare occasions (only when the ticket contest does not complete adequately) or on every new epoch (what (72) implies)

# 2024-08-15 06:41 davxy: The epoch marker appears in the first block of every epoch

# 2024-08-15 06:41 gav: It is merely noting that not all import data may need to be fetched ahead of the start of execution. (edited)

# 2024-08-15 06:45 gav: It’s obvious, really, but I wanted to point it out as for M4 it will be necessary to parallelise execution with import segment fetching.

# 2024-08-15 06:52 davxy: > relevant to the following epoch in case the ticket contest does not complete adequately (a very much unexpected eventuality)" It is rare that the ticket contest doesn't complete successfully. If the contest is successful, then the previously announced epoch marker becomes irrelevant for the sake of slot assignments to validators.

# 2024-08-15 11:35 oliver.tale-yazdi: To me it looks like the types of the sets in definition 101 are incompatible: Lambda U Kappa being a set of validator key tuples and Psi-o a set of Ed25519 keys:

# 2024-08-15 11:36 oliver.tale-yazdi: gp.png

# 2024-08-15 12:46 gav: > <@oliver.tale-yazdi:parity.io> To me it looks like the types of the sets in definition 101 are incompatible: Lambda U Kappa being a set of validator key tuples and Psi-o a set of Ed25519 keys: yeah - it should be the Ed25519 portions of lambda and kappa.

# 2024-08-15 12:46 gav: Will be corrected in 0.3.5

# 2024-08-15 12:47 gav: i.e. [k_e | k \in \lambda \cup \kappa]

# 2024-08-16 08:03 cisco: Screenshot 2024-08-16 at 10.02.44.png

# 2024-08-16 08:03 cisco: Looking at the PVM's standard program initialization. What is that z? I've identified all others and even read compiled pvm files successfully without mentioning it

# 2024-08-16 09:25 gav: It’s the number of zeroed pages.

# 2024-08-16 09:25 gav: See right section of 4th row in initial memory layout.

# 2024-08-16 09:26 gav: (246)

# 2024-08-16 09:32 cisco: Thanks!

# 2024-08-17 00:52 sourabhniyogi: image.png

# 2024-08-17 00:53 sourabhniyogi: For historical_lookup host function, the $H$ in $H({\\bf a},t,h)$ should be a $\\Lambda$ to match 92+94 of Section 9. Any reason why you don't just assign some opcodes to all the host functions? Can just add 100 or 128 to the existing ones, 13 becomes 113 or 141? (edited)

# 2024-08-17 04:50 luke_f: @davxy:matrix.org: just to clarify, can we use the code in https://github.com/davxy/bandersnatch-vrfs-spec/tree/main/example ?

# 2024-08-17 08:26 gav: > <@sourabhniyogi:matrix.org> For historical_lookup host function, the $H$ in $H({\\bf a},t,h)$ should be a $\\Lambda$ to match 92+94 of Section 9. > > Any reason why you don't just assign some opcodes to all the host functions? Can just add 100 or 128 to the existing ones, 13 becomes 113 or 141? Then each of the contexts would need to be integrated and defined as part of the basic PVM, which is obviously a terrible idea.

# 2024-08-17 12:21 sourabhniyogi: Understood. We are using koute's assembler to build refine and accumulate toy cases (to do squaring [refine] and summing [accumulating]) to cover host function implementations by hand replacing "17=0x11" (fallthrough) with "temporary opcodes" in the 128 range since all host functions are instructions with no args (ie for historical_lookup, 141=0x8D). We can fork koute/polkavm and add our own little mapping for the 2 dozen host functions around here https://github.com/koute/polkavm/blob/41c74f966eea0b7670075082f1231c6eeef20fc2/crates/polkavm-common/src/program.rs#L1313 https://github.com/koute/polkavm/blob/41c74f966eea0b7670075082f1231c6eeef20fc2/crates/polkavm-common/src/assembler.rs#L308 Adding 128 is sufficient to avoid collisions, but seems clear that koute could surely follow the GP spec on these details?

# 2024-08-17 08:27 gav: > <@sourabhniyogi:matrix.org> For historical_lookup host function, the $H$ in $H({\\bf a},t,h)$ should be a $\\Lambda$ to match 92+94 of Section 9. > > Any reason why you don't just assign some opcodes to all the host functions? Can just add 100 or 128 to the existing ones, 13 becomes 113 or 141? On the first point, yes indeed - refactoring error and will be corrected in 0.3.5.

# 2024-08-17 11:33 davxy: > <@luke_f:matrix.org> @davxy:matrix.org: just to clarify, can we use the code in https://github.com/davxy/bandersnatch-vrfs-spec/tree/main/example ? Yes. That example is compliant with what you need for JAM

# 2024-08-17 13:12 gav: On what details?

# 2024-08-17 14:15 gav: I’m not really sure what you’re proposing, but if you mean to use opcodes which are not yet in use in place of host functions I would caution against it. Opcodes which are not in use should panic as an illegal instruction. If they do anything else then it will fail test vectors.

# 2024-08-18 02:12 sourabhniyogi: https://github.com/koute/polkavm/pull/156 has the idea. Understand the caution, but not sure how else implementers can tackle implementing 23 host functions (8 or so critical to get the basics of DA+ a battery of refine-accumulate code for "tiny" JAM services: import, export, solicit, forget, historical\_lookup, read, write, lookup) without something along these lines. (edited)

# 2024-08-18 02:13 sourabhniyogi: https://github.com/koute/polkavm/blob/32253a16f3f427fb829c2013615192e17324cbf3/tools/spectool/spec/src/hostfunctions/README.md (edited)

# 2024-08-18 07:13 mkchung: In eq(93), the preimage appears to be of arbitrary size - which is then erasure encoded & distributed in segments of size wc*ws to DA. But how does the E_p (lookup extrinsic) in section 12.1 (eq 155-158) able to know the length of the preimage? Without knowing the length, we probably cannot remove the padded zeros from the wc * ws segments. What am I missing?

# 2024-08-18 10:07 gav: > <@sourabhniyogi:matrix.org> https://github.com/koute/polkavm/pull/156 > has the idea. Understand the caution, but not sure how else implementers can tackle implementing 23 host functions (8 or so critical to get the basics of DA+ a battery of refine-accumulate code for "tiny" JAM services: import, export, solicit, forget, historical\_lookup, read, write, lookup) without something along these lines. you'll need to implement the host functions regardless. you'll need to implement PVM too regardless.

# 2024-08-18 10:07 gav: i don't understand why you're trying to use polkavm given that even for M1 you'll need to write your own. (edited)

# 2024-08-18 10:59 sourabhniyogi: We are implementing our own PVM interpreter of course! We have already done so for the Appendix A instruction set and covered the tests of https://github.com/w3f/jamtestvectors/pull/3. We just need to have some similar byte code for Appendix B. To do that, we have to assemble some byte code that has host functions, and so we only use koute's assembler to map some assembly code (the PR has 23 of them) into PVM byte code to test that assembler. We are not FFIing into koute/polkavm. We are only using it as an assembler to generate byte code to implement 23 host functions of Appendix B, following koute's assemble/disassemble tool. https://github.com/w3f/jamtestvectors/pull/3#issuecomment-2257688558

# 2024-08-18 11:01 gav: ok that's something.

# 2024-08-18 11:01 gav: but why not just use ecalli along with the relevant host call index? (edited)

# 2024-08-18 11:03 oliver.tale-yazdi: FYI there are some hostcall examples here if you just need something to test: https://github.com/JamBrains/polkavm-examples/blob/master/lang-cpp/main.cpp#L13 the project also has an universal interpreter CLI to inject mocked host functions - all just for testing and using Jans code, but it helped with debugging our interpreter

# 2024-08-18 11:07 oliver.tale-yazdi: you can call this with the universal interpreter of that project with something like pvme call test.pvm entry 42 69 --host-functions "get_third_number:100" and then compare the result to your interpreter

# 2024-08-18 11:11 sourabhniyogi: > <@gav:polkadot.io> but why not just use ecalli along with the relevant host call index? AHA! Is there some line in the GP that explains ecalli -- This is the missing bit of info.

# 2024-08-18 11:12 oliver.tale-yazdi: Screenshot 2024-08-18 at 13.12.34.png

# 2024-08-18 11:12 oliver.tale-yazdi: it is what triggers a hostall

# 2024-08-18 11:16 sourabhniyogi: Well, I missed this "link" between Appendix A and Appendix B -- where is it documented?

# 2024-08-18 11:17 gav: image.png

# 2024-08-18 11:17 gav: Together with each of the places where Ψ_H is used

# 2024-08-18 11:18 gav: (248) and, by extension, (250) etc.

# 2024-08-18 11:27 sourabhniyogi: Ok -- I got it, thank you! (Closed my PR for now, it no longer makes sense) Any reason why assign and new don't get a host index in GP?

# 2024-08-18 11:50 gav: https://github.com/gavofyork/graypaper/pull/62/commits/025e868628ef5fe9ae781546ebb3a89fe93c7b09

# 2024-08-18 11:50 gav: Was fixed a week ago.

# 2024-08-18 11:51 gav: I'll release 0.3.5 soon

# 2024-08-18 10:09 gav: and in using it, rather than altering the code to introduce new non-standard instructions in order to empower ArchVisitor, you should obviously just invoke ArchVisitor using the ecall. (edited)

# 2024-08-18 10:18 gav: > <@mkchung:matrix.org> In eq(93), the preimage appears to be of arbitrary size - which is then erasure encoded & distributed in segments of size wc*ws to DA. But how does the E_p (lookup extrinsic) in section 12.1 (eq 155-158) able to know the length of the preimage? Without knowing the length, we probably cannot remove the padded zeros from the wc * ws segments. What am I missing? 93 describes account state storage.

# 2024-08-18 10:18 gav: this is not erasure-coded. it is stored in state as a byte-sequence (with an implied length) in state directly. (edited)

# 2024-08-18 10:21 gav: (157) shows exactly how the length can be verified: it is stored, along with the hash, as a key in δ[s]_l

# 2024-08-18 11:56 gav: 0.3.5 is released. Mostly trivial corrections, but there's also some small alterations to the work bundle serialisation format.

# 2024-08-19 09:14 danicuki: I have a doubt about state encoding, more specifically: C(12) ↦ E4(χ) C(13) ↦ E4(π) How can we E4 here if χ and π are not actual integers? Would it be just E(χ) and E(π)?

# 2024-08-19 14:59 oliver.tale-yazdi: I also have a Q about what r is in definition 111. It must be a work-report hash since it comes from V, but rho does not contain work-report hashes but full work reports? Do we have to compute the hashes and then remove them?

# 2024-08-20 06:40 gav: https://github.com/gavofyork/graypaper/pull/65/commits/24488eb2c899bc211852d70784e3289b347b8b22

# 2024-08-19 14:59 oliver.tale-yazdi: gp3.png

# 2024-08-19 16:31 gav: > <@danicuki:matrix.org> I have a doubt about state encoding, more specifically: > > C(12) ↦ E4(χ) > C(13) ↦ E4(π) > > How can we E4 here if χ and π are not actual integers? Would it be just E(χ) and E(π)? chi is just a tuple of numbers, so element should be encoded accordingly.

# 2024-08-19 16:32 gav: pi is also just a sequence of sequences of numbers.

# 2024-08-19 16:32 gav: Just encode each accordingly.

# 2024-08-19 16:33 gav: For tuples and sequences, E_4 is the same as E: totally transparent.

# 2024-08-19 16:34 gav: > <@oliver.tale-yazdi:parity.io> I also have a Q about what r is in definition 111. It must be a work-report hash since it comes from V, but rho does not contain work-report hashes but full work reports? > Do we have to compute the hashes and then remove them? Yes - indeed this should have been placed in a hash function H(…)

# 2024-08-19 16:34 gav: Will be fixed.

# 2024-08-19 21:23 mkchung: > <@gav:polkadot.io> (157) shows exactly how the length can be verified: it is stored, along with the hash, as a key in δ[s]_l Is E_P the only way to get preimage into the on-chain state δ? In other words, is it solely triggered by the solicit host func(Ωs) at the accumalate? Presumably block author would need to retrieve these preimage off chain from DAs in order to include it in the block as E_P?

# 2024-08-19 21:32 gav: Correct.

# 2024-08-19 21:33 gav: Validators are expected to receive and share these.

# 2024-08-19 22:37 sourabhniyogi: How does code get deployed? As per 14.3, a Work Package has codehash $c$ (but not a length) –_must_ code deployment go through (i) export (ii) solicit in accumulate resulting in (iii) E\_P-driven state trie updates, as stated by Eq (91) or is there some other route? More precisely how does a service's FIRST code hash get deployed? Its not happening in host function new... so is there some privileged service that service creators have to use to get started that does (i)-(iii)? (edited)

# 2024-08-19 23:05 sourabhniyogi: In 14.3 right before Eq (177) you have "a sequence of hashed of blob hashes and lengths" which I think should be ungarbled as something like "a sequence of extrinsic hashes and lengths" -- but shortly thereafter you have "We make an assumption that the preimage to each extrinsic hash in each work-item is known by the guarantor. In general this data will be passed to the guarantor alongside the work-package." _These_ preimages have nothing to do with the E\_P-driven state trie updates from solicit, correct? By "In general this data will be passed to the guarantor" by "this data" do you mean the Extrinsic preimage _via JAMNP_ like what you have as --> Vec<Extrinsic> ++ Vec<JustifiedImport> or something else? What does JustifiedImport refer to in GP? (edited)

# 2024-08-20 00:45 sourabhniyogi: In B.3 "The Refine invocation also ... explicitly accepts the work payload, ${\\bf y}$, ..., the import and extrinsic data blobs (both just concatenated segments) as dictated by the work-item, ${\\bf i}$ and ${\\bf x}$" -- which refers to 14.3's (177) (carried through to (184) and then (253) but then _nowhere_ in the refine-specific host functions of (254)), correct? Assuming the above understanding is correct, I am expecting some refine-specific host functions analogous to import (which gets at ${\\bf i}$) that gets at the payload ${\\bf y}$ (which might be called import_y and the extrinsic "raw" preimage data ${\\bf x}$ (which might be called import_x). But I don't see anything -- so which host functions read ${\bf y}$ and ${\bf y}$? Is there some setup of ${\\bf x}$ and ${\\bf y}$ in memory that we missed? (edited)

# 2024-08-20 05:29 gav: As you postulate. The first service (which will appear in the Genesis block) will include functionality for creating new services permissionlessly. (edited)

# 2024-08-20 05:32 gav: > <@sourabhniyogi:matrix.org> In 14.3 right before Eq (177) you have "a sequence of hashed of blob hashes and lengths" which I think should be ungarbled as something like "a sequence of extrinsic hashes and lengths" -- but shortly thereafter you have "We make an assumption that the preimage to each extrinsic hash in each work-item is known by the guarantor. In general this data will be passed to the guarantor alongside the work-package." _These_ preimages have nothing to do with the E\_P-driven state trie updates from solicit, correct? > > By "In general this data will be passed to the guarantor" by "this data" do you mean the Extrinsic preimage _via JAMNP_ like what you have as --> Vec<Extrinsic> ++ Vec<JustifiedImport> or something else? > > What does JustifiedImport refer to in GP? On the first two points, yes.

# 2024-08-20 05:35 gav: > <@sourabhniyogi:matrix.org> In B.3 "The Refine invocation also ... explicitly accepts the work payload, ${\\bf y}$, ..., the import and extrinsic data blobs (both just concatenated segments) as dictated by the work-item, ${\\bf i}$ and ${\\bf x}$" -- which refers to 14.3's (177) (carried through to (184) and then (253) but then _nowhere_ in the refine-specific host functions of (254)), correct? > > Assuming the above understanding is correct, I am expecting some refine-specific host functions analogous to import (which gets at ${\\bf i}$) that gets at the payload ${\\bf y}$ (which might be called import_y and the extrinsic "raw" preimage data ${\\bf x}$ (which might be called import_x). But I don't see anything -- so which host functions read ${\bf y}$ and ${\bf y}$? Is there some setup of ${\\bf x}$ and ${\\bf y}$ in memory that we missed? It should not read “both just contented segments”; only imports are segments. Extrinsic data are (arbitrary length) blobs.

# 2024-08-20 06:40 gav: https://github.com/gavofyork/graypaper/pull/65/commits/90868bc03582665ffb67aedc6afe19517d9e853d

# 2024-08-20 05:36 gav: Extrinsic data are passed into the refine function directly. Imported segments are read through the import host call.

# 2024-08-20 05:51 sourabhniyogi: Thank you for explaining -- still unclear though -- which refine instructions / host functions can read/copy extrinsic data ${\\bf x}$ (I understand, passed into refine function directly) into RAM? What about the payload ${\bf y}$? (edited)

# 2024-08-20 06:29 gav: > <@sourabhniyogi:matrix.org> Thank you for explaining -- still unclear though -- which refine instructions / host functions can read/copy extrinsic data ${\\bf x}$ (I understand, passed into refine function directly) into RAM? What about the payload ${\bf y}$? > Extrinsic data are passed into the refine function directly. See (251)

# 2024-08-20 06:29 gav: Arguments to Refine are encoded into a; this includes all extrinsic data.

# 2024-08-20 06:30 gav: (NOTE: "Extrinsic" is used for two separate and very different concepts in the GP. That's not ideal. I'll likely change this use of "extrinsic" to "argument" in a later draft.) (edited)

# 2024-08-20 10:32 sourabhniyogi: Understood -- I am postulating that PVM programs must have host functions in B.8 that copy (a) ${\\bf x}$ extrinsic data encoded into the $a$ argument of Refine, call it import_x to copy $a\_{\\bf x}$ into RAM and (b) ${\\bf y}$ payload data encoded into the $a$ argument of Refine, call it import_y to copy $a\_{\\bf y}$ into RAM I don't see any such way within { historical_lookup, import, export, machine, peek, poke, invoke, expunge } to access elements of $a$. It seems I am missing a concept or there needs to be 2 more host functions import_x and import_y. (edited)

# 2024-08-20 10:41 sourabhniyogi: image.png

# 2024-08-20 10:41 sourabhniyogi: I think when you say 251 you mean the above 253

# 2024-08-20 10:41 sourabhniyogi: since 251 for me is

# 2024-08-20 10:41 sourabhniyogi: image.png

# 2024-08-20 10:42 sourabhniyogi: and we're talking about this $a$ here:

# 2024-08-20 10:42 sourabhniyogi: image.png

# 2024-08-20 10:50 sourabhniyogi: The postulated import_x and import_y would be like import here:

# 2024-08-20 10:50 sourabhniyogi: image.png

# 2024-08-20 10:53 sourabhniyogi: except instead of copying ${\\bf i}_{\\omega\_0}$ it would copy $a_{{\\bf x}_{\\omega\_0}}$ or ${a_{\\bf y}}$ into RAM (and no need for the min(\\omega\_2, W\_C W\_S), no index $\\omega\_0$ for payload ${\\bf y}$) Perhaps names like extrinsic (?) and payload would be more appropriate than import_x and import_y -- What do you think? (edited)

# 2024-08-21 16:48 dave: Time as defined in the GP seems to be TAI-based. That is, it is defined simply as "seconds passed since the beginning of the JAM Common Era". Practically speaking most systems are UTC-based. The difference is that UTC is adjusted by leap seconds, whereas TAI is not. Sticking with the current definition seems good, as it's conceptually simple. It's probably worth explicitly talking about this in the GP though, as just doing the natural thing (current UNIX timestamp minus epoch timestamp) will on most systems not be correct. Well, it might be, if there are no more leap seconds. AIUI they will be abolished by 2035 but there may be some before then. I'm not sure if there's a "standard" way of dealing with this. I believe it is possible in Linux to make the system clock follow TAI, however I don't know if this is a good idea or not.

# 2024-08-22 11:17 lucasvo: I've been working through the paper and one thing I'm unsure about is how ordering of work packages defined and in practice how is it determined?

# 2024-08-22 12:49 gav: It’s not defined except through the prerequisite field. This currently doesn’t give a strong ordering for accumulation (as availability may complete out of order) but from 0.4 onwards it will. (edited)

# 2024-08-22 20:35 sourabhniyogi: TAI makes sense in a final ratification. Since most will create some abstractions to get the "JCE", it seems hardly problematic to address the leapseconds within that abstraction. Are you taking the lead on the next JAMNP writeup? We are wondering if there really is no gossiping of Tickets, Blocks, etc. within JAMNP as it seems pretty wild for each validator to broadcast to all V-1 fellow validators. Not only that, if each ticket is sent to V-1 validators, all the cool anonymity will be canceled by QUIC. Is there some detail of how QUIC will be used for gossiping objects quickly in JAMNP? Or some expectation that JAMNP will support gossiping soon enough? (edited)

# 2024-08-22 20:39 dave: Tickets are to be sent via a proxy; the proxy will know the sender's identity obviously but is trusted to not reveal it. If you assume 2/3 honest validators then most of the tickets will be anonymous

# 2024-08-22 20:40 dave: The plan is for new block distribution to be done via the availability system

# 2024-08-23 01:32 xlchen: Screenshot 2024-08-23 at 1.30.48 PM.png

# 2024-08-23 01:32 xlchen: is this missing a Hash?

# 2024-08-23 01:58 xlchen: Screenshot 2024-08-23 at 1.58.30 PM.png

# 2024-08-23 01:59 xlchen: to my understanding, those are unordered set, in that case, what order should be used for serialization? or they are just FIFO array without duplications? (edited)

# 2024-08-23 02:14 xlchen: Screenshot 2024-08-23 at 2.14.26 PM.png

# 2024-08-23 02:14 xlchen: what is al and ai?

# 2024-08-23 02:15 xlchen: Screenshot 2024-08-23 at 2.15.01 PM.png

# 2024-08-23 08:26 gav: Actually not via the “availability system” but through a separate distribution (and redistribution) of erasure coded block pieces. It’ll be its own distribution system but will reuse the erasure coding logic. (edited)

# 2024-08-23 08:54 gav: > <@xlchen:matrix.org> is this missing a Hash? No: https://github.com/gavofyork/graypaper/pull/65/commits/92b824261feb67d12784a539232eb64da1b87b47

# 2024-08-23 08:55 gav: The wavey vertical line implies ordering (see the two paragraphs below (11)): https://graypaper.fluffylabs.dev/#WyI0ODY2YjU5YmMwIiwiNSIsIlByZXZpb3VzIFdvcmsgYW5kIFByZXNlbnQgVHJlbmRzIixudWxsLFsiPGRpdiBjbGFzcz1cInQgbTAgeDEzIGgyIHkyMDUgZmY3IGZzMCBmYzAgc2MwIGxzMCB3czBcIj4iLCI8ZGl2IGNsYXNzPVwidCBtMCB4MTMgaDIgeTIwZiBmZjcgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiJdXQ== (edited)

# 2024-08-23 08:58 gav: See (94): (edited)

# 2024-08-23 08:58 gav: image.png

# 2024-08-23 09:01 gav: They're values which can be derived from the maps (so not strictly needed in the DB) but determining the total size/number of items directly from the Merkle tree is non-trivial. So they're in there to facilitate implementations' tracking of the changes rather than rebuilding each block (or having a separate database to track them) both of which add a lot of implementation bloat and/or complexity. (edited)

# 2024-08-23 09:03 sourabhniyogi: > <@gav:polkadot.io> Actually not via the “availability system” but through a separate distribution (and redistribution) of erasure coded block pieces. It’ll be its own distribution system but will reuse the erasure coding logic. Makes sense for "big" Block, but wouldn't tiny JAM objects (Guarantee, Assurance, Dispute, PreimageLookup) benefit from libp2p style gossiping in speed over erasure coding?

# 2024-08-23 09:04 gav: It's only for the block.

# 2024-08-23 09:04 gav: The rest is indeed distributed directly on p2p connections.

# 2024-08-23 17:10 dave: We require that the code hashes in the guarantees extrinsic match the current code hashes for the relevant services. This may not be the case by the time the WRs are accumulated though? Is the intention that accumulation code should protect itself against this if necessary, eg by including a version in the refine output?

# 2024-08-23 21:00 danicuki: Screenshot 2024-08-23 at 22.00.47.png

# 2024-08-23 21:02 danicuki: Why (h4:) is negated here? Is this correct? Why not simple h4: ?

# 2024-08-23 21:46 xlchen: I don’t exactly know but I think it is to protect from hash collusion

# 2024-08-24 16:56 gav: > <@xlchen:matrix.org> I don’t exactly know but I think it is to protect from hash collusion Exactly.

# 2024-08-23 22:56 sourabhniyogi: gav: Some basic questions on Assurances. 1. From (185)+(186) _Availability Specifier_ $s$ (with h, l, u, e) ALONE (included in a Work Report), will an assurer/validator be able to reconstruct the _auditable_ work package (by querying fellow validators) after getting their chunk and a proof of inclusion using $s$ ALONE? If not, what else does a \[non-guarantor\] assurer need other than $s$ from a Guarantee to reconstruct the _auditable_ work package (by querying fellow validators)? 2. If the answer to (1) is "YES, s alone is sufficient for work package reconstruction (by querying fellow validators)", we need a _test case_ of a work package ${\\bf p}$ and some pretend exported items with this "Availability Specifier" $s$ computed (which is equivalent to a Auditable Work package and sufficient for assuring, if we understand correctly). When can we get a ${\\bf p}, {\bf e}, s$ test case? Having this will be essential for JAM implementers to have proper $E\_G$ verifiable work report generation and to get through a full $E\_A$ Assurance generation. 3. Given the proof referenced in Section 16 "... Firstly, their erasure-coded chunk for this report. The validity of this chunk can be trivially proven through the work-report’s work-package erasure-root and a Merkle proof of inclusion in the correct location. The proof should be included from the guarantor." ... which QUIC message covers this proof _submission_ by guarantors to all other validators in JAMNP? 4. In section 16 "Availability Assurance", you refer to "provided" and "required" manifests, seemingly for the first time, which appear to be "inside" the availability specifier $s$. Are these ${\\bf b}^\\club$ (package, imported items, extrinsics related) and ${\\bf s}^\\club$ (exported items)-- which one is "provided" and which one is "required"? (The 14.3 paragraph "Guarantors are required to erasure-code and distribute two data sets:... " which is similar enough to cause us to believe section 16 could be folded into 14.3?) (edited)

# 2024-08-23 23:09 sourabhniyogi: 5. What is ${\bf s}$ vs $n$ within ${\bf s}[n]$ in 185? The meaning of ${\bf s}$ flips between imported and exported items in this section so we'd like to confirm. (edited)

# 2024-08-24 16:54 gav: > <@dave:parity.io> We require that the code hashes in the guarantees extrinsic match the current code hashes for the relevant services. This may not be the case by the time the WRs are accumulated though? Is the intention that accumulation code should protect itself against this if necessary, eg by including a version in the refine output? For the few blocks where a service is undergoing an upgrade then you’d probably want to avoid sending reports to it since it’s fundamentally impossible to predict which code is current come Accumulation.

# 2024-08-24 16:57 gav:

# 2024-08-24 16:59 gav: 1. They'll be able to ensure that the chunks they get from each validator is correct and that the bundle/work-package itself is correct. The bundle is all that is needed for a validator to audit. 2. As I’ve said before, test cases will be provided once they are ready. I won’t repeat myself further. (edited)

# 2024-08-24 17:02 gav: 3. I don’t know what you mean by “proof of submission”. In any case JAMSNP doesn't presently include any message/stream type for providing DA chunks. It will be included in due course. (edited)

# 2024-08-24 17:02 gav: 4. Provided and required are old terminology. They’ll be removed in the next GP release. Corrected here: https://github.com/gavofyork/graypaper/pull/65/commits/cc7e2bc87959bf851bcf34208312df9c28283cb2 (edited)

# 2024-08-24 17:38 gav: > <@sourabhniyogi:matrix.org> 5. What is ${\bf s}$ vs $n$ within ${\bf s}[n]$ in 185? The meaning of ${\bf s}$ flips between imported and exported items in this section so we'd like to confirm. I presume you mean:

# 2024-08-24 17:38 gav: image.png

# 2024-08-24 17:39 gav: (Best to use the Reader for referencing bits of the GP)

# 2024-08-24 17:40 gav: Here, M(s) is the segment root of the exporting work-package and n is the index of a segment exported by it. (edited)

# 2024-08-24 17:40 gav: s is therefore the sequence of exported segments from said exporting work-package. (edited)

# 2024-08-24 18:22 gav: 0.3.6 is tagged and released.

# 2024-08-24 18:25 sourabhniyogi: By proof submission, I mean the "The proof should be included from the guarantor." I believe you stubbed JustifiedImport in as --> Vec<Extrinsic> ++ Vec<JustifiedImport> in https://hackmd.io/@polkadot/jamsnp , so we'll run with that for now. (edited)

# 2024-08-24 18:30 gav: > <@sourabhniyogi:matrix.org> By proof submission, I mean the "The proof should be included from the guarantor." I believe you stubbed as JustifiedImport in as --> Vec<Extrinsic> ++ Vec<JustifiedImport> in https://hackmd.io/@polkadot/jamsnp , so we'll run with that for now. That's only for sharing of Work Packages between guarantors on the same core.

# 2024-08-24 18:30 gav: There is not yet an instruction for sharing (justified) DA chunks. (edited)

# 2024-08-24 18:33 gav: It'll likely just be Vec<Hash> ++ Blob ++ Vec<Hash> ++ Vec<SegmentChunk> ++ Vec<Hash>. (edited)

# 2024-08-24 18:36 gav: The Vec<Hash> will just be complementary Merkle-node-hashes from leaf to root. The first will contain hashes for the blob-subtree, the second for the segments subtree and the third for the super-tree.

# 2024-08-26 18:47 noahjoeris: image.png

# 2024-08-26 18:47 noahjoeris: Heya. Isn't the posterior authorizer pool also dependent on the header?

# 2024-09-09 07:44 gav: Will be fixed in 0.3.7

# 2024-08-26 18:48 noahjoeris: Because we use the header timeslot here: (edited)

# 2024-08-26 18:48 noahjoeris: image.png

# 2024-08-26 23:58 xlchen: Screenshot 2024-08-27 at 11.57.19 AM.png

# 2024-08-26 23:58 xlchen: The return type for the merkle justification generation function is wrong

# 2024-09-09 07:43 gav: Will be fixed in 0.3.7

# 2024-08-27 01:23 xlchen: Screenshot 2024-08-27 at 1.22.50 PM.png

# 2024-08-27 01:23 xlchen: this need a special case to handle the case |v| is zero

# 2024-09-09 07:44 gav: Will be fixed in 0.3.7

# 2024-08-27 01:25 xlchen: Screenshot 2024-08-27 at 1.25.42 PM.png

# 2024-08-27 01:26 xlchen: this also using log2(|v|) which will be undefined when v is empty

# 2024-08-27 01:27 xlchen: and I am really confused about the ...max part

# 2024-08-27 01:28 xlchen: it requires the length of v to be no more than 2^x so log2(|v|) - x is always <= 0?

# 2024-08-27 13:45 clw0908:

# 2024-08-27 14:00 clw0908: Hello everyone, When storing storage or preimage, we will specify the service identifier and the storage or preimage key at the same time, but we do not additionally store the mapping between the service identifier and the storage or preimage key.

# 2024-08-28 02:37 xlchen: my understand is that this is used to construct state root but it doesn't means you also have to store the state into db with exactly this format

# 2024-08-28 02:38 xlchen: so you can/should store the keys in whatever format you want

# 2024-08-27 14:00 clw0908: 1.png

# 2024-08-27 14:00 clw0908: So assuming we have a service account identifier (s), how do we retrieve all of (s)'s storage(bold s) keys or preimage(bold p) keys?

# 2024-08-27 14:00 clw0908: 2.png

# 2024-08-28 01:39 mkchung: Screenshot 2024-08-27 at 6.29.11 PM.png

# 2024-08-28 01:40 mkchung: In host function new=9, what is the x_i used for creating the service account? Additionally, for l∶{(c, l)↦[]} portion, are we writing this a_l on-chain, similar to the (h,z)=[] in the solicit=13 case? This should also trigger E_P, correct?

# 2024-08-28 06:28 sourabhniyogi: Basic questions on Is-Authorized. 1. Why is the authorizer not a 4th entry point alongside refine/accumulate/on\_transfer? 2. In (182) you use blob-hood of ${\\bf o}$ (being an element of Y) to determine if work is authorized, but how does the PVM code actually return and "set" ${\\bf o}$ -- not clear from (248) how Is-Authorized can return a blob, e.g. a public key or recovered address. Is it through \omega_10 + \omega_11 pointing to where the blob is returned in memory? If so what is the "not authorized" return mechanism? 3. In 8.2 GP lecture you mention validating signatures, e.g. validating ${\\bf j}$ right at the start of the work package. Shall we add ECRECOVER/... signature verification host functions for this purpose? 4. Why did you not support access to historical_lookup in Is-Authorized, to get at service-specific state? 5. It seems clear we will want "opinionated" inclusion of cryptographic primitives in ecalli host functions (BLS, Keccak/Blake2b/xx hash functions, Bandersnatch, Edwards, etc) due to the usual "interpreted code is Nx slower than compiled, we must have this primitive!" concerns. How is this problem NOT going to immediately reveal itself on the first big JAM service (parachain validation) which uses all these must have primitives? (edited)

# 2024-08-29 13:35 dave: Re (1) AIUI authorisation is for usage of a core, this usage is not tied to a particular service

# 2024-09-09 08:04 gav: (2) Should be obvious from the PVM definition. If you panic then the result will not be in Y.

# 2024-09-09 08:04 gav: (3) PVM should really be fast enough. This isn't Ethereum. (edited)

# 2024-09-09 08:04 gav: (4). IsAuthorized is intended to be as lightweight as possible to minimize the possibility of validator-griefing. This could potentially change in the future once we start actually writing code on JAM prototypes and understand the kinds of use-cases better.

# 2024-09-09 08:04 gav: (5) See the answer to (3).

# 2024-08-28 07:40 prematurata: image.png

# 2024-08-28 07:40 prematurata: it looks to me that appendix F (304) is wrong. 4 should be added and i' mont sure about the underlined red text. look like it does not make muc sense

# 2024-09-09 08:09 gav: Will be fixed in 0.3.7

# 2024-08-28 08:53 luke_f: 6230937201131240805.jpg

# 2024-08-28 08:54 luke_f: Hello I'm having a hard time understanding Equation 141 Specifically R is a constant as far as i can tell, but used as a function here? (edited)

# 2024-08-28 09:01 celadari: I also had the same question but I assumed in the end that it meant "multiply" R by (floor(tau/R) - 1)

# 2024-08-28 09:07 luke_f: oh i see that does make sense Thank you

# 2024-08-28 08:56 cisco: How is it that the PVM argument invocation can return a blob? I guess the memory contents on a success are the blob. Here the return type is ( (N_G, Y) U {panic, oom}, X ) but the inner function R seems to return ({panic, oom} U N_G, X U Y) (edited)

# 2024-08-28 14:51 sourabhniyogi: Ok, we'll run with your interpretation for now 😃, returning a "panic" being "authorization failed" if a new host function that we added to verify a signature against a hardcoded public key (or read via historical_lookup) doesn't pass verification. (edited)

# 2024-08-30 01:21 sourabhniyogi: We realized (again) that passing in \omega_10 + \omega_11 is to be the way to send in "a=arguments" (refine: authorization hash, accumulate: wrangled results, transfer: transfer memos) and ALSO return a blob.

# 2024-08-28 23:52 sourabhniyogi: General JAM Service Invocation/Interaction-related questions: 1. How are the refine, accumulate and on_transfer entry points (described after (90)) specified in the jump table ${\\bf j}$ of Eq (213)? We are currently just using pub @refine:, pub @accumulate: and pub @on_transfer in koute's assembler, but this appears underspecified in GP except for some 0/1/2 entry points 2. Given that refine can ONLY access on-chain state from historical_lookup host function calls (through accumulate's solicit host function calls that result in $a\_p$ preimage writes through $E\_P$ extrinsics) ... I am struggling to see how accumulate can drive refine inputs -- _except_ via some observer who "sees" the finalized post-accumulate state and submits a new work package based on that state in the form of extrinsics and payloads. Why can't accumulate also have access to the export host function such that refine can import segments exported by accumulate? 3. Why can't refine have access to read host function to get at state written to by accumulate's write host function calls? I imagine it would be historical_read, analogous to historical_lookup. 4. How can there be a refine-less Work Package / Service, _not_ needing Assurances to trigger an accumulate, that is purely accumulate based? 5. What is the _idiomatic_ way (via a specific instruction) to "panic" for authorization code to return "Not Authorized"? We are doing ecalli 42 as a hack right now on failed authorizations. 6. For multiple services going through accumulate what is the ordering of ${\\bf S}$ in 157? Assuming serial execution (which is sort of implied by new \[building up ${\\bf x}_n$\] and transfer \[building up ${\\bf x}_{\\bf s}$\]), is the context ${\\bf x}$ (of Eq 253) carried over from one services accumulate to the next? 7. From upgrade It appears to us that the invocation context ${\\bf x}$ in ${\\bf X}$ needs its ${\\bf s}$ to be a dictionary like ${\\bf x}\_{\\bf n}$, but right now its just ${\\bf s}$ of ${\\cal A}?$. 8. Host function service account lookup ${\\bf d}$ needs an explanation before B.5 On-transfer. It is used in lookup, read, info but we're not sure whether its in ${\\bf x}$ and how its initialized. 9. Why did you keep machine/invoke/peek/poke/expunge solely for refine and not also accumulate? 10. $\\bar{x}$, the full extrinsic data (not hash-len combo) given to the guarantors alongside the work package, needs to be put into Audit DA, since re-execution of refine by auditors require it. This step is not described, but is of course necessary, correct? (edited)

# 2024-08-29 13:48 dave: Re (4), AIUI the blessed services are always accumulated every block even if there are no work reports for them

# 2024-08-29 15:38 sourabhniyogi: Yipes, I read (28) + Sect 11 "After enough assurances the work-report is considered available, and the work outputs transform the state of their associated ser- vice by virtue of accumulation, covered in section 12. The report may also be timed-out, implying it may be replaced by another report without accumulation." + (162) input of refine results M(s) as all implying that accumulation REQUIRES refine.

# 2024-08-29 15:50 dave: For most services that is true -- accumulate will not be called unless some work reports have become available, the 3 blessed services are an exception to this. See (159)

# 2024-08-29 15:58 sourabhniyogi: Ah blessed=privileged =)

# 2024-08-29 15:59 dave: Ah yeah sorry, not sure where I got that term from

# 2024-09-09 08:28 gav: 1. There is no explicit jump table - only an entry point. These are now set to 0 (isAuthorized), 5 (Refine), 10 (Accumulate) and 15 (OnTransfer). (edited)

# 2024-09-09 08:29 gav: PVM code blobs use these to disambiguate usage.

# 2024-09-09 08:30 gav: 2. Regarding JAM, Refine -> Accumulate is a one-way street and as such sensibly designed services must be able to perform Refine with some degree of asynchroneity to Accumulate. This is why I write that JAM is "mostly coherent". Getting any state-changes resulting from Accumulation as inputs to a later Refine would imply making a state proof, which introduces synchroneity and as such a degree of latency. Re: "Why can't...": Lazy and Dumb Question. If you think you can add such a feature without blowing up the protocol's complexity, submit your PR to the GP repo. (edited)

# 2024-09-09 08:31 gav: 3. Lazy and Dumb Question. If you think you can add such a feature without blowing up the protocol's complexity, submit your PR to the GP repo. (edited)

# 2024-09-09 08:35 gav: sourabhniyogi: This is not a forum to request your (IMO totally unachievable) product dreams. If you have a serious suggestion to improve the JAM protocol, make a PR to the GP repo. Be ready to defend its implications against some of the best minds of the industry. Beware that designing and writing a high-performance secure decentralized protocol is nontrivial and you can't just wish features to exist.

# 2024-09-09 08:37 gav: 4. With the particular exception of the privileged services, a service cannot be accumulated without at least one refine result. So in short, there generally cannot. I've covered this many, many times in basically every talk I've given. Please familiarise yourself with this content. I'm not here to be your personal oracle, spoon-feeding you with material at your own pace. (edited)

# 2024-09-09 08:39 gav: 5. Any kind of panic works, but if you want a convention, I'd use trap. (edited)

# 2024-09-09 08:54 gav: 6. Your assertion is incorrect: there is no ordering. This is precisely the reason for this paragraph.

# 2024-09-09 08:54 gav: 7. Again your assertion is incorrect.

# 2024-09-09 09:03 gav: 8. There is indeed a missing argument for Refine's lookup case. This will be fixed in the next release. Refine's read and info cases properly supply this parameter. (edited)

# 2024-09-09 09:05 gav: 9. They are quite heavy-weight facilities and it is far from clear that they can be used effectively in the <10ms of gas which Accumulate is given. This is something which might change as we begin writing prototype services.

# 2024-09-09 09:07 gav: 10. It is fully defined already

# 2024-09-10 09:07 sourabhniyogi: You already have service users submitting a refinement context within the work package, which includes an anchor block's state root but not an unfinalized accumulate state root. With Safrole, assuming very high liveness, you also won mostly forkless state roots. So my lazy dumb strawman is that you can win historical_read parallel to historical_lookup in refine by modifying the "anchor block" centric refinement context included in work packages to be have state roots not (a) relative to an anchor block but instead (b) whatever the work package submitter has observed to be a recent \[potentially unfinalized\] state root The user already have to submit their work packages relative to (a), so what is lost with (b)? I am not making a request for product dreams here 😅, I am asking why an observed unfinalized state root from the user is an insufficient state proof. If 2 or 3 Guarantors refine a work package according to some unfinalized state root but it doesn't get finalized, then the work report doesn't get assured, it doesn't get audited, and so it just times out. The claim is that if this timeout is quite rare in practice, the experience for users improves tremendously with historical_read and a streamlined "accumulate=>refine" . The lazy/dumb GP modification would be extend what what you have in 9.2 "By retaining historical information on its availability, we become confident that any validator with a recently finalized view of the chain is able to determine whether any given preimage was available at any time within the period where auditing may occur." from preimages alone to what is done with write operations. Still asynchronous post-write here, but with less latency than (a). The new requirement is that any services aggressive work results (with aggressive (b)) would only be able to affect accumulate if the refinement context's state root in conception (b) was finalized. I understand this is in the "better to implement JAM to safely crawl before JAM walks/runs" region here. But you claim running with (b)'s aggressive refinement context is unachievable because ... what? (edited)

# 2024-09-10 09:09 sourabhniyogi: Typo: (252) should have "historical_lookup" instead of "lookup"

# 2024-09-10 20:47 gav: JAM (and Polkadot before it) is secure only because of Elves. Elves requires that all validators are able to audit all Work Reports *regardless of whether they’re synced to the same fork or not*. All host functions available to Refine must return exactly the same result for any node on any fork beyond the lookup-anchor. Since we cannot assume any particular state is known (even the lookup anchor block’s state, which could be very old and is likely pruned) then our host functions must be pretty much stateless. Historical lookup works only because we know, at any block up to 24 hours later, whether a preimage was known at the time and if so what it was. We manage this only with specialised data structures (in accounts) and limiting the rate at which a preimage may be supplied, removed and supplied again to avoid state blowup. This can all be discovered by a thorough read of the GP. The design would not cleanly apply to more arbitrary and general state changes such as service storage. (edited)

# 2024-09-12 14:00 sourabhniyogi: Thank you very much for this explanation. We will properly understand ELVES=>JAM constraints to see if there is any way we could improve the JAM service developer/user experience.

# 2024-08-29 11:37 sourabhniyogi: (160) should have $l: {\\bf r}\_l$ to match 11.1.4's eq (121) for the hash of the payload. (252) should use historical_lookup instead of lookup (edited)

# 2024-08-29 23:40 xlchen:

# 2024-09-04 03:33 shwchg: some questions about dispute and audit: Will disputes enter a special voting phase if someone casts a false judgement during each tranche settlement? I would like to know the details of how E\_D is formed. (edited)

# 2024-09-04 09:28 dave: AIUI H_V, like the other markers, was (intentionally) redundant. It simply provided verdict information for those downloading only headers, not block extrinsics. All validators need to download full blocks, so its removal makes no difference to them (other than not having to generate it etc obviously)

# 2024-09-04 09:32 shwchg: Sorry, I might not have been clear. H_V still exists in the generation of s, so is it simply being removed from the function?

# 2024-09-04 09:34 shwchg: https://graypaper.fluffylabs.dev/#WyJlMjA2ZTI2NjNjIiwiMWIiLCJBdWRpdGluZyBhbmQgSnVkZ2luZyIsIlNlbGVjdGlvbiBvZiBSZXBvcnRzIixbIjxkaXYgY2xhc3M9XCJ0IG0wIHhiZCBoMiB5ZTZjIGZmMWMgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiIsIjxkaXYgY2xhc3M9XCJ0IG0wIHgxMjQgaGQgeWU2YiBmZjE4IGZzNSBmYzAgc2MwIGxzMCB3czBcIj4iXV0=

# 2024-09-04 09:35 dave: Ah sorry think there is some confusion. H_j is what was removed

# 2024-09-04 09:36 dave: H_v still exists

# 2024-09-04 09:36 dave: The above applies to H_j

# 2024-09-04 09:36 dave: There may be changes I'm not aware of though

# 2024-09-04 09:41 shwchg: my bad, that H_j is header verdict XD Regarding dispute voting, what are your thoughts?

# 2024-09-04 09:45 dave: Re (2) if a node sees a negative judgement for a report, then it should start auditing it if it hasn't already. I'm not sure if it's supposed to send out an announcement in this case; I would guess this is not necessary, but probably also harmless.

# 2024-09-04 09:49 shwchg: got it, so next is to convert the judgments into Verdict and move on to the logic in Chapter 10. Thank you!

# 2024-09-04 09:52 dave: > <@dave:parity.io> Re (2) if a node sees a negative judgement for a report, then it should start auditing it if it hasn't already. I'm not sure if it's supposed to send out an announcement in this case; I would guess this is not necessary, but probably also harmless. In the case that there is a negative judgement, it is thus expected that all validators will produce a judgement. If a block author has seen enough judgements to build a verdict then they will do this.

# 2024-09-04 10:06 shwchg: I see. You’re so nice! Tks

# 2024-09-04 20:07 mateuszsikora: Hey, correct me if I am wrong but there is no specification in GP how to handle out of memory in PVM. We have 2 cases: 1. A generic program without memory segmentation. In this case we start from empty memory and we can call sbrk until the memory exceed 2 ** 32. what should happen then? 2. A "standard program initialization" program. In this case we have heap between 2Z_Q + Q(|o|) and 2**32 - 2Z_Q - Z_I - P(s) (beginning of stack segment). what should happen when the memory that we allocated exceeds the heap segment? I guess in both cases it should be a page fault but what should be the address then? we don't pass any address to sbrk so there is no any strict point where this fault could happen

# 2024-09-05 17:38 jan: Memory allocation/deallocation handling is still a work-in-progress, and it's possible the sbrk instruction will get modified and/or removed. I'd suggest you temporarily skip it and focus on other parts of JAM and/or PVM. --- If you're interested in some history as to why sbrk is there then let me give you some background. Historically I designed PolkaVM (on which PVM in the GP is based on) to be a VM which is as "powerful" as WASM VMs (so it can completely replace our current WASM-based executor in Polkadot 1.0 and our WASM-based smart contracts VM) while being as simple as possible to implement, and without sacrificing any performance. So this is where the idea for the sbrk came from (which is similar to what WASM has): the VM maintains a heap pointer, and the guest program can use sbrk to query that pointer and/or to bump it up. And every time it crosses a page boundary the VM allocates new memory for the program. So this design has numerous benefits. First, it's very simple to use as a guest program (pseudo code):

    // Get a pointer to the new allocation.
    let pointer = sbrk(0);
    // Actually allocate it.
    if sbrk(size) != 0 {
        // Allocation succeeded.
        // Now `pointer` points to `size` bytes you can use.
    }

This is also great for use cases like e.g. tiny smart contracts which can use this as directly as an allocator without having to bring a heavyweight allocator of their own (which would consume a lot of space). Secondly, it's simple to implement in the VM, something like that (pseudo code again):

fn sbrk(size) -> Pointer {
    if size == 0 {
        // The guest wants to know the current heap pointer.
        return current_heap_pointer;
    }

    // The guest wants to allocate.
    let new_heap_pointer = current_heap_pointer + size;
    if new_heap_pointer > max_heap_pointer {
        // Allocation failed.
        return 0;
    }


    let next_page_boundary = align_to_page_size(current_heap_pointer);
    if new_heap_pointer > next_page_boundary {
        allocate_new_pages(next_page_boundary..align_to_page_size(new_heap_pointer));
    }

    current_heap_pointer += size;
    return current_heap_pointer;

And this (along with the memory map I came up with, which is what we now call "standard program initialization") also makes in very easy to write an interpreter for this, because when handling loads/stores from memory you only have to do something like this:

fn load_value32(address) -> value {
    if address >= stack_address && address + 4 <= stack_address_end {
        return stack[address - stack_address];
    } else if address >= rw_data_address && address + 4 <= align_to_page_size(current_heap_pointer) {
        return rw_data[address - stack_address];
    } else if address >= ro_data_address && address + 4 <= ro_data_address_end {
        return ro_data[address - stack_address];
    } else {
        // Address is inaccessible.
        return Err;
    }
}

It's cheap, fast, and doesn't require any crazy data structures and doesn't require any handling of corner cases (for example, accesses which could read both from the stack and from RW data don't have to be handled, because they're impossible by definition; the interpreter can just keep them in separate arrays, and call it a day). So that's how (any why) it __was__ originally designed, but then came JAM and changed things. (: (Again, remember, I started working on this **before** JAM, and some things were just grandfathered into JAM.) What JAM introduces is a concept of inner VMs (see machine, peek, poke, invoke and expunge host functions in section B.8 of the GP) where one VM can spawn another VM, and as it is currently designed those inner VMs are extremely flexible and have completely free-form memories and are dynamically paged. What this essentially means is that all of those nice properties of sbrk that I've listed - simple and easy to implement, fast, doesn't require fancy data structures - they all now go out of the window! So we will probably be replacing sbrk with something else that's more appropriate for the more flexible inner VM model. And unfortunately also most likely orders of magnitude harder to implement (at least if you want to reach at least the half-speed milestone), but it is what it is. I'm still finishing some other stuff up, but I'll most likely be working on this soon-ish. (If any of you have any good and/or crazy ideas feel free to message me!) (edited)

# 2024-09-09 06:54 mateuszsikora: Thank you Jan Bujak, that clarifies a lot. It would be nice to include a note in the GP that sbrk might be modified or removed, and it is advisable not to implement it yet

# 2024-09-06 00:49 xlchen: Screenshot 2024-09-06 at 12.49.05 PM.png

# 2024-09-06 00:49 xlchen: This appears to be the original version described in wikipedia, not the modern version?

# 2024-09-06 08:53 celadari: Looks like it ! Modern version looks a bit simpler to implement as well

# 2024-09-06 08:53 celadari: Modern version requires random variables j in [0, i] in the loop, and function Q_l (equation 306) is said to be constrained to [0, l]. So it should be possible to use the modern version. PS: I don't see why Q_l result is contained in [0, l] but in the meantime I assume this hypothesis (I don't understand the 4i modulo 32 to be honest).

# 2024-09-09 09:27 gav: Will be fixed in 0.3.7.

# 2024-09-06 00:49 xlchen: thus much less efficient

# 2024-09-06 03:22 xlchen: Screenshot 2024-09-06 at 3.22.21 PM.png

# 2024-09-06 03:22 xlchen: I am trying to understand this, which doesn't make much sense to me

# 2024-09-06 03:23 xlchen: the R is used as a function but isn't it a constant?

# 2024-09-06 08:34 celadari: I think R here refers to the rotation period 10 (appendix I.4.4) and it means multiply \* R by (floor(tau'/R) -1) (edited)

# 2024-09-06 08:35 xlchen: oh that make sense. thanks

# 2024-09-06 03:23 xlchen: Screenshot 2024-09-06 at 3.23.38 PM.png

# 2024-09-06 03:24 xlchen: if it is referring to this R (in that case the font isn't right), it takes two arguments

# 2024-09-06 03:26 xlchen: and the last part just isn't making sense to me. the only logical interoperation I can come up with is that it is a typo that suppose to limit some t passing to the function R as the second argument?

# 2024-09-06 08:34 celadari:

# 2024-09-06 20:16 danicuki: Screenshot 2024-09-06 at 21.16.52.png

# 2024-09-06 20:17 danicuki: Hi. I have a doubt about the definition of erasure-coding function. Is it a recursive function? If so, what is the base case?

# 2024-09-06 20:17 danicuki: What is the difference between Ck and C?

# 2024-09-06 20:26 danicuki: The only place I see C defined is here:

# 2024-09-06 20:26 danicuki: Screenshot 2024-09-06 at 21.26.30.png

# 2024-09-09 06:59 shwchg: Hi, I would like to inquire about the explanation for this function. Last week, others were discussing it as well, but it seems that no one knows the reason for this subscript. https://graypaper.fluffylabs.dev/#WyJlMjA2ZTI2NjNjIiwiMzUiLCJBY2tub3dsZWRnZW1lbnRzIixudWxsLFsiPGRpdiBjbGFzcz1cInQgbTAgeDE4OCBoYiB5MjBkMCBmZjE2IGZzMCBmYzAgc2MwIGxzMCB3czBcIj4iLCI8ZGl2IGNsYXNzPVwidCBtMCB4MTM5IGhiIHkyMGQwIGZmMTYgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiJdXQ==

# 2024-09-09 07:45 gav: > <@xlchen:matrix.org> it requires the length of v to be no more than 2^x so log2(|v|) - x is always <= 0? The outcome is that it skips the final $x$ items of the proof.

# 2024-09-09 07:50 gav: The ceil(log_2(|v|)) just gives the (generally maximum, but here constant) number of nodes from root to leaf. Substracting x would result in a negative number in the case that there are fewer than 2^x leaves. So we clamp it to zero and take only the first such proof items.

# 2024-09-09 07:52 gav: This is useful since we know we'll get a well-aligned subtree of 2\*\*x data items (the "page") and thus need only proof data to get it from the root to the sub-tree root. (edited)

# 2024-09-09 07:52 gav: Consequently, if there are no more items in total than the items on our page, our proof can be empty (i.e. the subtree root is equal to the root). This where the max(0, ...) comes from. (edited)

# 2024-09-09 07:55 gav: It is never required to enumerate a service's storage keys or preimages. (edited)

# 2024-09-09 08:04 gav: > <@mkchung:matrix.org> In host function new=9, what is the x_i used for creating the service account? > > Additionally, for l∶{(c, l)↦[]} portion, are we writing this a_l on-chain, similar to the (h,z)=[] in the solicit=13 case? This should also trigger E_P, correct? x is the context, x_i is initially defined here https://graypaper.fluffylabs.dev/#WyJlMjA2ZTI2NjNjIiwiMjkiLCJBY2tub3dsZWRnZW1lbnRzIixudWxsLFsiPGRpdiBjbGFzcz1cInQgbTAgeDE2IGhiIHkxNzk5IGZmZiBmczAgZmMwIHNjMCBsczAgd3MwXCI+IiwiPGRpdiBjbGFzcz1cInQgbTAgeGQ2IGhiIHkxNzk5IGZmMTYgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiJdXQ== though it is mutated with every new.

# 2024-09-09 08:13 gav:

# 2024-09-09 08:20 gav: > <@cisco:parity.io> How is it that the PVM argument invocation can return a blob? I guess the memory contents on a success are the blob. Here the return type is ( (N_G, Y) U {panic, oom}, X ) but the inner function R seems to return ({panic, oom} U N_G, X U Y) Fixed in https://github.com/gavofyork/graypaper/pull/72/commits/38de2e265096cb1cbcaa31e9ce445a4e7e771b91

# 2024-09-09 08:25 cisco: > <@gav:polkadot.io> Fixed in https://github.com/gavofyork/graypaper/pull/72/commits/38de2e265096cb1cbcaa31e9ce445a4e7e771b91 Everything's clear now, thanks!

# 2024-09-09 09:18 gav:

# 2024-09-09 09:27 gav: > <@danicuki:matrix.org> Hi. I have a doubt about the definition of erasure-coding function. Is it a recursive function? If so, what is the base case? It is not.

# 2024-09-09 09:31 gav: > <@danicuki:matrix.org> The only place I see C defined is here: C_k is the "true" definition, and it assumes an input of data in multiples of 684 bytes, and this multiple is the number of "chunks", k. The k subscript may generally be elided since the number of chunks is implied by the input, but sometimes I put it in regardless to make things clearer.

# 2024-09-09 09:35 gav: We use a 256-bit hash-sequence to create a 32-bit integer sequence. It would be wasteful to full 256-bit hash for just 32 bits of entropy. So we do a hash only every 8th item and otherwise take the 256-bits of the hash and split it into 8 32-bit integers. (edited)

# 2024-09-09 19:36 danicuki: Screenshot 2024-09-09 at 20.36.35.png

# 2024-09-09 19:37 danicuki: this formula is a little bit confusing because pc has two meanings here and the letters "c" are almost identical. (edited)

# 2024-09-10 01:47 xlchen: Screenshot 2024-09-10 at 1.47.36 PM.png

# 2024-09-10 01:47 xlchen: Screenshot 2024-09-10 at 1.47.53 PM.png

# 2024-09-10 01:48 xlchen: the arguments order appears to be swapped for g & o

# 2024-09-10 06:47 gav: Will be fixed in 0.3.7

# 2024-09-10 05:13 xlchen: Screenshot 2024-09-10 at 5.13.41 PM.png

# 2024-09-10 05:15 xlchen: I don't fully get this part. Firstly I think the code hash of a service can never be null. And when o is empty, it returns what? The same service account? But what about other fields in the return context?

# 2024-09-10 05:16 xlchen: and it is not invoking PVM when o is empty array, which I am not sure is correct. the only case this can be triggered are for privilege services and it should still invoke PVM when there is no work result to accumulate?

# 2024-09-10 05:34 xlchen: Screenshot 2024-09-10 at 5.34.29 PM.png

# 2024-09-10 05:35 xlchen: I also need some clarification about this regards to privileged services without work report

# 2024-09-10 05:36 xlchen: when there are two work result for a service, does it receive 2 minimal accumulation gas + (gas ratio * remain gas)? or 1 minimal gas?

# 2024-09-10 05:36 xlchen: how about privileged services without work result? they should still receive the minimal accumulation gas right?

# 2024-09-10 06:40 gav: > <@danicuki:matrix.org> this formula is a little bit confusing because pc has two meanings here and the letters "c" are almost identical. It only has one meaning; the two lines are complementary. One defines the set it belongs to (i.e. its "type") and the other defines the value within this set.

# 2024-09-10 06:45 gav: > <@xlchen:matrix.org> sent an image. The two uses of c are related, but I'll change the codehash to u in https://github.com/gavofyork/graypaper/pull/72

# 2024-09-10 07:05 gav: > <@xlchen:matrix.org> and it is not invoking PVM when o is empty array, which I am not sure is correct. the only case this can be triggered are for privilege services and it should still invoke PVM when there is no work result to accumulate? Yes on both counts. Should be addressed by https://github.com/gavofyork/graypaper/pull/72/commits/f8ba4adcbb5982403cf35061a149d38826a3379e.

# 2024-09-10 07:06 gav: > <@xlchen:matrix.org> I don't fully get this part. Firstly I think the code hash of a service can never be null. And when o is empty, it returns what? The same service account? But what about other fields in the return context? This is not comparing the hash but the code itself, which can be null in the case that the service doesn't currently host the preimage of its code hash.

# 2024-09-10 07:09 gav: Yes, the function implies this, since it is taking the sum over all work results attributable to the given service, and each element of that sum includes the minimum accumulation gas together with its share of the remainder of the core. (edited)

# 2024-09-10 07:15 xlchen: how about the minimum accumulation gas from privileged services if they don't have any work results?

# 2024-09-10 07:16 xlchen: the remaining gas should also subtract those?

# 2024-09-10 07:18 gav: At present, that's zero according to the formula. (edited)

# 2024-09-10 07:18 gav: > <@xlchen:matrix.org> how about the minimum accumulation gas from privileged services if they don't have any work results? At present, that's zero according to the formula.

# 2024-09-10 07:19 gav: > <@xlchen:matrix.org> the remaining gas should also subtract those? ?

# 2024-09-10 07:20 xlchen: so if the gas is zero for privileged services without work results, then why include them in S? invoke those with zero gas surely wouldn’t yield anything other than OOG?

# 2024-09-10 07:24 gav: Yeah, it's not final, I'm just describing correct behaviour at present.

# 2024-09-10 08:25 gav: > <@xlchen:matrix.org> so if the gas is zero for privileged services without work results, then why include them in S? invoke those with zero gas surely wouldn’t yield anything other than OOG? https://github.com/gavofyork/graypaper/pull/75

# 2024-09-11 04:36 gav: 0.3.7 is released: https://github.com/gavofyork/graypaper/releases/tag/v0.3.7 In addition to the usual corrections/clarifications, this contains a couple of modest but important changes: - Erasure-coding is now validator-major not chunk-major: this optimizes the happy case (where you have the first 342 validator's shares) and doesn't really affect the general case. - There's a new item in chi (privileged services) for managing always-accumulate services and how much gas they get. The three privileged services *no longer always-accumulate implicitly*.

# 2024-09-11 07:34 dakkk: I've found a possible discrepancy in single-step state transition of PVM: In 214 psi_1 receives (c,k,j) + pvm_state, while in 218 psi_1 receives (c,j) + pvm_state

# 2024-09-11 07:43 gav: > <@dakkk:matrix.org> I've found a possible discrepancy in single-step state transition of PVM: > > In 214 psi_1 receives (c,k,j) + pvm_state, while in 218 psi_1 receives (c,j) + pvm_state https://github.com/gavofyork/graypaper/pull/77 - will be fixed in next release.

# 2024-09-11 15:14 tomusdrw: Question regarding BitSequence codec: Am I getting this right, that in case of a variable-length bit sequence encoding, one should prefix it with length of the bit sequence itself, not with length of it's packed representation?

# 2024-09-11 20:20 xlchen: it has to be the bit length otherwise how do you figure out that? but I don’t think variable length bit sequence is used atm so no need to implement it

# 2024-09-11 15:17 tomusdrw: Also IMHO it would be good to only have one canonical representation of some encoding, so GP should strictly define that: 1. The remaining bits (i.e. the remaining bitLength % 8) should be set to 0 (and decoding should fail in other case) 2. The boolean discriminator can be only 0 or 1, so any other number should fail the decoding.

# 2024-09-11 15:20 emielvanderhoek:

# 2024-09-11 15:22 emielvanderhoek:

# 2024-09-11 15:24 emielvanderhoek: Bitsequence with length 3 for Octet: 253 and octet: 1 both decode as [True,False,False]. With encoding the bits outside of the sequence matter to get the right octet back. (edited)

# 2024-09-11 15:58 emielvanderhoek:

# 2024-09-11 16:09 emielvanderhoek:

# 2024-09-12 00:31 gav: 1. This is already the implication. (We define only the encoding function, which implies that the remaining bits are set to zero. Decoding is just the inverse of the encoding function and so would naturally be invalid in if these bits happened to be set, since there would be no valid operand to the encoder function which could produce that output.) 2. What do you mean by “Boolean discriminator”? (edited)

# 2024-09-12 06:27 tomusdrw: 1. Okay fair point, I didn't think about implication like that. What I had in mind about being more explicit with bits is to change the sum limits to be full 0..8 and then define b_i * 2^i if i < |b|, 0 otherwise (edited)

# 2024-09-12 06:30 tomusdrw: > <@gav:polkadot.io> 1. This is already the implication. (We define only the encoding function, which implies that the remaining bits are set to zero. Decoding is just the inverse of the encoding function and so would naturally be invalid in if these bits happened to be set, since there would be no valid operand to the encoder function which could produce that output.) > 2. What do you mean by “Boolean discriminator”? 2. Sorry, should have been more precise: I've meant the discriminator for set in union with empty set. But given your other explanation about just providing the canonical encoding it's pretty clear that any number other than 0 and 1 should be rejected.

# 2024-09-12 06:41 emielvanderhoek: When looking at PVM testvectors, I assume k is a fixed length bitsequence (length defined by |c|=|k|). I believe the the remaining bits are all filled with 1s rather than 0s. Example is the first ADD instruction. https://github.com/w3f/jamtestvectors/blob/a2b18702aac7d15b9f51cd1ffcf0be95f987b2f7/pvm/programs/inst_add.json#L29 Value [T,F,F] is currently 249 or 0xF9 (1-filled) and should be 1 or 0x01 (0-filled). IMO if we are strict about decoding these vectors should be fixed. (edited)

# 2024-09-12 07:06 emielvanderhoek: This also implies that the only PVM testvectors that are currently valid are the ones where |c| is accidentally a multiple of 8.

# 2024-09-12 07:08 emielvanderhoek: We can work around this for now by not having a ‘strict’ decoding of a bitsequence. I.e. simply ignoring the bits outside of the bitmask.

# 2024-09-12 07:11 emielvanderhoek: Correct me if I am wrong. 🙏🏻

# 2024-09-12 07:19 gav: > <@emielvanderhoek:matrix.org> When looking at PVM testvectors, I assume k is a fixed length bitsequence (length defined by |c|=|k|). I believe the the remaining bits are all filled with 1s rather than 0s. > > Example is the first ADD instruction. https://github.com/w3f/jamtestvectors/blob/a2b18702aac7d15b9f51cd1ffcf0be95f987b2f7/pvm/programs/inst_add.json#L29 > > Value [T,F,F] is currently 249 or 0xF9 (1-filled) and should be 1 or 0x01 (0-filled). > > IMO if we are strict about decoding these vectors should be fixed. We are strict and if this is the case then they should indeed be fixed (@jan care to comment?)

# 2024-09-12 07:26 emielvanderhoek: https://github.com/w3f/jamtestvectors/issues/13

# 2024-09-12 07:27 gav: I think this would only confuse matters. It’s pretty clear currently using a summation to define each octet and totally ignoring the fact that the octets in the resultant sequence could hypothetically be expressed in bits some some of which may have no correspondence with bits in the bit sequence. (edited)

# 2024-09-12 07:28 gav: It is bits in, octets (values between 0 and 255 inclusive) out. There’s absolutely no need to bring “unused bits” into it. Concerns specific to particular low-level implementations are not generally going to be made explicit in the GP. (edited)

# 2024-09-12 09:14 tomusdrw: Omg, sorry I'm so dumb. Only now I've realized it's an actual sum, so obviously the other bits don't need any special treatment

# 2024-09-12 07:47 vinsystems: I append instruction data with 25 zeros (24 is de max value returned by skip function, I add one more for security) https://graypaper.fluffylabs.dev/#WyI3YWU1MWY5MzI1IiwiMjEiLCJBY2tub3dsZWRnZW1lbnRzIixudWxsLFsiPGRpdiBjbGFzcz1cInQgbTAgeGYgaGIgeTExMGQgZmY3IGZzMCBmYzAgc2MwIGxzMCB3czBcIj4iLCI8ZGl2IGNsYXNzPVwidCBtMCB4ZiBoNiB5MTEwZSBmZjcgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiJdXQ==, so in the case of the first ADD instruction data I have 3 + 25 octets. The Bitmask is 249 (11111001), |b| = 3 and I consider the remaining 5 ones as suffix. So then I have to suffix 20 more ones to the bitmask in order to have |instruction\_data| = |bitmask| (edited)

# 2024-09-12 08:11 vinsystems: > <@emielvanderhoek:matrix.org> This also implies that the only PVM testvectors that are currently valid are the ones where |c| is accidentally a multiple of 8. I append instruction data with 25 zeros (24 is de max value returned by skip function, I add one more for security) https://graypaper.fluffylabs.dev/#WyI3YWU1MWY5MzI1IiwiMjEiLCJBY2tub3dsZWRnZW1lbnRzIixudWxsLFsiPGRpdiBjbGFzcz1cInQgbTAgeGYgaGIgeTExMGQgZmY3IGZzMCBmYzAgc2MwIGxzMCB3czBcIj4iLCI8ZGl2IGNsYXNzPVwidCBtMCB4ZiBoNiB5MTEwZSBmZjcgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiJdXQ==, so in the case of the first ADD instruction data I have 3 + 25 octets. The Bitmask is 249 (11111001), |b| = 3 and I consider the remaining 5 ones as suffix. So then I have to suffix 20 more ones to the bitmask in order to have |instruction_data| = |bitmask|

# 2024-09-12 08:24 jan: The remaining bits in the PVM code blob's bitmask (in case the size of the instructions' slice is not divisible by 8) don't really matter and will produce the same result (as all out of bounds instructions are assumed to be traps, and regardless of what skip they have they will decode all the same), except the very first "extra" out of bounds bit which nominally should be 1, because that bit determines the skip of the last instruction. So just assuming they're all zeros would screw up decoding of the last instruction in case the instructions' slice is divisible by 8. But you're right that there's discrepancy here between the GP and the test vectors. It essentially boils down to this: what should we do if the number of instructions is not divisible by 8? Naively reading the GP the |c| = |k| would suggest to me that the number of bits in the bit mask should be always equal to the number of instruction bytes, but that indeed is not the case for our current test vectors. So how do we fix this discrepancy? First, even if the number of instruction bytes is divisible by 8 we need to assume that out of bounds bits are 1 (only the very first one needs to be, but it's simpler to assume all of them are), otherwise the decoding of the last instruction will be broken (alternatively skip can be defined to be "the number of 0s until the next 1s, or until the end of the bitmask", then conceptually the out of bounds bits can be 0). Second, decide what to do when the number of instruction bytes is not divisible by 8: a) change the |c| = |k| to say that the |c| should be rounded up to the nearest 8 (so, essentially, "round_up_to_nearest(8, instruction_bytes) == bits_in_bitmask"); essentially, allow the number of instructions and the number of bits in the bitmask not match, but enforce that the bitmask can have at most an "extra" 7 bits of padding (of which, as I've previously explained, only the first bit matters) b) delete the |c| = |k| (since AFAIK we always know the length of the whole **p** anyway, so the only purpose this equation serves is to add an additional constraint); essentially, allow the number of instructions and the number of bits in the bitmask not match, and not enforce anything about the bitmask's length c) force the instruction bytes to be always divisible by 8, in which case we could change the encoding of the **p** code blob to something like p = E(|j|) ⌢ E_1(z) ⌢ E_z (j) ⌢ E(c) ⌢ E(k) (basically remove E(|c|), since, again, the lengths here can be implicitly calculated from the length of **p**) So I think (b) is probably not a great idea. (a) is what I currently have implemented in PolkaVM, and I don't dislike the (c) option (It can waste up to 6 bytes, but it doesn't need any extra validation of the sizes. so one less place for the implementations to diverge I guess). gav

# 2024-09-12 08:30 gav: Firstly I don’t understand the problem of instructions not being divisible by 8. Bit strings need not be divisible by 8 either.

# 2024-09-12 08:36 jan: > <@gav:polkadot.io> Firstly I don’t understand the problem of instructions not being divisible by 8. Bit strings need not be divisible by 8 either. We have a bit in the bit mask for every instruction, right? But we can't just encode singular bits; we need to physically encode them as bytes. So if we have, say, an instructions blob which only takes 2 bytes of space, then sure, theoretically we only need 2 bits in the bit mask, but practically we must encode full 8 bits because 8 bits is the lowest granularity we can store. So this is what I mean by "instructions not being divisible by 8".

# 2024-09-12 09:03 jan: The implicit "remaining bits of the bitmask encoded as bytes must be zero" of GP's bit string encoding has two issues here as far as I can see: 1. this being an implicit requirement can be confusing for the implementers (as we've seen from the questions here; it might not be entirely obvious to everyone that those should be zeros, nor whether this should be validated), so it could be worthwhile to add a clarifying footnote or something explicitly saying this so that there's no room for confusion, 2. for the PVM code blob specifically this breaks the decoding of the very last instruction's skip value, _unless_ we say that the the skip calculation goes only as far as the end of the bitmask (so effectively implicitly the bitmask would have a 1 in there because it would behave as if there actually was a 1 encoded there, even though we'd physically require encoding 0s there!) (edited)

# 2024-09-12 09:05 jan: Hmm, but okay, another way of dealing with this could maybe be inverting the mask so that operands get a 1 and instruction opcodes get a 0 in the bitmask - then the zero padding would work.

# 2024-09-12 09:19 emielvanderhoek: Rereading GP-0.3.6-eq:276 (encoding of bitsequence) to me seems to explicitly force leading zero’s when encoding to an octet. Example: Value [T,F,F] leads to value 1 or 0x01 (0-filled). And no other value. (edited)

# 2024-09-12 09:29 gav: > We have a bit in the bit mask for every instruction, right? But we can't just encode singular bits; we need to physically encode them as bytes The encoding is irrelevant from the perspective of the PVM spec. (edited)

# 2024-09-12 09:29 gav: They're totally separate concerns.

# 2024-09-12 09:32 gav: I say again: the length of any sequence (including a sequence of bits) do not need to be divisible by 8. From the perspective of the spec there is absolutely nothing special about bit-sequences compared to any other kind of sequence.

# 2024-09-12 09:34 gav: There is a perfectly well-defined serialization codec. It is independent of the business logic and does not prejudice it at all. (edited)

# 2024-09-12 09:37 gav: Reading what you're saying, the only thing I can think of which might need a tweak is the skip function.

# 2024-09-12 09:37 gav: image.png

# 2024-09-12 09:40 gav: Since the final instruction's opcode bitmask could not be followed by a 1 as that would imply the opcode bitmask sequence as being longer than the instruction data sequence.

# 2024-09-12 09:40 jan: Sure, but the bits are physically there, so everyone needs to agree to handle them in the same way. (: So, again, if the implicit assumption is that those extra physical bits in a bitstream must be zero then that doesn't work for PVM as currently defined, and we need to fix it somehow. One simple way we could fix it is to invert the bitmask (so change the 1 to 0 in the skip equation) at a tiny cost to performance (~0.006% worse compilation speed as I just implemented it and measured); if that's fine to you then we can go with that.

# 2024-09-12 09:40 gav: it should probably read k + 1 + j >= |k| OR ...

# 2024-09-12 09:41 gav: > Sure, but the bits are physically there, so everyone needs to agree to handle them in the same way. (: Not necessarily.

# 2024-09-12 09:41 gav: You're thinking far too implementation specific.

# 2024-09-12 09:41 gav: Please check the room title.

# 2024-09-12 09:42 gav: Maybe someone implements the bit sequence as Vec<bool>. Maybe they're in C++ and it's std::vector<bool>. Maybe they're in Scheme or Haskell and it's linked list of bools.

# 2024-09-12 09:42 gav: The serialization format and the internal business logic are NOT the same.

# 2024-09-12 09:42 gav: They're not even close to being the same thing.

# 2024-09-12 09:43 gav: AFAICT nobody has brought up any valid concern related to bit-sequences. The spec is 100% clear. (edited)

# 2024-09-12 09:44 gav: Bit sequences are arbitrary in length and anywhere spec subscripts by an index >= length is undefined regardless of what might happen to be in any particular place in a machine's physical RAM. (edited)

# 2024-09-12 09:46 gav: You seem to be conflating some element of (perfectly well defined) business logic with the fact that the serialization of a bit sequence of type B\_8 and value \[1, 1, 1, 1, 1, 1, 1, 0\] happens to give the same octet sequence as a bit sequence of type B\_7 and value \[1, 1, 1, 1, 1, 1, 1\]. (edited)

# 2024-09-12 09:48 gav: As I say, at present the skip function appears to be in part undefined. This is the only alteration I see the need for. And it's just a broken function - it has nothing to do with the serialization format of sequences, or subscripting into bit sequences. (edited)

# 2024-09-12 10:03 jan: Okay, sure, but the bits are physically there, so it needs to be defined somehow (implicitly or explicitly) how they are handled. Are they validated (enforced to be zero) or are they ignored? The protocol will physically transmit those bytes, so it needs to define how they're handled (even if, again, this definition is only implicit and not explicitly written out). So I disagree that this is in any way an implementation-only concern. It's just a matter of whether the spec defines what happens explicitly or implicitly. Like, for example, let's take this pseudo code as an example:

a = spawn_pvm(instructions = [1,2,3,4,5,6,7], bitmask = [0b11111110])
b = spawn_pvm(instructions = [1,2,3,4,5,6,7], bitmask = [0b11111111])

We need the behaviour of this snippet to be the same for every implementation, hence the spec must define what happens here. Sure, the spec doesn't explicitly have to concern itself whether the last bit is there, but it needs to at least implicitly define (as a consequence of what it does explicitly define) exactly what happens, whether that be just ignoring the extra bits, or checking whether the extra bits are zero and if not then rejecting the program. If you want to say "this concern is too low level for the the spec to explicitly define, and we will only define it implicitly" then okay, fair enough. Anyway, so, can we change the bitmask for the skip to be the other way around - 0s for instructions and 1s for the argumets? That should resolve the issue with only a minimal hit to the performance of any potential implementations and not require any extra modifications.

# 2024-09-12 10:06 gav: > but the bits are physically there Not in the spec they're not. (edited)

# 2024-09-12 10:06 gav: And the spec is what is correct.

# 2024-09-12 10:06 gav: So NO. They're not.

# 2024-09-12 10:08 gav: It sounds as though your implementation of the spec is probably based on non-conformant assumptions about memory layouts.

# 2024-09-12 10:08 gav: And that you're conflating these non-conformant assumptions with the spec's serialization format for bit sequences.

# 2024-09-12 10:10 gav: Now I don't think there's any issue with altering the skip function in the way I state above.

# 2024-09-12 10:11 gav: Implementations are free to include a bounds-check, but they can also just place extra bits (of value 1) on the end of their (bit sequence) k, and thus allow writing a conformant skip function which needs no explicit bounds check as long as its argument is properly constrained (and we know it should be by virtue of how it's used in the spec). (edited)

# 2024-09-12 10:13 gav: Inverting the meaning of bits in k is not going to help unless you presume (generally incorrectly, but perhaps by design in your implementation) that there are accessible trailing zero bits in whatever your internal representation of sequence k is. (edited)

# 2024-09-12 10:14 gav: > It's just a matter of whether the spec defines what happens explicitly or implicitly. I've no idea what you're talking about.

# 2024-09-12 10:14 gav: There's nothing implicit.

# 2024-09-12 10:15 gav: Beyond any as-yet unknown errors, the spec defines correct behaviour perfectly.

# 2024-09-12 10:15 gav: There is nothing implicit about the serialization format nor about k, nor about what happens when you subscript into k.

# 2024-09-12 10:19 jan: Okay, can I ask an offtopic question? Leaving the spec-land, can you clarify what exactly are implementations supposed to do with the extra bits in the real world to be conformant with the spec? Are they supposed to ignore it? Or validate that they are zeros? Since the spec doesn't define what to do with them and doesn't concern itself with them am I correct in assuming that they should be ignored?

# 2024-09-12 10:19 gav: There are no extra bits.

# 2024-09-12 10:21 gav: I've been trying to point this out for nearly 10 hours now.

# 2024-09-12 10:22 jan: Okay, so they should be ignored by the implementations. Got it.

# 2024-09-12 10:22 gav: There is a definition of how to serialize a sequence of N_2 values ("bits", booleans, whatever you like to call it).

# 2024-09-12 10:22 gav: This implies a deserialization function definition which requires canonical serialization. (edited)

# 2024-09-12 10:24 gav: As for internal (in-RAM) representations: it's purely an implementation concern.

# 2024-09-12 10:25 gav: The GP specifies correctness through the serialization of blocks only. It does not EVER specify how a machine should represent any particular datum in physical memory. For all the GP knows, it could be a pen-and-paper block execution. (edited)

# 2024-09-12 10:26 gav: As such there is simply no concern of "extra bits" with regards to correctness.

# 2024-09-12 10:29 gav: If I might rephrase your question (possibly incorrectly) to "I plan to implement a bit sequence in Rust as a Vec<u8> and thus will naturally have access to a multiple of 8 bits at a time. If the bit sequence is a non-multiple of 8 in length and I attempt to dereference a bit whose index falls into those bits beyond the rightful length but still within the final byte which I am able to access, how should I proceed?" (edited)

# 2024-09-12 10:30 jan: More or less, yes.

# 2024-09-12 10:30 jan: (Although the question would be the same regardless of the programming language, as long as we're implementing JAM on a computer.)

# 2024-09-12 10:31 gav: The answer would be: "Either your implementation is incorrect (because the GP does not subscript at that index) or the GP is incorrect (because it does subscript at that index and thus includes an undefined term)." (edited)

# 2024-09-12 10:32 gav: Again not true. This question can only be framed with the presumption that there is accessible capacity beyond the rightful length, which seems pretty specific to the presumptions inherent in your implementation. (edited)

# 2024-09-12 10:33 jan: Okay, so the implementation should not access those bytes, and if it does this should not change any observable behavior, so any practical implementations of JAM on a computer will have to ignore those bits.

# 2024-09-12 10:34 gav: > <@jan:parity.io> Okay, so the implementation should not access those bytes, and if it does this should not change any observable behavior, so any practical implementations of JAM on a computer will have to ignore those bits. It's not as simple as that.

# 2024-09-12 10:34 gav: Maybe it does access those bytes.

# 2024-09-12 10:34 gav: The point is that the GP defines correct behaviour.

# 2024-09-12 10:34 gav: Implementations are totally free to access whatever parts of RAM they want in whatever ways they want to.

# 2024-09-12 10:35 gav: As long as their behaviour is in line with the GP it really doesn't matter.

# 2024-09-12 10:35 jan: Yes, that's what I was trying to say. (:

# 2024-09-12 10:37 gav: From the GP's perspective, there are no "extra bits", or more generally stated, there is no capacity beyond the length. For the sequence s = \[0, 1\], s\[2\] is undefined. If the spec ever tries to evaluate it, there is a mistake in the spec. Maybe a corresponding operation in some implementions would define some result. That's a potential avenue for an implementation-specific optimisation perhaps, but it's irrelevant from the perspective of asking about correct behaviour. (edited)

# 2024-09-12 10:40 gav: To help illustrate this a bit further you'd probably want to implement an efficient bit-sequence as a Vec<usize>, implying that you'd have up to 31 "extra bits" on 32-bit architectures and 63 "extra bits" on 64-bit architectures. Clearly the spec cannot care about what architecture any given impl instance is on. Therefore it must surely be that the GP cannot possibly consider the existence of any such "extra bits". (edited)

# 2024-09-12 10:41 jan: Hm... but isn't there at least one part of the spec that accesses those "extra" bits, namely, the machine hostcall defines the program as being p_z bytes long, so doesn't this impliy that there will be extra bits in there?

# 2024-09-12 10:41 gav: Where exactly? (edited)

# 2024-09-12 10:43 jan: https://graypaper.fluffylabs.dev/#WyI3YWU1MWY5MzI1IiwiMmYiLCJBY2tub3dsZWRnZW1lbnRzIixudWxsLFsiPGRpdiBjbGFzcz1cInQgbTAgeDQzIGhlIHkxY2ZjIGZmZiBmczAgZmMwIHNjMCBsczAgd3MwXCI+IiwiPGRpdiBjbGFzcz1cInQgbTAgeDE1MyBoNCB5MWNmZSBmZjIxIGZzMiBmYzAgc2MwIGxzMCB3czBcIj4iXV0=

# 2024-09-12 10:43 gav: ok you mean the program blob.

# 2024-09-12 10:43 gav: Yes, this draws upon the exact same (deterministic) codec as before.

# 2024-09-12 10:44 gav: There's still no extra bits.

# 2024-09-12 10:44 gav: Again, you're confusing the wire format with business logic.

# 2024-09-12 10:44 jan: Okay, fair enough. So in this case the bits should be zero to correctly decode, otherwise an error is returned?

# 2024-09-12 10:45 gav: The point where wire-format (the p which is passed in machine and passed into Phi) changes into business logic is the E function.

# 2024-09-12 10:45 dakkk: In appendix A, why the gas delta is always 0 for every instruction?

# 2024-09-12 10:45 gav: Before, one can argue that there may be "extra bits" (though I'd say it's unhelpful and unnecessary to frame it in that way). Afterwards there are most certainly not. As I say, some implementations will likely use optimized language primitives like C++'s vector<bool> in order to represent this data and there will be no such accessible-beyond-rightful-length data. (edited)

# 2024-09-12 10:47 gav: We have not yet defined sensible gas costs - it'll be one of the last things we do. (edited)

# 2024-09-12 10:48 gav: > <@jan:parity.io> Okay, fair enough. So in this case the bits should be zero to correctly decode, otherwise an error is returned? That's implication of the math, yeah.

# 2024-09-12 10:48 dakkk: > <@gav:polkadot.io> We have not yet defined sensible gas costs - it'll be one of the last things we do. okiedokie; so I assume in koute pvm test vectors the default value is 1

# 2024-09-12 10:59 shwchg: image.png

# 2024-09-12 11:01 shwchg: Should 'v' here be replaced with 'f'?

# 2024-09-12 11:02 shwchg: image.png

# 2024-09-12 11:02 shwchg: so as eq 129

# 2024-09-12 13:26 jan: > <@gav:polkadot.io> That's implication of the math, yeah. Okay, thank you for clarifying everything. Now I'm clear on the intended behavior. Sorry for being so dense. (: So the skip equation should be fixed, as it current does access out of bounds elements of k as far as I understand: j∈N∶ k_{i+1+j} = 1 (since j is ∈N, so it goes up to infinity, but even without this it would do an out-of-bounds access for the very last instruction due to the +1) And we might consider deleting/changing this sentence which precedes the skip equation as it explicitly talks about the extra padding. I'm not saying it's incorrect (logically it's true), but in the context of our "there are no extra bits" conversation it may be confusing when (outside of the serialization codec section) you have a sentence which explicitly says that those bits do in fact exist: > We assert that the length of the bitmask is no smaller than the length of the instruction blob (and in fact is simply rounded to the nearest multiple of eight for ease of octet-encoding). So, let me quickly summarize the main options that I can see: a) change skip to not read out of bounds, b) keep skip as-is, define out of bounds reads to be 1 (after all we already do something similar with the instructions blob), c) keep skip as-is, define out of bounds reads to be 1, use a dedicated serialization codec for this bit mask which requires that |k| mod 8 = 0 and ceil(|c| / 8) = |k| / 8 (so this would make the business logic explicitly use the "extra" bits, which now I know you don't want) I can make a PR to the GP. I'm guessing you'd like to go with (a), correct? (For reference, I've initially implemented (c) in PolkaVM because that was marginally the fastest option which didn't require adding any extra unnecessary padding.)

# 2024-09-12 13:33 jan: > <@dakkk:matrix.org> okiedokie; so I assume in koute pvm test vectors the default value is 1 Correct. The test vectors currently assume that every instructions costs 1 gas, but this is just a strictly temporary measure so that people can implement their gas metering machinery without needing the final gas cost model to be defined (since gas is a core part of the JAM you must have at least some gas cost model, and assuming every instruction costs only 1 gas is the simplest one you can have). We will be defining a proper gas cost model in the future and updating the vectors.

# 2024-09-12 15:56 gav: > <@jan:parity.io> Okay, thank you for clarifying everything. Now I'm clear on the intended behavior. Sorry for being so dense. (: > > So the skip equation should be fixed, as it current does access out of bounds elements of k as far as I understand: j∈N∶ k_{i+1+j} = 1 (since j is ∈N, so it goes up to infinity, but even without this it would do an out-of-bounds access for the very last instruction due to the +1) > > And we might consider deleting/changing this sentence which precedes the skip equation as it explicitly talks about the extra padding. I'm not saying it's incorrect (logically it's true), but in the context of our "there are no extra bits" conversation it may be confusing when (outside of the serialization codec section) you have a sentence which explicitly says that those bits do in fact exist: > > > We assert that the length of the bitmask is no smaller than the length of the instruction blob (and in fact is simply rounded to the nearest multiple of eight for ease of octet-encoding). > > So, let me quickly summarize the main options that I can see: > > a) change skip to not read out of bounds, > b) keep skip as-is, define out of bounds reads to be 1 (after all we already do something similar with the instructions blob), > c) keep skip as-is, define out of bounds reads to be 1, use a dedicated serialization codec for this bit mask which requires that |k| mod 8 = 0 and ceil(|c| / 8) = |k| / 8 (so this would make the business logic explicitly use the "extra" bits, which now I know you don't want) > > I can make a PR to the GP. I'm guessing you'd like to go with (a), correct? > > (For reference, I've initially implemented (c) in PolkaVM because that was marginally the fastest option which didn't require adding any extra unnecessary padding.) (a) sure.

# 2024-09-12 15:57 gav: And indeed that parenthesised text should be removed as it is neither useful nor commensurate with everything else.

# 2024-09-12 15:57 jan: > <@gav:polkadot.io> (a) sure. Got it. I'll make a PR to the GP and update the test vectors with the fixed paddings then.

# 2024-09-13 04:09 gav: That is now in https://github.com/gavofyork/graypaper/pull/77 (Fix PVM) (edited)

# 2024-09-13 04:48 sourabhniyogi: jeff: Can you point us to any ELVES talks posted from you (or anyone else)? I found one, perhaps there are more? Elves paper: https://eprint.iacr.org/2024/961 SBC 2024 video: https://www.youtube.com/watch?v=C1teIFTSphE

# 2024-09-23 10:42 alistair: We've been talking about this protcol for years. I like this talk I gave at a workshop: https://www.youtube.com/watch?v=F8q16k4U2fA . In 2019, Jeff and I gave one at the web3 summit, https://www.youtube.com/watch?v=JgaDYFCKSxM and I gave one at Devcon Osaka, https://www.youtube.com/watch?v=Ex_1XXF29Yo . Maybe Rob Habermeier has some. Expect more now that the paper is out.

# 2024-09-13 06:53 gav: sourabhniyogi: Before you or your team invest too much effort into attempting to redevelop the JAM protocol please note that following the resolution of Ordered Accumulations I do not anticipate any significant changes beyond tweaks, corrections and high-value-low-impact optimizations. Following this issue I consider JAM essentially feature complete and it will be very difficult to convince me to merge significant, novel feature additions into the GP. As a rough timeline I would like to have PolkaJAM entering audit in Q2 next year, which pretty much implies a security audit of the GP starting in Q1 and thus a spec freeze by EOY. (edited)

# 2024-09-13 08:14 prematurata: following up this https://github.com/gavofyork/graypaper/pull/77/commits/b1f172bdf98a52b12cebd9d5064b22230730b5d9

# 2024-09-13 08:14 prematurata: image.png

# 2024-09-13 08:14 prematurata: I guess omega X here means A function (edited)

# 2024-09-13 08:15 prematurata: so to avoid confusion may I suggest to remove the _X from the omega?

# 2024-09-13 08:48 gav: > <@prematurata:matrix.org> following up this https://github.com/gavofyork/graypaper/pull/77/commits/b1f172bdf98a52b12cebd9d5064b22230730b5d9 I cannot see any images you post unfortunately. Please use https://graypaper.fluffylabs.dev/ instead to point to the part of the GP you mean.

# 2024-09-15 08:33 vinsystems: Is this term always zero?

# 2024-09-15 08:50 subotic: No, e.g., if l is 0, then this term is x (edited)

# 2024-09-15 08:50 subotic: If it is the correct link.

# 2024-09-15 08:59 vinsystems: image.png

# 2024-09-15 08:59 vinsystems: Yes, its the correct link. Should l be determined implicitly by the number of octets of x? Or is it explicitly passed as an argument to the encode function?

# 2024-09-15 09:04 xlchen: it is calculated from the encoding number

# 2024-09-15 09:18 vinsystems: Is there any case in which l calculated from the encoding number is 0? (edited)

# 2024-09-15 09:26 tomusdrw: For x between 2^0 to 2^7 it seems (edited)

# 2024-09-16 06:36 jan: To make this equation a little more clear, here's a visualization of what it does. It essentially encodes numbers as varints in the following way:

At most  7bit - 0xxxxxxx
At most 14bit - 10xxxxxx xxxxxxxx
At most 21bit - 110xxxxx xxxxxxxx xxxxxxxx
...
At most 56bit - 11111110 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
At most 64bit - 11111111 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx

So the first encoded byte tells you how long the encoded number is by the amount of 1s either until the first 0 (starting at the most significant bit) or until the end of the byte. If there's "free space" left after the first 0 then part of the number is packed in there (the xs in the first byte). And then the rest of the bits of the encoded number are appended (the xs after the first byte). So as you can see the number of x bits is always divisible by 7 (hence why you can find the 7l in the GP equation), except the very last case where we don't have to encode the 0 in the first byte anymore because we'll hit the end of the byte when counting the 1s when decoding anyway (hence the "otherwise if x < 2^64" part in the GP). Here are some example numbers encoded using this:

number -> encoded as
01111111 -> 01111111 (127 -> [127])
10000000 -> 10000000, 10000000 (128 -> [128, 128])
00111111_11111111 -> 10111111, 11111111 (16383 -> [63, 255])

So if you're maybe familiar with the "standard" uleb128 varint serialization scheme - this is essentially similar, except instead of putting the continuation bits in every byte it packs all of the continuation bytes in the first byte and limits the number to at most 64-bit. (The rationale for this is that this is more efficient to decode on modern CPUs, because you only need to look at the first byte to know the length of the varint, instead of having to check the most significant bit for every byte.) Hopefully this makes it a little more clear.

# 2024-09-19 15:09 vinsystems: In the last example 00111111_11111111 -> 10111111, 11111111 (16383 -> [63, 255]) Don't should be 16383 -> [191, 255] ? In this case, l = 1 Acording with (272): 16383 -> 2⁸ - 2⁷ + (16383/2⁸) = 191 concatenate with (16383 mod 2⁸) = 255

# 2024-09-19 15:12 jan: Yes, you're correct. That was a copy-paste error on my part when I converted binary to decimal. 10111111 is, of course, 191 is decimal. Sorry for the confusion.

# 2024-09-15 08:50 gav: That link doesn’t seem to work especially well.

# 2024-09-15 09:35 vinsystems:

# 2024-09-15 09:37 gav: If 1 <= x < 128 then l = 0.

# 2024-09-15 09:37 gav: This is made explicit in the equation you referenced.

# 2024-09-15 09:42 vinsystems: > <@gav:polkadot.io> If 1 <= x < 128 then l = 0. Ok, I thought l = number of octets of x. Thanks.

# 2024-09-17 13:08 celadari: Hi, I have a question regarding state Merkelization. We say we use a Patricia trie, but looking at equation 295, it seems there are no extension nodes for common prefixes, which looks more like a regular prefix trie with Merkle hashing. Could you confirm if my understanding is correct? If so, (we want a Patricia trie), wouldn’t we need to define l and r using b_t1 and b_t0, where t1 and t0 are the largest common prefix among paths starting with 1 and 0, respectively?

# 2024-09-17 15:17 dave: It's intentional that there are no extension nodes. The keys are all hashes and so it isn't expected that there will be long common prefixes. (edited)

# 2024-09-17 15:29 celadari: I see ! I'm definitely not a database expert but doesn't it mean we'll have at least 512 jumps in the db

# 2024-09-18 01:07 gav: > <@celadari:matrix.org> I see ! I'm definitely not a database expert but doesn't it mean we'll have at least 512 jumps in the db Only if it is implemented in the least optimal way.

# 2024-09-17 16:08 clw0908: Hello, some question about 4 Invocation function entry points: As GP mentioned, the four entry points (0 (isAuthorized), 5 (Refine), 10 (Accumulate) and 15 (OnTransfer)) will pass through Ψ M -> Ψ H -> Ψ -> Ψ 1. My question is: What is the relationship between these four entry points and the PVM's instruction counter? Will these different entry points affect the PVM's instruction counter? (edited)

# 2024-09-18 01:06 gav: The instruction counter is initialised with the entry point.

# 2024-09-18 02:50 clw0908: > <@gav:polkadot.io> The instruction counter is initialised with the entry point. Why initialize the instruction counter with the entry point? Does it mean that different invocation functions execute different parts of the instruction data (bold C in GP(218))?

# 2024-09-18 02:50 jan: The entry point is the instruction counter.

# 2024-09-18 02:51 jan: e.g. for refine you set the instruction pointer to 5 and start execution from there

# 2024-09-18 02:57 jan: To give more backround on this: the way this will work in practice is that the program will have 4 unconditional jump instructions at the very start, and the first 3 of those instructions will be encoded in such a way as to be padded and always take up 5 bytes of space. Hence the entry points of 0, 5, 10 and 15. This design allows for hardcoded addresses for each of the entry points without having to specify them dynamically anywhere in the protocol.

# 2024-09-18 03:03 jan: (And just to be clear in case it isn't obvious: as a PVM implementation you don't and shouldn't care about this, and this shouldn't be handled in any special way on the VM level. You're supposed to set your instruction counter to the hardcoded address of the entry point, and just start execution from there, regardless of what exactly happens to be there. The fact that there will be usually unconditional jump instructions there is just an implementation detail, and is not in any way required by the GP.)

# 2024-09-18 12:49 prematurata: I believe i found some error in the pvm appendix. - https://graypaper.fluffylabs.dev/#WyI3YWU1MWY5MzI1IiwiMmIiLCJBY2tub3dsZWRnZW1lbnRzIixudWxsLFtdXQ== it says bold A. - i belive this is A set which is the ServiceAccount defined in (89) https://graypaper.fluffylabs.dev/#WyI3YWU1MWY5MzI1IiwiMTAiLCJTZXJ2aWNlIEFjY291bnRzIixudWxsLFsiPGRpdiBjbGFzcz1cInQgbTAgeGYgaDYgeTZmNyBmZjcgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiIsIjxkaXYgY2xhc3M9XCJ0IG0wIHhmIGg2IHk2ZjcgZmY3IGZzMCBmYzAgc2MwIGxzMCB3czBcIj4iXV0= - to confirm this i looked at bold X which is defined at 254 and x_s is being used in 257 for example for the omega w - if ^ is correct then omega_w or the writefunction states that i should compare a_t and a_b but there is no _tin the ServiceAccount set.

# 2024-09-18 13:13 gav: Yes it should read blackboard A

# 2024-09-18 13:28 prematurata: > <@gav:polkadot.io> Yes it should read blackboard A what about the subscribpted t ?

# 2024-09-18 14:07 prematurata: https://graypaper.fluffylabs.dev/#WyI3YWU1MWY5MzI1IiwiMmMiLCJBY2tub3dsZWRnZW1lbnRzIixudWxsLFsiPGRpdiBjbGFzcz1cInQgbTAgeDE1YSBoYiB5MTlmNCBmZjE2IGZzMCBmYzAgc2MwIGxzMCB3czBcIj4iLCI8ZGl2IGNsYXNzPVwidCBtMCB4NWYgaGQgeTE5ZjYgZmYxOCBmczUgZmMwIHNjMCBsczAgd3MwXCI+Il1d

# 2024-09-18 14:08 prematurata: same goes for t_t in the fn below

# 2024-09-18 14:32 qiwei: it could be the threshold balance https://graypaper.fluffylabs.dev/#WyI3YWU1MWY5MzI1IiwiMTEiLCJTZXJ2aWNlIEFjY291bnRzIiwiQWNjb3VudCBGb290cHJpbnQgYW5kIFRocmVzaG9sZCBCYWxhbmNlIixbIjxkaXYgY2xhc3M9XCJ0IG0wIHgxMiBoNiB5NzlhIGZmNyBmczAgZmMwIHNjMCBsczAgd3MwXCI+IiwiPGRpdiBjbGFzcz1cInQgbTAgeGYgaDYgeTc5YyBmZjcgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiJdXQ==

# 2024-09-18 14:49 prematurata: oooh

# 2024-09-18 14:49 prematurata: > <@qiwei:matrix.org> it could be the threshold balance https://graypaper.fluffylabs.dev/#WyI3YWU1MWY5MzI1IiwiMTEiLCJTZXJ2aWNlIEFjY291bnRzIiwiQWNjb3VudCBGb290cHJpbnQgYW5kIFRocmVzaG9sZCBCYWxhbmNlIixbIjxkaXYgY2xhc3M9XCJ0IG0wIHgxMiBoNiB5NzlhIGZmNyBmczAgZmMwIHNjMCBsczAgd3MwXCI+IiwiPGRpdiBjbGFzcz1cInQgbTAgeGYgaDYgeTc5YyBmZjcgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiJdXQ== that is definetely it thanks

# 2024-09-19 07:44 prematurata: https://graypaper.fluffylabs.dev/#WyI3YWU1MWY5MzI1IiwiMmQiLCJBY2tub3dsZWRnZW1lbnRzIixudWxsLFsiPGRpdiBjbGFzcz1cInQgbTAgeGE2IGhiIHkxYWEzIGZmMTYgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiIsIjxkaXYgY2xhc3M9XCJ0IG0wIHhmZCBoZCB5MWFhNCBmZjE4IGZzNSBmYzAgc2MwIGxzMCB3czBcIj4iXV0= is this correct? i am referring to the 176 offset/length the validator data should be 336 bytes long.

# 2024-09-19 09:45 prematurata: is m of deferred transfer Y\_64 or Y\_128? i ask because in 12.4 the text reads a memo component m of 64 octets but then its defined as m ∈ YM (161) and M is equal to 128 in the appendix. What is even more interesting is that the ΩT fn on p.45 says we should deserialize a M long octet sequence into a Tvalue. So my guess is that the (161) is wrong. Assuming m is Y64 then deserializing T means that 128-64 octets must be used to get the other elements in T (s, d, a, g). but even if all of them were 8 octets that would mean that we're left with 32 extra octets. so I'm a bit lost (edited)

# 2024-09-20 02:24 sourabhniyogi: How does Coreplay, with its stop-freeze-registers-resume pattern fit into JAM? This would be the major "feature request", in addition to: - connecting JAM back into Polkadot's { Staking, Coretime, and Asset Hub } system chains - getting complete about the rewards/punishment for guaranteeing/auditing/... - JAMNP - getting clear about how the (refineless) privileged services work from a user point of view I hope by Strongly Web3's "equidistant from all teams" principle, we can have multiple teams (3+) undergoing audits and not just PolkaJAM in the Q2 timeframe =). (edited)

# 2024-09-20 07:50 gav: sourabhniyogi: I am happy to field specific questions on interpretation of the Gray paper. I am not happy to be managed.

# 2024-09-20 07:51 gav: If you persist with this attitude, I **will ban you**.

# 2024-09-20 07:56 gav: > <@prematurata:matrix.org> https://graypaper.fluffylabs.dev/#WyI3YWU1MWY5MzI1IiwiMmQiLCJBY2tub3dsZWRnZW1lbnRzIixudWxsLFsiPGRpdiBjbGFzcz1cInQgbTAgeGE2IGhiIHkxYWEzIGZmMTYgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiIsIjxkaXYgY2xhc3M9XCJ0IG0wIHhmZCBoZCB5MWFhNCBmZjE4IGZzNSBmYzAgc2MwIGxzMCB3czBcIj4iXV0= is this correct? i am referring to the 176 offset/length the validator data should be 336 bytes long. Correct - that should be 336. Will be corrected.

# 2024-09-20 08:04 gav: > <@prematurata:matrix.org> is m of deferred transfer Y\_64 or Y\_128? i ask because in 12.4 the text reads a memo component m of 64 octets but then its defined as m ∈ YM (161) and M is equal to 128 in the appendix. > > What is even more interesting is that the ΩT fn on p.45 says we should deserialize a M long octet sequence into a Tvalue. So my guess is that the (161) is wrong. > > Assuming m is Y64 then deserializing T means that 128-64 octets must be used to get the other elements in T (s, d, a, g). but even if all of them were 8 octets that would mean that we're left with 32 extra octets. so I'm a bit lost That was a typo in the text: where it reads 64 should read 128 and is fixed in the next release. The other points are invalid. The dererialization you refer to is moot (I have since removed the function as it's a no-op here), and "deserializes" only the memo data. The other fields of the record are provided elsewhere.

# 2024-09-20 13:50 prematurata: > <@gav:polkadot.io> That was a typo in the text: where it reads 64 should read 128 and is fixed in the next release. The other points are invalid. The dererialization you refer to is moot (I have since removed the function as it's a no-op here), and "deserializes" only the memo data. The other fields of the record are provided elsewhere. thanks

# 2024-09-22 21:18 danicuki: Screenshot 2024-09-22 at 22.17.56.png

# 2024-09-22 21:19 danicuki: What is the meaning of av[c] here? As far as I understood, av is an integer. I interpret it as the core associated with validator in index av. But I am not sure the formula is accurate to the meaning.

# 2024-09-22 21:32 danicuki: I think I found the answer: it should be af, not av, right?

# 2024-09-23 06:08 gav: Cannot see the image - please provide a link using https://graypaper.fluffylabs.dev/ or an equation number along with a GP version. (edited)

# 2024-09-23 06:16 gav: Ahh - I think I see; yes, a\_v should be a\_f in two of the equations. This is fixed in https://github.com/gavofyork/graypaper/pull/77 and will be in the next release. (edited)

# 2024-09-23 06:34 gav: On that note, 0.3.8 is tagged and released: https://github.com/gavofyork/graypaper/releases/tag/v0.3.8

# 2024-09-23 08:49 danicuki: Should Formula (128) af[c] ⇒ ρ†[c]≠∅ be af[c] ⇔ ρ†[c]≠∅ ?

# 2024-09-23 09:25 danicuki: Another doubt: What does H(Hp, af) means in formula 126? H function should take only one argument, no?

# 2024-09-23 10:49 gav: > <@danicuki:matrix.org> Should Formula (128) > > af[c] ⇒ ρ†[c]≠∅ > > be > > af[c] ⇔ ρ†[c]≠∅ > > ? No.

# 2024-09-23 10:50 gav: H is assumed to encode any arguments give prior to hashing. Multiple arguments are treated as tuples. (edited)

# 2024-09-23 14:17 danicuki: > <@gav:polkadot.io> H is assumed to encode any arguments give prior to hashing. Multiple arguments are treated as tuples. Just to make sure I understood correctly: af is a binary string of 341 elements (e.g. represented in programming languages as an array of integers). It is not clear to me how should we hash this tuple. There is no clear definition of how to Hash a tuple in the GP. H(m ∈ Y) Should I assume: concat the Hp binary with the 0 and 1 af array and then Hash the result of this concatenation?

# 2024-09-23 17:58 vinsystems: After implementing the codec functions, I noticed that there is already a crate parity-scale-codec. Does this crate have the same encode implementations as the GP codec functions? (edited)

# 2024-09-23 19:37 vinsystems: After implementing the codec functions, I noticed that there is already a crate parity-scale-codec. Does this crate have the same encode implementations as the GP codec functions?

# 2024-09-23 19:59 tomusdrw: There are some similarities, but nope. AFAIR SCALE was initially mentioned in the Gray Paper (and is mentioned on the JAM prize website), but it is no longer. Main difference is the variable-length encoding of numbers.

# 2024-09-23 22:46 gav: (Updated my JAM prize page to remove reference to SCALE.)

# 2024-09-23 22:47 gav: FWIW the only difference between SCALE and JAM's serialization codec is in treatment of compact integers.

# 2024-09-23 22:36 gav: There's a lot of programming languages in the world. I'd avoid trying to make blanket statements about them. (edited)

# 2024-09-23 22:38 gav: As I wrote: > H is assumed to encode any arguments give prior to hashing. Multiple arguments are treated as tuples.

# 2024-09-23 22:39 gav: The GP defines how to encode tuples and how to hash octet-sequences.

# 2024-09-23 22:43 gav: And please note that no "assumptions" are needed. This is all detailed in the GP: https://graypaper.fluffylabs.dev/#WyIzODcxMDNkODIzIiwiNyIsIk5vdGF0aW9uYWwgQ29udmVudGlvbnMiLCJDcnlwdG9ncmFwaHkiLFsiPGRpdiBjbGFzcz1cInQgbTAgeDE1IGhiIHkyMzcgZmY3IGZzMCBmYzAgc2MwIGxzMCB3czBcIj4iLCI8ZGl2IGNsYXNzPVwidCBtMCB4MTUgaDYgeTIzYSBmZjcgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiJdXQ==

# 2024-09-24 02:34 gav: To avoid further confusion I'll make it even more explicit with the sentence: > The inputs of a hash function should be expected to be passed through our serialization codec $\mathcal{E}$ to yield an octet sequence to which the cryptography may be applied.

# 2024-09-24 02:58 xlchen: GP defines BLS pubkey to be 144 bytes but it seems like the BLS pubkey is usually 48 bytes?

# 2024-09-24 03:00 xlchen: which function should be used for the pubkey serialization? https://github.com/supranational/blst/blob/6f3136ffb636974166a93f2f25436854fe8d10ff/bindings/blst.h#L303

# 2024-09-24 03:04 xlchen: seems like the raw type (blst_p1) is 144 bytes but that should be internal details

# 2024-09-24 07:10 gav: > <@xlchen:matrix.org> GP defines BLS pubkey to be 144 bytes but it seems like the BLS pubkey is usually 48 bytes? @davxy:matrix.org?

# 2024-09-24 08:59 davxy: The public key for the scheme consists of two public elements, each corresponding to one of the two groups used by BLS for pairing: G1 and G2. Specifically, it is represented as (g1 sk, g2 sk), where g1 is the generator of G1 and g2 is the generator of G2. Each point in G1 is compressed into 48 bytes, while each point in G2 is compressed into 96 bytes. This scheme allows for fast verification of aggregate signatures (which are very useful for bridges). In particular the G1 component allows for aggregate signature correctness verification without the need for N expensive pairings, with N the number of aggregated signatures. The full scheme is detailed in this paper: https://eprint.iacr.org/2022/1611 and is implemented by this library: https://github.com/w3f/bls. The authors of the paper are jeff Syed Alistair and Oana Ciobotaru (edited)

# 2024-09-24 09:02 xlchen: I see. I guess I should switch from using blst to w3f/bls

# 2024-09-24 15:07 dakkk: I have a doubt; in A.1, we define the program blob p (octects containing instructions c, bitmask k and jump j). Then in A.7 we say "We thus define the standard program code format p, which includes not only the instructions and jump table (pre- viously represented by the term c), but ..." This I think it is wrong since we already defined p as the combination of c, k and j; the term c previously indicates only the instruction data.

# 2024-09-24 16:45 dave: Would it make sense to maintain an extended recent block history for checking prerequisite WP hashes? As it stands, a work package can have a recent enough anchor block to in theory be reportable, but not actually be reportable because the reporting of its prerequisite WP has been forgotten about. A WP with a prerequisite essentially has a shorter lifetime than other WPs because of this.

# 2024-09-25 02:17 xlchen: Screenshot 2024-09-25 at 2.17.13 PM.png

# 2024-09-25 02:18 xlchen: isn't this a circular dependency? The sealing key depends on unsigned block, but it includes Hv, and Hv depends on the VRF output of Hs?

# 2024-09-25 06:17 celadari: I think you first generate the anonymous VRF (with ring root) as a ticket submitter - you don't need aux for that: only context. This anonymous VRF happens to be equal to Y(H_s). You compute a signature H_v and then you compute H_s. Everything I said is you as a block author. (edited)

# 2024-09-25 08:26 xlchen: thanks. that make sense

# 2024-09-25 08:39 xlchen: how about the fallback case?

# 2024-09-25 09:04 celadari: I think you generate Y(H\_s) as a normal VRF using the context X\_F ++ eta\_3: anonymous and non-anonymous method give same VRF so here you can use whichever you want. You don't need aux for VRF output: just the context. You then compute signature H\_v and then you compute signature H\_s. Again, everything I said is you as a block author. If someone can confirm what I said it would be nice (edited)

# 2024-09-25 02:20 xlchen: or does the message (encoded unsigned block) won't impact the VRF output? so I can pass empty data when calculating the VRF output?

# 2024-09-25 15:41 davxy: I confirm. Y(H_s) can also be generated independently from the signature (you just need the context). That is what breaks the cyclic dep.

# 2024-09-25 15:44 davxy: https://github.com/davxy/bandersnatch-vrfs-spec/blob/8c82722a8e9a29df7f7aca7e3c25aa1bef0c2409/assets/example/src/main.rs#L71

# 2024-09-25 15:45 davxy: Y is then the output point hash. See here: https://github.com/davxy/bandersnatch-vrfs-spec/blob/8c82722a8e9a29df7f7aca7e3c25aa1bef0c2409/assets/example/src/main.rs#L198

# 2024-09-27 22:34 dave: A few questions regarding audit announcement statements (https://graypaper.fluffylabs.dev/#WyIzODcxMDNkODIzIiwiMWIiLCJBdWRpdGluZyBhbmQgSnVkZ2luZyIsIlNlbGVjdGlvbiBvZiBSZXBvcnRzIixbIjxkaXYgY2xhc3M9XCJ0IG0wIHhiYyBoMTAgeWVjYiBmZjEwIGZzMCBmYzAgc2MwIGxzMCB3czBcIj4iLCI8ZGl2IGNsYXNzPVwidCBtMCB4MTFmIGhiIHllY2IgZmYxNiBmczAgZmMwIHNjMCBsczAgd3MwXCI+Il1d): 1. Is the H(w) intended to mean the hash of the work report? It looks like the wrong font has been used 2. Is a\_0 intended to read a\_n instead? 3. Is it intended that announcement statements do not include eg the hash of the block that is being audited? (edited)

# 2024-09-28 02:02 gav: 1. Yes. 2. Yes. 3. No - the GP describes the signed material, but other data may be passed in addition for context. The signed material itself doesn’t need the block hash. (edited)

# 2024-09-30 14:33 dave: Re (3), because the block's hash is not included in the signed data, it doesn't seem unlikely that an announcement intended for a block on one fork could be fiddled with and then used in the context of a block on a different fork? Maybe this is not a problem but that is not clear to me

# 2024-09-30 16:40 dave: One more question on this... AFAICT, as currently specified, auditing of a block is performed by the prior validator set rather than the posterior validator set. This seems a bit odd, given that eg the availability assurance stuff uses the posterior validator set. Wondering if this is intentional or if it was simply missed when lots of things were changed to use posterior state?

# 2024-09-30 22:45 gav: They’re two very different processes - auditing is entirely off-chain, recorded publicly only through grandpa; assurance is recorded on-chain - so I wouldn’t draw any conclusions about one from the other. (edited)

# 2024-09-30 22:47 dave: Well auditors may request shards from assurers. If they're always the same set you might for example only service shard requests if you have seen an appropriate audit announcement first (not sure if this would be sensible)

# 2024-09-30 02:49 qiwei: https://graypaper.fluffylabs.dev/#WyIzODcxMDNkODIzIiwiMmUiLCJBY2tub3dsZWRnZW1lbnRzIixudWxsLFsiPGRpdiBjbGFzcz1cInQgbTAgeGMzIGhiIHkxYzMyIGZmZiBmczAgZmMwIHNjMCBsczAgd3MwXCI+IiwiPGRpdiBjbGFzcz1cInQgbTAgeDE2OCBoYiB5MWMzMiBmZjE2IGZzMCBmYzAgc2MwIGxzMCB3czBcIj4iXV0= just want to confirm, should it be w < t - D for this case?

# 2024-09-30 04:34 gav: No - it is correct as it is.

# 2024-09-30 04:37 gav: Note that the syntax in the preceding portion is the little-used numeric tuple subscript (the subscript “2”). I intend to remove this syntax in an upcoming revision. Instead the entire term “(x_s)_l[h,z]_2” can simply be replaced with“w”.

# 2024-09-30 07:36 emielsebastiaan: Graypaper releases with black backgrounds remain a pain point for me personally. Does anyone have a convenience download of the 0.3.8 release without the background and with black text? For a while these were available for download on the Graypaper site in the resources section.

# 2024-09-30 07:39 qiwei: https://jamcha.in/spec you can download light version here

# 2024-09-30 07:40 emielsebastiaan: great thanks hadn't noticed that yet :)

# 2024-09-30 12:38 celadari: Great thanks as well

# 2024-09-30 17:03 vinsystems: Question about Discriminator Encoding (275) Must it support encoding sequences of several types? e.g., [u32, u8, i64, u16...]

# 2024-09-30 22:37 gav: Sequences are generally drawn from the same type.

# 2024-09-30 22:39 gav: But regardless, even if they were not it would not alter anything in the spec.

# 2024-09-30 22:49 gav: David Emett: There is not really the prior/posterior difference for auditing as it's a fully off-chain process. With assurance, you actually check signatures on-chain, so you need to select which of the two sets those sigs refer to. Auditing doesn't happen inside a block. No audit signatures are routinely checked on-chain (judgements are the exception, and these are intentionally drawn from the prior set). Auditing happens _between_ blocks. So there really isn't a prior or posterior. There's just the "current" validator key set. I suppose that could be phrased as the posterior of the block associated with the timeslot at the point that the auditing process began. (edited)

# 2024-09-30 22:49 gav: The slight annoyance for clients is that there may be several forks.

# 2024-09-30 22:50 gav: So clients need to track all forks and do all audits of those forks where they are in the resultant (i.e. posterior) set. (edited)

# 2024-09-30 23:14 dave: AIUI auditing happens in the context of a block, and this block has a prior validator set and a posterior validator set. We presumably need to pick one of these to be the validator set that is expected to audit the block? I don't understand how you could just use the "current" validator set as (a) there is no consensus over this and (b) auditing is stateful and thus we presumably need the same validator set throughout the whole auditing process for a block?

# 2024-09-30 23:18 gav: So both auditing and block STF happen in the context of some pre-existing chain-head and its implied (posterior) state. In the context of the the block STF (which essentially validates a candidate child-block and determines the implied (posterior) state), then we now have two states - the old implied state and the new implied state. I call these two the prior and the posterior, since one is the state before the STF and the other is after the effects of the STF. (edited)

# 2024-09-30 23:18 gav: Auditing is not an STF.

# 2024-09-30 23:19 gav: There is only one state to consider: the former. (edited)

# 2024-09-30 23:20 gav: Now, depending on how you argue, this could be named the prior or the posterior. But the reality is that neither name is sensible. (edited)

# 2024-09-30 23:23 gav: AFAIK I never actually specify either term in the section on auditing (nor should I have). I guess the confusion comes from the fact that I didn't suffix kappa with a prime. But this wouldn't make sense since there is no concept of a non-prime kappa here.

# 2024-09-30 23:25 dave: https://graypaper.fluffylabs.dev/#WyIzODcxMDNkODIzIiwiMWIiLCJBdWRpdGluZyBhbmQgSnVkZ2luZyIsIlNlbGVjdGlvbiBvZiBSZXBvcnRzIixbIjxkaXYgY2xhc3M9XCJ0IG0wIHhmIGg2IHk3YyBmZjcgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiIsIjxkaXYgY2xhc3M9XCJ0IG0wIHhmIGg2IHk3ZiBmZjcgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiJdXQ==

# 2024-09-30 23:26 dave: You state here that the terms are in the context of the block that is being audited, so sigma prime is the posterior state for example

# 2024-09-30 23:26 gav: Yeah fair enough - that part is wrong.

# 2024-09-30 23:26 dave: This seems sensible to me

# 2024-09-30 23:26 gav: It should read as what I wrote above.

# 2024-09-30 23:28 dave: Particularly given that you use for example bold W to refer to the accumulated reports, which is defined in terms of prior state and the extrinsic

# 2024-09-30 23:29 gav: So basically replacing > assume ourselves focused on some block B with other terms corresponding, so σ′ is said block’s posterior state, H is its header &c. with > assume ourselves focused on the most recent implied state of the chain, σ. The header of the most recent block of the chain is assumed as H. (edited)

# 2024-09-30 23:29 dave: It would seem odd to say that "non-prime" names actually refer to items in the posterior state of the block (edited)

# 2024-09-30 23:32 gav: I'll make it explicit.

# 2024-09-30 23:33 gav: But using prime doesn't make sense here for the reasons I mention above.

# 2024-09-30 23:42 dave: Not sure I agree. But in any case, if we agree that the validator set which should audit a block is the validator set in the state after the block has been executed, then I'm happy (edited)

# 2024-09-30 23:54 gav: Ok - I notice that we need to draw upon the prior of rho in order to describe work-reports which became available during the block. So it's going to be convenient (indeed, pretty much necessary) to have both prior and posterior even though auditing _per se_ doesn't impact their relationship. (edited)

# 2024-09-30 23:55 gav: So we'll keep it as-is and I'll add some clarifying text and a prime to kappa.

# 2024-09-30 23:58 gav: David Emett https://github.com/gavofyork/graypaper/pull/90

# 2024-10-01 00:35 dave: LGTM. Getting back to the original question, should the \kappa bits be \kappa'?

# 2024-10-01 04:17 gav: I think kappa’ makes more sense yes. That's now in the PR.

# 2024-10-01 03:48 gav: I think kappa’ makes more sense yes. That's now in the PR. (edited)

# 2024-10-01 12:26 danicuki: What are the default values for GA, GI and GR? BTW, if I am not wrong, GR is not used anywhere in the GP https://graypaper.fluffylabs.dev/#WyIzODcxMDNkODIzIiwiM2IiLCJBY2tub3dsZWRnZW1lbnRzIixudWxsLFsiPGRpdiBjbGFzcz1cInQgbTAgeDExNSBoZSB5ZTUgZmYyMiBmczAgZmMwIHNjMCBsczAgd3MwXCI+IiwiPGRpdiBjbGFzcz1cInQgbTAgeDFmIGg2IHliMiBmZjEgZnMwIGZjMCBzYzAgbHMwIHdzMFwiPiJdXQ== (edited)

# 2024-10-01 14:47 gav: 0, 0, 0

# 2024-10-01 14:47 gav: This will be detailed more properly in 0.4 series (edited)

# 2024-10-04 08:47 prematurata:

# 2024-10-04 09:38 prematurata: shouldnt this be Y ? https://graypaper.fluffylabs.dev/#WyIzODcxMDNkODIzIiwiMTgiLCJXb3JrIFBhY2thZ2VzIGFuZCBXb3JrIFJlcG9ydHMiLCJQYWNrYWdlcyBhbmQgSXRlbXMiLFtdXQ==

# 2024-10-04 10:52 gav: Shouldn't what be Y? (edited)

# 2024-10-04 11:06 prematurata: > <@gav:polkadot.io> Shouldn't what by Y? sorry (181) the return should by a sequence of Y imho as the output of zeroPad (P) does not guarantuee size output

# 2024-10-07 07:27 gav: Zeropad function guarantees that the output is an integer multiple of the subscript (l = W_S.W_E, the length of a single segment in bytes). In this case the zeropad function just has the effect of padding the segment to exactly this size (blackboard Y_l === blackboard G) since its input is always greater than zero and never bigger than a segment. (edited)

# 2024-10-07 07:28 gav:

# 2024-10-07 11:57 gav: GP v0.3.9 is released

# 2024-10-07 11:58 gav: Expect 0.4.0 to come soon with #78 merged.

# 2024-10-08 13:01 gav: GP v0.4.0 is released This includes the rather large change, Ordered Accumulation.

# 2024-10-08 13:01 gav: The bits function for building tree nodes is now MSB-first and the PVM spec has several fixes as well as a fair amount of restructuring needed for OrdAcc. (edited)

# 2024-10-08 13:04 dakkk:

# 2024-10-08 13:04 dakkk: https://graypaper.fluffylabs.dev/#WyIzODcxMDNkODIzIiwiMzAiLCJBY2tub3dsZWRnZW1lbnRzIixudWxsLFtdXQ== that "i" should be "l"? (edited)

# 2024-10-08 13:06 gav: > <@dakkk:matrix.org> that "i" should be "l"? Please link to GPreader - images don't seem to work on this server

# 2024-10-09 09:04 prematurata: was looking at 0.4.0 i noticed that the isAuthorized invocation's F function at 267 should "return" 2 registers . it was like this even in 0.3.8 but i believe it is atypo. F should return **all** registers considering is being used by ΨM and ΨH which uses F to builld ω'' (edited)

# 2024-10-09 09:05 prematurata: same goes for the number of input registers. shouldnt that be 13?

# 2024-10-09 09:06 prematurata: side note: if i am wrong then why/what is the expected output is n != gas ? i see w_8 is now being used

# 2024-10-09 09:28 gav: Planned changes for 0.5 will be comparatively minor and very much directed towards tweaking and optimization: - Post-Accumulate Preimage Integration - Extrinsic Commitment should be hash of hashes - 64-bit PVM

# 2024-10-09 09:43 gav: > <@prematurata:matrix.org> same goes for the number of input registers. shouldnt that be 13? Yes, this is an old typo and will be corrected in 0.4.1

# 2024-10-09 09:44 prematurata: > <@gav:polkadot.io> Yes, this is an old typo and will be corrected in 0.4.1 thanks for confirming, then... is w_8 value to be placed in w_1 and the rest set to zero?

# 2024-10-09 09:51 gav: w\_1 isn't used any more. (it was also a typo from when the register sets were limited to the argument/return value registers) (edited)

# 2024-10-09 09:55 gav: 0.4.1 is on its way with these fixes.

# 2024-10-09 09:57 gav: Released GP v0.4.1 (edited)

# 2024-10-09 13:04 prematurata: on (275) am I wrong or all bold o should be bold u?

# 2024-10-09 13:07 prematurata: same goes with (280)

# 2024-10-09 13:07 gav: image.png

# 2024-10-09 13:07 gav: ^ here?

# 2024-10-09 13:08 prematurata: the one below

# 2024-10-09 13:08 prematurata: I

# 2024-10-09 13:08 prematurata: i'm looking at 0.4.1

# 2024-10-09 13:09 gav: ahh yes

# 2024-10-09 13:09 prematurata: since you might be there gav you might also want to check the first case of C (279) and (280) > check fn (edited)

# 2024-10-09 13:18 gav: First case of C seems fine

# 2024-10-09 13:19 gav: check is corrected in https://github.com/gavofyork/graypaper/pull/99

# 2024-10-09 15:53 prematurata: quick question, Initially i thought it was a typo but I am now sure i am missing something, what is _t in relation to a service account? for example in the new accumulate function both bold s and bold a reference this _t (edited)

# 2024-10-09 15:56 prematurata: as well as (X_{bold s})_t in the first matching case of new

# 2024-10-09 17:18 gav: Threshold balance.

# 2024-10-09 19:18 tomusdrw: hi there! we've just deployed a new version of the GP Reader. It has much shorter links and displays notes as annotations on the document itself. If you run into any issues please let us know on github. The GP is also updated to the latest 0.4.1 released earlier today.

# 2024-10-10 06:50 dakkk: gav: in your opinion what are the parts of the paper that you consider almost "written on stone" (or at least enough solid to be considered almost ready)?

# 2024-10-10 06:53 gav: Nothing is likely to change very much now. Safrole might not change at all.

# 2024-10-10 06:54 gav: It’s hard to see what will need to be tweaked until we begin actually testing and benchmarking.

# 2024-10-10 06:55 gav: But beyond the issues already in the GP repo and its readme, there’s nothing on my mind which needs altering.

# 2024-10-10 10:21 dakkk: https://graypaper.fluffylabs.dev/#/c71229b/2e20012e2301 gav we are replacing X.c here, but c is not present in the ResultContext definition 272 (edited)

# 2024-10-10 11:36 gav: > <@dakkk:matrix.org> https://graypaper.fluffylabs.dev/#/c71229b/2e20012e2301 > gav we are replacing X.c here, but c is not present in the ResultContext definition 272 yeah that's a rename issue.

# 2024-10-10 11:43 gav: fixed in https://github.com/gavofyork/graypaper/pull/99/commits/fd7aee1ab73dbc16271f8136fe671cf42cc61725 (will be in 0.4.2) (edited)

# 2024-10-10 12:57 dvladco: Hi everyone, we noticed that the polkavm rust implementation charges 1 extra gas when load/store fails with out of bounds error, however we couldn’t find this described in the graypaper, is this expected behaviour?

# 2024-10-10 13:01 jan: Do you mean that there is extra gas charged on top of the 1 gas that the instruction costs normally? If so then that's a bug. Nevertheless, it doesn't really matter what PolkaVM does here. PolkaVM is not a reference implementation of the GP; it's just an implementation, so you need to remember that just because PolkaVM does something it doesn't mean it's correct. :P

# 2024-10-10 13:18 dvladco: I understand, I was asking this because there is a PR for the jamtestvectors and as I can see these are based the PolkaVM implementation, so we are also trying to validate our implementation agains these tests

# 2024-10-10 13:20 jan: Ah, okay.

# 2024-10-10 13:20 dvladco: in this case I assume a more appropriate place to ask this would be in the PR itself :) (edited)

# 2024-10-10 13:20 jan: In that case this is expected behavior.

# 2024-10-10 13:21 jan: The gas cost model in general is still a work-in-progress, so this is not specced in the GP yet (nor it should be considered final behavior), however the reason you're seeing 1 extra gas being charged there is because we charge gas on an entry to a basic block, and not per instruction.

# 2024-10-10 13:22 jan: Basically, for efficiency's sake what we do is we calculate the gas cost for the whole basic block, and charge the whole cost at the start.

# 2024-10-10 13:22 jan: And in that particular test case you have a load/store instruction in the middle of a basic block, so the execution gets interrupted.

# 2024-10-10 13:23 jan: But since we've already entered the basic block the gas was already charged, as if the _whole_ basic block was executed. (edited)

# 2024-10-10 13:25 jan: So charging the gas at the start of basic blocks allows us to do two things: 1) be more efficient (charging gas on each executed instruction is expensive), and 2) it should allow us to have a better gas cost model which will also take the dependencies between the instructions into account instead of only assigning static costs to each instruction in isolation

# 2024-10-10 13:34 dvladco: oh I see, thanks for explaining, that's what I needed to know. So charging for the whole block even though it is probably not going to be fully executed is what we actually want, is this behaviour going to be described in the gray paper eventually?

# 2024-10-10 13:52 gav: The behaviour will be fully described, but for now we will be removing these non-concordant test instances.

# 2024-10-10 13:35 dakkk: I'm a bit confused about this list indexing notation, where a dot is used. I've found this notation only in zip/lace functions of appendix H. I'm not sure what it means, and I can't find a reference in the Notation chapter https://graypaper.fluffylabs.dev/#/c71229b/382700382700

# 2024-10-11 08:02 luke_f: Hello Could anyone please help me understand Eq. 159? https://graypaper.fluffylabs.dev/#/c71229b/152702152702 Is the symbol ⋃ defined anywhere? I couldn't find a definition. I’m interpreting it as the union of all dictionaries in Eq. 158, but I’m not sure what exactly the union of dictionaries means in this context. Could someone explain? Thank you!

# 2024-10-11 08:17 luke_f: > <@luke_f:matrix.org> Hello > Could anyone please help me understand Eq. 159? > https://graypaper.fluffylabs.dev/#/c71229b/152702152702 > > Is the symbol ⋃ defined anywhere? I couldn't find a definition. > > I’m interpreting it as the union of all dictionaries in Eq. 158, but I’m not sure what exactly the union of dictionaries means in this context. Could someone explain? > > Thank you! oh. i guess the definition is in eq 11 https://graypaper.fluffylabs.dev/#/c71229b/068301068401

# 2024-10-11 12:58 rosarp: In section I.4.1 Block-context Terms, reference to equation explaining C is missing. C: The service accumulation-commitment, used to form the Beefy root. See equation ??. Which equation it points to? https://graypaper.fluffylabs.dev/#/c71229b/3c18003c1b00 (edited)

# 2024-10-11 13:01 rosarp: is it (176) in 12.4 ? (176) let (n, o, t, C) = ∆+ (GT , W∗ , (χ, δ † , ι, φ), χg ) https://graypaper.fluffylabs.dev/#/c71229b/17b10217b102 (edited)

# 2024-10-11 15:01 prematurata: > <@dakkk:matrix.org> I'm a bit confused about this list indexing notation, where a dot is used. I've found this notation only in zip/lace functions of appendix H. I'm not sure what it means, and I can't find a reference in the Notation chapter > > https://graypaper.fluffylabs.dev/#/c71229b/382700382700 I 'm interested as well. i assumed times but didnt explore any further

# 2024-10-11 15:41 gav: > <@rosarp:matrix.org> is it (176) in 12.4 ? > (176) let (n, o, t, C) = ∆+ (GT , W∗ , (χ, δ † , ι, φ), χg ) > https://graypaper.fluffylabs.dev/#/c71229b/17b10217b102 Yes

# 2024-10-11 15:41 gav: > <@dakkk:matrix.org> I'm a bit confused about this list indexing notation, where a dot is used. I've found this notation only in zip/lace functions of appendix H. I'm not sure what it means, and I can't find a reference in the Notation chapter > > https://graypaper.fluffylabs.dev/#/c71229b/382700382700 Simple scalar multiply

# 2024-10-11 15:42 gav: But it should be $\cdot$ - i'll fix

# 2024-10-13 17:39 cisco: What's the purpose of i in the accumulation context? https://graypaper.fluffylabs.dev/#/c71229b/2b31002b3100 Is it a created account? But it should be optional in that case?

# 2024-10-14 08:34 luke_f: Hello, i have a doubt 160 shows \seq{\seq{(\mathbb{W}, \{\H\})}}_\mathsf{E} but it should be \seq{(\mathbb{W}, \{\H\})}_\mathsf{E} i.e list of length E of tuples. not list of lists of tuples shouldn't it? https://graypaper.fluffylabs.dev/#/c71229b/154402155002

# 2024-10-14 09:23 prematurata: I also have a question when calculating posteruor accumulation queue https://graypaper.fluffylabs.dev/#/c71229b/174803174803. what exactly isn ? Also W* is of type sequence 168,165,161 but being used as second parameter of E which is a dictionary in 182, maybe it was meant to be P(W*..n) ? (edited)

# 2024-10-14 10:24 prematurata:

# 2024-10-14 11:18 dave: n is the number of work reports that were actually accumulated, as returned by \Delta_+, see https://graypaper.fluffylabs.dev/#/c71229b/179e0217b102

# 2024-10-14 11:41 prematurata: > <@dave:parity.io> n is the number of work reports that were actually accumulated, as returned by \Delta_+, see https://graypaper.fluffylabs.dev/#/c71229b/179e0217b102 I must go to see a doctor I might be turning blind

# 2024-10-14 12:32 dave: A question concerning audit announcements. If we see a negative judgment for a work-report from some other auditor, we are required to audit the work-report ourselves. The GP states that we must announce our intention to audit this work-report ("In all cases, we publish a signed statement of which of the cores we believe we are required to audit"). Currently this isn't really possible in JAM-SNP as the "evidence" for requirement to audit a work-report only covers the no-shows case. I can extend the announcement message format to support this but I'm not sure what the reason is for announcing in this case, is it actually necessary?

# 2024-10-15 02:21 stanleyli: Hi David Emett , I have read your wonderful JAMNP-S simple.md. I have a few questions and was wondering if you could answer them. (1)Are the [Segment Shard] of CE139/140 the same as those of CE137? (2) in CE 139, you mentioned that "Guarantors should initially use protocol 139 to fetch segment shards. If a reconstructed import segment is inconsistent with its reconstructed proof". What exactly is this "reconstructed proof" when CE 139 only asked for [Segment Shard]? Perhaps you meant Guarantor will use [Segment Shard] retrieved from Assurer to verify against its own pageProof that it sent out during CE 137? (3) You mentioned that guarantors will use CE139/CE140 to request import segment shards from assurers. However, Builder specify Imported segments using the combination of ({tree_root, index})[https://github.com/w3f/jamtestvectors/blob/master/codec/data/work_package.json#L22-L35] in CE 133 and not [Erasure Root ++ Shard Index ++ len++[Segment Index]]. Should CE 139 have some alternative request format using only [segment_root ++ index]? (4) I need clarification on how is pageProof being used in CE139/CE140? Our understanding is that page proof is quite large as it's encoded with Justification & ↕si⋅⋅⋅+64 in https://graypaper.fluffylabs.dev/#/c71229b/1a15001a1500. So it's probably possible to recover exported segments from pageproof alone; Do I understand this wrong?

# 2024-10-15 11:09 dave: > Are the [Segment Shard] of CE139/140 the same as those of CE137? Yes. In CE137 the assurer receives a Segment Shard sequence, the CE139/140 Segment Index is an index into this sequence.

# 2024-10-15 11:19 dave: > What exactly is this "reconstructed proof" when CE 139 only asked for [Segment Shard] The segments that are erasure coded consist of the segments exported by the work package _plus_ a sequence of "proof" pages, essentially containing Merkle proofs from the segment root to the exported segments. When you fetch an exported segment, you also need to fetch the corresponding proof to the segment root, so that you can convince yourself and auditors that the segment is correct.

# 2024-10-15 11:23 dave: In the full network protocol these proof pages might be implicitly returned, but with SNP you will need to explicitly ask for them.

# 2024-10-15 11:24 dave: These proof pages are generated by the function P in the GP, defined here https://graypaper.fluffylabs.dev/#/c71229b/1a1c001a6e00

# 2024-10-15 11:39 dave: > You mentioned that guarantors will use CE139/CE140 to request import segment shards from assurers. However, Builder specify Imported segments using the combination of ({tree\_root, index})\[https://github.com/w3f/jamtestvectors/blob/master/codec/data/work\_package.json#L22-L35\] in CE 133 and not \[Erasure Root ++ Shard Index ++ len++\[Segment Index\]\]. Should CE 139 have some alternative request format using only \[segment\_root ++ index\] Guarantors will need to maintain a map from Segment Root to Erasure Root. This can be built from the work-reports placed on-chain / distributed via CE135. Note that an individual assurer cannot in general prove the relationship between a segment shard and the segment root, only between the shard and the erasure root. This is why for CE139/140 an erasure root is sent rather than a segment root. Note also that any segment root --> erasure root mapping constructed based on reports included on-chain may have incorrect entries, as reports included on-chain are not necessarily correct. Guarantors can conclude that an erasure root was incorrect if after fetching all shards with CE 140 and verifying all the justifications, the reconstructed segment and proof do not match up. (edited)

# 2024-10-15 11:51 dave: > I need clarification on how is pageProof being used in CE139/CE140? Our understanding is that page proof is quite large as it's encoded with Justification & ↕si⋅⋅⋅+64 in https://graypaper.fluffylabs.dev/#/c71229b/1a15001a1500. So it's probably possible to recover exported segments from pageproof alone; The proof pages contain Merkle proofs from the segment root to the exported segments. Each proof page/segment contains proofs for 64 exported segments. You certainly can't reconstruct the exported segments themselves from these proof pages. CE139/140 is intended to be used to fetch proof page shards as well as exported segment shards. You need to explicitly request proof page shards by passing the appropriate segment indices.

# 2025-07-15 11:17 knight1205: Hi David Emett ! Could you explain why do we explicitly need CE 139 / 40 for segment shard requests? Like what advantage does it has. for importing segments for a particular shard index and segment index, we also need to create its justification at the time of creating bundle. HERE for that we would need to have all the segments with their proofs. To build all the segments, we would have to manually pass all the indices for each shard index and then call CE139/40 to separate assurers. so wouldn't it be better we fetch whole shard from them using CE137?

# 2025-07-15 11:21 knight1205: Although eventually 137 would become push protocol later on I suppose. So that might be the reason? even in that case if we would need whole shard itself, we could just leave segment indexes empty? (edited)

# 2025-07-15 11:24 dave: I think you are misunderstanding the purpose of the various proofs/justifications

# 2025-07-15 11:27 dave: You should not use CE137 except to fetch your shards as an assurer. There is no reason to do so, and validators may refuse to service such requests

# 2025-07-15 11:32 dave: FWIW in the full network protocol it will be possible to directly fetch exported segments from guarantors. This will be the fast path and generally will be attempted first. Currently only the fallback path is specified, where you fetch shards from assurers and reconstruct via EC.

# 2025-07-15 11:35 dave: To reconstruct a 4kB segment you need 342 distinct 12-byte segment shards. You _also_ need to reconstruct the page containing the proof for the segment, which will require an additional 342 12-byte segments.

# 2025-07-15 11:36 dave: You _do not_ need to fetch anything else

# 2025-07-15 11:39 dave: As covered in the SNP doc, you should use CE139 initially, ie do not request justifications. This is because the justifications are quite large compared to the shards themselves. You should only use CE140 if segment proof verification fails and you need to refetch the shards

# 2025-07-15 11:41 dave: You do not need to fetch anything to check a justification returned by CE140; each justification is a Merkle proof to the erasure root that was passed in with the request

# 2025-07-15 11:51 knight1205: this is good. I got the actual use of 139 but i was sceptical about specifying segment indices. for building specific justification page (of size 1) for a particular segment, we would need to have all the segments for a particular segment root mapped to that erasure-root. so thus we would then have to request all the shards (more than threshold), with all the segment indices. it would be direct, if we could directly get that particular segment shard without the need for specifying indices?

# 2025-07-15 11:56 dave: You reconstruct proof pages in the same way you reconstruct the segments they prove

# 2025-07-15 11:56 dave: Not sure what you mean by "building specific justification page"

# 2025-07-15 11:58 dave: The proof pages are built when the segments are exported. This does not happen at import time when you are using CE139, that would not make sense

# 2025-07-15 11:59 knight1205: at the time of building bundle, the primary guarantor needs to fetch justifications for required import segment. for that it needs to create a single paged proof. this can be done when we have all the segments exported, mapped to a particular segment root.

# 2025-07-15 12:07 dave: It does not _create_ any proofs for imported segments. It _fetches_ these proofs in the same way it fetches the segments being proven

# 2025-07-15 12:07 dave: At the moment this requires fetching the shards for the proof page and doing an EC reconstruction

# 2025-07-15 12:08 dave: It _does not_ require fetching unrelated segment shards

# 2025-07-15 12:13 dave: You are possibly hung up on the definition in the GP. The GP definition tells you exactly what the proofs must look like but you are not expected to construct them following the definition

# 2025-07-15 12:14 dave: For imports you are expected to construct them via fetched proof pages

# 2025-07-15 12:23 knight1205: Thanks a lot. I would need to revisit it for more clarity.

# 2024-10-16 07:17 dakkk: In the 'new' hostcall (https://graypaper.fluffylabs.dev/#/c71229b/2e9a022ee302): - in case of OOB | CASH, we are assigning 2 elements to a tuple of 3 elements - we are referencing X_T, which I'm unable to find defined in any other part of the paper

# 2024-10-16 11:15 gav: > <@dakkk:matrix.org> In the 'new' hostcall (https://graypaper.fluffylabs.dev/#/c71229b/2e9a022ee302): > - in case of OOB | CASH, we are assigning 2 elements to a tuple of 3 elements > - we are referencing X_T, which I'm unable to find defined in any other part of the paper Will be fixed in 0.4.3

# 2024-10-16 11:16 gav: @room 0.4.2 is released

# 2024-10-17 23:22 stanleyli: Hi gav: is there a typo in the Page proof function here? https://graypaper.fluffylabs.dev/#/293bf5a/1a4f001a5100 In eq 196, is the ↕s referring to the actual segment or the segment hash? If it’s the segment, then each element within the page proof (encoded as 6Hash + G) would make this page proof extremely large (up to 64 G) ? (edited)

# 2024-10-18 03:27 clw0908: Hello everyone: Based on the GP's definition, I speculate the following, (not certain if it's correct): (1) The genesis state will have privileged service code to obtain preimages. (2) Users will use (1) to get their code into on-chain preimages. My question is: How will the privileged service get the code into the system?

# 2024-10-18 03:28 xlchen: I guess just have it included in genesis state

# 2024-10-18 07:40 gav: Indeed, on both points.

# 2024-10-18 11:23 gav: > <@stanleyli:matrix.org> Hi gav: is there a typo in the Page proof function here? https://graypaper.fluffylabs.dev/#/293bf5a/1a4f001a5100 > > In eq 196, is the ↕s referring to the actual segment or the segment hash? If it’s the segment, then each element within the page proof (encoded as 6Hash + G) would make this page proof extremely large (up to 64 G) ? Yes, as David Emett suggested, there should be a H^#(...) around the s_{i...+64}

# 2024-10-18 11:24 gav: Will be in 0.4.3.

# 2024-10-21 03:08 sourabhniyogi: I am likely missing a crucial point or important detail concerning the idea that guarantors, in order to use CE139/CE140, every validator must maintain a map from every Segment Root to Erasure Root -- this now requires every validator to become a indexer of pretty much a month's worth of work packages's work reports to be able to be guarantors. No problem in indexing a month's worth of ImportDA actiity, but validator need to take every single work report across all of JAM, and index ALL the _potential_ segment roots to Erasure Roots just to be able to _maybe_ fetch some import segment in some future work package's work items -- Could this be?? If so, what is the simplest process to map an on-chain work report to all its segment roots _without_ auditing the work report? As of 0.4.x, imported segments can now also be specified in a SECOND way with $H^\\boxplus$ as (work package hash, segment index) combinations, and further requires that validators build a lookup dictionary mapping these into segment roots. But given CE139 has [Erasure-Root ++ Shard Index ++ len++[Segment Index]] as the request key, it seems that this second way to specify imported segments is (a) sufficient for builders to specify import segments ... and (b) has far lower indexing requirements because validators just need to store a months worth of workpackagehash => erasure-Root mappings to use CE139 (which are just sitting in the work report!). Would it be reasonable to drop the first "older" way with segmentroot and just keep this SECOND method as the sole method to specify imported segments in a work package? If not, why not? I must have missed something crucial here... Thank you for your help! (edited)

# 2024-10-21 09:10 dave: Each work report has only one segment root, this is in the availability specifier

# 2024-10-21 09:15 dave: Re the WP hash to segment root mapping, this only needs to be tracked for ~1 epoch, as the chain can only check mappings going this far back

# 2024-10-21 09:21 dave: This is also a reason for still permitting the "older" way: the "newer" way using WP hashes will only work for referencing segments exported by WPs reported in the last ~hour, whereas the "older" way should work for any segments exported in the last ~28 days

# 2024-10-21 09:21 dave: > <@dave:parity.io> Each work report has only one segment root, this is in the availability specifier Probably this is what you are missing?

# 2024-10-21 09:23 dave: It should be possible to build this index using only the last 28 days worth of blocks

# 2024-10-21 12:01 sourabhniyogi: Will review+follow up, thank you!

# 2024-10-21 20:29 sourabhniyogi: Alright, thank you very much for clarifying the "one segment root", there are so many different things "segments root" and "segment roots" could mean I'll make a point to use singular "segment-root (e)" based on your design and my now clear understanding of what is going on!

# 2024-10-21 20:36 sourabhniyogi: One remaining nitpick question is this: Are you saying that a work item of a work package is _invalid_ if it uses a (workpackagehash, index) combination to specify an import segment if the workpackagehash is more than an epoch old? Is this implied/stated in the GP somewhere? If not, it should be added or made more explicit? (edited)

# 2024-10-21 21:31 dave: > <@sourabhniyogi:matrix.org> Alright, thank you very much for clarifying the "one segment root", there are so many different things "segments root" and "segment roots" could mean I'll make a point to use singular "segment-root (e)" based on your design and my now clear understanding of what is going on! FWIW I changed the SNP doc to use "segments-root" as opposed to "segment root"

# 2024-10-21 23:02 dave: > <@sourabhniyogi:matrix.org> One remaining nitpick question is this: Are you saying that a work item of a work package is _invalid_ if it uses a (workpackagehash, index) combination to specify an import segment if the workpackagehash is more than an epoch old? Is this implied/stated in the GP somewhere? If not, it should be added or made more explicit? Well, "more than an epoch old" is not exact. The GP is quite explicit about what work reports are valid for inclusion in a block and what happens to them after this point. The WP hash -> segment root mapping is checked at accumulation time, against the history of accumulated work packages, lower-case xi. This happens in the E function with the "x u w_l = w_l u x" check. The history is a circular buffer of length E (= timeslots per epoch), with one entry pushed every block. So it can store _at least_ one epoch's worth of history but possibly more if there are "missing" blocks

# 2024-10-21 23:44 sourabhniyogi: No debate on that, thanks for explaining. But on this last nitpick I mean the work items in the work package, as of 0.4.x have this $H^\boxplus$ source here: https://graypaper.fluffylabs.dev/#/293bf5a/199e0019a500 As far as I can tell, I could have an old work package hash from a long time ago -- "whereas a value drawn from H⊞ implies the hash value is the hash of the exporting work-package. In the latter case, it must be converted into a segment-root by the guarantor and this conversion reported in the work-report for on-chain validation." If you believe this last statement is only applied for around an epoch (with the circular buffer mechanics, as you mention) then we should ensure this is clear in GP, don't you think?

# 2024-10-22 10:37 gav: > <@sourabhniyogi:matrix.org> No debate on that, thanks for explaining. But on this last nitpick I mean the work items in the work package, as of 0.4.x have this $H^\boxplus$ source here: > https://graypaper.fluffylabs.dev/#/293bf5a/199e0019a500 > As far as I can tell, I could have an old work package hash from a long time ago -- "whereas a value drawn from H⊞ implies the hash value is > the hash of the exporting work-package. In the latter case, it must be converted into a segment-root by the guarantor and this conversion reported in the work-report for on-chain validation." If you believe this last statement is only applied for around an epoch (with the circular buffer mechanics, as you mention) then we should ensure this is clear in GP, don't you think? The guarantor will not be punished for introducing a WR with an old/unvalidatable SR lookup entry. They might just not get rewarded. And it beyond the scope of the GP (at least currently) to go deep into the optimum strategies across all off-chain behaviour.

# 2024-10-21 04:12 ksc85pwpj5: image.png

# 2024-10-21 04:12 ksc85pwpj5: Excuse me, I would like to ask what the difference is between x ∪ wl and wl ∪ x, and what is the meaning of x ∪ wl = wl ∪ x. (edited)

# 2024-10-21 06:11 prematurata: If I remember properly left or right side take priority in case of keys existing in both elements. This means that a U b = b U a effectively means there are no conflicting keys or if there are their values are the same

# 2024-10-21 09:07 ksc85pwpj5: Understood, thank you for your help.

# 2024-10-21 11:24 gav: Indeed. I was a bit torn over how to express this in the paper.

# 2024-10-21 08:52 dakkk: what is the w referenced in the forget host call? https://graypaper.fluffylabs.dev/#/293bf5a/30e60030e600

# 2024-10-21 08:55 dakkk: in quit hostcall, a is never used: https://graypaper.fluffylabs.dev/#/293bf5a/2f55012f5e01

# 2024-10-21 09:13 dakkk: in historical_lookup hostcall, k is undefined: https://graypaper.fluffylabs.dev/#/293bf5a/307102307102

# 2024-10-21 09:26 dakkk: i is undefined in poke hostcall: https://graypaper.fluffylabs.dev/#/293bf5a/316d02317402

# 2024-10-21 09:31 jan: Typo; that should be l

# 2024-10-21 12:21 gav: > <@dakkk:matrix.org> in quit hostcall, a is never used: https://graypaper.fluffylabs.dev/#/293bf5a/2f55012f5e01 a is used to construct t: https://graypaper.fluffylabs.dev/#/293bf5a/2f97012f9801

# 2024-10-21 12:22 gav: It is the third item of (x\_s)\_l\[h, z\] with the condition that it's a three-item array (notice it's behind the "if" condition) (edited)

# 2024-10-21 12:28 gav: It wasn't especially clear that the value was used as it was only referenced with the "_2" subscript prior to the "if". This will be improved in 0.4.3.

# 2024-10-21 12:56 gav: @room 0.4.3 is released with all the latest corrections/clarifications.

# 2024-10-22 07:22 dakkk: in export hostcall, constant W_X is not defined: https://graypaper.fluffylabs.dev/#/439ca37/31ab0031ae00

# 2024-10-22 11:14 dvladco: Hello, I am struggling to understand appendix A.7 more specifically I can't find the explanation for a bunch of variables like o, w, z, s in formula 260 and later a, W, R, V, A in formula 264. I have managed to decode RAM but mainly looking at the polkavm rust implementation however I don't understand how to decode the registers from the program code p

# 2024-10-22 11:27 jan: The formula (264) defines the memory map of the standard program initialization (that is: what is put at a given memory address, whether it's initialized, whether it's read-only/read-write, etc.). The formula (260) is essentially the standard JAM "program blob". You don't decode the registers from **p** (260); the initial values of the registers are given in (265).

# 2024-10-22 11:54 dvladco: Yeah, that would make sense however I got confused by this sentance: "We thus define the standard program code format p, which includes not only the instructions and jump table (pre- viously represented by the term c), but also information on the state of the ram and registers at program start"

# 2024-10-22 11:56 dvladco: here it says that we get registers from program code p and later in the formula 259 we have this Y(p) -> (c, ω, μ)? (edited)

# 2024-10-22 12:19 jan: The wording here might be a little confusing as-is. The initial registers are defined as part of the standard program initialization (eq. 265), but they're not explicitly part of the program blob (eq. 260). The Y in eq. 259 is meant to represent the standard program initialization, but it doesn't necessarily mean that the values of the registers are directly extracted from **p**.

# 2024-10-22 12:46 dvladco: Thank you, this explanation makes sense, also formula 266 shows a as separate from p. I guess formula 259 should be something like this then Y(p, a) -> (c, ω, μ)? (edited)

# 2024-10-22 13:19 dvladco: one more thing, formula 265 describes what are the initial registers, but I can't find defined anywhere how to convert a which is a series of octets with at most Z_I elements to registers [N_R]_13

# 2024-10-22 13:53 gav: It is |a|, the number of items in a, which should easily fit in register index 8.

# 2024-10-22 14:35 dvladco: I think I understand now, so when we refer to data arguments here it means a slice of bytes that can later be decoded when executing the program. I thought there would be a way to pass arguments as register values directly when invoking the PVM

# 2024-10-22 13:49 gav: Improved in https://github.com/gavofyork/graypaper/pull/114 (edited)

# 2024-10-22 14:36 gav: No, not in this “standard argument passing” invocation

# 2024-10-22 14:36 gav: Obviously you can use the PVM directly.

# 2024-10-22 15:13 dakkk: In state merklization of validator statistic (pi) (https://graypaper.fluffylabs.dev/#/439ca37/345702345802), I don't understand why we are using E_4 for serialization

# 2024-10-22 16:07 gav: It is perhaps not super clear, but it's intended to serialize all of the inner elements as E_4

# 2024-10-22 19:35 bmaas: Are we guaranteed to have deterministic ordering here: https://graypaper.fluffylabs.dev/#/293bf5a/346502346f02 ? I could not find any mention of how to handle (small) sets when encoding—whether to sort them or not. When I read about Theta as defined here: https://graypaper.fluffylabs.dev/#/c71229b/154402154e02, where {H} is a set of work-package hashes, it raises a question. Should this set containing work-package hashes be ordered before encoding?

# 2024-10-23 09:07 0xjunha: https://graypaper.fluffylabs.dev/#/439ca37/121503121503 Is this supposed to be ρ[c]_w?

# 2024-10-23 09:07 gav: > <@bmaas:matrix.org> Are we guaranteed to have deterministic ordering here: https://graypaper.fluffylabs.dev/#/293bf5a/346502346f02 ? I could not find any mention of how to handle (small) sets when encoding—whether to sort them or not. When I read about Theta as defined here: https://graypaper.fluffylabs.dev/#/c71229b/154402154e02, where {H} is a set of work-package hashes, it raises a question. Should this set containing work-package hashes be ordered before encoding? GP should state that sets of encodable items should be encoded in order.

# 2024-10-23 09:08 gav: Similar to dictionary encoding, but without the value part.

# 2024-10-23 09:10 gav: > <@0xjunha:matrix.org> https://graypaper.fluffylabs.dev/#/439ca37/121503121503 > Is this supposed to be ρ[c]_w? Yes. Will be fixed in next release.

# 2024-10-24 07:39 prematurata: https://graypaper.fluffylabs.dev/#/439ca37/348a02349202 about here

# 2024-10-24 07:40 prematurata: i guess a_l is actually cardinality of a_l

# 2024-10-24 07:40 prematurata: and couldnt really find a_i

# 2024-10-24 08:04 luke_fishman: see Eq 95 for a\_i and a_l (edited)

# 2024-10-24 08:05 prematurata: thanks mate

# 2024-10-24 08:06 prematurata: it's not the first time i ask about this derived terms. my memory is not working properly :)

# 2024-10-24 13:02 luke_fishman: Hello gav I think there is a mistake in Equation 183 E(WQ, W∗ ...n) if i = 0 the second argument is of the type ⟦W⟧ (list of work reports) but the second argument that function E expects is D⟨H → H⟩ (Equation 164) (edited)

# 2024-10-24 13:03 dave: > <@luke_fishman:matrix.org> Hello gav > I think there is a mistake in Equation 183 > E(WQ, W∗ ...n) if i = 0 > > the second argument is of the type ⟦W⟧ (list of work reports) > > but the second argument that function E expects is D⟨H → H⟩ (Equation 164) There is a PR open to fix this: https://github.com/gavofyork/graypaper/pull/112

# 2024-10-24 13:04 luke_fishman: i see. Thank you

# 2024-10-24 21:13 danicuki: In formula 268, psi\_i definition: https://graypaper.fluffylabs.dev/#/439ca37/29f80229fc02 ↦ r where (g, r, ∅) = ΨM (pc, 0, GI , E(p, c), F, ∅) What is pc? I believe there is a typo in the argument, since work packages do not have a field 'c' (edited)

# 2024-10-24 21:25 emielsebastiaan: I believe a correction is needed in GP in the State Transition Dependency Graph section for gamma and kappa. Suggested change: Remove ψ' from GP-0.4.3-eq:21 and add ψ' to GP-0.4.3-eq:19. Likely an unchanged bit from an earlier version of GP. Details here: https://github.com/gavofyork/graypaper/pull/118 (edited)

# 2024-10-25 09:10 gav: > <@danicuki:matrix.org> In formula 268, psi\_i definition: > > https://graypaper.fluffylabs.dev/#/439ca37/29f80229fc02 > > ↦ r where (g, r, ∅) = ΨM (pc, 0, GI , E(p, c), F, ∅) > > What is pc? I believe there is a typo in the argument, since work packages do not have a field 'c' Bold c is defined as the preimage of regular c.

# 2024-10-25 09:10 gav: It’s a derived term.

# 2024-10-25 12:16 dave: The state keys as currently defined in the State Merklization section can collide for different bits of state. e.g. The keys for general service storage and for service preimages are calculated identically? It seems like a_l keys can be made to collide with almost anything as both h and l are completely controllable by the code calling solicit.

# 2024-10-25 14:42 danicuki: Where do I get the value of GI from in psi_I? Is it a constant? Or how should I calculate it? https://graypaper.fluffylabs.dev/#/439ca37/29f80229fd02

# 2024-10-25 15:19 danicuki: What does \[WHAT, ω8, . . . \] mean here? What do I put on ...? I assume to pass through ω2, ω3, ... ? https://graypaper.fluffylabs.dev/#/439ca37/2a60002a6000 (edited)

# 2024-10-25 16:10 tomusdrw: > <@danicuki:matrix.org> What does \[WHAT, ω8, . . . \] mean here? What do I put on ...? I assume to pass through ω2, ω3, ... ? > https://graypaper.fluffylabs.dev/#/439ca37/2a60002a6000 That's how I understand it as well. You assign ω0=WHAT, ω1 = ω8, and the rest stays the same. The same notation is used later as parameters pass-through (e.g. Omega_G).

# 2024-10-25 16:33 sourabhniyogi: If you all have a precise idea of what a "null authorizer" should be in our JAM implementations [disconnected from the CoreTime system chain], like down to pvm assembly code included in genesis state, kindly share --I think we can converge on something very quickly?

# 2024-10-26 11:00 gav: Should be fixed by this (edited)

# 2024-10-26 11:06 gav: I'll put some values in: https://github.com/gavofyork/graypaper/pull/120/commits/edafbd0a6d618cd1949e665e6a7fbc6187201c4b (edited)

# 2024-10-26 11:12 gav: (They're not going to be especially meaningful yet as we don't have a sensible gas model yet)

# 2024-10-26 11:31 gav: PVM Invocations: Correct fallback condition (edited)

# 2024-10-26 11:51 gav: Note that it is omega_7 which gets set to the error code WHAT. omega otherwise remains the same.

# 2024-10-26 18:26 danicuki: I did not understand yet how do we have access to d and s (bold) in these formulas: X(w ∈ I) ≡ [d ∣ (H(d),∣d∣) −< wx] S(w ∈ I) ≡ [s[n] ∣ M(s) = L(r),(r,n) <− wi] J ( w ∈ I ) ≡ [ ↕ J ( s , n ) ∣ M ( s ) = L ( r ) , ( r , n ) <− w i ] https://graypaper.fluffylabs.dev/#/439ca37/1a98011ad401 Anyone has a clue?

# 2024-10-26 18:27 danicuki:

# 2024-10-26 20:18 gav: Extrinsic preimages are expected to be passed alongside the work package by the block builder. (edited)

# 2024-10-26 20:19 gav: Import segments are expected to be retrieved by the guarantor through erasure-code reconstruction from pieces from the validators collected via the network. (edited)

# 2024-10-27 08:31 tomusdrw: https://graypaper.fluffylabs.dev/#/439ca37/2cf9012cfc01 It seems to me that the extra hashing here is redundant, unless the keys in preimage dictionary are double-hashed (which I didn't find info about in section 9)

# 2024-10-27 12:19 gav: Certainly if https://github.com/gavofyork/graypaper/pull/119 goes in it would be

# 2024-10-27 12:22 gav: But yeah, it's a provision from an older revision where it was desirable to ensure that equivalent keys in different services wouldn't result in similar trie paths.

# 2024-10-27 12:23 gav: It's less important with the current trie as there's already a provision to split it based on service ID. It's still more optimal to hash in order to uniformally distribute tree paths, and generally more secure to "salt" the hash with the service ID. (edited)

# 2024-10-27 12:25 gav: https://github.com/gavofyork/graypaper/pull/121 is probably the alteration which will make it through. (edited)

# 2024-10-28 16:43 prematurata: Something i might be missing. https://graypaper.fluffylabs.dev/#/439ca37/2bb6002bb800 here preimage code ( i guess) is provided to the argument invocation. (d\[s\]\_bold\_c) but the preimage section does not seem to mention any mechanism of storing/fetching preimage for the service. I marked my code with a TODO and now i am fixing all the leftovers. omega_n (new service host call) does not seem to have access of such preimage but rather codehash is being directly provided in memory.... so I'm a bit lost of where the preimage is being handled (edited)

# 2024-10-29 06:55 prematurata: again my memory failed me... for those landing here

# 2024-10-29 06:56 prematurata: section 9.1 basically is the answer to this above

# 2024-10-28 17:09 tomusdrw: https://graypaper.fluffylabs.dev/#/439ca37/2e70002e7100 Should this d that's being subscripted here be just \mi? I'm having a hard time to figure out where is the data necessary to build that dictionary coming from. In other context d is just a dictionary indexed by service id afaict.

# 2024-10-28 20:18 prematurata: it's basically delta

# 2024-10-28 20:18 prematurata:

# 2024-10-28 23:15 gav: > <@tomusdrw:matrix.org> https://graypaper.fluffylabs.dev/#/439ca37/2e70002e7100 Should this d that's being subscripted here be just \mi? I'm having a hard time to figure out where is the data necessary to build that dictionary coming from. In other context d is just a dictionary indexed by service id afaict. Yes. Will be corrected in next revision

# 2024-10-29 19:41 sourabhniyogi: Is JAMNP intended to socialize individual vs aggregated BLS signatures ie https://graypaper.fluffylabs.dev/#/439ca37/1e7a001e7a00 If not, what does a validator do with ${\bf F}_v$? Should these be aggregated into EpochMarkers or into BridgeHub somehow?

# 2024-10-30 01:42 charliewinston14: Hello. I asked this in the Jam chat and didn't get a response so trying here (if this is not allowed let me know ill delete this post). Question about erasure coding in JAM. I’ve broken the blob into pieces, and then those pieces into the octet pairs. The pairs were then converted to 16 bits. I’m now trying to figure out the field element formula. It looks to be the summation of each bit multiplied by “vj” and this is where I’m not sure what value to use. If j = 7, then vj = α14 +α4 +α. What is α here? 

# 2024-10-30 01:42 charliewinston14: Screen Shot 2024-10-28 at 1.20.33 PM.png

# 2024-10-30 07:47 gav: @davxy:matrix.org^^^

# 2024-10-30 07:49 gav: Version 0.4.4 is out. This includes several corrections but two important protocol alterations. One change to the state merklisation, and one to the way that segment root lookup dictionary validations are done.

# 2024-10-30 07:50 gav: I’ll probably be pushing on with 0.5 mostly now with a number of smaller protocol tweaks. See the milestone in the GP report if you want to know what to expect.

# 2024-10-30 10:42 syed: > <@charliewinston14:matrix.org> Hello. I asked this in the Jam chat and didn't get a response so trying here (if this is not allowed let me know ill delete this post). Question about erasure coding in JAM. I’ve broken the blob into pieces, and then those pieces into the octet pairs. The pairs were then converted to 16 bits. I’m now trying to figure out the field element formula. It looks to be the summation of each bit multiplied by “vj” and this is where I’m not sure what value to use. If j = 7, then vj = α14 +α4 +α. What is α here? I have written a toy example here: https://github.com/w3f/jamtestvectors/blob/skalman--add-sage-toy-example/erasure_coding/sage/spec.sage#L35 it erasure code m1 = 0x0001 and m2 = 0x0002 if it helps. I think V_j's are elements of the contor basis, you can find them here https://github.com/w3f/jamtestvectors/blob/skalman--add-sage-toy-example/erasure_coding/sage/spec.sage#L26.

# 2024-10-30 12:40 charliewinston14: This is really helpful. Thanks!

# 2024-10-30 10:50 syed: From_V function is basically computing the sum you are pointing at but in real life you shouldn't do that, you should directly compute the code polynomial from (m_0, m_1,..m_15) vectors without the need to convert to standard field element representation.

# 2024-10-30 11:48 davxy: Currently, JAMNP does not incorporate any Beefy-related messaging. This gadget will need to be implemented on top of GRANDPA, which itself is not described by JAMNP yet. I would tentatively assume that the protocol won’t diverge _significantly_ from the one described for Polkadot. Of course, there are some differences, such as constructing the MMR with accumulation outputs, and using only BLS instead of BLS or ECDSA for signatures. However, for context, there are numerous introductory resources on these concepts available. Regarding the state of the "BEEFY Distribution" section in GP, it currently suggests signing each finalized block B. I'm not sure if this will later shift to a "sign every X blocks" approach similar to Beefy on Polkadot. The signatures are for sure BLS signatures, which can be later aggregated for efficient light client verification. Aggregated signatures will necessitate an aggregated key, and APK proofs might be introduced to verify the validity of these aggregated keys (there is at least a talk and one excellent tutorial from Syed online). Who is in charge of aggregation of BLS keys in JAM? Who is in charge of maintaining the keys commitment eventually required for APK proofs? Probably this aspect is outside of the core JAM protocol per se. These aspects likely warrants further discussion, and it is probably not worth for you (or anyone else) to focus on it right now, given that GRANDPA is not on M1, and potentially not even on M2 either, meaning BEEFY is further down the line (it may be the last component indeed). I believe further details will be provided when the time is appropriate. (edited)

# 2024-10-30 18:26 sourabhniyogi: Ok thank you for making the known unknowns super clear! For the Syed talk do you mean this: https://www.youtube.com/watch?v=xzC9KJXtidE For the tutorial can you share the link please?

# 2024-10-30 18:40 davxy: Yeah. I think this walkthrough is very helpful for understanding some low level aspects underlying APK (aggregated public key) proofs. It may also serve as a beneficial read before diving into the ring-proof primitive we're using for Safrole (technical spec draft available here), as they share some similarities. However, unless you’re interested for your own learning, it may not be essential to review this material. Regarding your JAM impl, you might prefer to just use FFI (edited)

# 2024-10-31 10:48 sourabhniyogi: Below are some basic questions on Ordered Accumulation and Prerequisites. *Background*. Lets say we wish to compute $C[n + 1] = A[n] + B[n] + C[n]$ (some kind of recurrence relationship, lets say Fibonacci, Tribonnaci, and Quadranocci in a toy example, all of $A$, $B$, $C$ trying to write their value into service storage of key 0) using the refine work results of services $A[n]$, $B[n]$, $C[n]$ coming into accumulate. We would like service $C$ to be able to read the results of $A, B, C$ and write out $C[n+1]$ to the service storage of $C$ in $C$'s accumulate code. Questions: 1. If we have a _single_ work package $p$ with 3 work items from service $A$, $B$, $C$ (with no prerequisites), there is **NO** way to have $C$'s write _guaranteed_ to happen **AFTER** $A$ and $B$. This is because $\Delta_{+}$ will initialize _parallelized_ execution of $\Delta_1$ for all 3 services through $\Delta_*$. Can you confirm this? 2. If we have _three_ work packages $p_A, p_B, p_C$ with 1 work item each ($p_A$ with refine work results from service A, $p_B$ from service $B$, $p_C$ from $C$), we **CAN** have C's write happen after A and B, by specifying BOTH $p_A$ and $p_B$ as _prerequisite_ work packages of $p_C$. This SET of TWO prereqs was not possible until GP 0.4.5 supporting a _set_ of prerequisites. Can you confirm this? 3. If (2) is correct, for the $C[n + 1] = A[n] + B[n] + C[n]$ write to happen _at the same time slot_ for some n, then another to happen within 2 timeslots for the next n, we **REQUIRE** _three_ cores for _three_ work packages, one core for each work package. In a tiny $V=6,C=2$ test configuration, we CANNOT achieve this, but in a small configuration with $V=9,C=3$, we CAN. Can you confirm this? 4. Actually, the answer to (3) is more nuanced! Because we could actually solve this ordered accumulation $C[n + 1] = A[n] + B[n] + C[n]$ with only TWO cores working on TWO packages -- one package $p_{A,B}$ with 2 work items using services $A$ and $B$ and then another package $p_C$ with ONE prerequisite: $p_{A,B}$. Then with a tiny $V=6,C=2$ test configuration, we CAN achieve the read of $C$ based on the writes of $A$ and $B$ all in one accumulate at the same time slot. Can you confirm this? 5. With a tiny testnet of $V=6,C=2$, it IS possible test (4) using $\Delta_+$ with $i$ splitting ${\bf w}$ into two pieces that test $\Delta_*$ and the tail recursion of $\Delta_+$. To observe $N$ calls to $\Delta_+$, you need $N-1$ cores though and need to figure out how to manage $g$ in $N$ splits. Can you confirm this? [If this is wrong, please advise how we should test $\Delta_+$] 6. Setting aside gas limits for a while (because at present anyway every operation is gas 1 or so, and the accumulation limit is 5-6 orders of magnitude higher) we can have 25 cores working on 25 work packages like this all in one accumulate of Z[n+1], with each work package having 2 prerequisites (except for $p_{A,B}$ which have none): * $C[n + 1] = A[n] + B[n] + C[n]$ ($p_C$ depending on $p_{A,B}$ using 2 cores) * $D[n + 1] = B[n] + C[n+1] + D[n]$ ($p_D$ depending on $p_{A,B}$ and $p_{C}$ using 1 more core for $D$) * $E[n + 1] = C[n+1] + D[n+1] + E[n]$ ($p_E$ depending on $p_{C}$ and $p_{D}$ using 1 more core for $E$) * $F[n + 1] = D[n+1] + E[n+1] + F[n]$ ($p_F$ depending on $p_{D}$ and $p_{E}$ using 1 more core for $F$) * ... and so on until ... * $Z[n + 1] = X[n+1] + Y[n+1] + Z[n]$ ($p_Z$ depending on $p_{X}$ and $p_{Y}$ using 1 more core for $Z$) Assuming all 25 cores can complete their 25 refines and get 25 work reports guaranteed and assured in a medium configuration ($V=120,C=40$), we can have the entire ordered accumulation done in ONE timeslot. Can you confirm this? Because there is a lot of formatting that github treats at least somewhat better also putting it here https://github.com/gavofyork/graypaper/issues/129 -- is that better?

# 2024-10-31 13:47 gav: 1. Correct

# 2024-10-31 13:48 gav: 2. The intention was only give the same guarantee as in (1). However the way the queuing works at present will, I think, allow this. (edited)

# 2024-10-31 13:48 gav:

# 2024-10-31 13:49 gav:

# 2024-10-31 13:54 gav:

# 2024-10-31 13:54 gav: The invariant you can definitely rely on is that work packages will not be accumulated any later than their dependencies. (edited)

# 2024-10-31 13:55 gav: In reality the order of item accumulation of any single service may come down to that service’s Accumulate code. (edited)

# 2024-10-31 13:57 gav: For dependent WPs with items in different services, you’ll just be guaranteed that the dependency service doesn’t see the dependent service’s changes before it’s own happen.

# 2024-10-31 13:58 gav: > <@sourabhniyogi:matrix.org> Below are some basic questions on Ordered Accumulation and Prerequisites. > > *Background*. Lets say we wish to compute $C[n + 1] = A[n] + B[n] + C[n]$ (some kind of recurrence relationship, lets say Fibonacci, Tribonnaci, and Quadranocci in a toy example, all of $A$, $B$, $C$ trying to write their value into service storage of key 0) using the refine work results of services $A[n]$, $B[n]$, $C[n]$ coming into accumulate. We would like service $C$ to be able to read the results of $A, B, C$ and write out $C[n+1]$ to the service storage of $C$ in $C$'s accumulate code. > > Questions: > > 1. If we have a _single_ work package $p$ with 3 work items from service $A$, $B$, $C$ (with no prerequisites), there is **NO** way to have $C$'s write _guaranteed_ to happen **AFTER** $A$ and $B$. This is because $\Delta_{+}$ will initialize _parallelized_ execution of $\Delta_1$ for all 3 services through $\Delta_*$. Can you confirm this? > > 2. If we have _three_ work packages $p_A, p_B, p_C$ with 1 work item each ($p_A$ with refine work results from service A, $p_B$ from service $B$, $p_C$ from $C$), we **CAN** have C's write happen after A and B, by specifying BOTH $p_A$ and $p_B$ as _prerequisite_ work packages of $p_C$. This SET of TWO prereqs was not possible until GP 0.4.5 supporting a _set_ of prerequisites. Can you confirm this? > > 3. If (2) is correct, for the $C[n + 1] = A[n] + B[n] + C[n]$ write to happen _at the same time slot_ for some n, then another to happen within 2 timeslots for the next n, we **REQUIRE** _three_ cores for _three_ work packages, one core for each work package. In a tiny $V=6,C=2$ test configuration, we CANNOT achieve this, but in a small configuration with $V=9,C=3$, we CAN. Can you confirm this? > > 4. Actually, the answer to (3) is more nuanced! Because we could actually solve this ordered accumulation $C[n + 1] = A[n] + B[n] + C[n]$ with only TWO cores working on TWO packages -- one package $p_{A,B}$ with 2 work items using services $A$ and $B$ and then another package $p_C$ with ONE prerequisite: $p_{A,B}$. Then with a tiny $V=6,C=2$ test configuration, we CAN achieve the read of $C$ based on the writes of $A$ and $B$ all in one accumulate at the same time slot. Can you confirm this? > > 5. With a tiny testnet of $V=6,C=2$, it IS possible test (4) using $\Delta_+$ with $i$ splitting ${\bf w}$ into two pieces that test $\Delta_*$ and the tail recursion of $\Delta_+$. To observe $N$ calls to $\Delta_+$, you need $N-1$ cores though and need to figure out how to manage $g$ in $N$ splits. Can you confirm this? > > [If this is wrong, please advise how we should test $\Delta_+$] > > 6. Setting aside gas limits for a while (because at present anyway every operation is gas 1 or so, and the accumulation limit is 5-6 orders of magnitude higher) we can have 25 cores working on 25 work packages like this all in one accumulate of Z[n+1], with each work package having 2 prerequisites (except for $p_{A,B}$ which have none): > > * $C[n + 1] = A[n] + B[n] + C[n]$ ($p_C$ depending on $p_{A,B}$ using 2 cores) > * $D[n + 1] = B[n] + C[n+1] + D[n]$ ($p_D$ depending on $p_{A,B}$ and $p_{C}$ using 1 more core for $D$) > * $E[n + 1] = C[n+1] + D[n+1] + E[n]$ ($p_E$ depending on $p_{C}$ and $p_{D}$ using 1 more core for $E$) > * $F[n + 1] = D[n+1] + E[n+1] + F[n]$ ($p_F$ depending on $p_{D}$ and $p_{E}$ using 1 more core for $F$) > * ... and so on until ... > * $Z[n + 1] = X[n+1] + Y[n+1] + Z[n]$ ($p_Z$ depending on $p_{X}$ and $p_{Y}$ using 1 more core for $Z$) > > Assuming all 25 cores can complete their 25 refines and get 25 work reports guaranteed and assured in a medium configuration ($V=120,C=40$), we can have the entire ordered accumulation done in ONE timeslot. Can you confirm this? > > Because there is a lot of formatting that github treats at least somewhat better also putting it here https://github.com/gavofyork/graypaper/issues/129 -- is that better? Sounds right, yes.

# 2024-10-31 13:59 jaymansfield: ima_b388934.jpeg

# 2024-10-31 14:00 jaymansfield: Where can I find function C defined? I don’t see it in section 12 and in the appendix under C it says “see equation ??” (edited)

# 2024-10-31 14:05 gav: > <@jaymansfield:matrix.org> Where can I find function C defined? I don’t see it in section 12 and in the appendix under C it says “see equation ??” Please provide link with GP reader.

# 2024-10-31 14:05 gav: Images are a bit broken in matrix currently.

# 2024-10-31 14:08 jaymansfield: > <@gav:polkadot.io> Images are a bit broken in matrix currently. https://graypaper.fluffylabs.dev/#/439ca37/0f8a010f9601

# 2024-10-31 14:11 jaymansfield: Looking for function C to be able to generate the accumulate root

# 2024-10-31 14:12 gav: it's not a function

# 2024-10-31 14:13 gav: and as the GP says right above your selection, it's defined in section 12. (edited)

# 2024-10-31 14:13 gav: (177) specifically.

# 2024-10-31 14:15 jaymansfield: > <@gav:polkadot.io> and as the GP says right above your selection, it's defined in section 12. Thanks I’ll take a look. Confusing though as there is no mention of C in 177 either.

# 2024-10-31 14:16 gav: > <@jaymansfield:matrix.org> Thanks I’ll take a look. Confusing though as there is no mention of C in 177 either. https://graypaper.fluffylabs.dev/#/439ca37/179e02179e02

# 2024-10-31 14:19 jaymansfield: Oh thanks got it now! I should have specified which GP version. In v0.4.5 I found it as 182.

# 2024-10-31 14:22 gav: I was just referencing the GP you used in your link, but yeah using numbers are a bad idea :) (edited)

# 2024-10-31 14:28 gav: > <@gav:polkadot.io> Sounds right, yes. Answered in the issue.

# 2024-10-31 18:04 dvladco: Hello, I have a few questions about the solicit host call for the accumulation host functions 1: This condition: if h ≠ ∇ ∧ (h, z) /∈ (x_s)l I assume should be if h ≠ ∇ ∧ (h, z) /∈ K((x_s)l) since l is a dictionary so here we are trying to verify if the key is in the dictionary 2: one line bellow here: if (x_s)l[(h, z)] = [x, y] I can't figure out where do the _x_ and _y_ (italic) come from? (edit and _t_ as well) (edited)

# 2024-10-31 18:11 gav: 1. When it’s unambiguous then I don’t bother writing the extra K() since it clutters and adds no clarity. This is often the case in dictionaries when we care about key inclusion.

# 2024-10-31 18:12 gav: 2. They are free variables defined by the fact that the value in question has two items. (edited)

# 2024-10-31 18:14 dvladco: nevermind about t I see in the latest version it's changed :)

# 2024-10-31 18:16 dvladco: So basically if l = [x, y] just means we check that there are two items

# 2024-10-31 18:37 gav: Yes

# 2024-10-31 21:37 xlchen: https://github.com/gavofyork/graypaper/pull/121#discussion_r1823768372 what's the reason to hash the preimage hash when serialize it into the state trie?

# 2024-10-31 22:29 dave: The preimage hash is passed in to eg solicit directly, it's trivial to pass in almost-colliding hashes. As the trie key construction function doesn't preserve all hash bits, these almost-colliding hashes could actually collide in the trie if they are not hashed beforehand

# 2024-11-03 11:24 dakkk: In 150 (https://graypaper.fluffylabs.dev/#/364735a/159a00159e00) for every element q of theta, we are getting keys of q, but theta has this type: List[List[Tuple[WorkReport, Set[Hash]]] So elements of theta are Lists, and I don't understand the use of K on lists (It's possible that I'm misunderstanding the formula)

# 2024-11-03 17:27 gav: Yeah it should be (w, d) in q. Will be corrected.

# 2024-11-04 10:47 dvladco: Hello, I would like to confirm if the gas calculation here is correct, it seems to me that ω8 contains the memory address for memo

# 2024-11-04 10:56 gav: Copy-paste error. g should be 10 (edited)

# 2024-11-04 16:43 dakkk: https://graypaper.fluffylabs.dev/#/364735a/0f59020f6e02 In phi, is correct that the size of the inner list is exactly Q? Q is the maximum number of items in the authorizations queue, so I expect to be :Q (edited)

# 2024-11-04 16:47 gav: What is authoriser_queue?

# 2024-11-04 16:50 dakkk: phi is called authorizer queue in the graypaper EDIT: I edited my question to match the GP (edited)

# 2024-11-04 16:52 gav: Where does the “maximum number of items” come from? (The formalism is correct, it is the exact number not the maximum)

# 2024-11-04 16:53 dakkk: here (appendix I): https://graypaper.fluffylabs.dev/#/364735a/3d6c003d6e00 (edited)

# 2024-11-04 16:54 gav: Ok that should not have the word “maximum” in there. (edited)

# 2024-11-04 16:54 gav: Will be corrected in the next release.

# 2024-11-06 13:16 dakkk: gav: is it possible to modify the formula numbering system within the graypaper? Currently, when formulas are added or removed, subsequent formula numbers shift, which can be problematic when referencing formulas from code. A potential solution to this issue is to implement section-based numbering. For instance, the third formula in Chapter 10 would be labeled as 10.3. This approach would also enhance readability when formula numbers are referenced within the paper itself, as it would immediately indicate the formula's location.

# 2024-11-06 13:36 prematurata: > <@dakkk:matrix.org> gav: is it possible to modify the formula numbering system within the graypaper? Currently, when formulas are added or removed, subsequent formula numbers shift, which can be problematic when referencing formulas from code. > > A potential solution to this issue is to implement section-based numbering. For instance, the third formula in Chapter 10 would be labeled as 10.3. This approach would also enhance readability when formula numbers are referenced within the paper itself, as it would immediately indicate the formula's location. I'd love this. I do reference formulas in my code as well for future reference. I just recently added graypaper v number next to the formula "Id" but this suggestion is much more robust

# 2024-11-06 13:38 rick: > <@prematurata:matrix.org> I'd love this. I do reference formulas in my code as well for future reference. I just recently added graypaper v number next to the formula "Id" but this suggestion is much more robust Permanent formula ids would be very helpful

# 2024-11-06 14:52 qiwei: in quit host-call, if halt with a transfer, the service account balance will be below threshold balance, just wondering is the account supposed to be removed from state afterwards? I did not find such logic (maybe i missed it..)

# 2024-11-06 17:10 gav: > <@qiwei:matrix.org> in quit host-call, if halt with a transfer, the service account balance will be below threshold balance, just wondering is the account supposed to be removed from state afterwards? I did not find such logic (maybe i missed it..) I'm not sure what you mean. The quit hostcall would delete the account in question.

# 2024-11-07 15:14 dvladco: Hey, is d here taken from u ? I don't see it being explicitly defined anywhere (edited)

# 2024-11-07 16:00 gav: Yes should be u_d - will be fixed in the next revision

# 2024-11-07 18:33 dave: Is it intended that reports can sit around in rho essentially forever? Seems like some shenanigans might be possible if someone in control of a core leaves a report sitting there long enough for auditors to no longer be able to audit it

# 2024-11-07 18:35 celadari: I think there is timeout: if new report comes in and timeout has passed the old report is flushed and the new report takes place im rho

# 2024-11-07 18:45 dave: There is a timeout but the report is only discarded if a new one comes along to replace it. Someone in control of the core could ensure no other report comes along. Might not be abusable but doesn't seem impossible to me and would be easy to fix by forcibly clearing reports after a certain amount of time

# 2024-11-07 18:52 sourabhniyogi: What does it mean to be "in control of the core" if validators are rotated within an epoch at high frequency?

# 2024-11-07 18:54 sourabhniyogi: If you mean the coretime owner, and they aren't using their core at a high frequency, that's not really a concern, I believe.

# 2024-11-07 20:10 dave: Yes I mean the coretime owner. I'm not concerned about the core not getting used, I'm concerned about an invalid report potentially getting accumulated because noone audits it. The core not getting used would be a cost to execute this attack 😅

# 2024-11-08 11:17 gav: I don't think it matters - the only reason it wouldn't go into accumulation is because it's not yet available and if it happens to stick around for ages before 2/3+1 of validators finally all decide it is available, then there's no great harm in initiating the auditing from that point. (edited)

# 2024-11-08 11:22 gav: Also, it's not the coretime owner who can prevent its becoming available: it's only the guarantors, and they do a switcheroo every 10 blocks. (edited)

# 2024-11-08 11:25 gav: If two guarantors and the coretime owner coordinated, then they could keep an unavailable WP in rho indefinitely and only choose to make it available at some arbitrary late stage (assuming the coretime owner has indefinite funds). But again, auditing only begins once available is assured.

# 2024-11-08 11:40 dave: > <@gav:polkadot.io> If two guarantors and the coretime owner coordinated, then they could keep an unavailable WP in rho indefinitely and only choose to make it available at some arbitrary late stage (assuming the coretime owner has indefinite funds). But again, auditing only begins once available is assured. Yes this is the case I was imagining. Of course this assumes it is _possible_ to make the WP available at an arbitrary later point. It's not obvious to me that this wouldn't be possible. If you wait long enough to do this then auditors will be able to assemble the bundle but may not be able to check the report because they have discarded necessary state. What then?

# 2024-11-08 11:43 dave: > <@gav:polkadot.io> I don't think it matters - the only reason it wouldn't go into accumulation is because it's not yet available and if it happens to stick around for ages before 2/3+1 of validators finally all decide it is available, then there's no great harm in initiating the auditing from that point. My point is that I think it _is_ harmful to kick off auditing so late as we don't keep eg preimages forever

# 2024-11-08 10:16 gav:

# 2024-11-08 11:42 gav: What state might they have discarded?

# 2024-11-08 11:44 dave: > <@gav:polkadot.io> What state might they have discarded? Forgotten preimages for example

# 2024-11-08 11:48 gav: I think the tolerances are pretty high on that, but it’s a lot easier to reason about correctness if it’s prevented outright. We can disable availability for WRs which should have timed out. This should avert any possibility of misuse. (edited)

# 2024-11-08 11:50 dakkk: > <@dakkk:matrix.org> gav: is it possible to modify the formula numbering system within the graypaper? Currently, when formulas are added or removed, subsequent formula numbers shift, which can be problematic when referencing formulas from code. > > A potential solution to this issue is to implement section-based numbering. For instance, the third formula in Chapter 10 would be labeled as 10.3. This approach would also enhance readability when formula numbers are referenced within the paper itself, as it would immediately indicate the formula's location. If anyone is interested in using this type of numbering in their project, it is sufficient to put this line at the beginning of graypaper.tex: ``\numberwithin{equation}{section}``

# 2024-11-10 19:38 basedafdev: Screenshot 2024-11-10 at 2.38.09 PM.png

# 2024-11-10 19:38 basedafdev: in the latest GP, the state-key constructor for (service, hash) in the second term exceeds 32 bytes. E_4(l) ~ H(h) = 36 bytes. Unless i'm missing something here

# 2024-11-11 06:54 gav: You’re not: there’s already a correction for it in my open PR: https://github.com/gavofyork/graypaper/commit/896430a35c6247175d0fdf41d2e791bacd2686e7

# 2024-11-14 03:58 cisco: Shouldn't this G function in the definition of the Accumulate invocation take the status from the host call as well in the first argument? https://graypaper.fluffylabs.dev/#/c71229b/2b6f022b6f02 Definition of the host call: https://graypaper.fluffylabs.dev/#/364735a/28a80228a802

# 2024-11-14 10:03 dakkk: davxy: in your safrole test vectors, is it possible to know the validators' secret keys?

# 2024-11-14 10:35 davxy: dakkk | JamPy: i-th key is geneted as follows: 1. interpret i as a 32 bit unsigned integer. 2. Encode it in LE. 3. Repeat the encoded value 8 times (you get a 32 bytes data) - Ed25519 secret: is data - Bandersnatch secret: use data to call Secret::from_seed in ark-ec-vrfs

# 2024-11-14 10:35 davxy: This is a trivial key generation used by the tests

# 2024-11-14 10:47 dakkk: > <@davxy:matrix.org> dakkk | JamPy: i-th key is geneted as follows: > 1. interpret i as a 32 bit unsigned integer. > 2. Encode it in LE. > 3. Repeat the encoded value 8 times (you get a 32 bytes data) > > - Ed25519 secret: is data > - Bandersnatch secret: use data to call Secret::from_seed in ark-ec-vrfs thank you (y)

# 2024-11-14 11:33 prematurata: > <@davxy:matrix.org> dakkk | JamPy: i-th key is geneted as follows: > 1. interpret i as a 32 bit unsigned integer. > 2. Encode it in LE. > 3. Repeat the encoded value 8 times (you get a 32 bytes data) > > - Ed25519 secret: is data > - Bandersnatch secret: use data to call Secret::from_seed in ark-ec-vrfs not sure who's behind jamcha.in but I'd suggest we all use this to generate keys

# 2024-11-14 12:16 gav: > <@cisco:parity.io> Shouldn't this G function in the definition of the Accumulate invocation take the status from the host call as well in the first argument? > > https://graypaper.fluffylabs.dev/#/c71229b/2b6f022b6f02 > > Definition of the host call: https://graypaper.fluffylabs.dev/#/364735a/28a80228a802 Yes the triangle sign is in fact missing from a few places. Will be corrected in next release.

# 2024-11-14 12:17 jaymansfield: > <@davxy:matrix.org> dakkk | JamPy: i-th key is geneted as follows: > 1. interpret i as a 32 bit unsigned integer. > 2. Encode it in LE. > 3. Repeat the encoded value 8 times (you get a 32 bytes data) > > - Ed25519 secret: is data > - Bandersnatch secret: use data to call Secret::from_seed in ark-ec-vrfs Thank you for this! Needed it as well.

# 2024-11-14 13:25 gav: > <@gav:polkadot.io> Yes the triangle sign is in fact missing from a few places. Will be corrected in next release. https://github.com/gavofyork/graypaper/pull/141

# 2024-11-14 14:29 amritj: Here, R looks like a function but it is multiplying with the inner term (t'/R -1) https://graypaper.fluffylabs.dev/#/364735a/14f40214ff02

# 2024-11-14 15:11 gav: This (san-serif) R is a constant; it's a simple product. (edited)

# 2024-11-14 15:11 gav: https://graypaper.fluffylabs.dev/#/c71229b/3d6b003d6d00

# 2024-11-14 20:51 cisco: The prerequisites of a work report are already a set, so shouldn't the D function just do w_x_p U ...?

# 2024-11-14 21:27 gav: > <@cisco:parity.io> The prerequisites of a work report are already a set, so shouldn't the D function just do w_x_p U ...? Yes. Will be fixed in next release

# 2024-11-14 21:27 prematurata: I have a question about acc history. According to https://graypaper.fluffylabs.dev/#/364735a/162000162800 , its a E long sequence of Hash sets When merkleizing https://graypaper.fluffylabs.dev/#/364735a/34fa0234fc02 we're asked to serialize it but C.1.7 https://graypaper.fluffylabs.dev/#/364735a/33d70033d700 defines set serialization which if intended to be used for ξ encoding, then it means we won't be able to decode it (due to not containing length discriminator). Right now we don't really need to decode it but not sure this is intended

# 2024-11-15 08:44 dakkk: Your assumptions seems correct, but the state serialization is not intendeed to be "reversible" (edited)

# 2024-11-15 08:47 prematurata: > <@dakkk:matrix.org> Your assumptions seems correct, but the state serialization is not intendeed to be "reversible" I agree with you but, according to my (bad) memory this is the only case when we serialize something but we are unable to deserialize it... I wonder if we really want to introduce a first here

# 2024-11-15 08:49 dakkk: > <@prematurata:matrix.org> I agree with you but, according to my (bad) memory this is the only case when we serialize something but we are unable to deserialize it... I wonder if we really want to introduce a first here it's not the first case; check preimage metadata serialization key: we are hashing an hash, that's not reversible

# 2024-11-15 08:51 prematurata: > <@dakkk:matrix.org> it's not the first case; check preimage metadata serialization key: we are hashing an hash, that's not reversible I knew my bad memory failed me again =) tkz

# 2024-11-15 08:54 dakkk: > <@prematurata:matrix.org> I knew my bad memory failed me again =) tkz No problem; btw it would be an interesting feature to have a deserializable encoding of the state, so for instance we can use it for storing on disk. But as mentioned earlier it is not the main purpose of this

# 2024-11-15 09:48 xlchen: we don’t really need the ability to decide state key for node usage. so it is mostly useful as a dev / debug / indexing needs. the simple solution is just save the preimage in a aux store next to the state store

# 2024-11-15 09:59 dave: We need to be able to decode state values for warp sync so this does seem like an oversight. In any case if there is an ambiguous encoding that results in two different states having the same state root this is at the least not ideal...

# 2024-11-15 10:02 dave: In general it should be possible to execute a block given just the initial state trie without needing a "real" state to deal with ambiguity

# 2024-11-15 10:05 gav: > <@prematurata:matrix.org> I have a question about acc history. According to https://graypaper.fluffylabs.dev/#/364735a/162000162800 , its a E long sequence of Hash sets > > When merkleizing https://graypaper.fluffylabs.dev/#/364735a/34fa0234fc02 we're asked to serialize it but C.1.7 https://graypaper.fluffylabs.dev/#/364735a/33d70033d700 defines set serialization which if intended to be used for ξ encoding, then it means we won't be able to decode it (due to not containing length discriminator). Right now we don't really need to decode it but not sure this is intended this is an oversight. It will be fixed in 0.5.0 (state encoding of curly E will have a length prefixes)

# 2024-11-15 10:09 prematurata: since 0.5.0 is merged do you think it would be better to have formula numbers follow the subchapters as well? ex... when reading chapter 12.2, the current 0.5.0 first formula is 12.13. what about (not sure it's possible in latex) renaming 12.13 to 12.2.1?

# 2024-11-15 10:09 prematurata: gav (edited)

# 2024-11-15 10:09 gav: No 0.5.0 has been tagged or released. (edited)

# 2024-11-15 10:10 gav: I’d avoid paying too much attention to the versions of main branch.

# 2024-11-15 10:11 prematurata: yeah i should have better explained myself. nevertheless what do you htink about changing the formula numbering a bit further?

# 2024-11-15 10:12 prematurata: i think it would be beneficial to encode the subsection in the formula numbering as well

# 2024-11-15 10:25 tomusdrw: Is there any particular reason why lower and higher bits of the two U64s are in a different order in registers than in case of new and transfer? My OCD is killing me: https://graypaper.fluffylabs.dev/#/364735a/2e0d032e0d03

# 2024-11-15 11:55 gav: > <@prematurata:matrix.org> since 0.5.0 is merged do you think it would be better to have formula numbers follow the subchapters as well? ex... > > when reading chapter 12.2, the current 0.5.0 first formula is 12.13. what about (not sure it's possible in latex) renaming 12.13 to 12.2.1? To be honest, the bigger size of formula numbers is already screwing up the layouts.

# 2024-11-15 11:55 gav: I don't want to make it any worse that it needs to be.

# 2024-11-15 11:57 gav: > <@tomusdrw:matrix.org> Is there any particular reason why lower and higher bits of the two U64s are in a different order in registers than in case of new and transfer? My OCD is killing me: > https://graypaper.fluffylabs.dev/#/364735a/2e0d032e0d03 It's not an issue in 0.5.0

# 2024-11-15 19:56 tomusdrw: Might be a dumb question, but shouldn't creating a service with code larger than W_C be disallowed? AFAIU it will not run yielding BIG, but why allow new host call to succeed at all? https://graypaper.fluffylabs.dev/#/364735a/2e80022e8002

# 2024-11-15 19:58 gav: It avoids the need for the code to be fetched (and fetchable) at time of service creation.

# 2024-11-15 20:01 gav: It’s not like you can’t put a huge preimage into a service’s storage if you’re willing to pay.

# 2024-11-15 20:01 tomusdrw: > <@gav:polkadot.io> It avoids the need for the code to be fetched (and fetchable) at time of service creation. not sure I get it. If I understand correctly l is the code length, and the new host call could just end with some error code if l >= W_C.

# 2024-11-15 20:02 gav: yes, there could be a check on l

# 2024-11-15 20:02 gav: But (assuming we keep the check where it’s actually used) then it’s one extra piece of math that’s not really needed

# 2024-11-15 20:03 gav: I guess the check could be moved here though.

# 2024-11-15 20:03 tomusdrw: I see. Though, you could still have some genesis services with code larger than W_C if it's checked only in new. (edited)

# 2024-11-15 20:04 gav: Yeah. Though then it’d be the chain publishers fault if it screwed the PVM

# 2024-11-15 20:05 gav: Realistically implementations will likely need to limit the PVM code size so if the protocol theoretically supports unlimited blob sizes if they’re added in genesis it’s a bit problematic

# 2024-11-15 20:06 tomusdrw: Well, if it's not explicit in GP it could lead to some cross-implementation issues. My curiosity is satisfied though. So up to you where the check should be :) (edited)

# 2024-11-17 11:16 stanleyli: image.png

# 2024-11-17 11:16 stanleyli: Hello guys, I have a question about the pageproof, according to the definition in the GP, constructing a Merkle Tree appears to follow the diagram provided. In this context, B(1,2) represents the branch of leaf 1 and leaf 2, and its calculation method is described as H($node ⌢ N(v\_{...|v|/2}, H) ⌢ N(v\_{|v|/2...}, H)).https://graypaper.fluffylabs.dev/#/364735a/354e0135ac01 (edited)

# 2024-11-17 11:18 stanleyli: Meanwhile, when calculating the Trace path, the method outlined in the GP is N(P(v, i), H).https://graypaper.fluffylabs.dev/#/364735a/364500360301

# 2024-11-17 11:18 stanleyli: The methos doesn't have any problem, but it leads to a final result formatted as [hash, hash ... hash, blob], where the earlier elements are hashes, but the last becomes a blob (length = G = 4104). This results in a significantly longer output length overall. Would it make sense to modify the N() function so that it returns H(v) when |v|=1? This way, every node would consistently be a hash, and each element in the trace path would have a uniform length of hash. Or perhaps I’ve misunderstood something? (edited)

# 2024-11-18 08:57 amritj: https://graypaper.fluffylabs.dev/#/364735a/2cfc012cfc01 Here we are fetching 32 bytes preimage hash from memory and then hashing it again, is it right?

# 2024-11-18 10:04 gav: > <@amritj:matrix.org> https://graypaper.fluffylabs.dev/#/364735a/2cfc012cfc01 > > Here we are fetching 32 bytes preimage hash from memory and then hashing it again, is it right? Correct.

# 2024-11-18 11:29 gav: > <@stanleyli:matrix.org> The methos doesn't have any problem, but it leads to a final result formatted as [hash, hash ... hash, blob], where the earlier elements are hashes, but the last becomes a blob (length = G = 4104). This results in a significantly longer output length overall. > > Would it make sense to modify the N() function so that it returns H(v) when |v|=1? This way, every node would consistently be a hash, and each element in the trace path would have a uniform length of hash. > > Or perhaps I’ve misunderstood something? The justification function curly-J uses the constancy function _C_, which basically hashes all data items before passing them in to the trace function _T_.

# 2024-11-18 11:48 stanleyli: Understand! Thanks a lot.

# 2024-11-19 15:50 gav: Graypaper version 0.5.0 is out. No vectors yet but they should be coming soon. There are several important changes, chiefly to the PVM, which is now 64-bit.

# 2024-11-20 11:48 qiwei: In Appendix I, there is a ZG = 2^14: The standard pvm program initialization page size. See section A.7., but A.7 is still using ZP, maybe need to update to use ZG?

# 2024-11-20 14:38 vinsystems:

# 2024-11-20 15:35 dakkk: Is it correct to deduct that in order to complete Milestone 1 of JAM prize we need to wait version 1.0 of the graypaper? Or can we deliver an 0.5 implementation? (edited)

# 2024-11-21 07:19 sourabhniyogi: Some candidate 32=>64: - service id might be 64-bit? https://graypaper.fluffylabs.dev/#/911af30/10da0010da00 thus new's check/bump would need adjustment https://graypaper.fluffylabs.dev/#/911af30/30e00230e002 https://graypaper.fluffylabs.dev/#/911af30/2da2032da203 - invoke 60 byte is 8+13\*8=112 byte with 4 => 8 https://graypaper.fluffylabs.dev/#/911af30/342a01342a01 https://graypaper.fluffylabs.dev/#/911af30/346101346101 (edited)

# 2024-11-21 09:31 gav: > <@sourabhniyogi:matrix.org> Some candidate 32=>64: > * service id might be 64-bit? https://graypaper.fluffylabs.dev/#/911af30/10da0010da00 > thus new's check/bump would need adjustment https://graypaper.fluffylabs.dev/#/911af30/30e00230e002 > https://graypaper.fluffylabs.dev/#/911af30/2da2032da203 > * invoke 60 byte is 8+13*8=125 byte with 4 => 8 > https://graypaper.fluffylabs.dev/#/911af30/342a01342a01 > https://graypaper.fluffylabs.dev/#/911af30/346101346101 > Service ID remains 32-bit (for now)

# 2024-11-21 09:31 gav: Invoke will be fixed in the next revision (edited)

# 2024-11-21 17:43 charliewinston14: Good afternoon everyone. I was hoping someone could shed some light on how to generate the accumulate root thats in block history? Not sure how it differs from the mmr peaks I already generated in the history items. (edited)

# 2024-11-22 13:52 emielsebastiaan: A total of 5 suggested changes to Graypaper Appendix A regarding the change to 64bit. Allowing up to 8 octets of input for the signed extension function. PR: https://github.com/gavofyork/graypaper/pull/151 (edited)

# 2024-11-23 06:58 sourabhniyogi: Small suggestion: the epoch marker here https://graypaper.fluffylabs.dev/#/911af30/0e52030e5203 may use $\\eta\_0$ and $\\eta\_1$ instead of $\\eta\_1'$ and $\\eta\_2'$ to keep the dependency graph here https://graypaper.fluffylabs.dev/#/911af30/098001098001 clean and easy to reason about -- its equivalent based on https://graypaper.fluffylabs.dev/#/911af30/0e57020e5702 (edited)

# 2024-11-24 14:44 yu2c: https://graypaper.fluffylabs.dev/#/911af30/0d63000d7c00 Hello, (6.2) this R as remainder will strongly misunderstand by the constant R in Appendix. While the equation itself appears valid, it could lead to misunderstanding when reading the entire GP. Is there a better way to express this equation to avoid ambiguity" (edited)

# 2024-11-24 15:41 celadari: Hey guys, in the refinement context what's the difference between the anchor (a) and the lookup-anchor (l) ?

# 2024-11-24 17:19 tomusdrw: Perhaps I'm missing something, but it's not clear to me what should happen with memory cells between l and |v| in case l < |v| in import. Should these be zeroed or just left as is? https://graypaper.fluffylabs.dev/#/911af30/32e20232e202 (edited)

# 2024-11-24 17:20 gav: > <@yu2c:matrix.org> https://graypaper.fluffylabs.dev/#/911af30/0d63000d7c00 > Hello, (6.2) this R as remainder will strongly misunderstand by the constant R in Appendix. Is there any better expression for this equation? It is pretty standard (e.g. here ) But I can probably circle it or something to make it more explicit.

# 2024-11-24 17:22 gav: > <@tomusdrw:matrix.org> Perhaps I'm missing something, but it's not clear to me what should happen with memory cells between l and |v| in case l < |v| in import. Should these be zeroed or just left as is? > https://graypaper.fluffylabs.dev/#/911af30/32e20232e202 that should have bold v replaced with bold v_...l

# 2024-11-24 17:22 gav: So that memory would stay unchanged (edited)

# 2024-11-24 17:24 gav: In summary, anchor is used to anchor the WP (and thus WR) to a particular block which must (still) be in recent history when reported and ensures that the WP is recent. Lookup-anchor need not be very recent at all (it can be quite old), but must be finalised; this is used to ensure that the point at which the historical_lookup (in refine) is both finalised and within the amount of lookup-history which we maintain on-chain. (edited)

# 2024-11-24 17:27 gav: You can find out more by reading the section on Reporting (for anchor) and the historical_lookup and its dependent function (for lookup-anchor)

# 2024-11-24 17:27 gav: > <@sourabhniyogi:matrix.org> Small suggestion: the epoch marker here https://graypaper.fluffylabs.dev/#/911af30/0e52030e5203 may use $\\eta\_0$ and $\\eta\_1$ instead of $\\eta\_1'$ and $\\eta\_2'$ to keep the dependency graph here https://graypaper.fluffylabs.dev/#/911af30/098001098001 clean and easy to reason about -- its equivalent based on https://graypaper.fluffylabs.dev/#/911af30/0e57020e5702 Feel free to make a PR.

# 2024-11-25 00:14 sourabhniyogi: Sure -- https://github.com/gavofyork/graypaper/pull/154

# 2024-11-25 08:02 luke_fishman: could someone clarify please in chapter 4 we have PVM memory with page size of 2^12 https://graypaper.fluffylabs.dev/#/911af30/2b94002b9400 in appendix A, in (A.31) (A.34) we have page size of 2^14 https://graypaper.fluffylabs.dev/#/911af30/2afe022afe02 which one is it? thanks

# 2024-11-25 08:09 gav: > <@luke_fishman:matrix.org> could someone clarify please > > in chapter 4 we have PVM memory with page size of 2^12 https://graypaper.fluffylabs.dev/#/911af30/2b94002b9400 > > in appendix A, in (A.31) (A.34) we have page size of 2^14 https://graypaper.fluffylabs.dev/#/911af30/2afe022afe02 > which one is it? > > thanks Some of those instances of Z_P should have been renamed to Z_G. Will be fixed in 0.5.1

# 2024-11-25 08:10 gav: There is only one PVM page size (2^12)

# 2024-11-25 08:11 gav: However specifically and only for standard memory initialization, we use an additional two initialization "page" sizes (2^14 and 2^16) in the math. (edited)

# 2024-11-25 08:15 luke_fishman: ok. Thank you Gav.

# 2024-11-25 08:25 gav: I cleaned up the wording further in an additional commit.

# 2024-11-25 12:40 gav: Further simplifications are incoming here: Simplify initialization memory layout. (edited)

# 2024-11-25 16:24 dakkk: davyx raised a dilemma over gas constants; if G_T is the total gas allocated across all cores for accumulation, and G_A is the total gass allocated to a core for accumulation, should G_A be G_T / C? https://graypaper.fluffylabs.dev/#/911af30/3f43003f5200

# 2024-11-25 16:29 gav: > <@dakkk:matrix.org> davyx raised a dilemma over gas constants; if G_T is the total gas allocated across all cores for accumulation, and G_A is the total gass allocated to a core for accumulation, should G_A be G_T / C? > > https://graypaper.fluffylabs.dev/#/911af30/3f43003f5200 No not any more

# 2024-11-25 16:29 gav: Because there is queuing and always accumulate services

# 2024-11-25 16:40 dakkk: Got it; so the always accumulated services and queuing would have the majority of the available gas, 341000000 - (341\*100000) = 306900000 (~90%); is it the correct order of magnitude? (edited)

# 2024-11-25 17:54 davxy: The doubt is mostly about the big 🍰 reserved for queued and always accumulate services

# 2024-11-25 18:52 gav: > <@dakkk:matrix.org> Got it; so the always accumulated services and queuing would have the majority of the available gas, 341000000 - (341\*100000) = 306900000 (~90%); is it the correct order of magnitude? Well no not necessarily but the fact that always-accumulate exists at all means they can’t necessarily be equivalent.

# 2024-11-26 00:22 mkchung: https://graypaper.fluffylabs.dev/#/911af30/36cb02362803 Can state-key constructor functions C be enhanced with a "type" identifier? something like:

(i ∈ N28, t ∈ N2) ↦ [i, 0, 0, . . . ,t]
(i, s ∈ NS, t ∈ N2) ↦ [i, n0, 0, n1, 0, n2, 0, n3, 0, 0, . . . ,t] where n = E4(s) 
(s, h, t) ↦ [n0, h0, n1, h1, n2, h2, n3, h3, h4, h5, . . . , h26, t] where n = E4(s)

so that we can determine the expected type (δ, a\_s, a\_p, a\_l) from the C key alone? (edited)

# 2024-11-26 05:46 gav: t is Boolean? I’m not sure what’s going on here

# 2024-11-26 06:49 mkchung: "T" is intended to be a type identifier (octet) so that it can potentially support up to 255 different types in state merklelization. C1-C15 and account related states (i.e., δ, a_s, a_p, a_l) are all different types. I think it's probably beneficial to have one octet from C key as a way to "validate" what encoded struct we are expecting (if collision is not a concern)? (edited)

# 2024-11-26 06:55 mkchung: Right now it's quite difficult to differentiate account related states (i.e., δ, a_s, a_p, a_l) from C key alone without this "type identifier".

# 2024-11-26 07:23 gav: The first two key compositions should be easy to spot given the zeroes. The last one is harder, sure because of the information density, but precisely because of this we can’t really afford to waste a byte as that would further reduce the cryptographic security.

# 2024-11-26 17:41 sourabhniyogi: For general fuzz testing of JAM implementers STF in a "jamblocks" tester, we would like to represent 4 components (i.e., δ, a_s, a_p, a_l) of service state, expanding from this starting point https://github.com/davxy/jam-test-vectors/blob/d4a7c879f54166ba9e505e2440f0b9b99c1dd9b8/reports/tiny/bad_service_id-1.json#L256-L268

# 2024-11-26 17:46 sourabhniyogi: to include the original inputs to C (a) s for δ (b) k (original 32-bit) for [a\_s]( https://graypaper.fluffylabs.dev/#/911af30/372a02372a02) (c) h (original 32-bit) for [a\_p]( https://graypaper.fluffylabs.dev/#/911af30/373d02373d02) (d) (h, l) for a\_l This would require that implementers keep all the metadata { s, k, h, l } for all of (a)-(d), and is useful for implementers to debug in M1-M4. May I suggest that we have a nice human reasonable JSON format and a matching state trie dump codec test vector that is not just raw key-vals but organized by serviceID complete with this additional metadata { k, h, l }. (edited)

# 2024-11-26 17:58 sourabhniyogi: By making the state trie JSON+codec representation non-lossy in having the original inputs for C that decide the keys (at the cost of making JAM implementers have slightly more bookkeeping), we have something better than the raw key-vals https://github.com/jam-duna/jamtestnet/blob/main/traces/assurances/jam\_duna/traces/395473\_011.json requiring a bunch of guesswork. I would further suggest that this motivates a meaningful improvement to CE129 https://github.com/zdave-parity/jam-np/issues/1 which I believe is only for debugging purposes (confirm?). I claim that the additional cost of bookkeeping by JAM implementers will be happily borne to save tons of debugging time later. Do you agree? If not, why not? (edited)

# 2024-11-26 21:31 dave: I've replied to the issue in the jam-np repo

# 2024-11-26 21:35 dave: FWIW it is possible for a node to maintain the "original" state keys if it executes every block from genesis, but in the case of warp sync this is _not_ possible because not all bits of the original state keys influence the state root and so they cannot all be proven

# 2024-11-26 21:37 dave: One further issue is that in the case of the preimage lookup dictionary l, the key transformation involves a hash and so cannot be reversed

# 2024-11-26 21:39 dave: In that case you could track the original key and prove this, but if we really want to address this issue I think the simpler solution is to simply include the original key in the transformed value

# 2024-11-26 21:43 dave: We could possibly do a similar thing for the discarded key bits

# 2024-11-26 21:43 dave: Of course this adds complexity, which I think is the reason Gav hasn't done it (not that I can speak for him)

# 2024-11-26 23:04 xlchen: there is no reason why we can't have an extra aux store to store the preimages / hash mappings to support reverse lookup. this is something like archive node feature that's optional but nice to have thing. I don't think it needs be spec'ed

# 2024-11-27 01:36 sourabhniyogi: Since CE129 is not intended for debug purposes but for warp-sync, this issue is not relevant, thank you. Closed the issue =). But the general need for a state representation for STF testing (and debugging state) is still there. For the syncing problem, we do need a clear understanding of how implementations should do the warp sync with CE129 and how AuditDA (1 hour) and ImportDA (28 days) should work at similar timescales, right? We imagine syncing could be skipped for AuditDA but ImportDA cannot. We doubt teams need to implement it until getting through M1+M2, right? We should spec out the RPC endpoints (with JSON request/responses) to cover the debugging case of state sharing without regard to CE 129 warp syncing... and the top 10 others. (edited)

# 2024-11-27 01:46 dave: Don't understand what you mean by skipping sync for DA? Could you elaborate?

# 2024-11-27 03:35 sourabhniyogi: A new validator who joins a JAM network has to get all chunks associated with its validator index -- for both AuditDA (1 hour) and ImportDA (28 days). For Audit DA, if there is some upper bound on how many new validators can join (not sure about where this will happen) then within an hour of participating its up to date -- so a new validator could "skip" this Audit DA syncing operation since there is so much redundancy. Or it could look through the last hours worth of work reports and ask the guarantors for its chunks. But for Import DA, the new validator should have some fast way of getting all the chunks its responsible for holding. It could look through the last 28 days of work reports and ask the guarantors for its chunks .. but I am not sure that it can.

# 2024-11-27 09:16 dave: Where did you get that from? Once enough validators claim receipt of their shards I do not believe there is any requirement for the guarantors of a report to send out more shards. There is also no mechanism for validators to report that they have obtained the shards after this point, although you could argue that isn't really necessary.

# 2024-11-27 16:47 sourabhniyogi: Consider the situation where on Dec 1 there is a segment exported in a V=1024 network. Under normal conditions, up to 2/3 of the network can be dead and the segment will be available in ImportDA. BUT if every day from Dec 1 to 11, 100 validators leave and 100 new join to take their place, in such a way that the entire network has been completely replaced, but (a) no one has the data to reconstruct the segment (b) no one in the new set has used any JAMNP method to fetch their chunk from the guarantor, what happens? I think the present answer is "there is no plan". If the plan is "oh we have 28 day unbonding periods", what happens with unbonding queues that are way shorter than that? This sets my expectation that we need a way to have ImportDA syncs for a changing of the guard, where one validator takes the place of another at a specific validator index. (edited)

# 2024-11-27 16:49 sourabhniyogi: https://github.com/polkadot-fellows/RFCs/blob/main/text/0097-unbonding_queue.md

# 2024-11-27 16:54 dave: There is a question of whether there are sufficient incentives for old validators to maintain import DA data and make it available to the current validator set for the full 28 days. I don't know the answer to that.

# 2024-11-27 16:56 dave: AFAIK though there is no mechanism to "hand over" import DA data to new validators

# 2024-11-27 16:57 gav: Indeed there is not. Thus far the assumption is that it doesn’t matter.

# 2024-11-27 16:58 gav: (Ie churn will be sufficiently small)

# 2024-11-27 17:06 sourabhniyogi: Concerning the accumulation queue, we are struggling to come up with a test case where we have 12.3 https://graypaper.fluffylabs.dev/#/911af30/167100167100 filled with anything because 11.38 https://graypaper.fluffylabs.dev/#/911af30/156001156001 makes it difficult to impossible to do so. Concretely, if package M has package F as a prerequisite, because of 11.38 requirement, M can't even get a work report in for 12.3 to matter. Put another way, 11.38 has prereqs constraining refine of M before 12.3 accumulation can matter. How can we get a good test case for accumulation queue then? We must be misunderstanding something here.

# 2024-11-27 17:09 dave: AIUI the dependencies stuff in the backend is to deal with availability of work reports happening out of order

# 2024-11-27 17:09 dave: Or not happening at all

# 2024-11-27 17:10 dave: So say you have two WPs A and B, with B depending on A. A and B can be reported in the same block. If B becomes available before A then it will enter the "ready" queue but must wait for A to be made available and accumulated before it can be accumulated itself. (edited)

# 2024-11-27 17:14 sourabhniyogi: Here is the thing -- by 11.38 B cannot become available before A because of this requirement https://graypaper.fluffylabs.dev/#/911af30/156001156001

# 2024-11-27 17:15 dave: That's a requirement for reporting. Availability happens after this

# 2024-11-27 17:17 sourabhniyogi: If B doesn't get reported until A is reported, there is no way for B to become available before A, right? No "If B becomes available before A" possibility exists then, thus no way for the accumulation queue to be filled with B. (edited)

# 2024-11-27 17:19 dave: Why do you say that? The availability of the two packages is pretty independent

# 2024-11-27 17:20 dave: For one if the guarantors for A go offline then A just won't become available at all

# 2024-11-27 17:31 sourabhniyogi: 1. For guarantors, they can assure the data as soon as they generated the work report. 2. For non-guarantors, they will only be able to assure the data based on observing a guarantee extrinsic that includes a work report. The majority of the network (all but 2 or 3 validators out of V) is in 2. So for the B-depends-on-A: * By 11.38 requirement on reporting/guarantees, there is no way to generate a work report for B until A has a work report * Because of 2, the majority of the network cannot provide an assurance * Because the above "the majority of the network cannot provide an assurance", we are having difficulty generating any situation where 12.3 accumulation queue is filled with anything. What are we getting wrong in the above chain of logic? Is there a test case where 12.3 accumulation queue is filled with anything? Note that we ARE able to generate a test case like this: (1) |E_G|=2, where refine of A+B are executed _in the same slot_ (2) |E_A|=V, where all V validators assure both A+B and then due to the "B depending on A" first A is accumulated (having no dependencies) then B (having a dependency on A)

# 2024-11-27 17:35 sourabhniyogi: and what we have to do to get a test case working is: (1) |E_G|=2, where refine of A+B are executed in the same slot (that is, 11.38 is NOT a problem) then simulate a forced delay of assurances in order to get our accumulation queue filled. Is this the only way?

# 2024-11-27 17:36 dave: What do you mean by "Because of 2, the majority of the network cannot provide an assurance"

# 2024-11-27 17:39 dave: If in your test case all reports are assured immediately then no you probably won't get anything in the queue. So yes your test needs to simulate assurances being delayed. Not sure why this is impossible?

# 2024-11-27 17:40 sourabhniyogi: Not impossible -- but our belief is that this is the only way. Can you think of another?

# 2024-11-27 17:41 dave: > <@dave:parity.io> AIUI the dependencies stuff in the backend is to deal with availability of work reports happening out of order Not aware of one. As I said, I believe the entire point of the dependencies stuff is to deal with this happening.

# 2024-11-27 17:41 sourabhniyogi: ok, great, we don't have a misconception, thank you!

# 2024-11-29 05:31 0xjunha: https://graypaper.fluffylabs.dev/#/911af30/162e02162f02 Potential typo: "the queue of work-reports" should be "the queue of authorizers"?

# 2024-11-29 08:22 gav: Correct! Will be fixed in next revision.

# 2024-11-29 13:05 0xjunha: https://graypaper.fluffylabs.dev/#/911af30/2af3002a0801 https://graypaper.fluffylabs.dev/#/911af30/23b20123c401 ^ It seems type for the instruction counter is missing in these definitions https://graypaper.fluffylabs.dev/#/911af30/2bd8012bd801 ^ And type for the instruction could be N_R, it is currently marked as N.

# 2024-11-29 15:19 gav: Will be corrected in 0.5.1: PVM: Don't forget the instruction counter in prototype (edited)

# 2024-11-29 16:02 vinsystems: Hi, this Disputes test https://github.com/davxy/jam-test-vectors/blob/disputes/disputes/tiny/progress_with_culprits-7.json fails because there is an offender relative to a not present verdict. Where is especified in the GP that offenders must be relative a present verdict? An offender cannot be new to be subsequently added to the verdicts?

# 2024-11-29 16:27 dakkk: > <@vinsystems:matrix.org> Hi, this Disputes test https://github.com/davxy/jam-test-vectors/blob/disputes/disputes/tiny/progress_with_culprits-7.json fails because there is an offender relative to a not present verdict. Where is especified in the GP that offenders must be relative a present verdict? An offender cannot be new to be subsequently added to the verdicts? the test fails because the culprit target is not in the bad set (equation 10.5)

# 2024-11-30 13:28 0xjunha: Another small correction for consistency: wrapping the (x_s)_l with \keys{} here https://graypaper.fluffylabs.dev/#/911af30/31b30231b302

# 2024-12-02 08:54 vinsystems: Is it possible to have different epoch judgments in the same dispute extrinsic? (edited)

# 2024-12-05 11:05 luke_fishman: Hello, guess i am missing something. in Refine Invocation we only ever call the Argument Invocation with **x = (∅, \[\])** this then passes on through ΨM and ΨH and eventually one of the Omegas in B.8 is getting called with this pair as an argument so if we only ever pass "empty" pair, how will the internal Omegas ever operate on non empty argument? (edited)

# 2024-12-05 11:17 luke_fishman: ok i think i can answer myself (sorry for the spam) by using a series of host calls i can progressively "fill up" the context pair for example i could use machine => poke => peek in order to create a machine in the **m** dictionary, write some program into it then verify it is there using peek

# 2024-12-05 13:05 amritj: We use the paged proof hashed segments for building segment justification. The segment root is built with a constancy preprocessor that prepends "$leaf" to the data. Shouldn't we also prepend "$leaf" here? https://graypaper.fluffylabs.dev/#/911af30/1a18011a1f01 (edited)

# 2024-12-06 08:05 gav: > <@amritj:matrix.org> We use the paged proof hashed segments for building segment justification. > > The segment root is built with a constancy preprocessor that prepends "$leaf" to the data. > > Shouldn't we also prepend "$leaf" here? > > https://graypaper.fluffylabs.dev/#/911af30/1a18011a1f01 Yes in the present system it would be needed. I might alter things so that for the pages proofs tree we don’t bother with the leaf/node prefixes at all since they’re redundant here.

# 2024-12-06 08:10 amritj: Gotcha, thanks!

# 2024-12-06 11:14 gav: > <@amritj:matrix.org> Gotcha, thanks! https://github.com/gavofyork/graypaper/pull/150/commits/9b6ca3aac5a90b40fefa6449c238b0ff9bd3b3cb

# 2024-12-06 15:05 amritj: One more suggestion, I think we should also check the export count in the work package items to be equal to the length of the actually exported segments. Otherwise, work item can export more segments than Wm (2\*\*11) and I think may even overwrite index of the export segment of next work item https://graypaper.fluffylabs.dev/#/911af30/1a0c021a0c02 (edited)

# 2024-12-06 15:06 amritj: Also, I can find any such constant named Wx, I think it should be Wm https://graypaper.fluffylabs.dev/#/911af30/33ae0033ae00 (edited)

# 2024-12-06 16:33 gav: > <@amritj:matrix.org> One more suggestion, I think we should also check the export count in the work package items to be equal to the length of the actually exported segments. > > Otherwise, work item can export more segments than Wm (2\*\*11) and I think may even overwrite index of the export segment of next work item > > https://graypaper.fluffylabs.dev/#/911af30/1a0c021a0c02 It's technically impossible to validly export more items than W_M already due to the condition in the export host-call (the only way to validly append an export)

# 2024-12-06 16:35 gav: But in any case, there's a required limitation placed on work-packages here: https://graypaper.fluffylabs.dev/#/911af30/19f500190501

# 2024-12-06 16:58 gav: @room 0.5.2 is out, with all of the various corrections from the past two weeks including those in 0.5.1, as well as some quite important tweaks: - Change blocks to be audited by their prior validator set - PVM: Simplify initialization memory layout - WP&WRs: Introduce second gas limit for Accumulate - WP&WRs: More sensible Work Report size limit - BEEFY: Use a more optimal BEEFY MMR commitment format (edited)

# 2024-12-06 17:06 amritj: Yes, we are checking if the export segment list is full if the export segment offset + current work item export count is more than W\_M (2\*\*11) , but the export segment offset is calculated by summing up the export counts of work items mentioned in the work package built by the builder But we are not validating if the export count is equal to no of segments actually exported by the work items after refine Maybe I am missing something? (edited)

# 2024-12-06 17:22 gav: > <@amritj:matrix.org> Yes, we are checking if the export segment list is full if the export segment offset + current work item export count is more than W\_M (2\*\*11) , but the export segment offset is calculated by summing up the export counts of work items mentioned in the work package built by the builder > > But we are not validating if the export count is equal to no of segments actually exported by the work items after refine > > Maybe I am missing something? https://github.com/gavofyork/graypaper/pull/160

# 2024-12-06 17:24 gav: I'm not (yet) totally convinced that the builder can really break anything for anyone other than themselves by misreporting these counts. (edited)

# 2024-12-06 17:24 gav: But it's probably sensible anyway.

# 2024-12-07 10:58 amritj: Question: In auditing, we consider the report audited if the report have no negative judgement and there exists some tranche where all validators required to audit have positive judgement about the report. In case some validator announced a audit but failed to deliver the judgement, and new audits are announced to cover that. So, the new tranche A_n will be = old tranche auditors + new auditors just announced - auditors failed to deliver judgement ? https://graypaper.fluffylabs.dev/#/911af30/1eab001eae00 (edited)

# 2024-12-07 11:26 gav: Once it has been determined by the auditor that they will audit the new tranche, the VRF is used to determine which cores (and thus which WRs) should be audited. These are announced and by doing so the node is obliged to publish a judgement. They receive corresponding announcements and, later, judgements, from other auditors and will audit a new tranche only if by the tranche's time limit they have not received a judgement for each announcement. (edited)

# 2024-12-07 11:37 amritj: In new tranche, auditors will only audit reports that were announced to be audited in previous tranch but judgement no received, right? (edited)

# 2024-12-07 11:46 amritj: Ahh, gotcha, the new audit tranche only contain reports that did not receive all judgement, other reports that did receive all judgements are excluded from the new tranche list because they are already validated, thanks I got bit confused! (edited)

# 2024-12-07 11:51 gav: The new audit tranche is determined solely by the VRF.

# 2024-12-07 11:55 gav: This is defined as (17.15) and (17.16)

# 2024-12-07 11:58 gav: So we perceive m_n missing audits at tranche n: those two formulae determine which of the work-reports (w) we will be required to audit this tranche.

# 2024-12-07 11:58 gav: For any tranche, as the number of perceived missing-judgements increases, we audit a greater selection of the overall work-reports. (edited)

# 2024-12-07 11:59 gav: We don't (necessarily) audit the report(s) with the missing judgement(s). (edited)

# 2024-12-07 12:00 gav: And we _always_ pass a judgement on any audits we have announced ourselves. (edited)

# 2024-12-07 12:01 gav: This is described in the paragraph here: https://graypaper.fluffylabs.dev/#/911af30/1e18001e2000

# 2024-12-07 12:03 amritj: Thanks, I got confused about rechecking all previous tranche WR to be present in J\_T to consider this WR audited, but as per eq 17.16 we only include those Work reports in new tranche that did not have all announced judgement, so no need to check work reports that are not included in next tranche again! (edited)

# 2024-12-07 19:04 gav: > <@amritj:matrix.org> Yes, we are checking if the export segment list is full if the export segment offset + current work item export count is more than W\_M (2\*\*11) , but the export segment offset is calculated by summing up the export counts of work items mentioned in the work package built by the builder > > But we are not validating if the export count is equal to no of segments actually exported by the work items after refine > > Maybe I am missing something? https://github.com/gavofyork/graypaper/pull/160/commits/184515d80354f96f0410b48cea950ed7c9e4f03f

# 2024-12-07 19:05 gav: This will be updated to a new system; WPs should not be invalidated by a bad export count.

# 2024-12-08 03:47 amritj: I asked this question in Jam room, not solved there so asking here too: Question: The segments are erasure-coded and distributed to the validators. Each validator receives a single shard out of the 1,023 shards, determined by their validator index. Therefore, the shard index of the shard they receive equals to their validator index. According to the JAMNP protocol, to fetch data from the assurers, we need to provide each assurer with the shard index we require from them. The exported segments are expected to remain available for 28 days. During this period, it is assumed that more than 341 validators will remain consistent. However, there is still a significant likelihood that validator indexes may change during this time due to the addition or removal of validators. I asked if we have to remember historical validator sets to know who to request shards from, which David Emett confirmed But we are not able to find any way to distribute this data to a new validator added to the network, and without this data he won't be able to compute work reports. I proposed to not require shard index in the request to fetch data from assurer but assurer himself give us the shard index which we can ofcourse verify is correct or not, so we only need the erasure root and segment index, but as some new validators added to the system if we randomly choose 342 assurers to request data from some of the responses will be empty so either we can request from ~400-500 validators (churn rate as confirmed by sourabhniyogi is low, so we will most probably receive more than 342 shards) OR somehow store which validators were expected to be active in the requested WR segment epoch by storing status component like we do in preimages in the validator metadata which I am not much sure about The above solution is only for new validators, old validators will use the normal method as they know which validators have which shard Is this a good solution to the problem or is there a better answer we can't see? (edited)

# 2024-12-08 04:54 amritj: Btw, a highly unlikely attack vector, but I'm still curious if it's possible: For a report to be marked valid, it goes through two checks: first, guaranteeing, and second, auditing. To pass the guaranteeing stage, we need two guarantors to sign. To pass the auditing stage, we need all 10 randomly selected auditors to give a positive judgment about the report. Let's say one of the guarantors is a malicious actor and tries to bribe their fellow guarantor to sign an invalid report. The bribe would need to be large enough to offset the guarantor's potential loss if they are found guilty. Since only one more actor needs to accept the risk and the bribe, let's assume the other guarantor accepts the bribe. After guaranteeing, we have to somehow pass the auditing stage. To do this, the attacker sets up an off-chain bribe market offering auditors of this report a deal: if all them don't announce anything, and prove they are auditors of the report after the invalid report is marked valid, they will receive a huge sum of money. The auditors face no risk even if they are the only one remaining silent, as they aren't passing a positive judgment. If all 10 auditors stay silent and no one announces anything, the report will automatically be marked valid. (edited)

# 2024-12-08 05:30 prematurata: A couple of bugs i found in 0.5.2: - WorkItem new a field (accumultionGasLimit) is not taken care in the codec defined in C.26 - A.34 has been changed and Q fn in A.32 renamed to Z ... yet some parts of graypaper stil report old nomenclature. (paragraph after sbrk & A.33)

# 2024-12-08 08:35 gav: > <@prematurata:matrix.org> A couple of bugs i found in 0.5.2: > - WorkItem new a field (accumultionGasLimit) is not taken care in the codec defined in C.26 > - A.34 has been changed and Q fn in A.32 renamed to Z ... yet some parts of graypaper stil report old nomenclature. (paragraph after sbrk & A.33) Thanks will be fixed in next revision

# 2024-12-08 08:38 gav: > <@amritj:matrix.org> I asked this question in Jam room, not solved there so asking here too: > > Question: > > The segments are erasure-coded and distributed to the validators. Each validator receives a single shard out of the 1,023 shards, determined by their validator index. Therefore, the shard index of the shard they receive equals to their validator index. > > According to the JAMNP protocol, to fetch data from the assurers, we need to provide each assurer with the shard index we require from them. > > The exported segments are expected to remain available for 28 days. During this period, it is assumed that more than 341 validators will remain consistent. However, there is still a significant likelihood that validator indexes may change during this time due to the addition or removal of validators. > > I asked if we have to remember historical validator sets to know who to request shards from, which David Emett confirmed > > But we are not able to find any way to distribute this data to a new validator added to the network, and without this data he won't be able to compute work reports. > > I proposed to not require shard index in the request to fetch data from assurer but assurer himself give us the shard index which we can ofcourse verify is correct or not, so we only need the erasure root and segment index, but as some new validators added to the system if we randomly choose 342 assurers to request data from some of the responses will be empty so either we can request from ~400-500 validators (churn rate as confirmed by sourabhniyogi is low, so we will most probably receive more than 342 shards) OR somehow store which validators were expected to be active in the requested WR segment epoch by storing status component like we do in preimages in the validator metadata which I am not much sure about > > The above solution is only for new validators, old validators will use the normal method as they know which validators have which shard > > Is this a good solution to the problem or is there a better answer we can't see? We may consider storing historical (28 days worth) of validator keys in state.

# 2024-12-08 08:43 gav: Honest Non-auditors would see the bribe market and realise they can profit by speculatively evaluating the report, finding it is invalid and making the judgement (honest participants in a dispute get a small reward). Once there is a single negative judgement in the system, all validators must audit. Expected loss of the guarantors is therefore huge and it becomes (wildly) unprofitable to attempt the attack. (edited)

# 2024-12-08 09:48 amritj: Yeah, that's the way I thought it will fail, but what if only the auditors of that work report know about the bribe. One way is after the auditor announce that they are going to audit that report, the attacker may reach all WR auditors to pass positive judgement. But in this way, even if auditor is willing to accept the bribe he don't know if other will do the same and have huge risk. But if someone have built the logic to accept the bribe he is not a good auditor, and all these bad auditors can also before announcing everyone about which report they going to audit tell the bribe market about it and if bribe market found ~30 such auditors working on same report he may ask them to not send this announcment to everyone and let the report to pass, otherwise everything goes normally, this could go on for years waiting for perfect moment. Kinda filmy, and highly unlikely (edited)

# 2024-12-08 10:00 amritj: Their can be some hacker group that in years injected this kind of virus in some of the validators machines that work in background without validators knowing about it, waiting for perfect chance (edited)

# 2024-12-08 10:24 gav: It’s an independent VRF per validator, stemming from both an on-chain seed and their secret key. There’s no way of knowing which validator will self-select before they announce. (edited)

# 2024-12-08 10:25 gav: And once the announcement happens, then it’s over as the escalation will begin if there’s no judgement. (edited)

# 2024-12-08 10:26 gav: As the attacker, you would need to know that the auditors were all compromised before committing to guaranteeing the invalid report. And this is impossible as it comes from 1023 VRFs whose entropy is secret (because at lease some are not compromised). (edited)

# 2024-12-08 10:28 gav: Of course if you knew enough (> 98%) of the validator secret keys you might be able to pull this off with non-negative expected profit, but we already assume <33% compromised nodes. (edited)

# 2024-12-08 10:52 amritj: Yeah, I calculated the probability, and the chances of all 30 auditors being in the compromised set, even if 33% of the validators are compromised, is 2.01×10^-15. That's extremely low To have ~50% chance ~98% validators should be compromised I think I must work on my math skils before asking these questions 😅 (edited)

# 2024-12-08 11:09 gav: You don't even know how many it is.

# 2024-12-08 11:09 gav: All you know is that every (honest) validator has a roughly 1/100 chance of self-selecting for any given core/WR. (edited)

# 2024-12-08 11:11 gav: i.e. ~99% chance of not, so assume that you compromise the secret keys of >90% of validators, that means there's still ~100 validators who are honestly self-selecting.

# 2024-12-08 11:11 gav: the chance that none of them unexpectedly audits your bad WR is therefore 0.99^100 = 36%, so you've around 2/3 chance of being discovered. (edited)

# 2024-12-08 11:12 gav: Once discovered, we can expect even the self-interested nodes acting under bribes to act honestly unless they're already bribed with the auditor slash amount (~10% of stake). (edited)

# 2024-12-08 11:17 gav: So you have to totally cover the slash of 2 guarantors (2x 100% stake), probably some of the third guarantor (who would have elevated chances of self-selecting out of spite of being left out of the guaranteeing process), and still have such a great hack that you compromise 90% of the secret keys (which in due course we can assume will be technically impractical owing to multiple implementations and hardware key support) or somehow bribe 90% to voluntarily give up their secret keys knowing that they could get slashed if the secret is misused (e.g. as a guarantor).

# 2024-12-08 11:18 gav: And even then your attack would need to give you _guaranteed_ 2x revenue on that total cost just for it to have expected break-even (edited)

# 2024-12-08 11:19 gav: And break-even isn't enough due to gambler's ruin. (edited)

# 2024-12-08 11:25 gav: In any case I think all these arguments are made in a more rigourous format in the ELVES paper, in case you want to dive deeper into it

# 2024-12-08 11:28 amritj: I understand it now, thanks man for answering my dumb questions with such great explanations, really appreciate it. My mind was stuck that we only need to disrupt the network between validators somehow at that exact time of audit announcement making A_n empty, and pass the audit test But the amount of validators required to pull this off is huge, and can only think of only one next to impossible case if more 95% of the validators use the same Internet Provider, and the provider performed a huge targeted attack on the jam protocol validators and stopped their communication at the exact time, not much sure about this too (reading the ELVES paper now) (edited)

# 2024-12-08 11:30 gav: Good that there's some effort to comprehend the underlying game theory! Less trust:)

# 2024-12-08 18:51 gav: > <@amritj:matrix.org> I asked this question in Jam room, not solved there so asking here too: > > Question: > > The segments are erasure-coded and distributed to the validators. Each validator receives a single shard out of the 1,023 shards, determined by their validator index. Therefore, the shard index of the shard they receive equals to their validator index. > > According to the JAMNP protocol, to fetch data from the assurers, we need to provide each assurer with the shard index we require from them. > > The exported segments are expected to remain available for 28 days. During this period, it is assumed that more than 341 validators will remain consistent. However, there is still a significant likelihood that validator indexes may change during this time due to the addition or removal of validators. > > I asked if we have to remember historical validator sets to know who to request shards from, which David Emett confirmed > > But we are not able to find any way to distribute this data to a new validator added to the network, and without this data he won't be able to compute work reports. > > I proposed to not require shard index in the request to fetch data from assurer but assurer himself give us the shard index which we can ofcourse verify is correct or not, so we only need the erasure root and segment index, but as some new validators added to the system if we randomly choose 342 assurers to request data from some of the responses will be empty so either we can request from ~400-500 validators (churn rate as confirmed by sourabhniyogi is low, so we will most probably receive more than 342 shards) OR somehow store which validators were expected to be active in the requested WR segment epoch by storing status component like we do in preimages in the validator metadata which I am not much sure about > > The above solution is only for new validators, old validators will use the normal method as they know which validators have which shard > > Is this a good solution to the problem or is there a better answer we can't see? David Emett: might have an opinion on this; personally I'd be tempted to store 28 days worth of historical validator key sets on-chain (full key 336 bytes, 10 validators change per epoch and 4 bytes to redirect to last change = 9 MB), as well as the 28 days of worth of entropy (32 bytes per 6 seconds = 13 MB)

# 2024-12-09 13:24 dave: Is historical entropy needed? Not clear to me what for. Re historical validator info, my preference was to just expand the epoch mark to include Ed25519 keys and metadata, as this seems simpler (from a spec perspective at least) and the chain history will be required anyway to construct the SR->ER map

# 2024-12-08 18:52 gav: Then, you can select the shards as desired and find (or cross-reference) the validator's info fairly easily.

# 2024-12-08 18:55 gav: > validator indexes may change during this time due to the addition or removal of validators Index churn is somewhat problematic when considering our current EC reconstruction algorithms. The staking chain would need to be aware of this and enforce index-validator affinity, so it wouldn't assign a different index to the same validator from hour to hour without good cause. (edited)

# 2024-12-08 19:00 gav: > old validators will use the normal method as they know which validators have which shard Only as long as they know which era/block the segments-root belongs to and they know the validator information for that time? (edited)

# 2024-12-09 04:48 amritj: > <@gav:polkadot.io> > old validators will use the normal method as they know which validators have which shard > > Only as long as they know which era/block the segments-root belongs to and they know the validator information for that time? Yeah, old validators need to maintain a mapping for historical validators. And new validators can now easily too if they can fetch historical validator set data from other nodes and verify it with epoch mark (edited)

# 2024-12-09 15:49 gav: Yup sounds sensible.

# 2024-12-10 16:37 jaymansfield:

# 2024-12-11 20:57 danicuki: I have a doubt about this formula: (7.2) β† ≡ β except β†\[|β| − 1\]s = Hr What is the correct interpretation: 1. β† is a copy of β except that in the last element of β† we replace s for the value of Hr 2. β† is a copy of β only if the value of s in the last element is equal to Hr, otherwise β† is nil https://graypaper.fluffylabs.dev/#/5b732de/0fd5010fde01 (edited)

# 2024-12-11 20:58 celadari: Choice 1 :)

# 2024-12-11 20:58 dave: 1

# 2024-12-11 20:59 dave: It's just putting the correct state root in for the last block, which will be 0 in beta (prior state)

# 2024-12-11 21:00 danicuki: So, in this case, wouldn't be more precise to put β†\[|β| − 1\]s ≡ Hr (edited)

# 2024-12-11 21:02 danicuki: and then comes another question: in report vectors, the prior_state_root (Hr) is not informed in the header, which means they are null. Should I replace the existing vale with null? Because the vectors post_state don't do this

# 2024-12-11 21:06 celadari: I don't understand the second part of the question regarding reports. Can you tell which equations you're refering to ?

# 2024-12-11 21:11 danicuki: Eq. 7.2, 7.3 and 7.3

# 2024-12-11 21:11 danicuki: more specifically 7.2

# 2024-12-11 21:12 danicuki: If I have to replace β† last element, Hr should not be null in the vector.

# 2024-12-11 21:14 celadari: Hr is not null, But beta[|beta|-1] is null (more specifically 32 bytes of zeros)

# 2024-12-11 21:16 danicuki: where is Hr informed in vector?

# 2024-12-11 21:18 celadari: In vector ? Hr is a header

# 2024-12-11 21:19 danicuki: No, H is header.

# 2024-12-11 21:19 danicuki: Hr is the header prior state root

# 2024-12-11 21:29 celadari: Yeah, I wasn't super precise but yeah Hr is part of the header

# 2024-12-11 21:29 danicuki: My interpretation is: 1. take the last element of history and put the Hr on s (which was zero in the last iteration) (7.2) 2. create a new element for history (7.3) 3. add this newly created element in the end of the list, removing the first element if the list is bigger then Constant H (7.3) (edited)

# 2024-12-11 21:30 celadari: Yes that's how we interpret it as well

# 2024-12-11 21:32 danicuki: actually, my main doubt is: are these vectors affecting the recent history state component? I think they are not

# 2024-12-11 21:33 danicuki: I have the same question for the authorizer pool (I think someone already asked this, I didn't see the answer)

# 2024-12-11 21:35 danicuki: My local tests were not passing because I was trying to match my post state recent history with the ones in the vectors, which actually are not correct, they are supposed to be ignored in these tests

# 2024-12-11 21:48 davxy: No. Recent block history last block state root is mutated by the recent block history STF. (edited)

# 2024-12-11 21:49 davxy: > <@danicuki:matrix.org> I have the same question for the authorizer pool (I think someone already asked this, I didn't see the answer) Authorizer pool is now mutated by the "authorizers" STF. Vectors not published yet.

# 2024-12-11 21:50 danicuki: But I see authorizers (alpha) being changed in reports vectors, no?

# 2024-12-11 21:50 davxy: Not anymore

# 2024-12-11 21:50 davxy: https://github.com/w3f/jamtestvectors/pull/28#issuecomment-2535451326

# 2024-12-11 22:11 vinsystems: The work report's authorizer\_hash is removed from the authorizer pool when it is placed on the core. https://graypaper.fluffylabs.dev/#/911af30/10b50010ba00 (edited)

# 2024-12-12 04:31 amritj:

# 2024-12-15 18:20 amritj: We are removing the starting 1 bit from the left sub-trie identity https://graypaper.fluffylabs.dev/#/5b732de/37ab0237af02 But as this is a hash, we don't know exactly if the starting bit of the left sub-trie identity will be 0 or 1. So, how do we decode this hash value? What am I missing here? (edited)

# 2024-12-15 18:43 dave: Not sure what you mean by decode? This stuff is for calculating the state root from the state, which is a one-way operation

# 2024-12-15 19:32 amritj: Ohk, clear!

# 2024-12-16 21:50 jaymansfield: Is there a target date for having version 1.0 of the GP ready?

# 2024-12-16 22:04 gav: April 14th (edited)

# 2024-12-16 22:22 jaymansfield: Thank you. I’m assuming M1 submissions can’t be before this date then? Or will the conformance tool come before this?

# 2024-12-16 22:40 jaymansfield: We have moved onto M2 work for JavaJAM until then.

# 2024-12-17 08:34 gav: > <@jaymansfield:matrix.org> Thank you. I’m assuming M1 submissions can’t be before this date then? Or will the conformance tool come before this? Correct. From the published rules: > Prizes are paid no earlier than the ratification by the Polkadot Fellowship of version 1.0 of the JAM protocol. Payment of the prize by the Web3.0 Foundation is conditional upon the successful completion of all KYC/AML processes

# 2024-12-16 23:05 emielsebastiaan: Suggestion for an inconsistency fix in the designate host function. PR: https://github.com/gavofyork/graypaper/pull/166 (GP: https://graypaper.fluffylabs.dev/#/5b732de/306401306401)

# 2024-12-17 20:27 prematurata: Following my comment https://github.com/w3f/jamtestvectors/pull/14#issuecomment-2549423040 and discussion started in jam chat i think it would be better if D.4 is being changed to take into account the last 6 bits of |v| instead of the first 6. (edited)

# 2024-12-18 00:54 xlchen: I am a bit confused about what is a work package bundle. I can't find a formal definition of it

# 2024-12-18 00:54 xlchen:

# 2024-12-18 00:58 xlchen: https://graypaper.fluffylabs.dev/#/5b732de/1b9d001b9f00 this have a description of it, but it leaves some ambiguity of the encoding format. e.g. how exactly are the extrinsic data and the proofs are encoded?

# 2024-12-18 01:03 dave: https://graypaper.fluffylabs.dev/#/5b732de/1b5d001b5d00

# 2024-12-18 01:05 dave: The bundle is the second argument to A

# 2024-12-18 01:06 xlchen: I see. Thanks. It will be better if it is bit more explicit

# 2024-12-18 01:07 dave: Agree that would be nice

# 2024-12-18 14:58 prematurata: davxy afk: since youre the bandernatch god :) I have a question... Hs has a direct dependency on Hv which is included in the context to sign includes Hv as last element. (6.16 and 6.15) Hv seems to also depend on Hs.... well.. it depends on Y(Hs) (see 6.17) . While i know it needs to match the ticket in case we're not in fallback mode, i believe there is some cryptography magic i didnt fully understand to know what Y(Hs) would be without really having Hs.... otherwise we would have a circular dependency (edited)

# 2024-12-18 15:01 prematurata: If i understand properly this magic should be using G.4 with same message, privkey and **empty context** to get a RingVRF proof on which i could then apply G.5 and get the same result (edited)

# 2024-12-18 15:10 alxmirap: I may be able to answer that, I believe. The encoding in equations 6.15/6.16 uses the function \Epsilon_U(H), defined in Appendix C as being the full \Epsilon(H) without the H_s term (equations C.19 and C.20). I believe this encoding function was created precisely for the case of omitting this field to avoid the circular dependency.

# 2024-12-18 15:13 alxmirap: This function is introduced in section 5, just before Eq 5.1.

# 2024-12-18 15:20 dave: The VRF output is independent of the additional context, so there isn't really a circular dependency

# 2024-12-18 15:21 prematurata: i think i didnt explain myself - \Epsilon_U(H) contains Hv - Hs uses \Epsilon_U(H) - Hv uses Y(Hs)

# 2024-12-18 17:37 sourabhniyogi: Not sure but this might help https://github.com/jam-duna/jamtestnet/issues/21

# 2024-12-18 15:21 prematurata: perfect

# 2024-12-18 15:26 davxy: FWIW I just added this https://github.com/davxy/bandersnatch-vrfs-spec/blob/bdd1f4b7ccbad9227dc28c72660d438cf00b1b33/assets/example/src/main.rs#L298-L300

# 2024-12-18 15:27 davxy: But you've got already your answer ^

# 2024-12-18 18:12 yu2c: Hello, question here : \mathbb{H}^0 = [0]_32 in Ch3.8.1 called "zero-hash" However, \mathbb{H}_0 called "zero hash" Is there any difference between these terms? Does "zero hash" also refer to [0]_32 or does it take 0 as input of hash func?

# 2024-12-18 21:24 gav: “Zero hash” just means [0,0,0,…]_{…32}

# 2024-12-18 21:25 gav: Both terms are equivalent in meaning but one is a typo and will be corrected to the other. For now please treat them as equal. (edited)

# 2024-12-18 22:55 basedafdev: In appendix D.2, the state serialization spec for $ \pi $ is a bit ambiguous to me. \pi is defined as a tuple of prior stats for all validators, and current stats for all validators, how is E_4(\pi) possible?

# 2024-12-19 00:05 gav: It just means that they should each be encoded with 4 bytes.

# 2024-12-19 00:06 gav: But yeah I can look into reformulating this as someone else had a similar query.

# 2024-12-19 09:23 luke_fishman: in accumulate host function (Appendix B), the transfer function: https://graypaper.fluffylabs.dev/#/5b732de/311401311401 the "otherwise if ϱ < g " branch does the g here refer the g = value of register #9 or g = 10 + ω8 + 232 ⋅ ω9 the gas cost? Thanks :)

# 2024-12-19 09:39 gav: It is the g of the same column.

# 2024-12-19 13:29 luke_fishman: OK then i must be missing something g (of the same column) is w9 g (gas cost) = 10 + ω8 + 2^32 ⋅ ω9 (clearly bigger then w9) so : if Q < w9 (the case to return HIGH) => Q < g(gas cost) => not enough gas => fallback to Eq. B.20 => registers unchanged so we can never get w7' = HIGH (edited)

# 2024-12-19 19:24 gav: The gas cost for transfer is only a placeholder. I’ll make it more sensible soon.

# 2024-12-19 19:25 gav: But the g it is referring to is not the instruction gas cost on the left but the let expression above.

# 2024-12-19 19:25 gav: I’ll also change the term used.

# 2024-12-20 09:26 gav: @room Version 0.5.3 is released, containing quite a few corrections and clarifications. No really major changes this time, though we now have *active removal of reports on timeout*. (edited)

# 2024-12-21 18:53 clearloop: Screenshot 2024-12-22 at 01.53.11.png

# 2024-12-21 18:54 clearloop: **9.3 Account Footprint and Threshold Balance - formula (9.8)** What about changing the a_l here to a_o, it looks too similar to a_\mathbf{l} (edited)

# 2024-12-21 19:11 gav: Feel free to make a PR :)

# 2024-12-21 19:50 sourabhniyogi: When we refer to JAM's DA capabilities, does that refer to the preimages (supported by accumulate host functions solicit/forget) as well as the ImportDA + AuditDA (supported by refine host functions import/export), or only the latter? For rollups utilizing JAM to store blocks and headers, should they be using one or the other or both in designs intended for max scalability for a whole ecosystem (e.g. OP Stack)? (edited)

# 2024-12-21 20:29 gav: Only the latter.

# 2024-12-22 08:57 clearloop: https://github.com/gavofyork/graypaper/pull/174 please review

# 2024-12-22 09:27 gav: Reviewed

# 2024-12-22 17:05 sourabhniyogi: Which system is the CoreChains service going to write blocks and headers to ? For zk rollup ecosystems using JAM DA vs preimages, where should proofs be stored -- is there a design judgment on this that should emulate that of the CoreChains service? (edited)

# 2024-12-22 20:05 xlchen: If I am building it, blocks will just be the body of work item, headers stored in service storage, parachain code blob in preimage, chain state in DA (edited)

# 2024-12-22 20:44 gav: > <@sourabhniyogi:matrix.org> Which system is the CoreChains service going to write blocks and headers to ? For zk rollup ecosystems using JAM DA vs preimages, where should proofs be stored -- is there a design judgment on this that should emulate that of the CoreChains service? Feel free to design your services exactly as you want.

# 2024-12-26 13:51 yu2c: Hello, I have a question about the structure of recent history $beta$, in the GP (7.1) is (header hash, accumulation-result MMR, state root, work-report dict.), however in (7.3) the item $n$ is (work-report dict., header hash, accumulation-result MMR, state root). The order seems inconsistent. Which one should I follow? I assume this is a tuple, so the order is important for implementation because they will concat. at the end

# 2024-12-26 15:46 gav: Tuple item order is irrelevant. It only matters for serialisation and that is well-defined in the appendix. (edited)

# 2024-12-26 16:19 yu2c: Oh, I see~ Thank for reply

# 2024-12-28 18:56 jaymansfield: Was it intentional that CE-130 was skipped in the networking specs?

# 2024-12-29 07:06 sisco0: In the graypaper, it is stated that "Size-Synchrony Antagonism" is not a known concept under literature. However, the following concepts might be related: CAP Theorem, Blockchain Trilemma (Link). Would not the "Size-Synchrony Antagonism" term be part of any of the cited ones? Citing the Blockchain Trilemma research paper: 1. Adding more nodes increases communication overhead 2. As 𝑁 increases, 𝐿 grows at least logarithmically, potentially linearly or quadratically, depending on protocol specifics. (edited)

# 2024-12-29 08:21 gav: CAP theorem comes closer since what it calls “consistency”is similar in nature to what i termed “coherency”. However each of its three concepts are binary conditions and the proof simply states that no system can fulfill all three. Furthermore it doesn’t conceptualise the size of the system directly (how could it under a binary condition?), instead going for antagonist concepts “availability” and “partition-tolerance”. As such I’d argue CAP theorem is a more concrete trilemma on specifically replicated data systems. (edited)

# 2024-12-29 08:21 gav: The blockchain trilemma does conceptualise the size of the system but does not involve the coherency at all. It simply states that as a (blockchain) system scales then it becomes either centralised or insecure. This trilemma implicitly assumes total coherence. (edited)

# 2024-12-30 04:01 yu2c: Sorry, I'm a bit confused about the definition of serialization of guarantee extrinsic $Eg$. In GP 5.6, $\mathbf{g}$ is defined However, in appendix C.16, there's already a definition for the serialization of $Eg$. Is $\mathbf{g}$ specifically defined for $H_x$? Or which definition should we follow? Thanks

# 2024-12-30 07:51 prematurata: As of now 2 different serializations are needed in 2 different contexts

# 2024-12-30 20:53 charliewinston14: Question about CE-137. In the specs a bundle shard is defined as [u8]. Is there a fixed size for them? There is no reference to a "bundle shard" in the GP itself and not sure if it goes by a different name in it.

# 2024-12-31 11:57 dave: A bundle shard is one of the pieces you get back after erasure coding a WP bundle. Size is not fixed as it depends on the size of the original bundle

# 2024-12-31 09:58 luke_fishman: a doubt about the encoding of service storage and pre-image lookups in D.2: https://graypaper.fluffylabs.dev/#/6e1c0cd/373302377802 if i am reading this correctly, after encoding it would be impossible to reconstruct the original hash keys when decoding ∀(s ↦ a) ∈ δ, (k ↦ v) ∈ as ∶ C(s, E4(232 − 1) ⌢ k0...28) ↦ v am i missing something? thank you and a happy new year

# 2024-12-31 10:02 gav: > <@luke_fishman:matrix.org> a doubt about the encoding of service storage and pre-image lookups in D.2: > https://graypaper.fluffylabs.dev/#/6e1c0cd/373302377802 > > > if i am reading this correctly, after encoding it would be impossible to reconstruct the original hash keys when decoding > > > ∀(s ↦ a) ∈ δ, (k ↦ v) ∈ as ∶ C(s, E4(232 − 1) ⌢ k0...28) ↦ v > > > am i missing something? > > > thank you and a happy new year > > > You’re not missing anything. HNY:)

# 2024-12-31 10:03 luke_fishman: and this is intended? decode(encode(service\_account) != service\_account (edited)

# 2024-12-31 10:05 gav: Are you implying you believe that is true?

# 2024-12-31 10:07 gav: Obviously, since decode is defined only as the inverse of encode, that is not true unless service account has more degrees of freedom than its encoding. This should not be the case.

# 2024-12-31 10:10 luke_fishman: right, obviously should not be the case. but you have confirmed above the i am not missing anything -> impossible to reconstruct the hash keys of storage dictionary s, preimage lookup dictionaries p and l. so... here i get stuck? where is the missing information to allow to reconstruct those hashes?

# 2024-12-31 10:12 gav: the mappings are not serialised alongside the rest of the service account. (edited)

# 2024-12-31 10:14 gav: They’d be too big for that.

# 2024-12-31 10:17 gav: The trie root is a commitment to them. In the protocol there’s no facility or direct need to enumerate the mappings’ keys thus no need to actually store them, only to be able to query them. It’s basically up to the implementation exactly how they’re stored, but good ones will need to take into account the commitment scheme (ie trie) when designing for it. (edited)

# 2024-12-31 10:20 gav: @arkadiy:parity.iomight be able to offer more hints as to how implementers might approach the database/trie layer.

# 2024-12-31 10:33 luke_fishman: ok. Thank you Gavin. Happy New Year

# 2025-01-02 05:31 clw0908: some questions about the following opcodes: 135: https://graypaper.fluffylabs.dev/#/6e1c0cd/27d70327e303 Since **vx** belongs to **N64**, should **vx** be mod 2^32 to ensure the input of **X4** won't overflow? 136: https://graypaper.fluffylabs.dev/#/6e1c0cd/27f30327fc03 Since **vx** belongs to **N64**, should **Z4(vx)** be replaced by **Z8(vx)**?

# 2025-01-02 10:05 gav: v_x belongs to N_32 - it’s constructed from at most 4 octets from the instruction data. (l_x is min(…, 4)) (edited)

# 2025-01-02 12:12 clw0908: But the output of **X_lx**(**eq A.11**) belongs to **N_R** (**N_64**) https://graypaper.fluffylabs.dev/#/6e1c0cd/249501249601 Did I misunderstand something?

# 2025-01-02 12:32 gav: true - this is probably the correction needed, but checking with Jan Bujak also: https://github.com/gavofyork/graypaper/pull/178

# 2025-01-02 12:37 clw0908: > <@gav:polkadot.io> true - this is probably the correction needed, but checking with Jan Bujak also: https://github.com/gavofyork/graypaper/pull/178 Alright, thanks for your clarification

# 2025-01-02 19:56 tomusdrw: I'm wondering about an edge case with arguments parsing in PVM. Some instructions simply use ζı+1 (like here), but for immediates/offsets we use ℓ which is mask dependent (like here). ζ is zero-padded so it's fine if we go beyond the program length, but what if ζı+1 is actually the next instruction? AFAICT from GP we should treat the instruction byte as argument (and read registers from it) and in the next step just move according to the mask (i.e. execute that instruction). Is that expected or should there be some special casing for that in the GP?

# 2025-01-02 20:04 tomusdrw: In an edge case to that edge case we may have a program that is just instructions (i.e. all bits in the mask are set), but we are still executing every instruction correctly, because we read the next instruction bytes as arguments to the current instruction (or 0s in case we go beyond the program length). I just want to confirm that I'm understanding it correctly. (edited)

# 2025-01-02 23:57 subotic: Not sure if I'm correct, but I understand the definitions in the GP as how the program must be encoded. So if there is no argument present where one should be as per GP, then the program is incorrect and we need to panic.

# 2025-01-03 00:50 gav: > <@tomusdrw:matrix.org> I'm wondering about an edge case with arguments parsing in PVM. Some instructions simply use ζı+1 (like here), but for immediates/offsets we use ℓ which is mask dependent (like here). > > ζ is zero-padded so it's fine if we go beyond the program length, but what if ζı+1 is actually the next instruction? > > AFAICT from GP we should treat the instruction byte as argument (and read registers from it) and in the next step just move according to the mask (i.e. execute that instruction). > Is that expected or should there be some special casing for that in the GP? It’s perfectly fine to concoct programs which reuse a previous instruction’s argument as the next instruction’s opcode.

# 2025-01-03 00:55 jan: Basically what Gav said. The program code blob is just that, a blob of bytes. You have the instruction pointer which tells you at which position to decode an instruction, and you have the skip value which tells you how to increment the instruction pointer after that instruction is executed. And the skip value doesn't necessarily have to be the same as the "length" of the instruction.

# 2025-01-03 07:11 tomusdrw: Perfect, thanks for the confirmation.

# 2025-01-03 21:35 charliewinston14: Hi. Question about block sealing (by ticket). Maybe I am missing something but tickets are generated using a ring signature so there doesn't seem to be a way to determine which ring member created it. But then how do we validate that when a block is received that it was created by the correct author that matches the ticket? couldn't any other member of the ring just create the same block and set themselves as the author (pretending the ticket was theirs)? (edited)

# 2025-01-03 21:46 davxy: The block author signs the block header using a "standard" Bandersnatch VRF signature. The VRF output generated from this signature matches the Ring VRF output, which serves as the ticket ID. This proves the block author's ownership of the ticket, as they are the only one capable of producing this specific output. For further details, please refer to the Bandersnatch VRF specification paper and example https://github.com/davxy/bandersnatch-vrfs-spec. (edited)

# 2025-01-05 09:10 luke_fishman: couple of question please regarding chapter 14.4 (computation of work results) 1. in Eq. 14.11 => definition of I(p,j) => invocation of refine => the function S(import segment data) is being called with two arguments, but in 14.14 is defined only with single (work item) argument. which one is it? 2. maybe related, who/what is the bold face **s** used by the functions J and S in 14.14? where does it come from? Thank you

# 2025-01-05 11:15 gav: 1. You can disregard the bold-l parameter; it is assumed to be “part of the environment” in each of the functions which utilise L and is not passed explicitly into it. (edited)

# 2025-01-05 11:17 gav: 2. Bold-s is constrained as the correct operand to the merkle root equality.

# 2025-01-05 11:29 luke_fishman: understood, thank you. follow up question, could you shed some light on how to tell the difference between H and H⊞? i saw your answer in the other room about encoding in C.29 but that didn't clear it up for me. are they both Y\_32 or is there any "mark" to tell them apart? (edited)

# 2025-01-05 11:38 gav: There’s a mark: it’s the little window thing.

# 2025-01-05 11:42 gav: One is drawn from the set Y_32 and the other from a bijective mapping, denoted by square-plus (that little window symbol)

# 2025-01-05 11:52 luke_fishman: yes indeed there is the window thingy mark. that's not what i meant. maybe it's a silly question. the intention was in code how do i tell them apart if they are both a 32 octets long binary, how could i tell if i'm looking at an H or an H⊞? (edited)

# 2025-01-05 11:58 gav: Well that all depends on what language you’re using. In general, you’ll need to use some extra memory to store whether it’s with or without the mark.

# 2025-01-05 11:58 gav: In rust you’d probably use a tagged union (edited)

# 2025-01-05 13:09 luke_fishman: Thank you Gav. I think I've got it

# 2025-01-06 06:44 luke_fishman: gav: follow up question please. about the construction of the segment-root dictionary l. initially we have: K(l) ≡ {h S w ∈ pw , (h⊞, n) ∈ wi} , SlS ≤ 8 this is clear enough, from the work item in a work package we take up to 8 hashes, of the kind that is the work-package hash. okie dokie. then we come to : ∃p, c ∈ P, NC ∶ H(p) = h in 14.13 which means to me one of two things 1. we have here knowledge of "all" the work packages (all in what scope?) or some subset/list of work packages and we go through it to find the work-pacakges that the work-items refers to 2. we only keep in the segment-root dictionary the hashes that refer to the work-pacakge paseed into work result computation function Ξ. so all the keys will be identical, H(p), but since s as far as i can see is not dependent on the core index c. we would end up with a dictionary of all identical key -> value pairs option 2 doesn't make much sense, which brings me back yo option 1, but i see no reference to any list or set of work-packages (edited)

# 2025-01-06 06:50 luke_fishman: is there a video lecture of chapter 14 as part of the JAM tour? i could not find any

# 2025-01-06 10:28 gav: Not yet. Expect one in two months

# 2025-01-06 10:30 gav: Yes, as the guarantor you'll need to have seen the WRs of any WP hashes mentioned in the SR Lookup, so you can be sure that it's correct otherwise you might end up guaranteeing WP with a WR which won't be accumulated. (edited)

# 2025-01-06 10:34 gav: There only a limited about of WR history you'll need have knowledge of: https://graypaper.fluffylabs.dev/#/6e1c0cd/15de01152d02

# 2025-01-06 10:35 gav: Basically just what will be in the recent blocks by the time of becoming available. (edited)

# 2025-01-06 10:36 gav: So that's basically just H = 8 slots (edited)

# 2025-01-06 10:39 gav: This mechanism is designed to allow pipelining when sequences of unidependent work packages are executed on the same core.

# 2025-01-06 10:41 gav: Without it, it would be hard to reference data in the DA without figuring out the SR manually which may not necessarily be known at the time of authoring the importing package.

# 2025-01-07 10:52 0xjunha: I have a question regarding the leaf node encoding function (https://graypaper.fluffylabs.dev/#/911af30/37f402371103) for state merklization. In a Merkle trie, a leaf node will be typically placed at a depth less than 256, implying that it should represent the remaining, unconsumed bits of the state_key after navigating to that point - so that we can guess which state_key that leaf node represents. When considering what the leaf node holds as the encoded state key (bits(k)...248), should this refer to: a) "The first 248 bits of the full 256-bit state key"? Or b) "The first 248 bits of the unconsumed path bits"? For example, let's say we have a state key like 0b1101_1010_1001... and a leaf node is at depth 10, meaning we've navigated the trie using the path 0b1101_1010_10 to reach this node, should the leaf node: a) Store the first 248 bits of 0b1101_1010_1001...? Or b) Only store the first 248 bits of 0b01..., skipping the first 10 bits, thus excluding the part already used for navigation (0b1101_1010_10)? Actually in this case the remaining part is only 246 bits, so we have 2 bits of free space. From my interpretation of the formalism, it seems the encoding takes the "first 248 bits of the full 256-bit state key" regardless of the leaf node's position (depth), so option a), but just wanted to confirm this understanding.

# 2025-01-07 13:55 dave: Yes, (a) is correct

# 2025-01-07 14:18 0xjunha: David Emett: awesome, thanks for clarifying!

# 2025-01-08 15:43 gav: @room Version 0.5.4 released with some important protocol alterations.

# 2025-01-08 15:46 gav: Significant alterations: * PVM Host-calls: Host-call to inspect preimage request status by @gavofyork in https://github.com/gavofyork/graypaper/pull/177 * PVM Invocations: Yielding accumulation hash possible on OOG by @gavofyork in https://github.com/gavofyork/graypaper/pull/179 * PVM: Bit manipulation (Zbb-inspired) extensions by @gavofyork in https://github.com/gavofyork/graypaper/pull/176 * Accumulation & PVM Invocations: Buddy ejects service to allow perfect service deletion by @gavofyork in https://github.com/gavofyork/graypaper/pull/182

# 2025-01-08 15:47 gav: I expect one further substantial alteration in the 0.5 series and after that I think we'll be good until the 0.7/0.8 series.

# 2025-01-08 19:54 danicuki: Getting back to an old question: I still didn't figure out what C(p) means in this formula. Here: d ↦ [join(c) ∣ c <−T[C(p) ∣ p <−unzip684(d)]] d is a binary with size multiple of both k and 684 I unzip684 d, so I get k binaries p of size 684 then I take each one of this binaries p and apply C(p) - what is C(p) ?

# 2025-01-09 08:17 amritj: It is Reed-Solomon erasure coding Encoding function we are allowed to use external library for this, example - https://github.com/ordian/reed-solomon-simd/blob/7def877102661817b3c2d5bcfe85118ffb535245/README.md?plain=1#L94

# 2025-01-10 18:08 danicuki: I got it. Thanks for sharing this. I tried to use this library, but it accepts only shards with size that are multiple of 64 bytes. In case of JAM, shards are 684 bytes. Do you know how we could deal with this? Another question is: 684 / 1023 configuration would be for production network. What would be the testnet / tiny / small network configuration?

# 2025-01-10 20:56 danicuki: About the Erasure Code definition, if C ∶ ⟦Y2⟧342 → ⟦Y2⟧1023, and p ∈ ⟦Y684⟧, is there a formal specification missing in formula H.6 to transform p ∈ ⟦Y684⟧ into ⟦Y2⟧342 ?

# 2025-01-11 03:40 amritj: I am also facing this problem right now. I have temporarily padded my data to support 64 bytes and will fix it later. There is also a discussion about this issue on GitHub here: https://github.com/w3f/jamtestvectors/pull/4 Maybe someone else has a concrete solution. I believe the configuration for testnet, tiny, or small networks will remain the same because the data size will still be 684k. In all these networks, the same validator holds multiple pieces of the 1023 chunks. (edited)

# 2025-01-11 17:32 weigen: Hi, I have a question about disputes. Based on my understanding, the faults (f) in the extrinsic include validators who issued invalid judgments on a work report. Since the work report is actually valid, it should belong to the good set (Psi_g). This logic aligns with the behavior seen in this test vector (https://github.com/davxy/jam-test-vectors/blob/polkajam-vectors/disputes/tiny/progress_with_verdicts-4.json#L82, https://github.com/davxy/jam-test-vectors/blob/polkajam-vectors/disputes/tiny/progress_with_verdicts-4.json#L191), where work reports included in faults (f) are added to the good set of the posterior Psi. However, according to the Gray Paper (https://graypaper.fluffylabs.dev/#/579bd12/128a01128a01), the reports in faults (f) should instead be placed in the bad set (Psi_b) and not in the good set. This creates a conflict: if the work reports in faults (f) are valid and should be part of the good set, why are they assigned to the bad set instead? My understanding is that validators who issued invalid judgments on these valid reports are placed in faults (f), but the reports themselves remain valid. Am I interpreting this correctly, or have I missed something?

# 2025-01-12 10:42 dave: I think you are interpreting the GP incorrectly. The judgments included in faults can be either positive (ie claiming the WR is good) or negative (ie claiming it is bad). The requirement you linked (https://graypaper.fluffylabs.dev/#/579bd12/128a01128a01) just states that they must contradict a verdict

# 2025-01-12 13:40 weigen: Thanks! I got it

# 2025-01-12 10:44 dave: So eg if a WR is found to be bad, then validators who supported the WR (ie produced a positive judgment) can be reported via faults

# 2025-01-12 13:37 clearloop: can't we just use https://github.com/paritytech/erasure-coding/tree/main directly? for the testvector, I think it is incomplete or we'd better do it after the accumulation part, if I'm not mistaken, the testvectors required a lot of external logic for assembling the chunks (edited)

# 2025-01-13 07:43 danicuki: > <@clearloop:matrix.org> can't we just use https://github.com/paritytech/erasure-coding/tree/main directly? for the testvector, I think it is incomplete or we'd better do it after the accumulation part, if I'm not mistaken, the testvectors required a lot of external logic for assembling the chunks This parity example just takes a generic byte string and splits it into chunks. It still uses the Reed Solomon library, that only accepts strings with 64 bytes multiple size

# 2025-01-16 05:02 clearloop: thanks! I have skipped the types specified in erasure_coding after seeing parity/erasure_coding , will get it back these days!

# 2025-01-13 10:29 dakkk: In the INVOKE host-call (https://graypaper.fluffylabs.dev/#/579bd12/362401366b01) we are apparently setting to regs[7], both values of "Host-call result constants" (https://graypaper.fluffylabs.dev/#/579bd12/2c78022ca802) and values from "inner pvm invocations result codes" (https://graypaper.fluffylabs.dev/#/579bd12/2caa022cbd02). This would be incorrect in theory, and also in practice since the value "0" is associated to two different constants (Ok and Halt)

# 2025-01-13 12:22 gav: No it wouldn’t.

# 2025-01-13 12:22 gav: The two tokens OK and Halt are each representations of zero.

# 2025-01-13 12:24 gav: If invoke returns zero in omega_7, then it unambiguously means Halt.

# 2025-01-13 12:25 gav: You’re probably trying to associate a _type_ with omega_7. This would be an incorrect approach.

# 2025-01-13 12:30 dakkk: yep, I was trying that. Thank you for clarification

# 2025-01-13 16:56 ycc3741: gav: Does the official source provide any test data for verifying whether each class has been correctly serialized according to the specifications in Appendix C.2 before performing the hash operation?

# 2025-01-13 16:57 ycc3741: We are currently encountering some issues and would like to verify whether the serialization is correct. Having the serialized results available would greatly facilitate development. Thank you!

# 2025-01-13 17:08 jaymansfield: > <@ycc3741:matrix.org> We are currently encountering some issues and would like to verify whether the serialization is correct. Having the serialized results available would greatly facilitate development. Thank you! Just a tip in the meantime until there are official full block/state serialization vectors.. I found switching to parsing all of the *.bin vectors rather than the json equivalents helped me catch a few encoding issues.

# 2025-01-13 17:10 dave: There are some serialization test vectors here: https://github.com/w3f/jamtestvectors/tree/master/codec

# 2025-01-13 17:30 ycc3741: Oh I got it thanks a lot

# 2025-01-15 14:52 prasad-kumkar: For state merklization, is there a recommended approach for data ordering before constructing the Merkle tree? While I noticed the state key construction function C appears to provide some ordering through key generation, I'm unsure if: 1. This is indeed meant to determine the ordering before Merkleization 2. If so, would service indices (s) create well-distributed keys since they are sparsely distributed 32-bit integers? I initially thought of splitting data in half recursively (like traditional Merkle trees), but noticed JAM's approach might be different.

# 2025-01-15 16:02 gav: This is a Merkle trie, so the keys (given by C) determine exactly the node structure of the tree.

# 2025-01-15 16:03 gav: You don’t get to decide. “Ordering” is moot as we never iterate. But if you really want an order (eg for RPC) then you can apply dictionary ordering to the keys.

# 2025-01-15 16:05 gav: Service indices are sparse, as are keys generally: the function C is designed to be sparse and mostly uniform. The tree’s implied node structure (by virtue of the commitment scheme) should be able to manage this perfectly well.

# 2025-01-15 16:07 gav: One reasonable question might be whether (or to what degree) keys/values should be kept in-memory, and whether (or to what degree) the nodes of the tree should be persisted and whether that persistence should be in-memory or on-disk.

# 2025-01-15 16:08 gav: For now I’d leave this for implementers to decide. I expect that getting to M4 or M5 will almost certainly need aggressive use of RAM to store/memoize one or both of these databases.

# 2025-01-15 19:52 jaymansfield: A suggestion for CE-132.. it might be a good idea to allow multiple tickets to be submitted by a proxy in a single stream. The current specification works good for the tiny chain spec but it may not be the best performance wise when using the full chain spec. The full chain spec will have 2046 tickets to be distributed to everyone within approx. 22 minutes (half of the lottery time and allowing 3 min for connectivity changes). This will result in 90+ incoming streams per minute for each validator just to consume tickets.

# 2025-01-15 19:56 dave: Might happen for the full network protocol, don't think it's particularly important though

# 2025-01-15 20:23 mkchung: For CE-134 "Work-package sharing", why is "slot" not included as part of the msg?

Guarantor -> Guarantor

--> Core Index ++ Segments-Root Mappings
--> Work-Package Bundle
--> FIN
<-- Work-Report Hash ++ Ed25519 Signature
<-- FIN

Perhaps this can be something like

--> Core Index ++ slot ++ Segments-Root Mappings

# 2025-01-15 21:27 dave: Not really necessary as it stands, the recipient can just accept or reject based on whether there is an appropriate core assignment or not. What an appropriate assignment is probably needs to be specified in more detail to ensure different implementations work well together, but will likely just be based on the current time and state at the head of the chain

# 2025-01-17 16:29 cisco: If this function returns B_{8n}, shouldn't it say "forall i in N_{8n}" instead of "...in N_{2^{8n}}"?

# 2025-01-17 18:15 gav: > <@cisco:parity.io> If this function returns B_{8n}, shouldn't it say "forall i in N_{8n}" instead of "...in N_{2^{8n}}"? Yes. Will be fixed in next revision

# 2025-01-17 18:22 gav: https://github.com/gavofyork/graypaper/pull/194

# 2025-01-19 10:18 vinsystems: Question about the schema.asn of PVM test programs: "expected-status" is described as "the status code of the execution, i.e. the way the program is supposed to end". 1.- Is this the GP exit reason? (halt, panic, out of gas, page fault, hostcall fault) 2.- If (1) is yes, should "trap" be replaced by "panic" and the other exit reasons be added to "expected-status"? (edited)

# 2025-01-20 05:50 jan: 1. Yes. 2. "trap" is the same as "panic"

# 2025-01-20 09:57 vinsystems: It's a bit confusing because "trap" is an instruction and "panic" is an exit reason. What do you do when a page fault ocurrs? Do you store the lowest address to access in a status register before executing the trap instruction to switch from PVM mode to JAM kernel mode?

# 2025-01-20 12:51 jan: I will align the naming in the next version of the test vectors to make it less confusing.

# 2025-01-20 12:55 jan: For toplevel PVMs a page fault is no different than executing a "trap" instruction or any other condition which would result in a "panic" exit reason, and for those you don't even need to know the address of the page that faulted. For inner PVMs (those are PVMs triggered with the invoke hostcall) a page fault will interrupt the execution of the inner PVM and should return the address of the page which triggered the fault.

# 2025-01-20 13:00 jan: And in case you're wondering, the store instructions are atomic, so for example if the inner PVM tries to write 4 bytes into the memory, and the first two bytes end up at the 2 last bytes of page N (which was already faulted) and the last two bytes end up at the 2 first bytes of page N +1 (which was not faulted) then such a write will only trigger a page fault with the address of the N + 1 page and memory will not be modified.

# 2025-01-20 18:01 sourabhniyogi: Small Clarification question about preimages included in the genesis state (example): To be conformant to GP, should it include a matching a\_l in the state trie _even though no preimage was ever "requested"_? If so, what would (9.7) (GP link) prescribe as its $[x]$ value? (edited)

# 2025-01-20 22:04 gav: [0] perhaps?

# 2025-01-21 12:29 sourabhniyogi: Question on JAM DA throughput: How does the model referenced in Sec 20 arrive at the distributed availability of 852MB/s? Simplest model based on W_B=12MB (max encoded work package size, derived from bandwidth considerations I think?) with 6s on guarantee and 6s on assurance yields 1MB/s per core x 341 cores = 341MB/s -- so what accounts for the difference?

# 2025-01-21 13:24 dave: JAM is pipelined so peak throughput is 341 WPs per block, not per 2 blocks. Don't know exactly where the 852MB/s number comes from, but the bundle size does not include exported segments so possibly that is the other missing bit (edited)

# 2025-01-21 13:32 gav: That’s an old figure even WPs could be 15 mb.

# 2025-01-21 13:32 gav: Now that they’re 12, it’s 682MB/s

# 2025-01-22 04:52 sourabhniyogi: Ok, got it -- this 2x pipelining "easter egg" is not obvious from GP or JAMNP. We now see the conditions for it are carefully enabled through the ordering of assurances and guarantees on "rho". Is the idea that this performance optimization is optional for M3 "Kusama performance" but practically required for M4 "Polkadot performance" -- yet not really part of the JAM protocol per se and thus outside of GP? Does a hint in CE133/134 to achieve this factor of 2x make sense?

# 2025-01-23 08:47 gav: Implementations are expected to author blocks in a reasonably efficient manner. Asynchrony which is possible should be exploited. This will likely be required as early as M2.

# 2025-01-23 08:50 gav: We will work to provide some basic M2 test vectors demonstrating this expectation.

# 2025-01-23 08:52 clearloop: If there will be a more detailed spec for async computation provided in M2 test vectors, as we can see in the dependency graph here https://graypaper.fluffylabs.dev/#/579bd12/091b00091b00, some of the state transition are just **assign** operations, that they actually not need async (edited)

# 2025-01-23 09:11 gav: For the inner aspects of state transition (block execution), asynchrony is perhaps not quite so relevant as with block authoring. (edited)

# 2025-01-23 09:12 gav: There will likely need to be some to hit M3 and M4 (e.g. signature checking, inter-block Merkle root calculation).

# 2025-01-23 09:12 gav: But it's up to teams how the optimise, given the correctness-constraints of the GP.

# 2025-01-23 09:12 gav: We won't be holding anyone's hand here.

# 2025-01-23 09:14 gav: For block authoring (M2), skipping a block for either guaranteeing or assuring will be considered incorrect behaviour.

# 2025-01-23 09:14 gav: Nodes are expected to provide (and use) information in a timely fashion; this goes for all aspects of block authoring and production. (edited)

# 2025-01-23 09:21 gav: Updated my notes with an according FAQ entry. (edited)

# 2025-01-24 10:06 dvladco: Hi, in the GP v0.5.4 the instructions rot_l_32 and rot_r_32 have 3 registers but ω_B is never used, is this a typo? and should we rotate by ω_B instead?

# 2025-01-24 10:07 jan: Yes, it's a typo.

# 2025-01-24 10:08 qiwei: the fix is merged: https://github.com/gavofyork/graypaper/pull/193

# 2025-01-25 13:18 weigen: Hi, I have a question about GP (https://graypaper.fluffylabs.dev/#/579bd12/131301131901). What is the expected range of n (segment-count)? When serializing a work report, should n be considered as 𝑛∈𝑁_32 or 𝑛∈𝑁_16? The choice would result in a difference in byte size.

# 2025-01-25 14:48 dave: Serialisation is defined in appendix C. In particular, see C.22

# 2025-01-26 12:07 weigen: Maybe the constraint of 𝑛∈𝑁_16 is needed, since in C.22 the input of ε_2 should be N_16

# 2025-01-26 14:22 gav: It’s not strictly needed, though if N \ N_{2^16} were ever fed into E_2 the result would be undefined. (edited)

# 2025-01-26 14:26 gav: Since it is deserialised with E_2, any foreign-born availability specification will always be in range for correct reserialisation. And though your node could create an availablilty spec with an out of range value, it would only break your own node since it could not be encoded and this would be needed for it to be sent to another node. (edited)

# 2025-01-26 16:52 danicuki: The formula to encode guarantees on for Hx extrinsic hash (https://graypaper.fluffylabs.dev/#/579bd12/0ce0000cf200) differs from the formula for encoding guarantees for block encoding (https://graypaper.fluffylabs.dev/#/579bd12/375b01377f01). In the former, it calculates hash of work report. Is this correct? (edited)

# 2025-01-26 17:18 gav: Yes.

# 2025-01-26 17:18 gav: It’s a bit fiddly but it’s so that they can be separately distributed and individual items concisely proven to be correct. (edited)

# 2025-01-26 17:25 sourabhniyogi: With tiny having rotation_period: 4 (which is not all that different from full having rotation_period: 10) we have found in our "run ~12MB work packages back-to-back" tests that work packages are reasonable for one validator to start towards the end of one rotation but by the time a guarantee is signable by one of the three, it is the case that one or two of them have rotated out of the core. Ok, so ... what is a good mode of operation for (a) don't start the work if you're towards the end according to rule R, because you won't be rewarded for it! (b) finish it because you'll be rewarded for it! We do believe its valuable to have this rule R or reward process specified in more detail to ensure different implementations work well together -- can we come up with a good Schelling point at this point? Not getting the slot in the "tiny" network (where C is just 2, R is just 4) with "big" work packages (close to 12MB) makes this issue quite prominent in regression tests. (edited)

# 2025-01-26 18:40 dave: Agree this will probably need to be specified at some point for different impls to work well together. I think this rule will need to be informed by the performance characteristics of a full 1000-validator network though, with real builder nodes, and we aren't in a position to run such a network yet. I can say that at the moment our node follows a pretty simple rule: as a validator, accept a package for core C if we will be "assigned" to that core in the next slot or the slot after that. Note that it is possible for block authors to include packages using assignments from the previous rotation, so there is quite a bit of leeway.

# 2025-01-26 18:43 dave: Given this leeway perhaps a better rule would be to allow if we're currently assigned or will be assigned in the next slot. In any case I would recommend making this sort of thing easy to tweak in your implementation!

# 2025-01-26 18:46 dave: The rule should also probably be a bit more lenient for work packages received indirectly from another guarantor on the same assignment, to avoid a situation where the "primary" guarantor _just_ accepts a package but the guarantors it then shares it with do not.

# 2025-01-26 18:51 dave: Of course at the end of the day none of this behaviour will be required; nodes will be free to accept or reject packages however they think will maximise their profit. A specified rule seems like a good starting point though

# 2025-01-26 22:45 sourabhniyogi: Alright - we'll do "to allow if we're currently assigned or will be assigned in the next slot." for now sure thing thank you

# 2025-01-29 16:46 vinsystems: If a panic occurs when 80 -> load_imm_jump executes the branch function, the "ωA" don't should be changed, right?

# 2025-01-29 16:49 jan: It's always changed, regardless of whether the jump fails or not.

# 2025-01-29 16:49 jan: (This is the equivalent of RISC-V's call instruction; the usual use of this instruction is to load the return address and jump to another function.) (edited)

# 2025-01-29 16:52 vinsystems: Thanks! 🙂

# 2025-01-30 10:08 carlos-romano: Screenshot 2025-01-30 at 11.07.59.png

# 2025-01-30 10:09 carlos-romano: Regarding https://github.com/davxy/jam-test-vectors/tree/polkajam-vectors/reports : I can see there are test vectors where auth_pools are the same in both prestate and poststate, but given that we should remove the used authorizer from the pool, and that pending core authorizers are not provided by test vectors and should be empty, I don't get why this is the case. Specific commit where this change was introduced: https://github.com/davxy/jam-test-vectors/commit/729592cf87eb09bc34555f846c5b19f5b1453c52 . This is related to equation 8.2, GP 0.5.4 Thanks!

# 2025-01-30 13:18 subotic: Regarding page-fault and this paragraph https://graypaper.fluffylabs.dev/#/579bd12/243c00245500, I understand that when trying to write to writable memory and I cannot, I emit a page-fault. What is not so clear to me based on this definition, that if I want to write to read-only memory, I should emit a panic. I only know that because of the test-vectors. Or did I miss a place in the GP where this is defined?

# 2025-01-30 13:20 boymaas: Maybe this helps: https://github.com/w3f/jamtestvectors/pull/3#issuecomment-2614547819

# 2025-01-30 13:25 subotic: Great, thanks for the link. As always, things are more complicated, then they seem.

# 2025-01-30 14:21 gav: Version 0.6.0 is released!

# 2025-01-30 14:22 gav: There's a few changes in this from 0.5.4 all centred around the PVM

# 2025-01-30 14:26 gav: - import host call has been removed in favour of a new fetch hostcall; this reduces the amount of data placed in PVM memory up-front and provides a means of extracting data beyond just that concerning the current work-item but for other work-items too. - All data providing host-calls now accept an offset parameter to allow any contiguous subportion of the data to be read. - OOB has been (almost) removed. When the outer PVM has a host-call in which it is passed a memory address it cannot access, then it panics irrecoverably.

# 2025-01-30 14:27 gav: The page faulting specification has also been updated and formalised.

# 2025-01-30 14:28 gav: This release signals the end of the 0.5 series and, potentially (but probably not), the final protocol revision (not including the obvious stuff which we know will need doing before 1.0 such as gas pricing). (edited)

# 2025-01-30 14:30 gav: 0.6 series should contain primarily cleanups, finesse, formatting, discussion and corrections.

# 2025-01-30 14:30 gav: 0.7 and 0.8 will be any important tweaks or optimisations brought on through service-prototyping and Toaster-testing. (edited)

# 2025-01-30 14:31 gav: 0.9 will be auditing fixes only.

# 2025-02-02 11:16 gav: Version 0.6.1 is released. This is just a few small corrections and a simplification of fetch. (edited)

# 2025-02-03 06:12 clw0908: Should **ω_A** be taken modulo **2^32** before being fed into **Χ_4**, since the input of **Χ_4** belongs to **N_32**? https://graypaper.fluffylabs.dev/#/4bb8fd2/299c0329a303

# 2025-02-03 08:45 gav: Yes, I expect so - @jan:parity.io?

# 2025-02-03 09:49 jan: Yes, since X\_4 requires the input to be N\_32 there should be a modulo there, otherwise the result would be undefined. Side note, since I already had questions regarding this: you can think of every 32-bit instruction variant as _always_ taking a modulo of its inputs, even though sometimes in the GP we omit this modulo to make the equations simpler if the result would be equivalent anyway, for example in case of add_32 the result gets truncated anyway so truncating the inputs is unnecessary. The high-level intent is to allow every 32-bit instruction variant to be implemented as follows (to allow for an efficient recompiler implementation): 1) truncate all inputs to 32-bit, 2) do the operation, 3) sign-extend to 64-bit, so if you find any equation in the GP for 32-bit instruction variants would have given a result that doesn't match this please ping me and we'll correct it. (edited)

# 2025-02-03 11:11 emielsebastiaan: Related to an issue we found in testvectors for PVM instruction 206. Does GP allow for negative output of a modulo operation? Since this is not explicitly mentioned in section 3, I assume the answer is no. If the answer is no, then PolkaVM probably has an incorrect implementation of GP. https://github.com/gavofyork/graypaper/issues/222

# 2025-02-03 12:04 jan: I just replied in the issue: https://github.com/w3f/jamtestvectors/pull/3#issuecomment-2630744952 TLDR: the test vector is correct and what we want; GP might have to be tweaked to account for this

# 2025-02-03 12:05 emielsebastiaan: Yes in that case GP should be adjusted to remove any ambiguity. GP should to explicitly state that the modulo operator on a negative number yields a negative number, and not a positive number as expected by 'Maths'.

# 2025-02-04 09:06 carlos-romano: anyone please? 🙏

# 2025-02-04 11:31 davxy: > <@carlos-romano:matrix.org> anyone please? 🙏 I'll take a look. Could you please specify one particular test vector so I can review it directly?

# 2025-02-04 12:44 carlos-romano: thanks a lot! All the test vectors modified here: https://github.com/davxy/jam-test-vectors/commit/729592cf87eb09bc34555f846c5b19f5b1453c52 For example: reports/tiny/high_work_report_gas-1.json Probably I am missing something, but for me the right test vectors should be how they were before that commit.

# 2025-02-04 19:28 davxy: The "reports" STF exercised by these vectors has been modified to not change the content of the auth queues. The content of the queues is changed by the "authorizations" test vectors. I'll add a note to the readme to make this explicit (edited)

# 2025-02-05 07:06 carlos-romano: ahh ok thanks, so then we don't need to check them against the STF test vectors post state right? Thanks a lot 🙏

# 2025-02-04 12:03 gav: > <@emielsebastiaan:matrix.org> Yes in that case GP should be adjusted to remove any ambiguity. > GP should to explicitly state that the modulo operator on a negative number yields a negative number, and not a positive number as expected by 'Maths'. This is corrected/clarified in main.

# 2025-02-04 16:28 prematurata: I've a question about 14.13. bold_l is being constructed by essentially ensuring that all the workpackagehashes (special hash) in the workItems import data segments i have a key in the bold\_l (14.11) and that the "pointing" value in the bold_l dictionary is the **previously** computed "segmentRoot" of the AVailability specification (14.13). - If this is correct then it means guarantors need to maintain a Datastore containing a dictionary of Previously computed WorkResults corect? - I don't see any limitation of imported segments referencing WorkPackage hashes which generated a report on the same core as the one we're trying to compute. This means that a workpackage p could have work items w whose reference in its import-segments a workpackagehash w which was computed on another core (edited)

# 2025-02-04 16:28 prematurata: (0.6.1)

# 2025-02-05 09:05 gav: 1. Correct 2. Nodes need only keep fairly recent WPH->SR history, since it is known that the resultant report must pass the on-chain WPH->SR lookup whose history is limited. Ultimately guarantors are free to ignore work packages whose imports they deem unreasonable, unknowable or unlikely to result in a profitable endeavour. To pass the conformance tests nodes need only be able to make guarantees under reasonable conditions. (edited)

# 2025-02-05 09:23 prematurata: Thanks for this. Will it ever be specified in graypaper? Especially the "fairly recent" or "reasonable conditions"?

# 2025-02-05 09:24 prematurata: I think the same might be applied on 14.14 which basically is implying that there is a datastore from the MerkleTreeRoot and its "preimage"

# 2025-02-05 09:40 gav: > <@prematurata:matrix.org> Thanks for this. Will it ever be specified in graypaper? Especially the "fairly recent" or "reasonable conditions"? Guaranteeing is a strategic endeavour, and the Gray Paper generally avoids dictating strategy.

# 2025-02-05 09:43 gav: However the aspect of “fairly recent” is well-specified in terms of on-chain behaviour. See equations 12.4-12.8

# 2025-02-05 09:43 gav: Of course a guarantor would need to guess when their guaranteed work-report would likely make it to accumulation in order to apply this limit to the strategy of guaranteeing.

# 2025-02-05 11:34 gav: But given that guarantors are responsible for the same core for 10 blocks at a time and receive all guarantees across (even across other cores), then it's quite possible to make a pretty decent guess on the quality and size of the core's backlog and how long it might be before the new report would make it on-chain. (edited)

# 2025-02-05 11:44 gav: See also 11.40/41: https://graypaper.fluffylabs.dev/#/4bb8fd2/150302152d02

# 2025-02-05 14:17 prematurata: Thanks gavin. I also have another question about the new fetch. - Is it intended for μ′\_{o⋅⋅⋅+l} to be updated even in case of a panic? Ex when w9 is 5 and memory between w10 and +32 is readable. - also what is bold x? (edited)

# 2025-02-05 14:41 gav: In the case of a panic, memory remains the same (though since panic at the top level is unrecoverable, it doesn’t really matter)

# 2025-02-05 14:43 gav: There is a superfluous condition there - the additional omega_9 term for a panic. This will be removed in the next revision. This will then make it very clear that memory does not change in the case of a panic.

# 2025-02-06 06:51 dakkk: did you discovered what that bold x is?

# 2025-02-06 08:42 prematurata: no, I have no idea. my first guess would have been A.7 but according to that it seems bold x woudld then be a set containing numbers \in N_{2^32} and that would make no sense in the fetch definition

# 2025-02-06 11:50 gav: @room : v0.6.2 is out - contains all the latest corrections.

# 2025-02-06 11:50 gav: Bold x is any value which satisfies the various conditions on it. You'll find that practically speaking, there's only one which anyone could reasonably know. (edited)

# 2025-02-06 11:54 gav: It's used specifically for extrinsics, where an extrinsic is specified in the WP as a hash/len. We assert that we know the preimage since we would not get to this part of the guarantee pipeline without knowing.

# 2025-02-06 11:55 gav: Alternatively it could have been made explicit with an additional term being passed in to the Refine function and its context, but that would have just complicated the formulation.

# 2025-02-06 11:56 gav: The GP isn't about describing a node's internal _data-logistics_; that's an implementation-specific consideration. It only concerns outward *behaviour*.

# 2025-02-06 16:02 emielsebastiaan: > <@gav:polkadot.io> @room : v0.6.2 is out - contains all the latest corrections. 3 PRs with pvm instruction modifications are pending review at your convenience.

# 2025-02-06 16:05 gav: Merged one - will wait for Jan Bujak to take a look at the others.

# 2025-02-06 16:09 jan: Yeah, since in the GP we "store" the values unsigned those were definitely missing the conversions back, LGTM.

# 2025-02-06 16:24 gav: All merged - will be in 0.6.3

# 2025-02-06 16:41 emielsebastiaan: I may have found one more for 0.6.3: https://github.com/gavofyork/graypaper/pull/231 Header / Bandersnatch related (H_a)

# 2025-02-07 14:00 cisco: Should the zero host function have the same error condition as the void host function? The one about pages being inaccessible https://graypaper.fluffylabs.dev/#/5f542d7/35a90235a902

# 2025-02-07 14:30 cisco: The logic for getting the service in the read host function is very similar to the lookup host function: https://graypaper.fluffylabs.dev/#/5f542d7/305c00305c00 They could be unified to make it easier to read.

# 2025-02-07 15:09 gav: In the case of void, the pages must all be accessible because we're making them inaccessible.

# 2025-02-07 15:10 gav: We wouldn't want exactly that condition for zero whose job is to initialize pages to being accessible and zero.

# 2025-02-07 15:11 gav: We could introduce a condition to require them to be previously inaccessible, but currently we don't. This was intented, but maybe it should be changed if requiring it bring us more performance impls ( Jan Bujak ?)

# 2025-02-07 15:11 gav: PRs are considered :P (edited)

# 2025-02-07 15:13 cisco: Will make a PR for that unification (edited)

# 2025-02-07 15:30 jan: The lack of requirement in zero that the pages are inaccessible is indeed intended so that e.g. this host call can be used to clear/reinitialize memory without first having to void it nor track what is already allocated. Requiring it wouldn't really change anything performance-wise since you have to (or the OS has to) iterate over the page map to find the holes to fill anyway. (And in practice not requiring this check can make things simpler because you can just ask the OS to zero allocate an address range in bulk instead of having to do this yourself.) Hm, but now that I think about it we probably should make void not require the pages to be accessible either, as that could simplify its usage in certain cases (and, same as with zero, the page map has to be iterated over anyway by the implementation, whether it returns an error or not). Basically have it take a range of pages we want it to free, and when it returns it'll guarantee that the whole range is now free. So the HUH error branch could just be deleted altogether (since voiding the first 64k or out-of-bounds would be a no-op as there can't never be anything allocated there anyway).

# 2025-02-09 01:25 ascriv:

# 2025-02-09 02:00 ascriv: Is anyone aware of any non-rust libraries which implement bandersnatch vrf signatures? Seems the main implementations are in Rust for now (edited)

# 2025-02-09 17:08 jay_ztc: Not aware of any other than davxys work on this. Have it on my list to investigate as a potential SPOF for the network.

# 2025-02-09 18:55 davxy: AFAIK, arc-ec-vrfs is currently the only implementation available. This is likely because the scheme is not standardized, thus any existing implementation should be tied to JAM. For those interested in implementing it, the details of Bandersnatch VRFs are soecified here: https://github.com/davxy/bandersnatch-vrfs-spec. Implementing the "plain" VRF is **relatively** straightforward, especially with the support of a "bigint" library, making it a manageable task. However, the complexity increases significantly when dealing with the ring-VRF variant. Even though it is thoroughly specified here: https://github.com/davxy/ring-proof-spec , implementing it requires a library that supports the underlying SNARK framework it relies on. Since there are no official standards (only de facto ones, at best), you’ll likely need to build many things from scratch. We have been using arkworks for this, as it provides **most** of the tools necessary. That said, some additional components were developed by the W3F team and ourselves to fill in the gaps anyway. In summary, if someone is inclined to implement any of these components (also by building over arkworks, for example), I’m available to provide some support. Additionally, test vectors are available for validating conformance. (edited)

# 2025-02-09 19:20 ascriv: Since implementing it from scratch is so challenging, it seems essentially everyone will rely on the one implementation that is available (using something like go’s FFI if they’re not implementing jam in rust), which becomes a redundancy and a failure point. It’s probably well audited and as a single point of failure still probably ok though.

# 2025-02-09 19:21 ascriv: I would love if people reimplement it of their own volition in their chosen language but I think we might need additional incentives if we do think relying on just this one implementation is in fact an issue

# 2025-02-10 18:08 vinsystems:

# 2025-02-10 18:35 vinsystems: 1.- In the preimages extrinsic, do the service-data pairs have to be ordered _only_ by service id? Or do we also have to consider the order of the data blob? 2.- Is the R function of eq (12.30) the one defined in eq (12.23)? (edited)

# 2025-02-10 19:42 gav: 1. Both, primarily service ID and then data blob. Same goes for any tuple.

# 2025-02-10 19:44 gav: 2. No. R in 12.30 and 12.31 are the same. R in 12.23 and 2.24 are the same. I might rename one of them to avoid the confusion.

# 2025-02-10 19:45 vinsystems: got it, thanks 👍️

# 2025-02-10 19:58 leonidas_m: The GP assumes that we know the preimage data because there's only one value that satisfies the necessary conditions. However, since the Refine function neither allows passing the preimage explicitly as a parameter nor permits querying the state, it's unclear where this data should come from. Should the preimage be retrieved from a node's internal data store (local database), or is it expected to be fetched from an external source (e.g., network)?

# 2025-02-10 21:23 gav: The latter.

# 2025-02-10 21:23 gav: Validator nodes are expected to be sent solicited preimages directly from external sources. (edited)

# 2025-02-11 16:04 arjanz: I believe to have corrected two small typos in appendix D, \H_0 should be \H^0: https://github.com/gavofyork/graypaper/pull/234

# 2025-02-11 16:10 yu2c: https://matrix.to/#/!ddsEwXlCWnreEGuqXZ:polkadot.io/$aFpJR-J3zqDbApsjZ3qMX-QjOVoZnYSYG2oTSPPCDB0?via=polkadot.io&via=matrix.org&via=parity.io

# 2025-02-12 00:39 charliewinston14: Hello I have a few questions I was hoping someone could help with. 1. Is there a formal definition of a work package bundle other than the textual description in 14.4.1? It mentions the attributes but not the data types of each attribute. 2. The extrinsic data in a work package bundle, is it a list of hashes and lengths? Or is this the full preimages themselves? 3. The exported segments in a work package bundle, or they the actual segments of length 4104? Or are they a list of root and indexes? 4. Is the extrinsic that’s passed in CE133 the same as what’s in a work package bundle? If so why is it included in CE133 if it can just be calculated based off the work package by the receiver using X(w) in 14.14

# 2025-02-12 13:38 prematurata: Hello I thought about opening an issue on graypaper repo instead of reporting here as it was/is a bit complicated matter to write here https://github.com/gavofyork/graypaper/issues/239

# 2025-02-12 15:06 gav: > <@charliewinston14:matrix.org> Hello I have a few questions I was hoping someone could help with. > > 1. Is there a formal definition of a work package bundle other than the textual description in 14.4.1? It mentions the attributes but not the data types of each attribute. > 2. The extrinsic data in a work package bundle, is it a list of hashes and lengths? Or is this the full preimages themselves? > 3. The exported segments in a work package bundle, or they the actual segments of length 4104? Or are they a list of root and indexes? > 4. Is the extrinsic that’s passed in CE133 the same as what’s in a work package bundle? If so why is it included in CE133 if it can just be calculated based off the work package by the receiver using X(w) in 14.14 > 1. It is the second argument to A in 14.15

# 2025-02-12 15:07 gav: 2. A bundle has the extrinsic data; see 14.14 X.

# 2025-02-12 15:08 gav: 3. A bundle has the actual segments and justifications. See 14.14 S and J.

# 2025-02-12 15:09 gav: 4. 14.14 X assumes knowledge of the relevant hash preimages. The receiver may not have such knowledge therefore it makes sense to provide it. Theoretically we could make it an on-demand thing later.

# 2025-02-13 10:16 emielsebastiaan: Adds tau & tau\_prime as state transition input dependency. https://github.com/gavofyork/graypaper/pull/241 (edited)

# 2025-02-14 11:49 sourabhniyogi: How is the strangely odd number 81 from the a_o service storage size formula derived? The 32 within the same formula is the storage for key, but 81 is 17 more bytes than 64... (edited)

# 2025-07-17 10:13 dave: I believe it's key+value size for the l and p dictionaries. p key size is 32, value size is z. l key size is 36, value size is max 13 (1 for len + 3*4 bytes). Total is 81+z.

# 2025-02-14 19:27 ascriv: For initializing the ring context for bandersnatch ring vrf stuff, there must be a common seed we’ll all be using? So that it remains deterministic

# 2025-02-14 19:33 ascriv: I assume we use the parameters in 4.1 configuration in https://github.com/davxy/bandersnatch-vrfs-spec/blob/main/specification.pdf but wanted to confirm

# 2025-02-14 21:58 davxy: > <@ascriv:matrix.org> I assume we use the parameters in 4.1 configuration in https://github.com/davxy/bandersnatch-vrfs-spec/blob/main/specification.pdf but wanted to confirm Correct. If and when anything changes, I'll update the spec and notify all JAM channels.

# 2025-02-16 18:27 ascriv: I think I have a correction for the state transition dependency graph: lambda prime needs to be added to the inputs for the state transition for the validator statistics, since we need to compute the reporters set which requires G* which requires lambda prime

# 2025-02-17 08:36 celadari: Hi everyone, I have a question about accumulation regarding the call to the accumulate function. We call Ψ_A in equation (12.19) [GP version 0.6.1] at the accumulation stage. I'm following the formalism of the white paper. Should this function call alter δ, or does it return a new account (like the Ω functions) that needs to be accumulated/saved later in equation (12.21)? Am I clear? 😅

# 2025-02-17 08:41 gav: The accumulate function is used in the final definition of posterior delta. It is up to implementations to determine at what point they alter any particular internal data structure(s) which may represent delta or some partial/intermediate value of it. (edited)

# 2025-02-17 08:49 celadari: thanks

# 2025-02-17 08:43 gav: One thing which will be very helpful to know is that no two accumulate functions which both contribute to the same “wave” of accumulations will have contradictory changes.

# 2025-02-17 21:31 emielsebastiaan:

# 2025-02-17 22:08 emielsebastiaan: ignore previous message.. mistake on our end

# 2025-02-17 22:19 emielsebastiaan: apologies for the inconvenience Jan Bujak

# 2025-02-18 04:38 sourabhniyogi: We believe the notation ${\\bf p}\_{\\bf c}$ used in Eq B.1 here needs to be adjusted to accommodate service $h$ and authorization code hash $u$ of 14.2 here -- that the preimage $u$ of service $h$ is the authorization code input to $\\Psi\_M$ of B.2. Can someone confirm this interpretation is correct? Specifically, the genesis state with bootstrap service 0 will have a null authorizer code hash (say, 0x12344321...) and the very first work package ${\\bf p}$ (to create a new service) will have $h=0$ and $u=0x12344321...$ to reference this null authorizer. Does that make sense? (edited)

# 2025-02-18 04:51 0xjunha: Description of accumulate queue-editing function (*E*) seems outdated - opened a PR to update it. https://github.com/gavofyork/graypaper/pull/246

# 2025-02-18 05:03 gav: No - having trouble interpreting this (edited)

# 2025-02-18 05:07 sourabhniyogi: Sorry -- how does Eq B.1's first input of $\\Psi\_M$ ( which is ${\\bf p}\_{\\bf c}$) get at a work package's authorization code? (edited)

# 2025-02-18 05:09 gav: See 14.9

# 2025-02-18 13:26 leonidas_m: Hey, I've noticed that some PVM tests (eg inst_load_u8_nok, inst_store_u8_trap_inaccessible) charge more than 1 extra gas when a page fault occurs even though only a single instruction is executed. In previous commits, similar tests were removed because the cost model was based on polkaVM and wasn't specified in the GP. Are these tests facing the same issue now or am I misunderstanding something?

# 2025-02-18 20:27 jan: Yes.

# 2025-02-19 07:58 leonidas_m: Looks like the same issue also affects the following test vectors: inst_store_imm_indirect_u16_with_offset_nok inst_store_imm_indirect_u32_with_offset_nok inst_store_imm_indirect_u64_with_offset_nok inst_store_imm_indirect_u8_with_offset_nok inst_store_imm_u8_trap_inaccessible inst_store_indirect_u16_with_offset_nok inst_store_indirect_u32_with_offset_nok inst_store_indirect_u64_with_offset_nok inst_store_indirect_u8_with_offset_nok

# 2025-02-19 12:37 celadari: Hi guys, Just some question related to safrole: Equations (6.15), (6.16), (6.17), (6.18), (6.19), (6.20) - GP version 6.2 - refer to gamma_s' and eta_3' => so we should update gamma_s and eta_3 before ? Therefore, we should compute equations (6.23), (6.24) before checking equations (6.15), (6.16), (6.17), (6.18), (6.19), (6.20) ?

# 2025-02-19 13:28 oliver.tale-yazdi: I had a similar Q and that is was i ended up doing. It seems that the gamma states cannot be updated at once anymore but need to be done in two steps

# 2025-02-19 13:33 celadari: Can you elaborate please ? 🙃 Should we update gamma then before making the header check of equation (6.15)-(6.20) ?

# 2025-02-19 14:57 oliver.tale-yazdi: We first do gamma_k & gamma_z, then check entropy marker H_e and then gamma_s & gamma_a. Not sure if that is optimal, it was just what i coded up first 🤷

# 2025-02-19 14:21 ascriv: That’s how I interpret the equations, since they involve posterior variables, the posterior variables must be computed already

# 2025-02-20 12:10 sourabhniyogi: I want a JAM codec expression after here to fully detail wrangled operand tuples to support this ${\bf o}$ input into single accumulate -- is this reasonable? (edited)

# 2025-02-20 12:37 sourabhniyogi: Also, due to GP Eq 4.7 it seems technically JAM departs from good old GP Eq 4.1 in needing accumulation result tree root $r$ summarizing ${\\bf C}$ from the state before $\\sigma$. So, to verify a isolated JAM "state\_transition", it has to not only include two states and block, but ALSO include this $r$ (or the _whole_ Beefy commitment map ${\\bf C}$) like this -- can someone confirm this? (edited)

# 2025-02-20 12:49 gav: > <@sourabhniyogi:matrix.org> I want a JAM codec expression after here to fully detail wrangled operand tuples to support this ${\bf o}$ input into single accumulate -- is this reasonable? Not sure what you’re talking about.

# 2025-02-20 14:23 sourabhniyogi: Will refine our guess -- https://hackmd.io/@sourabhniyogi/wrangledoperandtuples

# 2025-02-20 12:50 gav: O is defined properly.

# 2025-02-20 12:50 gav: Its result can easily be fed into the serialization function. (edited)

# 2025-02-20 12:51 gav: As used in C.23

# 2025-02-20 14:29 gav:

# 2025-02-20 14:31 gav: Cool - looks like you’ve already done most of the work for a PR to sort this - want to submit one?

# 2025-02-20 18:09 sourabhniyogi: If someone else agrees with it, we will give it a shot! (edited)

# 2025-02-22 00:16 charliewinston14: Hello. I’m having some difficulty understanding the erasure root formula in the GP, specifically calculating “s♣” in 14.16. Hoping someone can point me in the right direction. I had no problem calculating “b♣” and have my erasure coding function C and paged proof generation function P already. The s♣ formula has C#6 (s⌢ P(s))), where S is an array and P(S) is an array as well. Does that mean to concatenate them both together and then pass to chunking function? I think it’s the # that is confusing me as that normally means apply to each of the sub items. There is also a # on the binary merkle call so I’m assuming that I need to call the merkle function multiple times and not just once with the results of the erasure encoding but not understanding the formula at all. Can someone give me a tip of how to proceed with it? https://graypaper.fluffylabs.dev/#/5f542d7/1b4c011b5701

# 2025-02-22 01:57 ascriv: Is there a very rough estimate for v1.0.0? Trying to think if I can make milestone 1 by that time :v

# 2025-02-22 02:45 gav: Latest estimate is by end of Q3. But will depend a lot on the outcome of Toaster and initial service development.

# 2025-02-22 03:47 ymcsabo: Hi, in the gray paper section 15.2, it mentions advanced nodes and naive nodes. What are some of the examples of those two types of nodes?

# 2025-02-22 07:24 gav: There is no clear delineation. It’s about *strategy*. Some implementations (or node configurations) may use a more sophisticated strategy for predicting the best work package to execute and guarantee. This will allow their operators to take greater rewards under some circumstances. But again, this is strategy and therefore largely out of scope for the GP. (edited)

# 2025-02-22 07:25 luke_fishman: I need some clarification regarding the advancement of the instruction counter i in the PVM Reading A.1 and A.7 I understand the counter i` will always advance to the next instruction unless the exit reason is panic or halts so if we have program like ecalli .. op1 ... op2 ... host call fail => continue from op1 host call succeed => continue from op2 (due to the extra skip in [A.33] (https://graypaper.fluffylabs.dev/#/5f542d7/2b70012b7001) However, reading the text below A.34 i understand that i' is: exit reason == continue > i + 1 +skip out of gas => i panic or halt => 0 page fault => i host call => i but this makes [A.33] (https://graypaper.fluffylabs.dev/#/5f542d7/2b70012b7001) not make sense as now we will have host call fail => counter stays => reinvoke into the failing host call

# 2025-02-22 07:32 gav: You seem to be confusing two different conditions.

# 2025-02-22 07:34 gav: If the hostcall succeeds (where you pointed) then i’’ is used as the new instruction counter for the invocation of Phi_H which effectively skips past the ecalli instruction. If the hostcall results in anything other than a continue (the last condition) then Phi_H is not invoked again anyway. (edited)

# 2025-02-22 07:43 luke_fishman: right. my code does just that. no issue so lets talk about the case where the host call succeeds we start with phi\_1 which returned ecalli, and i' = i +1 +skip (i.e point to instruction after the the ecalli) and so we don't need to advance again after the host call has succeeded. since we already point to the next instruction or, the Phi\_1 should not advance the counter on a ecalli exit reason, and after host call finishes with success then the counter advances basically, why the text below A.34 says the i\` point the the host call, when seems to me from A.7 that it has already progressed beyond (edited)

# 2025-02-22 08:06 gav: I see your point, yes. i’’ needs not be defined; i’ should be used instead. Feel free to make a PR if i don’t get to it first.

# 2025-02-22 08:07 luke_fishman: yep. that's what i thought , i'' is not needed Thank you for confirming

# 2025-02-22 11:48 luke_fishman: regarding i in B.9

i = check((E4−1 (H(E(s, η0′ , Ht ))) mod (2^32 − 2^9 )) + 2^8 )

does the decode\_4 bytes imply that only the first(last?) bytes of the hash are to be taken? (edited)

# 2025-02-22 13:50 gav: Yes

# 2025-02-22 15:19 sourabhniyogi: Question on what the first value of a_t is in new here which is defined in 9.8: Assume the preimage of the code is 1149 bytes, and recall GP constants: B_S = 100, B_I = 10, B_L = 1 What is the value of a_i and a_o and thus a_t: * (1): a_i = 0, a_o = 0 ==> a_t = 100 (2): a_i = 2, a_o = 81 + 1149 = 1230 ==> a_t = 100 + 10 2 + 1 * 1230 = 1350 The order of operations in new is not clear, especially with the a_t and l "happening in the same line" here: https://graypaper.fluffylabs.dev/#/5f542d7/31b90231b902

# 2025-02-22 15:30 gav: a_t is a dependent variable whose value is implied through the (non-negotiable) definition of bold-l, which is fully defined as c and l are both fixed values. (edited)

# 2025-02-22 15:31 gav: balance is required to be equal to this (dependent) variable. There exists only one solution to this statement. (edited)

# 2025-02-22 15:32 gav: No specific order is *needed*. Ever. (edited)

# 2025-02-22 15:33 gav: Ordering is a point of implementation strategy, for implementation languages which require the practitioner to specify it (ie imperative ones). (edited)

# 2025-02-22 15:36 gav: Plenty of languages, like formal logic, don’t generally insist on specifying a solution in terms of ordered mutations.

# 2025-02-22 15:37 gav: If this is a new concept, I’d suggest reading some undergrad computer science texts such as “structure and interpretation of computer programmes”. (edited)

# 2025-02-23 03:51 luke_fishman: good morning everyone very small specific question about encoding looking at the encoding in the calculation of i in B.9 E(s, η0′ , Ht ) for me the symbol e means general encoding C.6 But the text under C.6 says _"Note that at present this is utilized only in encoding the length prefix of variable-length sequences."_ which would imply: - service index is encoded as 4 bytes (refs 9.1, C.23) - timeslot is encoded as 4 bytes as well (refs I.1.1, C.16, C.20, C.22) (edited)

# 2025-02-23 11:26 vinsystems: In eq 12.10, Does m should be m' since Ht = tau'?

# 2025-02-23 12:04 gav: I generally prefer not using a prime unless the plain (non-prime) term is also used.

# 2025-02-23 18:32 ascriv: I assume that when inspecting memory during the sbrk instruction, this should not case a memory-access exception, right? Also, is it right to interpret the math as saying “find the earliest inaccessible contiguous memory segment starting at or above h of length wa, and set it as mutable”?

# 2025-02-23 19:05 gav: For sbrk, I’ll leave it in the hands of @jan:parity.io - I’d personally quite like to get rid of it:))

# 2025-02-23 20:49 ascriv: neg_add_imm_64 has a +2^64 but then mods by 2^64 so this addition is the same as +0, so it’s redundant. Unless there’s a typo

# 2025-02-23 21:06 ascriv: rot_r_64_imm performs a left shift as written (ith bit of w’a = i+vx bit of wb), but the name suggests a right shift. Is this correct?

# 2025-02-23 21:07 ascriv: Similar for the next 3 instructions

# 2025-02-23 23:09 ascriv: ^similar for instructions 220-223

# 2025-02-23 23:10 ascriv: Also, shouldn’t we be doing Z inverse on the result of the max (227) and min (229) instructions? To convert back to unsigned before storing in the register

# 2025-02-24 02:58 charliewinston14: Morning all. Are the "EC shards" referenced in CE137 the same as the "bundle shards" referenced in CE138? What is the difference between these two APIs? Are they essentially the same except CE137 also returns segment shards?

# 2025-02-24 11:57 shwchg: Hi Dr.Wood https://github.com/gavofyork/graypaper/pull/248 Can we use beta_dagga as the only reference(instead of beta+beta_dagga) for all processes in Section 11?

# 2025-02-24 12:03 gav: > <@shwchg:matrix.org> Hi Dr.Wood > https://github.com/gavofyork/graypaper/pull/248 > Can we use beta_dagga as the only reference(instead of beta+beta_dagga) for all processes in Section 11? > The two should give equal effects in Section 11 as the only difference is the placement of the beefy root.

# 2025-02-24 12:04 gav: > <@ascriv:matrix.org> neg_add_imm_64 has a +2^64 but then mods by 2^64 so this addition is the same as +0, so it’s redundant. Unless there’s a typo I don’t define negative modulo; this ensures the modulo is positive.

# 2025-02-24 12:07 gav: It is correct. Check the definition of caligraphic B. (edited)

# 2025-02-24 12:08 gav: > <@ascriv:matrix.org> Also, shouldn’t we be doing Z inverse on the result of the max (227) and min (229) instructions? To convert back to unsigned before storing in the register Yes. If you’re in the mood feel free to submit a PR.

# 2025-02-24 13:23 0xjunha: > <@ascriv:matrix.org> Also, shouldn’t we be doing Z inverse on the result of the max (227) and min (229) instructions? To convert back to unsigned before storing in the register Actually this change is merged into main - probably will be included in the next release? https://github.com/gavofyork/graypaper/pull/228

# 2025-02-25 15:20 jay_ztc: Hi team 👋 Quick question about the Merkle function in D.6-> the branch splitting conditional implies that the key should be left shifted one bit before each recursion. Am I interpreting this correctly? The tests & gh consensus suggests that the keys shouldn't be rotated at each recursion, but rather that the $d'th bit should be used in the splitting conditional at recursion depth $d. I'm happy to open a PR to the GP repo if appropriate. https://graypaper.fluffylabs.dev/#/5f542d7/391b01391c01

# 2025-02-25 15:58 shwchg: https://graypaper.fluffylabs.dev/#/5f542d7/10b00010b000 If the authorization pool is all the same hash, and there is a guarantee that using that hash as authorization, will it consume the entire pool? and will the queue then refill with eight more?

# 2025-02-25 16:01 dave: Only one of the hashes should be consumed, see https://graypaper.fluffylabs.dev/#/5f542d7/07f70007fa00

# 2025-02-25 16:10 shwchg: ok I see! thanks for reply

# 2025-02-25 16:23 sourabhniyogi: In order to compare large amounts of PVM traces between teams precisely, I would like a formula to hash the registers with the PVM paged memory for teams to know they ended up with same answer at the end, and if they did not, be able to quickly determine which line in some PVM trace of PC they differed in results. Its not hard to come up with a procedure, but does a ready made answer exist within the PolkaVM repo or is there some public algorithm to do this kind of operation so we don't reinvent the wheel needlessly? (edited)

# 2025-02-25 16:53 jaymansfield: Hey! Hoping to get a clarification on the justifications for CE-138 (audit shard request). It mentions "The assurer should construct this by appending the corresponding segment shard root to the justification received via CE 137.". What is the segment shard root corresponding too exactly when it's a request about a work package shard? (edited)

# 2025-02-25 21:26 sourabhniyogi: The Parity Service trait definition for accumulate here returns an Option<Hash> fn accumulate(_slot: Slot, _id: ServiceId, items: Vec<AccumulateItem>) -> Option<Hash> but it appears there are TWO ways to provide a Some for accumulate: (1) if $\omega_8=32$, then the B.12 ${\bf o} \in \mathbb{H}$ condition applies (2) the yield host function As B.12 is written, (1) takes precedence over (2), but the new (2) yield is a cleaner solution otherwise why was it added? Now that its been added, I believe we don't need (1). Having both is not needed since whatever 32-byte optional yield could go through output (1) OR (2), so perhaps we can eliminate (1). Nitpick check: is $omega8 = 32$ a sufficient criteria for (1) to take precedence over (2) ? What if $omega8 > 32$? What if $omega8 < 32$? Related nitpick check: is there a way to change the C notation in eq 4.7 vs 4.17 to eliminate the appearance of dependency loops. We are pretty sure the C in 4.7 is from the previous states 4.17 but seek confirmation? (edited)

# 2025-02-26 01:53 ascriv: Should (A.43) have x’ instead of x? For clarity that it’s the x after the host call

# 2025-02-26 01:55 ascriv: And should the type of the gas in the return for (A.42) be signed (Zg) to handle e.g. when the host call returns out of gas?

# 2025-02-26 13:44 jaymansfield: Hey! Question about the state transition dependency graph 4.2.1. Should the calculation of β′ be moved further down since it depends on the commitment map C which doesn't exist yet, or does it use the commitment map from the previous block?

# 2025-02-26 13:54 gav: > <@sourabhniyogi:matrix.org> In order to compare large amounts of PVM traces between teams precisely, I would like a formula to hash the registers with the PVM paged memory for teams to know they ended up with same answer at the end, and if they did not, be able to quickly determine which line in some PVM trace of PC they differed in results. > > Its not hard to come up with a procedure, but does a ready made answer exist within the PolkaVM repo or is there some public algorithm to do this kind of operation so we don't reinvent the wheel needlessly? No, there’s no canonical PVM state serialization. Registers are trivial, but for memory we would need to have some definition on how to encode the pages and their accessibility.

# 2025-02-26 19:16 sourabhniyogi: https://hackmd.io/@sourabhniyogi/pvmhash is a first try, hopefully a couple of us will try to converge on something as we get our host function implementations and PVM interpreter implementations correct. I am wondering why R/W/0 page accessibility came to your mind right away (as opposed to X/Y contexts which has most of the immediate debugging problems) -- I must be missing something since any discrepancy in internal representations of page accessibility would be visible by some load/store instructions effect (or lack thereof) on a particular page -- encoding this page accessibility would be for 2 teams to reason about the contents of the memory after they saw the memory affected/not affected based on R/W/0 page accessibility bits -- if you anticipate this to be quite important early, I would like to put it in early in a "v1" (like in a page) -- is it? Related question maybe?: Was the W\_G=4104 segment size chosen for segments to match a 4096-page size and a specific 8-byte encoding of the page, a page number and some specific metadata, specifically the accessibility bits. If so, we might as well get the "dump a page" to map into the 4104 encoding imagined for CoreVM service with the desired 8-byte encoding for pages maybe? (edited)

# 2025-02-27 04:59 gav: The context is not PVM state - the whole point is that it's external. (edited)

# 2025-02-27 05:00 gav: Of course you'll likely still want to test it, but I don't think there's any reason to check it before the context is collapsed into the final result from Phi\_A (edited)

# 2025-02-27 10:59 sourabhniyogi: Then the v2 "hash" intends to capture the PVM state AND both contexts so as to support debugging of incorrect host function implementations. What should this be called? Since a Phi\_A result may have many host function calls to get at its result (with many intermediate X + Y contexts), we do have a reason to check this v2 "hash", to see if an intermediate value of the v2 "hash" from one implementation matches another after some of those host function calls \[which affect the X (or Y) context\] complete, one (or maybe both) of which is incorrect. Does that make sense? (edited)

# 2025-02-26 14:02 gav: > <@sourabhniyogi:matrix.org> The Parity Service trait definition for accumulate here returns an Option<Hash> > fn accumulate(_slot: Slot, _id: ServiceId, items: Vec<AccumulateItem>) -> Option<Hash> > but it appears there are TWO ways to provide a Some for accumulate: > (1) if $\omega_8=32$, then the B.12 ${\bf o} \in \mathbb{H}$ condition applies > (2) the yield host function > As B.12 is written, (1) takes precedence over (2), but the new (2) yield is a cleaner solution otherwise why was it added? Now that its been added, I believe we don't need (1). Having both is not needed since whatever 32-byte optional yield could go through output (1) OR (2), so perhaps we can eliminate (1). > Nitpick check: is $omega8 = 32$ a sufficient criteria for (1) to take precedence over (2) ? What if $omega8 > 32$? What if $omega8 < 32$? > > Related nitpick check: is there a way to change the C notation in eq 4.7 vs 4.17 to eliminate the appearance of dependency loops. We are pretty sure the C in 4.7 is from the previous states 4.17 but seek confirmation? We can consider removing (1) at a later stage. Host calls are not especially cheap and returning data is a more natural pattern than relying on the side-effect of a host call.

# 2025-02-27 04:35 gav: > <@ascriv:matrix.org> Should (A.43) have x’ instead of x? For clarity that it’s the x after the host call Sure https://github.com/gavofyork/graypaper/pull/254

# 2025-02-27 04:57 gav: > <@ascriv:matrix.org> And should the type of the gas in the return for (A.42) be signed (Zg) to handle e.g. when the host call returns out of gas? Yes indeed: https://github.com/gavofyork/graypaper/pull/255

# 2025-02-27 05:03 gav: > <@jaymansfield:matrix.org> Hey! Question about the state transition dependency graph 4.2.1. Should the calculation of β′ be moved further down since it depends on the commitment map C which doesn't exist yet, or does it use the commitment map from the previous block? I'm kept the (different variations of) the state components together rather than try to keep any "execution order". Indeed rather the point of this dependency graph is to demonstrate that there exists no specific order since it's a partially parallel rather than a fully serial system.

# 2025-02-27 08:20 sourabhniyogi: Here is the context of this "C" and "Beta" dependency question: https://github.com/jam-duna/jamtestnet/issues/101

# 2025-02-27 08:28 dakkk: > <@gav:polkadot.io> I'm kept the (different variations of) the state components together rather than try to keep any "execution order". Indeed rather the point of this dependency graph is to demonstrate that there exists no specific order since it's a partially parallel rather than a fully serial system. sourabhniyogi: I think this answered your doubt. C used by beta' is the result from accumulation process

# 2025-02-27 08:53 sourabhniyogi: We'll consider this case closed =) (edited)

# 2025-02-27 05:09 gav: > Nitpick check: is $omega8 = 32$ a sufficient criteria for (1) to take precedence over (2) ? What if $omega8 > 32$? What if $omega8 \< 32$? Not sure what you mean by $omega8, but assuming you mean selecting between that latter 2 variants of B.12, then it's pretty clear: you use the returned value IFF it is a 32-byte sequence (blackboard H). If it's anything other than this (e.g. 31 byte sequence or 33 byte sequence), then you fallback to the _otherwise_ condition of using the (success) context's yield. (edited)

# 2025-02-27 23:51 gav: Sure. Then indeed you’ll want to serialise the context also.

# 2025-02-28 02:43 ascriv: I’m guessing in (B.1) p_c should be p_p?

# 2025-02-28 03:03 qiwei: see 14.9, there is a authorization code

# 2025-02-28 03:17 ascriv: Thanks. Should the returned gas value in (A.34) be signed?

# 2025-02-28 03:36 jay_ztc: Jan Bujak: are you able to confirm if the rv test vectors are compliant with A.17? Based on my testing, I suspect they may be branching to the middle of basic blocks on passing tests. One example I found is the branch\_eq instruction at pc 466 in rv64ui\_add https://graypaper.fluffylabs.dev/#/5f542d7/24e40224e502 (edited)

# 2025-02-28 04:09 jan: I'm confused. There's nothing wrong with the branch_eq instruction at pc 466 in rv64ui_add test and it certainly doesn't branch into the middle of a basic block?

   448: 01                       fallthrough
      :                          @20
   449: 33 00 0d                 r0 = 0xd
   452: 33 01 0b                 r1 = 0xb
   455: c8 10 0b                 r11 = r0 + r1
   458: 64 b3                    r3 = r11
   460: 95 aa 01                 r10 = r10 + 0x1
   463: 33 02 02                 r2 = 0x2
   466: ab 2a ef                 jump 449 if r10 != r2

# 2025-02-28 04:21 jay_ztc: You're right, this is my mistake... embarrassingly small bug on my end, should have reviewed the target more thoroughly before posting as well (different debugging scope & missed fallthrough is bb terminator) 🤦‍♂️ Thanks for your response. (edited)

# 2025-02-28 10:30 faiz_871: I am struggling to understand the meaning of t_t, t_l and t_i in info host call function that is defined here: https://graypaper.fluffylabs.dev/#/5f542d7/306802308802

# 2025-02-28 10:44 tomusdrw: Faiz Ahmad: This should help: https://graypaper.fluffylabs.dev/#/5f542d7/115c01115c01 (t_t & t_i), https://graypaper.fluffylabs.dev/#/5f542d7/115400115700 (t_l)

# 2025-02-28 19:03 jay_ztc: Are alterations to the PC as a result of a host call required to be the start of a basic block? Not sure if I'm interpreting "call" appropriately in this sentence in the GP. https://graypaper.fluffylabs.dev/#/5f542d7/24e40224e402 If so, does reinvoking with the same pc of the host call (in the case of a host call page fault) imply a strict requirement that the instruction prior to the host call is a block terminator? https://graypaper.fluffylabs.dev/#/5f542d7/2b23022b2302 (edited)

# 2025-03-01 13:47 sourabhniyogi: Can a parent VM determine which PVM memory pages have been modified by an "invoked" child VM? CoreVM-type services aim to extend JAM computation across multiple work packages by exporting PVM memory pages as segments at the end of a task and retrieving them at the start of the next. While the memcpy/memset host call can support data transfer between parent and child VMs (an imported segment copied into some child VM page index), the parent needs a way to identify all mutated memory pages—not just those explicitly copied via memcpy—to ensure they are correctly included in the exported segments. One approach could be extending M to track modified page indexes and providing an explicit mechanism for the parent VM to access this set, supporting efficient page/segment export. Is this a good idea or is there a better approach?

# 2025-03-02 04:27 gav: > <@ascriv:matrix.org> Thanks. Should the returned gas value in (A.34) be signed? Yes. Already merged.

# 2025-03-02 04:49 ascriv: A lot of the host functions (eg solicit, forget, yield, etc) use Z_o…+32 (for example). I think these can never be negative since they’re register values? So N is a bit more clear

# 2025-03-02 05:08 gav: > <@ascriv:matrix.org> A lot of the host functions (eg solicit, forget, yield, etc) use Z_o…+32 (for example). I think these can never be negative since they’re register values? So N is a bit more clear Yes, feel free to make a PR.

# 2025-03-02 14:06 ascriv: https://github.com/gavofyork/graypaper/pull/262

# 2025-03-02 14:23 ascriv: Is there always an implicit mod 2^32 when inspecting or mutating the ram given a register (64 bit unsigned)? we do this a lot and wondering how it should be handled

# 2025-03-02 16:57 danicuki: I have a doubt about the work package execution formula: https://graypaper.fluffylabs.dev/#/5f542d7/1a48021a5d02 it says that I(p,j) = (r, e) when |e| = we, but what if r is an error and the |e| = we? Shouldn't be - (r, [G0, G0, ...) if r not binary first - (r, e) if |e| = we second ?

# 2025-03-02 22:01 ascriv: According to the gp, we only handle memory access/modification exceptions in the single step function, but presumably we’d want to catch such exceptions in , e.g host calls as well, no?

# 2025-03-02 22:17 jay_ztc: believe mem fault in nested pvm is returned to parent pvm instance for the pvm program to handle, someone correct me if I'm wrong here

# 2025-03-02 22:21 jay_ztc: https://graypaper.fluffylabs.dev/#/5f542d7/364401364d01

# 2025-03-02 22:24 ascriv: Yes, but if for example the 8th register is not a valid memory location, then when deserializing to construct g, we should fault, right? Before the pvm function is called

# 2025-03-02 22:25 ascriv: Similar for other host functions

# 2025-03-02 22:29 jay_ztc: Good question. I would assume write to that location from an 'outer-shell' implementation perspective, and if the pvm program attempts to access it, then it would trigger a page fault. Interested to hear other folks thoughts on this though. (edited)

# 2025-03-03 02:45 gav: > <@ascriv:matrix.org> Is there always an implicit mod 2^32 when inspecting or mutating the ram given a register (64 bit unsigned)? we do this a lot and wondering how it should be handled No. Everything is explicit.

# 2025-03-03 02:46 gav: They are handled explicitly. If you believe there is an instance where unarmed memory may be addressed, please report. (edited)

# 2025-03-03 02:49 gav: > <@danicuki:matrix.org> I have a doubt about the work package execution formula: https://graypaper.fluffylabs.dev/#/5f542d7/1a48021a5d02 > > it says that I(p,j) = (r, e) when |e| = we, but what if r is an error and the |e| = we? Shouldn't be > > - (r, [G0, G0, ...) if r not binary first > - (r, e) if |e| = we second > > ? No. GP is correct.

# 2025-03-03 02:50 gav: > <@ascriv:matrix.org> Yes, but if for example the 8th register is not a valid memory location, then when deserializing to construct g, we should fault, right? Before the pvm function is called See the second line there. The range beginning with o is ensured to be in the set of valid memory addresses.

# 2025-03-03 04:53 ascriv: In export, since we’re reading indices p…+z wrapped, should we also be checking if Np…+z (mod ram size) is in Vu ?

# 2025-03-03 04:54 ascriv: Similar for poke

# 2025-03-03 08:11 0xjunha: I have several questions/comments regarding the historical lookup and related constants: 1. The constant D has been updated from 28,800 slots (48 hrs) to 4,800 slots (8 hrs). I opened a PR to update it in appendix I too: https://github.com/gavofyork/graypaper/pull/260 2. Probably the constant value L should be reduced too? D seems to be introduced to prevent a preimage data from being removed while it still could be referenced during auditing. So D should be larger than L. However, current value of L is 14,400 (24 hrs): (https://graypaper.fluffylabs.dev/#/5f542d7/417000417000 and https://graypaper.fluffylabs.dev/#/5f542d7/0c9f000c9f00) which hasn't been updated since the initial commit, while D was updated as mentioned above. 3. https://graypaper.fluffylabs.dev/#/5f542d7/113b00113b00 Regarding the brief definition of the historical lookup function, should the constant C_D be D instead? I wonder if this is a typo or I'm missing something. Also, while the function is designed to be called off-chain, should we interpret the H_t here as "The timeslot index of the last finalized block header that an auditor sees at the point of auditing"?

# 2025-03-03 09:03 gav: 1./2. Yes there's an issue for this now; the provided PR may not be quite right. (edited)

# 2025-03-03 09:04 gav: 3. Yes indeed C_D should be D; will be fixed in 0.6.4. (edited)

# 2025-03-03 09:06 gav: One may consider 9.5 as the "on-chain" function-contract.

# 2025-03-03 09:06 gav: Big-Lambda is defined fully at 9.7 and this proper definition does not use H_t.

# 2025-03-03 09:07 gav: 9.5 is only provided to help the reader understand what problem the function is attempting to solve. (edited)

# 2025-03-03 09:07 gav: @room Gray Paper version 0.6.3 is released: https://github.com/gavofyork/graypaper/releases/tag/v0.6.3 (edited)

# 2025-03-03 09:10 gav: In addition to many corrections and clarifications, there are several small but important functional alterations; pay attention to the first 6 items in the changelog. (edited)

# 2025-03-03 11:17 yu2c: Does anyone know why the PDF file for release v0.6.3 is 111 MB, while the previous version v0.6.2 was only 4.22 MB? 🧐

# 2025-03-03 11:43 oliver.tale-yazdi: The alternative renders are small though https://github.com/JamBrains/graypaper/actions/runs/13629940636 (and if you are not logged into GH, here) (edited)

# 2025-03-03 13:08 yu2c: Is the missing update of $l$ to $y$ in the serialization of $\mathbb{L}$?https://github.com/gavofyork/graypaper/blob/85129dacf739e76ead2065bb5b84a999e8ac71e7/text/serialization.tex#L143

# 2025-03-03 14:19 gav: > <@yu2c:matrix.org> Does anyone know why the PDF file for release v0.6.3 is 111 MB, while the previous version v0.6.2 was only 4.22 MB? 🧐 Good point!:) I’ll look into getting a smaller rendering.

# 2025-03-03 14:20 gav: > <@yu2c:matrix.org> Is the missing update of $l$ to $y$ in the serialization of $\mathbb{L}$?https://github.com/gavofyork/graypaper/blob/85129dacf739e76ead2065bb5b84a999e8ac71e7/text/serialization.tex#L143 PR accepted!:)

# 2025-03-03 16:58 yu2c: And small suggestion: Add $\mathbb{N}_{R}$ defined in (4.23) in the Appendix I / Sets / Regular Notions

# 2025-03-03 17:35 ascriv: I believe we are not checking the right indices for memory validity in host functions that have wrapping in their memory inspecting/mutation: http://github.com/gavofyork/graypaper/pull/272

# 2025-03-03 18:50 danicuki: I saw that formula 14.11 (https://graypaper.fluffylabs.dev/#/85129da/1a9b021aad02?v=0.6.3) on 0.6.3 now passes work item components to ΨR, but ΨR function signature didn't change (B.4). It still receives package (p) and item position (i). Where do args S(w,l), X(w) go now? (replaced old î)

# 2025-03-03 19:34 gav: > <@yu2c:matrix.org> And small suggestion: Add $\mathbb{N}_{R}$ defined in (4.23) in the Appendix I / Sets / Regular Notions PR?:)

# 2025-03-03 19:36 gav: > <@ascriv:matrix.org> I believe we are not checking the right indices for memory validity in host functions that have wrapping in their memory inspecting/mutation: > http://github.com/gavofyork/graypaper/pull/272 Is there any particular reason you think we should be more lenient?

# 2025-03-03 19:38 gav: > <@danicuki:matrix.org> I saw that formula 14.11 (https://graypaper.fluffylabs.dev/#/85129da/1a9b021aad02?v=0.6.3) on 0.6.3 now passes work item components to ΨR, but ΨR function signature didn't change (B.4). It still receives package (p) and item position (i). Where do args S(w,l), X(w) go now? (replaced old î) Please explain?

# 2025-03-03 20:50 danicuki: ΨR call here passes 11 parameters: (wc,wg ,ws,h,wy ,px, pa,o,S(w,l),X(w),ℓ) (https://graypaper.fluffylabs.dev/#/85129da/1a9b021aad02?v=0.6.3) But ΨR defined still has only 5 (i,p,o,i,ς): https://graypaper.fluffylabs.dev/#/85129da/2d65002d9300?v=0.6.3

# 2025-03-03 21:12 ascriv: > <@gav:polkadot.io> Is there any particular reason you think we should be more lenient? In poke for example, we read from the ram with wrapping , which means we want to allow for the case where z > ram size, in which case we wrap back to 0. But in that case, Ns…z will not be in Vu, so we will panic

# 2025-03-03 21:12 ascriv: If we don’t want to handle the case where z > ram size, then we are wrapping unnecessarily

# 2025-03-03 21:15 ascriv: *s+z

# 2025-03-04 02:13 wabkebab: Jam implementors, technologists and Web3 enthusiasts, here you can find one of the videos of the JAM Tour, filmed during the lecture at Taipei University. More content coming! https://www.youtube.com/watch?v=aTS4yjFsJd0 (edited)

# 2025-03-04 03:44 sourabhniyogi: In 0.5.4 there was extrinsic data blobs in $a$ of Refine invocation: https://graypaper.fluffylabs.dev/#/579bd12/2d13012d1301?v=0.5.4 but in 0.6.x these extrinsic data blobs seem to have disappeared: https://graypaper.fluffylabs.dev/#/78ca0a8/2df3002df300?v=0.6.0 There are no release notes for this change https://github.com/gavofyork/graypaper/releases/tag/v0.6.0 While fetch enables access to the extrinsic blob hashes and their lengths of a work item, the extrinsic blobs are not in the work item: https://graypaper.fluffylabs.dev/#/85129da/199f00199f00?v=0.6.3 So how does Refine fetch the extrinsic data blobs as of 0.6.x?

# 2025-03-04 05:26 gav: > <@danicuki:matrix.org> ΨR call here passes 11 parameters: (wc,wg ,ws,h,wy ,px, pa,o,S(w,l),X(w),ℓ) (https://graypaper.fluffylabs.dev/#/85129da/1a9b021aad02?v=0.6.3) > > But ΨR defined still has only 5 (i,p,o,i,ς): https://graypaper.fluffylabs.dev/#/85129da/2d65002d9300?v=0.6.3 Yes indeed: https://github.com/gavofyork/graypaper/pull/273

# 2025-03-04 05:29 gav: As I've said countless times now, the GP does not dictate the specifics of data logistics, only observable behaviour. fetch requires implementations to return the correct extrinsic data by virtue of the constraints placed on the return value from fetch. How it gets the extrinsic data is entirely implementation-specific and left as an exercise for the reader. Probably it will be supplied along with the rest of the WP by the builder, but the GP does not define this since it is not *observable behaviour*. (edited)

# 2025-03-04 05:32 gav: Your question betrays a presupposition that the formalisms in the GP are 1:1 mappable to some implementation code. That may sometimes be the case but certainly not always. (edited)

# 2025-03-04 05:33 gav: e.g. We are able to use formalisms in mathematics such as let H(return_value) = input_value. This does not map 1:1 with (procedural) code, because it would imply the ability to make a reverse hash, for which no general solution is known. (edited)

# 2025-03-04 05:36 gav: It works in the GP because it is describing what we wish to see, not how it must be delivered.

# 2025-03-04 05:36 gav: In the case of implementing the "impossible" reverse hash function, it is fine because implementations are allowed to "cheat" and use external knowledge (such as DA contents or data arriving over the network) in order to arrive at the answer.

# 2025-03-04 05:40 gav: Implementing the GP is a puzzle and intentionally so. There may be different ways of solving the puzzle. This diversity can help deliver a resilient, even anti-fragile, network. Don't expect a perfectly described path to implementation. You'll need to use your brain.

# 2025-03-04 05:41 gav: Ahh, in fact the wrapping is superfluous there; I'll accept a PR which removes it from the host functions. (edited)

# 2025-03-04 05:43 gav: > <@yu2c:matrix.org> And small suggestion: Add $\mathbb{N}_{R}$ defined in (4.23) in the Appendix I / Sets / Regular Notions PR welcome:)

# 2025-03-04 06:06 yu2c: https://github.com/gavofyork/graypaper/pull/274 This PR adds the definition of $\N_R$ in Appendix I / Sets / Regular Notion.

# 2025-03-04 09:19 clearloop: hi there, I'm currently confused about the fallback keys (sealing) in grandpa: while blocks sealed with fallback keys are for the placeholder of contingency, https://graypaper.fluffylabs.dev/#/85129da/1fc0001fc400?v=0.6.3 seems mean that headers with fallback keys are not eligible to be selected as best headers which will make the blocks with fallback keys meaningless, if I'm not mistaken, the blocks with fallback keys should be selected as the best headers if there are no ticket sealed blocks at the end of slots, plz correct me if I'm wrong 🙏 (edited)

# 2025-03-04 13:20 gav: Such blocks may be the only way of extending the chain and getting to the point of having regular ticketed blocks which increase the best chain score. (edited)

# 2025-03-04 21:43 ascriv: > <@gav:polkadot.io> Ahh, in fact the wrapping is superfluous. http://github.com/gavofyork/graypaper/pull/272

# 2025-03-05 01:34 ascriv: In New, should we panic if the 8th register is not in N-2^32? Because then (c,l) will not be a valid key for the l component of the new service account

# 2025-03-05 02:40 clearloop: ~~so the case of finalizing headers with fallback keys is: we just confirmed a ticketed block is on the best chain, and we are requesting the ancestors of the best head ( blocks with the fallback keys in the headers could be here ) ?~~ clear about it now, best chain voted by blocks with most ancestor blocks, headers with fallback keys could be part of them (edited)

# 2025-03-05 02:44 ascriv: > <@ascriv:matrix.org> In New, should we panic if the 8th register is not in N-2^32? Because then (c,l) will not be a valid key for the l component of the new service account Also in new, a is missing the p component which should presumably be {}

# 2025-03-05 02:58 ascriv: http://github.com/gavofyork/graypaper/pull/275

# 2025-03-06 05:05 gav: Well, it’s not that it’s not “valid” (don’t forget we’re not using typed logic here, just basic set formalisms). Rather that there can be early certainty that if l is not in N_2^32, then there can be no item (c, l) in the set (H, N_2^32). This implies no key can exist. (edited)

# 2025-03-06 15:24 dakkk: gav: in your latest lecture in Taipei about JAM, you said that your implementation achieved ~**% speed of native code; would it possible to have the program you used for this benchmark? I'd love to test jampy PVM around different improvements I have in my sleeve

# 2025-03-06 15:25 jan: https://github.com/paritytech/polkavm/blob/master/BENCHMARKS.md

# 2025-03-06 15:25 jan: https://github.com/paritytech/polkavm/tree/master/guest-programs

# 2025-03-06 15:27 dakkk: thx Jan

# 2025-03-06 16:00 dakkk: Jan Bujak: I'm calling bash guest-programs/build-benchmarks.sh in order to create pvm and native binaries: it correctly creates PVM binaries, but the x86_64 contains only shared objects. Am I missing a step?

# 2025-03-06 16:01 jan: tools/benchtool is used to run those

# 2025-03-06 17:28 dakkk: I'm giving up since generated files are elf or .polkavm format, and I do not want to read other implementation's code. I'll write my own benchmark program

# 2025-03-06 17:45 jay_ztc: for the 'info' host call-> where is the serialization of the service accounts preimage availability dictionary (l) defined? https://graypaper.fluffylabs.dev/#/5f542d7/307f02307f02?v=0.6.2

# 2025-03-06 22:13 jay_ztc: Pretty sure this is the case, but want to verify here-> this branch conditional in the read host call is TYPE equivalence rather than literal equivalence? So this branch wouldn't be taken if w7 == s, only in the case where w7 == 2^64-1 https://graypaper.fluffylabs.dev/#/5f542d7/308500308500?v=0.6.2

# 2025-03-07 00:28 jay_ztc:

# 2025-03-07 20:23 ascriv: For some cases when lhs and rhs types are different, it’s clear what to do, for example in set inclusion we can safely interpret as evaluating to false. But in other cases where lhs and rhs are different types, like the building of the set that is the new service account, it becomes too ambiguous. One interpretation is that the l component in the new service account should be empty. Another is that a should be \error . I’ve made a PR with the second interpretation http://github.com/gavofyork/graypaper/pull/279 (edited)

# 2025-03-08 08:13 gav: Yes, that's fair. Merged.

# 2025-03-08 08:15 gav: you've highlighted regular l, when the preimage dictionary is bold-l.

# 2025-03-08 08:16 gav: regular l is an dependent field.

# 2025-03-08 08:18 gav: No idea what you're talking about. s\* and s are both in \N_S. the comparison is just a simple numeric one. If you work the logic through, it would be taken if omega_7 in { s, 2^64-1 }. (edited)

# 2025-03-08 14:45 jay_ztc: Thanks for the response! I'm not sure I'm following though? 't' is referencing an account, and I only see 'l' used in the accounts context when referring to the lookup dict? It would make sense if the metadata returned by the info host call contains some metadata about preimages, maybe just the keys even -> but I'm assuming that would be represented as K{t}_l in the spec if that were the case...

# 2025-03-08 15:02 jay_ztc: Thanks for clarifying! Wasn't sure if an account being able to selectively target its s vs. d context was a valid use case. Also was getting hung up on this being a more verbose way to get to 'a' than in the previous host call (lookup) -> but I see the intermediate value used when querying the storage, so the verbosity makes sense there.

# 2025-03-08 20:35 jaymansfield: Hello. For the bless and assign host calls, what should happen if a non-privileged service tries to execute it? Panic or HUH? The GP doesn’t really specify. (edited)

# 2025-03-10 13:25 jay_ztc: Happy Monday folks. In the 'new' host function-> the GP doesn't specify what should happen if the preimage blob length is above the valid range required by the account spec (N_l = 2^32). There is a related case where this type of behavior is specified-> In the bless host call the GP specifies a 'WHO' exit when the values aren't within the valid N_s range. Curious to hear what the guidance is here, happy to open a PR if needed. https://graypaper.fluffylabs.dev/#/5f542d7/31db0031e600?v=0.6.2

# 2025-03-11 17:55 gav: Exactly what the GP specifies. It returns OK but has no effect on the system's state. (edited)

# 2025-03-11 17:59 gav: Ahh yes, t_l should actually be t_o. Feel free to post an issue or make a PR:) (edited)

# 2025-03-11 18:02 gav: This was previously reported and is addressed in main branch.

# 2025-03-11 18:11 jaymansfield: > <@gav:polkadot.io> Exactly what the GP specifies. It returns OK but has no effect on the system's state. Thanks!

# 2025-03-11 18:12 jay_ztc: https://github.com/gavofyork/graypaper/pull/284

# 2025-03-11 18:13 ascriv: > <@jay_ztc:matrix.org> Happy Monday folks. In the 'new' host function-> the GP doesn't specify what should happen if the preimage blob length is above the valid range required by the account spec (N_l = 2^32). There is a related case where this type of behavior is specified-> In the bless host call the GP specifies a 'WHO' exit when the values aren't within the valid N_s range. Curious to hear what the guidance is here, happy to open a PR if needed. > > https://graypaper.fluffylabs.dev/#/5f542d7/31db0031e600?v=0.6.2 I made a change which results in a panic if that happens

# 2025-03-11 18:15 jay_ztc: I see it now, thanks!

# 2025-03-11 22:05 ascriv: I see in the on transfer and accumulate invocations we check if a service account’s code hash is “without value” but according to the service account type the code hash must have a value. Is the service account type wrong or are these checks wrong? (edited)

# 2025-03-12 09:49 gav: what makes you think that the code hash must have a value?

# 2025-03-12 09:50 subotic: The accumulate invocation returns unsigned gas (N_G): https://graypaper.fluffylabs.dev/#/85129da/2e3e002e5a00?v=0.6.3 Shouldn't the return value be signed gas (Z_G), as Psi_M is returning signed gas? The same also for C, where now the input and output should then probably be both Z_G? Or should C instead return 0 for the out-of-gas case?

# 2025-03-12 12:19 ascriv: > <@gav:polkadot.io> what makes you think that the code hash must have a value? In 9.3 the code hash is of type H, I think if we want it to be able to be valueless it should be of type “H?” ?

# 2025-03-12 15:41 gav: ahh right, sure the hash is non-empty. but there may not be a preimage. (edited)

# 2025-03-12 15:42 gav: regular c and bold c are not the same.

# 2025-03-12 15:43 gav: Yes indeed. I suspect this may have been incorrectly corrected recently.

# 2025-03-12 15:43 gav: Feel free to make a PR/issue.

# 2025-03-12 15:50 ascriv: > <@gav:polkadot.io> regular c and bold c are not the same. Yep, misread as regular c. Thanks

# 2025-03-12 16:49 subotic: Sure, for C to return 0 in case of negative gas or for change to Z_G? If change to Z_G, then this will also spill over into Psi_A and the accumulation chapter, where N_G is expected.

# 2025-03-13 12:17 gav: Yes, in fact it was confusing the gas remaining (which is the PVM gas counter and can be negative in the case of an underrun) with gas used (which is the value used by the higher level accumulation functions and cannot be negative). Should make more sense with https://github.com/gavofyork/graypaper/pull/288. (edited)

# 2025-03-15 19:11 ascriv: In the outer accumulation function, is it intended that the free accumulation services dict be zeroed out after the first iteration? Couldn’t there be work reports in the remaining ones which was made by a free accumulation service?

# 2025-03-17 05:22 0xjunha: My understanding is that the always-accumulate services should be processed first, followed by other services. And there is enough gas to run all the always-accumulate services in the initial round, so those getting accumulated later than the initial round would never happen. (edited)

# 2025-03-17 04:40 clearloop: hey teams, curious about in your implementations, after genesis, can you ensure there will be no empty slots in the local testnet? e.g. there are valid blocks on each of the time slots and all of them get full finalized with all of the nodes within 6 seconds (edited)

# 2025-03-17 10:04 prematurata: I think I've found something interesting when handling the ecalli pvm fn. According to the gp ΨH|A.34 calls Ψ()|A.1 and handles the ε′ = hxh. The ecalli fn ( https://graypaper.fluffylabs.dev/#/85129da/25ff0025ff00?v=0.6.3 ) is set to modify just ε leaving the default ı′ = ı + 1 + skip(ı) defined in A.7 in place. Then the host call gets executed inside ΨH|A.34. If all goes ok and f returns (▸, ....) then we should recursively call ΨH but with ı′′ which is defined as ı′′ = ı′ + 1 + skip(ı′). Now since i' is what is being returned by Ψ and is already skipping an instruction, then ΨH gets called by skipping another instruction (due to the 2x skips being applied). Now i think this is either a misinterpretation of mine or an error in the graypaper as in my point of view it makes no sense to skip 1 instruction after a successful hostcall execution via ecalli. Multiple implementations seems to agree with this considering there are multiple implementors passing the duna testnet. (edited)

# 2025-03-17 14:50 gav: You mean this, right? https://github.com/gavofyork/graypaper/issues/247

# 2025-03-17 14:51 prematurata: yes. i didnt see this issue daaamn. but essentially yeah its the same as i am reporting (edited)

# 2025-03-17 14:56 gav: https://github.com/gavofyork/graypaper/pull/292

# 2025-03-17 15:01 prematurata: perfect thanks

# 2025-03-17 19:54 jaymansfield: Hey, I have a question relating to guaranteeing/auditing timing. After a work package is guaranteed and included on-chain, assurers will normally start assuring its availability in the next slot. Is there any reason to wait for these assurances to be posted, or can the auditors (if they are assigned to that specific core) just immediately request shards from the original guarantors? My implementation currently doesn't wait for the assurances and just wanted to see if thats okay, or if i should change the timing around. What are the best practices here? (edited)

# 2025-03-18 02:30 shwchg: https://graypaper.fluffylabs.dev/#/85129da/1b6d001b8800?v=0.6.3 Is equation (14.14) using J_x as defined in (E.5) or the PagedProof P described in (14.10)? From the preceding equations, it appears to use J_x, but the paragraph stating that “such a vast amount of data is not generally needed as the justification can be derived through a single PagedProof” suggests that using P could also be reasonable. Could we get a clarification on this?

# 2025-03-18 17:33 vinsystems: What does t_o mean in the info host\_call function? In the graypaper reader version 3 March 2025 this term is t_l, but in the latest version of the GP 0.6.3 March 13 2025 this term is t_o. (edited)

# 2025-03-18 18:26 gav: Not sure what you mean: see section 17.3; audit selection happens on all work-reports "pending which have just become available".

# 2025-03-18 18:26 gav: So you'll never be self-selecting to audit a work-report until you, at least, believe it is available.

# 2025-03-18 18:27 jaymansfield: Thanks this answers the question.

# 2025-03-18 18:30 gav: 14.14 defines regular-J, using caligraphic-J (subscripted with 0), itself defined in E.5. (edited)

# 2025-03-18 18:33 gav: The formulation of 14.14 does not use the paged-proof(regular-P, 14.10) formulation at all. (edited)

# 2025-03-18 18:34 gav: Now, please remember as I've now said countless times, the Gray Paper defines _observable behaviour_. It doesn't necessarily tell you how to create that behaviour. (edited)

# 2025-03-18 18:36 gav: In this case, as the text states, you'll need to combine this formulation with the page-proof formulation and see that you can create the correct behaviour without requiring all of the data in the segment tree. (edited)

# 2025-03-18 18:38 gav: That's indeed the whole point of the paged-proofs. They're a very efficient means of storing proofs across nodes for showing the correctness of exported data segments when the time comes to import them. (edited)

# 2025-03-18 18:42 gav: So you'll need to study (14.10) and understand that for any export, a relevant proof-page can be fetched from the Segments DA and information from within the two parts of it can be (quite easily) selected and combined in order to produce a full import-justification of the form needed by (14.14 J). (edited)

# 2025-03-19 00:36 shwchg: ok! Thanks for replying me!

# 2025-03-18 19:24 gav: @room v0.6.4 is released. Many minor fixes and clarifications in here; the only real changes are: - Activity statistics - Assurances are checked with the prior validator set (rather than posterior) - Include ed25519 keys in the epoch marker

# 2025-03-18 19:25 gav: Activity statistics should help us understand what our JAM instances are actually doing and would be a nice thing to monitor and visualise on a web/cli tool if anyone's making any.

# 2025-03-18 21:11 sourabhniyogi: Is it reasonable to use CE139 to fetch the relevant-proof page (or rather shards of the proof page, so as to reconstruct the proof page) from "Segments DA", where the Segment Index of CE139 is _greater_ than the export count? (edited)

# 2025-03-18 21:16 dave: Yes, that's intended even

# 2025-03-18 21:26 eclesiomelo: hey guys! I have a question regards the test service in the Accumulation test vector, we have tried to parse it using the definition A.37. I would like to confirm that this PVM blob is formatted according to this definition and not simply A.2 (deblob), is this correct?

# 2025-03-19 04:15 qiwei: yes, it's a standard program blob

# 2025-03-19 14:44 eclesiomelo: okay, so given the definition A.37 the first 3 bytes encodes the |o|, however the first 3 bytes in the test service is 0x47000c which decodes to 786503 in decimal which is greater than the test service total bytes (edited)

# 2025-03-19 14:44 eclesiomelo: image.png

# 2025-03-19 15:00 eclesiomelo: I am using the inverse of integer encoding, definition C.5, and where I pass [47, 00, 0c] and as output I got 786503, which I think I am missing something.

# 2025-03-19 07:19 boymaas: I am double-checking my implementation against the GP, and a question arose. I conclude that if we have a configuration with a W* such that the sum of gas consumption for W* is greater than G_a * C, then we would be overspending the provided gas limit (https://graypaper.fluffylabs.dev/#/85129da/176503176f03?v=0.6.3) with our always accumulate services (https://graypaper.fluffylabs.dev/#/85129da/175901175901?v=0.6.3). I searched for a condition that would prevent this but could not find one. Furthermore, this line from the Gray Paper suggests that such a scenario is possible: “We work with a limited amount of gas per block and therefore may not be able to process all items in W∗ in a single block.” (https://graypaper.fluffylabs.dev/#/85129da/160e02161102?v=0.6.3) If my conclusion is correct, is this overspending allowed by design?

# 2025-03-19 10:00 gav: > then we would be overspending the provided gas limit No.

# 2025-03-19 10:01 gav: W* is the accumulatable WRs; we don't accumulate all of them if the gas limit doesn't allow.

# 2025-03-19 10:03 gav: As for always-accumulate services, see (12.20) - there's always enough gas to accumulate those in addition to the usual amount for each core. (edited)

# 2025-03-19 10:08 gav: There's an additional note placed on G_T in definitions advising to account for the always-accumulate services, but because of (12.20) G_T formally only places a lower-limit on the amount of gas used; if there's more always-accumulate gas than would fit into G_T, we run "over" G_T and honour the always-accumulate gas.

# 2025-03-19 11:29 boymaas: Thank you, gav . I am going to take another look later today. I was aware of 12.20 with the addition of the always accumulate services. I was imagining a scenario where we could get a set of accumulatable WRs whose total gas consumption would be more than Gₐ * C. Where fe an immediate WR, would resolve a bunch of queued WRs. In a tiny network setup with 2 cores for example. Since g is used to select a subset of the accumulable WRs I though it could eat into the reserve for the always accumulate services.

# 2025-03-19 12:17 vinsystems: Should this be H? ? (edited)

# 2025-03-19 14:42 gav: Always-accumulate services should be in the first "batch" of WRs accumulated (i.e. among the first items of W*) so they will, indeed, always accumulate.

# 2025-03-19 14:44 gav: Should what be H?

# 2025-03-19 14:44 gav: Not at all clear what you're pointing at or the reasoning behind your suggestion. (edited)

# 2025-03-19 14:57 vinsystems: Eq B.6 defines the result context X of the accumulate invocation. The last term of this equation is y ∈ ¿H. Should this term be y ∈ H? instead of y ∈ ¿H since the operator ¿ is used for serializable terms? (edited)

# 2025-03-19 15:14 gav: Ahh I see; yes.

# 2025-03-19 15:17 gav: https://github.com/gavofyork/graypaper/pull/299

# 2025-03-19 17:12 sourabhniyogi: C(13) in Eq D.2 needs to get the current / last validator stats (pi\_V, pi\_L) back in (from 13.1/13.2) (edited)

# 2025-03-19 17:37 gav: Please explain (edited)

# 2025-03-19 17:38 sourabhniyogi: The subscripts in C(13) in D.2 only have { C, S, L } but should have { C, S, V, L } to cover both core/service activity (the new elements of C(13)) and validator statistics activity (the old elements of C(13)), that's all. (edited)

# 2025-03-19 17:39 gav: Not sure what you're looking at

# 2025-03-19 17:39 gav: main branch currently reads:

# 2025-03-19 17:39 sourabhniyogi: image.png

# 2025-03-19 17:39 gav: image.png

# 2025-03-19 17:41 gav: I suspect the problem you're describing was fixed by https://github.com/gavofyork/graypaper/pull/298

# 2025-03-19 17:43 gav: > The subscripts in C(13) in D.2 only have { C, S } (They actually had C, L and S, but were indeed missing V as the typo had C twice.) (edited)

# 2025-03-19 17:47 sourabhniyogi: Would it be ok to seed a JAM implementer DAO on this repo https://github.com/w3f-webops/graypaper-website/pull/96

# 2025-03-19 18:59 sourabhniyogi: How did you end up with W\_M = 2048 (The maximum number of imports and exports in a work-package.) \[or 3072 in recent versions\] which implies at most 32 proof pages and 8.4MB of CoreVM memory -- is this due to bandwidth or networking considerations? With 4104 bytes usable per proof page, you can fit many many more import/export segments than 2048. If we solved the 14.10/14.14 "puzzle", we have only 64\*32+5\*32 byte = 2208 bytes out the 4104 bytes being used [or 2240 in recent versions]. You can definitely fit many many more segments with the unused proof page space of 4104-2208=1896 bytes (edited)

# 2025-03-19 19:02 dave: I didn't come up with these numbers; Gav did. I think they're based primarily on bandwidth usage, as discussed in the discussion section at the end of the GP

# 2025-03-19 19:08 dave: Re proof segment usage, think one reason for the choice of 64 is that it's a power-of-2 which makes things simple -- you fit a complete subtree in a segment. Could probably fit more proofs in but it would get more complicated as you would have a partial subtree (or multiple subtrees depending on how you look at it). Not sure if there are other reasons. (edited)

# 2025-03-19 19:21 sourabhniyogi: Did you get "full" erasure coding decoding in the JAM Toaster yet (or some serious fraction of it) -- I've been bothered by doing all this networking just to get a measly 12 bytes and was wondering if instead of W\_G=4104 pages we'd like something slightly bigger than 64K or 128K instead (thus getting 16x-32x as much). Then the 2048 could gets 16 or 32x times as much memory but of course 16x - 32x as much bandwidth usage. The idea would be that you can get your CoreVM loaded with fewer segments and network calls, but maybe someone will want a CoreVM to have not 8.4MB of memory but 134MB or 268MB instead. Not sure if W\_G = 4104 is chosen to be compatible with some RISC-V related considerations? (edited)

# 2025-03-19 19:25 dave: W_G=4104 for 4k pages for eg CoreVM. Bigger page sizes are not great, as it means way more read/write amplification in the worst case. 4k is already really too big for a number of use cases, we just can't really go smaller as 4k is the min page size on modern HW

# 2025-03-19 19:28 dave: Please note that the 12 bytes is _per segment shard requested_. You should absolutely be batching these requests so that in the ideal case you make ~340 network requests for all of the segment shards you need for a WP. This is still a lot of requests of course. In the full network protocol we may add a fast path for requesting the original data directly from guarantors.

# 2025-03-19 19:32 dave: 12MB is a limit on the number of pages read/written _per WP_. It is _not_ the limit on the CoreVM memory. This can't really be made larger as it is constrained by bandwidth. Remember that all this data needs to be read/written from/to the DA system for every core every slot

# 2025-03-19 19:53 sourabhniyogi: Understood about the batching of segmentIndex in JAMNP taking us from 12 bytes to some multiple of that. My mental model is that the majority of segments will come from one primary work package and occasionally a few segments come in from another few "foreign" wps but I understand that's just one use case, where my mental model is formed by what I imagine a lazy CoreVM programmer would do (if you tell him its like normal programs). By "read/write amplification" I believe you mean the idea that we load up all these pages (yes, in a single WP) from Segments DA and only a tiny % of the pages are updated and only a tiny fraction of each page at that, right? So it looks like a tradeoff between that and the number of networking calls, even with batched segmentIndex. Thanks for explaining!

# 2025-03-20 10:10 dave: By read/write amplification I mean even if you want to read or write just 1 byte you still have to import/export an entire page/segment; you end up reading/writing 4000 times as much. Of course that is the worst case, usually it will not be quite as bad

# 2025-03-20 03:43 sourabhniyogi: What is the expected strategy to have the CoreVM service identify which pages in the child VM have been written to (say, marked with a "dirty" bit) so that those pages (and only those pages) are exported to Segment DA by the parent VM?

# 2025-03-20 08:47 gav: Yes, CoreVM will support this via the inner PVM host API. We don't yet have that host call API in place but I believe it's on Jan Bujak 's TODO list. (edited)

# 2025-03-20 08:56 gav: One additional thing to note for segment reconstruction is that actually fetching each 12 byte piece from a unique set of 342 nodes and reconstructing is the very worst case. There are four better cases: that some or all of the segments are provided to the guarantors by the builder (because it could be the same builder or builder-network whose package created them in the first place). In this case they could have the proof-pages or just be content indexed and passed as extrinsics. Or the guarantor could already have the pages in their own cache because it was they who guaranteed the exporting package. Or the guarantor could fetch them from the exporting guarantor as whole segments. Or that the guarantor batches them because they were exported by the same package as other segments which are also needed for import. And in the worst case, the system wouldn't _break_; it would just mean that packages which exposed this worst-case behaviour would take potentially a little longer to make their way through the pipeline. (edited)

# 2025-03-20 10:26 danicuki: The encoding for newly created Work Result fields (C.23) do not specify integer sizes for xu, xi, xx, xz, xe. Are they all 1 byte values?

# 2025-03-20 16:32 prematurata: I have a couple of questions/remarks about accumulation. Specifically how jam should handle dependencies when accumulating a service. Let's consider the following scenario: - WorkReport A contains 2 results: service 1 and service 2 - WorkReport B has a dependency on A ((WB)_x)_p = ((WA)_s)_h and contains one result of service 3. So W! contains \[1, 2\], lets assume WQ is empty... But W* contains also 3 so W* = [1, 2, 3]. Now ∆∗ which is called "parallelized accumulation", is being called with W\* on 12.21 (after ∆+) https://graypaper.fluffylabs.dev/#/68eaa1f/17b50317c103?v=0.6.4 . question 1: But since there is no ordering enforced (that i can see) when executing ∆∗, then 3 might be executed before it's dependencies. question 2: lets say i am wrong in the previous question. Lets say Service 1 writes in it's storage a key that Service 3 needs to read from (via hostcall read). service 3 does not seem to receive the updated storage when ∆1 gets called with s=3 I guess both questions/remarks are wrong and that Service3 needs to be indeed gets executed after 1,2 (which can be parallelized) and that it should be accumulated using the updated service accounts from the 1,2 accumulation. but i can't see where this is enforced in the GP. (edited)

# 2025-03-20 17:08 jan: Indeed, as Gav said, we will most likely add a way for the parent VM to fetch this information in an efficient manner. It's on my TODO list; we just don't want to spec it before implementing it to make sure the performance is good.

# 2025-03-20 18:34 sourabhniyogi: That's totally great. A page-manifest host function call of both (A) what pages MUST be exported because it was written to (a "dirty bit" per page) and (B) what page MUST be imported because it was read from (a "accessed" bit per page) appears necessary for programmers to not have to think much about how their child "guest program" uses memory and the parent CoreVM service will need this manifest to poke and peek pages so the programmer doesn't have to. But since guest programs will vary widely in terms of how many "accessed" and "dirty" bits they get before they hit some refine gas limit OR the W\_M=3072 (12MB) limit on imported+exported segments which applies across ALL the VMs, parent and child, it seems the builder should be able to run with the whole 2^32 VMs always fully loaded ... and then use this bookkeeper to dump work packages out right before it runs out of gas or hit the W_M, to effectively break up long running computations into work packages that the guarantors can legit handle in accordance with JAM spec. We would thus want "builder" mode refining (having tens of thousands maybe the 1MM page) and "guarantor" model refining (W\_M) working on just the subset the builder wants. Is that the way to think about designing this host function? How do performance considerations figure in? (edited)

# 2025-03-20 18:37 gav: They're all variable. See C.6 (edited)

# 2025-03-20 18:42 gav: 1. WIs for services 1, 2 and 3 would be executed in the same batch. From the perspective of each they would be executed under the same prior state. If #3 inspected the state of #2 it would see its prior state; if #2 inspected #3, it would also see #3's prior state. 2. No. #3 would not (necessarily) see the storage change. (edited)

# 2025-03-20 18:43 gav: The thing to understand is that JAM services are very much designed to be only asynchronously interactive, at least at the accumulation stage.

# 2025-03-20 18:44 gav: The prerequisite functionality is there to ensure that a package doesn't get accumulated before another package is known to be accumulatable.

# 2025-03-20 18:44 gav: It is not there to force a total ordering over its constituent work items; and certainly not over multiple services. (edited)

# 2025-03-20 18:45 gav: This would create a potentially troublesome pattern and may over-extend the queue system and reduce potential parallelisability for accumulate. (edited)

# 2025-03-20 18:45 gav: A total ordering is possible within a single service; by creating dependent packages, you can be certain that certain WIs will not be accumulated in batches before others. You may end up with a dependency in the same batch, but then that's up to the service code to apply the appropriate ordering. (edited)

# 2025-03-20 18:47 gav: If you're looking to synchronise between services, then you'll need to use transfers. (edited)

# 2025-03-20 18:49 gav: Transfers can be combined with co-scheduling at the Refine stage (e.g. sharing the same WP) and it becomes possible to create causal entanglement between the WIs of multiple services which can be enforced at the accumulation stage, so the entangled effects only get integrated into state when both sides are known to be completable. (edited)

# 2025-03-20 18:53 gav: Accumulation is designed with a view to becoming parallelisable. At some later revision of JAM we may e.g. increase the number of cores to 682 with a requirement that a single service cannot regularly have more accumulation gas than is possible with 341 of them. We can't squeeze more gas into the synchronous pipeline, so this would be made viable through CPU parallelism executing multiple service's simultaneously.

# 2025-03-20 18:54 gav: This model breaks more as work-items between services become orderable and synchronous dependencies - chains of execution - start to become the norm.

# 2025-03-20 18:54 gav: So it's something I'm really trying to avoid with this design.

# 2025-03-20 18:56 gav: In short, cross-service execution dependencies force synchroneity and are evil and the death of scalability. We want to keep them off-chain. (edited)

# 2025-03-20 19:03 prematurata: perfect gavin. Thanks for this indepth explanation. it was very much needed for me... I was trying to make both the "dependency system"/prerequisite and parallelism work together.

# 2025-03-21 14:34 tvvkk7: Hello, I'm implementing PVM invocations, but I'm curious about how we get the actual service code, or in other words the standard program codes. Take on-transfer invocation for example, we input service codeHash into argument invocations. But, service codeHash is a 32-octet value. How do we get the program codes through service codeHash ?

# 2025-03-21 14:54 gav: There’s a function, big Lambda. This should define how to derive the hash preimage.

# 2025-03-21 14:54 gav: It’s all well defined.

# 2025-03-21 14:55 gav: You’ll need to utilise the preimage lookup map.

# 2025-03-21 14:57 tvvkk7: Much appreciated

# 2025-03-21 22:19 celadari: Hi guys, small questions: - Is ε(tₐ, tᵦ, tₜ, tᵧ, tₘ, tₒ, tᵢ) used in the definition of the host call function info Ωᵢ the same encoding as **𝐚 ∼ 𝓔₈(𝐚ᵦ, 𝐚ᵧ, 𝐚ₘ, 𝐚ₒ) ∼ 𝓔₄(𝐚ᵢ)**? - Perhaps there's something I don't see, but in equation A.43 we define **u = ρ − max(ρ′, 0)**. If ρ′ is negative, then **u = ρ**, which means the gas doesn't change. So the service would have run code, but the gas stays the same — is that correct? If so, is that the intended behavior?

# 2025-03-22 04:56 clearloop: may I ask if this is part of the new PVM tests or the stf of accumulation? I see everybody is talking about this however we haven't met this yet 😅 (edited)

# 2025-03-22 09:01 celadari: Capture d’écran du 2025-03-22 10-28-25.png

# 2025-03-22 09:01 celadari: Capture d’écran du 2025-03-22 10-29-08.png

# 2025-03-22 09:02 celadari: Sorry actually you were asking about the first question or the second question ?

# 2025-03-22 09:06 clearloop: sry I'm just curious about the host call part, I'm now updating our PVM tests while I don't see tests with host calls, so I assume it belongs to the accumulation stf?

# 2025-03-22 05:55 gav: > <@celadari:matrix.org> Hi guys, small questions: > - Is ε(tₐ, tᵦ, tₜ, tᵧ, tₘ, tₒ, tᵢ) used in the definition of the host call function info Ωᵢ the same encoding as **𝐚 ∼ 𝓔₈(𝐚ᵦ, 𝐚ᵧ, 𝐚ₘ, 𝐚ₒ) ∼ 𝓔₄(𝐚ᵢ)**? > - Perhaps there's something I don't see, but in equation A.43 we define **u = ρ − max(ρ′, 0)**. > If ρ′ is negative, then **u = ρ**, which means the gas doesn't change. > So the service would have run code, but the gas stays the same — is that correct? > If so, is that the intended behavior? On the second point, no. u is gas used. By ensuring the second term (gas counter) is never negative we just ensure that the gas used is never greater than the gas limit.

# 2025-03-22 07:40 celadari: Oh thank you, I hadn't inderstood that u was used gas, makes sense

# 2025-03-22 05:58 gav: On the first point, no. The former encoding uses the variable size numeric encodings.

# 2025-03-22 08:31 celadari: I see => thanks : )

# 2025-03-22 11:21 greywolve: Is the explicit encoding of the tuple in 5.6 redundant since that's going to just remain an octet sequence after the outer encoding? Or is there something special I'm missing? (i.e is it pretty much the same as the regular serialization in the appendix only the work report replaced with the hashed work report instead) (edited)

# 2025-03-22 11:36 celadari: Question regarding this: i'' was removed in the definition of Psi_H in version 6.4 of the GP so my question => ¿ do we advance the counter after ecalli instruction or not when we exit ecalli for the host call ? Looking at this line looks like we don't advance it https://graypaper.fluffylabs.dev/#/68eaa1f/246700247200?v=0.6.4 but then it conflicts with idea of using i' (not using i'') during Psi_H call (https://graypaper.fluffylabs.dev/#/68eaa1f/2b9d012b9d01?v=0.6.4) where it would mean that i' is advanced after ecalli instruction for the host call ? Thanks in advance for the clarification 🙏

# 2025-03-22 11:38 celadari: PS: I tag Ivan Subotic so he gets notified as well

# 2025-03-22 11:56 gav: > <@greywolve:matrix.org> Is the explicit encoding of the tuple in 5.6 redundant since that's going to just remain an octet sequence after the outer encoding? Or is there something special I'm missing? (i.e is it pretty much the same as the regular serialization in the appendix only the work report replaced with the hashed work report instead) It is done this way to avoid having to send all guarantees with the header. Merkle proofs can be provided for those which are sent on other channels.

# 2025-03-22 11:58 gav: > <@celadari:matrix.org> Question regarding this: > > i'' was removed in the definition of Psi_H in version 6.4 of the GP so my question => ¿ do we advance the counter after ecalli instruction or not when we exit ecalli for the host call ? > > Looking at this line looks like we don't advance it https://graypaper.fluffylabs.dev/#/68eaa1f/246700247200?v=0.6.4 > > but then it conflicts with idea of using i' (not using i'') during Psi_H call (https://graypaper.fluffylabs.dev/#/68eaa1f/2b9d012b9d01?v=0.6.4) where it would mean that i' is advanced after ecalli instruction for the host call ? > > Thanks in advance for the clarification 🙏 ecalli is no different to other instructions regarding i’: i’ still represents the instruction immediately following and as per the definition of PsiH, we advance to it once the host call is resolved.

# 2025-03-22 13:10 subotic: Ahh, now I understand it. The program counter always advances as per (A.7) and additionally in the case of ecalli, epsilon is h x v_x instead of play. Thanks!

# 2025-03-24 09:56 dakkk: gav: what is the rationale of having core and service statistics into the protocol? While validators' statistics are useful as explained in the GP, there are no information of the usefulness of core and service statistics, and I'm unable to figure it out by myself

# 2025-03-24 10:10 emilkietzman: Secondary markets of Agile Coretime like RegionX or Lastic - You could check Core utilizations in different projects and sell unused Coretime

# 2025-03-24 11:16 oliver.tale-yazdi: It was mentioned in the OpenDev call https://www.youtube.com/live/5kpgs7eb95M?si=AQu819sfQUwxgCRP&t=395

# 2025-03-24 11:21 dakkk: I missed that, I'll watch it. Thank you

# 2025-03-24 23:32 celadari: Question regarding host call functions Omeja_J(n=reject) - if we apply it to an account a of index s => it is supposed to eliminate from the database only the components a_c, a_b, a_g, a_m, a_o, a_i (represented by C(255, s) in the trie) or a_l, a_p, a_s as well ?

# 2025-03-25 14:55 jay_ztc: Is the ordering of the preimages extrinsic fully defined in the GP? 12.35 specifies that the preimages extrinsic should be ordered by the (account, preimage) tuples, but doesn't go into further detail. Looking at the tuple & sequence notation sections in section 3, there's not a default tuple ordering defined there either. https://graypaper.fluffylabs.dev/#/68eaa1f/181001181001?v=0.6.4 (edited)

# 2025-03-25 15:00 gav: Tuples are ordered in the usual, obvious way. (edited)

# 2025-03-25 15:03 gav: Eg (1,1), (1,2), (2,1), (2,2)

# 2025-03-25 15:09 jay_ztc: got it, thx for clarifying

# 2025-03-25 16:31 sourabhniyogi: With v0.5's PVM 64-bit change (and a irreversible commitment to not supporting 32-bit PVM), it is reasonable to adjust https://graypaper.fluffylabs.dev/#/68eaa1f/2c15002c1500?v=0.6.4 to have a memory address space beyond 4GB? If not, what are the technical reasons for continuing with this 32-bit layout? I believe we should extend the "A" here to include "accessed" (i) and "dirty" (e) bits so as to map into what pages must be imported and what pages must be exported, thus treating coreVM OSes specially. Or get at least a convention on how corevm services use the 4104-4096=8 bytes to keep the page number and these additional metadata bits. I understand JAM protocol may not wish to impose constraints on what these additional 8 bytes contain (though I believe it makes sense to have these i+e bits to support OSes to run on JAM), but a nevertheless pregnant question regardless is: are there additional metadata bits candidates per segment we should consider when designing CoreVM + Coreplay services? (edited)

# 2025-03-25 20:31 gav: No idea what you're talking about. Please reframe it in specific terms of Omega_J.

# 2025-03-25 20:33 gav: We can't actually handle 4 GB of allocations in terms of gas. Realistically most programs will mostly execute with only 1MB actually accessible (maybe sometimes with 16MB, but only very rarely with more). (edited)

# 2025-03-25 20:33 gav: Gas will be scaled depending on how much memory you're accessing. It'll become impractical many orders of magnitude below 32-bit.

# 2025-03-25 20:36 jan: > If not, what are the technical reasons for continuing with this 32-bit layout? Speed, as it makes sandboxing cheaper, and as Gav said you won't be able to use this much memory in practice anyway, so it's pointless to have a 64-bit address space.

# 2025-03-25 20:36 gav: And, for security (auditing) validators will need to be able to execute several refinements concurrently, probably around 10; we'll also need 2 guarantor refinements. If they all used, say, 4 GB of RAM, then validators would need to have 48GB of RAM free before we even start thinking about the state DB and various caches.

# 2025-03-25 20:37 gav: That would probably push minimum requirements to beyond 64 GB per node, which is too much.

# 2025-03-25 20:37 gav: Any, in any case, there's no sensible on-chain use-case which would need 64-bit addressability.

# 2025-03-25 21:03 sourabhniyogi: I read David Emett 's comment of "12MB is a limit on the number of pages read/written per WP. It is not the limit on the CoreVM memory." in the following way: - A CoreVM service user actually really does have a 4GB virtual computer. However, in any given work package, spanning say a few seconds of computation only a small number of pages are accessed (imported) or written to (exported). - Only W\_M (3072 as of 0.6.4) pages = 12MB of this 4GB virtual computer is realistic to set up with JAM's PVM but it is only _tiny fraction_ of the larger addressable subset of the CoreVM service users's 4GB virtual computer. - So one work package might access ABC (12MB) the next might access DEF (a different 12MB), the next EFG etc, none of which exceed W\_M _individually_ but in totality across multiple work packages exceed 12MB. In this way, you could totally want much more than 4GB. - When a builder submits a work package to a guarantor, JAM being basically an audit protocol of what happens in up to a W\_M (12MB sized) sliver of memory of what the larger 4GB computer did. JAM is optimized for trustless OS services. In previous decades there was a story of "640K ought to be enough for anybody" and maybe "I think there is a world market for about five computers" -- these days 4-8B people all have 4-8GB phones in their pocket and so perhaps the trustless supercomputing equivalent is "12MB/4GB ought to be enough for everyone" and "there is a world market for about 5 trustless supercomputers" =). Could we imagine that all 4-8B people collectively get their Shared World Computer in a 64-bit way so they may all coreplay together, even though any individual work package only references a tiny sliver? (edited)

# 2025-03-25 21:14 gav: I appreciate the ambition, but the limits are there for a reason.

# 2025-03-25 21:14 gav: We've explained the reasoning. Live with it or come up with a better protocol yourself.

# 2025-03-25 21:16 gav: There's the business of dreaming and the business of building. This channel is for the latter.

# 2025-03-25 21:33 celadari: Thanks for pointing out my message wasn’t clear and asking for clarification — I appreciate it 🙂 Speaking in terms of **Omega_J**: – we're removing an account d (https://graypaper.fluffylabs.dev/#/68eaa1f/32a50332a803?v=0.6.4) Speaking in terms of the **trie root (Appendix D)**: By removing this account d, the state root trie will need to be updated at some point. My question is: – Do we remove only the first component of the account in the trie? (https://graypaper.fluffylabs.dev/#/68eaa1f/382b03383b03?v=0.6.4) – Or do we remove other components of the account (storage, pre_image lookup, ...) as well? (https://graypaper.fluffylabs.dev/#/68eaa1f/385403386103?v=0.6.4, https://graypaper.fluffylabs.dev/#/68eaa1f/384103384e03?v=0.6.4, https://graypaper.fluffylabs.dev/#/68eaa1f/387603387f03?v=0.6.4) (edited)

# 2025-03-25 21:54 gav: You can answer your own question if you simply phrase it in terms of Omega_J.

# 2025-03-25 21:56 gav: Omega_J has the effect of removing a particular item (d) from the accounts dictionary (delta).

# 2025-03-26 15:25 celadari: Thanks again for your time and answer. Just to give a bit more context on why I was confused: I hadn’t realized that the condition on 𝑑𝑖=2 was actually implying that the **storage**, **preimages**, etc. for that account had to be _already empty_ — meaning the account must have gone through a forget(Omega_F) and write(Omega_W) before being eligible for deletion. Initially, I thought we were supposed to manually remove these fields by directly deleting down from the partial trie key like 𝐶(𝑠,𝐸4(2^32−1)) which would have worked fine for **preImageLookupP** and **storage**, but not for **preImageLookupL** (because of E(l)) — and that one had me pulling my hair out 😅 All good now and thanks again :) (edited)

# 2025-03-25 21:57 gav: And if you see the various data concerning accounts which makes up the state trie (from which the trie root may be derived), they're all defined through the contents of the accounts dictionary delta.

# 2025-03-25 21:58 gav: Therefore if the dictionary no longer contains a particular account, then state trie items such as the the stored data, preimages, &c which were concerning said account will no longer be in place.

# 2025-03-26 07:36 tvvkk7: Hello gav , I'm implementing accumulation invocation. The initializer function I requires eta'_0. Does eta'_0 be input through U ? Although, it is only needed in Psi_A.

# 2025-03-26 09:21 shawntabrizi: As I recall, compact numbers in JAM are different than they are in SCALE and Polkadot today. Can someone write a small description of the new compact number format? ❤️

# 2025-03-26 09:31 xlchen: I give my code to chatgpt and this is the description from it

This encoding converts an unsigned integer into a sequence of bytes using just as many bytes as needed. Here’s how it works in simple terms:
	1.	Zero Handling:
If the integer is 0, it simply outputs one byte with the value 0.
	2.	Determining Byte Count:
For nonzero numbers, the encoder figures out how many extra bytes are needed by finding the smallest number l (from 0 to 7) such that the value fits within 7 \times (l+1) bits. If none of these work, it defaults to using 8 bytes in total.
	3.	Control Byte Creation:
The first byte (control byte) combines a prefix that indicates how many extra bytes follow and part of the number itself. The prefix is calculated as:
\text{prefix} = 256 - (1 \ll (8 - l))
The control byte is then formed by adding this prefix to the most significant bits of the number.
	4.	Appending Remaining Bytes:
After the control byte, the remaining bytes (if any) represent the lower parts of the number, each taking an 8-bit chunk.

In summary, the format starts with a control byte that tells you how many additional bytes there are and includes part of the data, followed by the extra bytes that complete the full representation of the integer.

# 2025-03-26 09:22 shawntabrizi: image.png

# 2025-03-26 09:22 shawntabrizi: is that this?

# 2025-03-26 09:22 shawntabrizi: need an ELI5 :)

# 2025-03-26 09:24 jan: One prefix byte plus payload. The prefix byte determines the length through the number of 1s before the first 0. The unused bits in the first byte are used for the payload. The first byte always contains the most significant bits. The rest of the bytes are always written in a little endian order. Can encode at most 64-bit numbers. At most 7bit - 0xxxxxxx At most 14bit - 10xxxxxx xxxxxxxx At most 21bit - 110xxxxx xxxxxxxx xxxxxxxx At most 28bit - 1110xxxx xxxxxxxx xxxxxxxx xxxxxxxx At most 35bit - 11110xxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx At most 42bit - 111110xx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx At most 49bit - 1111110x xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx At most 56bit - 11111110 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx At most 64bit - 11111111 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx

# 2025-03-26 09:24 knight1205: is there any specification for work package builders, yet?

# 2025-03-26 09:30 clearloop: hi there, as for the riscv test https://github.com/koute/jamtestvectors/blob/master_pvm_initial/pvm/programs/riscv_rv64ua_amoadd_d.json, it doesn't have jump table defined, (the program starts with \[0, 0\]), however seems jumps are required for executing the complete logic, but how do we perform jumps in the bytecode since we don't have the jump table? (edited)

# 2025-03-26 09:33 jan: Not sure I understand your question. That program contains only static jumps and it doesn't require a jump table.

# 2025-03-26 09:48 clearloop: wait, I can reproduce my case in pvm debugger https://pvm.fluffylabs.dev/?program=0x0000808833000000018300ff9700103200330502330700000080330800f83a0a0000017ba782ab3e02000003c88b027ba23a02000003330400000080ab4b4c33050382ac3304ff97441f954400f8ab4c3a33050482ab3e02000003c88b027ba23a020000033304ff97441f954400f8ab4b1933050582ac33040000f88344ff97440cab4c0652050400287bff212941840a2904494924851492480a4932#/ at JUMP\_IND, we trigger halt, if so, how do we run the rest of the logic to reach the expected test result? (edited)

# 2025-03-26 09:49 jan: You don't start execution at 0.

# 2025-03-26 09:50 clearloop: oh I got it, missed the initial_pc, thanks!

# 2025-03-26 09:35 clearloop: oh I got you, so there could be problems in my djump implementation, I'm referencing the jump table in all jump instructions

# 2025-03-26 09:36 jan: The only dynamic jump in that program is the jump that goes to the hardcoded special address which halts the program; no other jumps there use the jump table.

# 2025-03-26 09:37 jan: Non-dynamic jumps definitely shouldn't do anything with the jump table.

# 2025-03-26 09:37 clearloop: I may need to fix more problems in my code, since I can pass all test_instr_* with my current implementation 😂

# 2025-03-26 09:42 gav: Specification, no, and there won't be.

# 2025-03-26 09:43 gav: There may be conventions, published APIs and/or SDKs to help create package builders.

# 2025-03-26 09:44 gav: But the JAM protocol doesn't prescribe any means of building packages any more than it prescribes how you should create your service logic. (edited)

# 2025-03-26 09:45 gav: It terms of data logistics, ideally builders will connect to the JAM network via inbuilt nodes (light or full, depending on the use-case and circumstances) and use internal node APIs to inject new packages. Nodes would then identify the right guarantors and send the package to them.

# 2025-03-26 09:48 gav: However, in the early days, we'll probably see RPCs being used by builder executables to deliver packages to nodes on testnets. It's definitely not something I'd want to see in production, but running & synching a full-node to insert a single work-package is plainly suboptimal and we don't have any light-clients yet.

# 2025-03-26 09:50 gav: That's not a question the GP (or I) can answer. How you get eta' into the appropriate place in your code for it to be able to calculate the initial machine state is entirely an implementation detail. (edited)

# 2025-03-26 11:51 tvvkk7: Many thanks ! I got it!

# 2025-03-26 09:50 xlchen: are validators expected to open connections with any builder nodes? obviously limits needs to be enforced on validators side but that can still be a DoS vector?

# 2025-03-26 09:50 gav: No.

# 2025-03-26 09:50 gav: The builder node set is unconstrained.

# 2025-03-26 09:51 gav: And even if it were somehow constrained, validators have no idea who they are.

# 2025-03-26 09:51 gav: Whereas validators IDs are well-known to anyone with an up to date JAM state.

# 2025-03-26 09:52 gav: Strategically, validators need to find work-packages they can guarantee in order to make rewards. But they must balance this with the possibility of being DoSed/attacked. (edited)

# 2025-03-26 09:53 gav: So there will be some need for implementors to create guarantor strategies which balance these two opposing forces. There's not really a right answer here and it's the sort of thing which should be discussed at JAM0.

# 2025-03-26 09:54 gav: Realistically I'd expect validators to have several dozen nodes connected, other than fellow validators.

# 2025-03-26 09:55 xlchen: I see. something to be figured out. worst case we just have tx pool and some package gossip protocol + peer reputation. ie something like what we have today

# 2025-03-26 09:55 gav: And to actively churn through nodes, keeping ones who tend to give them good packages.

# 2025-03-26 09:56 gav: Connection could come with a promise to give packages adhering to a set of authorizers. Failure to provide a package on a core with one of those authorizers in the pool could result in booting.

# 2025-03-26 09:57 gav: Obviously if bad packages are received, then this would also result in booting.

# 2025-03-26 10:12 xlchen: so for a builder to be able to consistently deliver work packages, it needs to work with all sorts of services. It certainly need a pool and some way to collect the packages. this is a big chunk of work

# 2025-03-26 10:15 gav: Highly unlikely.

# 2025-03-26 10:15 gav: Builders will almost certainly be private enterprises and specialised to a particular service or service-type.

# 2025-03-26 10:16 gav: E.g. for parachains, it could be that every parachain will have its own builder network (aka collator network). Though with the Omninode, we'll probably see generic parachain builder networks. (edited)

# 2025-03-26 10:16 gav: But still, they'll only build for the one Parachains service.

# 2025-03-26 10:17 gav: It will be up to the builders to convince guarantors on cores which their packages are capable of running that they can furnish them with packages.

# 2025-03-26 10:18 gav: Thankfully this need not be done blindly; IsAuthorized is designed to run independently and cheaply.

# 2025-03-26 10:19 gav: And once IsAuthorized executes successfully, the guarantor knows that the builder can reasonably supply a package worth refining/guaranteeing.

# 2025-03-26 10:35 knight1205: so the strategy to build connection and accept work packages will be fixed for each implementation, for consistency, or will there be different strategies? If fixed, will that be provided in JAM-NP?

# 2025-03-26 10:36 gav: JAM-SNP already contains network messages for provision/sharing of work-packages (and preimages)

# 2025-03-26 10:37 knight1205: but that is just protocol for connection setup. will there be any strategy/requirements for acceptance or just we have to validate author hash or work package on our own and then perform computations?

# 2025-03-26 10:38 gav: As I just wrote:

# 2025-03-26 10:38 gav: > So there will be some need for implementors to create guarantor strategies which balance these two opposing forces. There's not really a right answer here and it's the sort of thing which should be discussed at JAM0

# 2025-03-26 10:39 gav: (For the purposes of M2 conformance testing we'll have idealised connections and implementations will not need to concern themselves with the possibility of DoS.)

# 2025-03-26 10:41 gav: As per the security audit (M5), implementations will need to demonstrate a resilience against DoS, including attacks by peers. But of course, over-conservative nodes which sacrifice too many rewards may find that fewer validators are willing to run them. (edited)

# 2025-03-26 10:43 gav: Again, no right answers, and I do expect (and hope for!) some differences between node strategies, but our implementor conferences are meant for brainstorming and sharing insights into such things.

# 2025-03-26 10:43 knight1205: alright, got it. thank you very much

# 2025-03-26 11:01 dave: SNP currently allows builders to identify themselves at connection time by adding /builder to the protocol advertised during ALPN, see https://github.com/zdave-parity/jam-np/blob/main/simple.md#alpn. To some extent how validators treat these connections is a strategy thing and it isn't necessary for all implementations to behave the same. A reasonable strategy might be to grant a peer connecting with the /builder suffix a special builder connection slot (subject to availability), but require the peer to submit a valid work-package within a few seconds after connecting in order to keep the slot and not lose reputation.

# 2025-03-26 12:41 jay_ztc:

# 2025-03-27 16:51 jay_ztc: should the pvm invocation definitions make explicit the use of the historical lookup function when representing an accounts code? Currently it looks like S_c is used as if it were intended to be the code preimage itself, rather than the code hash. https://graypaper.fluffylabs.dev/#/68eaa1f/2fa9002fac00?v=0.6.4

# 2025-03-27 23:13 jay_ztc: within the accumulate pvm invocation, is flushing S back to the partial state after read-only host functions intentional? (read, lookup, info). Seems like it should be moot. https://graypaper.fluffylabs.dev/#/68eaa1f/2ebb022ebb02?v=0.6.4

# 2025-03-28 08:38 gav: s\_{bold c} is the preimage (s\_{regular c} is the hash) - It is defined in (9.4), and doesn't rely on the preimage lookup function. (edited)

# 2025-03-28 08:58 gav: Yes it's a moot point; I'm happy to take a PR which simplifies it, though I'm not sure if that's necessarily easy.

# 2025-03-28 13:34 jay_ztc: I see it now, thanks!

# 2025-03-28 14:15 jay_ztc: https://github.com/gavofyork/graypaper/pull/313

# 2025-03-30 08:21 celadari: Hi everyone, I have a question regarding the program metadata introduced in GP 6.3. If an extrinsic E_P includes a pre_image that do not conform to the expected encoding Epsilon(double_arrow Epsilon(a_m), a_c) (as specified here: https://graypaper.fluffylabs.dev/#/68eaa1f/106c01107101?v=0.6.4): Should we: - Consider the entire block invalid ? OR - *Accept the block*, and allow the service lookup dictionaries to include these entries, with the understanding that invocations of Psi_A, Psi_R, Psi_I, Psi_T for the service of this pre_image would simply fail (by failing I mean that invocations panic thus don't change state) ? (edited)

# 2025-03-30 11:55 gav: does not conform to the expected encoding of what? (edited)

# 2025-03-30 11:55 gav: The preimage is determined solely by data encoded as per the GP specification.

# 2025-03-30 11:56 gav: Either it is requested or it is not. If it is not, then the block is invalid. There’s no room for guesswork here.

# 2025-03-30 14:12 celadari: Let me use an example to explain my question more clearly: Suppose we receive an incoming block with some extrinsics, among which are E_P extrinsics (preimages). Let’s assume one of these preimages is for service s and is encoded as a vertical-double-array p. From the first byte, we determine the length of p, extract the corresponding bytes, and treat that as the preimage. This preimage p is expected to represent an Epsilon(double_arrow Epsilon(a_m), a_c) structure (https://graypaper.fluffylabs.dev/#/68eaa1f/106c01107101?v=0.6.4). Now, say p = [129, 2, 4, 5, 6]. Interpreting this: The metadata slice is supposed to be of length 129, starting right after the first byte. But the total array doesn't even contain 129 elements—so this is clearly an incorrectly encoded preimage. My question is: - Should we **reject the entire block** due to this malformed preimage? OR - Should we **accept the block**, include the "bad formed" preimage in the lookup for service s, and simply let the Psi_A panic at execution time (and thus not updating anything) for this service s ?

# 2025-03-30 14:13 celadari: The reason I ask is because the current codec test vectors for preimages (https://github.com/davxy/jam-test-vectors/blob/polkajam-vectors/codec/data/preimages_extrinsic.json) don’t appear to cover a valid encoding that includes metadata.

# 2025-03-30 14:50 gav: Preimage extrinsic (E_P) is a sequence of pairs (service index with blob).

# 2025-03-30 14:51 gav: Implementations can determine it from the (encoded) block.

# 2025-03-30 14:52 gav: Each other in the preimage extrinsic must be a valid request as per the prior state.

# 2025-03-30 14:52 gav: See (12.30) - (12.33) for formal definitions of this.

# 2025-03-30 14:53 gav: If you are still confused, I suggest you rephrase your query in though terms.

# 2025-03-30 14:54 gav: I'm not sure what you're really asking, but if the question is "if I receive a block which doesn't correctly decode but from which I could make a best guess at imagining some underlying meaning, should I import it as though it was really an encoding of this best guess?" then the answer is OF COURSE NOT! (edited)

# 2025-03-30 14:55 gav: Again, this is a consensus protocol. There is no room for error or guesswork.

# 2025-03-30 15:43 celadari: To be "tougher" - and concise: These preimage test vectors (https://github.com/davxy/jam-test-vectors/blob/polkajam-vectors/codec/data/preimages_extrinsic.json) don’t cover a valid encoding that includes metadata. They are 7 months old. ➡️ So if I understand correctly => **we shouldn’t try to test these preimage test-vectors with the current GP version and just wait for new test vectors for preimages to be published ?**

# 2025-03-30 17:31 dave: If I understand correctly, what you're asking is: if a preimage is requested which will be used as the code blob for a service, must the preimage be a valid "code blob" for it to be includable in a block and integrated into the service storage? Pretty sure the answer to this is no: as long as a preimage has been requested then it can be included in a block. If a service requests a preimage that cannot be decoded or used as a code blob for whatever reason, then attempts to use it as such will fail at the point of use. I'm not sure it can really work any other way as there is no type/format/whatever associated with preimages in the state; they are opaque binary blobs.

# 2025-03-30 17:41 celadari: Thanks 🫶! That answers it pretty much PS: by "valid" I meant only encoded-wise (https://graypaper.fluffylabs.dev/#/68eaa1f/106c01107101?v=0.6.4) Thanks for the help

# 2025-03-30 18:25 rustybot: > <@celadari:matrix.org> To be "tougher" - and concise: > > These preimage test vectors (https://github.com/davxy/jam-test-vectors/blob/polkajam-vectors/codec/data/preimages_extrinsic.json) don’t cover a valid encoding that includes metadata. They are 7 months old. > > ➡️ So if I understand correctly => **we shouldn’t try to test these preimage test-vectors with the current GP version and just wait for new test vectors for preimages to be published ?** https://github.com/davxy/jam-test-vectors/tree/polkajam-vectors/codec#semantic-correctness

# 2025-03-30 18:26 rustybot: Codec vectors only exercise the codec. Payload is mostly just random data (edited)

# 2025-03-30 18:28 celadari: I agree but since we are talking about encoding I wasn't sure to which extend we were supposed to verify or not. But thanks anyway ✌️

# 2025-03-30 18:29 gav: The Gray Paper is 100% clear on this.

# 2025-03-30 18:30 gav: If there was a need to verify, then it would state as much in the Gray Paper. It doesn't.

# 2025-03-31 15:39 ascriv: Has anyone yet taken a serious look at if size-synchrony antagonism has been formalized mathematically? Would be nice to have further validation e.g. that what we’re doing is somewhat optimal

# 2025-03-31 15:40 gav: There are some quite similar concepts in systems theory.

# 2025-03-31 15:55 gav: AI answer: > One relevant concept is "complexity theory," which posits that as systems grow in size and complexity, the potential for disorder and misalignment among components increases. This can lead to difficulties in achieving coherence. Larger systems may have more diverse elements, which can result in varying goals, behaviors, and interactions that can disrupt overall coherence. > Another related idea is "Ashby's Law of Requisite Variety," which states that for a system to effectively manage its environment, it must be as diverse as the environment it operates in. In larger systems, the variety of components and interactions can lead to challenges in maintaining coherence unless there are effective mechanisms for integration and coordination.

# 2025-03-31 15:56 gav: However the strict trilemma of Scale, Speed and Coherence doesn't seem to be established. (edited)

# 2025-03-31 15:57 gav: It seems to me, at least, quite demonstrable given real systems have causality bound by speed and component-distances.

# 2025-03-31 15:58 jay_ztc: CAP comes to mind, sort of a cousin principle if you will

# 2025-03-31 16:00 gav: Coherence -> degree of causality across all pairwise pieces of system state Speed -> bound as the time that it would take light to effect a causal resolution across the two most distant causally entangled parts of state Size -> Given some maximal density of system state, the maximum distance between causally entangled state-components of the system (edited)

# 2025-03-31 16:00 gav: It seems pretty trivial to show that If you increase any of these you must reduce one or both of the others.

# 2025-03-31 16:02 gav: So if you make a system bigger (add more state components and therefore make things farther apart) you either need to accept causal resolution will be slower because at least some portions of state are farther apart (and light only travels at a certain speed) or you have to limit what parts are causally entangled, limiting distances travelled for resolution and thus reduce coherence. (edited)

# 2025-03-31 16:03 gav: CAP is somehow related, but it's binary (select any two). It also doesn't deal with size of speed but only "correctness" properties. (edited)

# 2025-03-31 16:04 gav: But yes, is a related trilemma/antagonism applicable to (distributed) systems.

# 2025-03-31 16:54 ascriv: > <@gav:polkadot.io> So if you make a system bigger (add more state components and therefore make things farther apart) you either need to accept causal resolution will be slower or you have to limit what parts are causally entangled and thus reduce coherence. distance = rate * time in some sense?

# 2025-03-31 16:55 ascriv: Maybe that’s generalized too much

# 2025-03-31 17:00 ascriv: Distance ~ size Rate ~ speed Time ~ coherence So roughly size = speed*coherence (edited)

# 2025-03-31 17:21 emielsebastiaan: If a single global coherent state is the design goal (which it is) you can can play/design around different types of decoherence. Eg spatial decoherence (shards), temporal decoherence (ordered accumulation / asynchrony). You can allow for certain types of decoherence and still have a fully coherent global state sufficiently oftentimes to allow for the emergent abstraction of the Cloud layer. (edited)

# 2025-04-01 06:01 faiz_871: Could somebody please explain the meaning of variable δ here in Refine invocation function PsiR https://graypaper.fluffylabs.dev/#/5f542d7/2d65002d0e01?v=0.6.2 (edited)

# 2025-04-01 11:05 gav: size = time_taken / coherence

# 2025-04-01 11:06 gav: or speed size coherence = 1

# 2025-04-01 11:07 gav: you can make a system go fast, go big, or stay fully coherent but not all of them.

# 2025-04-01 11:09 gav: decentralisation implies size, but not the other way around.

# 2025-04-01 11:10 gav: so, if you keep a system small (in order to keep it fast and coherent we might presume), you'll not be able to decentralise nor will you be able to scale out. (edited)

# 2025-04-01 11:11 gav: i'd argue that by introducing such decoherence you do not have a fully coherent state.

# 2025-04-01 11:13 gav: however there may be ways to make the system apparently coherent, or dynamically rebalance the speed and/or coherence in order to optimise all three at any given time.

# 2025-04-01 11:35 ascriv: > <@gav:polkadot.io> or speed size coherence = 1 Or size*coherence ~ speed, with more speed of info travel you can get bigger or more coherent, no?

# 2025-04-01 11:36 gav: sure.

# 2025-04-01 11:37 gav: but there's two different speeds here

# 2025-04-01 11:37 gav: there's the physical limit of speed (speed of light), and the overall speed of the system (one over time to causal resolution)

# 2025-04-01 11:38 gav: it probably isn't sensible to call both things "speed". (edited)

# 2025-04-01 11:41 gav: the speed of light determines the upper limit of causal resolution - no system could ever process causal interactions faster than this. but it doesn't account for keeping a complex and arbitrary system in coherence. as coherent systems become bigger and more complex, the speed of their overall causality diverges from this universal physical limit. (edited)

# 2025-04-01 11:45 gav: at a basic level, as a coherent system grows, even if all of its internal causality happened at the speed of light, it would still take longer to step through its state transitions becuase it would take light longer to get from the corners of the system to interact and resolve.

# 2025-04-01 11:46 gav: so the system - in terms of state transitions per second - would be slower.

# 2025-04-01 11:47 gav: this is compounded by complexity, meaning that internal causal entanglements probably resolve slower as the system grows more complex and arbitrary.

# 2025-04-01 11:47 gav: light is not just going in a straight line.

# 2025-04-01 11:48 gav: of course our systems have a long way to go before the speed of light becomes too important. but still, the principle can serve us well now.

# 2025-04-01 11:58 gav: basically T=ZX/C where: - T is time to causal resolution (s - this is the inverse of the system's operating speed), - Z is size of the system (m - the diameter of the system's bounding sphere), - X is complexity factor of the system (no units, but a factor of at least 1 which describes the number of times light must travel back across the diameter of the bounding sphere in order to guarantee a causal resolution of state-transition) - C is speed of light (edited)

# 2025-04-01 12:06 gav: a totally trivial system would be a single laser switch in a vacuum with one light emitter transmitting a light signal to some light receiver. in this case Z would be the distance between the emitter and receiver, X would be close to one and T would therefore amount to the time it took light to travel between them.

# 2025-04-01 12:06 gav: this wouldn't do any processing though. (edited)

# 2025-04-01 12:07 gav: as we introduce the capability of data processing X increases since the round trip of light is much higher as it passes through more gates and it routed around; and as we introduce state (whether intra-transition or inter-transition) Z increases as we need to cover a greater space to hold more information (also a fundamental physical principle as well as intuitively correct). (edited)

# 2025-04-01 12:27 ascriv: That seems like a good model. as a cool aside maximum info scales with the surface area (not volume) of the bounding sphere, given by the bekenstein bound which black holes are believed to saturate (edited)

# 2025-04-01 12:59 emielsebastiaan: My team and I have put some thought into this. I’ll try to digest it into something presentable for our little Lisbon meetup.

# 2025-04-01 14:18 gav: Ahh yeah the holographic principle iirc

# 2025-04-01 18:07 boymaas: Have we considered, as a thought experiment, https://en.wikipedia.org/wiki/Quantum_entanglement as a means to get instant coherence in distributed systems? Going beyond the speed of light ... 😃

# 2025-04-01 18:09 dakkk: > <@boymaas:matrix.org> Have we considered, as a thought experiment, https://en.wikipedia.org/wiki/Quantum_entanglement as a means to get instant coherence in distributed systems? Going beyond the speed of light ... 😃 You can't communicate any information faster than the speed of light; quantum entanglement doesn't do that

# 2025-04-01 18:13 boymaas: Too bad, reading it now indeed, would have been an interesting case.

# 2025-04-01 18:42 jay_ztc: is sbrk here to stay? I noticed its being used in the accumulate testvectors.

# 2025-04-02 08:24 greywolve: Is the 28 days that erasure coded chunks need to be held for a minimum or a maximum?

# 2025-04-02 08:26 xlchen: minimum. you are not wrong if retained for one more day, but not the case otherwise

# 2025-04-02 08:27 greywolve: and I assume storing more data just costs you more?

# 2025-04-02 08:27 xlchen: yeah use more disk storage

# 2025-04-04 13:02 clearloop: hi there, please correct me if I'm wrong, a should be encoded right after c in (A.37) (Standard Program Initialization) https://github.com/gavofyork/graypaper/pull/323/files#diff-16981432fb50e7e5c3d19d2f40b81e3a14b1c9986de5a179b392b47cd8018383R773 (edited)

# 2025-04-04 16:16 yuchun: Hey there, I have a question regarding the available work-reports The **W** available work-reports (defined in equation (11.16)) are extracted from rhoDagger using the core index. As I understand it, each core should correspond to only one work-report, is that correct? However, I’m a bit confused about equation (13.10). It sums the work-reports for a specific core from the set of available work-reports. Does this imply that the same core might appear multiple times in the available work-reports? Please feel free to let me know if I’ve misunderstood anything. Thanks

# 2025-04-04 18:49 gav: Yes and no.

# 2025-04-04 18:49 gav: (In the case of 13.10, sum was simply to ensure that we get zero if the core is empty)

# 2025-04-04 22:07 0xjunha: In memory accessibility notation, is there any specific reason to use both \subseteq and \subset ? Or is that a typo? \subset makes more sense to me - some host functions (and sbrk inst) are using \subseteq while others use \subset. https://graypaper.fluffylabs.dev/#/68eaa1f/34a90234a902?v=0.6.4 https://graypaper.fluffylabs.dev/#/68eaa1f/336b00336b00?v=0.6.4

# 2025-04-05 04:41 gav: > <@0xjunha:matrix.org> In memory accessibility notation, is there any specific reason to use both \subseteq and \subset ? Or is that a typo? > \subset makes more sense to me - some host functions (and sbrk inst) are using \subseteq while others use \subset. > > https://graypaper.fluffylabs.dev/#/68eaa1f/34a90234a902?v=0.6.4 > https://graypaper.fluffylabs.dev/#/68eaa1f/336b00336b00?v=0.6.4 Typo. It should all be subseteq

# 2025-04-07 04:55 shwchg: https://graypaper.fluffylabs.dev/#/68eaa1f/09cf0109d301?v=0.6.4 https://graypaper.fluffylabs.dev/#/68eaa1f/09e70109e901?v=0.6.4 is the jam common era timestamp annotation wrong? 1,735,689,600 (0000 UTC on January 1, 2025) or 1,735,732,800 (1200 UTC on January 1, 2025)

# 2025-04-07 11:38 0xjunha: Probably will be included in the next release? https://github.com/gavofyork/graypaper/commit/b16207dc291d343991de2fdfb9aa3feb25927b50

# 2025-04-07 12:23 shwchg: Thanks!

# 2025-04-08 20:08 prematurata: I have a question: tecnically speaking is there something preventing the same service to be executed multiple times in the same block? (edited)

# 2025-04-08 20:16 gav: Not at all.

# 2025-04-08 20:17 gav: And it can happen due to queuing and earlier work tranches but using all their allotted gas.

# 2025-04-08 20:20 prematurata: tkz

# 2025-04-14 17:06 charliewinston14: Maybe I missed it but is there anything in the GP that mentions how to validate a justification received by CE 137?? I can generate one using the trace function but am not sure how a receiving node verifys the shard is correct using it.

# 2025-04-14 17:12 dave: In general the GP doesn't say how to implement anything, it just defines the required behaviour. The justification is a Merkle proof and can be verified in the usual way. There are lots of blogposts explaining the concept that you can find by googling "merkle proof"

# 2025-04-15 14:28 charliewinston14: Ok that helped I have the general idea now. One question to everyone, given each value in a justification how do you know if they represent a left or right node to be be able to calculate the right hash?

# 2025-04-15 15:15 dave: You can determine the "path" (series of lefts/rights) from the shard index

# 2025-04-15 15:13 greywolve: I think some variant of this question has been asked but I didn't see a clear answer. Assuming I want to submit a ticket extrinsic that will be included in the first block of a new epoch. At that point I have the last state of the previous epoch. 1. Do I use eta\_1 from the old state for the ring signature context? Given this will become eta\_2 in the next epoch when the ticket will be verified. 2. What about the y\_z, I only have access to the old y\_z. I don't see a way to compute the new one yet. Do I just sign with the old y\_z (and y_k) and hope it doesn't change? 3. When importing this first block in the epoch, would I use the prior y\_z or new y\_z to verify tickets? 6.29 seems to indicate the prior. (edited)

# 2025-04-16 07:52 gav: 1. Yes. 2. I think this is a typo (@davide - wdyt?) - it should read γ'z, not γz. I'll change this. 3. You'll need to compute it - you have all the information necessary from the prior state. (edited)

# 2025-04-16 07:55 greywolve: Thanks!

# 2025-04-16 08:58 davxy: > <@greywolve:matrix.org> Thanks! Yeah that is a typo

# 2025-04-17 16:04 danicuki: I have a doubt about core statistics formula: Formula 13.9 points to w (lowercased) - reports in guarantees extrinsic: https://graypaper.fluffylabs.dev/#/68eaa1f/190d01190f01?v=0.6.4 Formula 13.10 points to W (uppercase) - available reports: https://graypaper.fluffylabs.dev/#/68eaa1f/193e01193f01?v=0.6.4 Is this correct? If so, why is that? Also, in this context, it is very easy to incorrectly use W in the place of w, or vice-versa, since visually the difference is very subtle.

# 2025-04-18 15:05 jay_ztc: Just to confirm my understanding- the PVM entry pc is NOT required to be the start of a basic block, correct? Jan Bujak

# 2025-04-18 15:11 jan: Currently, yes (because we need the ability to resume execution after a page fault). (edited)

# 2025-04-18 15:16 jay_ztc: interesting point about resuming after a nested pvm page fault- thanks for clarifying. Aren't the outer invocation entry pcs non-zero though? For example accumulate initial pc is 5 right? https://graypaper.fluffylabs.dev/#/68eaa1f/2ec1002ec100?v=0.6.4

# 2025-04-18 15:17 jimboj21: I would think the non-zero cases would have to be jumps right?

# 2025-04-18 15:18 jan: Right, sorry, since it's late here I had a brainfart there.

# 2025-04-18 15:20 jan: I suppose technically the outer PVM's entry points could end up in the middle of a basic block if you'd build a particularly cursed blob.

# 2025-04-18 15:21 jay_ztc: No worries, thanks for clarifying. It sounds like it's acceptable for an outer entrypoint to be into the middle of a basic block. My apologies for @ing after hours.

# 2025-04-18 15:23 jan: I have considered disallowing such entry points in the past (basically allow only start-of-basic-block entry points and those needed for hostcall + page fault resumption), but we haven't made such change to the GP yet.

# 2025-04-18 15:23 jay_ztc: cursed blob/ an intentional attempt to break consensus 😉

# 2025-04-18 15:24 jay_ztc: That's how I approached in my reasoning as well, do you think this change is likely? No worries if its too early to tell- just curious. Might end up tabling this in my impl. (edited)

# 2025-04-18 15:30 jan: Can't give you a 100% answer at this point, but it is possible. In general the plan for the final gas cost model is to charge gas only at the beginning of the basic blocks (because charging per instruction is very inefficient), so now if you allow entry points anywhere this potentially complicates the gas metering implementation. Unfortunately disallowing them doesn't necessarily alleviate the problem because you still need to allow entry after page faults, and since memory accesses are so common we don't want to make memory access instructions into basic block terminators.

# 2025-04-18 15:35 jay_ztc: Thanks for clarifying, this is really helpful. I don't have enough context yet to form an opinion on the tradeoffs between allowing inner-instances being able to resume right after pageFault vs requiring them to restart/rollback to the start of a basic block. (edited)

# 2025-04-18 15:37 jan: Although we need some limitation when it comes to entry points, because without any restrictions you could e.g. jump into the middle of an instruction, and depending on the particular bytes used that might actually be a valid instruction. Supporting this in practice would be a nightmare in any other implementation that isn't a naive slow interpreter, so at very least jumping in the middle of instructions is something that we definitely do not want to support.

# 2025-04-18 15:38 jay_ztc: *context as far as the programs/usage of programs on top of JAM (edited)

# 2025-04-18 15:39 jan: Unfortunately it's impossible to require a restart at the start of a basic block as that'd screw up the program state.

# 2025-04-18 15:39 jan: The already executed part of the basic block might have modified memory or registers in a way that is irreversible.

# 2025-04-18 15:40 jan: So a rollback is not possible without taking a snapshot at the start of every basic block which might page fault, and we do not want that as it'd be abysmally slow.

# 2025-04-18 15:40 jay_ztc: the outer pvm instance would have access to gas host function and could potentially use its own memory as a backup before any unsafe calls right? Although, this gets way complicated fast

# 2025-04-18 15:41 jay_ztc: lol we were thinking the same, yep- agreed (edited)

# 2025-04-18 15:42 jay_ztc: But at the least, the outer pvm could do gas check to determine if it wants to 'risk' another inner invocation right? Although being turing complete makes this pretty hard- unless there's some well-defined gas-estimate API contract between nested invocations

# 2025-04-18 15:43 jan: That's infeasible; if you could do that you'd become a very rich person as then it'd mean you solved the halting problem. :P

# 2025-04-18 15:44 jay_ztc: good point lol, suppose that api idea would be practically useless due to such a small scope of applicability...

# 2025-04-18 15:46 jay_ztc: This discussion has provided a lot of clarity, and given me a few things to think about... Many thanks for your time Jan Bujak 🙏

# 2025-04-18 17:05 jimboj21: image.png

# 2025-04-18 17:06 jimboj21: image.png

# 2025-04-18 17:07 jimboj21: Given the return signature: When 12.17 is called shouldnt the assignment order be o*, t*, b*, u* ?

# 2025-04-19 16:18 ycc3741: I just want to confirm something about STF. Should the dispute when updating ψ′ happen before safrole? Because based on what I see in this link, updating gamma_k' requires ψ′_o.

# 2025-04-19 17:23 gav: > <@jimboj21:matrix.org> Given the return signature: When 12.17 is called shouldnt the assignment order be o*, t*, b*, u* ? Yes. I think this is already fixed in main.

# 2025-04-19 17:26 gav: Yes pretty much. Order is technically an implementation detail - some languages don’t have the concept of ordering - so I’m not going to tell you any order per se, but your reading is correct - key rotation is dependent on disputes. (edited)

# 2025-04-20 14:21 gav: Yes your reading is correct. It is in order to get as much information as possible in statistics, not just reported stuff (which represents the most recent computation work done on cores by guarantors) but also what data - required by each core - has recently been made available through assurers

# 2025-04-21 21:25 ascriv: I think the GP is not clear on how to serialize elements of N? And therefore, e.g. state serialization (D.2) is not clear on how to serialize the validator statistics component C, which has components in N

# 2025-04-21 21:47 danicuki: N you mean integers?

# 2025-04-21 21:49 danicuki: Formula C.6 is how you serialize numbers in N

# 2025-04-21 21:47 ascriv: bold N , naturals

# 2025-04-21 21:53 ascriv: Makes sense, wanted to make sure we’re intentionally using the general natural serialization here instead of accidentally not specifying the subscript. Since usually we only use the general one for the length discriminator

# 2025-04-21 21:55 ascriv: It’s just not very consistent. Also in C(13) we are clear to use E_4 for pi_V and pi_L which have components in bold N

# 2025-04-21 21:55 danicuki: > <@ascriv:matrix.org> Makes sense, wanted to make sure we’re intentionally using the general natural serialization here instead of accidentally not specifying the subscript. Since usually we only use the general one for the length discriminator Yes. It is used for the first time in latest version of GP

# 2025-04-21 21:56 danicuki: It is used to save storage space, as statistics number can grow indefinetly (or not). (edited)

# 2025-04-21 21:58 ascriv: But couldn’t pi_V and _L also? Yet we implicitly limit them based on the usage of E_4 serializing them

# 2025-04-22 11:22 gav: The lack of subscript is intentional.

# 2025-04-22 11:25 gav: Yes, the encoding of integers is not presently entirely uniform. Some are encoded for size savings (statistics, where there's relatively a lot of data in a place where bandwidth is very tight), others for the ability to swiftly/efficiently decode (e.g. in PVM I/O). It will be reviewed during the 0.6 series under this issue https://github.com/gavofyork/graypaper/issues/293.

# 2025-04-22 11:26 gav: The encoding used for pi\_V and pi\_L might yet be changed, or possibly pi\_C/pi\_S. (edited)

# 2025-04-22 11:27 gav: Again, the above issue will resolve this.

# 2025-04-22 11:28 gav: GP v0.6.5 is out.

# 2025-04-22 11:29 gav: It's mostly corrections, with two small protocol alterations: - There's now a gas limit in the accumulation operand tuple. - There's a new host-call to allow services to provide preimages to other services directly without going through the regular off-chain preimage process.

# 2025-04-22 12:34 jay_ztc: Got a few small questions about PVM edge cases, to confirm my understanding-> 1. If c\[0\] isn't a valid opcode, this would result in a panic, _if and only if_ the program attempts to execute c\[0\], correct? 2. If an instruction (this time corresponding to an index from the bitmask, k) contains an invalid opcode, this would result in a panic, _if and only if_ the program attempts to execute it, correct? 3. if the skip length function doesn't find any bitmask-marked opcodes within the subsequent 24 octets, this bumps the pc to current+24. So if there happened to be a valid opcode & args at that new pc, it would continue executing at the new pc in the same manner as if it were marked by the instruction bitmask, correct? Jan Bujak (edited)

# 2025-04-22 13:00 jan: Yes, currently "invalid" instructions don't make the program invalid, and only have an effect if executed (they're effectively treated as a trap). Yes, if there's no 1 found in the bitmask then the skip is assumed to be 24. Note that IIRC currently even if the next instruction has 0 in its opcode bitmask bit (i.e. the next bit after the last bit that the bitmask scan checks) the instruction will also be executed, but this is not intended behavior and is on my TODO list to add to the GP that every instruction must have a 1 in the bitmask to be considered valid (otherwise we'll run into some nasty corner cases). (edited)

# 2025-04-22 13:04 jay_ztc: Many thanks for the quick response 🙏, your insight on the TODO is much appreciated. enjoy your evening.

# 2025-04-22 13:08 jan: Also, while we're at the topic parsing, (this is going to be relevant to people aiming for M3 and M4) in case you're wondering why the limit is 24 - it was deliberately picked to allow for fast parsing. To parse a PVM instruction at a given position you need to read only two values from memory: a 128-bit integer from the instructions slice, and a 32-bit integer from the bitmask slice, and then you can easily parse it in an efficient manner with bitshifts etc. (for the bitmask one can use the "leading zeros" intrinsic/method which is a single assembly instruction on modern CPUs to cheaply get the skip to get to the next instruction, hence the maximum is 24)

# 2025-04-22 13:10 jay_ztc: corner case of the corner case might be a 'greater than 24 octet' break in instructions after a valid basic block termination instruction-> (ie is the subsequent 'valid' instruction also a 'valid' jump target, or is the jump target the 'invalid' 24th octet) edit: 'program termination' -> 'basic block' (brain fart) (edited)

# 2025-04-22 13:12 jan: IIRC the jumps actually check that the previous instruction has the 1 set in its bitmask (because you can only jump either to offset=0 or after block terminators)

# 2025-04-22 13:14 jan: So with the current way things are you might get a paradoxical block terminator that would execute when you arrive at it from the previous instruction, but the next instruction wouldn't be a valid target for a jump. (edited)

# 2025-04-22 13:14 jan: (again, I want to require all instructions to have 1 set in its bitmask to prevent such potential corner cases)

# 2025-04-22 13:20 jay_ztc: jumps target indexes in w (basic block index set), which is calculated by applying the skip-length function to each opcode index in the k bitmask\*-> so this would result in a jump target of the 24th octet (our 'invalid' instruction). edit: *additionally, k always includes offset=0 (c[0]) (edited)

# 2025-04-22 13:30 jay_ztc: actually, looks like both branch & jump use the 'beginning of basic blocks' collection, w (which itself uses the skip-length).

# 2025-04-22 13:37 jay_ztc: Didn't intend for this chat to continue into your off-hours, my apologies. Don't worry about looking at this tonight especially given its a non-urgent corner case.

# 2025-04-22 14:32 knight1205: Here: https://graypaper.fluffylabs.dev/#/68eaa1f/1bcb011b1702?v=0.6.4 In Eqn 14.14 and 14.15 we are using s (now b) which is defined as all the segments exported by all work packages, and then we are calculating it's merkle root and comparing with segment root of some previously exported segments of any work package. Is this what it implies? As segments exported from a work package leads to different segment roots, then in that case how can it be same as segment root of all the segments (from all wps)? (edited)

# 2025-04-23 16:45 knight1205: David Emett: Can you please clarify on this?

# 2025-04-23 16:50 dave: In S and J, M(s) = L(r) means s is the sequence of segments with the specified root

# 2025-04-23 16:53 knight1205: but it is specified as all the segments exported by all the work packages exporting a segment. is this right? or this? > s is the sequence of segments with the specified root

# 2025-04-23 16:53 dave: Not sure where you're getting that from?

# 2025-04-23 16:54 knight1205: according to 0.6.5:

# 2025-04-23 16:54 knight1205: image.png

# 2025-04-23 16:54 knight1205:

Note that while S and J are both formulated using the
inner term b (all segments exported by all work-packages
exporting a segment to be imported)

# 2025-04-23 16:59 dave: Hmm not sure about the wording of that sentence. The point there is that if a work-package A is importing a single segment exported by work-package B, you do not need to fetch _all_ of the segments exported by B

# 2025-04-23 17:00 dave: You only need to fetch the segment you care about, plus the corresponding proof segment

# 2025-04-23 17:00 knight1205: got it. thanks for the clarification. Though I am still not sure about the wordings.

# 2025-04-23 17:01 dave: Yeah I think that wording could be improved

# 2025-04-23 19:42 mkchung: For CE140 justification: j++[b]++T(s,i,H), what exactly does the s in T(s, i, H) denote? Is it really the raw exported segment shards for the entire workpakge at given shard index? Without hashing the s term, the co-path in T(s, i, H) would grow linearly as the number of exports increases & each DA would need to store its sibling's entire "raw exported segment shards" as part of the co-path proof?

# 2025-04-23 19:44 mkchung: Let's say there are 100 exported segments for a given workpackage, can you provide an estimated justification size for requesting just one segment? I'm estimating the justification to be somewhere around 66~410435 byte in tiny setting(where validator=6, W_P=1026) and around 297~3866 bytes in full setting(where validator=1023, W_P=6). Does my estimate look reasonable to you?

# 2025-04-23 20:33 jaymansfield: "s is the full sequence of segment shards with the given shard index."

# 2025-04-23 20:33 jaymansfield: The CE140 justification basically allows you to first calculate the segment root for a given segment index, which is then used to validate the erasure root (edited)

# 2025-04-23 22:02 mkchung: That's for the recepient/consumer of a CE140 justification. But the burden of producing such proof seems to fall on DA? if you were the "generator" of CE140 justification(i.e someone requested the justification from you), wouldn't you also need to fetch the raw "s" from your sibling's just to provide a co-path to the segment shard that you are responsible to store?

# 2025-04-23 22:03 mkchung: I think I'm confused as why "S" not being hashed before building the segments root

# 2025-04-24 16:52 jaymansfield: No you only need to know what was returned from CE137 originally. You are providing a path to a root for a list of shards with an index, not for all shards of a given segment. Refer back to s\_clubs in your availability specifier. (edited)

# 2025-04-28 15:46 dave: Sorry, missed these messages earlier. s is the full sequence of segments shards with the given shard index yes. These are all provided in CE 137 (the [Segment Shard]), you should not need to request data from another node to handle a CE 140 request. (edited)

# 2025-04-28 15:47 dave: FWIW the segments root is _not_ involved here, the justifications returned by CE 140 are justifications from the returned segment shards to the erasure root

# 2025-04-28 15:52 dave: As Jason says look at s_clubs

# 2025-04-28 15:53 dave: I've opened a PR to (hopefully) improve the wording: https://github.com/gavofyork/graypaper/pull/345

# 2025-05-09 10:35 knight1205: Hi David Emett , Sorry to disturb you again. Here, in S function we are fetching all the segments whose root / wp hash is present in imports. Is this function returning segments grouped based on segment root or just simply list of segment? Coz, current definition implies its just list of segments, however in Refine Invocation, we are passing list of list of segments here. if the latter is true, then, I think in eqn itself there must be change for more clarity. Please correct me if I understand this wrong.

# 2025-05-09 11:09 dave: Bold b is the complete list of segments with the given root, or exported by the work-package with the given hash

# 2025-05-09 11:09 dave: This is what the M(b) = L(r) bit means

# 2025-05-09 11:32 knight1205: that i got, but after that we are picking nth segment from b and building a list. but in psi R list of list of segment is expected.

# 2025-05-09 12:05 dave: Ah I see what you mean now, sorry. The refine call for each work item has access to the imports for _all_ work items. So the list-of-list-of-segments is [S(w) for w in p_w]

# 2025-05-09 12:06 dave: I believe there is an error in the definition of I(p,j); this says S(w) is passed in rather than the above

# 2025-05-09 12:07 knight1205: yes

# 2025-05-09 12:07 knight1205: that's confusing

# 2025-05-09 12:09 dave: I'll push a PR to the GP repo to address this

# 2025-05-09 12:16 knight1205: Raised Issue #377.

# 2025-05-09 12:26 dave: Thanks for reporting, I've pushed a fix here https://github.com/gavofyork/graypaper/pull/378

# 2025-04-22 18:33 sourabhniyogi: We have found our development life has improved with the "metadata" attached to preimages, specifically when the preimage is for service code, because we can then have tools show that metadata as strings like "fib" and "gameoflife". Can we get the same for workpackages+bundles so that we can attach metadata like "fib(93)", "gameoflife(314)", or should that be in the payload as a mere convention?

# 2025-04-22 19:56 sourabhniyogi: Another quick request: Would you mind assigning provide a number like 17 or 27, and perhaps starting historical\_lookup at 32 or 64 or 128? That would leave a bit of space between the two big groups of host functions for "one more". (edited)

# 2025-04-24 03:47 sourabhniyogi: Found it here -sorry https://graypaper.fluffylabs.dev/#/cc517d7/17b00217b002?v=0.6.5 (edited)

# 2025-04-23 13:24 jay_ztc: Jan Bujak: I found this very helpful post of yours in a thread from last September. Do you think holding off on implementing sbrk is still relevant advice? Do you have any new thoughts since this message was posted that you can share?

# 2025-04-23 13:27 jay_ztc: reposting below for context:: https://matrix.to/#/!ddsEwXlCWnreEGuqXZ:polkadot.io/$\_RkIlMDNZrROw\_6WDXpbllO2VSbjY1FNTIfDjVZhhdw?via=polkadot.io&via=matrix.org&via=parity.io Memory allocation/deallocation handling is still a work-in-progress, and it's possible the sbrk instruction will get modified and/or removed. I'd suggest you temporarily skip it and focus on other parts of JAM and/or PVM. If you're interested in some history as to why sbrk is there then let me give you some background. Historically I designed PolkaVM (on which PVM in the GP is based on) to be a VM which is as "powerful" as WASM VMs (so it can completely replace our current WASM-based executor in Polkadot 1.0 and our WASM-based smart contracts VM) while being as simple as possible to implement, and without sacrificing any performance. So this is where the idea for the sbrk came from (which is similar to what WASM has): the VM maintains a heap pointer, and the guest program can use sbrk to query that pointer and/or to bump it up. And every time it crosses a page boundary the VM allocates new memory for the program. So this design has numerous benefits. First, it's very simple to use as a guest program (pseudo code):

// Get a pointer to the new allocation.
let pointer = sbrk(0);
// Actually allocate it.
if sbrk(size) != 0 {
    // Allocation succeeded.
    // Now `pointer` points to `size` bytes you can use.
}

fn sbrk(size) -> Pointer {
    if size == 0 {
        // The guest wants to know the current heap pointer.
        return current_heap_pointer;
    }

    // The guest wants to allocate.
    let new_heap_pointer = current_heap_pointer + size;
    if new_heap_pointer > max_heap_pointer {
        // Allocation failed.
        return 0;
    }


    let next_page_boundary = align_to_page_size(current_heap_pointer);
    if new_heap_pointer > next_page_boundary {
        allocate_new_pages(next_page_boundary..align_to_page_size(new_heap_pointer));
    }

    current_heap_pointer += size;
    return current_heap_pointer;

fn load_value32(address) -> value {
    if address >= stack_address && address + 4 <= stack_address_end {
        return stack[address - stack_address];
    } else if address >= rw_data_address && address + 4 <= align_to_page_size(current_heap_pointer) {
        return rw_data[address - stack_address];
    } else if address >= ro_data_address && address + 4 <= ro_data_address_end {
        return ro_data[address - stack_address];
    } else {
        // Address is inaccessible.
        return Err;
    }
}

It's cheap, fast, and doesn't require any crazy data structures and doesn't require any handling of corner cases (for example, accesses which could read both from the stack and from RW data don't have to be handled, because they're impossible by definition; the interpreter can just keep them in separate arrays, and call it a day). So that's how (any why) it was originally designed, but then came JAM and changed things. (: (Again, remember, I started working on this before JAM, and some things were just grandfathered into JAM.) What JAM introduces is a concept of inner VMs (see machine, peek, poke, invoke and expunge host functions in section B.8 of the GP) where one VM can spawn another VM, and as it is currently designed those inner VMs are extremely flexible and have completely free-form memories and are dynamically paged. What this essentially means is that all of those nice properties of sbrk that I've listed - simple and easy to implement, fast, doesn't require fancy data structures - they all now go out of the window! So we will probably be replacing sbrk with something else that's more appropriate for the more flexible inner VM model. And unfortunately also most likely orders of magnitude harder to implement (at least if you want to reach at least the half-speed milestone), but it is what it is. I'm still finishing some other stuff up, but I'll most likely be working on this soon-ish. (If any of you have any good and/or crazy ideas feel free to message me!) (edited)

# 2025-04-23 15:22 celadari: Hello, I have a question regarding the eject host function, specifically about the condition described here: https://graypaper.fluffylabs.dev/#/68eaa1f/328e03329103?v=0.6.4. I want to confirm my understanding: Does it mean that in order for a caller s to successfully eject account d, the codeHash of account d must be equal to the (hash) service index of the caller s ? Thanks in advance for the clarification! (edited)

# 2025-04-23 15:23 celadari: To be more precise:

# 2025-04-23 15:23 celadari: image.png

# 2025-04-24 13:47 haikoschol: sourabhniyogi: was that reaction a "yes" or a "i'm wondering about that too" or a "yay, equations!!1!"? 🤔

# 2025-04-24 15:12 haikoschol: KwickBit - Charles-Edouard LADARI: it's not the hash though, it's the (32 octet) encoding of the service index, right?

# 2025-04-24 15:14 celadari: By 32 octet i meaning we add 0s before the last 4 octets of the service index ? 🤔

# 2025-04-24 15:44 haikoschol: i think so

# 2025-04-24 15:45 haikoschol: it won't incidentally turn out to be the hash of some code, so the code hash field needs to have been set with upgrade to this value before i reckon

# 2025-04-23 18:58 charliewinston14: Hello. Two questions about availability assurances. 1. Should these be sent out every slot even if the bitfield didn’t change from the previous distribution? 2. For an assurer to say a core is available in their assurance bitfield, does that mean they have access to just the shards matching their validator index, or are they saying they have access to ALL shards for a given core? (edited)

# 2025-04-23 19:02 sourabhniyogi: 1. Yes, its critical that they are since assurances are anchored to the immediate parent header hash. The only exception is if the entire bitfield is zero, in which case I don't believe there is a point to submitting an assurance unless some reward exists for liveness when all cores are idle (which would be sad ... but possible!) 2. Just their shards matching their validator index. If they had access to ALL the shards, that would be too much in the large! But of course they can do whatever they want, and they might be one of the guarantors. (edited)

# 2025-04-23 19:04 vinsystems: Hi, question about the accumulate transfer = 11 function. When Service A sends an amount X to Service B, a deferred transfer (i.e. s: A, d: B, a: X, m: ..., g: ...) is created and the amount to be sent is substracted from the Service's A balance. But I cannot find in the GP where it says to update the balance of the receiver (service B). I thought that it was done when all the deferred effects of the transfers are applied, but the on_transfer functions only modify the subject's account storage. (edited)

# 2025-04-23 19:21 sourabhniyogi: https://graypaper.fluffylabs.dev/#/cc517d7/2fe2002fe200?v=0.6.5

# 2025-04-23 19:28 vinsystems: Thanks! ☺️

# 2025-04-24 04:40 jan: We are going to remove the sbrk instruction and replace it with a hostcall soon.

# 2025-04-24 17:28 ascriv: How is the serialization of the storage dictionary (a_s) of a service account in (D.2) not lossy? The key (I thought) is not the hash of the value unlike the preimage lookup. I’m definitely missing something or we are losing the last 4 bytes of the keys (edited)

# 2025-04-24 17:53 prasad-kumkar: I think the point is we don’t need to store the full a_s keys, they’re only used to look up values when the key is already known

# 2025-04-24 18:43 dakkk: keys are hash, a collision is very unlikely

# 2025-04-24 18:30 erin: hello all, I've created hosted archives of the JAM and graypaper chats with plaintext versions also available at https://paritytech.github.io/matrix-archiver/ there are a few quality of life improvements still to be done but any feedback or comments are welcome if you find this useful. This is also now linked on the jamcha.in site. These are updated daily at ~3am UTC. Other JAM-related channels are welcome to be archived - they need to be unencrypted and world-readable (history available to "Anyone"). Please open an issue here with the internal room ID if you wish to archive a channel. (edited)

# 2025-04-24 18:41 ascriv: > <@prasad-kumkar:matrix.org> I think the point is we don’t need to store the full a_s keys, they’re only used to look up values when the key is already known I don’t completely follow, should we just be comparing first 28 bytes when doing lookups? What about when two keys only differ in the last 4 bytes?

# 2025-04-24 18:46 tomusdrw: erin: would you consider adding <a name="{element-msg-id}"></a> to the <time> in html mode? That would allow linking to a specific message, which I think would be pretty cool. (edited)

# 2025-04-24 18:46 erin: yeah, that's one of the QoL features in the pipeline :)

# 2025-04-24 18:47 erin: others may include "jump to bottom" and perhaps pagination, though i kinda like the big raw log style (feedback again welcome here).

# 2025-04-24 18:54 tomusdrw: afaict it's 220kB gzipped for 1y+ worth of content. My guess is that adding JS pagination to this with a modern framework would start to pay off only after 2 more years :D (edited)

# 2025-04-24 18:55 erin: i'll worry about it later then 😂

# 2025-04-24 18:55 erin: it's just static html generated by a python script at the moment

# 2025-04-25 12:21 erin: tomusdrw: the timestamps are now links to the messages

# 2025-04-25 13:49 sourabhniyogi: Cool if we had discord rooms with content, should we put them in this repo so its not a matrix-archiver but a gp-archive?

# 2025-04-25 16:04 erin: there should be an automated script to grab the content daily unless they're dead/archived channels. the script/rendering right now is very matrix-specific. it would be best to have a matrix bridge or something regardless

# 2025-04-24 18:54 ascriv: > <@dakkk:matrix.org> keys are hash, a collision is very unlikely So state serialization being lossless isn’t important, it just needs to be extremely unlikely to collide

# 2025-04-25 13:02 charliewinston14: Hi I have a timing question. Segments are kept in the DA for 28 day, but it seems SegmentRootLookupItem's in work package bundles must be recent (exist in beta or extrinsic). Does that mean we store segments for 28 days, but they can only be used as import segments for 8 days?

# 2025-04-25 13:03 dave: The segment root lookup stuff is for when you import using a work-package hash

# 2025-04-25 13:03 dave: This is only supported for recent work-packages, for older packages you should use the segment-root directly

# 2025-04-25 13:08 charliewinston14: Thank you. Didn't realize an ImportSpec could be a export root OR a work package hash.

# 2025-04-25 16:02 knight1205: Does anyone know how is shard index determined from validator index? Like which particular validator requests for which particular shard index as in CE 137? Here in GP: It is just mentioned that >chunks are distributed to each validator whose keys are together with similarly corresponding chunks for imported, extrinsic and exported segments data, such that > each validator can justify completeness according to the > work-report’s erasure-root. I am not sure what it is trying to imply.

# 2025-04-25 16:13 dave: See my reply in the "Let's JAM" channel.

# 2025-04-26 17:40 ascriv: For the function definition (D.4), it seems like k could have (much) fewer than 248 bits, e.g. in the case where there are many elements of the preimage lookup for any service account. Since in this case the keys for all of these in the serialized state will have the same first 8 bytes, so by the time we are computing the leaf value for one of these keys we will have cleaved off at least 64 bits, leaving fewer than 192 bits for k, if I understand correctly

# 2025-04-26 17:40 ascriv: Should we be padding with zeroes in such cases to compute bits(k)…248?

# 2025-04-26 17:47 ascriv: Ah, nevermind. This is why we also have the key in the rhs of the map, so we remember the original non-cleaved key

# 2025-04-28 07:58 kianenigma: I noticed from (latex) comments in https://github.com/gavofyork/graypaper/pull/285 that the encoding of the new statistics state components vary between u16, u32, and u64. Yet, in the state serialization section, they are not covered: https://graypaper.fluffylabs.dev/#/cc517d7/39dd0139e901?v=0.6.5 Is this something that is not finalized yet, or are they mentioned in another section and I am missing it? For example, I could guess that gas always serialized as u64 but didn't have a prior reference to know e.g. extrinsic count is encoded as u16.

# 2025-04-28 09:31 gav: The state serialisation for the latter two components is the general (variable size integer) encoding. (edited)

# 2025-04-28 16:07 prasad-kumkar: Should argument a from argument invocation function be encoded with p before passing in program initialization function, as its being decoding from p as described in A.37 ? As noted: > Given some p which is appropriately encoded together with some argument data a, we can define program code c, registers ω and ram μ through the standard initialization decoder function Y (edited)

# 2025-04-28 19:14 ascriv: Should we expect to be able to handle bad blocks for M1? Or are all blocks presumed valid?

# 2025-04-28 19:25 jay_ztc: +"handle forks" to this question (edited)

# 2025-04-28 21:07 xlchen: IMO M1 is just STF, state in, new state out. there is no concept of a block chain to be considered

# 2025-04-28 21:10 xlchen: so you should detect if a block is bad (purely using the provided input state), but no need to care about forks because it is stateless

# 2025-04-28 21:09 davxy: > <@ascriv:matrix.org> Should we expect to be able to handle bad blocks for M1? Or are all blocks presumed valid? Be prepared for bad blocks. A lot of bad blocks :)

# 2025-04-29 13:22 qinwenwang: JAM Gray Paper Chinese Version has been updated to 0.6.5 accordingly https://www.lollipop.builders/JAM-Graypaper-Chinese.pdf

# 2025-04-29 18:41 jay_ztc: Curious why the state serialization of preimage availabilities specifies that the preimage hash should be hashed? Instead of serializing the preimage hash itself, like in the preimage serialization? https://graypaper.fluffylabs.dev/#/cc517d7/397202397202?v=0.6.5 Edit: wording (edited)

# 2025-04-30 02:39 ascriv: Maybe for super extra care in having no collisions?

# 2025-04-30 02:44 xlchen: I asked the same question before

# 2025-04-30 02:45 ascriv: I have a question also about state serialization: For pending reports serialization, is it the case that the segment count in (11.5) should be serialized using the general natural number encoding? In fact , since E is not subscripted in C(10), shouldn’t all naturals in the work report be encoded in the general way?

# 2025-04-30 02:51 ascriv: But, if we’re going with naturals that are subscripted (like gas values) should always be encoded in the non general way (that seems to be convention?), does that mean the gas values in core and service statistics should not be generally encoded? (edited)

# 2025-04-30 10:01 jimboj21: I see in 12.20 that P is a return value. From what I can see it is not defined here but is a returned by the accumulation invokable pvm instance. However it does not appear that psi_a function signature has been updated to reflect this

# 2025-04-30 10:07 jimboj21:

# 2025-04-30 12:47 jay_ztc: Anyone else having trouble with the archives site? Only seeing the title: "Archived Rooms" light inspection reveals the following warning: Error with Permissions-Policy header: Unrecognized feature: 'interest-cohort'. https://paritytech.github.io/matrix-archiver/

# 2025-04-30 12:48 erin: ah thanks for flagging, I'll check on it right away

# 2025-04-30 12:58 erin: It's now fixed - sorry about that. Looks like the archival process timed out, could have been due to a slow github runner. I've increased the timeout to something quite high now so it should (hopefully) not happen again.

# 2025-04-30 13:15 jay_ztc: <del>Validators won't be able to get the preimage hash itself from the serialized state-> Instead they will need to refer to the block extrinsics right? (for any newly-requested, not yet provided preimage).</del> What if instead of hashing, we stack the length octets on top of our preimage hash octets? Basically instead of concatenating bytes, we add each of the 4 length octets to each of the first 4 hash octets, and use the result as our state-key? I'm thinking if we can avoid the hash operation while meeting the same requirements- we can lessen our perf overhead edit: wording (duplication) edit: strikethrough not-really-relevent edge case of validator requesting state from another validator for reconstruction (edited)

# 2025-04-30 13:16 jay_ztc: For context, this was the post from David Emett in Oct 2024: 2024-10-31 22:29 dave: The preimage hash is passed in to eg solicit directly, it's trivial to pass in almost-colliding hashes. As the trie key construction function doesn't preserve all hash bits, these almost-colliding hashes could actually collide in the trie if they are not hashed beforehand

# 2025-04-30 13:23 dave: > <@jay_ztc:matrix.org> Validators won't be able to get the preimage hash itself from the serialized state-> Instead they will need to refer to the block extrinsics right? (for any newly-requested, not yet provided preimage). > > What if instead of hashing, we stack the length octets on top of our preimage hash octets? Basically instead of concatenating bytes, we add each of the 4 length octets to each of the first 4 hash octets, and use the result as our state-key? > > I'm thinking if we can avoid the hash operation while meeting the same requirements- we can lessen our perf overhead > > edit: wording (duplication) Maybe I'm not understanding but it seems trivial to generate collisions with this scheme. The requested hash and length are almost unconstrained (the length is somewhat constrained by required deposit I think)

# 2025-04-30 13:24 jay_ztc: doh! of course, thank you

# 2025-04-30 13:24 jay_ztc:

# 2025-04-30 14:20 luke_fishman: has anyone passes the latest accumulate test vector same_code_different_services-1 in the pre-state there are 2 services: 1729, 1730 1730 has no "blob" i.e service code in the post-state service 1730 is removed, only service 1729 remains i am currently clueless as to how that might have happend - it could not have ejected itself - since it has no code - as far as i can tell, it is not removed from d' in 12.17 how else can it be removed? davxy could you advise? (edited)

# 2025-04-30 14:21 jimboj21: Luke | Jamixir: I am also stuck trying to get this working currently

# 2025-04-30 14:21 jimboj21: and dont see why it is happening yet

# 2025-04-30 14:22 luke_fishman: good to know I'm not alone on this

# 2025-04-30 15:45 emielsebastiaan: Seems to be addressed in https://github.com/gavofyork/graypaper/pull/348 And thus likely published in next release of GP (0.6.6).

# 2025-04-30 15:46 emielsebastiaan: Minor text change near equation: 12.20: change p -> i PR: https://github.com/gavofyork/graypaper/pull/351

# 2025-04-30 15:48 emielsebastiaan: Some suggested changes to the provide host function: https://github.com/gavofyork/graypaper/pull/352 Fixes: 1) setting registers, 2) setting bold_d (service dictionary)

# 2025-04-30 18:04 sourabhniyogi: The new 0.6.5 gas parameter x\_g in the operand is serialized in C.29 not with E\_8 but with C.6, unlike the others that use E\_8 (see C.23, C.26, C.28) -- is this a typo or intended? Why not make them consistent in one direction or the other? (edited)

# 2025-04-30 19:28 gav: All these encodings will be revisited in due course before 0.7.0 (there’s an issue on the 0.6 milestone page) but until then assume your reading is correct.

# 2025-04-30 20:21 davxy: > <@luke_fishman:matrix.org> has anyone passes the latest accumulate test vector same_code_different_services-1 > > in the pre-state there are 2 services: 1729, 1730 > 1730 has no "blob" i.e service coed > > > in the post-state service 1730 is removes, only service 1729 remains > > i am currently clueless to how that might have happend > - it could not have ejected itself - since it has no code > - as far as i can tell, it is not removed from d' in 12.17 > > how else can it be removed? > > davxy could you advice? > I'll have a look

# 2025-05-01 04:23 clw0908: Some questions about A.36, A.37, and A.43: In A.43, **p** is the program blob of a service, and **a** is the serialized arguments from refine or accumulate (on-transfer). So how can **p** include **a** in A.37? Since A.36 only checks whether there exists a **c**, **o**, **w**, z, s (excluding **a**) that satisfies A.37, does this mean we don’t have to care whether **a** exists in **p**?

# 2025-05-01 07:54 gav: Bold a should be in that list

# 2025-05-01 07:59 prasad-kumkar: Should a be concatenated to p before passing in A36 ? https://github.com/gavofyork/graypaper/pull/350 (edited)

# 2025-05-01 08:00 gav:

# 2025-05-01 08:03 gav: Hmm. Looks like bold a getting added into that concatenation is a typo.

# 2025-05-01 08:04 gav: Bold a is not a part of the program blob

# 2025-05-01 08:06 prasad-kumkar: then shall **a** be passed as a function argument to Y?

# 2025-05-01 08:12 gav: Yes indeed.

# 2025-05-01 08:16 prasad-kumkar: Okay thank you! I’ll update #350 (https://github.com/gavofyork/graypaper/pull/350) to reflect the change if that’s okay 🙂

# 2025-05-01 20:01 ascriv: For instruction 104, doesn’t this actually get the trailing (not leading) zeros as it’s defined? For example if wA is 0x1, B_8 will be 1 followed by 0s, and so the formula will evaluate to 0, when it should be 63 (edited)

# 2025-05-01 20:06 ascriv: We should be using the big endian representation of wA in 104-107 unless I’m mistaken

# 2025-05-01 22:48 ascriv: Made a pr www.github.com/gavofyork/graypaper/pull/357 (edited)

# 2025-05-02 08:51 gav: Merged a fix

# 2025-05-02 11:16 danicuki: I am a bit confused about the terminology used in CE 139/140 (https://docs.jamcha.in/knowledge/advanced/simple-networking/spec#ce-139140-segment-shard-request): the GP specifies very well the concept of "segment". It doesn't use the word "shard" - are they the same concept? Also, CE139 define 12 bytes segment shards. But the GP segment sizes are 684 bytes. Where do 12 comes from? Where are shards defined in the GP?

# 2025-05-02 11:19 gav: segments are 4104 bytes and can be reconstructed by 342 validators each presenting their 12 byte shard (and its index)

# 2025-05-02 12:29 danicuki: > <@gav:polkadot.io> segments are 4104 bytes and can be reconstructed by 342 validators each presenting their 12 byte shard (and its index) Thanks. Then In tiny testnets (6 nodes, 2 cores) each shard has 2052 bytes? So shard size is 4104 / [number of cores]?

# 2025-05-02 12:36 gav: Yes indeed.

# 2025-05-02 15:51 sourabhniyogi: image.png

# 2025-05-02 15:52 sourabhniyogi: You might find the above useful (if you spot an error, please advise!)

# 2025-05-02 17:21 jaymansfield: Hey all. With having M2 pretty much complete at this point I'm now trying to wrap my head around how to build the required recompiler. Based on my research so far, it looks like I could generate RISC-V machine code, add an ELF header, then use a linker to produce the final ELF file, and finally execute that ELF (somehow) as a background process. Am I heading in the right direction, or am I way off and should just wait for Jan’s talk?

# 2025-05-02 17:23 jan: Nope. My talk next week should contain the bare basics that should let you get started on writing one.

# 2025-05-02 17:25 jaymansfield: Alright I'll hold off for now. I'm no longer going to be able to make it to jam experience so I really hope this is available to watch online afterwards

# 2025-05-02 17:26 jan: I will also share the slides so even if you can't watch the talk they should be useful.

# 2025-05-02 17:26 jaymansfield: Thanks!

# 2025-05-03 17:09 ascriv: State serialization is lossy, specifically in regards to the keys of the storage dictionary of service accounts. I’m a bit confused why this isn’t important. Are we not expected to be able to fully recover a serialized state in the case that the storage dictionary is non empty? State serialization is lossless everywhere else, so it seems odd. (edited)

# 2025-05-03 17:47 jaymansfield: I don’t think it’s something really meant to be reversed. Also storage items are not the only thing suffering from this. There are cases where preimage lookups are also not recoverable (if the preimage isn’t known). If it’s just been solicited through a host call and not included in a block yet you wouldn’t know the full hash of it either by looking at the lookup state key (edited)

# 2025-05-03 20:02 davxy: As I wrote in the GH discussion. Storage keys are lossy from the host perspective. When a service requires to read (or write) a service key it gives to the host the full unhashes key. The host then computes the state key (as per D.2) and gives (or writes) the data. Host doesn't need to keep track of the full key as it doesn't need it (edited)

# 2025-05-03 20:19 jaymansfield: I understand the process you suggested, but doesn't it go against the GP? It defines the preimage and storage lookup dictionaries as having 32 byte keys with a key format of H(E4(s∗)⌢ µko ⋅⋅⋅+kz). If we were to support state keys instead for these since they are not all reversible the logic in the GP is invalid in a few spots

# 2025-05-03 20:22 davxy: 31 bytes https://github.com/gavofyork/graypaper/pull/356

# 2025-05-03 20:24 ascriv: I think Jason is talking about the definition in the service account (9.3), but I think it’s still correct, however implementations will likely not have an actual struct which maps 32 octet strings to blobs, and instead opt for 31 octet strings (the state key)

# 2025-05-03 20:25 ascriv: But even that implementation would still follow gp, functionally

# 2025-05-03 20:26 davxy: To process our vectors you need to support construction of the state using state keys

# 2025-05-03 22:30 jaymansfield: Ok thank you. Will make the updates to be able to support this

# 2025-05-03 20:27 jaymansfield: > <@davxy:matrix.org> 31 bytes https://github.com/gavofyork/graypaper/pull/356 I was referring to the preimage lookup dictionary and storage dictionary in delta. GP shows key is H.

# 2025-05-03 20:31 ascriv: That’s still true, it’s just to process davxys test vectors you can’t have an implementation which works with the full storage keys, instead working with the state key versions of the storage keys. GP doesn’t say your internal representation of service accounts must match the one in (9.3) (edited)

# 2025-05-03 20:32 ascriv: An implementation might has a_s mapping 31 byte strings to blobs and still be faithful to gp

# 2025-05-03 20:33 ascriv: Just need to make sure it serializes correctly, does key existence checks correctly, etc

# 2025-05-03 20:35 ascriv: But correct me if I’m wrong @davxy

# 2025-05-03 20:43 davxy: I see your problem. IIUC you are talking about constructing the Dictionaries with the preimage full hash from the raw KV state. - For the preimage you can. Just hash the value. - For the preimage lookup dictionary you can just if you already have the corresponding preimage in the KV - For services you can't recover the unhashed service key Yeah, the vectors assume that you use the state keys as your dictionary keys (edited)

# 2025-05-03 20:58 cisco: I'm also really confused by this. I guess the encodings will be revised in 0.7.0 but I would like to know how they are right now if possible

# 2025-05-03 21:04 rustybot: > <@davxy:matrix.org> I see your problem. IIUC you are talking about construct some Dictionary with the preimage full hash from the raw KV state. > - For the preimage you can. Just hash the value. > - For the preimage lookup dictionary you can just if you already have the corresponding preimage in the KV > - For services you can't recover the unhashed service key > > Yeah, the vectors assume that you use the state keys as your dictionary keys I’d also like to add that, to support warp sync (i.e., syncing from a finalized state rather than from genesis), you'll receive the raw key-value state keys. Not sure if this is already specified by JAM SNP

# 2025-05-03 21:21 ascriv: > <@davxy:matrix.org> I see your problem. IIUC you are talking about construct some Dictionary with the preimage full hash from the raw KV state. > - For the preimage you can. Just hash the value. > - For the preimage lookup dictionary you can just if you already have the corresponding preimage in the KV > - For services you can't recover the unhashed service key > > Yeah, the vectors assume that you use the state keys as your dictionary keys I think (very) technically speaking, an implementation which relies on state keys for storage does not actually implement the gp, because of the extraordinarily unlikely outcome of a state key collision when checking existence of a key. this is probably not even worth mentioning due to the likelihood but thought it was interesting

# 2025-05-04 09:33 greywolve: In the last point, do you mean for storage keys you can't recover the unhashed storage key? It's only the storage dict that would have state keys? (since we can recover the rest) (edited)

# 2025-05-04 13:09 ascriv: > <@cisco:parity.io> I'm also really confused by this. I guess the encodings will be revised in 0.7.0 but I would like to know how they are right now if possible If you look at (C.24) it explains how to serialize work reports which I previously overlooked

# 2025-05-05 09:57 gav: The segment-count field (in the availability specifier set) is encoded as a 2-byte fixed length integer.

# 2025-05-05 10:00 gav: This is missing the point.

# 2025-05-05 10:01 gav: Nowhere in the GP does it tell you (how) to store the state. (edited)

# 2025-05-05 10:01 gav: It only specifies how to recognise a valid state. (edited)

# 2025-05-05 10:02 gav: And in fact this is done via the Merkle root, which (under reasonable assumptions) identifies a single state. But (obviously) it's lossy in so much as it impossible to reconstruct a potentially multi-gigabyte state from a 32 byte quantity. (edited)

# 2025-05-05 10:03 gav: Implementations are free to store state however they choose. GP takes pains not to specify this; it's one of the very many implementation details. GP only specifies how to recognise a valid block. (edited)

# 2025-05-05 10:06 gav: Of course when we want to provide test vectors for incomplete portions of the protocol, then we may need to make judgement calls on exactly what data an implementation has available. That's a bit unfortunate and may coerce implementations into particular design patterns. Note that __implementations need not be able to pass the test vectors__. They must only pass the conformance tests, which will test in line with the GP not any internal data structures we might assume implementations have within.

# 2025-05-05 10:07 gav: Test vectors are provided only in so much as they might help teams.

# 2025-05-05 10:08 gav: If you're talking about a possible collision of the 32nd byte of a hash (or even the 27th byte), then the GP implicitly assumes it is impossible. (edited)

# 2025-05-05 10:10 gav: Always remember: GP only specifies behaviour, not mechanism. (edited)

# 2025-05-05 10:12 gav: The change made to characterise state keys as 31 bytes in the GP should make no practical difference to behaviour; it's there to ensure that W3F's test vectors have something concrete to point at in the GP.

# 2025-05-05 11:21 rustybot: Since GP defines the service storage dictionary key as the full hash, an implementation is free to use the full hash (9.3) as the dictionary key. If this implementation supports warp sync, as outlined in CE129, it receives a sequence of KV items with 31-byte state keys (as per D.1). But now this implementation cannot construct its service dictionary properly (as it uses full hashes), and may fail to process subsequent blocks. This happens because host calls will pass full hashes, but they've padded the keys received from another node, causing mismatches. The likelihood of such a mismatch is pretty high. Doesn't this suggest that implementations aiming to support warp sync need to use 31-byte state keys for their dictionaries? (edited)

# 2025-05-05 11:41 ascriv: This is my understanding as well, but interested to hear others opinions (edited)

# 2025-05-05 11:43 ascriv: The last few bytes of the keys are not present in the serialized state so warp sync and full storage key implementations must not be compatible (edited)

# 2025-05-05 14:09 gav: Yes, warp-sync is not (yet) within the behavioural description of the GP. (edited)

# 2025-05-05 14:14 gav: GP v0.6.6 is released.

# 2025-05-06 00:56 jaymansfield: There might be a typo in I(p,j) where it checks the output size of the refine calls. I'm assuming it should be checking if the size is greater then WR rather then smaller

# 2025-05-05 14:16 gav: It's not quite the 0.7.0 I had wanted prior to the JAM XP, but it's now getting quite close.

# 2025-05-05 14:17 gav: There are a few small changes here in line with the 0.6 milestone, but the most impactful is the alteration to fetch, which now works across the different invocations and averts the need for unavoidable unbounded RAM allocations. (edited)

# 2025-05-05 14:18 gav: fetch also changed to allow on-chain entropy to be inspected in Accumulate/OnTransfer, and, in the future, off-chain entropy to be inspected in Refinement.

# 2025-05-05 14:18 gav: It is now possible to introspect the chain's parameters, allowing for chain-agnostic PVM code.

# 2025-05-07 17:30 sigsigsigsigsig: hi gang

# 2025-05-07 17:30 sigsigsigsigsig: coming back to jam, struggling to find the black on white version of the updated pdf could anyone help?

# 2025-05-07 17:31 sigsigsigsigsig: this one is gnarly to print 😁

# 2025-05-07 17:31 sigsigsigsigsig: would be most grateful 🙏

# 2025-05-07 17:31 oliver.tale-yazdi: there are three versions here https://jamcha.in/spec

# 2025-05-07 17:32 sigsigsigsigsig: Oliver Tale-Yazdi: amazing, thankyou!! 😍

# 2025-05-07 17:33 sigsigsigsigsig: will think of you while i'm enjoying reading it in the uk sun 😁 💕

# 2025-05-07 17:33 sigsigsigsigsig: gotta get it while it's hot here! ☔️

# 2025-05-08 14:32 ascriv: The fifth component of the yield invocation in (B.2) should be a hash but it’s a work package. Assuming that’s a typo?

# 2025-05-09 09:43 gav: Yes. https://github.com/gavofyork/graypaper/pull/376

# 2025-05-09 16:14 ascriv: The third state key constructor in (D.1) only uses the first 27 bytes of h, but each usage of that constructor sends 31 bytes. I think e.g. for the storage dictionary we should use k0…23, and similar the next 2. This wouldn’t change functionality but as far as I can tell we are sending 4 extraneous bytes in each of them (edited)

# 2025-05-10 01:15 ascriv: also I saw that we changed the arguments for the argument invocation in the accumulation definition (B.9) to include just the length of operand tuples array, instead of the full array. is this intentional? is it also intentional that we've left off any subscript for E, and thus we use the general natural encoding for t and s?

# 2025-05-10 13:47 ascriv: https://github.com/gavofyork/graypaper/pull/380 https://github.com/gavofyork/graypaper/pull/381 draft fixes for the above two issues (edited)

# 2025-05-11 17:10 gav: > <@ascriv:matrix.org> also I saw that we changed the arguments for the argument invocation in the accumulation definition (B.9) to include just the length of operand tuples array, instead of the full array. is this intentional? is it also intentional that we've left off any subscript for E, and thus we use the general natural encoding for t and s? All intentional

# 2025-05-11 17:11 gav: Passing the whole vector will become unsound when gas cost inflates with accessible memory. The point of these changes is to support sound service code and thus ensure that baseline gas costs are fixed over all inputs. Memos and work results must now be fetched as needed. (edited)

# 2025-05-11 19:08 jan: To expand on the rationale here - we've been doing research/running experiments on how to come up with a secure gas cost model we can use in a permissionless environment like JAM, and it has been increasingly clear that minimizing the amount of memory that is accessible to the programs is important to keep the gas costs of the memory instructions in check. In general the more memory you have accessible the easier it gets to trigger cache misses, which makes the the worst case, well, worse, for memory access instructions, potentially up to several hundred times more costly (compared to the average case) if you can access the whole ~4GB worth of address space. So if we can do something while needing less memory, even if that ends up being slightly more computationally inefficient, it will still be a net win considering the gas costs.

# 2025-05-12 01:56 ascriv: https://github.com/gavofyork/graypaper/pull/387 various fetch nits

# 2025-05-12 19:56 ascriv: For this message, can we assume that (w_s)_l is added for each r in w_r ? so if len(w_r) = 2, the b component for R(c) is 2 * (w_s)_l ?

# 2025-05-12 19:57 ascriv: For this equation *

# 2025-05-12 19:58 ascriv: (assuming there's one work report for that core)

# 2025-05-12 21:22 gav: No, this is an oversight. (w_s)_l should only be counted once. Feel free to open an issue.

# 2025-05-13 10:34 greywolve: For the service storage keys state serialization, should the truncated key not be the encoded service id + k0..23 (23 bytes), given that the C function accepts 27 bytes?

# 2025-05-13 10:35 greywolve: Also applies the the preimages/preimage lookups, 4 bytes less

# 2025-05-13 10:49 greywolve: Ahh someone already opened a PR for this :)

# 2025-05-13 14:34 ascriv: Yes me 👋

# 2025-05-14 14:06 decentration: Are there existing conformance vectors available for validating the entire serialized state, including all chapters, after it has been merkleized into a single state root?

# 2025-05-15 12:53 decentration: (just realised i asked this in GP not Lets Jam group), i found these shared by Jamduna, and will use these for now. https://github.com/jam-duna/jamtestnet/blob/main/data/generic/state_transitions/00000012.json

# 2025-05-15 13:10 decentration: though these would be slightly outdated

# 2025-05-15 13:19 ascriv: Yes I find they are not up to date with 0.6.6 yet @sourabhniyogi FYI

# 2025-05-15 18:44 dakkk: Reading the GP a falltrought instruction is like a nop, right? It only do nothing and the program counter goes to the next instruction?

# 2025-05-15 18:55 sourabhniyogi: This is off-topic for this channel, but for fallback/safrole you can definitely use this https://github.com/davxy/jam-test-vectors/pull/45 and you can attempt the "reports" as well, which I imagine we'll get team convergence on very soon...

# 2025-05-15 20:35 jan: The primary function of the fallthrough instruction is to allow jumps into code which otherwise you wouldn't be able to jump into, as it starts a new basic block, and only jumps to the beginning of basic blocks are allowed. But otherwise yes, it just goes to the next instruction.

# 2025-05-16 21:05 greywolve: In the state constructor function for service account state keys wouldn't a representation like \[i, 0, n0, 0, n1, 0, n2, 0, n3, 0, 0 ... \] make identifying the different types of keys more robust? In that case the second byte would be unique for each of them. (regular, service, preimage, preimage lookup, storage) (note, assuming preimage lookup lengths are > 0 and \< max(uint32) -1, but it's still unique for the other four at least) (edited)

# 2025-05-20 04:51 jtxmp:

# 2025-05-20 09:19 gav: Not sure how you mean "more robust" - could you give an example of it not being robust? (edited)

# 2025-05-20 20:28 greywolve: Maybe more robust was the wrong way to phrase it, more like make a collision impossible, though that's just a nit since right now it's quite unlikely there would ever be a collision. I just meant with that representation, service state keys and preimage meta state keys have the second byte as always 0, and always > 1 respectively, so they can never collide then. Eg. if x can be any byte.

[255,  0   ...] // service keys
[x,   >1   ...] // preimage meta keys

But this probably doesn't make much difference in practice I guess. (edited)

# 2025-05-21 11:02 faiz_871:

# 2025-05-23 22:23 ascriv: Screenshot 2025-05-23 at 11.23.50 PM.png

# 2025-05-23 22:24 ascriv: for sbrk, if wA happens to be 0, then N_x..+wA = {}, which is a subset of every set. so there is no x which satisfies the conditions, meaning sbrk is undefined

# 2025-05-23 22:24 ascriv: should we define sbrk for the case when wA = 0?

# 2025-05-24 16:53 gav: Well remove sbrk pretty soon I believe, wdyt Jan Bujak?

# 2025-05-24 20:31 jan: Indeed. We will be removing sbrk and replacing it with a host call, so I wouldn't waste any effort on handling corner cases with the current sbrk.

# 2025-05-27 15:23 danicuki: Screenshot 2025-05-27 at 16.23.14.png

# 2025-05-27 15:23 danicuki: How should I interpret host call "read" and o = 0 and l = 0?

# 2025-05-27 15:24 danicuki: is this a panic?

# 2025-05-29 09:26 gav: No.

# 2025-05-29 09:27 gav: \<anything> ...+ 0 is the empty range (edited)

# 2025-05-29 09:28 gav: So N_{anything ...+ 0} is {}

# 2025-05-29 09:28 gav: And {} is the subset-or-equal of all sets. (edited)

# 2025-05-29 09:34 gav: @room GP version 0.6.7 released

# 2025-05-29 09:35 gav: Several smaller alterations here. (edited)

# 2025-05-29 09:36 gav: This completes almost all of the 0.6 milestone, with the rest just being the sweep of numeric encodings and some tidying to the underlying latex. (edited)

# 2025-05-29 09:37 gav: There could be some additional corrections/tweaks once we see this stuff implemented, so there will likely be one or two more releases on the 0.6 series before we move to 0.7. (edited)

# 2025-05-29 09:38 gav: One interesting thing which was added is the ability for cores to be owned by services

# 2025-05-29 12:23 ascriv: Screenshot 2025-05-29 at 1.23.45 PM.png

# 2025-05-29 12:24 ascriv: I think 12.22 in 0.6.7 needs to pass the additional 3 U parameters we added (edited)

# 2025-05-29 12:27 ascriv: from privileged services (chi) presumably

# 2025-05-29 12:29 ascriv: oh nevermind, actually chi just needs to be expanded and placed at the end (edited)

# 2025-05-29 12:31 ascriv:

# 2025-05-29 12:32 ascriv:

# 2025-05-29 14:07 ascriv: Screenshot 2025-05-29 at 3.07.37 PM.png

# 2025-05-29 14:09 ascriv: this part of bless in 0.6.7 should probably be E^-1_4#([u_a+4i...+4] for i in N_C) ? (edited)

# 2025-05-29 14:12 ascriv: and I guess now we don't need to validate that each of them is in the range of a valid service index, because they're limited to 4 bytes (edited)

# 2025-05-30 00:38 jaymansfield: Is there any documentation on the extra RPC methods that jamt uses? I noticed it uses a couple additional things on top of JIP-2, like a subscribeSyncStatus method and expects a few extra variables in the parameters object for example. I'm trying to make it work with one of my nodes.

# 2025-05-30 10:45 gav: https://github.com/gavofyork/graypaper/pull/413

# 2025-05-30 11:14 gav: https://github.com/gavofyork/graypaper/pull/415

# 2025-05-30 11:34 arkadiy: https://github.com/polkadot-fellows/JIPs/pull/7 As for the chain parameters, these will most likely be changed to match the on-chain data return by the fetch function here: https://graypaper.fluffylabs.dev/#/7e6ff6a/32f800324e01?v=0.6.7

# 2025-05-30 11:55 jaymansfield: > <@arkadiy:parity.io> https://github.com/polkadot-fellows/JIPs/pull/7 > As for the chain parameters, these will most likely be changed to match the on-chain data return by the fetch function here: > https://graypaper.fluffylabs.dev/#/7e6ff6a/32f800324e01?v=0.6.7 Thank you.

# 2025-05-30 16:47 subotic: Sorry, if this is trivial. The GP mentions an explicit halt instruction here, which I don't see defined in Appendix A. Am I missing something?

# 2025-05-30 16:49 subotic: ah, brain fog. the djump thingy. sorry for the noise. (edited)

# 2025-05-30 19:44 jaymansfield: From the jamt source code I noticed that for the statistics and service related rpc subscriptions that it expects their responses to always first include the header hash and slot #. I don't think this is mentioned anywhere in JIP-2. For example under subscribeStatistics it just says "Notification type: Blob", but in reality it looks like the notifications need to include Header hash, slot and then the blob. Are you able to confirm?

# 2025-05-30 19:49 arkadiy: Yes, I guess we need to update the spec here David Emett

# 2025-05-30 19:53 dave: Yeah I think there are a few updates needed to JIP-2, will review and update next week

# 2025-05-31 22:08 greywolve: For the disputes extrinsic verdict's age, does using the prior state's slot, active and archived validators imply that in the first block of a new epoch a verdict can have an age of 2 epochs back? Eg, we're in the first slot of epoch 3 (slot 36), at that point the prior state is in epoch 2 (slot 35), and our verdict has an age of 1 which is valid. 2 is also valid, but we couldn't at this point have an age of 3. (edited)

# 2025-06-01 14:20 tvvkk7: Hello, I would like to double-check a question regarding the recompiler. Is the implementation language restricted to JAM, to the same language set, or are other programming languages also permitted?

# 2025-06-01 19:25 charliewinston14: For the wonky disputes, is it exactly 1/3 or 1/3+?

# 2025-06-02 06:48 davxy: > <@charliewinston14:matrix.org> For the wonky disputes, is it exactly 1/3 or 1/3+? Exactly . Otherwise, it would logically include the 2V/3 + 1 case - which isn't wonky, but rather the good case

# 2025-06-02 07:14 davxy: > <@davxy:matrix.org> Exactly . Otherwise, it would logically include the 2V/3 + 1 case - which isn't wonky, but rather the good case Just to elaborate. The purpose of this expression is to concisely capture the condition in which a supermajority cannot be reached for either outcome

# 2025-06-02 11:47 ascriv: Screenshot 2025-06-02 at 2.47.21 PM.png

# 2025-06-02 11:48 ascriv: for the state serialization, can't we get away with omitting a_o and a_i since they are derivable from other components of the serialized state?

# 2025-06-03 11:35 greywolve: Doesn't changing storage keys here from H -> Y mean that implementers are not free to choose how they represent storage keys in the storage dict? Ie we'd have to ensure that we all have the same length for our chosen key representation. Seems to contradict with the new paragraph in the state serialization section. (edited)

# 2025-06-03 11:48 dave: The change from H->Y was just to avoid hashing the keys twice. They are still hashed by the C function (see https://graypaper.fluffylabs.dev/#/7e6ff6a/3b6c003b7d00?v=0.6.7).

# 2025-06-03 11:51 greywolve: But does that mean you're forced to use the C function as your representation?

# 2025-06-03 11:51 dave: Yes, this is pretty much required anyway to implement warp sync

# 2025-06-03 12:00 greywolve: I wonder if that wouldn't make more sense as Y_31 then since you're pretty much forced to have that key length to get the same footprint

# 2025-06-03 12:02 dave: The output of the C function is now Y_31, maybe I'm misunderstanding

# 2025-06-03 12:03 greywolve: I mean for the storage keys definition. Y -> Y , vs Y\_31 -> Y , since you're forced to use C (edited)

# 2025-06-03 12:06 dave: Not sure I understand. The service storage stuff is only one part of the trie

# 2025-06-03 12:10 greywolve: Sorry I'm probably not being clear. For this definition. Y\_31 -> Y . Wouldn't you dict always have 31 byte keys? (edited)

# 2025-06-03 12:12 greywolve: Or does Y there represent the original key. Ie the footprint for storage is done based on the length of the original key (edited)

# 2025-06-03 12:37 greywolve: Seems like this is the case if I'm reading this correctly, so to calculate the balance now you need to know the length of the original raw key

# 2025-06-03 13:01 dave: Yes for the footprint calculation the original key is used. This is intentional even though the storage backend should really only store the 31-byte output of the C function

# 2025-06-03 13:03 dave: Note that you should always have the original key at hand when inserting/removing/updating such a value, even if the original key is not kept in the underlying storage (which it really shouldn't be)

# 2025-06-03 13:05 dave: Re accessing the service storage see the read/write host calls, which no longer hash the key: https://graypaper.fluffylabs.dev/#/7e6ff6a/333800331602?v=0.6.7

# 2025-06-03 13:05 greywolve: Thanks! That clears it up :)

# 2025-06-03 13:54 prematurata: I've a question about 0.6.7. I noticed storage keys for service account changed in type from hash to generic "blob". the read and write host calls still refer/use hashes so is it an overlook or is it really meant to be generic blob?

# 2025-06-03 14:01 greywolve: If you read just above I asked a similar question, and those host calls no longer hash the key it seems.

# 2025-06-03 14:02 prematurata: ha

# 2025-06-03 14:02 dave: Could you link the relevant bit of the GP? Possibly something was missed in the update.

# 2025-06-03 14:03 prematurata: i was looking at 0.6.6 for the host calls

# 2025-06-03 14:03 prematurata: my bad

# 2025-06-03 14:26 prematurata: So if I understand properly warp supporting implmenetations should be a bit offspec compared to GP's write and read and use the C key instead of "plain"

# 2025-06-03 14:29 greywolve: Given this, I guess maybe not off spec per say. Ie it's better to not store plain keys. (edited)

# 2025-06-03 14:30 basedafdev:

# 2025-06-03 14:31 prematurata: yeah I read that. but on the other side the read host call (for example) math says that **k** should be \in K(a_s)

# 2025-06-03 14:31 greywolve: Yeah I guess it's not clear to me that you're forced to use C, and that in the service account footprint calculation the storage key is actually the original key (edited)

# 2025-06-03 14:32 prematurata: but actually we would need to do C(s*, E4(2**232-1) ^ k) \in K(a_s**)

# 2025-06-03 14:32 prematurata: well thanks formatting :)

# 2025-06-03 14:33 prematurata: yeah basically read and write imply you have access to original keys

# 2025-06-03 14:33 prematurata: but on the other side that paragraph somehow says "well yeah you can use C hashes instead of original keys"

# 2025-06-03 14:35 prematurata: but if this is the way it was conceived. And it looks like it is... It bugs me cause C(....) is a subset of Y so not exactly the same

# 2025-06-03 14:36 prematurata: couldnt we just use Y31 like you suggested as keys in storage and go through C in read and write?

# 2025-06-03 14:38 greywolve: I don't think you can use Y_31 as the key in the service definition because that definition is referring to the original key, not C(original_key)

# 2025-06-03 14:38 greywolve: (i.e the footprint now uses the original key length, so it's gone from stateless to stateful)

# 2025-06-03 14:39 greywolve: so it's a bit fiddly I guess to express

# 2025-06-03 14:40 greywolve: For footprint calculation purposes it's Y -> Y, but for the purpose of storing the key in state it's Y_31 -> Y (edited)

# 2025-06-03 14:43 prematurata: well apart to know the length is the original really needed?

# 2025-06-03 14:43 greywolve: Doesn't seem like it

# 2025-06-03 15:05 dave: No. Practically speaking all implementations should support warp sync. For the later milestones, particularly the ones involving networking, this will be required. This is effectively required by some of the current test vectors I believe, though I don't know if it will be required for M1 conformance testing. There is an implicit assumption here that the C function is injective, which it obviously isn't really. In _practice_ it is because it is impractical to find a collision.

# 2025-06-03 15:08 dave: So while you could argue that an implementation which only stores the output of the C function and not the original keys is not really compliant with the GP, we explicitly allow and will ultimately require this behaviour

# 2025-06-03 15:10 prematurata: thanks for clarifying. I was just wondering why not using C directly in read and write instead (edited)

# 2025-06-03 15:10 dave: The service storage stuff is only one part of the trie

# 2025-06-03 15:11 dave: Not sure it would make sense to use C there

# 2025-06-03 15:13 dave: The keys would go through C again as part of State Merklization

# 2025-06-03 15:26 greywolve: Wouldn't there still be a problem with warp sync and preimage meta keys, since you can only reconstruct those completely if you have the preimage extrinsics that are going to be solicited aswell? Are details on that still being worked out? (edited)

# 2025-06-03 15:27 dave: You are not intended to reconstruct anything. This is why I said "the storage backend should really only store the 31-byte output of the C function"

# 2025-06-03 15:30 greywolve: So the same principle should apply to all the service dicts in theory for warp sync?

# 2025-06-03 15:31 greywolve: (ie , storage, preimage lookup, preimage meta)

# 2025-06-03 15:32 dave: So the bottom layer of your storage stack should probably have get/set functions which take a 31-byte key. Warp sync, which will use the state request protocol defined here https://github.com/zdave-parity/jam-np/blob/main/simple.md#ce-129-state-request, should just fill in the bottom layer with the exact keys/values received over the network. That layer shouldn't care what bits of state these keys/values correspond to (edited)

# 2025-06-03 15:33 greywolve: Completely understood for storage, just wondering about preimage meta in particular

# 2025-06-03 15:33 dave: Not sure I understand the question, sorry. There's nothing special about that vs any other bit of state really

# 2025-06-03 15:37 dave: For Merklization there is just one Y_31 -> Y dictionary. All the various bits of state get placed into this single dictionary in accordance with the C function.

# 2025-06-03 15:37 greywolve: I think this comment from davxy sums it up.

# 2025-06-03 15:38 dave: In practice you should store this single dictionary rather than the nicely structured state as otherwise you will be unable to implement warp sync

# 2025-06-03 15:41 dave: I don't think you should be doing any "deserialization" as Davide calls it. I think the point of his comment is that in some cases doing such deserialization is somewhat possible, but it really should not be necessary. If you find yourself needing to do it then either you are misunderstanding something or there is an issue with the GP that needs to be fixed.

# 2025-06-03 15:42 greywolve: Right rereading it now I understand your point more

# 2025-06-03 15:43 dave: "Deserialization" might be useful for tooling of course, the current stance is that it is just not in general possible

# 2025-06-03 15:45 ascriv: We do need to deserialize the other parts of the merkilized state though for warp sync right?

# 2025-06-03 15:45 dave: No!

# 2025-06-03 15:46 greywolve: I'm also a bit confused because how would you update your state then if you don't deserialize it?

# 2025-06-03 15:46 greywolve: I understand the dicts, sure

# 2025-06-03 15:46 dave: Warp sync should just be a case of taking the Y_31 -> Y dictionary you get using the State request protocol and sticking that as-is into the bottom layer of your storage stack

# 2025-06-03 15:47 dave: You can read/write bits of state by computing C(...) and then using this as a key into the underlying storage dictionary

# 2025-06-03 15:47 dave: At no point do you need an inverse of C

# 2025-06-03 15:49 ascriv: But to do the operations in the GP we need to have a representation of state that is not just key->bytes no?

# 2025-06-03 15:50 greywolve: We are talking about. eg 1-16

# 2025-06-03 15:50 greywolve: Fully understand that you won't need to deserialize any of the service dicts

# 2025-06-03 15:51 dave: No, all operations in the GP should be expressible in terms of get/set on the state. This is what this paragraph https://graypaper.fluffylabs.dev/#/7e6ff6a/3b1f033b2403?v=0.6.7 is talking about; enumeration of dictionaries for example is not required anywhere in the GP

# 2025-06-03 15:52 greywolve: Ahh I guess there's a difference between deserialize, and decode?

# 2025-06-03 15:52 greywolve: You don't mean you don't need to decode the other parts of the state, naturally

# 2025-06-03 15:53 dave: You are of course free to deserialize bits of the underlying dictionary if you think this will for example help with performance. In general you can do whatever you like in your implementation provided the observed behaviour is the same. The point here is more that a compliant implementation will not need to perform any deserialization.

# 2025-06-03 15:54 dave: Yes I'm just talking about the keys here. You will need to decode the values. Sorry if I caused any confusion with that.

# 2025-06-03 15:55 dave: The value decoding should be trivial

# 2025-06-03 15:55 greywolve: Yeah that makes complete sense

# 2025-06-03 15:55 ascriv: Oh. Yeah not the keys

# 2025-06-03 15:55 greywolve: You shouldn't ever need to deserialize keys , 100%

# 2025-06-04 15:27 yu2c: I came across two related definitions: B, defined as a service-indexed commitment to the accumulation output, and θ ( theta ), currently defined as the block’s accumulation-output sequence. I initially thought they referred to the same thing( sequence ), but I noticed that B is defined as a set while θ is defined as a sequence. Am I missing some distinction here, or are they meant to serve different purposes despite the similarity?

# 2025-06-05 10:03 gav: Theta is presently defined as being an item within the set B, effectively making it a set. This doesn’t quite work since Theta is elsewhere defined to be a sequence. B should in fact be a sequence and the combinator in Delta_+ should be concatenation rather than union.

# 2025-06-05 10:03 gav: Feel free to make an issue or a PR.

# 2025-06-05 13:39 prematurata: typo? https://graypaper.fluffylabs.dev/#/7e6ff6a/33cc0233cc02?v=0.6.7

# 2025-06-05 13:40 prematurata: (the v part)

# 2025-06-06 11:03 dvladco: In the accumulate invocation we pass the new timeslot τ′ which according to eq. 6.1 is equivalent to H_t however later in the accumulation invocation we use H_t explicitly. Is there any reason for that? and why not use t which in equivalent to H_t?

# 2025-06-06 11:21 prematurata: https://graypaper.fluffylabs.dev/#/7e6ff6a/37f70137fc01?v=0.6.7 can someone help me understand the last part of the check where codehash (d_c ->32bytes) is compared to a 32 bit number (x_s -> Service Index ) encoded into 32 byte? (edited)

# 2025-06-06 11:22 prematurata: how is that suppoused to work in a real scenario

# 2025-06-06 11:55 tomusdrw: > <@prematurata:matrix.org> https://graypaper.fluffylabs.dev/#/7e6ff6a/37f70137fc01?v=0.6.7 can someone help me understand the last part of the check where codehash (d_c ->32bytes) is compared to a 32 bit number (x_s -> Service Index ) encoded into 32 byte? The service updates it code hash to this value, which indicates it's ready to be ejected. There can be only one preimage left for the ejected account, presumably some kind of tombstone.

# 2025-06-06 12:31 prematurata: ah ok

# 2025-06-06 12:31 prematurata: tkz

# 2025-06-06 18:13 gav: > <@tomusdrw:matrix.org> The service updates it code hash to this value, which indicates it's ready to be ejected. There can be only one preimage left for the ejected account, presumably some kind of tombstone. They preimage with be the most recent code since the new code hash has no preimage it cannot possibly be removed after the upgrade to a zombie (or, obviously, before)

# 2025-06-06 22:06 ascriv: Screenshot 2025-06-07 at 1.06.27 AM.png

# 2025-06-06 22:06 ascriv: for 0.6.7 I noticed that we have serialization components for theta and beta_B in C(16) and C(3) respectively, but the state definition only has B_H and no theta (edited)

# 2025-06-06 22:08 ascriv: why does it seem that we are storing some things which aren't considered part of the state?

# 2025-06-07 15:26 gav: Yes indeed - that’s a typo and should be beta and theta.

# 2025-06-08 19:48 prematurata: can we please reconsider adding storage key length inside the state merklization value ? It should make the "restore from merklized state" easier. like massively easier to implement. (edited)

# 2025-06-09 08:42 ascriv: Screenshot 2025-06-09 at 11.42.47 AM.png

# 2025-06-09 08:43 ascriv: Screenshot 2025-06-09 at 11.43.00 AM.png

# 2025-06-09 08:43 ascriv: What is the utility of having item markers like E_4(2^32-1) if we set fire to it with a hash when making the state key?

# 2025-06-09 08:47 ascriv: I thought they were useful for prefix search too

# 2025-06-09 11:59 prematurata: none i can see indeed

# 2025-06-09 15:59 oliver.tale-yazdi: Looking at the dev-spec.json from David Emett makes me think that something is off: There is a key 00fe00ff00ff00ff93528801297aa9d5702c0d1b92efacb77f07cddd9edd28, which looks like the preimage key for Service 0, but without that hash function applied. Otherwise this key must be an opaque service key, which would be odd to exactly collide with this pattern.

# 2025-06-09 16:00 dave: That dev-spec.json doesn't have these latest state changes applied, they're not implemented in PolkaJam yet

# 2025-06-09 16:01 dave: Re this, it's still important to not have the different C(s, ...) cases collide

# 2025-06-09 16:02 dave: I think it's assumed impossible to request a preimage of length eg 2^32-1 as AIUI you need a deposit proportional to the requested length

# 2025-06-09 16:04 oliver.tale-yazdi: I think not having this hash function makes the keys easier understandable, as you can prefix-match on them. I added some tool for this here https://docs.jamcha.in/advanced/storage/keys , but it fails for C(s, ..) because of that hash function. So there is no way to determine whether something is a Service Preimage, Service Storage or Service Preimage Lookup key. All the other C 1-16 and C(255, s) work fine.

# 2025-06-09 16:04 dave: The hash was added/moved to improve the security margin, see https://github.com/gavofyork/graypaper/issues/403

# 2025-06-09 16:06 dave: It's true that you can't currently reverse the key mapping done by the C() function

# 2025-06-09 16:13 ascriv: > <@dave:parity.io> Re this, it's still important to not have the different C(s, ...) cases collide If it’s just about collisions, we could make the item marker only 1 byte instead of 4. But I think more importantly we can no longer do nice prefix search as @[Oliver Tale-Yazdi] pointed out

# 2025-06-09 16:15 ascriv: I presumed that was exactly why we had 4 byte long item markers, so that, interleaved with the 4 byte service index, we have a prefix we can search with

# 2025-06-09 16:23 yu2c: Does \\mathcal{E}\_M, serialization of merkle mountain range, have a formal definition somewhere in the graypaper? I couldn’t find where this notation is explicitly defined. (edited)

# 2025-06-09 20:01 greywolve: See equation E.9 :)

# 2025-06-10 11:42 gav: It is about collisions.

# 2025-06-10 11:44 gav: Some care is needed; at present it should be secure and I see little opportunity to "optimize" here. a\_s and a\_l are relatively unconstrained, giving an attacker a vector for engineering a collision. (edited)

# 2025-06-10 11:46 gav: In particular, if only a single byte were used for the discriminator for a\_s, then a key could conceivably be engineered within a\_s which caused a collision with a properly chosen hash of economically feasible length in a\_l. (edited)

# 2025-06-10 11:47 gav: This would then need to be managed by altering the key-formula for a_l (e.g. by adding an additional discriminator).

# 2025-06-10 11:49 gav: Right now we avoid this extra discriminator by ensuring that there can be no engineered collisions between a_s & a_l by effectively namespacing through those first four bytes of the key-formulae.

# 2025-06-10 11:50 gav: I don't have an especially strong preference for this though - design-wise it's a bit of a hangover from when we didn't do the hashing at all and it made sense to make the database keys easily reversible. (edited)

# 2025-06-10 11:52 gav: (To make this truly secure we should really add a requirement that no entries in a\_l may have lengths >= 2^32-2 - without this then we rely solely on economic properties to avoid a collision, and while such properties are reasonable to assume on a public mainnet, they may not be on e.g. a testnet or a permissioned network) (edited)

# 2025-06-11 09:48 ascriv: Does the cleanroom rule apply also to all things related to networking, or not since it’s sort of external to GP?

# 2025-06-11 10:05 dave: Networking stuff will eventually be included in the GP so I assume the clean room rule applies

# 2025-06-11 23:52 sourabhniyogi: Should the first sentence of 11.3 here be "Every slot, each core has 3 validators.. " rather than "Every block, each core has 3 validators .."

# 2025-06-12 03:20 xlchen: not every slot have a block, and a slot could have multiple blocks

# 2025-06-13 13:31 sourabhniyogi: We would like to see Provide a more optimal EC parameterization for work bundles get moved up to 0.7 (so we can work on it this summer rather than the fall) -- as we are finding that erasure encoding / decoding with 1K-3K imported/exported segments is disruptively time-consuming. Possible? Or, I'm missing something basic here to match up with davxy 's new Erasure coding test vectors and would like someone to provide the trick to get the efficiency of just having 1 encode call rather than k? (edited)

# 2025-06-13 13:43 clearloop: I remember parity has this erasure-coding repo https://github.com/paritytech/erasure-coding, wouldn't this be helpful? since in the jam prize page it says we can use erasure-coding library as primitives directly

# 2025-06-13 14:32 jaymansfield: If theres no way to reduce the k encode calls to just one, you can try doing them in parallel rather than sequentially. It significantly reduced encoding time for me after doing that. Although would be faster still to just do a single call if possible.

# 2025-06-13 16:36 sourabhniyogi: The efficiency we want is that encoding+decoding of 3072 segments coming back from 342+ CE138 responses form 12MB+ work package bundle is done in well under 1s using up 1/16 of resources as aimed for here https://graypaper.fluffylabs.dev/#/7e6ff6a/202501202501?v=0.6.7 It is surely doable, we just have to actually do it. I think a solution between "tiny" (V=6, C=2) and "full" (V=1023, C=341) should be pursued so that N teams can pursue parallelization of requests/responses ( and various politeness concerns ) within the Toaster in pairs or something like that. (edited)

# 2025-06-13 16:46 dave: FWIW the full network protocol will provide a way of fetching bundles/segments directly from guarantors, as a fast path. The less fast path which is possible now is to request the shards containing original data, in which case reconstruction is just concatenation. If you need to do full reconstruction of segments it should still be possible to have only a single call into the EC library _if_ the shard indices are the same for every segment. If they're not then I'm not sure how efficiently reconstruction can be done.

# 2025-06-16 17:35 clearloop: just stepped in this part, if I'm not mistaken, isn't the k params for memory safe and scalability, I think the problem is the word size is too small in the tests, and k !=1 allows us to encode the data in parallel indeed (edited)

# 2025-06-15 07:15 ascriv: The header has a parent hash and prior state root. We need to guarantee the posterior state root of the parent block equals this prior state root. Does this mean implementations must also keep track of the posterior state roots along with headers? Otherwise it seems naively that an incoming header could point to any ancestral header as its parent

# 2025-06-15 08:59 gav: Implementations will need to be able to determine the posterior state root for any block they retain in their chain which could serve as a parent.

# 2025-06-15 09:00 gav: The fork-choice rule and finality ensure that this set cannot grow indefinitely (and thus that a new block cannot have a non-recent parent). (edited)

# 2025-06-16 15:58 prematurata: I hate restarting the discussion on the topic again but i think the C function used in merklization should be changed. ( and maybe the T fn as well) I wrote a rationale issue on the topic https://github.com/gavofyork/graypaper/issues/436 (edited)

# 2025-06-16 20:00 gav: Replied.

# 2025-06-17 11:44 greywolve: What is the new state serialization for privileged services? I see this isn't yet updated in the GP. Is it reasonable to assume it's going to be something like:

E4(χm, χv) ⌢ E(χa) ⌢ E(χg)

# 2025-06-17 14:45 gav: Yes something like this.

# 2025-06-17 15:45 gav: Is it not?

# 2025-06-17 15:45 gav: image.png

# 2025-06-17 15:46 gav: This is in equation D.2, GP main

# 2025-06-17 15:46 dakkk: X\_a is a list, so D.2 is wrong (edited)

# 2025-06-17 15:48 gav: No it's not:)

# 2025-06-17 15:48 gav: See equation C.15

# 2025-06-17 15:51 dakkk: The equation defining X_a is 9.9, and it says X_a is a list; I don't understand why C.15 should be involved

# 2025-06-17 16:26 gav: chi_a is indeed a sequence. The serialisation function used is defined in multiple parts, depending on the domain. (edited)

# 2025-06-17 16:27 gav: If the domain is a sequence, then C.15 comes into effect.

# 2025-06-17 16:59 dakkk: E_4 doesn't rewrite to E, so if the argument is a sequence, it shouldn't be rewritten as E_4(a1) ^ E_4(a2) ...; btw it's just a matter of formalism

# 2025-06-17 17:02 gav: Not sure what you mean.

# 2025-06-17 17:03 gav: E_4 indeed is not expressed in terms of E under any argument domain.

# 2025-06-17 17:04 gav: Regardless it is well defined for sequence domains, multiple arguments and tuples (all of them in terms of itself). (edited)

# 2025-06-17 20:17 greywolve: Does that mean the # here for E_8 is redundant ?

# 2025-06-18 05:12 gav: Yes, not just the hash but also the concatenation operator above. Feel free to make a PR or issue.

# 2025-06-18 10:38 greywolve: Does the new service host call imply that no service id, even genesis services, should be less than 256?

# 2025-06-18 10:46 gav: It implies that no new services can be less than 256 (or greater than 2^32-256)

# 2025-06-18 10:47 gav: Genesis state is necessarily trusted and so doesn’t have these constraints. (edited)

# 2025-06-18 10:48 gav: However the state merklisation is done to allow a genesis service at 0.

# 2025-06-18 11:02 greywolve: State serialization question: How does one tell the difference between a preimage meta state key for a freshly solicited preimage, and a storage state key? I know the preimage meta value will be the encoded empty list, ie 0, but a storage key could have the same value. (edited)

# 2025-06-18 11:03 gav: Who says you can?

# 2025-06-18 11:03 gav: The only thing which we aim to achieve with Merklisation is security (non-collision) and speed.

# 2025-06-18 11:04 gav: The ability to introspect/interpret merklised keys is not a priority. (edited)

# 2025-06-18 11:05 greywolve: I understand that, but I surely need to know which map to put a given key in when I import a trace, for example (edited)

# 2025-06-18 11:05 gav: Nope.

# 2025-06-18 11:06 gav: That's simply not a given.

# 2025-06-18 11:06 gav: It would imply requiring that which I've already said isn't a priority.

# 2025-06-18 11:06 greywolve: Ah right

# 2025-06-18 11:07 oliver.tale-yazdi: I think many implementations use KV lookup, in which case you dont need to invert them. IMHO it is still possible to not use KV Db by inverting all keys that can be inverted and then doing some special handling for the service keys that cannot be inverted. Also since only three key kinds cannot be inverted and they are used from host-calls that tell you the kind of the key. You can check this unofficial key table here https://docs.jamcha.in/advanced/storage/keys

# 2025-06-18 11:09 gav: If the lack of merklised-key-interpretability turns out to be an issue in usage, it's something fairly easy to fix (by including metadata in the value).

# 2025-06-18 11:09 gav: But right now the assumption is that it's an unneeded pessimisation.

# 2025-06-20 07:40 clearloop: I'm feeling it's necessary to weaken the usage of chain parameters, since in the context of jam, the parameters should be the same in the stable production network finally if I'm not mistaken, that the parameters is more like for the node development usage, it should not trouble the app/service developers of jam, ideally, they will never reach this concept since it increases the complexity of their development EDITED: just got the purpose via https://github.com/gavofyork/graypaper/issues/186 (edited)

# 2025-06-20 09:21 ascriv: (C.16) says that the validator index for each credential in the guarantees extrinsic (11.23) should be generically serialized, but in order to comply with the test vectors (for condition (5.4)), I had to serialize the validator indices with E_2. Which is correct? (edited)

# 2025-06-20 10:05 gav: > <@ascriv:matrix.org> (C.16) says that the validator index for each credential in the guarantees extrinsic (11.23) should be generically serialized, but in order to comply with the test vectors (for condition (5.4)), I had to serialize the validator indices with E_2. Which is correct? The GP should be corrected so that the term a (credential) is serialized with a fixed-length validator index.

# 2025-06-20 10:06 gav: Feel free to make an issue or a PR.

# 2025-06-20 20:14 ascriv:

# 2025-06-20 20:19 ascriv: > <@gav:polkadot.io> Feel free to make an issue or a PR. https://github.com/gavofyork/graypaper/pull/440

# 2025-06-21 09:35 ascriv: For handling time, a naive approach which pings ntp servers during block validation will probably take >10ms on average which is too long. Could do periodic syncing during downtime but I feel like that technically wouldn’t be gp compliant. Could rely on system time but also prob not gp compliant (edited)

# 2025-06-21 09:38 ascriv: Should we have a recommended configuration or should we leave solving this problem up to the implementers?

# 2025-06-21 10:37 gav: There’s nothing uncompliant about periodic time syncing providing you have good reason to believe the machine’s internal clock isn’t going to be totally mad.

# 2025-06-21 10:40 gav: The GP assumes common-clock - it makes no assertions about how this is achieved. I’d have thought impls will generally expect system clock to be correct and the problem will thus generally be left to the node’s sysadmin.

# 2025-06-21 18:19 sourabhniyogi: Nitpick suggestion: make fetch case 8 here match up exactly with the implied authorizer p\_a preimage here -- prefer 14.10 change instead of fetch. (edited)

# 2025-06-22 16:21 gav: As those who follow the GP will have seen, the last few days has seen both major refactoring of the Latex and some pretty heavy changes to the terms used to try to make things a lot easier for newcomers. I expect to finish the final 0.6 series issue imminently, after which 0.7.0 can, in principle, be released. Before I put out 0.7.0 I'll leave it for a few days to settle as no doubt there will be some errors from the huge refactoring/renaming: **If you do happen to be looking at GP-main and see any regressions the refactor introduced, do please point them out**.

# 2025-06-25 10:32 gav: Nobody reported any issues, so here it is: GP 0.7.0

# 2025-06-25 10:33 gav: I did a sweep of integer encodings, but may have missed some; if you feel that there are still encodings which are variable and shouldn't be (or which are fixed and should be variable), do feel free to mention it.

# 2025-06-25 10:35 clearloop: when will the network protocol being embedded in GP?

# 2025-06-25 10:42 gav: It's not a big priority.

# 2025-06-25 10:42 gav: It might not happen even before 1.0 - it could just be done as a JIP/RFC. (edited)

# 2025-06-25 10:43 gav: Biggest priority is to get the core protocol secure, fast and functional.

# 2025-06-25 13:55 clearloop:

spacejam    WARN ticket{attempt=0}: failed to open bi-stream: eecgwpgwq3noky4ijm4jmvjtmuzv44qvigciusxakq5epnrfj2utb@127.0.0.1:40000
spacejam    DEBUG ticket{attempt=0}:disconnect{peer="eecgwpgwq3noky4ijm4jmvjtmuzv44qvigciusxakq5epnrfj2utb"}: failed to open bi-stream
spacejam    WARN ticket{attempt=0}: failed to open bi-stream: ekwmt37xecoq6a7otkm4ux5gfmm4uwbat4bg5m223shckhaaxdpqa@127.0.0.1:40002
spacejam    DEBUG ticket{attempt=0}:disconnect{peer="ekwmt37xecoq6a7otkm4ux5gfmm4uwbat4bg5m223shckhaaxdpqa"}: failed to open bi-stream
polkajam-3  tokio-runtime-worker WARN authoring  Error processing incoming ticket: finality lagging (final: 2520995, curr: 2520999)
polkajam-4  tokio-runtime-worker WARN authoring  Error processing incoming ticket: finality lagging (final: 2520995, curr: 2520999)
polkajam-1   tokio-runtime-worker WARN authoring  Error processing incoming ticket: finality lagging (final: 2520995, curr: 2520999)
spacejam     TRACE ce132::recv: ticket#0@cdf12f for epoch: 210083
spacejam     TRACE up0{peer=en5ejs5b2tybkfh4ym5vpfh7nynby73xhtfzmazumtvcijpcsz6ma}:recv: block#2521000@0x4ed2f0, grandpa#2520998@0x961299, remote#2520996@0xdd6fa0

I can reach this finality lagging problem frequently from polkajam, at the last line we can see spacejam still receives block#2521000 from a polkajam node, while seems other 3 neighbors are at 2520995 that they reject spacejam's ticket and close the connection, I expect the correct behavior should be like, on receiving a ticket from future blocks, we cache the ticket, and verify it on finalizing that new block instead of closing the connection immediately our testnet (0.6.5) configuration is 5 polkajam node + 1 spacejam node and spacejam can work with polkajam nodes with around 30mins then get broken on inconsistent blocks (edited)

# 2025-06-27 13:42 prematurata: I've a question in 10.5 f \in bold_k but that bold_k can't come from 10.3 which needs a from EV. is it a typo? (edited)

# 2025-06-27 14:09 prematurata: lol i am totally blind sorry :(

# 2025-07-14 14:10 ascriv: when computing the work report for a work package, i'm a bit confused how GP is specifying how to compute the segment root dictionary. It seems like we should rely on the segment-roots in the recent blocks component in the current state according to (11.41). But also 11.41 says the segment-roots might be in the guarantees extrinsic itself. that seems to imply we should wait for all the work reports in E_G before computing it, which seems to imply the author should be computing the segment root dictionaries. what am I missing?

# 2025-07-14 14:17 dave: 11.41 is a validity condition for a block, it doesn't say how the reports should be computed

# 2025-07-14 14:18 dave: The WP hash -> segments-root map in a report has to be computed by the report's guarantors as it is signed by them

# 2025-07-14 14:20 dave: It is expected to be computed by the "primary" guarantor, who received the work-package from the builder. When the work-package is shared with a secondary guarantor, this mapping is sent in the first message (see CE 134)

# 2025-07-14 14:21 dave: The author of a block should only include guarantees where this mapping is valid as per 11.41

# 2025-07-14 14:25 dave: Exactly how guarantors compute the mapping is up to them. They may optimistically include work-packages that have not yet been reported on-chain. Guarantors won't be slashed for including an invalid mapping in a report, but they also won't be paid for these reports as it won't be possible to include them on chain

# 2025-07-14 14:27 ascriv: Got it. So could a naive guarantor could compute it by looking at the recent blocks component in the current state and that would suffice but possibly not be optimal? (edited)

# 2025-07-14 14:37 ascriv: Should we not also include optimal guarantor behavior in the gp? Or is it not well defined?

# 2025-07-14 14:42 dave: To some extent this may need to be defined for guarantors assigned to the same core to work well together. I think we'd generally like to allow validators to have their own strategies for things though

# 2025-07-14 14:44 dave: eg It wouldn't be great if the mappings computed by one guarantor were always rejected by the other two assigned to the core (edited)

# 2025-07-14 14:44 dave: Builders will also need to know generally what to expect from guarantors

# 2025-07-14 14:46 dave: So I expect we will define some of this stuff, but probably not until after performance testing etc has happened on the toaster, as I don't think we will understand things well enough until then

# 2025-07-16 10:22 ascriv: Screenshot 2025-07-16 at 12.22.13 PM.png

# 2025-07-16 10:22 ascriv: Screenshot 2025-07-16 at 12.22.44 PM.png

# 2025-07-16 10:23 ascriv: is there a discrepancy between GP and https://github.com/zdave-parity/jam-np/blob/main/simple.md ? 139/140 require the erasure root to reconstruct the imported segments but GP says the segments root is what's needed

# 2025-07-16 20:55 dave: The erasure root is used in SNP as the distributed shards are proven only to this, it is not possible to (efficiently) prove a segment shard to a segments-root

# 2025-07-16 20:59 dave: Of course for an imported segment you start with a segments-root, not an erasure-root. The guarantor fetching the segment is expected to map one to the other, based on previously observed work-reports (edited)

# 2025-07-16 21:05 dave: Not sure the network protocol can sensibly work any other way, could be wrong about this though

# 2025-07-16 21:12 ascriv: I guess this would be in the availability spec. How long should a guarantor keep work report availability specs? (edited)

# 2025-07-16 21:14 dave: Should keep mappings for as long as the data is available, 28 days is what's currently specified I believe

# 2025-07-17 09:21 ascriv: Can guarantors assume that each imported data segment for the work items is exported by one of the prerequisite work packages? (Unless perhaps the imported segment is identified by work package hash)

# 2025-07-17 09:24 dave: No

# 2025-07-18 14:54 ascriv: The JNP seems pretty clear about how guarantors should get segment shards for reconstruction, but aren’t segment justifications also in D3L? There doesn’t seem to be a way to get those that I can tell. The justifications sent in 140 are justifications for shards, not segments, no?

# 2025-07-18 14:55 dave: You should also use 139/140 to request proof page shards

# 2025-07-18 14:56 dave: You need to reconstruct proof pages in the same way that you reconstruct import segments

# 2025-07-18 14:58 dave: The justifications returned by CE 140 are for shards yes. As the SNP doc says you generally should only use CE 140 if your first attempt to fetch shards and reconstruct segments using CE 139 fails

# 2025-07-18 15:22 ascriv: Ok think I get it, looking at 14.16, we can reuse the same erasure roots but use different indices to get paged proof shards using 139/140 (edited)

# 2025-07-22 13:28 danicuki:

# 2025-07-23 10:15 ascriv: Is there a proposal for the format of the metadata in the last 128 octets of the validator keys somewhere? I can infer from the various examples floating around but would be nice if it was written down somewhere (to infer the IP address/port, e.g.) (edited)

# 2025-07-23 12:24 dave: IDK if this is specified anywhere, but PolkaJam currently assumes first 16 bytes are an IPv6 address and next 2 bytes are a port

# 2025-07-23 12:25 dave: Remaining metadata bytes are currently ignored

# 2025-07-23 12:27 gav: This ought to fit nicely in the networking spec (edited)

# 2025-07-23 12:28 dave: Ah, it is there already actually: https://github.com/zdave-parity/jam-np/blob/main/simple.md#required-connectivity

# 2025-07-23 12:28 dave: "The validators' IP-layer endpoints are given as IPv6/port combinations, to be found in the first 18 bytes of validator metadata, with the first 16 bytes being the IPv6 address and the latter 2 being a little endian representation of the port"

# 2025-07-23 15:03 ascriv: should a valid ipv6/port be enforced before incorporation on chain?

# 2025-07-23 15:21 dave: Could plausibly be sanity checked by the delegator service, GP does not have any such logic though

# 2025-07-25 00:33 r2rtnl: I'm observing a possible inconsistency in instruction decoding between the Gray Paper and the official PolkaVM implementation: 1. According to (A.16) of the Gray Paper, any signed extension must produce a 64-bit value: https://graypaper.fluffylabs.dev/#/9a08063/25e30225e302?v=0.6.6 2. In (A.26), the immediate (nu\_x) for the branch\_ge\_u\_imm instruction must be sign-extended from 32 bits or less (as the immediate size is up to 4 bytes) to 64 bits, using the signed extension defined in (A.16). https://graypaper.fluffylabs.dev/#/9a08063/27e70227e702?v=0.6.6 3. When decoding the following argument octets for a branch\_ge\_u\_imm (opcode 85): 0x17 0xf7 0xf4 0x1, the official PolkaVM implementation logs this as: "jump 70519 if a0 >=u 4294967287" (the instruction starts at offset 70019). This suggests that the signed extension applied to an 8-bit value (0xf7) resulted in a 32-bit value 0xfffffff7 = 4294967287, rather than a 64-bit value as implied by (A.26). Could you please clarify the intended behavior in this case? P.S. This observation comes from analyzing the following PolkaVM trace: https://github.com/davxy/jam-stuff/blob/main/pvm-traces/0.6.6/reports-l0-00000005.log that was provided in this task: https://github.com/davxy/jam-test-vectors/issues/83 (edited)

# 2025-07-26 00:57 gav: Gray Paper version 0.7.1 is out. This one contains most of the pre-planned 0.7 series changes already. The rest of the changes may be dropped or postponed to 0.8; we'll see once we have vaguely real services running on the Toaster. The biggest change with this release is to combine on_transfer with accumulate, actually reducing the size of the spec and, hopefully, implementation and API complexity. Also introduced is a new privilege, the Registrar, which is able to name the index of newly created services in a newly cordoned AccountId space (the lower 16 bits). (edited)

# 2025-07-30 09:01 prematurata: I've a question about B.11. what is bold_o in the fetch? I can't seem to find a reference

# 2025-07-30 10:04 0xjunha: Good catch - that should be bold_i for accumulate input, I updated that from bold_o to bold_i but missed updating FETCH https://github.com/gavofyork/graypaper/pull/465

# 2025-07-30 10:04 0xjunha: Opened a PR to fix it - https://github.com/gavofyork/graypaper/pull/479

# 2025-07-30 11:21 prematurata: thanks!!

# 2025-07-30 23:08 sourabhniyogi: In comparing PVM gas vs EVM gas (and for a bit more than funsies, throwing geth's evm interpreter into our implementation and wiring up Go StateDB interface) and seeing how modular JAM is in its PVM invocations to refine/accumulate, it's clear that JAM can totally support other VMs besides PVM -- so long as JAM guarantors+auditors/validators measure gas/work the exact same way and take in argument inputs/outputs according to whatever the VM's interfaces are, I think we can transplant multiple non-PVM VMs into JAM -- good idea or no? In principle, GP could split off Appendix A (the PVM) from GP (and like grandpa/bls/bandersnatch) incorporate by reference and make JAM agnostic to its VM, and describe JAM VM-specific interfaces in separate JAM+PVM, JAM+EVM, JAM+whatever VM papers (or JIPs). You could have CorePVM service running PVM refine/accumulate, CoreEVM running EVM refine/accumulate, Core{whatever}VM running whateverVM refine/accumulate. Instead of "JAM can run any VM that people want so long as it’s PVM” 😀 it can be "JAM can run any VM people so long as it runs on gas". The reason _against_ doing so is that PVM's recompilation and RISC-V choice is technically the best option! Having done a basic recompiler, we now see directly how recompiling PVM to X86 (and surely, ARM) wins a truly massive performance gain over the PVM interpreter -- so no debate there! But other VMs can also map into X86 (or ARM) in the same way and get similar truly massive performance gains: revmc does it for EVM, and its clear a non-PVM JAM path could be based on that or something similar for EVM (and probably other non-PVM VMs). Once we compare X86 to X86 wall clock times, the differences between these VMs are bound to be more like the difference between QWERTY vs non-QWERTY keyboards: people will use Solidity/Remix/solc over Rust/polkatool/revive just because its familiar, not because its actually the best option. What is the reason for tying JAM solely to PVM? Could we chart a more open course? If not, what did I miss? (edited)

# 2025-07-30 23:27 clearloop: I think too many translations are over complicating the problem and endless of maintenance, if I'm not mistaken, we can simply run a VM in builders and wrap the result to the extrinsic data of work packages

# 2025-07-30 23:29 xlchen: the main thing is actually working gas model. PVM is designed in such way that the gas consumption actually reflect the resource usage. EVM gas is not the case

# 2025-07-30 23:37 clearloop: my idea is that builders holding VMs for executing any programs, host a service for charging fees, users can get the execution data from refine context and doing pending stuffs, the translation solutions are more like for non-JAM arch, e.g. PVM out of JAM

# 2025-07-30 23:46 sourabhniyogi: I probably learned the wrong lesson here from the 3 snippets here where Jan's point in

Snippet 1: 6ms
Snippet 2: 114ms
Snippet 3: 10ms

was that not even PVM can get it right -- thus we need to have a new mechanism considering whether observed wall clock times are close to expected "average" times to steer away from "Snippet 2" situations. You appear to believe PVM has it right, and I don't know about that...

# 2025-07-30 23:51 xlchen: the whole point is accurate gas measuring is hard. but that's the exact reason why pvm gas system is not yet out

# 2025-07-31 00:47 sourabhniyogi: I believe the solution, whenever it comes out, will be at the JAM level rather than VM-specific, which is a happy outcome because I think JAM's innovations here are not wholly dependent on PVM at all.

# 2025-07-31 14:17 bamzedev: Equation 10.2 defines the verdict age as an element of N₂ (i.e., {0, 1}), and equation 10.3 determines which validator set to use based on comparing the age to ⌊τ/E⌋. At epoch transitions (e.g., slot 600 transitioning from epoch 0 to epoch 1), this appears to create a one-slot delay before new validators can have their judgments included in verdicts: At slot 600: τ = 599, so ⌊τ/E⌋ = 0 Valid ages are {0, 1} Age 0 → uses κ (epoch 0 validators from prior state) Age 1 → uses λ (validators from before epoch 0) No age value allows referencing κ' (the new epoch 1 validators) Does this mean: Slot 600 might have no verdicts at all, OR Epoch 0 validators continue sending judgments into the first slot of epoch 1 (and these can be included with age=0)?

# 2025-07-31 14:59 dave: This sounds like a misunderstanding of the auditing system. Verdicts/judgments are posted on-chain only if there is a dispute, ie at least one validator thinks a report is faulty. This should be extremely rare, and is likely to happen at least a couple of slots after the disputed report is made available. The first block the epoch 1 validators are responsible for auditing is the one in slot 601, so it doesn't matter if they can't include a verdict before this.

# 2025-07-31 15:19 bamzedev: thanks

# 2025-08-01 04:12 gav: > <@sourabhniyogi:matrix.org> I believe the solution, whenever it comes out, will be at the JAM level rather than VM-specific, which is a happy outcome because I think JAM's innovations here are not wholly dependent on PVM at all. PVM provides a number of key innovations for JAM: 1. An execution and gas model which we expect to get much closer to metering at native speed than anything before. 2. A well supported ISA, allowing a host of compilation tooling to be used, including LLVM. 3. A stack and memory model which allows for easy consensus-sensitive program suspension and restoration. This opens the doors to things like CoreVM and “native speed” smart contract execution (rather than executing the smart contract as a second interpreter within the VM).

# 2025-08-01 04:15 gav: Using EVM or Wasm instead of PVM is technically possible, but would increase workload and maintenance for all JAM implementation teams and would mean hard-forking to keep up with any EVM changes which we wanted to support (forcing a choice of outdated tooling or being subject to other peoples’ hard fork schedule). Neither Wasm nor EVM have particular good gas models. Polkadot works around this with a special wall-clock time consensus protocol but this approach is insecure outside of Polkadot 1. Realistically it may be end up being faster to recompile EVM into PVM and execute “natively” than build in a specialist EVM protocol with a crappy gas models. I wouldn’t necessarily say never, but I won’t be adding it to 1.0. (edited)

# 2025-08-01 10:08 dakkk: davxy: after finishing tests for 0.6.7, will you go ahead implementing those for 0.7.1 or you will do 0.7.0 first? I'm asking since in Jampy I'm maintaining branches for every gp versions from 0.6.7 to latest, and I'm doing it only for being able to run your tests in the future

# 2025-08-01 10:43 rustybot:

# 2025-08-01 10:44 davxy: > <@dakkk:matrix.org> davxy: after finishing tests for 0.6.7, will you go ahead implementing those for 0.7.1 or you will do 0.7.0 first? I'm asking since in Jampy I'm maintaining branches for every gp versions from 0.6.7 to latest, and I'm doing it only for being able to run your tests in the future Probably we'll ship 0.7.0 first. Side note, are you aware of this: https://github.com/davxy/jam-stuff/tree/main/fuzz-reports/0.6.6/jampy/jampy-0.6.6_GP-0.6.6/1753810654 ?

# 2025-08-01 10:57 dakkk: ok (y) No, I wasn't aware; I will have a look

# 2025-08-01 11:33 sourabhniyogi: For the engineering of a whateverVM byte code to PVM byte code (and winning gas metering closer to wall clock times [but by no means perfect, as Jan's Snippet 1/2/3 illustrates super well]), this makes a lot of sense! Do you believe it should be done (a) in a performance-based way like PVM byte code is being mapped to X86 (or ___) now, (b) according to a spec, or (c) with one good FFI [like most of us are all treating the W3F bls+bandersnatch libs + erasure encoding] ? If there is a difference between engineering the mapping between (1) whateverVM to PVM and (2) PVM to X86, what is it? Based on the very long history of dozens-to-hundreds of smart contract compiler bugs (which get a lot of press!), I'm far more concerned about buggy polkatool or problems in one actually-not-so-good FFI (for EVM=>PVM or PVM=>X86) than I am of crappy gas models being the source of severe security problems. What should people like me read to rebalance their security concerns and develop a better appreciation of how severely crappy the gas models are?

# 2025-08-01 11:57 jan: Unless I misunderstood it, your question doesn't really make sense. Translating one bytecode into another must always be done according to the spec, otherwise you'll get diverging semantics and the translated program won't do what it's supposed to do. Of course 1-to-1 exact translation isn't always possible - e.g. in case of EVM you can do things like inspect the current gas, so if you'd translate this to PVM you'd either have to implement very expensive emulation to fake EVM's gas, or just accept that programs which depend on this will be incompatible. Nevertheless, PVM's gas metering itself (which you cannot turn off) will ensure that the guest program can be safely run.; we care about how PVM instructions map to native AMD64 code because this is security sensitive; we don't care how EVM instructions map to PVM instructions because the PVM layer below it already provides the security it needs.

# 2025-08-01 12:20 jan: Anyway, JAM using a different VM would be possible, but unless you don't care about performance or you don't care about security then it would be much harder, precisely because of the gas cost model. Let's say, for example, we want to use WASM. Okay. So first, we absolutely need O(n) recompilation support in the VM, otherwise it's trivial to DoS our chain. But WASM wasn't designed with this in mind, so we eat 2x performance cost to use a single-pass WASM VM even before any other consideration. Then, we need a gas cost model. WASM bytecode is relatively high level and doesn't map to native instructions 1-to-1, so this makes coming up with a secure optimal gas cost model even harder. But okay, let's give up on the "optimal" part. This means you need to overestimate your gas model costs by quite a lot, even more than we have to do with PVM. And you still need to do the legwork and come up with the gas cost model yourself, because WASM itself doesn't come with one! Of course there's still the YOLO option of just.... not having a secure gas cost model (without any backup mechanism like Polkadot 1). I know of at least one chain which picked this option (whether deliberately or accidentally), and they're doing fine because no one really cares to actually attack them. :P But I can't in good conscience recommend this option, for obvious reasons.

# 2025-08-01 12:29 jan: JAM would most likely look very different without PVM. In the gas cost model that I'm working on I'm already pretty much pushing the absolute limits of what can be practically modeled; to do significantly better you'd need to either a) skimp on security, or b) gimp the chain's capabilities and make it not a general-purpose, permissionless-chain (which is JAM's killer feature in my opinion).

# 2025-08-01 17:41 sourabhniyogi: We will need to study what is missing in revmc's EVM-to-X86 implementation (which is obviously as performance oriented as JAM's PVM-to-X86) a bit more to understand what it is missing there that is present in the PVM-to-X86 architecture that gives the latter better security outside of gas semantics alone. Obviously, its missing a dozen or two teams competing in 2026 in a performance-based way. Otherwise, I think you can redo a gas model for most (but not all) EVM op codes based on typical revmc typical X86 measurement, give up on legacy gas model and get "almost compatibility" -- using host function calls for any and all op codes that need them. You would then have two modern paths for 2 different types of teams/implementations: - DVORAK teams/implementations: evm =TBD=> pvm =M3/M4 impl=> x86 \[with a pvm gas model based on x86\] - QWERTY teams/implementations: evm =revmc \[or similar/better\]=> x86 \[with a non-legacy gas model based on x86\] Augmenting Current Teams with DVORAK teams and/or QWERTY teams is better than the current path of - Current Teams: Rust/Solidity =\[revive/polkatool\]=> pvm => x86 \[with a pvm gas model based on x86\] _alone_ because, simply put, there are basically a O(10^4)-O(10^6) times fewer people who have dogfooded polkatool with the above path (we're trying our best, its just ... not ready yet =) ). We want security, performance and **impact**. JAM's path to max impact probably should involve DVORAK teams, is what I'm hearing, informed by the predicted failures of QWERTY teams. Do you have a prediction on how QWERTY teams will fail? (edited)

# 2025-08-02 01:18 gav: While i appreciate the desire to create impact, as I already stated the GP will definitely not see EVM in it before 1.0, and probably never. I’d suggest you instead focus your efforts on comparison (perhaps putting your various benchmarking code through both revmc and Rust/PolkaVM) and compilation of Solidity to PVM (or perhaps even recompilation of EVM to PVM).

# 2025-08-02 15:09 jan: If you're paranoid about using my linker for translating RISC-V code to PVM you can always either disable my linker's optimizations (in which case it does very little changes to the original RISC-V bytecode, besides translating it to equivalent PVM instructions and relocating the code and data like any other linker does to fit with the PVM memory map), or you can just write your own independent RISC-V -> PVM translator and end-to-end fuzz it against mine. And of course you can also try to do formal verification of the resulting PVM bytecode of your program, but the tooling for that is as of yet still missing (you're welcome to contribute though if you have the manpower, as that would actually be useful).

# 2025-08-03 00:37 sourabhniyogi: clever ok!

# 2025-08-07 10:38 ascriv: https://graypaper.fluffylabs.dev/#/1c979cb/308a01308a01?v=0.7.1

# 2025-08-07 10:39 ascriv: I don't think any impls are doing the g-10 in the default case for the mutator here? Should we remove?

# 2025-08-07 10:40 ascriv: also if we keep it, shouldn't we do an out of gas check in case g < 10 ?

# 2025-08-07 10:45 gav: It shouldn’t be removed, and yes an OOG check should be assumed. Feel free to make a PR stating this explicitly.

# 2025-08-07 11:35 ascriv: https://github.com/gavofyork/graypaper/pull/482

# 2025-08-08 15:54 ascriv: https://graypaper.fluffylabs.dev/#/1c979cb/2c57022c6102?v=0.7.1 host functions returning a page fault is only theoretical at this point, right? Couldn't find any that actually return this exit reason

# 2025-08-12 13:37 ge0321: Hi Gavin, I was wondering if there’s been any thought given to updating the original Polkadot whitepaper. With Polkadot 1.0 now fully delivered, the original vision has essentially been realized. Now that we have JAM, the Gray Paper does an excellent job explaining the technical details, but it’s quite dense for a general audience. Do you think there’s value in producing an updated whitepaper or equivalent that frames Polkadot’s vision and architecture from the JAM perspective for a broader audience?

# 2025-08-12 15:12 gav: Yeah perhaps that could be interesting.

# 2025-08-12 15:13 gav: Question is whether I can find the time:)

# 2025-08-12 16:00 ge0321: :)

# 2025-08-12 17:04 sourabhniyogi: What are the primary reasons for _not_ having refine's { machine, poke, invoke, peek, pages, expunge } host functions available as invokable accumulate host functions as well? If these are "expensive" (or worth discouraging) why not just gas price them as such? As far as we can see, there is no constraint on a child VM's available gas in invoke from the parent VM -- is that correct? If the parent has 2B gas allocated, the machine-generated child VM could have 3B, or 3 machine generated child VMs could have 1B (or 1T). What did I miss on how invoke is constrained so that a refine invocation has the child VM not using more gas than was allocated to the parent? I'm taking a good look at invoke to tackle the EVM-to-PVM gas comparison problem seriously with some fraction of Ethereum legacy tests -- where I think if the EVM contract-related family of opcodes (CALL/STATICCALL/DELEGATECALL/...) is mapped into PVM refine+accumulate machinery "correctly", it can be a good source of dogfooding. This is almost certainly a non-trivial undertaking to do right but a PoC seems tractable especially if the goal is "just" to compare gas models for "pure" computations. How would you constrain or order the execution of this? Or, should this be avoided completely? It sure seems that refines child VM invoke/machine/... machinery should be adapted for this purpose (and incorporated into accumulate)-- would you agree? To do a proper precise comparison of gas models and demonstrate the improvements of gas models based on PVM to EVM, it seems an absolute requirement to compare wall clock times between (1) EVM-to-X86 (with revmc), measuring both: (a) strawman EVM gas model (whatever the production model is circa 2025) to (b) "reparameterized" EVM gas model (as informed by X86/.. wall clock times) (2) EVM-to-PVM-to-X86, with the new PVM gas model (as informed by X86/.. wall clock times) with the exact same battery of EVM byte code doing the exact same full range of computations (computing with 256-bit with no 64-bit shortcuts) and allowing both (1)+(2) optimizations (e.g. if the 256-bit computation only actually needs 64-bit computation, then take the shortcut). Am I requiring too much here? What are legit shortcuts to take? The primary shortcut I believe is to only measure pure functions (no EVM byte code reading/writing state, or doing any kind of IO), with no use of host functions (except maybe to actually conduct the test). Are there other shortcuts you would advise, for the goal of doing precise gas model comparison? (edited)

# 2025-08-12 21:19 boymaas: I am re-reading the accumulated definitions and have a question about a detail in the outer accumulation function which calls the parallelized accumulation function one or more times based on actual gas usage. The parallelized accumulation function takes a set of ( w ) and executes single service invocations in parallel, building, for example, ( B ), the service-indexed commitment. Now I wonder what would happen if the second invocation included a w with an (s) service\_id which was also present in the first invocation of the parallelized accumulation function. Are we not losing the output from the first parallelized accumulation function invocation for that service in B when we perform the union integrating the results from the second invocation into B? https://graypaper.fluffylabs.dev/#/1c979cb/17be0117c101?v=0.7.1 (edited)

# 2025-08-13 18:08 prasad-kumkar: We’ve published some early benchmarks comparing our x86 recompiler vs the Python interpreter for PVM on pure compute programs (with gas computation): 🔗 https://gist.github.com/prasad-kumkar/7727798f7b06d8ead3e84436e85499f1 These numbers are super early and there’s a lot of room for improvement. One thing I noticed - when we integrate it in JAM (with host calls), performance drops quite a bit, even slower than interpreter 🤔 Sharing here in case anyone wants to reference

# 2025-08-14 19:04 vinsystems: When the designate hostcall is called and the next\_validators are updated, should we also update the safrole's pending validators, recompute the epoch root and empty the tickets accumulator? (edited)

# 2025-08-14 19:23 jaymansfield: Results of designate go into iota prime (see 12.25). Ring root and next validators come from the original iota (6.13) (edited)

# 2025-08-14 19:39 davxy: The designate validator is stored in iota and remains there until the next epoch change. When the epoch changes, iota enters the safrole state, and the ring root (commitment) is recomputed for the new validator set. The designate host call can be invoked multiple times during an epoch to update iota, but only the last value written before the epoch change is used to recompute the safrole ring root.

# 2025-08-15 07:08 ascriv: Can anyone explain why handling page fault exit reasons from host calls is necessary? Otherwise I’ll make a PR to remove it. It seems each host call is careful not to access or mutate inaccessible memory (edited)

# 2025-08-15 18:48 dave: I agree that case does not seem necessary. I'm also not sure about some of the wording below. eg "We use both values of instruction-counter for the definition of ΨH since if the host-call results in a page fault we need to allow the outer environment to resolve the fault and re-try the host-call". There only seems to be one value for the instruction counter in Psi_H. The bit about outer environment I assume is talking about inner PVMs, but invoke does not use Psi_H, it uses Psi directly.

# 2025-08-16 06:59 ascriv: https://github.com/gavofyork/graypaper/pull/485

# 2025-08-16 23:14 shwchg: Hi guys! Does anyone know why we can't retrieve the instruction counter after an invoke? We can still resume code execution in the next package, even if we're out of gas during the invoke.

# 2025-08-17 03:02 jan: If you can't retrieve the instruction counter after an invoke then how can you resume code execution?

# 2025-08-17 03:03 jan: In general the instruction counter should be retrievable every time there's an expectation that you might want to interrupt the execution and continue during another block.

# 2025-08-17 03:05 jan: However, we don't want the instruction counter to be inspectable when the program terminates abnormally (due to a trap/panic/etc.) or when it successfully finishes execution and quits, since that shouldn't be very useful on-chain, and not requiring this makes the virtual machine easier to implement (and it's one less place where different implementations might diverge). (edited)

# 2025-08-17 03:16 shwchg: https://graypaper.fluffylabs.dev/#/1c979cb/350c02352602?v=0.7.1 actually I just found you can get the instruction counter from expunge but I think expunge should return two value back so you won't have the same value as WHO (edited)

# 2025-08-20 15:24 tvvkk7: Hello, davxy I am comparing pvm output with pvm-traces/0.6.7/preimages_light_00000003. I found that we have different setting on argument data memory allocation . I allocated argument data memory size with E(t, s, |o|), while the trace log allocated 2^24 bytes, which is the most data size of argument data. I was wondering if I might have misunderstood the GP, or if there could be another possible interpretation.

# 2025-08-20 19:22 rustybot: > <@tvvkk7:matrix.org> Hello, davxy > I am comparing pvm output with pvm-traces/0.6.7/preimages_light_00000003. > I found that we have different setting on argument data memory allocation . > I allocated argument data memory size with E(t, s, |o|), while the trace log allocated 2^24 bytes, which is the most data size of argument data. > I was wondering if I might have misunderstood the GP, or if there could be another possible interpretation. > https://matrix.to/#/#jam-conformance:matrix.org

# 2025-08-22 13:31 danicuki: I have a doubt about interpretation of Formula 12.38 (GP 0.6.7): h ∈/ d[s]p ∧ d[s]_l[(h,l)]=[] many of our traces are failing because we are not considering the case where d[s]l[(h,l)]=∅ Shouldn't the GP formula be: R(d,s,h,l) ⇔ d[s]_p[h] = ∅ ∧ d[s]_l[(h,l)] = ∅ Also, shouldn't we consider the case where d[s] = ∅ ?

# 2025-08-22 15:24 0xjunha: Why should that be d[s]_l[(h,l)] = ∅? That means preimage was not solicited yet - GP says preimage should be solicited before it can be provided as E_P & integrated

# 2025-08-22 19:23 danicuki: thanks for the explanation. I know what I am missing

# 2025-09-01 08:38 0xjunha: In TRANSFER host call, it seems GP specifies to charge (10 + φ9 = base gas + transfer gas limit) even for failure cases (Panic, WHO, LOW, CASH), which do not add a deferred transfer to the posterior accumulate context (**x_t**). Is this intended or a typo? I'm a bit confused why we should charge additional gas (φ9) for those failure cases.

# 2025-09-01 13:32 gav: Probably a typo - feel free to raise a properly references issue or PR if you want.

# 2025-09-02 04:19 0xjunha: Opened a PR for TRANSFER: https://github.com/gavofyork/graypaper/pull/488 Some refactoring was needed to avoid repetition of result cases. I'd be happy to adjust the approach if needed.

# 2025-09-04 16:06 prematurata: Not sure if it's premature or not but i'd like to implement jam-np and I've some questions. Especially about assurers. If I understand properly this should be the process: - CEI137: Assurer asks for its shard to the Guarantor who computed workreport for WorkPackage X - CEI141: Assurer advertise that they have data available so that EA can be filled in the next block - CEI139: later in time Guarantor of WorkPackage Y which refers to some exported elements from X, asks to the associated Assurers their shard in order to reconstruct the imports. - repeats the above This means Assurers are just validators with a special rotating-role like Guarantors?

# 2025-09-04 16:08 dave: Yes, also auditors will request bundle shards for auditing

# 2025-09-04 16:09 dave: "Assurer", "Guarantor", "Auditor", etc just refer to the different roles performed by validators. Every validator node is all of these

# 2025-09-04 16:11 prematurata: Guarantors => EG | rotating role Assurers => EA | rotating role Auditor => ED | anyone in λ (edited)

# 2025-09-04 16:11 prematurata: i am probably stating the obvious here :)

# 2025-09-04 16:14 prematurata: But since i didnt find much else on the topic i prefer to have a confirmation that i have indeed got the concepts right

# 2025-09-04 16:15 dave: Guarantors/Assurers rotate in the sense that validators are assigned to different cores/shards at different points in time, but every validator is always assigned to something. Anyway seems like you have the concepts right

# 2025-09-04 16:16 prematurata: yes sorry i should have explained better myself. :) Thanks a lot

# 2025-09-05 14:07 prematurata: I have a question about I in 11.28 (0.7.0). It's specified as a set but subsequent formulas forbid 2 or more reports of the same package so is there any other reason why I needs to be a set? (edited)

# 2025-09-05 14:11 prematurata: In other words I being a set or a sequence of reports in EG would have the same cardinality. (edited)

# 2025-09-05 18:28 prematurata:

# 2025-09-09 11:33 r2rtnl: GP 0.7.1 states in (4.11) that ψ′ depends only on external disputes and the previous ψ. However, the definitions in the disputes section also involve κ, λ, and τ (non-prime). This is a minor point, but shouldn’t these additional dependencies be made explicit in (4.11)? https://graypaper.fluffylabs.dev/#/1c979cb/09bc0009c500?v=0.7.1

# 2025-09-09 12:42 r2rtnl: A similar concern applies to (4.13) for ρ‡, where assurance signatures must be verified—requiring access to κ, λ, τ, and τ′. My broader question is whether such hidden or implicit dependencies should be explicitly reported and addressed, or whether there is a general rule I may be missing that allows certain dependencies to remain implicit and excluded from the 4.2.1 State Transition Dependency Graph.

# 2025-09-09 17:48 gav: GP editorial board will be happy to consider a PR if you have time to submit:)

# 2025-09-10 19:46 r2rtnl: https://github.com/gavofyork/graypaper/pull/491

# 2025-09-12 00:08 sourabhniyogi: Its 9/11, so like Happy 2nd birthday JAM! Its been 2 years since some of us started reading about JAM here: https://github.com/polkadot-fellows/RFCs/blob/gav-corejam/text/0031-corejam.md (edited)

# 2025-09-12 00:09 sourabhniyogi: Very cool to see how much has changed, and how little has changed from the original RFC.

# 2025-09-13 06:57 prematurata: I was looking into the pvm memory management in my implementation. I've seen others addressing the pvmmemory in regions (ro, rw, stack...). Eg: https://github.com/davxy/jam-conformance/issues/76 I've had that hint as well but I always hesitated because of the page host call which seems to allow setting a W access modifier on a RO region (and vice-versa). To explain this differently i don't see any contraint in pages hostcall to prevent setting 1-or-more pages of the RO region as writeable. Or setting part of the RW region Readonly (maybe in the middle of it). I'd like to migrate my currently overcomplicated pvm memory implementation to something with regions. Should also give me some boosts but I'd like to know what I am missing .... (edited)

# 2025-09-13 07:53 clearloop: imo you need to think about how to make your memory access closed to O(1) performance, region based or page map based, it doesn't matter, btw I'm curious have you tried AssemblyScript for specified logic, or using bun or deno as runtime instead of node, not sure if these would help

# 2025-09-13 08:29 prematurata: > <@clearloop:matrix.org> imo you need to think about how to make your memory access closed to O(1) performance, region based or page map based, it doesn't matter, btw I'm curious have you tried AssemblyScript for specified logic, or using bun or deno as runtime instead of node, not sure if these would help I am exploring all of this but some of the big winners are no go for now (ex bun)

# 2025-09-15 05:11 gav: @room GP v0.7.2 is released. There's no big changes, just a bunch of corrections and clarifications.

# 2025-09-15 05:16 gav: For future note, 0.7 series will probably not see any substantial development other than Jan Bujak 's gas charging refactor and (possibly) one new host call. I'll keep rolling in corrections/clarifications until then and we'll move on to 0.8 series once that's all done.

# 2025-09-15 05:16 gav: I also don't see much happening in 0.8/0.9 series beyond what I've already talked about - optimisations and security fixes.

# 2025-09-15 15:01 boymaas: https://graypaper.fluffylabs.dev/#/1c979cb/373d01373d01?v=0.7.1 This should probably be "b." The balance should not fall below the threshold balance.

# 2025-09-15 15:01 gav: See 0.7.2.

# 2025-09-15 17:39 clearloop: curious about why we use blake2b instead of blake3 for storage hash, how storage can be attacked, since the program storage key is "prefixed" by service id without hash, that inside of services, it will never be possible to mockup the same keys of others (edited)

# 2025-09-15 18:00 dave: Might be interesting to switch to Blake3 as its hierarchical structure allows you to make Merkle proofs for ranges of the hashed data

# 2025-09-16 06:45 sivakona: https://github.com/w3f/fflonk/issues/45#issue-3393797205 davxy Could you please help me figure this out, or provide any guidance? P.S.: The results are only differing at this specific scope — in other areas, they match. (edited)

# 2025-09-16 10:06 gav: > curious about why we use blake2b instead of blake3 for storage hash From Wikipedia: > The BLAKE3 compression function is closely based on that of BLAKE2s, with the biggest difference being that the number of rounds is reduced from 10 to 7, a change based on the assumption that current cryptography is too conservative.

# 2025-09-16 10:06 gav: It's unclear as to whether that assumption is correct.

# 2025-09-16 10:10 gav: Furthermore Blake3, inherits its main datum from Blake2s as 32-bit, making it fastest on silicon of that architecture. I expect JAM nodes to be overwhelmingly 64-bit architecture, giving Blake2b an edge. They added a binary tree structure to it to help parallelise the processing, however this isn't too helpful unless the data is large enough to use a deep enough binary tree. Much of our hash input is itself quite close to 32 bytes and probably wouldn't. In reality hashing small data on 64-bit hardware the two are pretty similar in speed yet Blake2b is objectively and substantially more secure - even if, arguably, that security is redundant. (edited)

# 2025-09-16 10:12 gav: On the latter point, just because there are no published attacks from academia doesn't mean that at the state level there are no weakening vectors known. Furthermore, in JAM we routinely don't use all 32 bytes of security from the hash, so it makes sense to be more conservative in the hash function itself.

# 2025-09-18 10:23 vinsystems: Question about Ancestors in Eq 11.35: In a real scenario where a validator node goes offline for a while (i.e. a power outage), how can we recover the missed headers? The GP says that the header chain is deterministic and calculable, but I don't find how we can recover it (edited)

# 2025-09-18 10:25 dave: In the validator case you need to request the missing blocks using CE 128 and import them all in turn

# 2025-09-18 10:28 dave: Currently there isn't a way of fetching headers only

# 2025-09-18 10:29 dave: This might be sensible to add when we add the bits needed for warp sync

# 2025-09-18 10:41 vinsystems: Got it, thank you!

# 2025-09-23 14:56 ycc3741: gav - AFKish: If you happen to have some time, would you be so kind as to take a look at this issue? https://github.com/gavofyork/graypaper/issues/498

# 2025-09-23 14:56 ycc3741: thanks a lot

# 2025-09-27 18:24 celadari: I have a doubt regarding security of transfers, for GP 0.7.2 being debited is done by host call TRANSFER and being credited is done at accumulation invocation (eq B.9) let's say S1 makes a transfer of 10 tokens to S2 no matter what happens (if code of S2 fails or not) S2 will get the 10 tokens according to eq B.9 () now code of S1 runs independently and if it fails, TRANSFER host call is not applied and S1 is not charged so either I am not seing something or one could "generate" mopney/balance no ? (edited)

# 2025-09-27 18:46 celadari: Nevermind :)

# 2025-10-03 17:17 sourabhniyogi: JAM + PVM is "bare metal", imposing few constraints on the types/formats of work package extrinsics (... state witnesses, txs) and JAM D3L segments in export and the refine output. This is in sharp contrast to Substrate which had strongly typed extrinsic parameters and storage (attached to runtimes) embedded in a full model of types with commonly reused pallets. Being free to do whatever we want with our JAM services is cool, but for JAM service interoperability (say, between 2+ rollup hosting services doing "smart contract interpretation" between xVM, yVM + zVM), it is clearly valuable to impose types and structure on JAM D3L "objects" so xyz state changes may be consistently written to JAM state. Right now there is no structure whatsoever, intentionally. I get that this is intentionally outside JAM protocol and that service builders are free to just roll their own in D3L exported out of refine with xyz integration in accumulate -- but I believe we should prefer something like "accords" between xyz service builders to achieve service network effects ( x state interacting with y state interacting with z state ) sought earlier for parachains. How do you imagine service builders impose additional structure on JAM D3L formats/types and address Cross-service Work-Item Entanglement better? Distant but related: If D3L holds objects worth keeping around for up to 28 days, with a D3L refresh mechanism then D3L could hold objects (code, types, data) for longer. If immutable objects can be held forever so long as there is a service with a balance, then the function of accumulate for xyz VM is largely to record the latest version of whatever is held in D3L. Part of the service "accord" between xyz could be to adhere to this model between D3L and JAM state. Services adhering to additional D3L structure as part of a smart contract service accord would sacrifice a degree of "bare metal" flexibility to win service interoperability. What are the set of accords we can imagine here? If we do not seek this set, we end up with isolated services -- no different than todays isolated rollups/parachains. How could we do better in the JAM era? (edited)

# 2025-10-04 12:44 hitchhooker: why not just add 5th entry point for premise and ape systemd's transaction builder logic for coscheduling? it's the clear winner in unix for coordinating interdependent processes. systemd does exactly what jam needs: https://github.com/systemd/systemd/blob/main/src/core/transaction.c#L928 - transaction_add_job_and_dependencies() - collect all job dependencies - transaction_verify_order() - verify no conflicts/cycles - transaction_activate() - execute atomically or abort entirely maps to jam: - premise() checks preconditions like transaction_add_job_and_dependencies() - jam core verifies dependencies like transaction_verify_order() - accumulate() applies atomically like transaction_apply()

# 2025-10-04 12:46 gav: Services are not intended to be co scheduled as part of JAM. Please note that JAM has no concept of transactions and the accumulation stage is designed to be entirely async. (edited)

# 2025-10-04 12:46 gav: Any co-scheduling happens within a work package, effectively during Refinement where all compute is sync. (edited)

# 2025-10-04 12:47 gav: If you’re interested, you might like to prototype a service or CoreVM alteration which implements your idea for co-scheduling.

# 2025-10-04 14:56 vinsystems: Hi, I have a question about the combination of accumulate with on_transfer. The accumulate invokation takes as one of its arguments the concatenation of the transfers and Operand tuples. Does this mean that at this point there could be both transfers and Operand tuples? (edited)

# 2025-10-04 14:59 gav: Yes

# 2025-10-23 19:37 sourabhniyogi: Could there be a way via fetch (or is there already some way), for a single service accumulation (delta_1) for a specific service X to know that its the _last_ (delta_1) in a sequence of multiple single service accumulations for that service X, so that this last accumulation can integrate the whole sequence? The first in the sequence can know, but the last cannot, as far as I can tell. Check? (edited)

# 2025-10-23 19:59 gav: No. Service accumulation continues as long as any of them make a transfer. No service’s accumulation can know what happens in another which is happening contemporaneously. Therefore it’s not possible.

# 2025-10-23 20:00 gav: Best you could do is have services make a transfer to some integrator service.

# 2025-10-23 20:02 gav: But then the integrator would need to know what services are going to send to it before it does the integration rather than blindly accrue and do everything at the end of the block when “everything is finished”.

# 2025-10-24 18:26 sourabhniyogi: image.png

# 2025-10-24 18:29 sourabhniyogi: Is there a recorded lecture where you explain all the numbers in this diagram in detail? Now that the gas model is becoming clearer and we're imminently ready to test it on the Toaster or something tiny for our services, I think an update of the above would be useful!

# 2025-10-24 18:33 gav: That’s a rather old image and according estimation. Estimations of eventual emergent system performance will change as we understand the protocol better, with the eventual numbers being the product of empirical analysis. Anyway for this specific diagram, a later version (still 12+ months out of date) was 2 MB/s on the right when the WP size was revised from 15 to 12 MB. (edited)

# 2025-10-24 18:37 gav: The gray paper and its analysis is where you should look for the most current estimations. And otherwise understand that perfect information doesn’t come from guesswork 2+ years out from a production implementation but rather from experiments and experience. (edited)

# 2025-10-24 18:38 gav: To the extent you can contribute experience, please do.

# 2025-10-24 18:39 gav: Overall, the gas model has been eye-opening to all involved. It involved a lot of consideration and innovation and we are quite confident that what we have now is a quite optimal solution (edited)

# 2025-10-24 18:42 sourabhniyogi: Can we do a JAM implementers meetup to get your Toaster experience in Dec (PBA) similar to the May event, or should service implementers report on their experience in Feb/Mar after testing out their services in the Toaster.

# 2025-10-24 18:43 gav: We are hopeful that we will have toaster results in the coming weeks, but it has been a long journey of optimisation.

# 2025-10-24 18:43 gav: Other teams will probably need to take the same journey.

# 2025-10-24 18:45 gav: Otherwise, I’d expect teams taking part in the PBA JAM hackathon Dec-Mar to do their own experiments and draw their own experience.

# 2025-11-04 23:19 sourabhniyogi: The fetch host function does not presently provide access to the JAM parent state root to accumulate invocations, which is the authoritative commitment of everything in JAM State. A rollup host service would typically be expected to include some kind of state root in a rollup's block for proofs of inclusion, historical lookups, etc., where having the JAM parent state root for each accumulate would be valuable to include in a rollup block. Is it reasonable to add a case to fetch for this, or add it after entropy here https://graypaper.fluffylabs.dev/#/ab2cdbd/30c40030c400?v=0.7.2 for "case 1"? If you permitted state root of the header in, you could allow the whole block header and let the service copy whatever it wants from JAM block header into its own rollup blocks. I recognize that a rollup host service can have its accumulate invocation use the yield host function or 32-byte accumulate roots to get an ample alternative for a rollup host's commitments going into JAM State C(16), wherein proofs of inclusion can be generated with C(16) alone, without yet another fetch case. But for backwards compatibility with rollups expectations of a rollup host (eg for historical lookups of any commitment by the rollup against some JAM State root included in the rollups "block"), it would be good to have the JAM State root as well. Is this a reasonable ask? (edited)

# 2025-11-08 13:25 gav: It certainly doesn't sound desperately harmful:) happy to be pragmatic. If you want to submit the relevant changes in a PR, we can review and try to get it in prior to 0.8. (edited)

# 2025-11-08 21:58 vinsystems: What should we return if the lookup key doesn't exist in the provide hostcall? HUH? (edited)

# 2025-11-13 14:06 jaymansfield: Hello I have a question about JIP-2/JIP-3. Should JIP-2's submitWorkPackage and submitWorkPackageBundle RPC methods emit a JIP-3 work package submission event for telemetry so that it can be tracked? If so what should be used for the Peer ID attribute in the event since we don't have a builder address?

# 2025-11-13 14:57 dave: This plausibly could be tracked but the work-package events in JIP-3 at the moment are only intended to be emitted by guarantors (edited)

# 2025-11-13 14:58 dave: So the submitWorkPackage RPC should send the work-package to the guarantors assigned to the given core, and it will first appear in telemetry when those guarantors emit events for it

# 2025-11-13 15:02 jaymansfield: Oh I misunderstood the intent of the rpc method. I thought they were for builders to submit straight to a guarantor, not for the rpc server to redistribute to the correct guarantor after receiving it. Okay in this case an event wouldn't make sense. Thanks. (edited)

# 2025-11-13 15:08 dave: In general validator nodes should not expose RPC. I think polkajam currently always runs an RPC server, we should maybe change this. RPC is really only intended as a stop-gap until we have light client stuff, to avoid needing to have tools/builders embed a full node. It's only expected to be used to communicate between eg a builder and a node running on the same machine. (edited)

# 2025-11-13 15:11 jaymansfield: Actually one other question. When a guarantor receives a work package for a core they are not assigned too, should it be rejected immediately or queued until they are correctly assigned? (edited)

# 2025-11-13 15:14 dave: To some extent this is up to you, though it's probably sensible to have some standard behaviour that builders can rely on. PolkaJam currently accepts from 1 slot before the assignment up until the end of the assignment

# 2025-11-13 15:14 dave: Outside of this it immediately rejects

# 2025-11-14 08:51 bkchr: Will JAM provide any kind of easy way to construct ancestry proof of blocks contained in the chain? This would enable light clients to not only load data from recent blocks, but also to read data from blocks in the past. (Requires that these nodes keep the data around, but this is another issue) Or can the beefy mmr be used to construct this? Or should we build some zk ancestry proof?

# 2025-11-14 21:43 gav: You mean historical state proofs?

# 2025-11-14 21:43 gav: Under what conditions would that data be stored? It seems pretty unlikely that there would be many nodes which would provide this service since it’s totally unscalable. (edited)

# 2025-11-14 21:46 bkchr: > <@gav:polkadot.io> You mean historical state proofs? Yeah

# 2025-11-14 21:46 gav: In any case, no, it hasn’t been an explicit consideration, but I suppose could probably be built on top if deemed important. I’d say that archive nodes are probably too high a cost and that apps should just keep any data which are deemed important to be able to have users query be available either under DA, locally or under some paid persistent store.

# 2025-11-14 21:47 gav: Just assuming archive nodes exist and will serve out proofs understates the importance of having an economic rationale for retaining this data and presenting it in a convenient format on demand. (edited)

# 2025-11-14 21:47 gav: I see no such rationale, generally.

# 2025-11-14 21:48 bkchr: What about thinks like proper block explorers that enable you to verify the data they serve to you

# 2025-11-14 21:49 gav: If block explorers are an economically viable service then I guess they’ll have to suck up the cost of procuring and archiving all their data.

# 2025-11-14 21:49 gav: I see them as Web2 remnants, like centralised exchanges and JSONRPC servers. (edited)

# 2025-11-14 21:50 gav: We’ll need to start better architecting solutions than to rely on such workarounds. (edited)

# 2025-11-14 21:50 bkchr: So you would argue that apps should store relevant data in storage?

# 2025-11-14 21:50 gav: Thankfully with JAM we probably have enough raw resources available that we don’t need to have apps rely on historical state.

# 2025-11-14 21:51 bkchr: Simple things like a transaction history

# 2025-11-14 21:51 gav: They can just keep enough relevant history alive in present state.

# 2025-11-14 21:51 gav: And aggregate or store locally as needed.

# 2025-11-14 22:01 bkchr: Yeah okay makes sense

# 2025-11-14 22:18 bkchr: One slightly related question, warp proofs are probably part of the networking spec and not part of the gray paper?

# 2025-11-14 22:33 bkchr: David Emett: did you looked into zk warp proofs?

# 2025-11-14 22:35 dave: No, Matthew Halpin has been implementing grandpa and speccing the network stuff for it, think he has integrated the polkadot grandpa stuff with minimal changes (edited)

# 2025-11-14 22:35 dave: If you have suggestions he is the one to talk to :)

# 2025-11-17 19:28 sourabhniyogi: I believe the notion of a JAM light client should be broadened from a "JAM-only" light client to something that tracks one or more JAM services. My assumption is that the economics of storing rollup blocks or service user data in JAM state vs JAM DA will lead to almost all service user data being stored in JAM DA. JAM State becomes an index into JAM DA. You only want the minimal data in JAM State to allow light clients to detect relevant events for their services. Basically, for any service, a JAM services user’s light client must be able to scan JAM State for events relevant to that specific service, so it knows when to fetch additional data from JAM DA. GP does not prescribe how this should work; it is up to each JAM Service to define how their light clients detect and locate relevant events. EVM/YP's log Bloom filters, BIP-157/158, and ZIP-307 all basically have a "might this block contain my data" -- where only if that test passes, should there be deeper fetch to get the whole block. False positives are acceptable, but false negatives are not. In a JAM setting, where rollup blocks and almost all service user data live inside JAM DA, it is undesirable for every service user to download every rollup block just to check whether an event is relevant to them. Not sure why Polkadot lost this kind of check (or I missed something) -- JAM CoreChains service will reintroduce it maybe? I don't think we need to change anything in GP to accommodate this notion, or even try to develop a cross-service light standard (since what ZIP-307 like systems needs are totally different from EVM/YP log bloom filters in what they need to do) but would be nice to get the above assumption validated. Its insanely great that we have so many JAM implementations ready to be light clients, where the JAM service-aware light client is basically a small tweak away from the JAM-only one, which will be attractive to any JAM service builder. (edited)

# 2025-11-17 20:19 gav: Bloom filters seemed like a vaguely good idea back in 2014, but given the usage Ethereum got they quickly got saturated to the point of uselessness. They are also totally attackable. That’s why we never introduced them in Substrate. (edited)

# 2025-11-17 23:16 sourabhniyogi: I'll see if we can learn + code up Verkle tree properly (instead of Sparse Merkle Trees) evolving per epoch for this JAM Service light client problem...

# 2025-11-19 14:16 sourabhniyogi: JAM meticulously meters every byte written to JAM State and JAM DA in service account and service statistics, enabling a precise solution to the age old problem of "storage rent" by charging service accounts AWS-style. This could be way fairer than lazily pricing it into an write gas costs (ala SSTORE), but is this the actual plan? When will we know the final storage pricing model for GP vs Polkadot? JAM encourages, but does not require, getting away from age old expectations of immutability by having JAM Service designers use JAM State serve as a index into JAM DA, forcing JAM DA refreshes for any sane service design that uses it. The feature/bug of immutability is replaced with the feature/bug of forgetfulness, but its not meticulously perfect: as far as I can tell, even with a storage rent deduction per service, with the way eject and write are right now, JAM State can still be bloated with "evicted zombie services" that, even after they were charged rent down to below threshold balance, still occupy JAM State space forever. This is because we can only use eject when every last byte is deleted with write+forget, BUT only the service itself can clean up after itself. Check? If so, can JAM have a delete process (either yet another host function or (yipes!) a variation on write/forget/eject) that would enable another service to minimize JAM State bloat for such evicted services? Having such a clean freak would keep not only DA 100% clean and paid for, but also JAM State. After all this meticulous byte counting, I think this is worth winning. Ideally, a parent service that spawned many child services would have an incentive to clean up after itself, or maybe a good samaritan could be given one, or it would cost nothing.

# 2025-11-19 14:25 gav: Any data stored in JAM state must be “paid for” via a deposit.

# 2025-11-19 14:25 gav: That is the case for zombie state. If the service wants to reclaim its deposit, it’ll need to clear its storage.

# 2025-11-19 14:26 gav: DA is limited to a 28 day cycle after which it forgets. Work Packages can place data in for 28 days under the general economics of Coretime. (edited)

# 2025-11-20 15:05 sourabhniyogi: Is it too late to replace the Binary Merkle Tree with a Verkle tree? Because we only get 12mb per workpackage, the content of the WP ends up being dominated by BMT proofs against a recent JAM state root faster than rollup block/txs. But if we replace BMT with Verkle trees, the % of the WP chewed up by state witness proofs stays reasonable. We can whine over the compute time it takes to build a JAM Verkle root (instead of BMT IO with SSDs and hashing more and more as JAM state gets fatter), but its about how many JAM DA objects rollup blocks need to access, which is empirical in the end. Perhaps there is an even better witness compression approach we should like better? If its too late, no problem, just trying to kill 3 birds with one stone.

# 2025-11-20 15:10 dave: What's the use case for including JAM state proofs in work-packages? If your work-package assumes some JAM state then you can verify this directly against the relevant state in accumulate, either by including the expected state in refine output, or a hash of it, or something like that. No need for a full state proof?

# 2025-11-20 19:33 kianenigma: I think sourabhniyogi is drawing an analogy to the current parachain protocol where in the refine part you need to send a BMT proof of the state that you need to compute upon in refine. In this context, it makes sense and I can confirm that empirically we see that in parachains proof size is the first bottleneck to be reached compared to computation time. But it is also worth noting: 1. what other applications can be written on JAM that don't require a state-rollup fashion BMT proof for refine 2. refine now have access to DA, so maybe the usage of state proofs are less relevant now I don't know how much overhead Verkle tree adds fwiw.

# 2025-11-20 19:34 bkchr: This makes no real sense

# 2025-11-20 19:35 bkchr: Even in Polkadot your parachain can use whatever witness format you want (edited)

# 2025-11-20 19:35 bkchr: Aka you could already use Verkle Trees

# 2025-11-20 19:35 bkchr: Same applies for JAM

# 2025-11-20 19:40 bkchr: The refine code will be responsible to decode the witness and to feed it to the execution

# 2025-11-20 19:40 bkchr: As validate_block does this right now in Polkadot

# 2025-11-20 19:42 kianenigma: I mean the _cumulus-standard way of Polkadot_ I guess :) but you are right overall, the state format of the JAM chain doesn't matter to services, in the same way that the state format of the relay chain doesn't matter in Polkadot today. (edited)

# 2025-11-20 19:52 bkchr: sourabhniyogi: or do you want to read JAM state in your service?

# 2025-11-20 20:18 sourabhniyogi: The use case is the central "rollup host" use case -- in particular, that refine does rollup STF for execution with witnesses (in my case, that N objectids have xyz specific versions in JAM DA \[hash, indexstart, indexend\] against some refine context state root), and accumulate basically writes new pointers to JAM DA (that M objectIDs have some new object versions) aggregating across many cores in ordering a new rollup block, provided there are no conflicts. Totally get that a rollup service can do whatever witness compression it likes for refine to do the rollup execution, and JAM doesn't have to change BMT at all. Yes, we can do our service design with the builder computing Verkle roots and the "builder" providing more compact state witness proofs with Verkle than with BMT. I'm just seeing the obvious fact that the rollup transactions are getting dwarfed by the BMT state witnesses and need a better plan. But if I'm doing this (what seems to be shadow) computation for good reasons, then JAM state proofs against BMT could have a different basis. Not a showstopper though. I haven't checked how much PVM gas N BMT verifies as JAM state size increases from 10MM to 1B keys exactly (I'm sure you guys have been there and done that!) -- I assumed all witness verification all should be done on refine rather than accumulate because, simply put, we don't get a lot of gas to work with in accumulate relative to refine. Its pretty clear accumulate should spend its gas on IO centric read and write (amidst conflict resolution from multiple cores) rather than the computation work of verifying proofs, which certainly should be done in refine. Of course its _possible_, but its not what JAM is optimized for. Will check this regardless. (edited)

# 2025-11-20 20:41 bkchr: But the DA layer doesn't have a BMT? Or do I misunderstand something?

# 2025-11-20 20:42 bkchr: Also not sure why you talk about verifying inside accumulate?

# 2025-11-20 21:36 dave: There are BMTs for DA; we build a BMT for the segments exported by a work-package. IDK if this is what you're talking about Sourabh? You are to some extent free to verify imported data however you like but there is "builtin" verification of segments imported by exports root + index. That is also what you need to request a segment from the DA system so if you eg want to have pointers to DA in your service state then these really need to be (or at least contain) exports root + index. All that said I'm not sure I understand what the BMT proofs would be needed for in this rollup case, if you could clarify this that would be helpful. As others have said for a Polkadot-style rollup the rollup state would be separate from JAM state and could use whatever format you like

# 2025-11-21 01:40 jeff: There is a BMT across the chunks sent to different validators, with which you prove the validity of each chunk, and which the approval checkers aka auditors validate by reconstructing the erasure coding. This is by far the most efficient method to validate the availability. In particular, availability is tightly integrated into elves. (edited)

# 2025-11-21 01:43 jeff: Verkle trees require pairings, so they're incredibly slow for verify operations, and even slower for proving. And they're slow for update operations too. You can batch the pairings but the marginal cost would always remain higher than any common signature scheme, and even after batching the verification remains crazy expensive. Afaik accumulate should never have the CPU time even for fast signatures, but definitely not for pairings, so while a service can change its storage it cannot realistically ever access Verkle trees from accumulate. This is not really a problem, since accumulate should be mostly unused anyways once people optimise. Accumulate can definitely copy Verkle tree roots around though, just like it copies around Merkle tree roots now. And accumulate can handle messages that demand reliable delivery using BMTs again just like now. And other similar tasks work fine too. Verkle trees would never be post-quantum either, not that this matters short term, nor hopefully in our lifetimes, but it's bad for optics, so one could easily imagine governance excluding Verkle tree based services from handling some assets. (edited)

# 2025-11-22 01:00 sourabhniyogi: Thank you all for chiming in. I made a mistake in going off topic and not referencing GP in my questioning here. (edited)

# 2025-11-24 15:28 danicuki: I have a doubt about host call info = 5: the first condition is v = ∇ ∨ No⋅⋅⋅+l ⊆/ V∗ but v will never be ∇, as it is defined as a binary or nil in this case, the return condition v = ∅ shouldn't come first?

# 2025-11-25 02:39 sourabhniyogi: What are the rewards vs penalties for an assurer for providing vs not providing assurances in the very long 28 day window after the immediate assurance? If > 2/3 of assurers just forgot their JAM DA content after N days for N \< 28, what happens? Do we just say "honest behavior is well-defined and expected of at least 2/3 of validators" as here and consider the CE141 assurance sufficient, or is there a JAM DA sampling process for the entire 28 day period that will result in rewards/penalties accordingly? It seems that Polkadot's Coretime revenue model (thought of as costs for core computing only AFAICT) could map JAM DA core statistics (export counts) into costs for service users and thus payments to honest storers who pass long-term JAM DA sampling processes, as a reward. There is a reference of "validators vote on their impression of each other's efforts" here but not clear how we're going to mechanize this -- is this a concern to be addressed within GP or outside GP? (edited)

# 2025-11-25 07:42 ascriv: Thinking about the on-transfer change in 0.7.1: it seems now that a service only receives the incoming balance in the accumulate invocation (B.9), then it will be common for a transfer not to actually reach its destination, in the case that the receiver is accumulating in the same batch as the sender, e.g. Is this intentional? (edited)

# 2025-11-25 07:46 ascriv: Ah I think I understand, receiving services are forced to accumulate in the following batch (edited)

# 2025-11-25 08:35 jeff: That sounds like the paper miss-described the design, probably because it's not exactly part of the relay chain, and so the paper doesn't need to describe it. Approval checking aka auditing work payouts must be based upon "opinions", and regulated using a tit-for-tat game, because few other designs make sense. Validators should participate in this game, but it never goes onto the RC, and nor does anything else too rleated to payments, so if they do it wrong then that's only bad for the validator operation, not bad for the network. Availability is inherently too subjective even to do payout by "opinions." Instead we need a statistical micropayment tit-for-tat game which uses some reweighting internally based upon the approval payouts. All this is spelled out in RFC#119. Polkadot does not care about availability after a few hours, or really even after a few min. If a parachain or service moves quickly, then availability is much too slow for block building anyways, hence all our censorship resistance work for asset hub. 28 day availability really fits a very different niche: slow stateless services. Stateless services are awesome because they cannot be captured, so you can make them very resilient without doing the ugly complex censorship resistance shit. We'd love to get the validator election off of asset hub and into a stateless service for example, but that's a relatively "speedy" stateless service. Of course stateless services are not really stateless, they just use whole past blocks in availability as state, but this means they cannot handle "accounts" or other db tables like entities, only big blocks of data. You can fetch smaller chunks out of these past blocks aka work packages, but if you're doing reconstruction then you're slowed down by making all those network connections, and reconstructing 10k from each of 10 blocks costs vastly more than reconstructing 100k from one block, so this service must run slowly. Also, you must validate all those chunks, which requires something like blake3, which never gets mentioned in the paper. There is obviously no sampling of the validity of the availability chunks except within the auditing process itself. There is no need to sample availability presence because this is covered by our 2/3rd honesty assumption. Now over 28 days that 2/3rd honesty assumption maybe unreasonable. This is the issue you raised. Sampling cannot fix this. In fact, there is no good way to map the current validator set to past epochs: Imagine f(i,v,j) is the position for epoch i of a validator v in epoch j > i. f(i,v,i) is the actual position of v in epoch i. If j>k>i and f(i,v,k) is defined, aka v was a validator in epoch k, then f(i,v,j) = f(i,v,k). There is no mapping f that satisfies this. In english, we'd have to keep moving the chunks around, which could break our asymptotics in availability. We thus cannot incentivise the current validators to have the chunks, so we must incentivise the old validators to remain active, which sounds expensive. There do exist half ways sane payment schemes: We charge services a LOT more if they want the 28 day availability. We have each block declare a "rewards pre-message" Vec<(Epoch,BitField)> and use it to increment some Map<Epoch,Vec<u16>>, which goes into state, but we keep the scheme stateless by allowing the builder to "stiff" its providers. We eventually send payment messages based upon these. We then have validators track Map<Builder,Vec<u16>> across all services, and when they see themselves getting stiffed then they get revenge by probabilistically stiffing that builder of chunks. All builders stif, but if builders still too much then get stiffed on chunks, and they cannot build blocks anymore. This sort of thing easily gets more complex than its worth, so maybe we just hope the availability provides a backup, but actually use archive nodes whenever possible. (edited)

# 2025-11-25 08:57 jeff: At a high level, there is a subtle lie in all blockchain ideology that paying for protocols makes them better. It's kinda the opposite that paying for the protocol in the protocol makes it worse, so the right approach is to make the protocol work well first, and then we figure out later how to pay for it without fucking up the working protocol. This may result in a payment scheme that scares people, but real life runs on such approximations. yolo (edited)

# 2025-12-02 07:26 ascriv: It seems clear that JAM will benefit greatly from a source of randomness which is as secure as the network itself. After doing a bit of digging it seems a good way to do this is to periodically have a distributed key generation + threshold VRF signature (maybe every epoch). I don’t think this fits into the refine/accumulate model, so it can’t happen as a service. Does this mean it would have to happen on the consensus level? (edited)

# 2025-12-02 11:49 danicuki: Isn't the on-chain entropy eta this source of randomness?

# 2025-12-02 12:56 ascriv: The block producer of the block controls the vrf ouput placed in the block seal. Then, that same producer can construct the entropy before its accumulated

# 2025-12-02 12:57 ascriv: So if this producer is malicious they can rig a billion dollar lottery, e.g.

# 2025-12-02 12:58 ascriv: We need something that assumes there could be at most 1/3 misbehaving validators, which we can do with the dkg/threshold vrf thing

# 2025-12-02 12:58 davxy: The VRF in the block header does not prevent omission. The producer still sees the VRF output before publishing the block and can discard the block if the value is unfavorable. This introduces some bias in the onchain randomness used by the service. A commit-reveal scheme implemented at the service level is a better choice for high stake games (eg lotteries). Randomness from block's vrf is still fine for non-critical or lower-stake pseudo-random data requirements (edited)

# 2025-12-02 12:59 ascriv: Could you explain how a commit reveal scheme based randomness service would work?

# 2025-12-02 13:06 davxy: Participants first submit a hash of their secret (commit), which the service stores. Later, in the reveal phase, they submit the actual secret, and the service verifies it against the commit. The service combines all revealed secrets to produce the final random value. Non-revealing participants can be penalized (e.g., via a deposit that is slashed). The randomness remains unpredictable and unbiasable until the reveal phase. This is the basic idea, variants are possible. For example, the final random value can be generated once a specified number of participants have revealed, without requiring everyone to do so.

# 2025-12-02 13:35 danicuki: As I understood, this solution is on the service level, not protocol level. So, we don't have a protocol level randomness?

# 2025-12-02 13:45 davxy: You have eta, you can use it as randomness source. But has the limitations I just mentioned. For high stake games (eg lotteries) the commit-reveal scheme at service level is stronger (edited)

# 2025-12-02 13:45 jaymansfield: hey davxy, has a 0.7.2 compatible polkajam been released?

# 2025-12-02 13:46 davxy: Yes, as I wrote: > A commit-reveal scheme implemented at the service level is a better choice

# 2025-12-02 13:48 davxy: Yeah, but it has not been tagged yet

# 2025-12-02 14:30 ascriv: If there’s a service in the future who decides that the security of this entropy isn’t high enough, would it be possible for jam to support the DKG+threshold vrf scheme ?

# 2025-12-02 14:49 ascriv: Also a question about PoP: for extremely high-value lotteries, PoP validators don’t have large stake to slash, so the economic deterrent is limited. Does this mean JAM isn’t meant to secure very high-stakes randomness, and such applications should rely on hybrid or external beacons?

# 2025-12-02 15:45 gav: Future means of validator selection have yet to be finalised and formalised. Polkadot will continue to target the highest resilience. Validator selection is entirely divorced from the JAM protocol and need not be a consideration here.

# 2025-12-02 15:46 gav: I appreciate there is interest but it is OT for JAM/GP channels.

# 2025-12-02 18:14 gav: As far as I’m aware, threshold VRF schemes require some trusted setup whenever participants change. @jeff:web3.foundation? (edited)

# 2025-12-02 18:22 gav: Resilient (perfectly secure, trustless) randomness is still hard afaik. Stakes don’t help all that much and Polkadot will in any case move away from slashability soon (even before any PoP mechanism for validator selection).

# 2025-12-02 20:36 danicuki: > <@gav:polkadot.io> Resilient (perfectly secure, trustless) randomness is still hard afaik. Stakes don’t help all that much and Polkadot will in any case move away from slashability soon (even before any PoP mechanism for validator selection). Is there any good technical article / research explaining why slashability is not the best choice now, comparing it with other consensus mechanisms ?

# 2025-12-02 20:43 jaymansfield: I'm not sure if this is the reason, but if you think about it the majority of validators on polkadot don't self-stake very much and rely on nominators to get active. If they were to misbehave, it would be their nominators that get punished and not really the malicious validator themselves.

# 2025-12-02 21:27 gav: What technical article says it *is*?

# 2025-12-02 21:27 gav: Indeed.

# 2025-12-02 21:58 sourabhniyogi: Does this fly? https://research.web3.foundation/Polkadot/security/slashing/amounts I think JAM people should be able to sell the change out of PoS when faced with the finest minds in Web3 outside Polkadot ... and don't get why this would be off-topic given how central staking has been to security. Not that many of us are familiar with what is surely a ton of history on minimum self-staking requirements, (or lack thereof) -- I think education here is going to be valuable, even if this is not part of JAM protocol, no?

# 2025-12-02 22:13 bkchr: https://forum.polkadot.network/t/proposal-dynamic-allocation-pool-dap/15878 gives some insights into the planed changes.

# 2025-12-03 00:28 gav: Minds should absolutely be equipped! But this isn’t the right forum for discussing what is ultimately a separate concern to JAM.

# 2025-12-03 10:02 jeff: Yes, you need a DKG to distribute the key shares to the current participants. You can keep the public key the same if you do this among the old ones, but that's more something. We prove in the sassafras paper that the "slowly secure randomness" from sassafras is secure enough for sassafras' internal usage. We know that "quickly secure randomness" from a DKG plus threshold VRF yields a 40% improvement in elves, so eventually polkadot (jam or before) shall deploy this, but.. (edited)

# 2025-12-03 10:21 jeff: There were several different DKG plus threshold VRF design choices, including: - ristretto with only threshold VRF outputs, but not threshold signatures, would not hurt our CPU time. - secp256k1 could yield a better bitcoin bridge, but it's messy to really do this. - Alistair's BLS flavour would yield encryption to the future, but hurts CPU time, although GLV helped. This is now mostly Alistair & Elizabeth's baby. I tell applications people like ideal labs to use drand for this for now, because drand avoids us paying the CPU cost, but longer term we'd love to have this as an option in our threat model. - We have a DKG-only-plus-reveal-no-VRF design that's post-quantum, since post-quantum VRFs have serious limitation. This has some bandwidth overhead, but no significant CPU time. All of these give faster randomness. All of these need a dedicate parachain/service that runs the DKG. All of these require adjustments to the soft pre-elves consensus, which does not yet exist even in current polkadot: https://github.com/polkadot-fellows/RFCs/pull/144 Nobody has worked on the DKG failure question here. We think the DKG-only PQ thing makes the most sense longer term, and Sergey started implementing, so maybe we jump directly to this. Anyways: - You have have slow randomness from sassafras in JAM like in polkadot today. - You can have fast 30 second randomness and encryption-to-the-future from drand now, but . - We'll provide some fast 6 second randomness eventually. - We'll likely provide some fast-ish encryption-to-the-future, but how fast depends upon the CPU costs. - JAM and the gray paper need fixes to their elves, service/collator protocols, and gossip, before even thinking about the soft pre-elves consensus parts.

# 2025-12-03 10:24 jeff: I originally envisioned elves as working based upon "permanent excludability" more than slashing. In fact, our analysis wound up depending upon the amount slashed, but slashing delegated stake is not really the same as the slashing in our security assumptions, so we're kinda back to "permanent excludability" but modeling based upon self stake, which remains slashable, and loss of future income. (edited)

# 2025-12-03 10:26 jeff: I've forgotten Jonas' numbers but I think 10k DOTs was the current plan for validator self stake and slashing risk.

# 2025-12-03 10:28 jeff: Of course other options are possible, but it's harder to translate things like "I signed a contract with the community which someone might take me to court over" into our model, and some of that sort of thing might work less well in some jurisdictions. We cannot satisfy all jurisdictions anyways because of bandwidth requirements but this could cost us more than we'd like.

# 2025-12-03 10:29 jeff: Also right now nobody is openly thinking about the underlying competition to become a validator. This was solved by delegation, and may still partially be solved that way, but if it changes much then there is a large design space there. In this, self-stake is both good and bad.

# 2025-12-03 12:43 sigsigsig: jeff: very interesting, thank you for this. i am interested in how permanent exclusion can be enforced from an anti-sybil perspective, and also in further discourse with regard to slashing delegated versus non-delegated stake, if you are able to expand or point me in the direction of said discussions ( :

# 2025-12-03 13:18 jeff: We do not have an analysis that works on that principle alone, ala my original gambler's ruin ideas were likely only intuitive, not quantitative. Instead, we always trace back to some value at risk as a proportion of the total value of the network: https://eprint.iacr.org/2024/961

# 2025-12-03 13:18 jeff: We'll likely assume people can be permanently excluded, and then do some "KYC" etc on them to make this true-ish so that the future rewards part makes sense, or ask for more self stake if anonymous. (edited)

# 2025-12-03 13:20 jeff: It's fine to have soft intuitive things that we think justify more rigid security assumptions, but we should actually try to make those parts true.

# 2025-12-03 14:45 gav: Economic rationalisations of PoS security are worthless if the staking token becomes of so little value because, for example, of the staking economics running the price into the ground. (edited)

# 2025-12-03 14:49 gav: In any case, this is totally OT in this channel.

# 2025-12-03 18:55 danicuki: Question about the Erasure Coding Appendix H: All values in this section are "hard coded" - 1023 validators and 341 cores. Wouldn't be better to define all hard coded values (684, 342, 4104, etc) in terms of defined constants in Appendix I (W\_P, W\_E, etc)? If you think this is a good idea, I can make a PR with this change to the GP. (edited)

# 2025-12-04 00:49 sourabhniyogi: With some luck, the 0.8 features of Self-hosted parameters and Provide a more optimal EC parameterization for work bundles should make V (thus C and W_E) dynamic -- ideally signaled in the Epoch mark to match CE 153.

# 2025-12-04 14:12 danicuki: Should the subscripts for Justification J and L (14.11) be also parametrized? Now they are hard coded as '6'. Is it correct to assume that they should be subscripted by W_P?

# 2025-12-04 14:16 dave: No, that they are both 6 is a coincidence. The subscript 6 is derived from the size of a segment. That is at least currently fixed.

# 2025-12-04 14:21 dave: The subscript 6 = log2(64), with 64 being the number of pages proven by a proof page. 64 is the largest number that works with the current segment size.

# 2025-12-06 21:38 sourabhniyogi: I've been bugged by your super honest "you're slowed down by making all those network connections", "actually use archive nodes whenever possible" and "we charge services a LOT more if they want the 28 day availability" for the last couple of weeks -- at the core, this is because I've had excessively rosy ideas of what JAM DA was supposed to guarantee. But, since validators have nothing at stake, no 28 day unbonding period, no slashing, and old validators can be flipped out with no actual requirement for new validators to secure old JAM DA chunks ... it seems serious service builders should just forget that JAM (on its own anyway) secures its JAM DA segments beyond the M hours needed to audit/finalize, and develop their own long-term storage solution, with its own rewards/payment system following something like Map\<Builder,Vec\<u16>>, or pretending that "store on ipfs" is actually real, or, yipes, just paying Cloudflare etc. and hoping no one calls BS on the subtle lie =) Should we be bothered about this squeezing long-term responsibility out of one system (JAM) and throwing it over to storage systems run by \[shamelessly centralized\] service operators/builders, or is there a way to salvage hopes of what JAM DA can actually guarantee beyond M hours? (edited)