# room: #jam-conformance:matrix.org
# exported: 2025-09-26 03:25 UTC
2025-08-20 15:56 erin: tomusdrw: i've set up the archiver to archive this also, so hopefully will be up starting tomorrow
2025-08-21 06:33 ascriv: https://github.com/davxy/jam-conformance/issues/26 should we not discourage open source implementations (for now)? [edited]
2025-08-21 10:27 oliver.tale-yazdi: I think open source repos are fine. Other implementors are just not supposed to look at it
2025-08-21 10:28 oliver.tale-yazdi: Maybe in their self interest it should be private though since someone else may steal their code and submit it first, but that is up to them to decide IMO [edited]
2025-08-21 10:30 sourabhniyogi: I've never understood when we are required to make everything open-source -- is it after M1 (like, this year =) ), or would it be at the very end? Or is there some "share with the w3feval team" process?
As the [reports](https://github.com/davxy/jam-conformance/tree/main/fuzz-reports) are clear for > 2/3 of active teams now (omg ... yay!), can we move onto 0.7.0 traces for the rest of the month and start the 0.7.0 fuzzing in first half of September, and then do 0.7.1 traces/fuzzing in the second half? [edited]
2025-08-21 10:32 oliver.tale-yazdi: I think you only need to share the code with the Milestone judges eventually.
2025-08-21 10:45 sourabhniyogi: Does anyone want to see refining added to the fuzz protocol?
2025-08-21 10:47 ascriv: Sure. But doesn’t that require networking for the erasure coding stuff?
2025-08-21 10:56 sourabhniyogi: For a auditable work package bundle, it doesn't have to. We can add this to the fuzzer protocol:
```
--> Work Package Bundle ++ Core Index ++ Segment Root Mappings ++ Timeslot
<-- Work Report
```
2025-08-21 10:57 jaymansfield: > <@sourabhniyogi:matrix.org> Does anyone want to see refining added to the fuzz protocol?
That would be great
2025-08-21 11:01 sourabhniyogi: Anyone see the need to add more to the above than 1+2, or have a desire to add this in 0.7.0 vs 0.7.1? There is this ancestors detail I'm wondering what will happen when
2025-08-21 11:02 ascriv: I’m ok with whatever the evaluators decide, but clarity/a roadmap would be nice
2025-08-21 11:18 clearloop: I hope polkajam can join the discuss which can make our time spent more efficient, for example, the testing data format
2025-08-21 11:26 sourabhniyogi: I think the expectation is that we all become evaluators, and we earn our fellowship stripes by suggesting the changes we want ... like, how do you want to see ancestors in the data
2025-08-21 11:26 davxy: Your time schedule looks realistic and reasonable
2025-08-21 11:36 davxy:
This seems doable. Maybe you could open a PR to extend the fuzzer protocol?
Right now, I am focused on fixing issues in the fuzzer and getting stuff ready for version 0.7.0 (test vectors first).
The refinement extension could realistically land in 0.7.1 instead.
For 0.7.0, the fuzzer protocol should already be extended with:
- supported-features handshake during PeerInfo message exchange
- ancestors set
If the target supports refinement, that capability could be included as part of the features exchange.
2025-08-21 11:37 sourabhniyogi: Is refining successfully part of M1 conformance? M2?
2025-08-21 11:38 sourabhniyogi: I am happy to do a refining extension PR targeting 0.7.1 and try it out with everyone here in Sept =)
2025-08-21 11:40 sourabhniyogi: I'll primarily just extend this README.md here https://github.com/davxy/jam-conformance/blob/main/fuzz-proto/README.md unless you suggest something else?
2025-08-21 11:44 davxy: From a technical standpoint, M1 is doing block importing, so as far as I can tell, the answer is no.
2025-08-21 11:54 sourabhniyogi: Can we consider the 0.6.7 fuzzer "done" (which allows other teams to roll in) or are you going to go through another few waves of 0.6.7 for the next 7-10 days, like the gas accounting wave?
2025-08-21 11:57 sourabhniyogi: For others doing fuzzers, what else should be addressed? I think you had a task list to share maybe
2025-08-21 12:09 jaymansfield: Seems like this should work. I might attempt to build my own refine fuzzer actually as I'm getting close to finishing my recompiler. Will use the same refine arguments when I get to it next.
2025-08-21 16:03 r2rtnl: JAM Prize rules #14 and #15 appear to favor using a public GitHub repo as evidence of organic development.
Rule #7 seems reasonably interpreted as: “don’t look; and if you did look, disclose it publicly.”
2025-08-21 16:16 oliver.tale-yazdi: Rule #16 also explicitly allows private repos. So not sure if either is favoured
2025-08-21 16:20 r2rtnl: Agreed — I’ve just shared an explanation of why TurboJam chose to go the open-source route.
2025-08-21 16:20 ascriv: I’m pretty sure private meant off-GitHub according to gav’s comments last October
2025-08-21 16:21 ascriv: image_D9A1B8B1-C051-4866-9047-C48BBDBB137B_1755793257.png
2025-08-21 16:21 ascriv: But I’m in the process of confirming this with the fellowship, can share once I hear
2025-08-21 16:23 ascriv: “If you don’t want to use GitHub (private or public), add commits timestamped on a blockchain”
2025-08-21 16:23 oliver.tale-yazdi: But also depends on how one uses it. Just pushing commits to public GibHub wont help, it will need to be Merge Requests since they carry timestamps.
2025-08-21 16:26 oliver.tale-yazdi: Yea I am doing this to avoid any controversy later on: https://github.com/JamBrains/remark-commit (i already posted this recently but nobody seems to want to use it 😆)
2025-08-21 16:27 ascriv: It’s definitely best to do something like OTS or your remark tool regardless. They should have required everyone do this no matter if on GitHub or off imo [edited]
2025-08-21 16:28 ascriv: Bc as it’s written and as gav explained, it seems just being on github suffices
2025-08-21 17:41 erin: archive is up: https://paritytech.github.io/matrix-archiver/archive/_21ksYpYHcVftKsUAsdMa_3Amatrix.org/index.html
cc tomusdrw
2025-08-21 17:45 davxy: Two new highly controversial reports: 1755796851 1755796995
2025-08-21 17:46 davxy: I guess after these two we can start with 0.7.0 :-)
2025-08-21 17:46 davxy: https://github.com/davxy/jam-conformance/tree/main/fuzz-reports
2025-08-21 17:49 davxy: (the traces passed by all teams were removed from the table)
2025-08-22 14:10 danicuki: I am trying to match the preimage error because of this: https://matrix.to/#/!ddsEwXlCWnreEGuqXZ:polkadot.io/$ps8N0jq66pBJG1o26Z5XUGyU23QETifiYuVaaiiypIc?via=polkadot.io&via=matrix.org&via=parity.io
It also affects test vector https://github.com/davxy/jam-test-vectors/blob/master/stf/preimages/tiny/preimage_not_needed-1.json
2025-08-22 16:59 rustybot: > <@danicuki:matrix.org> I am trying to match the preimage error because of this: https://matrix.to/#/!ddsEwXlCWnreEGuqXZ:polkadot.io/$ps8N0jq66pBJG1o26Z5XUGyU23QETifiYuVaaiiypIc?via=polkadot.io&via=matrix.org&via=parity.io
>
> It also affects test vector https://github.com/davxy/jam-test-vectors/blob/master/stf/preimages/tiny/preimage_not_needed-1.json
>
Looks like you've got your answer in the other channel?
2025-08-22 17:44 danicuki: If that answer is correct than I still can't understand why some of the fuzzer cases shouldn't raise preimage_not_needed error [edited]
2025-08-22 17:46 danicuki: Cases 1755530896, 1755530728,1755531265, 1755620371
2025-08-22 17:51 ascriv: Can you provide a deeper analysis/argument for why one of these cases should error? It’s always possible we all have the same mind virus
2025-08-22 17:56 danicuki: In all these cases
d[s]_l[(h,l)] = null
Which according to 12.38, should fail the block.
2025-08-22 17:58 ascriv: I’ll take a look but I think it could be because 12.38 is a definition of R, not a condition that must be true
2025-08-22 17:58 ascriv: If that doesn’t explain it then I’ll look deeper later
2025-08-22 17:59 danicuki: All E_P must satisfy R
2025-08-22 18:01 ascriv: Right, let me look now then
2025-08-22 18:05 ascriv: I checked 1755530896, my service 2494444674 indeed does not have the preimage in the extrinsic but it does have a [] in the historical lookup
2025-08-22 18:05 ascriv: The key for the historical lookup begins with 0x92115b..
2025-08-22 18:06 ascriv: And I see it in the pre state for ….008.json for this test vector
2025-08-22 18:06 ascriv: Value 0x00 which corresponds with empty list
2025-08-22 18:24 danicuki: I will double check on my side but I think I know what we are doing wrong.
2025-08-23 11:22 davxy: **0.7.0 test vectors are up:**
https://github.com/davxy/jam-test-vectors/pull/90
2025-08-23 17:54 jaymansfield: Another set of 0.7.0 safrole vectors if anyone is interested: https://github.com/javajamio/javajam-trace
2025-08-24 07:31 davxy: ⚠️ https://github.com/davxy/jam-test-vectors/pull/90#issuecomment-3217905803
2025-08-24 07:33 davxy: For tiny we're using G_T = 20_000_000 and not 3_500_000_000 . That is why we call into accumulate twice in preimages/00000008.bin
As I described in the comment, I think a different value for tiny is reasonable to trigger interesting cases
2025-08-24 07:33 davxy: if there are no objections I'll open a PR in jamcha.in and set this as the value for G_T in tiny config
2025-08-24 08:16 davxy: https://github.com/JamBrains/jam-docs/pull/59
2025-08-24 11:13 prematurata: hey @davxy I just found out about the sbrk note in the traces readme https://github.com/davxy/jam-test-vectors/tree/v0.7.0/traces
besides the sbrk implementation the RAM code looks a bit off to me.
For example in A.42 the Writeable section starts at 2Zz + Z(|o|) while in the code there it seems to end there
2025-08-24 11:18 davxy: Was written by [@sourabhniyogi:matrix.org](https://matrix.to/#/@sourabhniyogi:matrix.org)some time ago Feel free to open a PR
2025-08-24 11:18 prematurata: ah ok maybe at that time the gp was different :)
2025-08-24 11:20 davxy: Honestly I didn't went through that go code LOL
2025-08-24 11:20 prematurata: npnp thanks for clarifying
2025-08-24 11:27 davxy: I don’t feel like I’ve clarified anything :-) I’ll check that section out
2025-08-24 11:28 prematurata: well it's clear that there MAY be an error :)
2025-08-24 17:05 sourabhniyogi: Yes, it deserves an update -- the 0.6.7 fuzz traces did a thorough job of showing how its out of date (64k, read vs write OOB) -- I'm not aware of really great SBRK / heap testing expansion anywhere, and the "s" component [stack] remains elusive and entirely untouched so far as I can tell in addition to the "h" [heap] of SBRK which has not been fuzzed much.
It would be ideal if the next 0.7.2+ fuzzing series attacked this fully and somehow incorporated "picoalloc" which I gather is the replacement to SBRK (the "h"). I am wondering if its useful to really go crazy with fuzz testing of SBRK because its going to be replaced. We can keep busy with outeraccumulate in 0.7.0 and transfer in 0.7.1 the meantime =). [edited]
2025-08-25 07:52 dakkk: Where can I find the source code of the service used for the tests? My implementation is raising a panic from services/bootstrap-service/src/lib.rs in two traces
↳ 2025-08-25 10:13 oliver.tale-yazdi: I think you can get it with `cargo clone jam-bootstrap-service`, at least that code matched with the logs that we saw when running the test vectors
↳ 2025-08-25 11:17 dakkk: thank you
2025-08-25 14:57 emielsebastiaan: davxy: Potential ASN error: https://github.com/davxy/jam-test-vectors/issues/93
Please review.
2025-08-26 06:50 davxy: https://github.com/davxy/jam-test-vectors/pull/94
2025-08-26 07:02 emielsebastiaan: In the office in an hour. Let me check then. It is simply the reordering. New order: i, x, z, e for both CoreActivityRecord and ServiceActivityRecord. [edited]
2025-08-26 08:24 davxy: @room Could you please check these vectors https://github.com/davxy/jam-test-vectors/pull/94 before I proceed with merging?
STF and traces. Ty
↳ 2025-08-26 08:37 0xjunha: fastroll passes all stf/traces
↳ 2025-08-26 12:10 jaymansfield: Looks good.
2025-08-26 08:32 dakkk: it works for me, and now I pass all the tests either
2025-08-26 08:45 arjanz: Traces and STF are passing
2025-08-26 08:53 vinsystems: vinwolf passes all stf and traces as well
2025-08-26 09:36 ascriv: Jamzilla passes as well
2025-08-26 10:59 danicuki: Do we already have fuzzer for 0.7.0?
2025-08-26 11:03 danicuki: Jamixir passes all 0.7.0 traces and test vectors.
2025-08-26 11:06 danicuki: We are still struggling with two fuzzer cases on 0.6.7 - Could anyone please provide the PVM traces for 1755531265 and 1755796995 fuzz cases (0.6.7)?
↳ 2025-08-26 16:20 clearloop: 1755531265.log
↳ 2025-08-26 16:20 clearloop: 1755796995.log
↳ 2025-08-26 16:21 clearloop: hope these help ⬆️ [edited]
↳ 2025-08-26 23:49 danicuki: ❤️
↳ 2025-08-27 19:23 danicuki: clearloop | SpaceJam: I have a doubt about your logs:
```
2025-08-27 00:19:42 TRACE stf:accumulate: pos=26517 Ecalli gas=9997503 regs=[804, 4278056528, 17, 16, 4278056271, 206752, 70, 2, 66858, 4, 206752, 57, 8]
2025-08-27 00:19:42 DEBUG stf:accumulate: calling host call 100
2025-08-27 00:19:42 INFO program: Bootstrap Service Accumulate, 0h @8 $18446744073709003931 target="boot"
2025-08-27 00:19:42 TRACE stf:accumulate: pos=26519 Fallthrough gas=9997502 regs=[804, 4278056528, 17, 16, 4278056271, 206752, 70, 0, 66858, 4, 206752, 57, 8]
```
the host call is a log, so it should not change r7 from 2 to 0
the next call is a fallthrough, which should also not change r7 from 2 to 0
So why is your r7 changed to 0?
↳ 2025-08-27 23:25 clearloop: good catch! we have a wrap of host calls, 0 means the host call executed successfully 🤦♂️ this could be confusing but it should not be the root case since `r7` is commonly used as exit status, and it is actually ignored by the log host call in the program, (e.g. it will be override by other operations without getting loaded )
while you can pass the traces in the test vectors, I believe the bug in the fuzz tests are in `host call` / `memory op` / `encoding stuffs`
↳ 2025-08-28 10:08 danicuki: I am double checking fetch.
What value do you have for G_A, G_T and G_R?
↳ 2025-08-28 10:11 danicuki: also, from what I know, it uses 0.7.0 spec for fetch, right?
↳ 2025-08-28 15:01 clearloop: the traces were generated by a commit that we haven't upgraded to 0.7.0 yet, if I'm not mistaken the fetch call updates are covered by both 0.6.7 and 0.7.0 (registers changed), from we I can recall:
- `1755796995` is related to the threshold of accounts (if it was the last row of the table), you need to check your `info` implementation instead of `fetch`, the diff are actually in memory that register values are not that helpful
- `1755531265` I'm not sure about this, if you can share which key you failed to match, mb I can recall sth helpful
2025-08-26 11:25 davxy: Polkajam has it; I'll provide some reports in the next days
2025-08-26 14:41 emielsebastiaan: davxy afk -> 28 Aug: https://github.com/w3f/jamtestvectors/pull/55#issuecomment-3224458898
2025-08-26 15:04 emielsebastiaan: https://github.com/paritytech/polkajam-releases/releases/tag/v0.1.25
CoreVM & DOOM included. [edited]
2025-08-26 18:48 prematurata: ~Hello can someone provide tracelog for storage 13 in 0.7.0? I am struggling to pass that one~ [edited]
2025-08-27 14:10 yu2c: davxy afk -> 28 Aug:
I'm curious about some issues of the jam-test-vector/stf/assurances (v0.7.0)
From [README](https://github.com/davxy/jam-test-vectors/blob/master/stf/assurances/README.md):
1. the "[assurances_for_stale_report-1](https://github.com/davxy/jam-test-vectors/blob/master/stf/assurances/tiny/assurances_for_stale_report-1.json)" pins red dot (which implies error code exists), but in the file, there's no error code but has ok with reports (which implies green dot)
2. Why "[no_assurances_with_stale_report-1](https://github.com/davxy/jam-test-vectors/blob/master/stf/assurances/tiny/no_assurances_with_stale_report-1.json)" doesn't have output? (we don't output timeout reports?)
> Stale work report assignment is removed (but not returned in the output).
2025-08-28 15:42 davxy: I'll have a look. Please open an issue if you want to be sure I don't forget :-D
2025-08-28 15:50 davxy: Interesting dispute (0.7.0): https://github.com/davxy/jam-conformance/discussions/37
2025-08-28 15:54 prematurata: i will fix the build script for the target tonight but i can give it a spin on this trace beforehead
2025-08-28 21:35 sourabhniyogi: https://github.com/davxy/jam-conformance/discussions/37#discussioncomment-14249768
2025-08-28 21:35 sourabhniyogi: Jason | JavaJAM: What do you think of the above ^^
2025-08-28 21:45 jaymansfield: I did see that as well but its hard to interpret if its referring to a single parallel accumulation since its referring to **n** and **m** and they are defined in ∆∗, or over multiple outer accumulations. [edited]
2025-08-28 21:48 sourabhniyogi: The bottom line is that we cannot accumulate any service more than once. If you incorporate this constraint, you should be able to match their state root in this case.
2025-08-28 21:50 jaymansfield: You may be right it does make sense but would be good to have davxy's opinion. [edited]
2025-08-28 21:55 dave: image.png
2025-08-28 21:55 dave: Are you referring to this section of the GP?
2025-08-28 21:56 sourabhniyogi: Yes -- does that constraint get modeled in another equation somehow?
2025-08-28 21:56 dave: This is talking about service modification/creation/deletion. It doesn't have anything to do with accumulating a service multiple times AFAIK (which is permitted AFAIK).
2025-08-28 21:57 sourabhniyogi: I am interpreting "altered" as service modification via accumulation. What blocks the third report from accumulating otherwise? [edited]
2025-08-28 21:58 dave: The point is that the parallel accumulation won't work sensibly if the same service is created/modified/deleted by multiple different services in the same parallel accumulation step
2025-08-28 22:00 dave: As there is no logic to sensibly merge the results
2025-08-28 22:00 sourabhniyogi: Here we have two outer accumulates.
2025-08-28 22:00 dave: Most of that paragraph is merely "informative"; I think the only "normative" bit is that if there _is_ a conflict the block must be considered invalid and discarded.
2025-08-28 22:01 dave: This can only happen if there is a collision amongst the pseudo-randomly generated IDs of newly created services
2025-08-28 22:02 dave: Right, from the perspective of that paragraph the "outer" accumulates should not interact
2025-08-28 22:02 sourabhniyogi: Ok then we need to find the source in the GP which says "If you accumulated service X in one outer accumulate, you cannot accumulate that service X again (in a following outer accumulate) " -- [edited]
2025-08-28 22:02 sourabhniyogi: If there is no source, then Polkajam has a bug, I think.
2025-08-28 22:02 dave: I don't believe there is such a requirement
2025-08-28 22:03 jaymansfield: There might not be one. It could just be a bug. [edited]
2025-08-28 22:04 dave: Yes, quite possibly. I haven't looked into the fuzzer case so I can't comment on what's actually happening there
2025-08-28 22:04 sourabhniyogi: Alright, we'll hold off on publishing a "fix" based on your read =)
2025-08-28 22:09 dave: You shouldn't take my word for it. But I will say that in general most things in the GP are defined formally. You should avoid reading too much into the exact wording of the text surrounding the formal definitions; that is there primarily to aid understanding. Of course if there is a significant mismatch between the two that can and should be fixed...
2025-08-28 22:17 sourabhniyogi: Not sure if this should be in this room or not (you decide!) -- we are trying out running DOOM work packages from the 0.7.0 release of polkajam in a 5+1 polkajam+jamduna testnet and only seeing "null-authorizer" work packages being submitted by the 2-3 key steps
```
# Create CoreVM.
./jamt vm new ./doom.corevm 1000000000
# Run CoreVM builder (SERVICE_ID is in jamt's output).
./corevm-builder --temp --chain dev --gas 1000000000 SERVICE_ID
# Run CoreVM monitor (SERVICE_ID is in jamt's output).
./corevm-monitor SERVICE_ID
```
where the `corevm-monitor` is showing nothing on the 3rd step and the only work package logging we see is from "null-authorizer" caused by the `corevm-builder`.
Am I missing a key step here or is there something else involved to have us test our JAM implementations against the famous 60fps target? [edited]
↳ 2025-08-29 04:59 clearloop: aint coreVM has different entrypoint that either refine or accumulate will not be able to be invoked [edited]
↳ 2025-08-29 05:00 jan: CoreVM payloads cannot be loaded as toplevel services. They use a different blob format than JAM services do, and need hostcalls which are not provided by the JAM client.
↳ 2025-08-29 05:12 clearloop: oh I got it, so in this test, `jamt vm ./doom.corevm` creates a process which running the `corevm` and keep submitting work packages to the network
2025-08-28 22:21 dave: I don't think anything is missing there. It does take a little while to get going. `jamtop` should show you the doom service and packages being refined/accumulated for it
2025-08-28 22:22 sourabhniyogi: A little while = how many seconds? Ok, will run `jamtop`
2025-08-28 22:22 dave: Well, nothing is missing assuming you have a JAM network running
2025-08-28 22:22 dave: If you are not starting a JAM network before those steps then not much will happen :P
2025-08-28 22:23 sourabhniyogi: How many "frames" are represented in each doom work package
2025-08-28 22:28 dave: I guess 360 if it's running at 60fps. Can't confirm if it actually is running at 60fps as I don't see any indication of the fps anywhere...
2025-08-28 22:29 dave: I can confirm though that it's running for me with the above commands, preceded by `polkajam-testnet` to launch a network [edited]
2025-08-28 22:29 dave: Using the latest nightly release (https://github.com/paritytech/polkajam-releases/releases/tag/nightly-2025-08-28)
2025-08-28 22:30 dave: There are a couple of sticking points though. If you run the `jamt` command too soon after starting the network you will get an error like "2025-08-28 23:24:28 Fatal error: ErrorObject { code: ServerError(1000), message: "Chain error: storage access error: core assigments error: invalid epoch 287320, reference epoch is 0", data: None }
Caused by:
ErrorObject { code: ServerError(1000), message: "Chain error: storage access error: core assigments error: invalid epoch 287320, reference epoch is 0", data: None }"
2025-08-28 22:31 dave: I also got a "Step error: Work error: BadCode" from corevm-builder the first time I ran it... I don't know what the explanation for that one is. I reran it and it worked the second time! 😅
2025-08-28 22:40 sourabhniyogi: image.png
2025-08-28 22:40 sourabhniyogi: Ok thank you for making this happen -- super psyched!
2025-08-29 02:41 sourabhniyogi: David Emett: In attempting 0.7.0 work report alignment between polkajam and our jamduna client (now with a lot of jam-conformance hardening!) against the latest release of `polkajam` (2025-08-28), we believe the authorization args (\[now just E\_2(c)\] (https://graypaper.fluffylabs.dev/#/38c4e62/2ec2002ec200?v=0.7.0)) are incorrect, because very early in the polkajam logs for first invocation of the bootstrap service (in `./jamt vm new ./doom.corevm 1000000000`) we see
```
sp = 0xfefdfa58
u64 [0xfefdfff8] = ra = 0xffff0000
u64 [0xfefdfff0] = s0 = 0x0
u64 [0xfefdffe8] = s1 = 0x0
u64 [0xfefdfc68] = a1 = 0x1 <<<<< the argument input length should be 2, not 1 for 0.7.0 Authorization
```
If we change our encoding from E\_2(c) to E\_1(c) we get work reports to match (in addition to matching this little "a1") ... except for gas where we're wondering when polkajam 0.7.0 does "basic block" gas accounting and when it does single instruction gas accounting. [edited]
2025-08-29 04:32 davxy: You can accumulate one service multiple times. I don't understand why you thought this wasn't possible and came to that conclusion.
The paragraph you shared is completely unrelated - as David mentioned, it refers to conflicts in IDs of new services.
I suspect this might just be a bug on our side. I'll investigate it today. [edited]
2025-08-29 05:07 davxy: There's a bug
2025-08-29 10:06 sourabhniyogi: It had a lot to do with this trace 😂 but really ... because service designers want to have accumulation bring together async refined data together across multiple cores synchronously. I definitely have a "GROUP BY" map-reduce view of what I want to see happen from JAM's 'rollup host' purpose: A service designer will want to integrate ALL the work that was done in the last N seconds every N seconds and have a single place to compute and write out a SINGLE view of what happened.
While I'm glad no work report is left behind, JAM's accumulate appears to have a depressing G\_T problem for its rollup host purpose: accumulate does NOT accumulate everything every N seconds, not even typically. Yes, you can organize work packages in chain with every work package having a prereq, but that is pretty lame. [edited]
2025-08-29 10:09 dave: I believe the only case where JAM _won't_ accumulate everything is when there are unsatisfied dependencies
2025-08-29 10:10 dave: It will accumulate these as their dependencies are satisfied, subject to per-block gas limits
2025-08-29 10:11 dave: So I'm not sure I agree with "it won't typically accumulate everything"
2025-08-29 10:15 sourabhniyogi: I mean a service designer would like to see *one* `accumulate` execution across all work reports. Whether it typically does or not depends on what else is happening. Its definitely not the case that a service designer can expect to get all the work reports grouped together. Anyway, we're outside the topic of JAM conformance =)
2025-08-29 10:16 dave: You certainly can't rely on that no
2025-08-29 14:15 boymaas: davxy: I see you are already publishing the import stats. That's great! I was wondering if you are measuring with verbose output enabled, or if you ran the JamZig⚡ node with -vv or without any verbose flags. I compiled these with detailed runtime debugging on, and the best performance is achieved without -vv and with all debug output removed. Let me know if you want a new binary compiled without this.
2025-08-29 14:23 davxy: I'll disable debug then. No worries -- Zig looks fast ;-)
2025-08-29 14:24 davxy: FYI, we're working on extracting some rough numbers to get a basic comparison between the implementations and to provide a bit of extra incentive :-D
Soon we'll publish a nice-looking chart, courtesy of Erin.
The perf reports are based on monitoring your import timings (accounting for the small overhead of the process link). Nothing scientific here -- just estimates.
In practice, we also run polkajam as a target. In this case, the difference between the fuzzer importing the block and the target import is pure overhead. This is the overhead that we considered for your targets as well.
The perf reports are generated from the test vector traces.
I'm scripting the production of these numbers, so if you update your binary, regeneration will be immediate.
↳ 2025-08-29 17:53 clearloop: hi davxy just did some optimizations! hope can try spacejam again! https://github.com/davxy/jam-conformance/issues/13#issuecomment-3237780866
2025-08-29 14:34 clearloop: curious about what's the system info of the benchmarking machine, also, if it is possible providing memory peeks as well, I'm a bit worry about it since some programs are pretty big [edited]
2025-08-29 14:39 davxy: Perhaps we can run this on some more beefy machine at some point.
But for the moment I think these results already gives you enough hints
2025-08-29 14:39 jaymansfield: I published a new version of javajam yesterday that included about a 70% import performance improvement for the storage test vectors. Are you able to grab the latest 0.2.7? It looks like the report used 0.2.6.
2025-08-29 14:39 davxy: My machine
AMD Ryzen Threadripper 3970X 32-Core (64) @ 4.55 GHz
2025-08-29 14:40 davxy: > it is possible providing memory peeks as well
Current reports not, perhaps we will in the future.
2025-08-29 14:46 davxy: done
2025-08-29 14:50 jaymansfield: Thanks!
2025-08-29 14:51 ascriv: Where can I see the import stats? Been afk for a while
2025-08-29 14:53 ascriv: Or are they not published yet
2025-08-29 14:54 clearloop: Jamzilla is super fast
https://github.com/davxy/jam-conformance/blob/import-perf/fuzz-reports/0.7.0/reports/jamzilla/perf/storage.json
2025-08-29 14:57 ascriv: 🫢
2025-08-29 14:59 clearloop: I'm curious are you all running accumulation in parallel already, why in that speed
2025-08-29 15:00 ascriv: I am doing in parallel yes
2025-08-29 15:00 ascriv: Interpreted pvm for now
2025-08-29 15:00 sourabhniyogi: davxy: Which metric should teams optimize for to get the most fuzzing action?
```
% grep import_p75 */perf/storage.json
boka/perf/storage.json: "import_p75": 6925.978,
jamduna/perf/storage.json: "import_p75": 171.9,
jamzig/perf/storage.json: "import_p75": 39.349,
jamzilla/perf/storage.json: "import_p75": 81.848,
javajam/perf/storage.json: "import_p75": 233.967,
pyjamaz/perf/storage.json: "import_p75": 5186.699,
spacejam/perf/storage.json: "import_p75": 187.597,
turbojam/perf/storage.json: "import_p75": 10.395,
```
↳ 2025-08-29 15:02 clearloop: can't believe the performance of turbojam
↳ 2025-08-29 15:04 davxy: You have two ways of make the fuzzer happy:
- run the STF
- hardcode the replies in the binary :-D (given that we run the test vectors traces here)
↳ 2025-08-29 15:04 sourabhniyogi: Removing logs and recompiler is 95% of the optimizing
↳ 2025-08-29 15:04 davxy: But the thing is that turbojam has the same perfs with a random seed. So it can't predict the state roots. Ergo. Looks really fast
↳ 2025-08-29 15:05 davxy: I'll provide perf reports for polkajam in few mins
↳ 2025-08-29 15:05 davxy: both with interpreter and recompiler
↳ 2025-08-29 15:05 davxy: you can provide optimized binaries. I don't need the logs actually
↳ 2025-08-29 15:07 davxy: @room If you republish optimized bins, please leave a note in your dedicated GH issue.
I won’t re-run them right away, and otherwise I’ll probably forget.
↳ 2025-08-29 15:09 clearloop: I doubt about the performance of recompiler... I've tried compiling the programs in the stroage set however they at least take 10s on compilation... not sure what's the number from others [edited]
↳ 2025-08-29 15:10 jan: 10s per program or in total?
↳ 2025-08-29 15:11 clearloop: per...I must made sort of horrible mistake... haven't implemented the cache yet so each test triggers a compilation, there mb sort of memory issues as well
still reorganizing my implementations these days, will take the benchmark with polkavm/bentools soon! [edited]
↳ 2025-08-29 15:14 jan: Note that if you have optimal enough code then you don't actually need a cache. Anyway, 10s seconds sounds very abnormal.
↳ 2025-08-29 23:20 clearloop: Just made my recompiler 20x faster via removing my "sugar" of register operations .... compiling storage\_light::00000004 from `11s -> 0.6s` indeed horrible mistake! trying to remove more "sugar methods" now XD I was actually handling the instructions with a interpreter mind! [edited]
2025-08-29 15:01 boymaas: davxy:I pushed a new version for JamZig⚡ with all the debugging mechanics removed. I'm curious about the performance gains! It looks very promising locally. 🤠 The -vv flags on the target do not do anything anymore.
2025-08-29 15:20 boymaas: https://asciinema.org/a/eMIt8geWsubdseBLBn9ygUkiN 🤠
↳ 2025-08-29 15:23 clearloop: this visualization tool looks perfect, can you make a PR to the repo that we can try it as well?
2025-08-29 15:21 boymaas: image.png
2025-08-29 15:22 jan: Would be nice to also have the implementation language next to the name.
2025-08-29 15:24 dakkk: is this using only fallback / safrole right?
2025-08-29 15:24 boymaas: Safrole indeed
2025-08-29 15:26 dakkk: davxy: I managed to solve the issue that was causing an error on jampy when using the pvm; I've just uploaded a fixed version
2025-08-29 15:38 boymaas: Good idea, let me check if I can add that quickly.
2025-08-29 15:41 sourabhniyogi: Way to go everyone -- this will turbocharge all of us =)
2025-08-29 15:43 boymaas: Does anyone know what language TurboJam uses? I cannot find it that quickly online.
↳ 2025-08-29 15:43 clearloop: C++
2025-08-29 15:44 emielsebastiaan: davxy: Is there a reason why you choose not to include `preimages` & `preimages_light` in the speed benchmark?
2025-08-29 15:44 arjanz: C++ I believe
2025-08-29 15:44 ascriv: The repo is public too
2025-08-29 15:44 davxy: Not really. I can add in the next iteration
2025-08-29 15:48 prematurata: wow nice charts. :)
2025-08-29 15:49 prematurata: do we know all language sets? it would be nice to have it after the name until we get used to it. I, for example dont know many of other implementors languages other than the ones that contain it in the name
2025-08-29 15:50 dakkk: it could be useful to have a script that generates those charts automatically for every test set
2025-08-29 15:52 boymaas: image.png
2025-08-29 15:53 davxy: Added polkajam and polkajam iterpreted
↳ 2025-08-29 16:03 clearloop: fact: polkajam is 2x faster than turbojam = =
2025-08-29 16:00 davxy: Given the data I have provided, you do not need to worry about variations of around +/-3ms.
This is not a dedicated machine, so results may fluctuate a bit.
Still, the data gives good hints on where improvements are needed, and it shows which trace step took more time so you can re-execute it.
2025-08-29 16:00 davxy: https://github.com/davxy/jam-conformance/tree/import-perf/fuzz-reports/0.7.0/reports/perf
2025-08-29 16:04 boymaas: image.png
2025-08-29 16:04 boymaas: Forgive me if I have associated the wrong language; please let me know, and I will correct it.
2025-08-29 16:08 ascriv: Turbojam is c++, not unknown
https://github.com/r2rationality/turbojam
2025-08-29 16:10 sourabhniyogi: Ok, since polkajam is at the top 3 of the `reports/perf` (for non-zero values anyway), we can all aim to turbocharge up to polkajam `polkajam_perf_int` most easily if we use polkajam fuzzer target to compare against our own.
Is there a fuzzer option within `polkajam` or is that a separate binary that can be provided?
Not asking for polkajam fuzzer binary -- just the polkajam fuzzer target.
2025-08-29 16:12 jaymansfield: Will be interesting to see where everyone stands in a few days after working on optimizations
2025-08-29 16:12 jaymansfield: Really useful seeing the numbers
2025-08-29 16:12 boymaas: Thank you ascriv | Jamzilla and arjan | PyJAMaz will add it.
2025-08-29 16:47 danicuki: Jamixir already has 0.7.0 - would you please include it in this speed report?
2025-08-30 07:02 dakkk: > <@boymaas:matrix.org> Forgive me if I have associated the wrong language; please let me know, and I will correct it.
Can you share the script you re using to create this chart?
2025-08-30 07:40 boymaas: visualize_perf_enhanced.py
2025-08-30 08:32 boymaas: davxy: Feel free to include it in the repo as well.
2025-08-30 10:56 davxy: Can you open a PR ?
2025-08-30 10:56 dakkk: maybe you can also integrate it in the github-CI, so every time perf are updated, it creates the chart
2025-08-30 10:57 prematurata: this changed from "make a graypaper compliant implementation" to a drag race [edited]
2025-08-30 10:57 prematurata: :)
2025-08-30 10:57 dakkk: image.png
2025-08-30 10:57 dakkk: image.png
2025-08-30 10:57 dakkk: btw, those are from last update:
2025-08-30 10:58 prematurata: tsjam is TypeScript Language btw... i see it is marked as unknown [edited]
2025-08-30 11:04 jan: Jamixir is Elixir AFAIX (hence the "ixir" in the name)
2025-08-30 11:06 jan: And it'd make more sense to change "Rust (int)" to "Rust" and "Rust" to "Rust (recomp)" for polkajam (or perhaps have another column whether the implementation is using a recompiler; not sure if any other implementation besides polkajam uses one currently?) [edited]
2025-08-30 11:07 ascriv: I think jamduna has one?
2025-08-30 11:07 ascriv: @sourabhniyogi
2025-08-30 11:08 jan: Yeah, I know a few of the implementations have one in-progress, but it's unclear if any are used here for these tests. [edited]
2025-08-30 11:09 clearloop: btw I want to confirm if block based gas charging matches the tracing tests, we can match accumulate tests but not tracing tests with block based gas charging, could be caused by the host call tried once but seems the problem is not that obvious [edited]
2025-08-30 11:10 jan: Block based gas is not yet part of the GP; not sure if current traces use per-instruction gas yet (you'd have to ask davxy ), but I have implemented per-instruction gas metering in PolkaVM so the plan is to have the fuzzer/traces use per-instruction gas until the new gas cost model is ready.
2025-08-30 11:11 jan: You can already start implementing a block-based gas cost model to get a head start on the final gas cost model, but for now the official model is per-instruction.
2025-08-30 11:18 davxy: We use per-instruction gas charging. The test vector traces work for both models, since they never fail midway through a block.
2025-08-30 11:18 clearloop: kk so the problem is in my implementation, will take a deeper look at it then!
2025-08-30 11:55 danicuki: > <@prematurata:matrix.org> this turned out from "make a graypaper compliant implementation" to a drag race
Tell me how you measure me and I will tell you how I will behave.
2025-08-30 13:46 davxy: EoW batch: https://github.com/davxy/jam-conformance/pull/41
2025-08-30 13:47 davxy: 0.7.0 table:
https://github.com/davxy/jam-conformance/blob/reports-batch-0.7.0/fuzz-reports/README.md#disputes
2025-08-30 14:55 boymaas: dakkk | JamPy: If you updated the script, please feel free to create the PR for the visualize_perf tool.
2025-08-30 17:03 vinsystems: I started a discussion about trace 1756548916 https://github.com/davxy/jam-conformance/discussions/42 which affects almost all teams
2025-08-30 17:23 davxy: replied
2025-08-31 10:39 danicuki: davxy: how frequent do you update the reports tables - (performance and disputes)? On our side, do we need to notify you when we have a new version of our binary, or just put the binary in same place?
2025-08-31 12:55 davxy: In general, just drop a message in your team's issue.
I usually re-run the scripts when it is worthwhile -- for example, if a team has made significant improvements or fixed a some traces.
Of course, do not expect me to re-run them immediately after a ping, since I might be away or focused on something at the time.
Still, given that there are not too many teams, I tend to run the scripts more frequently than expected, as I enjoy seeing things improve and become more resilient to the fuzzer day by day.
2025-08-31 13:32 davxy: Just to be clear - currently - the performance table is mostly meant to provide an indication of how long fuzzing should run to reach the same level of confidence across implementations. This is also to start thinking a bit more about the auditing process duration.
For example, if we take (in the screenshot shared by dakk above) PolkaJam as the baseline, and auditors require 3 days of continuous fuzzing without crashes/diffs then:
- An implementation 300x slower would require 900 days (about 2.5 years)
- If the baseline is instead 10x slower than Polkajam, the same 300x implementation would need 90 days
These numbers are speculative, but that is the point of sharing performance results at this stage: they are not about improving by few ms to climb the ranking, but about giving a tangible sense of scale for testing.
I will likely update performance numbers less often (about once per week) to reduce noise from micro-optimizations and keep the focus on M1 compliance. Slower implementations that report tangible improvements may get more frequent updates. [edited]
2025-08-31 13:39 jaymansfield: I have started a discussion about 1756548741 : https://github.com/davxy/jam-conformance/discussions/43
2025-08-31 17:37 jaymansfield: Regenerated the rankings:
2025-08-31 17:37 jaymansfield: Screenshot 2025-08-31 at 1.34.03 PM.png
2025-08-31 17:37 jaymansfield: Screenshot 2025-08-31 at 1.35.38 PM.png
2025-08-31 17:37 jaymansfield: Screenshot 2025-08-31 at 1.35.48 PM.png
2025-08-31 17:41 clearloop: seems our perf is not updated XD, we should be at least ~3x faster now
2025-08-31 19:01 sourabhniyogi: Since most of us are deathly afraid of flaming out of fear of being too slow (or have really large egos 🤪🤣 ), why not just state some precise guidelines now that you got almost all of us into basic "walking" shape.
Concretely:
1. Your implementation _should_ perform no slower than 10x (or 25x or 50x, you pick the number) polkajam\_int across all test groups
2. Your interpreter _should_ be able to interpret at 1MM gas/s (or 10M gas/s, 50MM gas/s, you pick the number }.
By stating this clearly, teams who are way ahead of this "should" threshold can proceed without fear on winning the M3/M4 marathon (refining Doom/.. workpackages with a recompiler) rather than an M1 sprint rankings game (which we should save for M3/M4, right?). In the end the timings are dominated by PVM accumulate execution?
What do you think -- does that make sense? [edited]
2025-08-31 19:11 dakkk: > <@sourabhniyogi:matrix.org> Since most of us are deathly afraid of flaming out of fear of being too slow (or have really large egos 🤪🤣 ), why not just state some precise guidelines now that you got almost all of us into basic "walking" shape.
>
> Concretely:
>
> 1. Your implementation _should_ perform no slower than 10x (or 25x or 50x, you pick the number) polkajam\_int across all test groups
> 2. Your interpreter _should_ be able to interpret at 1MM gas/s (or 10M gas/s, 50MM gas/s, you pick the number }.
>
> By stating this clearly, teams who are way ahead of this "should" threshold can proceed without fear on winning the M3/M4 marathon (refining Doom/.. workpackages with a recompiler) rather than an M1 sprint rankings game (which we should save for M3/M4, right?). In the end the timings are dominated by PVM accumulate execution?
>
> What do you think -- does that make sense?
I dont think it would be a good idea for M1 and M2, interpeted languages are slower by design, and they may never achieve certain performances. [edited]
2025-08-31 19:12 sourabhniyogi: image.png
2025-08-31 19:12 sourabhniyogi: That makes sense -- probably need to make Set C + D versions of those guidelines based on the data.
2025-08-31 19:15 sourabhniyogi: Since you (and other Set C folks) know what Python (and other interpreted languages) can achieve, what would you have as guidelines? Key word is "should" not turning into a "must" [edited]
2025-08-31 19:19 dakkk: Idk, I personally prefer not to have constraints on it
2025-08-31 19:19 sourabhniyogi: Using the word "must" would be a constraint. "should" would mean you would be expected to improve your interpreter to get more fuzzing action.
2025-08-31 19:27 davxy: What you said makes sense, but keep in mind:
1. The data we collected is meant to support this goal. Giving guidelines without data would just be speculation.
2. I can't really give you an answer here. You might need to escalate this kind of question to the architect :-D Maybe in jam chat. Not sure.
2025-08-31 19:31 davxy: I see that Gav is not in this channel BTW
2025-08-31 19:32 clearloop: I'd like to share some opinions at this topic as well, not sure if we should pivot this discussion to the let's jam channel first [edited]
2025-08-31 19:32 davxy: So yeah, if your questions can't be answered here, I'd say try with the JAM channel
2025-08-31 19:34 clearloop: could you please (mb) re-post your paragraph in the let's JAM channel sourabhniyogi , and then we can continue on it? [edited]
2025-08-31 20:00 prematurata: any difference in tiny for WA and WB and WC? [edited]
2025-09-01 07:02 prematurata: I want to tell you how i lost 4 hours of my life debugging. but i will share the code. now that I share you'll instantly find it but it took me 4 hours : `SERVICECODE_MAX_SIZE = 4_000_0000;`
damn fat fingers [edited]
2025-09-01 07:37 clearloop: I used to spend 2 days for an encoding order error = =
2025-09-01 07:39 davxy: Jason | JavaJAM: can you pls open a PR with the latest version of the perf script? I'd like to have it in the jam-conformance `scripts` folder. Ty
2025-09-01 08:09 clearloop: I just made one for Boy Maas | JamZig⚡ with his commit info, would you mind checking it over again or you can make one instead Boy Maas | JamZig⚡ , wanna keep you as the original author of this script in the git history anyway for respect https://github.com/davxy/jam-conformance/pull/46 then we can join the update of it together later [edited]
2025-09-01 08:31 boymaas: Thank you clearloop | SpaceJam I asked you to create one. Thank you for the PR and the attribution; I appreciate it! 😎
2025-09-01 09:47 davxy: Please add your considerations/proposals
2025-09-01 09:47 davxy: https://github.com/davxy/jam-conformance/pull/47
2025-09-01 10:03 sourabhniyogi: Jason | JavaJAM: would you like to add your ideas on how to add refinement to the above? davxy is this ok to do?
2025-09-01 10:05 sourabhniyogi: Are there \[Set C/D?\] teams who want to work on an FFI into a PVM recompiler? If so, we probably can do a meaningful call together, well beyond intros, at the intersection of everything. [edited]
2025-09-01 10:34 bamzedev: davxy: https://github.com/davxy/jam-test-vectors/blob/master/stf/reports/tiny/banned\_validator\_guarantee-1.json if I understand correctly, you are expecting that the guarantee is invalid, because one of the guarantors is in the offenders list? If that is the case, could you point out to the GP equation for that validation? [edited]
2025-09-01 11:54 davxy: 11.21 & 11.22
https://graypaper.fluffylabs.dev/#/1c979cb/155800155800?v=0.7.1
key is zeroed out if belongs to the offenders set (as per 6.14)
2025-09-01 11:55 davxy: I'll add it to the features.
2025-09-01 12:03 sourabhniyogi: would you be able to join a call with us (this week or next) to give us some tips on how to do a refine fuzzer that is approximately what you would do? I understand your priorities are on M1 but a little guidance would go a long way
2025-09-01 20:44 sourabhniyogi: https://github.com/davxy/jam-conformance/pull/50
2025-09-02 00:20 r2rtnl: A discussion about 1756548706: https://github.com/davxy/jam-conformance/discussions/51
2025-09-02 09:20 davxy: Reports batch: https://github.com/davxy/jam-conformance/pull/52
2025-09-02 12:03 ascriv: @davxy can you enlighten us on what is left to be done in pre-submission fuzzing before M1 submissions are opened?
2025-09-02 12:07 davxy: 1. finish some open tasks on the fuzzer
2. define/enact fuzz protocol v1 https://github.com/davxy/jam-conformance/pull/47
3. better define baseline timings/requirements/auditors etc. I'll discuss about this with the team as soon as I have the opportunity
2025-09-02 12:09 ascriv: thanks!
2025-09-02 15:55 boymaas:
2025-09-03 12:00 erin: Hello everyone, the conformance performance dashboard is now ready -> https://paritytech.github.io/jam-conformance-dashboard/
Please let me know if anything looks off/wrong. Do note it does use a weighted scoring mechanism (methodology detailed at the bottom of the page) so it may not always look strictly "in order". [edited]
2025-09-03 12:07 erin: Checks for new data are performed hourly.
2025-09-03 12:38 oliver.tale-yazdi: Maybe there could be an option to anchor it to a "fixed baseline" version of PolkaJAM (e.g. 0.7.0)?
Then teams can continuously check if they improved, otherwise it would look like they get slower after PolkaJAM optimized a bit [edited]
↳ 2025-09-03 12:41 erin: good feedback, thanks!
2025-09-03 18:35 boymaas: Very nice dashboard erin . Now I also need to start optimizing 🤠
2025-09-03 19:15 clearloop: wow, this dashboard looks much more prettier than I can imagine [edited]
2025-09-05 11:00 sourabhniyogi: For all of us macro data refiners (you know you want to =)), please check out / comment on https://github.com/davxy/jam-conformance/pull/59 -- this aims to add the ability to trace the causes of work report differences with a new `feature-exports` in addition to the basics of `feature-bundle-refinement`.
2025-09-05 20:30 jaymansfield: https://github.com/davxy/jam-conformance/discussions/64
2025-09-08 10:01 dakkk: davxy: what should we do when we create a new release of our impl? I see other teams tags you in GH issues both for updating perf and traces. Is it ok for you?
It could be an idea to schedule an automatic rebuild of reports / perf (once or twice a day)
2025-09-08 11:42 boymaas: image.png
2025-09-08 11:42 boymaas: Good question. I started optimizing yesterday, and now a new version is coming with some promising improvements. It's not done yet. 🤠 [edited]
2025-09-08 16:46 davxy: Feel free to leave a note in your team's issue.
For now, though, I'm not going to regenerate performance reports more than once a week.
Automation would be nice, but I'm not going to invest my time improving my hacky scripts and maintain some infrastructure.
I realize automation may seem like a "fire and forget" solution, but right now that is not feasible.
On top of that, my DevOps skills are not sharp enough to make the setup fully robust :-D
Last but not least, the focus should stay on M1, not on chasing a tight race for the top spot in the speed rankings.
2025-09-08 17:02 sourabhniyogi: For [Fuzzer Protocol v1](https://github.com/davxy/jam-conformance/pull/47) should we aim for implementing that this week or next amidst v0.7.0 or plan to have that as part of v0.7.1?
2025-09-08 17:03 sourabhniyogi: Probably some timing expectations on when we wrap up 0.7.0 fuzzing and do 0.7.1 would be useful?
2025-09-08 17:07 sourabhniyogi: Not sure if we should be doing { Ancestry support, Simple forking } exploration in 0.7.0 fuzzing -- it may be simplest for everyone to have Fuzzer Protocol v1 synced with 0.7.1?
2025-09-08 17:19 rustybot: I think we are ready to deploy v1. Is just about changing the PeerInfo message, which is trivial. If a target doesn't support a feature can turn off the flag
2025-09-08 17:22 rustybot: https://github.com/davxy/jam-conformance/pull/47#issuecomment-3267172222
2025-09-08 17:26 sourabhniyogi: How about finalizing v1 by merging 47 (with or without an Error string) and having a specific date (suggest: 9/11) be the v0=>v1 cutover?
2025-09-08 17:35 davxy: Sounds good. What about the Error message? Sounds good? Any suggestion? I'd like to keep this dead simple
2025-09-08 17:36 davxy: So perhaps the error message (which can be empty) can be included in the fuzz report
2025-09-08 17:42 sourabhniyogi: The situation is that because you give us 1-5 homework disputes every 2-3 days, it makes very little difference since we just debug each of them by hand. If it was 10-100x greater and we had "I can't replicate what you see" problems an error string would matter a LOT! But we do not. However, I imagine if you had programmatic fuzzing (of PVM) working at 10-100x scale, these error strings will be 10-100x more useful for fuzzing scalability.
I think a lot of us became SUPER performance focussed in the last 7-10 days because of the "audit" expectations that we will be doing 10-1000x more fuzzing, running for 3 days rather than 3-30s. The moment you have programmatic fuzzing working at higher scale, it becomes valuable. Looking ahead to PVM fuzzing, this error string could summarize the result of "we differ at gas/step/opcode with xyz registers" from the fuzzer probing the target via this feature https://github.com/davxy/jam-conformance/pull/65 -- but we should not have this feature part of v1 with a v0=>v1 of this week. [edited]
↳ 2025-09-09 01:20 clearloop: I got sick last week on working on my implementation, (after 3~5 hours sleep per day) for re-writing my compiler, and my compiler is now getting even worse 😅
↳ 2025-09-09 02:32 xlchen: this is the fun part of doing optimizations. you never know if your optimization actually optimizes it or not until you spend/wasted bunch time on it [edited]
↳ 2025-09-09 19:30 danicuki: > <@xlchen:matrix.org> this is the fun part of doing optimizations. you never know if your optimization actually optimizes it or not until you spend/wasted bunch time on it
In practice, the theory is different.
2025-09-08 18:23 sourabhniyogi: In the end, because there is just 1 of you and 15+ of us, we should do whatever makes your life easier so we get over the finish line together faster -- anything to increase the fuzzing level so we get as much conformance as possible and at least half of us done with M1 before sub0 ( ideally ) and most of us before xmas would be best. It seems pretty clear from the extremely high speeds that teams do their homework each week that we're all capable and motivated to work at high speeds but we don't want to stress you out either =) [edited]
2025-09-09 01:10 clearloop: agree with the error message, because the current report (per team level), doesn't make a lot of sense since we can get them any way on debugging them locally (diff of key-values), with error messages embedded, the reports will be more useful, and it would be nice if the fuzzer provides error messages in the report as well, e.g. for the cases the the target can import the blocks while the blocks should not be imported
from yesterday till today, I have met some traces broken remote but works locally, and without the error message from the remote, I actually have no ideas what to fix ... [edited]
2025-09-09 14:45 davxy: https://github.com/davxy/jam-conformance/pull/68
2025-09-09 20:47 vinsystems: https://github.com/davxy/jam-conformance/discussions/71
2025-09-09 20:53 prematurata: damn bernar. you beat me by some minutes. i opened it as well for 771 https://github.com/davxy/jam-conformance/discussions/72
2025-09-09 21:35 davxy: Fuzzer protocol v1 will go into effect on Friday, 12 Sep. Starting on that date, our fuzzer will implement the changes proposed in:
https://github.com/davxy/jam-conformance/pull/47 [edited]
2025-09-09 21:43 sourabhniyogi: Friday 9/12 yes?
2025-09-09 21:50 davxy: Fixed 😅
2025-09-10 10:30 clearloop: ( if the fork tests the last set of M1 ? ^ ^)
2025-09-10 10:41 ascriv: > <@davxy:matrix.org> Fuzzer protocol v1 will go into effect on Friday, 10/11. Starting on that date, our fuzzer will implement the changes proposed in:
> https://github.com/davxy/jam-conformance/pull/47
On import failure due to a jam protocol-defined error condition, it seems like the README says we should both return the 0 hash AND an 'Error' message, but shouldn't it be one or the other?
2025-09-10 10:44 davxy: We should return Error message. I need to remove the 0 hash requirement. That is a leftover
2025-09-10 14:12 clearloop: btw wanna discuss about if either header verification or fork detection should be in the network level? if the received header is not validated, we won't execute it at all (or put it in the pool), tbh the fork part and the header validation breaks our design, that I have to downgrade my validations from my network layer to the block import layer, and for the fork part, I have to rewrite the logic just for tests (bcz there is no finalization, no "grandpa", no peer handshake info)
but as "block importer", header validation & fork detection makes sense again ... [edited]
2025-09-10 14:51 dave: Not entirely sure what you mean. From the networking perspective you should verify blocks before announcing them to peers. You should announce and attempt to import all forks. It should be possible to verify blocks significantly faster than executing them. In particular, it is intended to be possible to verify a block without performing any accumulation, so without any PVM execution. There is _one_ validity condition that cannot be checked without performing accumulation; that we don't attempt to create two services with the same ID. It is not intended that this is checked before announcing/distributing a block to peers.
2025-09-10 14:53 dave: I think we could probably only require verification of the author and the author's signature before announcing, this isn't what the JAM-NP doc currently says though
2025-09-10 14:57 clearloop: > From the networking perspective you should verify blocks before announcing them to peers
yes, but verifying headers after producing a block or received a new block from remote peers which is right happens at the SNP, if it is validated, I'll put it into a fork, waiting for the block processor to pick it up then execute (block import), I meant that **header validation is ahead of block execution**, they are not bound together, also if we receive a block earlier than the finalization chain, I won't even validate it, that for adapting the tests, I need to bind them together, but it's fine indeed since the logic is minor and the header verification logic do need to be confirmed or it may be tough for the networking part [edited]
2025-09-10 15:02 dave: Still not really sure what you're saying. Are you saying the fuzzing interface is forcing you to change your design, even though your current design is fine for implementing JAM-NP?
2025-09-10 15:04 clearloop: sorry and yes I meant that I have to write new logic just for adapting the tests now (which will not contribute to the production), header validation is indeed fine, but testing fork at the current state, I think it may go out of control finally [edited]
2025-09-10 15:14 dave: Nodes are expected to follow all forks, not clear to me why it's an issue for the fuzzer to test forks? What do you mean by "testing fork at the current state"?
2025-09-10 15:20 clearloop: for example, we are now introducing fork tests at this level: https://github.com/davxy/jam-conformance/discussions/71#discussioncomment-14360978 , simply replace the latest imported block
I believe the approach in production is creating a new fork at parent, results in we now have two fork chains (currently without grandpa or finalization), but for the current fuzzing interface, we are more like just maintaining a "finalized chain" (or importing from a finalized chain), which makes conflict, and starting from this, I'm afraid that we'll introduce more and more parameters/arguments in the fuzzer till we can cover the entire fork system, and during the process, I believe for some teams, their work will be rewriting their implementation 5%, 10%...100%, till they back to the status months ago
[edited]
2025-09-10 15:42 dave: Ok, think davxy's input is needed here. But I would think you probably shouldn't be immediately finalizing blocks that are imported. Ideally the fuzzer won't force the node to be architected in any particular way, but you're naturally going to have to write _some_ code to interface with it; M1 doesn't require JAM-NP so we can't just use that.
2025-09-10 15:44 clearloop: > But I would think you probably shouldn't be immediately finalizing blocks that are imported.
what do you think about the case for light node? If I connect a node which is present in validator set, can I import its finalized blocks without verification? [edited]
2025-09-10 15:44 dave: Light nodes will probably follow the finalized chain and so lag slightly
2025-09-10 15:47 dave: Though it really depends on the use case, for some use cases maybe it's good enough to trust instead of verify
2025-09-10 15:48 dave: In the case of a finalized block you can always ask for proof of finality
2025-09-10 15:48 dave: (This isn't in the JAM-NP spec, GRANDPA stuff will be added soon)
2025-09-10 15:49 clearloop: but if a node is present in validator set, and we can't trust its finalized blocks, ain't the consensus got broken, or the case would be like,
- if we importing blocks from it from 0 till the latest finalized block, we can trust (mb with small verifications)
- If we just pick sort of blocks in the middle, we'll need the proof of finality
hmm, but the rpc may cheat or broken as well? [edited]
2025-09-10 15:51 dave: What do you mean by "trust its finalized blocks"? Validators are generally fairly trusted but the protocol is designed to work with up to 1/3 byzantine validators. A byzantine validator could lie to you about which blocks have been finalized
2025-09-10 15:52 dave: If you want to be sure a block has been finalized you need to ask for a proof and verify it. A proof will consist of various GRANDPA votes, possibly going all the way back to the genesis block to verify validator set changes and so on
2025-09-10 15:52 dave: RPC is totally different, as the RPC interface is specifically intended to only be used between two processes which trust each other [edited]
2025-09-10 15:53 dave: Probably because they are being run by the same operator
2025-09-10 15:57 danicuki: Sorry if I missed this information in the protocol: when a block should not be imported for any reason, how do we know the expected reason why the block wasn't imported? On test vectors there was an expected error message. Do we have something similar in the fuzzer?
2025-09-10 15:57 clearloop: you got me, probably I have done some work less of security in my previous work lol, e.g. importing blocks from open rpc directly without checks for building centralized databases, the concepts of proof of finalization makes sense to me! [edited]
2025-09-10 16:01 clearloop: I had asked for it before as well, but seems it's still not included in Fuzzer V1 yet (only has error from target to fuzzer but not fuzzer to target), probably we can comment it in https://github.com/davxy/jam-conformance/pull/47 [edited]
2025-09-10 16:33 davxy: Ideally, you should not need to write any special code to interact with the fuzzer.
You only need to implement the communication channel, which is straightforward.
No special handling is required at the block import level.
## Basic Protocol
1. Receive a block
2. Import the block, which yields either success or failure
3. Reply:
- On success: return the resulting state root
- On failure: return the `Error` message (the reason is irrelevant and not specified in the GP)
## Handling Forks
In production, you must support forks that occur above the last finalized block.
However, since there is no finalization rule in this context, you will not finalize anything.
Technically, this means you must be able to handle any kind of fork.
For fuzzing, of course, this is not required.
The only requirement (which matches a correct implementation and does not need special code)
is that you must allow forks at the **last imported block**.
This is necessary because fuzzing involves mutating blocks in several ways,
and you must attempt to import all the mutations (some may fail some may be successfull -> fork)
## Example Session
1. Let `i = 0`
2. Increment i and Produce block `B_i` with parent `B_(i-1)`
3. Mutate `B_i` into several variants: `B_i1`, `B_i2`, `B_i3`
4. Send these variants in order: `B_i1`, `B_i2`, `B_i3`, and finally `B_i`
5. Repeat from step 2, using `B_i` as the parent for the next batch
## Notes
- Finality is not required: just import the blocks
- Forks created by the mutations can remain in the state
- (optional) You may prune these forks as you progress if you wish [edited]
2025-09-10 16:34 davxy: This goes straight into the fuzz protocol v1 readme
2025-09-10 16:39 davxy: The trace and report. Re execute to get why your impl fails.
This is the same as the jam-test-vectors traces, a dynamic that matches how the system actually works. [edited]
2025-09-10 16:43 clearloop: > The only requirement (which matches a correct implementation and does not need special code)
> is that you must allow forks at the last imported block.
>
> This is necessary because fuzzing involves mutating blocks in several ways,
> and you must attempt to import all the mutations (some may fail some may be successful -> fork)
well I can accept the current fork system now since I just realized that I have to refactor my test infura anyway now..., but want to confirm **if the current fork system won't be expanded anymore?**
block mutation happens, and we have already had the logic which is pretty large, handling orphan blocks, invalid blocks, forks, fork of forks, block mutation may happen in any part of them and we have already handled all of them, but for adapting the fuzzer, we still need to write new logic, because our exist system requires more informations to bootstrap [edited]
2025-09-10 16:46 davxy: > but want to confirm if the current fork system won't be expanded anymore
I can confirm this. At least according to our fuzzer protocol proposal [edited]
2025-09-10 18:21 danicuki: In this case, https://github.com/davxy/jam-conformance/blob/main/fuzz-reports/0.7.0/traces/1757406441/00000117.json - the post_state is same as pre_state. This means that the block import should fail for some reason, resulting in the same state root. But we don't know, based on the trace, what is the reason why you are rejecting the block. On test vectors, there is something like `"err": "bad_validator_index"` or other hints about the reason why the block fails.
2025-09-10 19:10 davxy: Hmm interesting... indeed we could include the result of our last block import in the final fuzz report - it might serve as a helpful hint.
However, should be noted that the evaluation order of some expressions is arbitrary, so with a block with the potential to trigger multiple errors the result could is a bit ambiguous.
Edit: After thinking about it, I agree - I don't see a major issue with adding it to the report. [edited]
2025-09-10 22:04 danicuki: > <@davxy:matrix.org> Hmm interesting... indeed we could include the result of our last block import in the final fuzz report - it might serve as a helpful hint.
>
> However, should be noted that the evaluation order of some expressions is arbitrary, so with a block with the potential to trigger multiple errors the result could is a bit ambiguous.
> It thus also expose some implementation details, which might not be super ideal.
>
> I'm not against including it, just unsure.
> Perhaps we should add it if it meaningfully speeds up the analysis.
> What do you think, [@dave:parity.io](https://matrix.to/#/@dave:parity.io)
You could list all potential errors in no particular order. Or at least reveal the first one. Because it is very very hard from our side to know why a particular block is rejected. When we have state value diff it is easier to debug and find reasons for the diff. But when next state doesn’t change, we stay clueless and can’t even challenge the fuzzer.
2025-09-10 22:08 danicuki: Another idea: fuzzer target binaries are public. This means anyone can download other teams binaries and run over them the blocks and inspect the logs to find clues. For this reason, since you already run teams fuzzer target, you could share in the repo the running logs of all teams. [edited]
2025-09-11 11:44 davxy: As you are the main stakeholders, could I ask for one more review round before we merge?
The changes are summarized in the PR description: https://github.com/davxy/jam-conformance/pull/47
2025-09-11 13:08 r2rtnl: https://github.com/davxy/jam-conformance/discussions/74
2025-09-11 16:02 charliewinston14: Is it correct that the ancestry set is maintained separately for each fork?
2025-09-11 16:15 davxy: The ancestry is determined using the P() function (see 5.3), which traces the chain backward starting from a specified header.
For a block within one fork, the state in other forks is irrelevant. [edited]
2025-09-11 16:17 davxy: For the sake of fuzzing you can maintain a simple bounded queue... (as we never extend mutations chain) [edited]
2025-09-11 22:15 r2rtnl: https://github.com/davxy/jam-conformance/discussions/75
2025-09-11 23:55 sourabhniyogi: It is likely very useful to have a v1 .bin/.json ancestors (and maybe forks) trace for us to test our v1 implementations as we post them in the next 24-72 hours? In the examples/v1, with the updated names [edited]
2025-09-12 17:04 davxy: [V1 Enacted](https://github.com/davxy/jam-conformance/pull/47)
2025-09-12 17:38 sourabhniyogi: Can you suggest the proper home for this issue https://github.com/davxy/jam-conformance/issues/76
2025-09-12 17:44 clearloop: I assume this should be under `w3f/jam-test-vectors` or https://github.com/polkadot-fellows/JIPs ? [edited]
2025-09-12 17:46 clearloop: pretty like a standard interface for `PVM-FFI`, I think Jan Bujak may have some inputs? according to the benchmark tool in polkavm, there is sort of arch of Module, compile, various steps, etc. [edited]
2025-09-12 17:48 clearloop: https://github.com/paritytech/polkavm/blob/master/tools/benchtool/src/backend.rs#L14 this is pretty like a universal interface, but I think it's still missing the host calls part for making it PVM in JAM focused [edited]
2025-09-12 18:06 sourabhniyogi: Jan Bujak: Can you advise?
2025-09-12 18:56 prematurata: 🥳 should we signal support of our targets somehow? in the ghissue maybe? [edited]
2025-09-13 05:16 jan: Something similar to this but cut down:
https://docs.rs/polkavm/latest/polkavm/struct.RawInstance.html
Basically:
1. A function to turn a raw program blob into a module (i.e. a compilation step for recompilers; for interpreters this will be mostly a nop)
2. A function to instantiate a new VM instance from that module.
3. Then various getters/setters to modify that instantiated VM's state and be able to call into it.
We definitely do not want to have callbacks in the interface. There should be just a single `run` method that you call on the instance, and if it happens to trigger a host call then it should return a status code signifying that a hostcall was called. (Similar to how inner PVM invocation hostcalls work.) Same with handling page-faults. The various "read_bytes_8" functions are completely unnecessary; there should only be one function for reading bytes with a given length, and then using that to e.g. read a 32-bit number is just a few trivial bitshifts and bitors away. And of course, most of the stuff in your "Debug and Tracing" section should not be in a standard API at all.
2025-09-13 12:12 sourabhniyogi: OK! I will adjust our pvm.h to match that API 100% and shift the callback process to the run method approach, thank you!
2025-09-13 18:21 danicuki: Do we need to maintain fuzzer target accepting multiple versions? Or should we keep only the latest (in this case, now, v1)?
2025-09-13 19:52 davxy: for what concerns our fuzzer, you can switch to v1 only. All new fuzzing sessions will use v1
2025-09-14 15:23 danicuki: Are you planning to add some examples that use ancestors on v1? As far as I saw, all 30 examples don't use ancestors feature.
2025-09-14 17:06 davxy: Yes, but I've not implemented the feature yet. Indeed during the handshake the flag is turned off in our PeerInfo message
2025-09-14 17:06 davxy: new batch https://github.com/davxy/jam-conformance/pull/78 [edited]
2025-09-14 20:13 prematurata: https://github.com/davxy/jam-conformance/discussions/79
2025-09-15 02:42 clearloop: interesting, each time new traces come out, I can trigger the sum of failures from all other teams
2025-09-15 22:32 r2rtnl: davxy: a question regarding ancestor support and sharing the same implementation of the (11.35) check between a fuzzer target and jam-test-vectors.
In GP 0.7.0, (11.35) requires two things:
1. The lookup\_anchor hash must be present in the ancestor set.
2. Its slot must match lookup\_anchor\_slot.
The issue I’m seeing is that the reports test vectors only provide recent\_blocks, which don’t include slot information.
This creates a mismatch:
- To pass the reports test vectors, the implementation must skip the slot check and check only against recent\_blocks.
- For full ancestor support in fuzzer target, the slot check must remain.
Would it be feasible to adjust the reports test vectors to include slot information, or otherwise make it possible to construct the ancestor set so that the full (11.35) check can run? [edited]
2025-09-16 02:55 ascriv: https://github.com/davxy/jam-conformance/discussions/84
2025-09-16 09:07 davxy: Please try to run your implementation against minifuzz first
https://github.com/davxy/jam-conformance/pull/85
↳ 2025-09-16 10:01 r2rtnl: davxy: Does minifuzz assume that an implementation supports forks? Messages 5 and 7 seem to import a block with the same slot (3).
↳ 2025-09-16 10:39 davxy: In the next hours I'll share a set with forks and another without forks.
2025-09-16 09:08 davxy: May be a bit buggy :-) feel free to open a PR to improve it
2025-09-16 09:24 davxy: Teams that are already enrolled are encouraged to run it anyway, as I have found some minor issues in the targets (e.g., incorrect error message length, wrong argument order in PeerInfo, etc.).
I will not fuzz your target if you have not successfully completed the first 20 steps in the examples/v1 folder using minifuzz [edited]
2025-09-16 10:21 tomusdrw: Is there any more details on the examples (i.e. expected state, etc)? We are getting a wrong state root at some point, but not really sure how I could debug it.
2025-09-16 10:38 davxy: I added a readme file to the examples folder. However, I'll copy the first 20 steps to the minifuzz folder to avoid midunderstandings
2025-09-16 10:46 clearloop: we have unit tests for `examples/v1`, e.g. decode/encode `*.bin` and see if things get matched, not sure if this is easier to test the format, will try the mini-fuzz script anyway
2025-09-16 10:56 tomusdrw: Ah, okay, perfect! I missed that.
2025-09-16 13:01 jaymansfield: davxy: Just a heads up, minifuzz seems to work until the invalid state root in step 29. It then shuts down after seeing the mismatch rather then proceeding with the final get state request.
2025-09-16 13:02 davxy: Yes this is known. See the README in the examples folder.
I'll prepare some other traces for minifuzz testing (with and without forks)
2025-09-16 13:04 davxy: In the examples we intentionally return a wrong root to emulate a bad target, and thus include the GetState message in the examples sequence
2025-09-16 13:04 davxy: But this is causing some confusion (you are the third to ask)
2025-09-16 13:04 davxy: so I'll provide a dedicated folder for the "self test"
2025-09-16 13:07 jaymansfield: This would help. I think the readme is clear in that it should fail, but not that minifuzz won't actually perform the get state itself after. That was the part missing. [edited]
2025-09-16 13:10 clearloop: wait, I just realized that `minifuzzer` stopped after 10 pairs on my machine without any error messages, if this is not normal as well
```
==========================================================================
Processing pair 10: 00000009_fuzzer_import_block.bin -> 00000009_target_state_root.bin
TX: import_block
RX: state_root
Stopping after 10 file pairs as requested
``` [edited]
2025-09-16 13:11 jaymansfield: Try adding the argument: --stop-after 30
2025-09-16 13:47 davxy: https://github.com/davxy/jam-conformance/tree/main/fuzz-proto#preliminary-self-testing
2025-09-16 13:47 davxy: I hope is clear now
2025-09-16 14:12 danicuki: I get error when try to run minifuzz:
```
$ python minifuzz/minifuzz.py -d examples/v1/forks --target-sock /tmp/jam_target.sock
Traceback (most recent call last):
File "/Users/danicuki/dev/jam-conformance/fuzz-proto/minifuzz/minifuzz.py", line 9, in
from jam_types.fuzzer import FuzzerMessage
ModuleNotFoundError: No module named 'jam_types'
```
Where can I find the `jam_types` code?
2025-09-16 14:14 clearloop: ```
pip install git+https://github.com/davxy/jam-types-py.git
```
2025-09-16 14:22 clearloop: davxy: I think there is a problem in the fork tests at `00000004` which propose `block.slot=1` with parent hash of the block in `00000003` (`block.slot=1` imported successfully) and expects import success, if I'm not mistaken, if this is a fork, it should have the parent hash of `block.slot=0`
2025-09-16 14:59 jaymansfield: I'm seeing the parent of 00000004 to be the one from the initialize in 00000001
2025-09-16 15:40 danicuki: minifuzzer should not compare message size, since error messages can differ from implementation to implementation, no?
```
Processing pair 7: 00000006_fuzzer_import_block.bin -> 00000006_target_error.bin
TX: import_block
Error decoding target response: Decoding - No more bytes available (needed: 69 / total: 68)
Connection closed
```
I believe `FuzzerMessage(data=scale_bytes).decode()` might have a bug [edited]
2025-09-16 15:43 clearloop: can you pass that, from my logs `0x7003387fc23f422113a94dd5abc0b4fa31eae0108b4ac9ea6a133ffd14e315fb` is the parent `00000004` want
```
DEBUG read: message(length): Info
DEBUG write: message(21): Info
DEBUG read: message(length): Initialize(len=21)
DEBUG write: message(33): StateRoot(0xe174c1ea94e21958db28c9ea621f09ae2b360a9ecb6047afdbf94827d216e1db)
DEBUG read: message(length): ImportBlock(slot=1, hash=0x252ff7a523a698b91e872f90f3f046fac90b80869bf9c7cf2000d627f4e7df3e)
DEBUG import: importing block(1)=0x252ff7a523a698b91e872f90f3f046fac90b80869bf9c7cf2000d627f4e7df3e, best block: 0
WARN failed to import block: BadCoreIndex
DEBUG write: message(14): Error(BadCoreIndex)
DEBUG read: message(length): ImportBlock(slot=1, hash=0x7003387fc23f422113a94dd5abc0b4fa31eae0108b4ac9ea6a133ffd14e315fb)
DEBUG import: importing block(1)=0x7003387fc23f422113a94dd5abc0b4fa31eae0108b4ac9ea6a133ffd14e315fb, best block: 0
DEBUG write: message(33): StateRoot(0x25b1bd54e8d0d5c82202ae51de2d378b8dfa13f637fe75bda672484c25e1c6a6)
DEBUG read: message(length): ImportBlock(slot=1, hash=0xe63b5fa906678415f5b15ae6514d44b35e4dafa497f52a1640cbf374811b9c89)
DEBUG import: importing block(1)=0xe63b5fa906678415f5b15ae6514d44b35e4dafa497f52a1640cbf374811b9c89, best block: 1
WARN import: Fallback state to 0
WARN failed to import block: Parent mismatch, expected: 0x7003387fc23f422113a94dd5abc0b4fa31eae0108b4ac9ea6a133ffd14e315fb, got: 0x2bf11dc5e1c7b9bbaafc2c8533017abc12daeb0baf22c92509ad50f7875e5716
``` [edited]
2025-09-16 16:39 davxy: > I think there is a problem in the fork tests at 00000004
I don't see any issue
> I'm seeing the parent of 00000004 to be the one from the initialize in 00000001
Correct
Given the filename number associated to each block, the chain graph of the blocks (attempted to be imported) up to 7 should be:
```
1--+--2
+--3
+--4--+--5
+--6--7
``` [edited]
↳ 2025-09-17 07:09 clearloop: thanks! I found that our impl has sort of bugs on falling back
2025-09-16 16:43 davxy: From the replies, it appears that:
- Blocks **2** and **5** fail to be imported.
- Blocks **3** and **4** are successfully imported (actual fork).
According to the fuzzer protocol, the chain is extended using the **last successfully imported block**.
In other words, you will **never** see blocks 3 and 4 successfully imported and then the chain extended from block 3.
2025-09-16 16:43 davxy: https://github.com/davxy/jam-conformance/tree/main/fuzz-proto#example-session
2025-09-16 20:42 sourabhniyogi: After teams get through "minifuzz" this week and a round or two of ancestry + forks this week and next, can we wrap up 0.7.0 and move collectively onto 0.7.1, say, on week of Sept 29?
Since there isn't _that_ much happening on 0.7.2, perhaps we can jump from 0.7.0 to [0.7.2](https://github.com/gavofyork/graypaper/releases/tag/v0.7.2)? [edited]
2025-09-17 03:36 ascriv: Should L be 24 for the tiny config? Cant pass 1757862468 otherwise
2025-09-17 04:47 prematurata: > <@ascriv:matrix.org> Should L be 24 for the tiny config? Cant pass 1757862468 otherwise
Yes
2025-09-17 04:48 prematurata: Anyone passing https://github.com/davxy/jam-conformance/discussions/79 can shred some light on this?
2025-09-17 05:45 danicuki: > <@ascriv:matrix.org> Should L be 24 for the tiny config? Cant pass 1757862468 otherwise
Yes
2025-09-17 08:26 0xjunha: https://github.com/davxy/jam-conformance/discussions/91
2025-09-17 10:49 davxy:
2025-09-17 11:38 clearloop: anybody knows what's the case of faulty\_000000029 ? from `000000030`, I found that the trace expects block `0xb13b648e9030118a6bf912aaca95a78b66c86cbd41d112b21393d4b896eaf864` in the history via getting the state of `0x91fcda538898b174da9b61af42c141fb0e1549e4e0dfc1ca8caec4c6185eeea5` (and this is the last block get imported exactly in my target) while the block `0xbi3b6...` does not exists at all in my importing process [edited]
2025-09-17 13:01 vinsystems: > <@clearloop:matrix.org> anybody knows what's the case of faulty\_000000029 ? from `000000030`, I found that the trace expects block `0xb13b648e9030118a6bf912aaca95a78b66c86cbd41d112b21393d4b896eaf864` in the history via getting the state of `0x91fcda538898b174da9b61af42c141fb0e1549e4e0dfc1ca8caec4c6185eeea5` (and this is the last block get imported exactly in my target) while the block `0xbi3b6...` does not exists at all in my importing process
https://github.com/davxy/jam-conformance/tree/main/fuzz-proto/examples/v1#warning-faulty-session
↳ 2025-09-17 13:02 clearloop: thank you so much! I have just spent hours on this 🤦♂️ [edited]
2025-09-17 14:58 davxy: https://github.com/gavofyork/graypaper/pull/497
2025-09-18 06:46 dakkk: It seems I'm having a similar issue with that trace
2025-09-18 13:40 danicuki: > <@sourabhniyogi:matrix.org> After teams get through "minifuzz" this week and a round or two of ancestry + forks this week and next, can we wrap up 0.7.0 and move collectively onto 0.7.1, say, on week of Sept 29?
>
> Since there isn't _that_ much happening on 0.7.2, perhaps we can jump from 0.7.0 to [0.7.2](https://github.com/gavofyork/graypaper/releases/tag/v0.7.2)?
I agree we should go directly to 0.7.2
2025-09-18 14:12 clearloop: interesting, I just found that in the tracing tests (conformance reports/traces), there are programs have same code\_hash but different initial registers, ain't the code\_hash the hash of (metadata + standard program) that the initial registers should be fixed? [edited]
2025-09-18 16:54 boymaas: When running the minifuzz, do other teams also verify whether the author_index is determined using: https://graypaper.fluffylabs.dev/#/38c4e62/0e1c040e3104?v=0.7.0 in key mode, as part of their header validation. I observe a delta in my implementation **only during the first epoch** of the jamtestvectors traces and the minifuzzer protocol test.
2025-09-18 19:13 sourabhniyogi: Poll: December 2025 In-person JAM Meetup @ PBA Lisbon
https://github.com/davxy/jam-conformance/discussions/93
It would be fabulous if we had as many of the JAM implementer enrolled teams there as we possibly can!
2025-09-18 19:16 emielsebastiaan: Does anyone have specific dates?
2025-09-18 19:49 sourabhniyogi: Hello nikos -- we heard about the PBA JAM course way back in the JAM XP May meetup -- it would be great to have JAM implementers assemble there. Can you tell us about details so we can plan with you a bit?
2025-09-18 19:50 wirednkod: sourabhniyogi: the dates are not final
2025-09-18 19:50 wirednkod: we are recovering from PBA Bali and expecting some updates concerning the toaster - and then decide the final dates
2025-09-18 19:51 wirednkod: I would suggest to avoid polls with specific dates as the one above - but i will keep the JAM community up to date as soon as we have the final dates
2025-09-18 19:51 sourabhniyogi: Are you ok with 20-30 of JAM implementer headcount there?
2025-09-18 19:52 wirednkod: i cannot provide a definite answer to that at the moment as it may sound like a promise
2025-09-18 19:53 wirednkod: again - i would suggest for a bit of patience - im sure by the end of Sept we will have many more details to move forwrad
2025-09-18 19:53 wirednkod: but at the moment neither the date nor the participant numbers or profile are 100% defined
2025-09-18 19:53 sourabhniyogi: Absolutely, will be patient =)
2025-09-18 19:54 wirednkod: thank you
2025-09-19 16:37 shimonchick: Is there something with the JIP-5 altnames? Its a very basic algorithm, but applying it yields different values for me. Can somebody confirm if the current JIP-5 altnames in https://docs.jamcha.in/basics/dev-accounts are correct?
2025-09-19 18:09 sourabhniyogi: Hats off to Tianyi | SpaceJam ... with the FIRST JAM Implementation to match/beat Polkajam in performance!
2025-09-19 18:10 sourabhniyogi: As teams develop PVM recompilers, should we aim to have two entries in the dashboard like Polkajam or just one?
2025-09-19 20:30 davxy: First of all, congrats to Spacejam for achieving such impressive results!
That said, we've now entered the sub-ms domain, where my workstation setup isn't deterministic at all.
One run can be faster, and the next might not be.
To avoid encouraging pointless sub-ms races, I think it makes sense to round measurements to millisecond granularity
↳ 2025-09-19 21:48 clearloop: 0.05ms, it's driving me crazy 🫠 agree with not overkilling on this, from my side, I'm feeling exhausted indeed in the recent weeks, however I think the sub-ms in single dataset is still useful for checking if there are spaces to optimize XD for example, via seeing polkajam is faster in safrole/fallback, I know things can be done better, and I just found sth new can be optimized in our implementation, but I actually hope that I can't find since this stuff looks like endless XD [edited]
↳ 2025-09-20 03:33 jan: Nice! I'm always happy to see other implementations matching PolkaVM in performance. Congratulations to SpaceJam!
↳ 2025-09-20 03:34 jan: The next frontier will be recompilation speed (along with the new gas cost model which will be a lot more challenging to make fast), which isn't currently very well tested by these performance benchmarks.
2025-09-19 20:37 davxy: For example, one could gain some *pointless* advantage just by removing **all** the logs.
I've seen some implementations keep a few logs, while others completely removed them.
We even spotted implementations removing signature checking ;-) (quickly restored, of course).
Honestly, we don't care much about these micro-optimizations.
What really matters is avoiding spending orders of magnitude more time fuzzing a target compared to the baseline.
2025-09-19 22:01 sourabhniyogi: Were the (now missing) logs useful to you?
I believe shifting the focus from micro-optimizations to recompilation is important. We've been conditioned to care about the small (\<50%) differences because of
```
(M3) HALF-SPEED: Conformance and 50% of required performance (including PVM impl): 100,000 DOT + 1,000 KSM
(M4) FULL-SPEED: Conformance and 100% of required performance (including PVM impl): 100,000 DOT + 1,000 KSM
```
Its pretty obvious that performance is all about recompilation here, and I think its valuable to get more teams focussed on this.
We know its not an thing for M1, but since we know with 100% certainty its a not-pointless goal for M3/M4, its entertainment now but useful entertainment while we're waiting for like, 1.0 ratification.
Why not introduce at least 1M-10MM million gas of randomly generated \[but still valid\] PVM byte code interpretation to always consume > 10ms (for baseline polkajam) so we can improve performance on at least _some_ traces?
By introducing this a > 10x factor, 50% differences become visible in `import_max`, and:
1. we have a baby step towards PVM fuzzing
2. recompiler teams can play a useful substantive M3 / M4 optimization game amidst M1 fuzzing.
3. more teams will get engaged on recompiler (because we love optimizing)
PVM fuzzing is valuable to attack, I presume, right within M1, is it not? [edited]
2025-09-19 22:56 clearloop: my testing linux machine is likely 10x-20x slower than the fuzz machine, can confirm that some of our big optimizations were actually started from 1ms on my machine which could be just 0.1ms/0.05ms in the testing machine, however at the current level, for spacejam we won't optimize more
out of the performance part, I think we're feeling anxious mostly caused by there is no obvious threshold for the stable implementation, the performance of polkajam is really hard to chase ( we started from 20x~50x slower, and some parts of polkajam are still unbelievable/insane to us atm even we are in the same language ) and sometimes I'm feeling like I'm working like a slave to chase it 🫠
hope the threshold of M3/M4 or the baseline for the current M1 will not even in X ms but based on X0 ms or X00ms, which is for sth like: 「there are 6s in the block, if you can import most of blocks within 30ms avg on machine A, it's totally enough for the stability of our network, you have done a great job, I won't force you to archive 0.X ms since all we know there are indeed differences in impls」 [edited]
2025-09-20 03:38 jan: > hope the threshold of M3/M4 or the baseline for the current M1 will not even in X ms but based on X0 ms or X00ms
This is not official, but I imagine for M3/M4 we'll look at what's achievable (by looking the fastest implementation), and just pick a threshold that's slightly lower than that so that it won't be limited to only the most performance-oriented teams and/or languages (where "slightly lower" is very much TBD; we'll probably know how much that is exactly once we can run more real-world tests on an actual JAM chain)
2025-09-20 03:40 jan: I can see a world where some parts of the protocol will be a bottleneck and that could make it less important to optimize other parts because they wouldn't actually make things faster in practice. [edited]
2025-09-20 03:41 jan: But, again, this is still TBD until we can do more real-world testing instead of microbenchmarks.
2025-09-20 03:43 jan: > Why not introduce at least 1M-10MM million gas of randomly generated [but still valid] PVM byte code interpretation to always consume > 10ms (for baseline polkajam) so we can improve performance on at least some traces?
For purely PVM drag racing it probably doesn't make sense to use a general-purpose fuzzer/harness like this and it'd be a better idea to make a dedicated one which *only* does PVM (like I'm planning to have for M3/M4)
2025-09-20 03:45 sourabhniyogi: Life is what happens when you're busy making plans =)
2025-09-20 04:49 ascriv: How will the amount of M1 conformance vectors be defined? Based on amount of vectors polkajam-int can run in 3 days (e.g.) at the time of conformance? Or something else?
2025-09-20 11:42 rustybot: https://github.com/davxy/jam-conformance/pull/95
2025-09-20 12:07 boymaas: Could something have gone wrong with the last performance run? When I bench in a VM, I already get much lower values (between 5x and 10x lower) using the same weighted scoring algorithm.
2025-09-20 12:20 davxy: Indeed there is a big regression for jamzig. I'll try to re-run, but I used exactly the same setup for all the targets.
↳ 2025-09-20 12:25 clearloop: could you please try to download the latest spacejam binaries as well, I had an update of our crypto stuffs late yesterday, possibly provides 25% optimization of our basic fallback mechanism, not sure if it really works
2025-09-20 12:20 clearloop: looks like this new rule has a lot of effects in crypto related stuffs XD
2025-09-20 12:22 boymaas: Thanks, davxy. An hour ago, I released a new binary that also addresses the last failing reports.
2025-09-20 12:34 davxy: Perhaps my message was not clear:
- I'm not sustaining this kind of competition
- I'm not here to run your impl on each update (for a sub ms race)
- I may run impls with strange behavior
I will run your update on next round [edited]
2025-09-20 12:35 davxy: FWIW fuzzer protocol is public, traces are public, anyone can check their perfs
2025-09-20 12:49 boymaas: No problem. It's just an unusual regression I do not understand. It will resolve itself the next run. There's no rush.
2025-09-20 12:53 ascriv: I think I speak for everyone when I say we all really appreciate the work you’re basically soloing @davxy
2025-09-20 12:58 clearloop: understood, sort of devops related stuffs for all teams are indeed need mental health insurance ))
2025-09-20 13:02 davxy: Well the point is that it is **literally** not my job to bench your impls. I work on an implementation and I built a fuzzer. That so far is the most effective [edited]
2025-09-20 13:03 davxy: Well your impl has a very strange regression, so I'll check that of course :-)
2025-09-20 13:05 boymaas: Thank you; Much appreciated.
2025-09-20 14:46 ycc3741:
2025-09-20 14:47 ycc3741: davxy:
When testing [trace/preimage_light/0000008](https://github.com/davxy/jam-test-vectors/blob/master/traces/preimages_light/00000008.json), we found some issues with [pi_V[v]_g](https://graypaper.fluffylabs.dev/#/38c4e62/19ea0119fc01?v=0.7.0), as follows:
We know that in tiny mode, the parameters are R = 4, E = 12, which means the cores where the guarantors for slot 7 and slot 8 reside will rotate.
We observed that originally our pi_V[v]_g was 2. In slot 8, we received guarantee_extrinsic with two elements belonging respectively to slot 7 and slot 8. According to the core assignment for slot 8 — [0,1,1,0,0,1] — and for slot 7 — [1,0,0,1,1,0], we confirmed that the reports received for slot 7 and slot 8 were both valid.
Thus, we incremented pi_V[v]_g twice by +1 [(since kappa_v_prime is G according to GP 0.7.0 formula 13.5)](https://graypaper.fluffylabs.dev/#/38c4e62/19f40119f401?v=0.7.0), resulting in pi_V[v]_g_prime = 4.
However, according to GP 0.7.0 formula 13.5 (kappa_v_prime is G), it only checks whether kappa_v_prime is in G. But doesn’t this conflict with the description above 13.3:
[g: The number of reports guaranteed by the validator.](https://graypaper.fluffylabs.dev/#/38c4e62/197700197a00?v=0.7.0)
Shouldn’t both reports be counted? Or should we simply follow the [formula (13.5)](https://graypaper.fluffylabs.dev/#/38c4e62/19f40119f401?v=0.7.0) and add +1 just once?
2025-09-20 14:47 ycc3741: post.png
2025-09-22 12:24 dave: IIRC Gav said the GP behaviour in this case is intentional, to keep things simple. I agree the counting behaviour you describe would probably be more useful. You can propose a GP change by making a PR in the GP repo, if the change is simple I think there's a fair chance of it being accepted
2025-09-23 01:29 ycc3741: THX a lot
2025-09-23 01:29 ycc3741: we will deal with it later
2025-09-23 16:18 davxy: https://github.com/davxy/jam-conformance/discussions/98#discussioncomment-14488639
2025-09-25 15:58 danicuki: Many of the new fuzzer batch are related to on_transfer, which was dropped on GP 0.7.2.
Do you think it is worth at this point to have tests that use on_transfer?
2025-09-25 16:33 prematurata: in my case i found other issues not directly related to ontransfer so it was worth it I'd say
2025-09-25 18:43 davxy: Keep in mind that it is not mandatory to pass every case. You may decide that some are not worth addressing and leave them as is.
This is the final batch for 0.7.0; after this, we will move on to 0.7.1.