#jam-conformance:matrix.org

last updated 2025-09-26 03:25 UTC

⇩ plaintext · ⇦ all rooms


# 2025-08-20 15:56erin: tomusdrw: i've set up the archiver to archive this also, so hopefully will be up starting tomorrow
# 2025-08-21 06:33ascriv: https://github.com/davxy/jam-conformance/issues/26 should we not discourage open source implementations (for now)? (edited)
# 2025-08-21 10:27oliver.tale-yazdi: I think open source repos are fine. Other implementors are just not supposed to look at it
# 2025-08-21 10:28oliver.tale-yazdi: Maybe in their self interest it should be private though since someone else may steal their code and submit it first, but that is up to them to decide IMO (edited)
# 2025-08-21 10:30sourabhniyogi: I've never understood when we are required to make everything open-source -- is it after M1 (like, this year =) ), or would it be at the very end? Or is there some "share with the w3feval team" process? As the reports are clear for > 2/3 of active teams now (omg ... yay!), can we move onto 0.7.0 traces for the rest of the month and start the 0.7.0 fuzzing in first half of September, and then do 0.7.1 traces/fuzzing in the second half? (edited)
# 2025-08-21 10:32oliver.tale-yazdi: I think you only need to share the code with the Milestone judges eventually.
# 2025-08-21 10:45sourabhniyogi: Does anyone want to see refining added to the fuzz protocol?
# 2025-08-21 10:47ascriv: Sure. But doesn’t that require networking for the erasure coding stuff?
# 2025-08-21 10:56sourabhniyogi: For a auditable work package bundle, it doesn't have to. We can add this to the fuzzer protocol:
--> Work Package Bundle ++ Core Index ++ Segment Root Mappings ++ Timeslot
<-- Work Report
# 2025-08-21 10:57jaymansfield: > <@sourabhniyogi:matrix.org> Does anyone want to see refining added to the fuzz protocol? That would be great
# 2025-08-21 11:01sourabhniyogi: Anyone see the need to add more to the above than 1+2, or have a desire to add this in 0.7.0 vs 0.7.1? There is this ancestors detail I'm wondering what will happen when
# 2025-08-21 11:02ascriv: I’m ok with whatever the evaluators decide, but clarity/a roadmap would be nice
# 2025-08-21 11:18clearloop: I hope polkajam can join the discuss which can make our time spent more efficient, for example, the testing data format
# 2025-08-21 11:26sourabhniyogi: I think the expectation is that we all become evaluators, and we earn our fellowship stripes by suggesting the changes we want ... like, how do you want to see ancestors in the data
# 2025-08-21 11:26davxy: Your time schedule looks realistic and reasonable
# 2025-08-21 11:36davxy: This seems doable. Maybe you could open a PR to extend the fuzzer protocol? Right now, I am focused on fixing issues in the fuzzer and getting stuff ready for version 0.7.0 (test vectors first). The refinement extension could realistically land in 0.7.1 instead. For 0.7.0, the fuzzer protocol should already be extended with: - supported-features handshake during PeerInfo message exchange - ancestors set If the target supports refinement, that capability could be included as part of the features exchange.
# 2025-08-21 11:37sourabhniyogi: Is refining successfully part of M1 conformance? M2?
# 2025-08-21 11:38sourabhniyogi: I am happy to do a refining extension PR targeting 0.7.1 and try it out with everyone here in Sept =)
# 2025-08-21 11:40sourabhniyogi: I'll primarily just extend this README.md here https://github.com/davxy/jam-conformance/blob/main/fuzz-proto/README.md unless you suggest something else?
# 2025-08-21 11:44davxy: From a technical standpoint, M1 is doing block importing, so as far as I can tell, the answer is no.
# 2025-08-21 11:54sourabhniyogi: Can we consider the 0.6.7 fuzzer "done" (which allows other teams to roll in) or are you going to go through another few waves of 0.6.7 for the next 7-10 days, like the gas accounting wave?
# 2025-08-21 11:57sourabhniyogi: For others doing fuzzers, what else should be addressed? I think you had a task list to share maybe
# 2025-08-21 12:09jaymansfield: Seems like this should work. I might attempt to build my own refine fuzzer actually as I'm getting close to finishing my recompiler. Will use the same refine arguments when I get to it next.
# 2025-08-21 16:03r2rtnl: JAM Prize rules #14 and #15 appear to favor using a public GitHub repo as evidence of organic development. Rule #7 seems reasonably interpreted as: “don’t look; and if you did look, disclose it publicly.”
# 2025-08-21 16:16oliver.tale-yazdi: Rule #16 also explicitly allows private repos. So not sure if either is favoured
# 2025-08-21 16:20r2rtnl: Agreed — I’ve just shared an explanation of why TurboJam chose to go the open-source route.
# 2025-08-21 16:20ascriv: I’m pretty sure private meant off-GitHub according to gav’s comments last October
# 2025-08-21 16:21ascriv: image_D9A1B8B1-C051-4866-9047-C48BBDBB137B_1755793257.png
# 2025-08-21 16:21ascriv: But I’m in the process of confirming this with the fellowship, can share once I hear
# 2025-08-21 16:23ascriv: “If you don’t want to use GitHub (private or public), add commits timestamped on a blockchain”
# 2025-08-21 16:23oliver.tale-yazdi: But also depends on how one uses it. Just pushing commits to public GibHub wont help, it will need to be Merge Requests since they carry timestamps.
# 2025-08-21 16:26oliver.tale-yazdi: Yea I am doing this to avoid any controversy later on: https://github.com/JamBrains/remark-commit (i already posted this recently but nobody seems to want to use it 😆)
# 2025-08-21 16:27ascriv: It’s definitely best to do something like OTS or your remark tool regardless. They should have required everyone do this no matter if on GitHub or off imo (edited)
# 2025-08-21 16:28ascriv: Bc as it’s written and as gav explained, it seems just being on github suffices
# 2025-08-21 17:41erin: archive is up: https://paritytech.github.io/matrix-archiver/archive/_21ksYpYHcVftKsUAsdMa_3Amatrix.org/index.html cc tomusdrw
# 2025-08-21 17:45davxy: Two new highly controversial reports: 1755796851 1755796995
# 2025-08-21 17:46davxy: I guess after these two we can start with 0.7.0 :-)
# 2025-08-21 17:46davxy: https://github.com/davxy/jam-conformance/tree/main/fuzz-reports
# 2025-08-21 17:49davxy: (the traces passed by all teams were removed from the table)
# 2025-08-22 14:10danicuki: I am trying to match the preimage error because of this: https://matrix.to/#/!ddsEwXlCWnreEGuqXZ:polkadot.io/$ps8N0jq66pBJG1o26Z5XUGyU23QETifiYuVaaiiypIc?via=polkadot.io&via=matrix.org&via=parity.io It also affects test vector https://github.com/davxy/jam-test-vectors/blob/master/stf/preimages/tiny/preimage_not_needed-1.json
# 2025-08-22 16:59rustybot: > <@danicuki:matrix.org> I am trying to match the preimage error because of this: https://matrix.to/#/!ddsEwXlCWnreEGuqXZ:polkadot.io/$ps8N0jq66pBJG1o26Z5XUGyU23QETifiYuVaaiiypIc?via=polkadot.io&via=matrix.org&via=parity.io > > It also affects test vector https://github.com/davxy/jam-test-vectors/blob/master/stf/preimages/tiny/preimage_not_needed-1.json > Looks like you've got your answer in the other channel?
# 2025-08-22 17:44danicuki: If that answer is correct than I still can't understand why some of the fuzzer cases shouldn't raise preimage_not_needed error (edited)
# 2025-08-22 17:46danicuki: Cases 1755530896, 1755530728,1755531265, 1755620371
# 2025-08-22 17:51ascriv: Can you provide a deeper analysis/argument for why one of these cases should error? It’s always possible we all have the same mind virus
# 2025-08-22 17:56danicuki: In all these cases d[s]_l[(h,l)] = null Which according to 12.38, should fail the block.
# 2025-08-22 17:58ascriv: I’ll take a look but I think it could be because 12.38 is a definition of R, not a condition that must be true
# 2025-08-22 17:58ascriv: If that doesn’t explain it then I’ll look deeper later
# 2025-08-22 17:59danicuki: All E_P must satisfy R
# 2025-08-22 18:01ascriv: Right, let me look now then
# 2025-08-22 18:05ascriv: I checked 1755530896, my service 2494444674 indeed does not have the preimage in the extrinsic but it does have a [] in the historical lookup
# 2025-08-22 18:05ascriv: The key for the historical lookup begins with 0x92115b..
# 2025-08-22 18:06ascriv: And I see it in the pre state for ….008.json for this test vector
# 2025-08-22 18:06ascriv: Value 0x00 which corresponds with empty list
# 2025-08-22 18:24danicuki: I will double check on my side but I think I know what we are doing wrong.
# 2025-08-23 11:22davxy: **0.7.0 test vectors are up:** https://github.com/davxy/jam-test-vectors/pull/90
# 2025-08-23 17:54jaymansfield: Another set of 0.7.0 safrole vectors if anyone is interested: https://github.com/javajamio/javajam-trace
# 2025-08-24 07:31davxy: ⚠️ https://github.com/davxy/jam-test-vectors/pull/90#issuecomment-3217905803
# 2025-08-24 07:33davxy: For tiny we're using G_T = 20_000_000 and not 3_500_000_000 . That is why we call into accumulate twice in preimages/00000008.bin As I described in the comment, I think a different value for tiny is reasonable to trigger interesting cases
# 2025-08-24 07:33davxy: if there are no objections I'll open a PR in jamcha.in and set this as the value for G_T in tiny config
# 2025-08-24 08:16davxy: https://github.com/JamBrains/jam-docs/pull/59
# 2025-08-24 11:13prematurata: hey @davxy I just found out about the sbrk note in the traces readme https://github.com/davxy/jam-test-vectors/tree/v0.7.0/traces besides the sbrk implementation the RAM code looks a bit off to me. For example in A.42 the Writeable section starts at 2Zz + Z(|o|) while in the code there it seems to end there
# 2025-08-24 11:18davxy: Was written by @sourabhniyogi:matrix.orgsome time ago Feel free to open a PR
# 2025-08-24 11:18prematurata: ah ok maybe at that time the gp was different :)
# 2025-08-24 11:20davxy: Honestly I didn't went through that go code LOL
# 2025-08-24 11:20prematurata: npnp thanks for clarifying
# 2025-08-24 11:27davxy: I don’t feel like I’ve clarified anything :-) I’ll check that section out
# 2025-08-24 11:28prematurata: well it's clear that there MAY be an error :)
# 2025-08-24 17:05sourabhniyogi: Yes, it deserves an update -- the 0.6.7 fuzz traces did a thorough job of showing how its out of date (64k, read vs write OOB) -- I'm not aware of really great SBRK / heap testing expansion anywhere, and the "s" component [stack] remains elusive and entirely untouched so far as I can tell in addition to the "h" [heap] of SBRK which has not been fuzzed much. It would be ideal if the next 0.7.2+ fuzzing series attacked this fully and somehow incorporated "picoalloc" which I gather is the replacement to SBRK (the "h"). I am wondering if its useful to really go crazy with fuzz testing of SBRK because its going to be replaced. We can keep busy with outeraccumulate in 0.7.0 and transfer in 0.7.1 the meantime =). (edited)
# 2025-08-25 07:52dakkk: Where can I find the source code of the service used for the tests? My implementation is raising a panic from services/bootstrap-service/src/lib.rs in two traces
# 2025-08-25 10:13oliver.tale-yazdi: I think you can get it with cargo clone jam-bootstrap-service, at least that code matched with the logs that we saw when running the test vectors
# 2025-08-25 11:17dakkk: thank you
# 2025-08-25 14:57emielsebastiaan: davxy: Potential ASN error: https://github.com/davxy/jam-test-vectors/issues/93 Please review.
# 2025-08-26 06:50davxy: https://github.com/davxy/jam-test-vectors/pull/94
# 2025-08-26 07:02emielsebastiaan: In the office in an hour. Let me check then. It is simply the reordering. New order: i, x, z, e for both CoreActivityRecord and ServiceActivityRecord. (edited)
# 2025-08-26 08:24davxy: @room Could you please check these vectors https://github.com/davxy/jam-test-vectors/pull/94 before I proceed with merging? STF and traces. Ty
# 2025-08-26 08:370xjunha: fastroll passes all stf/traces
# 2025-08-26 12:10jaymansfield: Looks good.
# 2025-08-26 08:32dakkk: it works for me, and now I pass all the tests either
# 2025-08-26 08:45arjanz: Traces and STF are passing
# 2025-08-26 08:53vinsystems: vinwolf passes all stf and traces as well
# 2025-08-26 09:36ascriv: Jamzilla passes as well
# 2025-08-26 10:59danicuki: Do we already have fuzzer for 0.7.0?
# 2025-08-26 11:03danicuki: Jamixir passes all 0.7.0 traces and test vectors.
# 2025-08-26 11:06danicuki: We are still struggling with two fuzzer cases on 0.6.7 - Could anyone please provide the PVM traces for 1755531265 and 1755796995 fuzz cases (0.6.7)?
# 2025-08-26 16:20clearloop: 1755531265.log
# 2025-08-26 16:20clearloop: 1755796995.log
# 2025-08-26 16:21clearloop: hope these help ⬆️ (edited)
# 2025-08-26 23:49danicuki: ❤️
# 2025-08-27 19:23danicuki: clearloop | SpaceJam: I have a doubt about your logs:
2025-08-27 00:19:42 TRACE stf:accumulate: pos=26517  Ecalli               gas=9997503 regs=[804, 4278056528, 17, 16, 4278056271, 206752, 70, 2, 66858, 4, 206752, 57, 8]
2025-08-27 00:19:42 DEBUG stf:accumulate: calling host call 100
2025-08-27 00:19:42  INFO program: Bootstrap Service Accumulate, 0h @8 $18446744073709003931 target="boot"
2025-08-27 00:19:42 TRACE stf:accumulate: pos=26519  Fallthrough          gas=9997502 regs=[804, 4278056528, 17, 16, 4278056271, 206752, 70, 0, 66858, 4, 206752, 57, 8]
the host call is a log, so it should not change r7 from 2 to 0 the next call is a fallthrough, which should also not change r7 from 2 to 0 So why is your r7 changed to 0?
# 2025-08-27 23:25clearloop: good catch! we have a wrap of host calls, 0 means the host call executed successfully 🤦‍♂️ this could be confusing but it should not be the root case since r7 is commonly used as exit status, and it is actually ignored by the log host call in the program, (e.g. it will be override by other operations without getting loaded ) while you can pass the traces in the test vectors, I believe the bug in the fuzz tests are in host call / memory op / encoding stuffs
# 2025-08-28 10:08danicuki: I am double checking fetch. What value do you have for G_A, G_T and G_R?
# 2025-08-28 10:11danicuki: also, from what I know, it uses 0.7.0 spec for fetch, right?
# 2025-08-28 15:01clearloop: the traces were generated by a commit that we haven't upgraded to 0.7.0 yet, if I'm not mistaken the fetch call updates are covered by both 0.6.7 and 0.7.0 (registers changed), from we I can recall: - 1755796995 is related to the threshold of accounts (if it was the last row of the table), you need to check your info implementation instead of fetch, the diff are actually in memory that register values are not that helpful - 1755531265 I'm not sure about this, if you can share which key you failed to match, mb I can recall sth helpful
# 2025-08-26 11:25davxy: Polkajam has it; I'll provide some reports in the next days
# 2025-08-26 14:41emielsebastiaan: davxy afk -> 28 Aug: https://github.com/w3f/jamtestvectors/pull/55#issuecomment-3224458898
# 2025-08-26 15:04emielsebastiaan: https://github.com/paritytech/polkajam-releases/releases/tag/v0.1.25 CoreVM & DOOM included. (edited)
# 2025-08-26 18:48prematurata: <del>~Hello can someone provide tracelog for storage 13 in 0.7.0? I am struggling to pass that one~</del> (edited)
# 2025-08-27 14:10yu2c: davxy afk -> 28 Aug: I'm curious about some issues of the jam-test-vector/stf/assurances (v0.7.0) From README: 1. the "assurances_for_stale_report-1" pins red dot (which implies error code exists), but in the file, there's no error code but has ok with reports (which implies green dot) 2. Why "no_assurances_with_stale_report-1" doesn't have output? (we don't output timeout reports?) > Stale work report assignment is removed (but not returned in the output).
# 2025-08-28 15:42davxy: I'll have a look. Please open an issue if you want to be sure I don't forget :-D
# 2025-08-28 15:50davxy: Interesting dispute (0.7.0): https://github.com/davxy/jam-conformance/discussions/37
# 2025-08-28 15:54prematurata: i will fix the build script for the target tonight but i can give it a spin on this trace beforehead
# 2025-08-28 21:35sourabhniyogi: https://github.com/davxy/jam-conformance/discussions/37#discussioncomment-14249768
# 2025-08-28 21:35sourabhniyogi: Jason | JavaJAM: What do you think of the above ^^
# 2025-08-28 21:45jaymansfield: I did see that as well but its hard to interpret if its referring to a single parallel accumulation since its referring to **n** and **m** and they are defined in ∆∗, or over multiple outer accumulations. (edited)
# 2025-08-28 21:48sourabhniyogi: The bottom line is that we cannot accumulate any service more than once. If you incorporate this constraint, you should be able to match their state root in this case.
# 2025-08-28 21:50jaymansfield: You may be right it does make sense but would be good to have davxy's opinion. (edited)
# 2025-08-28 21:55dave: image.png
# 2025-08-28 21:55dave: Are you referring to this section of the GP?
# 2025-08-28 21:56sourabhniyogi: Yes -- does that constraint get modeled in another equation somehow?
# 2025-08-28 21:56dave: This is talking about service modification/creation/deletion. It doesn't have anything to do with accumulating a service multiple times AFAIK (which is permitted AFAIK).
# 2025-08-28 21:57sourabhniyogi: I am interpreting "altered" as service modification via accumulation. What blocks the third report from accumulating otherwise? (edited)
# 2025-08-28 21:58dave: The point is that the parallel accumulation won't work sensibly if the same service is created/modified/deleted by multiple different services in the same parallel accumulation step
# 2025-08-28 22:00dave: As there is no logic to sensibly merge the results
# 2025-08-28 22:00sourabhniyogi: Here we have two outer accumulates.
# 2025-08-28 22:00dave: Most of that paragraph is merely "informative"; I think the only "normative" bit is that if there _is_ a conflict the block must be considered invalid and discarded.
# 2025-08-28 22:01dave: This can only happen if there is a collision amongst the pseudo-randomly generated IDs of newly created services
# 2025-08-28 22:02dave: Right, from the perspective of that paragraph the "outer" accumulates should not interact
# 2025-08-28 22:02sourabhniyogi: Ok then we need to find the source in the GP which says "If you accumulated service X in one outer accumulate, you cannot accumulate that service X again (in a following outer accumulate) " -- (edited)
# 2025-08-28 22:02sourabhniyogi: If there is no source, then Polkajam has a bug, I think.
# 2025-08-28 22:02dave: I don't believe there is such a requirement
# 2025-08-28 22:03jaymansfield: There might not be one. It could just be a bug. (edited)
# 2025-08-28 22:04dave: Yes, quite possibly. I haven't looked into the fuzzer case so I can't comment on what's actually happening there
# 2025-08-28 22:04sourabhniyogi: Alright, we'll hold off on publishing a "fix" based on your read =)
# 2025-08-28 22:09dave: You shouldn't take my word for it. But I will say that in general most things in the GP are defined formally. You should avoid reading too much into the exact wording of the text surrounding the formal definitions; that is there primarily to aid understanding. Of course if there is a significant mismatch between the two that can and should be fixed...
# 2025-08-28 22:17sourabhniyogi: Not sure if this should be in this room or not (you decide!) -- we are trying out running DOOM work packages from the 0.7.0 release of polkajam in a 5+1 polkajam+jamduna testnet and only seeing "null-authorizer" work packages being submitted by the 2-3 key steps
# Create CoreVM.
./jamt vm new ./doom.corevm 1000000000

# Run CoreVM builder (SERVICE_ID is in jamt's output).
./corevm-builder --temp --chain dev --gas 1000000000 SERVICE_ID

# Run CoreVM monitor (SERVICE_ID is in jamt's output).
./corevm-monitor SERVICE_ID
where the corevm-monitor is showing nothing on the 3rd step and the only work package logging we see is from "null-authorizer" caused by the corevm-builder. Am I missing a key step here or is there something else involved to have us test our JAM implementations against the famous 60fps target? (edited)
# 2025-08-29 04:59clearloop: aint coreVM has different entrypoint that either refine or accumulate will not be able to be invoked (edited)
# 2025-08-29 05:00jan: CoreVM payloads cannot be loaded as toplevel services. They use a different blob format than JAM services do, and need hostcalls which are not provided by the JAM client.
# 2025-08-29 05:12clearloop: oh I got it, so in this test, jamt vm ./doom.corevm creates a process which running the corevm and keep submitting work packages to the network
# 2025-08-28 22:21dave: I don't think anything is missing there. It does take a little while to get going. jamtop should show you the doom service and packages being refined/accumulated for it
# 2025-08-28 22:22sourabhniyogi: A little while = how many seconds? Ok, will run jamtop
# 2025-08-28 22:22dave: Well, nothing is missing assuming you have a JAM network running
# 2025-08-28 22:22dave: If you are not starting a JAM network before those steps then not much will happen :P
# 2025-08-28 22:23sourabhniyogi: How many "frames" are represented in each doom work package
# 2025-08-28 22:28dave: I guess 360 if it's running at 60fps. Can't confirm if it actually is running at 60fps as I don't see any indication of the fps anywhere...
# 2025-08-28 22:29dave: I can confirm though that it's running for me with the above commands, preceded by polkajam-testnet to launch a network (edited)
# 2025-08-28 22:29dave: Using the latest nightly release (https://github.com/paritytech/polkajam-releases/releases/tag/nightly-2025-08-28)
# 2025-08-28 22:30dave: There are a couple of sticking points though. If you run the jamt command too soon after starting the network you will get an error like "2025-08-28 23:24:28 Fatal error: ErrorObject { code: ServerError(1000), message: "Chain error: storage access error: core assigments error: invalid epoch 287320, reference epoch is 0", data: None } Caused by: ErrorObject { code: ServerError(1000), message: "Chain error: storage access error: core assigments error: invalid epoch 287320, reference epoch is 0", data: None }"
# 2025-08-28 22:31dave: I also got a "Step error: Work error: BadCode" from corevm-builder the first time I ran it... I don't know what the explanation for that one is. I reran it and it worked the second time! 😅
# 2025-08-28 22:40sourabhniyogi: image.png
# 2025-08-28 22:40sourabhniyogi: Ok thank you for making this happen -- super psyched!
# 2025-08-29 02:41sourabhniyogi: David Emett: In attempting 0.7.0 work report alignment between polkajam and our jamduna client (now with a lot of jam-conformance hardening!) against the latest release of polkajam (2025-08-28), we believe the authorization args (\[now just E\_2(c)\] (https://graypaper.fluffylabs.dev/#/38c4e62/2ec2002ec200?v=0.7.0)) are incorrect, because very early in the polkajam logs for first invocation of the bootstrap service (in ./jamt vm new ./doom.corevm 1000000000) we see
sp = 0xfefdfa58
u64 [0xfefdfff8] = ra = 0xffff0000
u64 [0xfefdfff0] = s0 = 0x0
u64 [0xfefdffe8] = s1 = 0x0
u64 [0xfefdfc68] = a1 = 0x1 <<<<< the argument input length should be 2, not 1 for 0.7.0 Authorization
If we change our encoding from E\_2(c) to E\_1(c) we get work reports to match (in addition to matching this little "a1") ... except for gas where we're wondering when polkajam 0.7.0 does "basic block" gas accounting and when it does single instruction gas accounting. (edited)
# 2025-08-29 04:32davxy: You can accumulate one service multiple times. I don't understand why you thought this wasn't possible and came to that conclusion. The paragraph you shared is completely unrelated - as David mentioned, it refers to conflicts in IDs of new services. I suspect this might just be a bug on our side. I'll investigate it today. (edited)
# 2025-08-29 05:07davxy: There's a bug
# 2025-08-29 10:06sourabhniyogi: It had a lot to do with this trace 😂 but really ... because service designers want to have accumulation bring together async refined data together across multiple cores synchronously. I definitely have a "GROUP BY" map-reduce view of what I want to see happen from JAM's 'rollup host' purpose: A service designer will want to integrate ALL the work that was done in the last N seconds every N seconds and have a single place to compute and write out a SINGLE view of what happened. While I'm glad no work report is left behind, JAM's accumulate appears to have a depressing G\_T problem for its rollup host purpose: accumulate does NOT accumulate everything every N seconds, not even typically. Yes, you can organize work packages in chain with every work package having a prereq, but that is pretty lame. (edited)
# 2025-08-29 10:09dave: I believe the only case where JAM _won't_ accumulate everything is when there are unsatisfied dependencies
# 2025-08-29 10:10dave: It will accumulate these as their dependencies are satisfied, subject to per-block gas limits
# 2025-08-29 10:11dave: So I'm not sure I agree with "it won't typically accumulate everything"
# 2025-08-29 10:15sourabhniyogi: I mean a service designer would like to see one accumulate execution across all work reports. Whether it typically does or not depends on what else is happening. Its definitely not the case that a service designer can expect to get all the work reports grouped together. Anyway, we're outside the topic of JAM conformance =)
# 2025-08-29 10:16dave: You certainly can't rely on that no
# 2025-08-29 14:15boymaas: davxy: I see you are already publishing the import stats. That's great! I was wondering if you are measuring with verbose output enabled, or if you ran the JamZig⚡ node with -vv or without any verbose flags. I compiled these with detailed runtime debugging on, and the best performance is achieved without -vv and with all debug output removed. Let me know if you want a new binary compiled without this.
# 2025-08-29 14:23davxy: I'll disable debug then. No worries -- Zig looks fast ;-)
# 2025-08-29 14:24davxy: FYI, we're working on extracting some rough numbers to get a basic comparison between the implementations and to provide a bit of extra incentive :-D Soon we'll publish a nice-looking chart, courtesy of Erin. The perf reports are based on monitoring your import timings (accounting for the small overhead of the process link). Nothing scientific here -- just estimates. In practice, we also run polkajam as a target. In this case, the difference between the fuzzer importing the block and the target import is pure overhead. This is the overhead that we considered for your targets as well. The perf reports are generated from the test vector traces. I'm scripting the production of these numbers, so if you update your binary, regeneration will be immediate.
# 2025-08-29 17:53clearloop: hi davxy just did some optimizations! hope can try spacejam again! https://github.com/davxy/jam-conformance/issues/13#issuecomment-3237780866
# 2025-08-29 14:34clearloop: curious about what's the system info of the benchmarking machine, also, if it is possible providing memory peeks as well, I'm a bit worry about it since some programs are pretty big (edited)
# 2025-08-29 14:39davxy: Perhaps we can run this on some more beefy machine at some point. But for the moment I think these results already gives you enough hints
# 2025-08-29 14:39jaymansfield: I published a new version of javajam yesterday that included about a 70% import performance improvement for the storage test vectors. Are you able to grab the latest 0.2.7? It looks like the report used 0.2.6.
# 2025-08-29 14:39davxy: My machine AMD Ryzen Threadripper 3970X 32-Core (64) @ 4.55 GHz
# 2025-08-29 14:40davxy: > it is possible providing memory peeks as well Current reports not, perhaps we will in the future.
# 2025-08-29 14:46davxy: done
# 2025-08-29 14:50jaymansfield: Thanks!
# 2025-08-29 14:51ascriv: Where can I see the import stats? Been afk for a while
# 2025-08-29 14:53ascriv: Or are they not published yet
# 2025-08-29 14:54clearloop: Jamzilla is super fast https://github.com/davxy/jam-conformance/blob/import-perf/fuzz-reports/0.7.0/reports/jamzilla/perf/storage.json
# 2025-08-29 14:57ascriv: 🫢
# 2025-08-29 14:59clearloop: I'm curious are you all running accumulation in parallel already, why in that speed
# 2025-08-29 15:00ascriv: I am doing in parallel yes
# 2025-08-29 15:00ascriv: Interpreted pvm for now
# 2025-08-29 15:00sourabhniyogi: davxy: Which metric should teams optimize for to get the most fuzzing action?
% grep import_p75 */perf/storage.json  
boka/perf/storage.json:    "import_p75": 6925.978,
jamduna/perf/storage.json:    "import_p75": 171.9,
jamzig/perf/storage.json:    "import_p75": 39.349,
jamzilla/perf/storage.json:    "import_p75": 81.848,
javajam/perf/storage.json:    "import_p75": 233.967,
pyjamaz/perf/storage.json:    "import_p75": 5186.699,
spacejam/perf/storage.json:    "import_p75": 187.597,
turbojam/perf/storage.json:    "import_p75": 10.395,

# 2025-08-29 15:02clearloop: can't believe the performance of turbojam
# 2025-08-29 15:04davxy: You have two ways of make the fuzzer happy: - run the STF - hardcode the replies in the binary :-D (given that we run the test vectors traces here)
# 2025-08-29 15:04sourabhniyogi: Removing logs and recompiler is 95% of the optimizing
# 2025-08-29 15:04davxy: But the thing is that turbojam has the same perfs with a random seed. So it can't predict the state roots. Ergo. Looks really fast
# 2025-08-29 15:05davxy: I'll provide perf reports for polkajam in few mins
# 2025-08-29 15:05davxy: both with interpreter and recompiler
# 2025-08-29 15:05davxy: you can provide optimized binaries. I don't need the logs actually
# 2025-08-29 15:07davxy: @room If you republish optimized bins, please leave a note in your dedicated GH issue. I won’t re-run them right away, and otherwise I’ll probably forget.
# 2025-08-29 15:09clearloop: I doubt about the performance of recompiler... I've tried compiling the programs in the stroage set however they at least take 10s on compilation... not sure what's the number from others (edited)
# 2025-08-29 15:10jan: 10s per program or in total?
# 2025-08-29 15:11clearloop: per...I must made sort of horrible mistake... haven't implemented the cache yet so each test triggers a compilation, there mb sort of memory issues as well still reorganizing my implementations these days, will take the benchmark with polkavm/bentools soon! (edited)
# 2025-08-29 15:14jan: Note that if you have optimal enough code then you don't actually need a cache. Anyway, 10s seconds sounds very abnormal.
# 2025-08-29 23:20clearloop: Just made my recompiler 20x faster via removing my "sugar" of register operations .... compiling storage\_light::00000004 from 11s -&gt; 0.6s indeed horrible mistake! trying to remove more "sugar methods" now XD I was actually handling the instructions with a interpreter mind! (edited)
# 2025-08-29 15:01boymaas: davxy:I pushed a new version for JamZig⚡ with all the debugging mechanics removed. I'm curious about the performance gains! It looks very promising locally. 🤠 The -vv flags on the target do not do anything anymore.
# 2025-08-29 15:20boymaas: https://asciinema.org/a/eMIt8geWsubdseBLBn9ygUkiN 🤠
# 2025-08-29 15:23clearloop: this visualization tool looks perfect, can you make a PR to the repo that we can try it as well?
# 2025-08-29 15:21boymaas: image.png
# 2025-08-29 15:22jan: Would be nice to also have the implementation language next to the name.
# 2025-08-29 15:24dakkk: is this using only fallback / safrole right?
# 2025-08-29 15:24boymaas: Safrole indeed
# 2025-08-29 15:26dakkk: davxy: I managed to solve the issue that was causing an error on jampy when using the pvm; I've just uploaded a fixed version
# 2025-08-29 15:38boymaas: Good idea, let me check if I can add that quickly.
# 2025-08-29 15:41sourabhniyogi: Way to go everyone -- this will turbocharge all of us =)
# 2025-08-29 15:43boymaas: Does anyone know what language TurboJam uses? I cannot find it that quickly online.
# 2025-08-29 15:43clearloop: C++
# 2025-08-29 15:44emielsebastiaan: davxy: Is there a reason why you choose not to include preimages & preimages_light in the speed benchmark?
# 2025-08-29 15:44arjanz: C++ I believe
# 2025-08-29 15:44ascriv: The repo is public too
# 2025-08-29 15:44davxy: Not really. I can add in the next iteration
# 2025-08-29 15:48prematurata: wow nice charts. :)
# 2025-08-29 15:49prematurata: do we know all language sets? it would be nice to have it after the name until we get used to it. I, for example dont know many of other implementors languages other than the ones that contain it in the name
# 2025-08-29 15:50dakkk: it could be useful to have a script that generates those charts automatically for every test set
# 2025-08-29 15:52boymaas: image.png
# 2025-08-29 15:53davxy: Added polkajam and polkajam iterpreted
# 2025-08-29 16:03clearloop: fact: polkajam is 2x faster than turbojam = =
# 2025-08-29 16:00davxy: Given the data I have provided, you do not need to worry about variations of around +/-3ms. This is not a dedicated machine, so results may fluctuate a bit. Still, the data gives good hints on where improvements are needed, and it shows which trace step took more time so you can re-execute it.
# 2025-08-29 16:00davxy: https://github.com/davxy/jam-conformance/tree/import-perf/fuzz-reports/0.7.0/reports/perf
# 2025-08-29 16:04boymaas: image.png
# 2025-08-29 16:04boymaas: Forgive me if I have associated the wrong language; please let me know, and I will correct it.
# 2025-08-29 16:08ascriv: Turbojam is c++, not unknown https://github.com/r2rationality/turbojam
# 2025-08-29 16:10sourabhniyogi: Ok, since polkajam is at the top 3 of the reports/perf (for non-zero values anyway), we can all aim to turbocharge up to polkajam polkajam_perf_int most easily if we use polkajam fuzzer target to compare against our own. Is there a fuzzer option within polkajam or is that a separate binary that can be provided? Not asking for polkajam fuzzer binary -- just the polkajam fuzzer target.
# 2025-08-29 16:12jaymansfield: Will be interesting to see where everyone stands in a few days after working on optimizations
# 2025-08-29 16:12jaymansfield: Really useful seeing the numbers
# 2025-08-29 16:12boymaas: Thank you ascriv | Jamzilla and arjan | PyJAMaz will add it.
# 2025-08-29 16:47danicuki: Jamixir already has 0.7.0 - would you please include it in this speed report?
# 2025-08-30 07:02dakkk: > <@boymaas:matrix.org> Forgive me if I have associated the wrong language; please let me know, and I will correct it. Can you share the script you re using to create this chart?
# 2025-08-30 07:40boymaas: visualize_perf_enhanced.py
# 2025-08-30 08:32boymaas: davxy: Feel free to include it in the repo as well.
# 2025-08-30 10:56davxy: Can you open a PR ?
# 2025-08-30 10:56dakkk: maybe you can also integrate it in the github-CI, so every time perf are updated, it creates the chart
# 2025-08-30 10:57prematurata: this changed from "make a graypaper compliant implementation" to a drag race (edited)
# 2025-08-30 10:57prematurata: :)
# 2025-08-30 10:57dakkk: image.png
# 2025-08-30 10:57dakkk: image.png
# 2025-08-30 10:57dakkk: btw, those are from last update:
# 2025-08-30 10:58prematurata: tsjam is TypeScript Language btw... i see it is marked as unknown (edited)
# 2025-08-30 11:04jan: Jamixir is Elixir AFAIX (hence the "ixir" in the name)
# 2025-08-30 11:06jan: And it'd make more sense to change "Rust (int)" to "Rust" and "Rust" to "Rust (recomp)" for polkajam (or perhaps have another column whether the implementation is using a recompiler; not sure if any other implementation besides polkajam uses one currently?) (edited)
# 2025-08-30 11:07ascriv: I think jamduna has one?
# 2025-08-30 11:07ascriv: @sourabhniyogi
# 2025-08-30 11:08jan: Yeah, I know a few of the implementations have one in-progress, but it's unclear if any are used here for these tests. (edited)
# 2025-08-30 11:09clearloop: btw I want to confirm if block based gas charging matches the tracing tests, we can match accumulate tests but not tracing tests with block based gas charging, could be caused by the host call tried once but seems the problem is not that obvious (edited)
# 2025-08-30 11:10jan: Block based gas is not yet part of the GP; not sure if current traces use per-instruction gas yet (you'd have to ask davxy ), but I have implemented per-instruction gas metering in PolkaVM so the plan is to have the fuzzer/traces use per-instruction gas until the new gas cost model is ready.
# 2025-08-30 11:11jan: You can already start implementing a block-based gas cost model to get a head start on the final gas cost model, but for now the official model is per-instruction.
# 2025-08-30 11:18davxy: We use per-instruction gas charging. The test vector traces work for both models, since they never fail midway through a block.
# 2025-08-30 11:18clearloop: kk so the problem is in my implementation, will take a deeper look at it then!
# 2025-08-30 11:55danicuki: > <@prematurata:matrix.org> this turned out from "make a graypaper compliant implementation" to a drag race Tell me how you measure me and I will tell you how I will behave.
# 2025-08-30 13:46davxy: EoW batch: https://github.com/davxy/jam-conformance/pull/41
# 2025-08-30 13:47davxy: 0.7.0 table: https://github.com/davxy/jam-conformance/blob/reports-batch-0.7.0/fuzz-reports/README.md#disputes
# 2025-08-30 14:55boymaas: dakkk | JamPy: If you updated the script, please feel free to create the PR for the visualize_perf tool.
# 2025-08-30 17:03vinsystems: I started a discussion about trace 1756548916 https://github.com/davxy/jam-conformance/discussions/42 which affects almost all teams
# 2025-08-30 17:23davxy: replied
# 2025-08-31 10:39danicuki: davxy: how frequent do you update the reports tables - (performance and disputes)? On our side, do we need to notify you when we have a new version of our binary, or just put the binary in same place?
# 2025-08-31 12:55davxy: In general, just drop a message in your team's issue. I usually re-run the scripts when it is worthwhile -- for example, if a team has made significant improvements or fixed a some traces. Of course, do not expect me to re-run them immediately after a ping, since I might be away or focused on something at the time. Still, given that there are not too many teams, I tend to run the scripts more frequently than expected, as I enjoy seeing things improve and become more resilient to the fuzzer day by day.
# 2025-08-31 13:32davxy: Just to be clear - currently - the performance table is mostly meant to provide an indication of how long fuzzing should run to reach the same level of confidence across implementations. This is also to start thinking a bit more about the auditing process duration. For example, if we take (in the screenshot shared by dakk above) PolkaJam as the baseline, and auditors require 3 days of continuous fuzzing without crashes/diffs then: - An implementation 300x slower would require 900 days (about 2.5 years) - If the baseline is instead 10x slower than Polkajam, the same 300x implementation would need 90 days These numbers are speculative, but that is the point of sharing performance results at this stage: they are not about improving by few ms to climb the ranking, but about giving a tangible sense of scale for testing. I will likely update performance numbers less often (about once per week) to reduce noise from micro-optimizations and keep the focus on M1 compliance. Slower implementations that report tangible improvements may get more frequent updates. (edited)
# 2025-08-31 13:39jaymansfield: I have started a discussion about 1756548741 : https://github.com/davxy/jam-conformance/discussions/43
# 2025-08-31 17:37jaymansfield: Regenerated the rankings:
# 2025-08-31 17:37jaymansfield: Screenshot 2025-08-31 at 1.34.03 PM.png
# 2025-08-31 17:37jaymansfield: Screenshot 2025-08-31 at 1.35.38 PM.png
# 2025-08-31 17:37jaymansfield: Screenshot 2025-08-31 at 1.35.48 PM.png
# 2025-08-31 17:41clearloop: seems our perf is not updated XD, we should be at least ~3x faster now
# 2025-08-31 19:01sourabhniyogi: Since most of us are deathly afraid of flaming out of fear of being too slow (or have really large egos 🤪🤣 ), why not just state some precise guidelines now that you got almost all of us into basic "walking" shape. Concretely: 1. Your implementation _should_ perform no slower than 10x (or 25x or 50x, you pick the number) polkajam\_int across all test groups 2. Your interpreter _should_ be able to interpret at 1MM gas/s (or 10M gas/s, 50MM gas/s, you pick the number }. By stating this clearly, teams who are way ahead of this "should" threshold can proceed without fear on winning the M3/M4 marathon (refining Doom/.. workpackages with a recompiler) rather than an M1 sprint rankings game (which we should save for M3/M4, right?). In the end the timings are dominated by PVM accumulate execution? What do you think -- does that make sense? (edited)
# 2025-08-31 19:11dakkk: > <@sourabhniyogi:matrix.org> Since most of us are deathly afraid of flaming out of fear of being too slow (or have really large egos 🤪🤣 ), why not just state some precise guidelines now that you got almost all of us into basic "walking" shape. > > Concretely: > > 1. Your implementation _should_ perform no slower than 10x (or 25x or 50x, you pick the number) polkajam\_int across all test groups > 2. Your interpreter _should_ be able to interpret at 1MM gas/s (or 10M gas/s, 50MM gas/s, you pick the number }. > > By stating this clearly, teams who are way ahead of this "should" threshold can proceed without fear on winning the M3/M4 marathon (refining Doom/.. workpackages with a recompiler) rather than an M1 sprint rankings game (which we should save for M3/M4, right?). In the end the timings are dominated by PVM accumulate execution? > > What do you think -- does that make sense? I dont think it would be a good idea for M1 and M2, interpeted languages are slower by design, and they may never achieve certain performances. (edited)
# 2025-08-31 19:12sourabhniyogi: image.png
# 2025-08-31 19:12sourabhniyogi: That makes sense -- probably need to make Set C + D versions of those guidelines based on the data.
# 2025-08-31 19:15sourabhniyogi: Since you (and other Set C folks) know what Python (and other interpreted languages) can achieve, what would you have as guidelines? Key word is "should" not turning into a "must" (edited)
# 2025-08-31 19:19dakkk: Idk, I personally prefer not to have constraints on it
# 2025-08-31 19:19sourabhniyogi: Using the word "must" would be a constraint. "should" would mean you would be expected to improve your interpreter to get more fuzzing action.
# 2025-08-31 19:27davxy: What you said makes sense, but keep in mind: 1. The data we collected is meant to support this goal. Giving guidelines without data would just be speculation. 2. I can't really give you an answer here. You might need to escalate this kind of question to the architect :-D Maybe in jam chat. Not sure.
# 2025-08-31 19:31davxy: I see that Gav is not in this channel BTW
# 2025-08-31 19:32clearloop: I'd like to share some opinions at this topic as well, not sure if we should pivot this discussion to the let's jam channel first (edited)
# 2025-08-31 19:32davxy: So yeah, if your questions can't be answered here, I'd say try with the JAM channel
# 2025-08-31 19:34clearloop: could you please (mb) re-post your paragraph in the let's JAM channel sourabhniyogi , and then we can continue on it? (edited)
# 2025-08-31 20:00prematurata: any difference in tiny for WA and WB and WC? (edited)
# 2025-09-01 07:02prematurata: I want to tell you how i lost 4 hours of my life debugging. but i will share the code. now that I share you'll instantly find it but it took me 4 hours : SERVICECODE_MAX_SIZE = 4_000_0000; damn fat fingers (edited)
# 2025-09-01 07:37clearloop: I used to spend 2 days for an encoding order error = =
# 2025-09-01 07:39davxy: Jason | JavaJAM: can you pls open a PR with the latest version of the perf script? I'd like to have it in the jam-conformance scripts folder. Ty
# 2025-09-01 08:09clearloop: I just made one for Boy Maas | JamZig⚡ with his commit info, would you mind checking it over again or you can make one instead Boy Maas | JamZig⚡ , wanna keep you as the original author of this script in the git history anyway for respect https://github.com/davxy/jam-conformance/pull/46 then we can join the update of it together later (edited)
# 2025-09-01 08:31boymaas: Thank you clearloop | SpaceJam I asked you to create one. Thank you for the PR and the attribution; I appreciate it! 😎
# 2025-09-01 09:47davxy: Please add your considerations/proposals
# 2025-09-01 09:47davxy: https://github.com/davxy/jam-conformance/pull/47
# 2025-09-01 10:03sourabhniyogi: Jason | JavaJAM: would you like to add your ideas on how to add refinement to the above? davxy is this ok to do?
# 2025-09-01 10:05sourabhniyogi: Are there \[Set C/D?\] teams who want to work on an FFI into a PVM recompiler? If so, we probably can do a meaningful call together, well beyond intros, at the intersection of everything. (edited)
# 2025-09-01 10:34bamzedev: davxy: https://github.com/davxy/jam-test-vectors/blob/master/stf/reports/tiny/banned\_validator\_guarantee-1.json if I understand correctly, you are expecting that the guarantee is invalid, because one of the guarantors is in the offenders list? If that is the case, could you point out to the GP equation for that validation? (edited)
# 2025-09-01 11:54davxy: 11.21 & 11.22 https://graypaper.fluffylabs.dev/#/1c979cb/155800155800?v=0.7.1 key is zeroed out if belongs to the offenders set (as per 6.14)
# 2025-09-01 11:55davxy: I'll add it to the features.
# 2025-09-01 12:03sourabhniyogi: would you be able to join a call with us (this week or next) to give us some tips on how to do a refine fuzzer that is approximately what you would do? I understand your priorities are on M1 but a little guidance would go a long way
# 2025-09-01 20:44sourabhniyogi: https://github.com/davxy/jam-conformance/pull/50
# 2025-09-02 00:20r2rtnl: A discussion about 1756548706: https://github.com/davxy/jam-conformance/discussions/51
# 2025-09-02 09:20davxy: Reports batch: https://github.com/davxy/jam-conformance/pull/52
# 2025-09-02 12:03ascriv: @davxy can you enlighten us on what is left to be done in pre-submission fuzzing before M1 submissions are opened?
# 2025-09-02 12:07davxy: 1. finish some open tasks on the fuzzer 2. define/enact fuzz protocol v1 https://github.com/davxy/jam-conformance/pull/47 3. better define baseline timings/requirements/auditors etc. I'll discuss about this with the team as soon as I have the opportunity
# 2025-09-02 12:09ascriv: thanks!
# 2025-09-02 15:55boymaas:
# 2025-09-03 12:00erin: Hello everyone, the conformance performance dashboard is now ready -> https://paritytech.github.io/jam-conformance-dashboard/ Please let me know if anything looks off/wrong. Do note it does use a weighted scoring mechanism (methodology detailed at the bottom of the page) so it may not always look strictly "in order". (edited)
# 2025-09-03 12:07erin: Checks for new data are performed hourly.
# 2025-09-03 12:38oliver.tale-yazdi: Maybe there could be an option to anchor it to a "fixed baseline" version of PolkaJAM (e.g. 0.7.0)? Then teams can continuously check if they improved, otherwise it would look like they get slower after PolkaJAM optimized a bit (edited)
# 2025-09-03 12:41erin: good feedback, thanks!
# 2025-09-03 18:35boymaas: Very nice dashboard erin . Now I also need to start optimizing 🤠
# 2025-09-03 19:15clearloop: wow, this dashboard looks much more prettier than I can imagine (edited)
# 2025-09-05 11:00sourabhniyogi: For all of us macro data refiners (you know you want to =)), please check out / comment on https://github.com/davxy/jam-conformance/pull/59 -- this aims to add the ability to trace the causes of work report differences with a new feature-exports in addition to the basics of feature-bundle-refinement.
# 2025-09-05 20:30jaymansfield: https://github.com/davxy/jam-conformance/discussions/64
# 2025-09-08 10:01dakkk: davxy: what should we do when we create a new release of our impl? I see other teams tags you in GH issues both for updating perf and traces. Is it ok for you? It could be an idea to schedule an automatic rebuild of reports / perf (once or twice a day)
# 2025-09-08 11:42boymaas: image.png
# 2025-09-08 11:42boymaas: Good question. I started optimizing yesterday, and now a new version is coming with some promising improvements. It's not done yet. 🤠 (edited)
# 2025-09-08 16:46davxy: Feel free to leave a note in your team's issue. For now, though, I'm not going to regenerate performance reports more than once a week. Automation would be nice, but I'm not going to invest my time improving my hacky scripts and maintain some infrastructure. I realize automation may seem like a "fire and forget" solution, but right now that is not feasible. On top of that, my DevOps skills are not sharp enough to make the setup fully robust :-D Last but not least, the focus should stay on M1, not on chasing a tight race for the top spot in the speed rankings.
# 2025-09-08 17:02sourabhniyogi: For Fuzzer Protocol v1 should we aim for implementing that this week or next amidst v0.7.0 or plan to have that as part of v0.7.1?
# 2025-09-08 17:03sourabhniyogi: Probably some timing expectations on when we wrap up 0.7.0 fuzzing and do 0.7.1 would be useful?
# 2025-09-08 17:07sourabhniyogi: Not sure if we should be doing { Ancestry support, Simple forking } exploration in 0.7.0 fuzzing -- it may be simplest for everyone to have Fuzzer Protocol v1 synced with 0.7.1?
# 2025-09-08 17:19rustybot: I think we are ready to deploy v1. Is just about changing the PeerInfo message, which is trivial. If a target doesn't support a feature can turn off the flag
# 2025-09-08 17:22rustybot: https://github.com/davxy/jam-conformance/pull/47#issuecomment-3267172222
# 2025-09-08 17:26sourabhniyogi: How about finalizing v1 by merging 47 (with or without an Error string) and having a specific date (suggest: 9/11) be the v0=>v1 cutover?
# 2025-09-08 17:35davxy: Sounds good. What about the Error message? Sounds good? Any suggestion? I'd like to keep this dead simple
# 2025-09-08 17:36davxy: So perhaps the error message (which can be empty) can be included in the fuzz report
# 2025-09-08 17:42sourabhniyogi: The situation is that because you give us 1-5 homework disputes every 2-3 days, it makes very little difference since we just debug each of them by hand. If it was 10-100x greater and we had "I can't replicate what you see" problems an error string would matter a LOT! But we do not. However, I imagine if you had programmatic fuzzing (of PVM) working at 10-100x scale, these error strings will be 10-100x more useful for fuzzing scalability. I think a lot of us became SUPER performance focussed in the last 7-10 days because of the "audit" expectations that we will be doing 10-1000x more fuzzing, running for 3 days rather than 3-30s. The moment you have programmatic fuzzing working at higher scale, it becomes valuable. Looking ahead to PVM fuzzing, this error string could summarize the result of "we differ at gas/step/opcode with xyz registers" from the fuzzer probing the target via this feature https://github.com/davxy/jam-conformance/pull/65 -- but we should not have this feature part of v1 with a v0=>v1 of this week. (edited)
# 2025-09-09 01:20clearloop: I got sick last week on working on my implementation, (after 3~5 hours sleep per day) for re-writing my compiler, and my compiler is now getting even worse 😅
# 2025-09-09 02:32xlchen: this is the fun part of doing optimizations. you never know if your optimization actually optimizes it or not until you spend/wasted bunch time on it (edited)
# 2025-09-09 19:30danicuki: > <@xlchen:matrix.org> this is the fun part of doing optimizations. you never know if your optimization actually optimizes it or not until you spend/wasted bunch time on it In practice, the theory is different.
# 2025-09-08 18:23sourabhniyogi: In the end, because there is just 1 of you and 15+ of us, we should do whatever makes your life easier so we get over the finish line together faster -- anything to increase the fuzzing level so we get as much conformance as possible and at least half of us done with M1 before sub0 ( ideally ) and most of us before xmas would be best. It seems pretty clear from the extremely high speeds that teams do their homework each week that we're all capable and motivated to work at high speeds but we don't want to stress you out either =) (edited)
# 2025-09-09 01:10clearloop: agree with the error message, because the current report (per team level), doesn't make a lot of sense since we can get them any way on debugging them locally (diff of key-values), with error messages embedded, the reports will be more useful, and it would be nice if the fuzzer provides error messages in the report as well, e.g. for the cases the the target can import the blocks while the blocks should not be imported from yesterday till today, I have met some traces broken remote but works locally, and without the error message from the remote, I actually have no ideas what to fix ... (edited)
# 2025-09-09 14:45davxy: https://github.com/davxy/jam-conformance/pull/68
# 2025-09-09 20:47vinsystems: https://github.com/davxy/jam-conformance/discussions/71
# 2025-09-09 20:53prematurata: damn bernar. you beat me by some minutes. i opened it as well for 771 https://github.com/davxy/jam-conformance/discussions/72
# 2025-09-09 21:35davxy: Fuzzer protocol v1 will go into effect on Friday, 12 Sep. Starting on that date, our fuzzer will implement the changes proposed in: https://github.com/davxy/jam-conformance/pull/47 (edited)
# 2025-09-09 21:43sourabhniyogi: Friday 9/12 yes?
# 2025-09-09 21:50davxy: Fixed 😅
# 2025-09-10 10:30clearloop: ( if the fork tests the last set of M1 ? ^ ^)
# 2025-09-10 10:41ascriv: > <@davxy:matrix.org> Fuzzer protocol v1 will go into effect on Friday, 10/11. Starting on that date, our fuzzer will implement the changes proposed in: > https://github.com/davxy/jam-conformance/pull/47 On import failure due to a jam protocol-defined error condition, it seems like the README says we should both return the 0 hash AND an 'Error' message, but shouldn't it be one or the other?
# 2025-09-10 10:44davxy: We should return Error message. I need to remove the 0 hash requirement. That is a leftover
# 2025-09-10 14:12clearloop: btw wanna discuss about if either header verification or fork detection should be in the network level? if the received header is not validated, we won't execute it at all (or put it in the pool), tbh the fork part and the header validation breaks our design, that I have to downgrade my validations from my network layer to the block import layer, and for the fork part, I have to rewrite the logic just for tests (bcz there is no finalization, no "grandpa", no peer handshake info) but as "block importer", header validation & fork detection makes sense again ... (edited)
# 2025-09-10 14:51dave: Not entirely sure what you mean. From the networking perspective you should verify blocks before announcing them to peers. You should announce and attempt to import all forks. It should be possible to verify blocks significantly faster than executing them. In particular, it is intended to be possible to verify a block without performing any accumulation, so without any PVM execution. There is _one_ validity condition that cannot be checked without performing accumulation; that we don't attempt to create two services with the same ID. It is not intended that this is checked before announcing/distributing a block to peers.
# 2025-09-10 14:53dave: I think we could probably only require verification of the author and the author's signature before announcing, this isn't what the JAM-NP doc currently says though
# 2025-09-10 14:57clearloop: > From the networking perspective you should verify blocks before announcing them to peers yes, but verifying headers after producing a block or received a new block from remote peers which is right happens at the SNP, if it is validated, I'll put it into a fork, waiting for the block processor to pick it up then execute (block import), I meant that **header validation is ahead of block execution**, they are not bound together, also if we receive a block earlier than the finalization chain, I won't even validate it, that for adapting the tests, I need to bind them together, but it's fine indeed since the logic is minor and the header verification logic do need to be confirmed or it may be tough for the networking part (edited)
# 2025-09-10 15:02dave: Still not really sure what you're saying. Are you saying the fuzzing interface is forcing you to change your design, even though your current design is fine for implementing JAM-NP?
# 2025-09-10 15:04clearloop: sorry and yes I meant that I have to write new logic just for adapting the tests now (which will not contribute to the production), header validation is indeed fine, but testing fork at the current state, I think it may go out of control finally (edited)
# 2025-09-10 15:14dave: Nodes are expected to follow all forks, not clear to me why it's an issue for the fuzzer to test forks? What do you mean by "testing fork at the current state"?
# 2025-09-10 15:20clearloop: for example, we are now introducing fork tests at this level: https://github.com/davxy/jam-conformance/discussions/71#discussioncomment-14360978 , simply replace the latest imported block I believe the approach in production is creating a new fork at parent, results in we now have two fork chains (currently without grandpa or finalization), but for the current fuzzing interface, we are more like just maintaining a "finalized chain" (or importing from a finalized chain), which makes conflict, and starting from this, I'm afraid that we'll introduce more and more parameters/arguments in the fuzzer till we can cover the entire fork system, and during the process, I believe for some teams, their work will be rewriting their implementation 5%, 10%...100%, till they back to the status months ago (edited)
# 2025-09-10 15:42dave: Ok, think davxy's input is needed here. But I would think you probably shouldn't be immediately finalizing blocks that are imported. Ideally the fuzzer won't force the node to be architected in any particular way, but you're naturally going to have to write _some_ code to interface with it; M1 doesn't require JAM-NP so we can't just use that.
# 2025-09-10 15:44clearloop: > But I would think you probably shouldn't be immediately finalizing blocks that are imported. what do you think about the case for light node? If I connect a node which is present in validator set, can I import its finalized blocks without verification? (edited)
# 2025-09-10 15:44dave: Light nodes will probably follow the finalized chain and so lag slightly
# 2025-09-10 15:47dave: Though it really depends on the use case, for some use cases maybe it's good enough to trust instead of verify
# 2025-09-10 15:48dave: In the case of a finalized block you can always ask for proof of finality
# 2025-09-10 15:48dave: (This isn't in the JAM-NP spec, GRANDPA stuff will be added soon)
# 2025-09-10 15:49clearloop: but if a node is present in validator set, and we can't trust its finalized blocks, ain't the consensus got broken, or the case would be like, - if we importing blocks from it from 0 till the latest finalized block, we can trust (mb with small verifications) - If we just pick sort of blocks in the middle, we'll need the proof of finality hmm, but the rpc may cheat or broken as well? (edited)
# 2025-09-10 15:51dave: What do you mean by "trust its finalized blocks"? Validators are generally fairly trusted but the protocol is designed to work with up to 1/3 byzantine validators. A byzantine validator could lie to you about which blocks have been finalized
# 2025-09-10 15:52dave: If you want to be sure a block has been finalized you need to ask for a proof and verify it. A proof will consist of various GRANDPA votes, possibly going all the way back to the genesis block to verify validator set changes and so on
# 2025-09-10 15:52dave: RPC is totally different, as the RPC interface is specifically intended to only be used between two processes which trust each other (edited)
# 2025-09-10 15:53dave: Probably because they are being run by the same operator
# 2025-09-10 15:57danicuki: Sorry if I missed this information in the protocol: when a block should not be imported for any reason, how do we know the expected reason why the block wasn't imported? On test vectors there was an expected error message. Do we have something similar in the fuzzer?
# 2025-09-10 15:57clearloop: you got me, probably I have done some work less of security in my previous work lol, e.g. importing blocks from open rpc directly without checks for building centralized databases, the concepts of proof of finalization makes sense to me! (edited)
# 2025-09-10 16:01clearloop: I had asked for it before as well, but seems it's still not included in Fuzzer V1 yet (only has error from target to fuzzer but not fuzzer to target), probably we can comment it in https://github.com/davxy/jam-conformance/pull/47 (edited)
# 2025-09-10 16:33davxy: Ideally, you should not need to write any special code to interact with the fuzzer. You only need to implement the communication channel, which is straightforward. No special handling is required at the block import level. ## Basic Protocol 1. Receive a block 2. Import the block, which yields either success or failure 3. Reply: - On success: return the resulting state root - On failure: return the Error message (the reason is irrelevant and not specified in the GP) ## Handling Forks In production, you must support forks that occur above the last finalized block. However, since there is no finalization rule in this context, you will not finalize anything. Technically, this means you must be able to handle any kind of fork. For fuzzing, of course, this is not required. The only requirement (which matches a correct implementation and does not need special code) is that you must allow forks at the **last imported block**. This is necessary because fuzzing involves mutating blocks in several ways, and you must attempt to import all the mutations (some may fail some may be successfull -> fork) ## Example Session 1. Let i = 0 2. Increment i and Produce block B_i with parent B_(i-1) 3. Mutate B_i into several variants: B_i1, B_i2, B_i3 4. Send these variants in order: B_i1, B_i2, B_i3, and finally B_i 5. Repeat from step 2, using B_i as the parent for the next batch ## Notes - Finality is not required: just import the blocks - Forks created by the mutations can remain in the state - (optional) You may prune these forks as you progress if you wish (edited)
# 2025-09-10 16:34davxy: This goes straight into the fuzz protocol v1 readme
# 2025-09-10 16:39davxy: The trace and report. Re execute to get why your impl fails. This is the same as the jam-test-vectors traces, a dynamic that matches how the system actually works. (edited)
# 2025-09-10 16:43clearloop: > The only requirement (which matches a correct implementation and does not need special code) > is that you must allow forks at the last imported block. > > This is necessary because fuzzing involves mutating blocks in several ways, > and you must attempt to import all the mutations (some may fail some may be successful -> fork) well I can accept the current fork system now since I just realized that I have to refactor my test infura anyway now..., but want to confirm **if the current fork system won't be expanded anymore?** block mutation happens, and we have already had the logic which is pretty large, handling orphan blocks, invalid blocks, forks, fork of forks, block mutation may happen in any part of them and we have already handled all of them, but for adapting the fuzzer, we still need to write new logic, because our exist system requires more informations to bootstrap (edited)
# 2025-09-10 16:46davxy: > but want to confirm if the current fork system won't be expanded anymore I can confirm this. At least according to our fuzzer protocol proposal (edited)
# 2025-09-10 18:21danicuki: In this case, https://github.com/davxy/jam-conformance/blob/main/fuzz-reports/0.7.0/traces/1757406441/00000117.json - the post_state is same as pre_state. This means that the block import should fail for some reason, resulting in the same state root. But we don't know, based on the trace, what is the reason why you are rejecting the block. On test vectors, there is something like &quot;err&quot;: &quot;bad_validator_index&quot; or other hints about the reason why the block fails.
# 2025-09-10 19:10davxy: Hmm interesting... indeed we could include the result of our last block import in the final fuzz report - it might serve as a helpful hint. However, should be noted that the evaluation order of some expressions is arbitrary, so with a block with the potential to trigger multiple errors the result could is a bit ambiguous. Edit: After thinking about it, I agree - I don't see a major issue with adding it to the report. (edited)
# 2025-09-10 22:04danicuki: > <@davxy:matrix.org> Hmm interesting... indeed we could include the result of our last block import in the final fuzz report - it might serve as a helpful hint. > > However, should be noted that the evaluation order of some expressions is arbitrary, so with a block with the potential to trigger multiple errors the result could is a bit ambiguous. > It thus also expose some implementation details, which might not be super ideal. > > I'm not against including it, just unsure. > Perhaps we should add it if it meaningfully speeds up the analysis. > What do you think, @dave:parity.io You could list all potential errors in no particular order. Or at least reveal the first one. Because it is very very hard from our side to know why a particular block is rejected. When we have state value diff it is easier to debug and find reasons for the diff. But when next state doesn’t change, we stay clueless and can’t even challenge the fuzzer.
# 2025-09-10 22:08danicuki: Another idea: fuzzer target binaries are public. This means anyone can download other teams binaries and run over them the blocks and inspect the logs to find clues. For this reason, since you already run teams fuzzer target, you could share in the repo the running logs of all teams. (edited)
# 2025-09-11 11:44davxy: As you are the main stakeholders, could I ask for one more review round before we merge? The changes are summarized in the PR description: https://github.com/davxy/jam-conformance/pull/47
# 2025-09-11 13:08r2rtnl: https://github.com/davxy/jam-conformance/discussions/74
# 2025-09-11 16:02charliewinston14: Is it correct that the ancestry set is maintained separately for each fork?
# 2025-09-11 16:15davxy: The ancestry is determined using the P() function (see 5.3), which traces the chain backward starting from a specified header. For a block within one fork, the state in other forks is irrelevant. (edited)
# 2025-09-11 16:17davxy: For the sake of fuzzing you can maintain a simple bounded queue... (as we never extend mutations chain) (edited)
# 2025-09-11 22:15r2rtnl: https://github.com/davxy/jam-conformance/discussions/75
# 2025-09-11 23:55sourabhniyogi: It is likely very useful to have a v1 .bin/.json ancestors (and maybe forks) trace for us to test our v1 implementations as we post them in the next 24-72 hours? In the examples/v1, with the updated names (edited)
# 2025-09-12 17:04davxy: V1 Enacted
# 2025-09-12 17:38sourabhniyogi: Can you suggest the proper home for this issue https://github.com/davxy/jam-conformance/issues/76
# 2025-09-12 17:44clearloop: I assume this should be under w3f/jam-test-vectors or https://github.com/polkadot-fellows/JIPs ? (edited)
# 2025-09-12 17:46clearloop: pretty like a standard interface for PVM-FFI, I think Jan Bujak may have some inputs? according to the benchmark tool in polkavm, there is sort of arch of Module, compile, various steps, etc. (edited)
# 2025-09-12 17:48clearloop: https://github.com/paritytech/polkavm/blob/master/tools/benchtool/src/backend.rs#L14 this is pretty like a universal interface, but I think it's still missing the host calls part for making it PVM in JAM focused (edited)
# 2025-09-12 18:06sourabhniyogi: Jan Bujak: Can you advise?
# 2025-09-12 18:56prematurata: 🥳 should we signal support of our targets somehow? in the ghissue maybe? (edited)
# 2025-09-13 05:16jan: Something similar to this but cut down: https://docs.rs/polkavm/latest/polkavm/struct.RawInstance.html Basically: 1. A function to turn a raw program blob into a module (i.e. a compilation step for recompilers; for interpreters this will be mostly a nop) 2. A function to instantiate a new VM instance from that module. 3. Then various getters/setters to modify that instantiated VM's state and be able to call into it. We definitely do not want to have callbacks in the interface. There should be just a single run method that you call on the instance, and if it happens to trigger a host call then it should return a status code signifying that a hostcall was called. (Similar to how inner PVM invocation hostcalls work.) Same with handling page-faults. The various "read_bytes_8" functions are completely unnecessary; there should only be one function for reading bytes with a given length, and then using that to e.g. read a 32-bit number is just a few trivial bitshifts and bitors away. And of course, most of the stuff in your "Debug and Tracing" section should not be in a standard API at all.
# 2025-09-13 12:12sourabhniyogi: OK! I will adjust our pvm.h to match that API 100% and shift the callback process to the run method approach, thank you!
# 2025-09-13 18:21danicuki: Do we need to maintain fuzzer target accepting multiple versions? Or should we keep only the latest (in this case, now, v1)?
# 2025-09-13 19:52davxy: for what concerns our fuzzer, you can switch to v1 only. All new fuzzing sessions will use v1
# 2025-09-14 15:23danicuki: Are you planning to add some examples that use ancestors on v1? As far as I saw, all 30 examples don't use ancestors feature.
# 2025-09-14 17:06davxy: Yes, but I've not implemented the feature yet. Indeed during the handshake the flag is turned off in our PeerInfo message
# 2025-09-14 17:06davxy: new batch https://github.com/davxy/jam-conformance/pull/78 (edited)
# 2025-09-14 20:13prematurata: https://github.com/davxy/jam-conformance/discussions/79
# 2025-09-15 02:42clearloop: interesting, each time new traces come out, I can trigger the sum of failures from all other teams
# 2025-09-15 22:32r2rtnl: davxy: a question regarding ancestor support and sharing the same implementation of the (11.35) check between a fuzzer target and jam-test-vectors. In GP 0.7.0, (11.35) requires two things: 1. The lookup\_anchor hash must be present in the ancestor set. 2. Its slot must match lookup\_anchor\_slot. The issue I’m seeing is that the reports test vectors only provide recent\_blocks, which don’t include slot information. This creates a mismatch: - To pass the reports test vectors, the implementation must skip the slot check and check only against recent\_blocks. - For full ancestor support in fuzzer target, the slot check must remain. Would it be feasible to adjust the reports test vectors to include slot information, or otherwise make it possible to construct the ancestor set so that the full (11.35) check can run? (edited)
# 2025-09-16 02:55ascriv: https://github.com/davxy/jam-conformance/discussions/84
# 2025-09-16 09:07davxy: Please try to run your implementation against minifuzz first https://github.com/davxy/jam-conformance/pull/85
# 2025-09-16 10:01r2rtnl: davxy: Does minifuzz assume that an implementation supports forks? Messages 5 and 7 seem to import a block with the same slot (3).
# 2025-09-16 10:39davxy: In the next hours I'll share a set with forks and another without forks.
# 2025-09-16 09:08davxy: May be a bit buggy :-) feel free to open a PR to improve it
# 2025-09-16 09:24davxy: Teams that are already enrolled are encouraged to run it anyway, as I have found some minor issues in the targets (e.g., incorrect error message length, wrong argument order in PeerInfo, etc.). I will not fuzz your target if you have not successfully completed the first 20 steps in the examples/v1 folder using minifuzz (edited)
# 2025-09-16 10:21tomusdrw: Is there any more details on the examples (i.e. expected state, etc)? We are getting a wrong state root at some point, but not really sure how I could debug it.
# 2025-09-16 10:38davxy: I added a readme file to the examples folder. However, I'll copy the first 20 steps to the minifuzz folder to avoid midunderstandings
# 2025-09-16 10:46clearloop: we have unit tests for examples/v1, e.g. decode/encode *.bin and see if things get matched, not sure if this is easier to test the format, will try the mini-fuzz script anyway
# 2025-09-16 10:56tomusdrw: Ah, okay, perfect! I missed that.
# 2025-09-16 13:01jaymansfield: davxy: Just a heads up, minifuzz seems to work until the invalid state root in step 29. It then shuts down after seeing the mismatch rather then proceeding with the final get state request.
# 2025-09-16 13:02davxy: Yes this is known. See the README in the examples folder. I'll prepare some other traces for minifuzz testing (with and without forks)
# 2025-09-16 13:04davxy: In the examples we intentionally return a wrong root to emulate a bad target, and thus include the GetState message in the examples sequence
# 2025-09-16 13:04davxy: But this is causing some confusion (you are the third to ask)
# 2025-09-16 13:04davxy: so I'll provide a dedicated folder for the "self test"
# 2025-09-16 13:07jaymansfield: This would help. I think the readme is clear in that it should fail, but not that minifuzz won't actually perform the get state itself after. That was the part missing. (edited)
# 2025-09-16 13:10clearloop: wait, I just realized that minifuzzer stopped after 10 pairs on my machine without any error messages, if this is not normal as well
==========================================================================
Processing pair 10: 00000009_fuzzer_import_block.bin -> 00000009_target_state_root.bin
TX: import_block
RX: state_root

Stopping after 10 file pairs as requested
(edited)
# 2025-09-16 13:11jaymansfield: Try adding the argument: --stop-after 30
# 2025-09-16 13:47davxy: https://github.com/davxy/jam-conformance/tree/main/fuzz-proto#preliminary-self-testing
# 2025-09-16 13:47davxy: I hope is clear now
# 2025-09-16 14:12danicuki: I get error when try to run minifuzz:
$ python minifuzz/minifuzz.py -d examples/v1/forks --target-sock /tmp/jam_target.sock
Traceback (most recent call last):
  File "/Users/danicuki/dev/jam-conformance/fuzz-proto/minifuzz/minifuzz.py", line 9, in <module>
    from jam_types.fuzzer import FuzzerMessage
ModuleNotFoundError: No module named 'jam_types'
Where can I find the jam_types code?
# 2025-09-16 14:14clearloop:
pip install git+https://github.com/davxy/jam-types-py.git
# 2025-09-16 14:22clearloop: davxy: I think there is a problem in the fork tests at 00000004 which propose block.slot=1 with parent hash of the block in 00000003 (block.slot=1 imported successfully) and expects import success, if I'm not mistaken, if this is a fork, it should have the parent hash of block.slot=0
# 2025-09-16 14:59jaymansfield: I'm seeing the parent of 00000004 to be the one from the initialize in 00000001
# 2025-09-16 15:40danicuki: minifuzzer should not compare message size, since error messages can differ from implementation to implementation, no?
Processing pair 7: 00000006_fuzzer_import_block.bin -> 00000006_target_error.bin
TX: import_block
Error decoding target response: Decoding <String> - No more bytes available (needed: 69 / total: 68)
Connection closed
I believe FuzzerMessage(data=scale_bytes).decode() might have a bug (edited)
# 2025-09-16 15:43clearloop: can you pass that, from my logs 0x7003387fc23f422113a94dd5abc0b4fa31eae0108b4ac9ea6a133ffd14e315fb is the parent 00000004 want
DEBUG  read: message(length): Info
DEBUG write: message(21): Info
DEBUG  read: message(length): Initialize(len=21)
DEBUG write: message(33): StateRoot(0xe174c1ea94e21958db28c9ea621f09ae2b360a9ecb6047afdbf94827d216e1db)

DEBUG  read: message(length): ImportBlock(slot=1, hash=0x252ff7a523a698b91e872f90f3f046fac90b80869bf9c7cf2000d627f4e7df3e)
DEBUG import: importing block(1)=0x252ff7a523a698b91e872f90f3f046fac90b80869bf9c7cf2000d627f4e7df3e, best block: 0
 WARN failed to import block: BadCoreIndex
DEBUG write: message(14): Error(BadCoreIndex)

DEBUG  read: message(length): ImportBlock(slot=1, hash=0x7003387fc23f422113a94dd5abc0b4fa31eae0108b4ac9ea6a133ffd14e315fb)
DEBUG import: importing block(1)=0x7003387fc23f422113a94dd5abc0b4fa31eae0108b4ac9ea6a133ffd14e315fb, best block: 0
DEBUG write: message(33): StateRoot(0x25b1bd54e8d0d5c82202ae51de2d378b8dfa13f637fe75bda672484c25e1c6a6)

DEBUG  read: message(length): ImportBlock(slot=1, hash=0xe63b5fa906678415f5b15ae6514d44b35e4dafa497f52a1640cbf374811b9c89)
DEBUG import: importing block(1)=0xe63b5fa906678415f5b15ae6514d44b35e4dafa497f52a1640cbf374811b9c89, best block: 1
 WARN import: Fallback state to 0
 WARN failed to import block: Parent mismatch, expected: 0x7003387fc23f422113a94dd5abc0b4fa31eae0108b4ac9ea6a133ffd14e315fb, got: 0x2bf11dc5e1c7b9bbaafc2c8533017abc12daeb0baf22c92509ad50f7875e5716
(edited)
# 2025-09-16 16:39davxy: > I think there is a problem in the fork tests at 00000004 I don't see any issue > I'm seeing the parent of 00000004 to be the one from the initialize in 00000001 Correct Given the filename number associated to each block, the chain graph of the blocks (attempted to be imported) up to 7 should be:
1--+--2
   +--3
   +--4--+--5
         +--6--7
(edited)
# 2025-09-17 07:09clearloop: thanks! I found that our impl has sort of bugs on falling back
# 2025-09-16 16:43davxy: From the replies, it appears that: - Blocks **2** and **5** fail to be imported. - Blocks **3** and **4** are successfully imported (actual fork). According to the fuzzer protocol, the chain is extended using the **last successfully imported block**. In other words, you will **never** see blocks 3 and 4 successfully imported and then the chain extended from block 3.
# 2025-09-16 16:43davxy: https://github.com/davxy/jam-conformance/tree/main/fuzz-proto#example-session
# 2025-09-16 20:42sourabhniyogi: After teams get through "minifuzz" this week and a round or two of ancestry + forks this week and next, can we wrap up 0.7.0 and move collectively onto 0.7.1, say, on week of Sept 29? Since there isn't _that_ much happening on 0.7.2, perhaps we can jump from 0.7.0 to 0.7.2? (edited)
# 2025-09-17 03:36ascriv: Should L be 24 for the tiny config? Cant pass 1757862468 otherwise
# 2025-09-17 04:47prematurata: > <@ascriv:matrix.org> Should L be 24 for the tiny config? Cant pass 1757862468 otherwise Yes
# 2025-09-17 04:48prematurata: Anyone passing https://github.com/davxy/jam-conformance/discussions/79 can shred some light on this?
# 2025-09-17 05:45danicuki: > <@ascriv:matrix.org> Should L be 24 for the tiny config? Cant pass 1757862468 otherwise Yes
# 2025-09-17 08:260xjunha: https://github.com/davxy/jam-conformance/discussions/91
# 2025-09-17 10:49davxy:
# 2025-09-17 11:38clearloop: anybody knows what's the case of faulty\_000000029 ? from 000000030, I found that the trace expects block 0xb13b648e9030118a6bf912aaca95a78b66c86cbd41d112b21393d4b896eaf864 in the history via getting the state of 0x91fcda538898b174da9b61af42c141fb0e1549e4e0dfc1ca8caec4c6185eeea5 (and this is the last block get imported exactly in my target) while the block 0xbi3b6... does not exists at all in my importing process (edited)
# 2025-09-17 13:01vinsystems: > <@clearloop:matrix.org> anybody knows what's the case of faulty\_000000029 ? from 000000030, I found that the trace expects block 0xb13b648e9030118a6bf912aaca95a78b66c86cbd41d112b21393d4b896eaf864 in the history via getting the state of 0x91fcda538898b174da9b61af42c141fb0e1549e4e0dfc1ca8caec4c6185eeea5 (and this is the last block get imported exactly in my target) while the block 0xbi3b6... does not exists at all in my importing process https://github.com/davxy/jam-conformance/tree/main/fuzz-proto/examples/v1#warning-faulty-session
# 2025-09-17 13:02clearloop: thank you so much! I have just spent hours on this 🤦‍♂️ (edited)
# 2025-09-17 14:58davxy: https://github.com/gavofyork/graypaper/pull/497
# 2025-09-18 06:46dakkk: It seems I'm having a similar issue with that trace
# 2025-09-18 13:40danicuki: > <@sourabhniyogi:matrix.org> After teams get through "minifuzz" this week and a round or two of ancestry + forks this week and next, can we wrap up 0.7.0 and move collectively onto 0.7.1, say, on week of Sept 29? > > Since there isn't _that_ much happening on 0.7.2, perhaps we can jump from 0.7.0 to 0.7.2? I agree we should go directly to 0.7.2
# 2025-09-18 14:12clearloop: interesting, I just found that in the tracing tests (conformance reports/traces), there are programs have same code\_hash but different initial registers, ain't the code\_hash the hash of (metadata + standard program) that the initial registers should be fixed? (edited)
# 2025-09-18 16:54boymaas: When running the minifuzz, do other teams also verify whether the author_index is determined using: https://graypaper.fluffylabs.dev/#/38c4e62/0e1c040e3104?v=0.7.0 in key mode, as part of their header validation. I observe a delta in my implementation **only during the first epoch** of the jamtestvectors traces and the minifuzzer protocol test.
# 2025-09-18 19:13sourabhniyogi: Poll: December 2025 In-person JAM Meetup @ PBA Lisbon https://github.com/davxy/jam-conformance/discussions/93 It would be fabulous if we had as many of the JAM implementer enrolled teams there as we possibly can!
# 2025-09-18 19:16emielsebastiaan: Does anyone have specific dates?
# 2025-09-18 19:49sourabhniyogi: Hello nikos -- we heard about the PBA JAM course way back in the JAM XP May meetup -- it would be great to have JAM implementers assemble there. Can you tell us about details so we can plan with you a bit?
# 2025-09-18 19:50wirednkod: sourabhniyogi: the dates are not final
# 2025-09-18 19:50wirednkod: we are recovering from PBA Bali and expecting some updates concerning the toaster - and then decide the final dates
# 2025-09-18 19:51wirednkod: I would suggest to avoid polls with specific dates as the one above - but i will keep the JAM community up to date as soon as we have the final dates
# 2025-09-18 19:51sourabhniyogi: Are you ok with 20-30 of JAM implementer headcount there?
# 2025-09-18 19:52wirednkod: i cannot provide a definite answer to that at the moment as it may sound like a promise
# 2025-09-18 19:53wirednkod: again - i would suggest for a bit of patience - im sure by the end of Sept we will have many more details to move forwrad
# 2025-09-18 19:53wirednkod: but at the moment neither the date nor the participant numbers or profile are 100% defined
# 2025-09-18 19:53sourabhniyogi: Absolutely, will be patient =)
# 2025-09-18 19:54wirednkod: thank you
# 2025-09-19 16:37shimonchick: Is there something with the JIP-5 altnames? Its a very basic algorithm, but applying it yields different values for me. Can somebody confirm if the current JIP-5 altnames in https://docs.jamcha.in/basics/dev-accounts are correct?
# 2025-09-19 18:09sourabhniyogi: Hats off to Tianyi | SpaceJam ... with the FIRST JAM Implementation to match/beat Polkajam in performance!
# 2025-09-19 18:10sourabhniyogi: As teams develop PVM recompilers, should we aim to have two entries in the dashboard like Polkajam or just one?
# 2025-09-19 20:30davxy: First of all, congrats to Spacejam for achieving such impressive results! That said, we've now entered the sub-ms domain, where my workstation setup isn't deterministic at all. One run can be faster, and the next might not be. To avoid encouraging pointless sub-ms races, I think it makes sense to round measurements to millisecond granularity
# 2025-09-19 21:48clearloop: 0.05ms, it's driving me crazy 🫠 agree with not overkilling on this, from my side, I'm feeling exhausted indeed in the recent weeks, however I think the sub-ms in single dataset is still useful for checking if there are spaces to optimize XD for example, via seeing polkajam is faster in safrole/fallback, I know things can be done better, and I just found sth new can be optimized in our implementation, but I actually hope that I can't find since this stuff looks like endless XD (edited)
# 2025-09-20 03:33jan: Nice! I'm always happy to see other implementations matching PolkaVM in performance. Congratulations to SpaceJam!
# 2025-09-20 03:34jan: The next frontier will be recompilation speed (along with the new gas cost model which will be a lot more challenging to make fast), which isn't currently very well tested by these performance benchmarks.
# 2025-09-19 20:37davxy: For example, one could gain some pointless advantage just by removing **all** the logs. I've seen some implementations keep a few logs, while others completely removed them. We even spotted implementations removing signature checking ;-) (quickly restored, of course). Honestly, we don't care much about these micro-optimizations. What really matters is avoiding spending orders of magnitude more time fuzzing a target compared to the baseline.
# 2025-09-19 22:01sourabhniyogi: Were the (now missing) logs useful to you? I believe shifting the focus from micro-optimizations to recompilation is important. We've been conditioned to care about the small (\<50%) differences because of
(M3) HALF-SPEED: Conformance and 50% of required performance (including PVM impl): 100,000 DOT + 1,000 KSM
(M4) FULL-SPEED: Conformance and 100% of required performance (including PVM impl): 100,000 DOT + 1,000 KSM
Its pretty obvious that performance is all about recompilation here, and I think its valuable to get more teams focussed on this. We know its not an thing for M1, but since we know with 100% certainty its a not-pointless goal for M3/M4, its entertainment now but useful entertainment while we're waiting for like, 1.0 ratification. Why not introduce at least 1M-10MM million gas of randomly generated \[but still valid\] PVM byte code interpretation to always consume > 10ms (for baseline polkajam) so we can improve performance on at least _some_ traces? By introducing this a > 10x factor, 50% differences become visible in import_max, and: 1. we have a baby step towards PVM fuzzing 2. recompiler teams can play a useful substantive M3 / M4 optimization game amidst M1 fuzzing. 3. more teams will get engaged on recompiler (because we love optimizing) PVM fuzzing is valuable to attack, I presume, right within M1, is it not? (edited)
# 2025-09-19 22:56clearloop: my testing linux machine is likely 10x-20x slower than the fuzz machine, can confirm that some of our big optimizations were actually started from 1ms on my machine which could be just 0.1ms/0.05ms in the testing machine, however at the current level, for spacejam we won't optimize more 
out of the performance part, I think we're feeling anxious mostly caused by there is no obvious threshold for the stable implementation, the performance of polkajam is really hard to chase ( we started from 20x~50x slower, and some parts of polkajam are still unbelievable/insane to us atm even we are in the same language ) and sometimes I'm feeling like I'm working like a slave to chase it 🫠 
hope the threshold of M3/M4 or the baseline for the current M1 will not even in X ms but based on X0 ms or X00ms, which is for sth like: 「there are 6s in the block, if you can import most of blocks within 30ms avg on machine A, it's totally enough for the stability of our network, you have done a great job, I won't force you to archive 0.X ms since all we know there are indeed differences in impls」 (edited)
# 2025-09-20 03:38jan: > hope the threshold of M3/M4 or the baseline for the current M1 will not even in X ms but based on X0 ms or X00ms This is not official, but I imagine for M3/M4 we'll look at what's achievable (by looking the fastest implementation), and just pick a threshold that's slightly lower than that so that it won't be limited to only the most performance-oriented teams and/or languages (where "slightly lower" is very much TBD; we'll probably know how much that is exactly once we can run more real-world tests on an actual JAM chain)
# 2025-09-20 03:40jan: I can see a world where some parts of the protocol will be a bottleneck and that could make it less important to optimize other parts because they wouldn't actually make things faster in practice. (edited)
# 2025-09-20 03:41jan: But, again, this is still TBD until we can do more real-world testing instead of microbenchmarks.
# 2025-09-20 03:43jan: > Why not introduce at least 1M-10MM million gas of randomly generated [but still valid] PVM byte code interpretation to always consume > 10ms (for baseline polkajam) so we can improve performance on at least some traces? For purely PVM drag racing it probably doesn't make sense to use a general-purpose fuzzer/harness like this and it'd be a better idea to make a dedicated one which only does PVM (like I'm planning to have for M3/M4)
# 2025-09-20 03:45sourabhniyogi: Life is what happens when you're busy making plans =)
# 2025-09-20 04:49ascriv: How will the amount of M1 conformance vectors be defined? Based on amount of vectors polkajam-int can run in 3 days (e.g.) at the time of conformance? Or something else?
# 2025-09-20 11:42rustybot: https://github.com/davxy/jam-conformance/pull/95
# 2025-09-20 12:07boymaas: Could something have gone wrong with the last performance run? When I bench in a VM, I already get much lower values (between 5x and 10x lower) using the same weighted scoring algorithm.
# 2025-09-20 12:20davxy: Indeed there is a big regression for jamzig. I'll try to re-run, but I used exactly the same setup for all the targets.
# 2025-09-20 12:25clearloop: could you please try to download the latest spacejam binaries as well, I had an update of our crypto stuffs late yesterday, possibly provides 25% optimization of our basic fallback mechanism, not sure if it really works
# 2025-09-20 12:20clearloop: looks like this new rule has a lot of effects in crypto related stuffs XD
# 2025-09-20 12:22boymaas: Thanks, davxy. An hour ago, I released a new binary that also addresses the last failing reports.
# 2025-09-20 12:34davxy: Perhaps my message was not clear: - I'm not sustaining this kind of competition - I'm not here to run your impl on each update (for a sub ms race) - I may run impls with strange behavior I will run your update on next round (edited)
# 2025-09-20 12:35davxy: FWIW fuzzer protocol is public, traces are public, anyone can check their perfs
# 2025-09-20 12:49boymaas: No problem. It's just an unusual regression I do not understand. It will resolve itself the next run. There's no rush.
# 2025-09-20 12:53ascriv: I think I speak for everyone when I say we all really appreciate the work you’re basically soloing @davxy
# 2025-09-20 12:58clearloop: understood, sort of devops related stuffs for all teams are indeed need mental health insurance ))
# 2025-09-20 13:02davxy: Well the point is that it is **literally** not my job to bench your impls. I work on an implementation and I built a fuzzer. That so far is the most effective (edited)
# 2025-09-20 13:03davxy: Well your impl has a very strange regression, so I'll check that of course :-)
# 2025-09-20 13:05boymaas: Thank you; Much appreciated.
# 2025-09-20 14:46ycc3741:
# 2025-09-20 14:47ycc3741: davxy: When testing trace/preimage_light/0000008, we found some issues with [pi_V[v]_g](https://graypaper.fluffylabs.dev/#/38c4e62/19ea0119fc01?v=0.7.0), as follows: We know that in tiny mode, the parameters are R = 4, E = 12, which means the cores where the guarantors for slot 7 and slot 8 reside will rotate. We observed that originally our pi_V[v]_g was 2. In slot 8, we received guarantee_extrinsic with two elements belonging respectively to slot 7 and slot 8. According to the core assignment for slot 8 — [0,1,1,0,0,1] — and for slot 7 — [1,0,0,1,1,0], we confirmed that the reports received for slot 7 and slot 8 were both valid. Thus, we incremented pi_V[v]_g twice by +1 (since kappa_v_prime is G according to GP 0.7.0 formula 13.5), resulting in pi_V[v]_g_prime = 4. However, according to GP 0.7.0 formula 13.5 (kappa_v_prime is G), it only checks whether kappa_v_prime is in G. But doesn’t this conflict with the description above 13.3: g: The number of reports guaranteed by the validator. Shouldn’t both reports be counted? Or should we simply follow the formula (13.5) and add +1 just once?
# 2025-09-20 14:47ycc3741: post.png
# 2025-09-22 12:24dave: IIRC Gav said the GP behaviour in this case is intentional, to keep things simple. I agree the counting behaviour you describe would probably be more useful. You can propose a GP change by making a PR in the GP repo, if the change is simple I think there's a fair chance of it being accepted
# 2025-09-23 01:29ycc3741: THX a lot
# 2025-09-23 01:29ycc3741: we will deal with it later
# 2025-09-23 16:18davxy: https://github.com/davxy/jam-conformance/discussions/98#discussioncomment-14488639
# 2025-09-25 15:58danicuki: Many of the new fuzzer batch are related to on_transfer, which was dropped on GP 0.7.2. Do you think it is worth at this point to have tests that use on_transfer?
# 2025-09-25 16:33prematurata: in my case i found other issues not directly related to ontransfer so it was worth it I'd say
# 2025-09-25 18:43davxy: Keep in mind that it is not mandatory to pass every case. You may decide that some are not worth addressing and leave them as is. This is the final batch for 0.7.0; after this, we will move on to 0.7.1.