-
Notifications
You must be signed in to change notification settings - Fork 79
feat: new keccak #2303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop-new-keccak
Are you sure you want to change the base?
feat: new keccak #2303
Conversation
672c42f to
fb9244f
Compare
7d7039c to
05a6d51
Compare
809a00d to
a7e7113
Compare
05a6d51 to
b6423af
Compare
3fce685 to
fcd99c6
Compare
| let limb = u16_idx % U64_LIMBS; | ||
| let y = i / 5; | ||
| let x = i % 5; | ||
| let x = i / 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why'd you change this? unless you updated the plonky3 commit; this is just definitional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because somehow some bus which was unbalanced before became balanced after this change but I am still trying to understand what is the correct way to do it since the memory read / write busses are still unbalanced because the data being written after the keccakf operation does not match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn generate_trace_rows_for_perm<F: PrimeField64>(rows: &mut [KeccakCols<F>], input: [u64; 25]) {
let mut current_state: [[u64; 5]; 5] = unsafe { transmute(input) };
let initial_state: [[[F; 4]; 5]; 5] =
array::from_fn(|y| array::from_fn(|x| u64_to_16_bit_limbs(current_state[x][y])));
turns out inside plonky3's generate_trace_rows the input gets transposed which is why the bus becoming balanced makes sense now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
turns out in the current keccak trace gen we initially transpose the state first
openvm/extensions/keccak256/circuit/src/trace.rs
Lines 403 to 410 in 77adf7e
| // We need to transpose state matrices due to a plonky3 issue: https://github.com/Plonky3/Plonky3/issues/672 | |
| // Note: the fix for this issue will be a commit after the major Field crate refactor PR https://github.com/Plonky3/Plonky3/pull/640 | |
| // which will require a significant refactor to switch to. | |
| let state = from_fn(|i| { | |
| let x = i / 5; | |
| let y = i % 5; | |
| states[block_idx][x + 5 * y] | |
| }); |
which is why we don't swap y and x there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
turns out to fix it I just have to transpose the input to the plonky3 trace gen and since the transpose of the transpose is the original matrix, everything will be as expected. fixed it here
openvm/extensions/new-keccak256/circuit/src/keccakf/trace.rs
Lines 233 to 236 in f1d6a76
| // the reason we give the transpose instead is inside, plonky3 transpose the input | |
| // so transpose of transpose fixes it | |
| let p3_trace: RowMajorMatrix<F> = generate_trace_rows(vec![preimage_buffer_bytes_u64_transpose], 0); | |
| row[..NUM_KECCAK_PERM_COLS].copy_from_slice( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonathanpwang when the plonky3 thing gets fixed the change to the code should be to just revert into passing in vec![preimage_buffer_bytes_u64 instead of the transpose to generate_trace_rows
b3982f4 to
024d03b
Compare
…ruct with new one
024d03b to
e1dc17b
Compare
CodSpeed Performance ReportMerging #2303 will improve performances by ×2.2Comparing
|
| Mode | Benchmark | BASE |
HEAD |
Change | |
|---|---|---|---|---|---|
| ⚡ | WallTime | benchmark_execute_metered[quicksort] |
19.5 ms | 9 ms | ×2.2 |
Footnotes
-
No successful run was found on
develop-new-keccak(77adf7e) during the generation of this report, somain(77adf7e) was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩ -
36 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…go.toml - place it back when cuda tracegen is merged
Commit: e1c71cf |
shuklaayush
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropped a few comments. still reviewing
| } | ||
|
|
||
| #[inline(always)] | ||
| unsafe fn execute_e12_impl<F: PrimeField32, CTX: ExecutionCtxTrait, const IS_E1: bool>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use IS_E1 to switch between slice and normal reads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for pointing this out
| let x = i % 5; | ||
|
|
||
| let state_limb: AB::Expr = local.inner.preimage[y][x][limb].into(); | ||
| let hi: AB::Expr = local.preimage_state_hi[i * U64_LIMBS + limb].into(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do these need to be range checked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it needs to be and I forgot about it. thank you for pointing this out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am now wondering what constraints the correctness of preimage_state_hi as the upper u8 limb of preimage in the current keccak256 chip
Description
New Keccak with Xorin and Keccakf chip and opcode
Checklist
To reviewer: I will still have to complete the above checklist. But you can start reviewing if you would like to.
Closes INT-5017, INT-5721, INT-5720, INT-5718, INT-5717, INT-5646, INT-5018