-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Conversation
@@ -233,8 +233,7 @@ fn hash_compute(light: &Light, full_size: usize, header_hash: &H256, nonce: u64) | |||
|
|||
Node { bytes: out.assume_init() } | |||
}, | |||
// This is fully initialized before being read, see `let mut compress = ...` below | |||
compress_bytes: unsafe { mem::uninitialized() }, | |||
compress_bytes: [0u8; MIX_WORDS], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❯ cargo bench --features=bench -- compute
bench_light_compute_memmap
time: [980.59 us 986.10 us 992.69 us]
change: [-3.7744% -1.4892% +0.5632%] (p = 0.20 > 0.05)
No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) high mild
2 (2.00%) high severe
bench_light_compute_memory
time: [1.0336 ms 1.0410 ms 1.0493 ms]
change: [-3.9993% -2.0714% +0.0881%] (p = 0.05 > 0.05)
No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing, must have been part of one of the LLVM upgrades since this code was written. I absolutely benchmarked it before and 0-initialising it was significantly slower. Wonder if there's anywhere else where we can remove uninit entirely now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a single minor comment, this appears to be perfect.
ethash/src/compute.rs
Outdated
|
||
// This is initialized in `keccak_256` below. | ||
let mut hash = mem::MaybeUninit::<[u8; 32]>::uninit(); | ||
keccak_256::unchecked(hash.as_mut_ptr() as *mut u8, 32, buf.as_ptr(), buf.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you extract 32 to a constant so you don't have to ensure that the 32
in the definition of hash
and the 32
in the unchecked
call are the same? Before you could do .len
but that's now impossible because of MaybeUninit
.
// big-endian arches like mips. | ||
let compress: &mut [u32; MIX_WORDS / 4] = | ||
unsafe { make_const_array!(MIX_WORDS / 4, &mut buf.compress_bytes) }; | ||
#[cfg(target_endian = "big")] | ||
{ | ||
compile_error!("parity-ethereum currently only supports little-endian targets"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, this should have been done long ago. Re-reading the code, I think it might actually work on little-endian as long as you don't copy across the files we're reading from from little- to big-endian or vice-versa, although I can't be certain. Might be worth testing.
ethash/src/compute.rs
Outdated
@@ -152,9 +152,10 @@ pub fn quick_get_difficulty(header_hash: &H256, nonce: u64, mix_hash: &H256, pro | |||
|
|||
let buf = buf.assume_init(); | |||
|
|||
const HASH_BYTES_LENGTH: usize = 32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm bad at naming, so any other suggestions are welcome :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe KECCAK_LEN
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because KECCAK_LEN
would be the total length right? We run keccak with different lengths during this function. At least HASH_BYTES_LEN
is generic enough that it doesn't imply that we always run it with this length, it's just a barrier against typos.
* master: Run cargo fix on a few of the worst offenders (#10854) removed redundant fork choice abstraction (#10849) Extract state-db from ethcore (#10858) Fix fork choice (#10837) Move more code into state-account (#10840) Remove compiler warning (#10865) [ethash] use static_assertions crate (#10860)
Ok, so I've removed uninit completely, and running |
ptr::copy_nonoverlapping(&nonce as *const u64 as *const u8, buf[32..].as_mut_ptr(), 8); | ||
let hash_len = header_hash.len(); | ||
buf[..hash_len].copy_from_slice(header_hash); | ||
buf[hash_len..hash_len + mem::size_of::<u64>()].copy_from_slice(&nonce.to_ne_bytes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using here to_ne_bytes
to preserve the previous behavior, but I guess it still only works for little-endian targets (don't have a machine to test this though).
The benches on this branch:
However, the same bench command on master is… weird. Here's the output:
Notice the ~3000s estimate and the duplicate bench names. |
Part of #10842. Supersedes #10853.
Closes #10842.