Talk:Btrfs/Checksum Algorithms
From Forza's ramblings
Kernel hash algorithm performance[edit source]
Great article, but perhaps it is worth mentioning that the in-kernel implementations behave differently than SMHasher? For example, on my zen2 CPU, the xxhash64 driver is 1.7 times slower than the in-kernel crc32c_intel driver when hashing 4K, 16K and 64K blocks, which is similar to what the btrfs module does in inode.c and scrub.c.
Perhaps SMHasher tests on larger blocks? Increasing the size of blocks passed to the crypto_shash_update
function makes xxhash64 faster after about 4800K (1200 pages), but btrfs doesn't hash blocks of this size.
[ 6990.504207] hashbench: driver=crc32c-intel pages=1 iters=1000000 [ 6990.688875] hashbench: total: 184 ms, 184 ns/page [ 6981.959992] hashbench: driver=xxhash64-generic pages=1 iters=1000000 [ 6982.275400] hashbench: total: 315 ms, 315 ns/page
[ 7115.187618] hashbench: driver=crc32c-intel pages=1200 iters=1000 [ 7115.664394] hashbench: total: 467 ms, 389 ns/page [ 7111.475708] hashbench: driver=xxhash64-generic pages=1200 iters=1000 [ 7111.921164] hashbench: total: 435 ms, 363 ns/page
ktime_t start = ktime_get();
for (size_t i = 0; i < iterations; i++) {
if (crypto_shash_init(shash)) {
hb_err("failed to init shash\n");
goto free_vmem;
}
for (u8 *page = vmem; page != end; page += PAGE_SIZE) {
if (crypto_shash_update(shash, page, PAGE_SIZE)) {
hb_err("failed to update shash\n");
goto free_vmem;
}
}
if (crypto_shash_final(shash, hash)) {
hb_err("failed to finalize shash\n");
goto free_vmem;
}
}
ktime_t delta = ktime_sub(ktime_get(), start);