Skip to content

FIPS202/x86_64: Don't use native x4 backend on x86_64 when not needed#1032

Merged
mkannwischer merged 1 commit intomainfrom
no-unneeded-x64-x4-backend
Apr 12, 2026
Merged

FIPS202/x86_64: Don't use native x4 backend on x86_64 when not needed#1032
mkannwischer merged 1 commit intomainfrom
no-unneeded-x64-x4-backend

Conversation

@flynd
Copy link
Copy Markdown
Contributor

@flynd flynd commented Apr 8, 2026

Match the AArch64 behavior and skip the native Keccak-f1600x4 backend when MLD_CONFIG_SERIAL_FIPS202_ONLY or MLD_CONFIG_REDUCE_RAM is set.

This change was part of PR #1000 but doesn't need to be part of that stack.

@flynd flynd requested a review from a team as a code owner April 8, 2026 10:12
@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Apr 8, 2026

CBMC Results (ML-DSA-87)

⚠️ Attention Required

Proof Status Current Previous Change
mld_attempt_signature_generation ⚠️ 204s 89s +129%
poly_pointwise_montgomery_c ⚠️ 255s 139s +83%
Full Results (186 proofs)
Proof Status Current Previous Change
**TOTAL** 2472s 2129s +16.1%
poly_pointwise_montgomery_c ⚠️ 255s 139s +83%
polyvec_matrix_expand 236s 209s +13%
mld_attempt_signature_generation ⚠️ 204s 89s +129%
polyvecl_pointwise_acc_montgomery_c 196s 185s +6%
rej_uniform_native 147s 136s +8%
sign_verify_internal 113s 136s -17%
mld_invntt_layer 104s 90s +16%
polyvec_matrix_expand_serial 86s 82s +5%
mld_ct_memcmp 74s 69s +7%
mld_ntt_layer 56s 52s +8%
sign_signature_internal 38s 38s +0%
polymat_permute_bitrev_to_custom 30s 26s +15%
mld_compute_t0_t1_tr_from_sk_components 24s 24s +0%
fqmul 22s 18s +22%
poly_uniform_eta_4x 21s 15s +40%
rej_uniform 20s 20s +0%
poly_chknorm_c 19s 19s +0%
polyeta_unpack 17s 17s +0%
poly_uniform_4x 16s 17s -6%
polyt0_unpack 16s 13s +23%
polyveck_power2round 16s 18s -11%
polyveck_decompose 14s 12s +17%
rej_uniform_c 14s 13s +8%
mld_polyvecl_permute_bitrev_to_custom_native 13s 11s +18%
keccakf1600x4_permute_native 12s 15s -20%
mld_check_pct 12s 12s +0%
mld_ntt_butterfly_block 12s 11s +9%
keccak_absorb_once_x4 11s 11s +0%
poly_invntt_tomont_c 11s 7s +57%
polyvecl_ntt 11s 7s +57%
mld_sample_s1_s2_serial 10s 7s +43%
poly_add 10s 9s +11%
polyveck_use_hint 10s 11s -9%
polyz_unpack_c 10s 9s +11%
keccakf1600_permute_native 9s 8s +12%
pointwise_acc_native_x86_64 9s 10s -10%
polyveck_ntt 9s 7s +29%
unpack_sk 9s 9s +0%
keccakf1600_permute 8s 7s +14%
poly_decompose_c 8s 8s +0%
polyvec_matrix_pointwise_montgomery 8s 8s +0%
polyveck_add 8s 10s -20%
polyveck_invntt_tomont 8s 7s +14%
polyveck_pointwise_poly_montgomery 8s 9s -11%
polyveck_shiftl 8s 7s +14%
sign_pk_from_sk 8s 7s +14%
keccak_absorb 7s 7s +0%
mld_sample_s1_s2 7s 5s +40%
polyveck_reduce 7s 7s +0%
polyveck_sub 7s 6s +17%
sign_verify_pre_hash_shake256 7s 3s +133%
unpack_pk 7s 5s +40%
keccak_squeezeblocks_x4 6s 8s -25%
mld_prepare_domain_separation_prefix 6s 2s +200%
pointwise_acc_native_aarch64 6s 7s -14%
poly_challenge 6s 4s +50%
poly_ntt_c 6s 3s +100%
poly_power2round 6s 6s +0%
poly_reduce 6s 3s +100%
polyt0_pack 6s 3s +100%
polyveck_caddq 6s 10s -40%
polyvecl_chknorm 6s 7s -14%
rej_eta_c 6s 4s +50%
unpack_hints 6s 6s +0%
keccak_f1600_x1_native_aarch64_v84a 5s 4s +25%
keccak_squeeze 5s 2s +150%
ntt_native_aarch64 5s 3s +67%
poly_caddq_native 5s 4s +25%
poly_caddq_native_aarch64 5s 3s +67%
poly_chknorm 5s 5s +0%
poly_decompose 5s 3s +67%
polyveck_pack_t0 5s 3s +67%
polyveck_unpack_eta 5s 6s -17%
polyvecl_pointwise_acc_montgomery_native 5s 2s +150%
polyvecl_unpack_z 5s 3s +67%
sign 5s 5s +0%
sign_keypair 5s 3s +67%
sign_signature_pre_hash_shake256 5s 6s -17%
sign_verify 5s 6s -17%
sys_check_capability 5s 4s +25%
caddq 4s 2s +100%
keccakf1600_extract_bytes (big endian) 4s 2s +100%
keccakf1600_xor_bytes 4s 3s +33%
keccakf1600x4_permute 4s 2s +100%
make_hint 4s 5s -20%
mld_compute_pack_z 4s 6s -33%
mld_ct_cmask_neg_i32 4s 3s +33%
mld_ct_cmask_nonzero_u32 4s 2s +100%
mld_ct_cmask_nonzero_u8 4s 3s +33%
mld_h 4s 4s +0%
ntt_native_x86_64 4s 3s +33%
pack_sig_z 4s 3s +33%
pointwise_native_aarch64 4s 3s +33%
pointwise_native_x86_64 4s 5s -20%
poly_caddq 4s 3s +33%
poly_chknorm_native_aarch64 4s 3s +33%
poly_invntt_tomont 4s 3s +33%
poly_uniform_eta 4s 5s -20%
poly_uniform_gamma1 4s 5s -20%
poly_uniform_gamma1_4x 4s 4s +0%
poly_use_hint 4s 6s -33%
poly_use_hint_native 4s 3s +33%
polyt1_unpack 4s 3s +33%
polyveck_chknorm 4s 7s -43%
polyveck_pack_w1 4s 3s +33%
polyvecl_pack_eta 4s 3s +33%
polyvecl_unpack_eta 4s 5s -20%
polyw1_pack 4s 1s +300%
polyz_pack 4s 2s +100%
rej_eta_native 4s 5s -20%
shake256x4_absorb_once 4s 5s -20%
sign_keypair_internal 4s 5s -20%
sign_open 4s 5s -20%
sign_signature 4s 5s -20%
sign_signature_pre_hash_internal 4s 5s -20%
sign_verify_pre_hash_internal 4s 6s -33%
intt_native_x86_64 3s 3s +0%
keccak_f1600_x4_native_aarch64_v84a 3s 1s +200%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 3s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 3s 1s +200%
keccak_finalize 3s 3s +0%
keccak_init 3s 3s +0%
keccakf1600_xor_bytes (big endian) 3s 3s +0%
keccakf1600x4_xor_bytes 3s 2s +50%
mld_ct_get_optblocker_i64 3s 3s +0%
mld_ct_get_optblocker_u32 3s 1s +200%
mld_ct_get_optblocker_u8 3s 1s +200%
mld_ct_sel_int32 3s 2s +50%
mld_keccakf1600_extract_bytes 3s 5s -40%
mld_value_barrier_u8 3s 3s +0%
pack_sig_h_poly 3s 3s +0%
poly_caddq_c 3s 5s -40%
poly_chknorm_native 3s 3s +0%
poly_decompose_native 3s 3s +0%
poly_ntt_native 3s 2s +50%
poly_pointwise_montgomery 3s 3s +0%
poly_pointwise_montgomery_native 3s 3s +0%
poly_uniform 3s 4s -25%
poly_use_hint_c 3s 3s +0%
polyeta_pack 3s 3s +0%
polyt1_pack 3s 3s +0%
polyveck_unpack_t0 3s 3s +0%
polyvecl_permute_bitrev_to_custom 3s 4s -25%
polyvecl_pointwise_acc_montgomery 3s 2s +50%
polyvecl_uniform_gamma1 3s 4s -25%
power2round 3s 2s +50%
reduce32 3s 2s +50%
shake128_init 3s 2s +50%
shake128_squeeze 3s 1s +200%
shake128x4_squeezeblocks 3s 2s +50%
shake256 3s 1s +200%
shake256_absorb 3s 2s +50%
shake256_init 3s 2s +50%
sign_signature_extmu 3s 3s +0%
use_hint 3s 3s +0%
keccak_f1600_x1_native_aarch64 2s 2s +0%
keccakf1600x4_extract_bytes 2s 3s -33%
mld_ct_abs_i32 2s 2s +0%
mld_value_barrier_i64 2s 1s +100%
mld_value_barrier_u32 2s 2s +0%
pack_pk 2s 1s +100%
pack_sig_c 2s 3s -33%
pack_sk 2s 5s -60%
poly_invntt_tomont_native 2s 2s +0%
poly_make_hint 2s 4s -50%
poly_ntt 2s 3s -33%
poly_sub 2s 1s +100%
polyveck_pack_eta 2s 3s -33%
polyvecl_uniform_gamma1_serial 2s 4s -50%
polyz_unpack 2s 5s -60%
polyz_unpack_native 2s 2s +0%
rej_eta 2s 4s -50%
shake128_finalize 2s 2s +0%
shake128_release 2s 3s -33%
shake128x4_absorb_once 2s 1s +100%
shake256_finalize 2s 3s -33%
shake256_release 2s 1s +100%
shake256_squeeze 2s 2s +0%
shake256x4_squeezeblocks 2s 3s -33%
sign_verify_extmu 2s 4s -50%
unpack_sig 2s 4s -50%
decompose 1s 3s -67%
fqscale 1s 6s -83%
montgomery_reduce 1s 4s -75%
poly_shiftl 1s 2s -50%
shake128_absorb 1s 2s -50%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Apr 8, 2026

CBMC Results (ML-DSA-65)

⚠️ Attention Required

Proof Status Current Previous Change
poly_pointwise_montgomery_c ⚠️ 243s 146s +66%
Full Results (186 proofs)
Proof Status Current Previous Change
**TOTAL** 2443s 2252s +8.5%
sign_verify_internal 302s 272s +11%
polyvecl_pointwise_acc_montgomery_c 268s 234s +15%
poly_pointwise_montgomery_c ⚠️ 243s 146s +66%
rej_uniform_native 146s 139s +5%
mld_attempt_signature_generation 116s 105s +10%
polyvec_matrix_expand 107s 107s +0%
mld_invntt_layer 98s 92s +7%
polyvec_matrix_expand_serial 81s 80s +1%
mld_ct_memcmp 76s 69s +10%
mld_ntt_layer 53s 52s +2%
polymat_permute_bitrev_to_custom 33s 35s -6%
mld_compute_t0_t1_tr_from_sk_components 27s 25s +8%
polyveck_use_hint 25s 23s +9%
sign_signature_internal 24s 28s -14%
rej_uniform 22s 20s +10%
fqmul 19s 18s +6%
poly_chknorm_c 18s 21s -14%
polyveck_decompose 17s 15s +13%
poly_uniform_eta_4x 16s 16s +0%
poly_uniform_4x 15s 16s -6%
keccak_absorb_once_x4 13s 11s +18%
keccakf1600x4_permute_native 13s 13s +0%
mld_ntt_butterfly_block 13s 12s +8%
polyt0_unpack 13s 13s +0%
polyveck_power2round 13s 12s +8%
rej_uniform_c 13s 14s -7%
mld_check_pct 12s 11s +9%
polyvec_matrix_pointwise_montgomery 12s 12s +0%
polyveck_ntt 11s 10s +10%
poly_add 10s 12s -17%
poly_invntt_tomont_c 10s 10s +0%
polyveck_caddq 10s 9s +11%
polyveck_invntt_tomont 10s 7s +43%
keccakf1600_permute 9s 6s +50%
keccakf1600_permute_native 9s 10s -10%
polyveck_add 9s 8s +12%
polyveck_chknorm 9s 9s +0%
polyveck_pointwise_poly_montgomery 9s 9s +0%
pointwise_acc_native_x86_64 8s 7s +14%
polyveck_reduce 8s 6s +33%
sign_pk_from_sk 8s 6s +33%
unpack_sk 8s 11s -27%
mld_polyvecl_permute_bitrev_to_custom_native 7s 11s -36%
pointwise_acc_native_aarch64 7s 3s +133%
poly_decompose_c 7s 11s -36%
polyveck_shiftl 7s 6s +17%
rej_eta_c 7s 4s +75%
keccak_absorb 6s 7s -14%
keccakf1600_xor_bytes (big endian) 6s 3s +100%
mld_sample_s1_s2_serial 6s 6s +0%
ntt_native_x86_64 6s 5s +20%
poly_caddq 6s 5s +20%
poly_uniform_gamma1 6s 5s +20%
polyeta_unpack 6s 7s -14%
polyvecl_ntt 6s 7s -14%
polyvecl_pointwise_acc_montgomery 6s 4s +50%
sign 6s 7s -14%
sign_open 6s 5s +20%
intt_native_x86_64 5s 3s +67%
keccak_squeezeblocks_x4 5s 6s -17%
mld_compute_pack_z 5s 4s +25%
mld_sample_s1_s2 5s 6s -17%
pack_pk 5s 4s +25%
pointwise_native_x86_64 5s 5s +0%
poly_caddq_native 5s 4s +25%
poly_challenge 5s 4s +25%
poly_make_hint 5s 3s +67%
poly_power2round 5s 7s -29%
poly_sub 5s 2s +150%
poly_use_hint_c 5s 4s +25%
polyveck_pack_w1 5s 4s +25%
polyveck_sub 5s 6s -17%
polyvecl_pointwise_acc_montgomery_native 5s 3s +67%
polyz_pack 5s 3s +67%
polyz_unpack_native 5s 4s +25%
shake256x4_absorb_once 5s 3s +67%
sign_keypair_internal 5s 5s +0%
sign_signature_extmu 5s 2s +150%
caddq 4s 3s +33%
keccak_f1600_x1_native_aarch64_v84a 4s 3s +33%
keccakf1600x4_extract_bytes 4s 2s +100%
mld_ct_cmask_nonzero_u8 4s 3s +33%
mld_h 4s 5s -20%
ntt_native_aarch64 4s 4s +0%
pack_sk 4s 5s -20%
pointwise_native_aarch64 4s 2s +100%
poly_caddq_c 4s 4s +0%
poly_caddq_native_aarch64 4s 4s +0%
poly_ntt_c 4s 2s +100%
poly_pointwise_montgomery 4s 2s +100%
poly_pointwise_montgomery_native 4s 5s -20%
poly_reduce 4s 4s +0%
poly_shiftl 4s 3s +33%
poly_use_hint 4s 4s +0%
polyveck_pack_eta 4s 6s -33%
polyveck_pack_t0 4s 3s +33%
polyveck_unpack_eta 4s 4s +0%
polyveck_unpack_t0 4s 4s +0%
polyvecl_chknorm 4s 4s +0%
polyvecl_permute_bitrev_to_custom 4s 2s +100%
polyvecl_uniform_gamma1 4s 2s +100%
polyvecl_unpack_z 4s 4s +0%
reduce32 4s 3s +33%
rej_eta_native 4s 6s -33%
shake128_release 4s 4s +0%
shake256 4s 3s +33%
shake256_release 4s 5s -20%
shake256x4_squeezeblocks 4s 1s +300%
sign_keypair 4s 5s -20%
sign_signature_pre_hash_shake256 4s 2s +100%
sign_verify 4s 3s +33%
sign_verify_extmu 4s 5s -20%
unpack_hints 4s 5s -20%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 2s +50%
keccakf1600_extract_bytes (big endian) 3s 2s +50%
keccakf1600_xor_bytes 3s 3s +0%
keccakf1600x4_xor_bytes 3s 2s +50%
mld_ct_abs_i32 3s 2s +50%
mld_ct_cmask_nonzero_u32 3s 4s -25%
mld_ct_get_optblocker_u32 3s 3s +0%
mld_prepare_domain_separation_prefix 3s 4s -25%
pack_sig_c 3s 4s -25%
pack_sig_h_poly 3s 2s +50%
pack_sig_z 3s 4s -25%
poly_chknorm 3s 5s -40%
poly_chknorm_native_aarch64 3s 5s -40%
poly_decompose 3s 4s -25%
poly_decompose_native 3s 4s -25%
poly_invntt_tomont 3s 3s +0%
poly_ntt 3s 3s +0%
poly_uniform 3s 3s +0%
polyt0_pack 3s 3s +0%
polyvecl_uniform_gamma1_serial 3s 4s -25%
polyvecl_unpack_eta 3s 2s +50%
polyz_unpack 3s 3s +0%
polyz_unpack_c 3s 2s +50%
shake128x4_squeezeblocks 3s 3s +0%
shake256_absorb 3s 2s +50%
shake256_finalize 3s 1s +200%
shake256_init 3s 3s +0%
sign_signature 3s 5s -40%
sign_verify_pre_hash_internal 3s 6s -50%
sign_verify_pre_hash_shake256 3s 5s -40%
unpack_sig 3s 5s -40%
use_hint 3s 2s +50%
decompose 2s 5s -60%
keccak_f1600_x1_native_aarch64 2s 1s +100%
keccak_f1600_x4_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 2s +0%
keccak_finalize 2s 2s +0%
keccak_squeeze 2s 4s -50%
keccakf1600x4_permute 2s 2s +0%
make_hint 2s 2s +0%
mld_ct_cmask_neg_i32 2s 2s +0%
mld_ct_get_optblocker_i64 2s 2s +0%
mld_ct_get_optblocker_u8 2s 2s +0%
mld_keccakf1600_extract_bytes 2s 6s -67%
mld_value_barrier_u32 2s 2s +0%
mld_value_barrier_u8 2s 2s +0%
poly_chknorm_native 2s 3s -33%
poly_invntt_tomont_native 2s 3s -33%
poly_ntt_native 2s 3s -33%
poly_uniform_eta 2s 5s -60%
poly_uniform_gamma1_4x 2s 4s -50%
poly_use_hint_native 2s 2s +0%
polyeta_pack 2s 2s +0%
polyt1_pack 2s 3s -33%
polyvecl_pack_eta 2s 3s -33%
power2round 2s 3s -33%
shake128_absorb 2s 2s +0%
shake128_squeeze 2s 3s -33%
shake128x4_absorb_once 2s 3s -33%
sign_signature_pre_hash_internal 2s 4s -50%
unpack_pk 2s 4s -50%
fqscale 1s 3s -67%
keccak_init 1s 4s -75%
mld_ct_sel_int32 1s 1s +0%
mld_value_barrier_i64 1s 2s -50%
montgomery_reduce 1s 2s -50%
polyt1_unpack 1s 4s -75%
polyw1_pack 1s 2s -50%
rej_eta 1s 2s -50%
shake128_finalize 1s 3s -67%
shake128_init 1s 1s +0%
shake256_squeeze 1s 2s -50%
sys_check_capability 1s 3s -67%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Apr 8, 2026

CBMC Results (ML-DSA-44)

⚠️ Attention Required

Proof Status Current Previous Change
mld_attempt_signature_generation ⚠️ 276s 104s +165%
poly_invntt_tomont_c ⚠️ 24s 4s +500%
Full Results (186 proofs)
Proof Status Current Previous Change
**TOTAL** 2262s 2036s +11.1%
polyvecl_pointwise_acc_montgomery_c 345s 334s +3%
mld_attempt_signature_generation ⚠️ 276s 104s +165%
sign_verify_internal 160s 205s -22%
poly_pointwise_montgomery_c 156s 151s +3%
rej_uniform_native 143s 141s +1%
mld_invntt_layer 102s 84s +21%
mld_ct_memcmp 76s 70s +9%
mld_ntt_layer 53s 53s +0%
polymat_permute_bitrev_to_custom 29s 28s +4%
poly_invntt_tomont_c ⚠️ 24s 4s +500%
poly_chknorm_c 21s 19s +11%
polyvec_matrix_expand 21s 23s -9%
polyvec_matrix_expand_serial 21s 21s +0%
rej_uniform 21s 22s -5%
fqmul 19s 22s -14%
polyt0_unpack 18s 13s +38%
sign_signature_internal 18s 18s +0%
poly_uniform_eta_4x 17s 18s -6%
polyeta_unpack 17s 17s +0%
poly_uniform_4x 16s 16s +0%
rej_uniform_c 15s 14s +7%
keccakf1600x4_permute_native 14s 12s +17%
mld_compute_t0_t1_tr_from_sk_components 14s 10s +40%
mld_ntt_butterfly_block 14s 11s +27%
poly_add 14s 12s +17%
polyz_unpack_c 13s 10s +30%
keccak_absorb_once_x4 11s 10s +10%
mld_check_pct 10s 8s +25%
polyveck_use_hint 10s 6s +67%
keccak_absorb 9s 9s +0%
pointwise_acc_native_aarch64 9s 4s +125%
sign_pk_from_sk 9s 6s +50%
poly_power2round 8s 5s +60%
polyveck_decompose 8s 6s +33%
sign 8s 8s +0%
unpack_sk 8s 6s +33%
keccakf1600_permute 7s 8s -12%
keccakf1600_permute_native 7s 10s -30%
mld_compute_pack_z 7s 5s +40%
mld_polyvecl_permute_bitrev_to_custom_native 7s 9s -22%
polyvec_matrix_pointwise_montgomery 7s 9s -22%
polyveck_add 7s 7s +0%
rej_eta 7s 4s +75%
sign_keypair 7s 4s +75%
sign_signature_extmu 7s 5s +40%
sign_signature_pre_hash_internal 7s 5s +40%
keccak_squeezeblocks_x4 6s 5s +20%
mld_sample_s1_s2_serial 6s 4s +50%
pointwise_native_aarch64 6s 3s +100%
poly_caddq_c 6s 5s +20%
poly_use_hint_c 6s 5s +20%
poly_use_hint_native 6s 3s +100%
rej_eta_native 6s 4s +50%
unpack_hints 6s 5s +20%
keccakf1600_xor_bytes (big endian) 5s 2s +150%
keccakf1600x4_extract_bytes 5s 5s +0%
keccakf1600x4_xor_bytes 5s 1s +400%
mld_prepare_domain_separation_prefix 5s 4s +25%
mld_sample_s1_s2 5s 2s +150%
pointwise_acc_native_x86_64 5s 7s -29%
poly_caddq 5s 3s +67%
poly_challenge 5s 4s +25%
poly_decompose_c 5s 4s +25%
poly_ntt_c 5s 4s +25%
poly_pointwise_montgomery_native 5s 2s +150%
poly_reduce 5s 2s +150%
poly_uniform 5s 5s +0%
poly_uniform_eta 5s 3s +67%
polyveck_ntt 5s 4s +25%
polyveck_pack_eta 5s 3s +67%
polyveck_power2round 5s 5s +0%
polyvecl_pack_eta 5s 3s +67%
polyvecl_uniform_gamma1 5s 2s +150%
rej_eta_c 5s 5s +0%
shake128_init 5s 4s +25%
sign_open 5s 8s -38%
sign_signature 5s 4s +25%
sign_signature_pre_hash_shake256 5s 3s +67%
sign_verify_extmu 5s 3s +67%
sign_verify_pre_hash_internal 5s 5s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 4s 3s +33%
keccak_init 4s 2s +100%
mld_ct_abs_i32 4s 3s +33%
mld_value_barrier_u8 4s 2s +100%
ntt_native_aarch64 4s 3s +33%
ntt_native_x86_64 4s 2s +100%
pack_sig_z 4s 2s +100%
pointwise_native_x86_64 4s 3s +33%
poly_caddq_native_aarch64 4s 4s +0%
poly_decompose 4s 3s +33%
poly_ntt_native 4s 4s +0%
poly_uniform_gamma1 4s 3s +33%
poly_use_hint 4s 3s +33%
polyt0_pack 4s 5s -20%
polyt1_pack 4s 4s +0%
polyveck_caddq 4s 2s +100%
polyveck_pack_t0 4s 4s +0%
polyveck_pointwise_poly_montgomery 4s 3s +33%
polyveck_reduce 4s 4s +0%
polyveck_shiftl 4s 3s +33%
polyveck_sub 4s 4s +0%
polyveck_unpack_eta 4s 5s -20%
polyvecl_ntt 4s 6s -33%
shake128x4_absorb_once 4s 2s +100%
shake128x4_squeezeblocks 4s 5s -20%
sign_verify_pre_hash_shake256 4s 5s -20%
unpack_pk 4s 3s +33%
fqscale 3s 3s +0%
intt_native_x86_64 3s 4s -25%
keccak_f1600_x1_native_aarch64 3s 3s +0%
keccak_f1600_x1_native_aarch64_v84a 3s 3s +0%
keccak_finalize 3s 2s +50%
keccakf1600_xor_bytes 3s 2s +50%
mld_ct_get_optblocker_u8 3s 4s -25%
mld_ct_sel_int32 3s 3s +0%
mld_h 3s 4s -25%
mld_keccakf1600_extract_bytes 3s 2s +50%
poly_chknorm_native_aarch64 3s 2s +50%
poly_pointwise_montgomery 3s 2s +50%
poly_shiftl 3s 4s -25%
poly_uniform_gamma1_4x 3s 4s -25%
polyt1_unpack 3s 2s +50%
polyveck_chknorm 3s 4s -25%
polyveck_invntt_tomont 3s 3s +0%
polyveck_pack_w1 3s 3s +0%
polyveck_unpack_t0 3s 3s +0%
polyvecl_chknorm 3s 3s +0%
polyvecl_permute_bitrev_to_custom 3s 2s +50%
polyvecl_pointwise_acc_montgomery 3s 4s -25%
polyvecl_pointwise_acc_montgomery_native 3s 3s +0%
polyvecl_uniform_gamma1_serial 3s 2s +50%
polyvecl_unpack_z 3s 5s -40%
polyz_unpack 3s 3s +0%
polyz_unpack_native 3s 2s +50%
shake128_absorb 3s 2s +50%
shake256 3s 4s -25%
shake256_init 3s 3s +0%
sign_keypair_internal 3s 6s -50%
sign_verify 3s 8s -62%
unpack_sig 3s 4s -25%
use_hint 3s 2s +50%
caddq 2s 5s -60%
decompose 2s 1s +100%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 2s +0%
keccak_squeeze 2s 3s -33%
make_hint 2s 3s -33%
mld_ct_cmask_neg_i32 2s 2s +0%
mld_ct_cmask_nonzero_u32 2s 2s +0%
mld_ct_cmask_nonzero_u8 2s 4s -50%
mld_ct_get_optblocker_u32 2s 2s +0%
mld_value_barrier_u32 2s 3s -33%
montgomery_reduce 2s 3s -33%
pack_pk 2s 2s +0%
pack_sig_c 2s 7s -71%
pack_sig_h_poly 2s 5s -60%
poly_caddq_native 2s 4s -50%
poly_chknorm_native 2s 4s -50%
poly_decompose_native 2s 4s -50%
poly_invntt_tomont_native 2s 3s -33%
poly_make_hint 2s 4s -50%
poly_ntt 2s 3s -33%
poly_sub 2s 3s -33%
polyeta_pack 2s 4s -50%
polyvecl_unpack_eta 2s 3s -33%
polyw1_pack 2s 4s -50%
polyz_pack 2s 2s +0%
power2round 2s 2s +0%
shake128_release 2s 2s +0%
shake256_absorb 2s 2s +0%
shake256_finalize 2s 5s -60%
shake256_squeeze 2s 2s +0%
shake256x4_squeezeblocks 2s 1s +100%
sys_check_capability 2s 3s -33%
keccak_f1600_x4_native_aarch64_v84a 1s 4s -75%
keccakf1600_extract_bytes (big endian) 1s 4s -75%
keccakf1600x4_permute 1s 2s -50%
mld_ct_get_optblocker_i64 1s 1s +0%
mld_value_barrier_i64 1s 3s -67%
pack_sk 1s 3s -67%
poly_chknorm 1s 2s -50%
poly_invntt_tomont 1s 3s -67%
reduce32 1s 4s -75%
shake128_finalize 1s 4s -75%
shake128_squeeze 1s 3s -67%
shake256_release 1s 2s -50%
shake256x4_absorb_once 1s 3s -67%

@mkannwischer
Copy link
Copy Markdown
Contributor

Thanks @flynd. Can you elaborate how this matches the AArch64 behavior? I don't see a matching guard in the AArch64 backend.

@flynd
Copy link
Copy Markdown
Contributor Author

flynd commented Apr 8, 2026

Thanks @flynd. Can you elaborate how this matches the AArch64 behavior? I don't see a matching guard in the AArch64 backend.

This was a commit you made (added to my stack in PR #1000) that I noticed didn't have any dependencies to that stack. I see now that you're referring to one of my commits so the commit message is now referring to something that doesn't exist on master yet.

I will grab the relevant parts of my commit and add to this PR to make it consistent.

@flynd flynd force-pushed the no-unneeded-x64-x4-backend branch from 0b8f3f2 to c29f36d Compare April 8, 2026 13:03
@mkannwischer mkannwischer self-assigned this Apr 9, 2026
Copy link
Copy Markdown
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @flynd! LGTM.

When building with MLD_CONFIG_SERIAL_FIPS202_ONLY, Keccak-f1600x2/x4
is not used and can be skipped.

Signed-off-by: Anders Sonmark <Anders.Sonmark@axis.com>
@mkannwischer mkannwischer force-pushed the no-unneeded-x64-x4-backend branch from c29f36d to f773262 Compare April 12, 2026 07:15
@mkannwischer mkannwischer merged commit 5dc4211 into main Apr 12, 2026
382 checks passed
@mkannwischer mkannwischer deleted the no-unneeded-x64-x4-backend branch April 12, 2026 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants