Skip to content

Allow AP_SAT, AP_RND for 'maximum' precision in HLS Config#1422

Open
morunner wants to merge 2 commits intofastmachinelearning:mainfrom
morunner:allow-sat-rnd-in-max-precision
Open

Allow AP_SAT, AP_RND for 'maximum' precision in HLS Config#1422
morunner wants to merge 2 commits intofastmachinelearning:mainfrom
morunner:allow-sat-rnd-in-max-precision

Conversation

@morunner
Copy link

Description

During the development of a gravnet model for hls4ml, I found that setting rounding and saturation for the maximum allowed precision can be beneficial for increasing accuracy. Below you can see histograms and mean difference with one standard deviation when the maximum precision is ap_fixed<16,8,AP_RND,AP_SAT,0> and when rounding and saturation are not enabled (ap_fixed<16,8>).

In the current upstream main branch of hls4ml, rounding and saturation modes set through the 'maximum' field in the HLS config are ignored during precision inference (see e.g. here). I thus propose a single function _apply_max_precision_constraints to be applied where necessary in the infer_precision.py module, which adheres the following rules:

  • Width/Integer: Always constrained to the minimum of inferred vs max.
  • Rounding/Saturation: Inherited from max_precision ONLY if they differ from the defaults
    (meaning the user likely set them explicitly).
  • Signedness: max_precision signed arg is always preferred.

We can of course discuss, what the preferred ruleset should be here.

No additional dependencies are required for this change.

Type of change

  • Other (Accuracy Improvement)

Tests

Pytest

Added a new pytest module, test_max_precision.py, which tests the newly added _apply_max_precision_constraints function isolated and within the _infer_precision function, using mocks.

Conversion to HLS

Ran the full jupyter notebook for gravnet keras conversion to hls at hls4ml-gravnet (Link) to generate the below listed plots, with the proposed change enabled and disabled. The profiling section was ran with this fix applied. We currently do not provide the fully trained model open-source, since it is not finalized. @bo3z please contact me directly, also regarding the dataset, if needed.

Checklist

  • I have read the guidelines for contributing.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation. -> tbd depending on the ruleset for RND, SAT
  • My changes generate no new warnings.
  • I have installed and run pre-commit on the files I edited or added.
  • I have added tests that prove my fix is effective or that my feature works.

GravNet plots showing accuracies and bias across layers

With rounding and saturation enabled for the maximum precision.

gravnet_accuracy_with_rnd_sat_dense_params gravnet_accuracy_without_rnd_sat_dense_params gravnet_bias_with_rnd_sat_dense_params gravnet_bias_without_rnd_sat_dense_params

@morunner morunner force-pushed the allow-sat-rnd-in-max-precision branch from 5d8237f to 55ed9c0 Compare January 23, 2026 11:16
@bo3z bo3z self-assigned this Jan 30, 2026
@jmitrevs
Copy link
Contributor

jmitrevs commented Jan 30, 2026

Generally setting the maximum precision is not something we recommend using much, I don't think. It's better to either quantize the values in the training, or if doing PTQ, explicitly set certain widths to more reasonable values in the configuration. The maximum width is not granular enough for that. One can see what width one gets without the maxumum setting and modify the configuration till it is satisfactory.

Also, rounding and saturation for the accumulator often make the accumulation much slower. It is better to keep it wider and if needed, use saturation and rounding in the activation step right after it, where its cost is insignificant. This more fine-grained way of doing things is recommended instead of using the maximum precision.

@bo3z
Copy link
Contributor

bo3z commented Feb 2, 2026

@morunner do you have some results (in terms of resource usage) before and after this change. @jmitrevs and @calad0i mentioned in our last dev meeting that AP_SAT may not be the most resource-friendly for accumulators (due to the underlying implementation of the saturation operation). Most times, the recommended way is to simply increase the bit width of the variable.

If that's the case from your results we should keep this PR open as a reference (in case someone is interested in using similar functions), but not merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants