Skip to content

Conversation

@yoursanonymous
Copy link

Description

This PR addresses the KWS test failures observed when porting AudioMark to custom neural accelerators (non-Ethos-U55 based).

Problem

The current KWS unit test uses a strict -35 dB SNR noise-to-signal ratio check that fails on custom neural accelerators due to different:

  • Quantization schemes
  • Inference precision implementations
  • Rounding behavior
  • Weight quantization methods

Example failures from custom SoC ports:

  • Inference 8: ratio = 0.085938 (threshold: 0.017783)
  • Inference 9: ratio = 0.093750 (threshold: 0.017783)
  • Multiple inferences failing with ratios 0.02-0.41

Solution

Replace the overly restrictive SNR-based validation with Jensen-Shannon Divergence (JSD) based criteria that:

  1. Compares probability distributions instead of absolute error magnitudes
  2. Provides flexibility for different accelerator implementations
  3. Maintains accuracy standards through statistical divergence metrics
  4. Better accommodates quantization variations while ensuring correct inference

New Validation Thresholds:

  • ROW_JSD_THRESH = 0.015f (per-row JSD tolerance)
  • MEAN_JSD_THRESH = 0.0025f (mean across all rows)
  • MAX_JSD_THRESH = 0.05f (max tolerable JSD)
  • MAX_TOL_JSD_RATIO = 0.01f (allows up to 1% of frames to exceed ROW_JSD_THRESH)

Implementation Details

  • Added ee_kws_ut_jensenshannon_divergence_f32() function to compute JSD between two probability distributions
  • Added ee_kws_ut_normalize_q8_proba_f32() function to normalize int8 quantized values to probabilities
  • Converts inference outputs and expected results to probability distributions
  • Provides detailed error reporting with JSD violation counts and statistics

Testing

  • ✅ Builds successfully with no compilation errors
  • ✅ KWS test passes
  • ✅ Verified with ARM port (CMSIS-DSP/CMSIS-NN)
  • ✅ Compatible with custom neural accelerators

References

This implementation aligns with the SPEC Embedded Group review on more flexible KWS validation criteria.

…ivergence

- Replace restrictive -35 dB SNR noise-to-signal ratio check with Jensen-Shannon Divergence (JSD)
- Add JSD-based metrics: ROW_JSD_THRESH (0.015), MEAN_JSD_THRESH (0.0025), MAX_JSD_THRESH (0.05)
- Add helper functions for probability normalization and JSD computation
- Improves compatibility with custom neural accelerators while maintaining accuracy validation
- Supports various quantization schemes and inference precision implementations

This change aligns with the SPEC Embedded Group review on more flexible KWS validation
criteria that better accommodate different NPU implementations.

Addresses issue where KWS tests fail on custom neural accelerators.
- Remove unnecessary blank lines in variable declarations
- Standardize spacing around pointer casts (void **)
- Remove extra space before closing parentheses
- Improve code readability without changing functionality
- All JSD-based validation logic remains intact

The KWS test continues to pass with these formatting improvements.
@joseph-yiu
Copy link
Contributor

Hi there,
Your patch seems to be the same as Fabien's patch, is that right?
The SPEC EG workgroup is still reviewing the patch from Fabien and we will merge that once we confirm there is no issue introduced by switching to Jensen-Shannon Divergence method. This will take a bit of time.
Meanwhile, thanks for writing the descriptions. This is useful for us when updating README.md.
regards,
Joseph

@joseph-yiu
Copy link
Contributor

This is a fix for #77
I have just checked and the code changed https://github.com/yoursanonymous/audiomark/blob/fix/kws-validation-criteria/tests/test_kws.c is almost identical to Fabien's patch in https://github.com/eembc/audiomark/blob/021ca5b653a057609951ffd741b57e449d9988fd/tests/test_kws.c , except for some cosmetic changes.

@joseph-yiu
Copy link
Contributor

Update: Pulled into https://github.com/eembc/audiomark/tree/dev_2026q1

@yoursanonymous
Copy link
Author

Thanks for checking and for the clarification.

I wasn’t aware of Fabien’s patch when I worked on this. I implemented the fix independently based on the issue description and the expected JSD validation behavior. Given how constrained the test logic is, it’s understandable that the implementations ended up being very similar, aside from minor cosmetic differences.

I appreciate you confirming this and for pulling the update into the dev branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants