Skip to content

Conversation

@doux-jy
Copy link

@doux-jy doux-jy commented Nov 28, 2025

Fix Issue #97: Use Heavy-Tailed Distribution for MSELoss to Prevent Partial Computation from Passing Verification

Summary

Replace uniform distribution with Pareto distribution in level1/94_MSELoss.py input generation to detect incorrect kernel implementations that only compute partial data.

Problem

The original implementation uses uniform distribution for test data generation:

def get_inputs():
    scale = torch.rand(())
    return [torch.rand(batch_size, *input_shape)*scale, torch.rand(batch_size, *input_shape)]

Due to the bounded moments of uniform distribution, by the Law of Large Numbers, MSE converges to the same expected value $(2s^2 - 3s + 2)/6$ regardless of sample size. This allows faulty kernel implementations (e.g., computing only part of the data) to pass accuracy verification.

Solution

Adopt Pareto distribution Pareto(scale=0.01, alpha=1.5) (or other heavy-tailed distributions with divergent second moments):

  • Bounded first moment (α=1.5 > 1): Ensures numerical stability
  • Divergent second moment (α=1.5 ≤ 2): MSE expectation grows with sample size

This ensures implementations with different computation volumes produce significantly different outputs, correctly detecting faulty kernel implementations.

Changed Files

  • KernelBench/level1/94_MSELoss.py

@doux-jy doux-jy marked this pull request as draft November 28, 2025 08:44
@doux-jy doux-jy marked this pull request as ready for review November 28, 2025 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant