Change iteration order by wheeheee · Pull Request #102 · JuliaMath/FFTA.jl

wheeheee · 2026-02-03T06:46:39Z

As I understand it, it's generally slightly more performant to iterate from left to right since Julia is column-major, although there's going to be limited benefit here as these views are strided in memory. If there isn't a reason for the current iteration order, this order is a little bit better.

edit: oops, sticky fingers, accidentally closed this

generally more performant to iterate from left to right since Julia is column-major

codecov · 2026-02-03T06:49:48Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.21%. Comparing base (707282f) to head (de3eb54).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #102      +/-   ##
==========================================
- Coverage   97.24%   97.21%   -0.03%     
==========================================
  Files           4        4              
  Lines         435      431       -4     
==========================================
- Hits          423      419       -4     
  Misses         12       12

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dannys4 · 2026-02-03T12:44:46Z

Any chance you can produce timings before and after the pr for an example to show the difference locally?

wheeheee · 2026-02-03T16:49:00Z

It's a very contrived example, which uses the size of the L2 cache (because fitting it in the L1 cache wasn't as significant as expected, probably because of runtime dispatch / allocations) and doesn't give that great a speedup...I think the best argument for doing this is that it generally should be better, and would probably be made more effective with type stability and thus less runtime dispatch and allocs, which I tried to improve with an extra function barrier in a new commit.

Benchmark on PR

julia> @benchmark mul!(out, p, M) setup=(mdims=(4,32768,10); M=randn(ComplexF64, mdims); out=similar(M); p=plan_fft(out, 2))
BenchmarkTools.Trial: 101 samples with 1 evaluation per sample.
 Range (min … max):  35.603 ms … 46.353 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     36.979 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   37.236 ms ±  1.384 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

      ▆▃▄▁▆ ▄▁▁▁▁█ ▆  ▆
  ▄▁▄▇█████▆██████▆█▁▄█▁▄▇▄▆▄▄▁▁▄▄▁▁▄▄▄▄▁▁▁▁▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▄ ▄
  35.6 ms         Histogram: frequency by time        41.8 ms <

 Memory estimate: 19.38 KiB, allocs estimate: 831.

Benchmark on main

julia> @benchmark mul!(out, p, M) setup=(mdims=(4,32768,10); M=randn(ComplexF64, mdims); out=similar(M); p=plan_fft(out, 2))
BenchmarkTools.Trial: 70 samples with 1 evaluation per sample.
 Range (min … max):  58.159 ms …  64.638 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     58.988 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   59.194 ms ± 884.073 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

           ▃█ ▁ ▃▃   ▁  ▁ ▃▁      ▁
  ▄▄▄▄▄▄▇▁▇██▇█▇██▇▇▁█▄▄█▄██▄▄▇▁▁▄█▇▁▁▁▁▄▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▄ ▁
  58.2 ms         Histogram: frequency by time         61.2 ms <

 Memory estimate: 19.19 KiB, allocs estimate: 819.

The improvement is outside of the margin of error.

versioninfo

julia> versioninfo()
Julia Version 1.12.4
Commit 01a2eadb04 (2026-01-06 16:56 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
  WORD_SIZE: 64
  LLVM: libLLVM-18.1.7 (ORCJIT, tigerlake)
  GC: Built with stock GC
Threads: 8 default, 1 interactive, 8 GC (on 8 virtual cores)
Environment:
  JULIA_CONDAPKG_BACKEND = Null
  JULIA_DEPOT_PATH = Q:\.julia;
  JULIA_NUM_THREADS = auto

edit: actually the size of the first two dimensions of ComplexF64s isn't exactly the size of my L2 cache because I forgot about the Complex part, but it still works so whatever...

dannys4 · 2026-02-03T17:48:41Z

Nevertheless, it makes sense---I'd bet it's technically better still if you used 16384 (2^14) instead of 32768 (2^15) just because we have a better power-of-four FFT and so the looping/memory read bottleneck would be more of the issue over arithmetic throughput. Obviously further contrived, just wanted to see a bit of evidence that the memory throughput doesn't get washed out by the compiler or something.

dannys4

Overall, thanks for cleaning up the loops. It reads much better now. I'll approve as soon as gitignore is cleaned up.

.gitignore

This reverts commit e9410af.

wheeheee · 2026-02-04T02:52:54Z

Although imo it's a good idea to leave the / off Manifest.toml, as they are produced by running benchmarks too.

dannys4 · 2026-02-04T03:40:20Z

Ah I see, didn't consider that. I'll save it for another day.

change iteration order

03005ba

generally more performant to iterate from left to right since Julia is column-major

wheeheee closed this Feb 3, 2026

wheeheee reopened this Feb 3, 2026

wheeheee added 2 commits February 4, 2026 00:45

another _mul_loop! for function barrier

fa3aa90

add some common files to .gitignore

e9410af

dannys4 requested changes Feb 3, 2026

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

Revert "add some common files to .gitignore"

de3eb54

This reverts commit e9410af.

dannys4 approved these changes Feb 4, 2026

View reviewed changes

dannys4 added this pull request to the merge queue Feb 4, 2026

Merged via the queue into JuliaMath:main with commit f4c4fb8 Feb 4, 2026
8 checks passed

wheeheee deleted the patch-1 branch February 4, 2026 03:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change iteration order #102

Change iteration order #102
dannys4 merged 4 commits intoJuliaMath:mainfrom
wheeheee:patch-1

wheeheee commented Feb 3, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 3, 2026 •

edited

Loading

Uh oh!

dannys4 commented Feb 3, 2026

Uh oh!

wheeheee commented Feb 3, 2026 •

edited

Loading

Uh oh!

dannys4 commented Feb 3, 2026

Uh oh!

dannys4 left a comment

Uh oh!

Uh oh!

wheeheee commented Feb 4, 2026

Uh oh!

dannys4 commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wheeheee commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dannys4 commented Feb 3, 2026

Uh oh!

wheeheee commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dannys4 commented Feb 3, 2026

Uh oh!

dannys4 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wheeheee commented Feb 4, 2026

Uh oh!

dannys4 commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wheeheee commented Feb 3, 2026 •

edited

Loading

codecov bot commented Feb 3, 2026 •

edited

Loading

wheeheee commented Feb 3, 2026 •

edited

Loading