-
Notifications
You must be signed in to change notification settings - Fork 85
Replace internal maps with unsorted Vecs #192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note: despite the claimed advantages, I would 100% understand if this was rejected to avoid having to review and maintain the hand-rolled map implementation. But I was curious how much it help w.r.t. binary size, and once I had implemented it, I figured I might as well write a proper commit message and submit a PR in case it's of interest. |
ff75628 to
1a3150b
Compare
|
Looking at it, it looks in a beneficial direction. Though I'd still have few suggestions, if I may:
|
I started out by modifying the places using collection APIs to open-code I can add some basic tests once the API surface is settled.
That was actually my first instinct, but my initial attempt at it became too hairy halfway through (not only due to open-coding). Then I took a step back and came to the conclusion that it's an unnecessary complication for no performance gain (see analysis in the commit message). But maybe the scales tip the other way if the API surface is trimmed down. On the other hand, if the entry API goes away, there's less need for the index-based |
97356b0 to
4d1be65
Compare
|
Updates:
|
|
When I'm looking at it, I'd say the sorted version adds a lot of complexity to the code that's probably not worth it. So, my proposal: can you rebase (the CI was merged to master), remove the last commit and add some tests, so we can merge it? |
HashMap and BTreeMap are overkill in this context. Unsorted vectors are plenty fast enough and the necessary collection interfaces are straightforward to implement. This change has two benefits. First, it improves binary size. For the `print` example from signal-hook in release mode, the .text section shrinks by about 18 KiB and overall file size shrinks by about 30 KiB. That's roughly a 6% reduction in both metrics. Second, the simpler data structures make it more obvious that the signal handler only does async-signal-safe operations. In particular, the default HashMap has a `RandomState`, which can access TLS, do dlsym lookups, open and read from files, etc. depending on the platform. I don't think that's a problem for the hash table *lookup* done in the signal handler since that shouldn't touch the `RandomState`, but it's a bit subtle and the standard library doesn't make any guarantees about this. Avoiding hash maps entirely removes the need to think about it. Performance notes: * (Un-)registering actions does an insert/remove by ActionId, which is asymptotically slower with this PR. However, (un-)registering is a slow operation and should be done rarely. Besides locking, it always clones the entire `SignalData`, so it already takes O(n) time when there's n actions registered across all symbols. * The signal handler looks up the `Slot` by signal number, which is asymptotically slower with this PR. However, there's only a very small constant number of signals, so asymptotics don't matter. * After looking up the right `Slot`, the signal handler only iterates sequentially over the actions, it doesn't do any lookups by ActionId. * For a simple microbenchmark that registers one action each for 20 signals and then raises one signal 100k times, this implementation appears to be slightly *faster* regardless of which signal is raised.
3231514 to
16cdbf8
Compare
|
Done. |
16cdbf8 to
c6bb7ff
Compare
|
A CI job appears to be stuck in a signal-hook-async-std doctest: https://github.com/vorner/signal-hook/actions/runs/19989264522/job/57327601452?pr=192#step:5:453 I can reproduce this locally by running |
c6bb7ff to
0cae452
Compare
|
(Pushed a no-op |
HashMap and BTreeMap are overkill in this context. Unsorted vectors are plenty fast enough and the necessary collection interfaces are straightforward to implement. This change has two benefits.
First, it improves binary size. For the
printexample from signal-hook in release mode, the .text section shrinks by about 18 KiB and overall file size shrinks by about 30 KiB. That's roughly a 6% reduction in both metrics.Second, the simpler data structures make it more obvious that the signal handler only does async-signal-safe operations. In particular, the default HashMap has a
RandomState, which can access TLS, do dlsym lookups, open and read from files, etc. depending on the platform. I don't think that's a problem for the hash table lookup done in the signal handler since that shouldn't touch theRandomState, but it's a bit subtle and the standard library doesn't make any guarantees about this. Avoiding hash maps entirely removes the need to think about it.Performance notes:
SignalData, so it already takes O(n) time when there's n actions registered across all symbols.Slotby signal number, which is asymptotically slower with this PR. However, there's only a very small constant number of signals, so asymptotics don't matter.Slot, the signal handler only iterates sequentially over the actions, it doesn't do any lookups by ActionId.