Optimisation: Replace linear searches with binary searches for more consistent performance. #138

willvale · 2025-12-23T08:57:13Z

In a middle-sized story in a release build, I was seeing good performance at one end (100us or so per getline) vs. bad performance at the other end (300us). This change puts getline under 100us for the entire story.

Define file format with C/C++ structs rather than code.
Describe file sections in header so we don't need to scan the whole thing on load.
Remove vestiges of Endian-swapping and make read_list_flag a free function.
Bump format version.
Fix UTF-8 test (needed to use prefix)
Rename compiler's _containers stream to _instructions since that's what it stores.
Remove iterate_containers, add find_container_for, find_container_id, container_data and container_offset.
Implement find_container_for and find_offset_for with upper_bound.
Store expanded information about containers (container_data) including tree structure.
Rewrite jump using new toolkit - update ip, unwind stack, then generate new stack with search/tree walk.
Move little bit of container entry logic out of globals_impl::visit.

NB: This might be too much change? My 2p is that the speed improvements are worthwhile and the changes to the file format get it into a better shape which is more clearly defined and easier to extend.

I spotted more avenues for optimisation but trying not to get too distracted by them :)

value as present. Replace unchecked access to _value with checked access for the various operators.

And define ptrdiff_t using decltype so it works for 32b and 64b arch.

Merge changes from master into optimisation

Keep selective visit logic on the runner side.

More correctness about whether we have a root container. Generate sorted hash in compile step, remove from runtime.

And share with the runtime. Introduced container_data_t, container_hash_t, container_map_t and used to emit data. Still need to generate the container data at compile time.

Define aligned sections in header so we don't need to parse the data on load. Use defined sized types for header and contents. Expand first field to u32 so we don't get any padding. Remove vestigial Endian-swapping. Add new container data section and remove runtime setup.

This might be better with a high-level flag as well?

(Template wasn't being instantiated in my build environment, d'oh)

The choice flags (5b) don't fit into the container flags space (4b), but that's OK because we never put them there.

Fix lists test (data was terminated early, postpone termination until writing out lists, and only do it if there are lists) Correctly truncate end of story ignoring section padding. Fix descent parsing for container hash and accept potential duplicate entries (for now) as they won't hurt the search.

We do need to process the new container in jump() or we lose context about what kind of jump it is. In that case the normal flow could track a knot that we've tunnelled inside, losing the knot tags.

Use source path macro. Include globals.h so we can destruct the impl properly.

Try and do everything in one loop - cleaner - while traversing original stack. Honour the visit and knot options properly (according to tests, at least)

Allow type conversion on redefine if the casting matrix says the types have a common base. Allow bool->float conversion.

This only came up in my story, re-visiting a relatively simple knot, so it's odd that there's no test for it. It's a bug in inkcpp::master, but it was easier to fix it here after simplifying jump.

Allow conversions from bool to string. Added test case for allowed and forbidden casts. Added missing include to UTF8 test.

Also removed debug trace left in by mistake.

And set lines to LF.

JBenda · 2025-12-23T21:22:30Z

At first thanks for your PR and 2p.
Run-time optimisation was low priority. But the project has reached a state where I appreciate work in this direction.

Luckily we have tests so I'm quite confident in your work.

I will take a deeper look soon.

Feel free to open an issue if you like to discuss further optimisation.

Please keep an eye on the feature/migration branch.

One change in it is that every comment now has a 32bit payload. This allows easier code navigation for migration.
(Also the node reconstruction will profit from your optimisation)

willvale added 30 commits November 8, 2025 00:27

Small config changes to build without STL and Unreal

f791b36

Fix optional::emplace(), which wasn't marking the newly-constructed

8645d01

value as present. Replace unchecked access to _value with checked access for the various operators.

Verify sized types

00509f5

And define ptrdiff_t using decltype so it works for 32b and 64b arch.

Merge branch 'JBenda:master' into master

5ee26cd

Merge pull request #1 from willvale/master

65fdeba

Merge changes from master into optimisation

Structural optimisations

fce4b1e

Closer to working, added original jump back for comparisons.

9359507

Optimise/fix jump to visit all nested knots when we change path.

45471cf

Keep selective visit logic on the runner side.

Use ~0 to delimit containers.

7ce4729

More correctness about whether we have a root container. Generate sorted hash in compile step, remove from runtime.

Use explicit types in compiled data

461a652

And share with the runtime. Introduced container_data_t, container_hash_t, container_map_t and used to emit data. Still need to generate the container data at compile time.

Move read_list_flag out of header and make header private to story.

7baa748

Allow manual workflow dispatch.

6a44a0a

Allow manual dispatch of build workflow.

3958835

Fix traits for ubuntu?

b4f47e0

Fix definitions for line_type-based APIs.

4c56a70

This might be better with a high-level flag as well?

For real this time

23ffa04

(Template wasn't being instantiated in my build environment, d'oh)

Merge branch 'master' into optimisation

b2ea4e1

Fix warning - store command flags as integral.

30e05b6

The choice flags (5b) don't fit into the container flags space (4b), but that's OK because we never put them there.

Remove profiling code.

4d926c6

Fix non-deref of empty container.

4f33a1e

Fix remaining unit test

eb68bad

We do need to process the new container in jump() or we lose context about what kind of jump it is. In that case the normal flow could track a knot that we've tunnelled inside, losing the knot tags.

Fix UTF-8 test

afe3e1e

Use source path macro. Include globals.h so we can destruct the impl properly.

Rework jump slightly and fix tests.

88dc8d0

Try and do everything in one loop - cleaner - while traversing original stack. Honour the visit and knot options properly (according to tests, at least)

Fix restrictive numeric casts/assignments.

5f3a179

Allow type conversion on redefine if the casting matrix says the types have a common base. Allow bool->float conversion.

Clangformat fixes.

adefe3e

Merge branch 'JBenda:master' into master

97e7f4b

Fix knot visit counts

48bac0f

This only came up in my story, re-visiting a relatively simple knot, so it's odd that there's no test for it. It's a bug in inkcpp::master, but it was easier to fix it here after simplifying jump.

Merge branch 'fix-casts' into optimisation

01a04c2

JBenda and others added 12 commits November 28, 2025 12:11

build(Options) Add NO_EH and NO_RTTI as cmake options.

3f7f6f2

Fix a variety of cast problems

7106179

Allow conversions from bool to string. Added test case for allowed and forbidden casts. Added missing include to UTF8 test.

Merge branch 'JBenda:master' into fix-casts

c180251

Clangformat fixes

be2b6f7

Also removed debug trace left in by mistake.

Merge latest from fix_casts (and implicitly upstream)

918ebc3

Merge branch 'master' of https://github.com/willvale/inkcpp

75c247a

Merge branch 'JBenda:master' into master

c540ff7

Merge branch 'master' into optimisation

ecb7077

Emit list metadata in separate section and remove duplicate parsing.

e1ca8ba

Don't keep trying to terminate a stream which failed.

9d653ae

Run Clangformat on affected files.

31885d4

And set lines to LF.

Missed one.

46569c3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Optimisation: Replace linear searches with binary searches for more consistent performance. #138

Optimisation: Replace linear searches with binary searches for more consistent performance. #138

Uh oh!

willvale commented Dec 23, 2025

Uh oh!

JBenda commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Optimisation: Replace linear searches with binary searches for more consistent performance. #138

Are you sure you want to change the base?

Optimisation: Replace linear searches with binary searches for more consistent performance. #138

Uh oh!

Conversation

willvale commented Dec 23, 2025

Uh oh!

JBenda commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants