Skip to content

Conversation

@willvale
Copy link
Contributor

In a middle-sized story in a release build, I was seeing good performance at one end (100us or so per getline) vs. bad performance at the other end (300us). This change puts getline under 100us for the entire story.

  • Define file format with C/C++ structs rather than code.
  • Describe file sections in header so we don't need to scan the whole thing on load.
  • Remove vestiges of Endian-swapping and make read_list_flag a free function.
  • Bump format version.
  • Fix UTF-8 test (needed to use prefix)
  • Rename compiler's _containers stream to _instructions since that's what it stores.
  • Remove iterate_containers, add find_container_for, find_container_id, container_data and container_offset.
  • Implement find_container_for and find_offset_for with upper_bound.
  • Store expanded information about containers (container_data) including tree structure.
  • Rewrite jump using new toolkit - update ip, unwind stack, then generate new stack with search/tree walk.
  • Move little bit of container entry logic out of globals_impl::visit.

NB: This might be too much change? My 2p is that the speed improvements are worthwhile and the changes to the file format get it into a better shape which is more clearly defined and easier to extend.

I spotted more avenues for optimisation but trying not to get too distracted by them :)

value as present.
Replace unchecked access to _value with checked access for the various
operators.
And define ptrdiff_t using decltype so it works for 32b and 64b arch.
Merge changes from master into optimisation
Keep selective visit logic on the runner side.
More correctness about whether we have a root container.
Generate sorted hash in compile step, remove from runtime.
And share with the runtime. Introduced container_data_t, container_hash_t, container_map_t and used to emit data.

Still need to generate the container data at compile time.
Define aligned sections in header so we don't need to parse the data on load.
Use defined sized types for header and contents.
Expand first field to u32 so we don't get any padding.
Remove vestigial Endian-swapping.
Add new container data section and remove runtime setup.
This might be better with a high-level flag as well?
(Template wasn't being instantiated in my build environment, d'oh)
The choice flags (5b) don't fit into the container flags space (4b), but that's OK because we never put them there.
Fix lists test (data was terminated early, postpone termination until writing out lists, and only do it if there are lists)
Correctly truncate end of story ignoring section padding.
Fix descent parsing for container hash and accept potential duplicate entries (for now) as they won't hurt the search.
We do need to process the new container in jump() or we lose context about what kind of jump it is. In that case the normal flow could track a knot that we've tunnelled inside, losing the knot tags.
Use source path macro.
Include globals.h so we can destruct the impl properly.
Try and do everything in one loop - cleaner - while traversing original stack.
Honour the visit and knot options properly (according to tests, at least)
Allow type conversion on redefine if the casting matrix says the types have a common base.
Allow bool->float conversion.
This only came up in my story, re-visiting a relatively simple knot, so it's odd that there's no test for it.
It's a bug in inkcpp::master, but it was easier to fix it here after simplifying jump.
@JBenda
Copy link
Owner

JBenda commented Dec 23, 2025

At first thanks for your PR and 2p.
Run-time optimisation was low priority. But the project has reached a state where I appreciate work in this direction.

Luckily we have tests so I'm quite confident in your work.

I will take a deeper look soon.

Feel free to open an issue if you like to discuss further optimisation.

Please keep an eye on the feature/migration branch.

One change in it is that every comment now has a 32bit payload. This allows easier code navigation for migration.
(Also the node reconstruction will profit from your optimisation)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants