Skip to content

Conversation

@kripken
Copy link
Member

@kripken kripken commented May 6, 2025

No description provided.

@kripken
Copy link
Member Author

kripken commented May 6, 2025

Note: I have yet to confirm the binary format here interops with others, see WebAssembly/branch-hinting#31

@kripken kripken marked this pull request as draft May 7, 2025 15:41
@kripken
Copy link
Member Author

kripken commented May 7, 2025

Converting to draft as the order was wrong here: unlike DWARF, this section must appear before the code. I did that in the reader, but doing it in the writer will take some interesting plumbing...

@tlively
Copy link
Member

tlively commented May 7, 2025

Since we'll apparently have to assemble the output in chunks, maybe it's a good opportunity to investigate parallelizing binary writing for functions.

@kripken
Copy link
Member Author

kripken commented May 7, 2025

Yeah, that can make sense after #7575 which allows us to work on side buffers - we could emit functions in parallel ones.

@kripken kripken marked this pull request as ready for review May 7, 2025 18:54
@kripken
Copy link
Member Author

kripken commented May 7, 2025

This is now ready, but includes parts of #7575 which should land first.


auto funcIter = binaryLocations.functions.find(func);
assert(funcIter != binaryLocations.functions.end());
auto funcDeclarations = funcIter->second.declarations;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this funcOffset instead? That will be clearer when we subtract it from the expr offset below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the comment so it is right above this code, and I think now makes it explanatory? I.e. the code is now doing literally what the comment says, "take the offset relative to function declarations"

Comment on lines 1911 to 1916
if (DWARF) {
// Do not break, so we keep looking for DWARF.
continue;
} else {
break;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this conditional selection between continue and break is worth the complexity. Can we just unconditionally continue here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it means we pre-scan through the entire binary in the common case, but I measured it now and the difference is maybe 1%, so I agree it's not worth it. Done.

@kripken
Copy link
Member Author

kripken commented May 7, 2025

(rebased, this no longer has parts of another PR)

Comment on lines +489 to +491
std::move_backward(&o[sectionStart - 1], &o[oldSize], o.end());
std::copy(
annotationsBuffer.begin(), annotationsBuffer.end(), &o[sectionStart - 1]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be simpler if we wrote sections to several buffers and then concatenated them at the end rather than shifting contents around within buffers. That would be a bigger change, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and possibly less efficient (though I'm not sure, maybe reallocating a single buffer is worse?)


// We compute the location of the function declaration area (where the
// locals are declared) the first time we need it.
BinaryLocation funcDeclarations = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this funcDeclarationsOffset or similar so it makes more sense when we subtract it from exprOffset below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

}
}

if (!funcHints.exprHints.empty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's turn this into an early exit:

Suggested change
if (!funcHints.exprHints.empty()) {
if (funcHints.exprHints.empty()) {
continue;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// Write the final size. We can ignore the return value, which is the number
// of bytes we shrank (if the LEB was smaller than the maximum size), as no
// value in this section cares.
(void)buffer.emitRetroactiveSectionSizeLEB(lebPos);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this void cast to avoid compiler errors? It doesn't seem like we should.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the return value isn't marked as erroring if unused, yeah, and the comment should be enough. Done.

readDylink(payloadLen);
} else if (sectionName.equals(BinaryConsts::CustomSections::Dylink0)) {
readDylink0(payloadLen);
} else if (sectionName.equals(Annotations::BranchHint.str)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since both sides are Name, can't we use ==?

Suggested change
} else if (sectionName.equals(Annotations::BranchHint.str)) {
} else if (sectionName == Annotations::BranchHint) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 5255 to 5256
auto funcIndex = getU32LEB();
auto& func = wasm.functions[funcIndex];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should guard against OOB indices here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done.


;; RUN: wasm-opt -all %s -S -o - | filecheck %s
;; RUN: wasm-opt -all %s -S -o - | filecheck %s
;; RUN: wasm-opt -all --roundtrip %s -S -o - | filecheck %s --check-prefix=BINARY
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes we use RTRIP as the check prefix so it has the same number of characters as CHECK.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +1611 to +1613
// TODO: This mode may not matter (when debugging, code annotations are an
// optimization that can be skipped), but atm source maps cause
// annotations to break.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for the breakage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I only noticed it by chance, and added the TODO to investigate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually I think we will need this to work. One use case for source maps is deobfuscation of stack traces collected from the wild, which means they need to be accurate (to the extent possible) for production binaries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense. I'll get to it sooner, then.

@kripken
Copy link
Member Author

kripken commented May 8, 2025

(Meanwhile I am pretty sure this is correct wrt the binary format. The last difference with the spec interpreter turned out to be a bug on that side, WebAssembly/branch-hinting#34)

@kripken kripken merged commit f634373 into WebAssembly:main May 8, 2025
14 checks passed
@kripken kripken deleted the branch.hint.binary branch May 8, 2025 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants