Skip to content

Conversation

@larissakl
Copy link
Contributor

This adds a label-synchronous search algorithm on a search tree built by the AedTreeBuilder (#127).
It is derived from the lexicon-free label-synchronous beam-search algorithm (#126). Similar to the time-synchronous treesearch (#113), global or separate pruning of within-word and word-end hypotheses (within the set of active hypotheses) is possible and a language model score is added a word end.

Simon Berger and others added 30 commits February 19, 2025 19:10
@larissakl larissakl requested review from SimBe195 and curufinwe and removed request for curufinwe May 30, 2025 14:59
larissakl and others added 7 commits May 30, 2025 17:03
# Conflicts:
#	apptainer/2022-10-21_tensorflow-1.15_arm_v1/makefiles/Modules.make
#	apptainer/2022-10-21_tensorflow-1.15_v1/makefiles/Modules.make
#	apptainer/2023-05-08_tensorflow-2.8_v1/makefiles/Modules.make
#	apptainer/2023-08-09_tensorflow-2.8_onnx-1.15_v1/makefiles/Modules.make
#	apptainer/2023-11-08_tensorflow-2.14_v1/makefiles/Modules.make
#	apptainer/2025-04-23_tensorflow-2.17_onnx-1.20_v1/makefiles/Modules.make
#	src/Search/Makefile
#	src/Search/Module.cc
#	src/Search/Module.hh
currentToken(extension.nextToken),
currentState(extension.state),
lmHistory(extension.lmHistory),
length(base.length + 1),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in LexiconfreeLabelsyncBeamSearch, shouldn't length be incremented depending on the transitionType?

extension.timeframe + 1,
{extension.score - extension.lmScore, extension.lmScore},
{}));
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better without empty lin betwen case

}
break;

default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about LABEL_LOOP?
LexiconfreeLabelsyncBeamSearch also contains blank, should this also include it?

sentenceEndLemma = lexicon_->specialLemma("sentence-boundary");
}
sentenceEndLabelIndex_ = sentenceEndLemma->id();
log() << "Use sentence-end index " << sentenceEndLabelIndex_ << " inferred from lexicon";
Copy link
Contributor

@hannah220 hannah220 Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also better to have option to read from paramSentenceEndLabelIndex

Nn::LabelIndex tokenIdx = network_->structure.state(successorState).stateDesc.acousticModel;

auto transitionType = Nn::LabelScorer::TransitionType::LABEL_TO_LABEL;
if (hyp.currentToken == Core::Type<Nn::LabelIndex>::max) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (hyp.currentToken == Core::Type<Nn::LabelIndex>::max) {
if (hyp.currentToken == Nn::invalidLabelIndex) {

}
};

recombinedHypotheses_.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    // Reserve capacity because future reallocations would break the raw pointer we are storing later
    recombinedHypotheses_.reserve(newBeam_.size());

break;

default:
defect(); // Unexpected transition type which can not be produced by `inferTransitionType`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why isn't inferTransitionType function defined in this class?

Base automatically changed from lexiconfree_labelsync_search to master January 16, 2026 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants