Skip to content

Set up performance metrics eval on squad #11

@plpxsk

Description

@plpxsk

... comparing current implementation to fine-tuned models on HF

Research metrics from HF

Notes

Squad "validaton" datasets, in contrast to "train", can have more than one answer.

  • HF pipeline says model is correct if it picks one of the answers
  • TBD: confirm this and implement it

See https://huggingface.co/learn/nlp-course/chapter7/7?fw=pt#processing-the-training-data

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions