... comparing current implementation to fine-tuned models on HF
Research metrics from HF
Notes
Squad "validaton" datasets, in contrast to "train", can have more than one answer.
- HF pipeline says model is correct if it picks one of the answers
- TBD: confirm this and implement it
See https://huggingface.co/learn/nlp-course/chapter7/7?fw=pt#processing-the-training-data