Skip to content

Conversation

@mcelrath
Copy link

This is to fix #52 .

I've tested it with:

starcoder2
qwen2.5-coder
deepseek-coder-v2
codegemma
codellama

Do you agree? I think it might be interesting to put some templates in the repo showing how to use FIM for larger models that don't natively support it, but may be good at following instructions.

@mcelrath
Copy link
Author

This does NOT work with gemma3, which supports FIM though the tag <|fim_middle|> but doesn't mention it anywhere in their template.

The method in my patch can be detected by ollama show --template <model> | grep Suffix but I don't see how to go about doing this without a laundry list of models and templates. The existing solution with the config/ directory definitely doesn't work with a lot of models... (especially thinking models like deepseek-r1).

@gergap
Copy link
Owner

gergap commented Mar 15, 2025

Hi, thank you for the contribution. I was going this route initially, but had the problem that the templates in ollama where not working for some models.
There is even a suffix argument in the REST API now, where ollama is supposed to do much of the work of building a correct prompts for code completion.

I will need to test this and if it works more reliable now, than in the past I can get rid of my own templates and token configurations. I can imagine to keep both. So you can override bogus behavior with local configurations, but if it is missing it will default to what ollama provides.

I'm also working on repo completions and inserting of complete files (or vim buffers) as context. For this I will still need to create my own prompts. However, if I can read out the FIM tokens reliable from the models template I would be happy to do so.

@gergap
Copy link
Owner

gergap commented Mar 15, 2025

Hi, I tested it with my default starcoder2:3b completion model. Technically, it works, but the results are radically different and not really useful. I don't know why this happens with the ollama REST API.

See yourself: First I try it with your branch. The task is trivial, it should only complete a missing 'f' for a printf call, which tests the fill-in-the-middle problem pretty good. With your solution it generates some nonsense and also garbage on the next lines instead of the required completion. Then I switch back to mast branch with my manual FIM code and it works as expected.

vim-pull-53-2025-03-15_11.06.09.mp4

@gergap
Copy link
Owner

gergap commented Mar 15, 2025

I does not look better with codellama either:

Ollama REST API result (your branch):
codellama1

Manual templates (master branch):
codellama2

I also tried to change the "Raw" option to false on your branch, but this didn't help either.
Looks to me like ollama stuff still does not work as it should.
I'm running ollama version is 0.5.4.
Please let me know if you are running a newer version with better results for this simple example task.

@mcelrath
Copy link
Author

mcelrath commented Mar 15, 2025 via email

@mcelrath
Copy link
Author

Are you on Discord? Or do you belong to any Discord/Slack or other chat that discusses FIM usage? There are a bunch of things I want to do here and it would be good to discuss it with someone. ;-)

@gergap
Copy link
Owner

gergap commented Mar 15, 2025

Hi, at the moment I don't have time for discussions, but you might find the use_model_template branch useful.
Just committed this experiment for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Don't use config files, use the model's template

2 participants