Skip to content

Releases: ggml-org/LlamaBarn

0.21.0

07 Jan 08:01

Choose a tag to compare

  • Update header to show base URL as a separate element
  • Include memory budget details in context length info in Settings
  • Add Hugging Face link to model context menu
  • Remove --no-mmap to enable memory mapping
  • Update llama.cpp to b7652
  • Fix Nemotron KV cache footprint calculation

0.20.0

02 Jan 13:14

Choose a tag to compare

  • Refine catalog menu layout and navigation
  • Show granular download progress with decimal percentages
  • Optimize settings menu toggle to avoid full menu rebuild

0.19.0

30 Dec 14:15

Choose a tag to compare

  • Family items now open detailed views instead of expanding in place
  • Add descriptions to model families
  • Show incompatible models in the catalog with clear memory requirements
  • Group Qwen3, Qwen3 VL, and Ministral 3 models with their reasoning variants

0.18.0

29 Dec 13:05

Choose a tag to compare

  • Add Q4_K_M quantizations for Devstral 2 models
  • Refactor catalog and model structures for better organization
  • Update llama.cpp to b7569

0.17.0

28 Dec 12:27

Choose a tag to compare

  • Add Devstral 2 model family
  • Add Nemotron Nano 3 model family
  • Use xcassets for icon to reduce bundle size
  • Update llama.cpp to b7561

0.16.0

19 Dec 07:56

Choose a tag to compare

  • Add Show in Finder button to installed model context menu
  • Always show context length and estimated memory usage in model items
  • Remove memory limit option in favor of automatic safety budget
  • Add expandable info descriptions to settings
  • Fix UI flicker when downloading or cancelling models
  • Update llama.cpp to b7475

0.15.0

17 Dec 11:42

Choose a tag to compare

  • Added "Memory cap" setting to replace hard-coded logic and allow user control
  • Removed confusing right-click button for "run at max ctx"
  • Renamed "Context length" option to "Context length cap" and updated labels
  • Updated context length display in model metadata
  • Made model deletion immediate for better UX

0.14.0

16 Dec 04:07

Choose a tag to compare

  • Removed show quantized toggle — catalog now prefers full-precision models and falls back to quantized only when full-precision won't run on device
  • Added setting to set context length
  • Added setting to display estimated memory usage
  • Redesigned model items and settings
  • Updated icon for better integration with macOS Tahoe
  • Updated llama.cpp to b7406

0.13.0

04 Dec 09:17

Choose a tag to compare

  • Added Ministral 3 model family
  • Added vision support indicator to catalog
  • Replaced hover-based delete with right-click delete for installed models
  • Added toggle button for quantized models in catalog divider
  • Changed default context window from max to 4k
  • Improved visual hierarchy with refined colors and font sizes throughout
  • Updated app icon to better match menu bar icon
  • Updated llama.cpp to b7247

0.12.0

25 Nov 08:29

Choose a tag to compare

  • Enable resuming model downloads after network interruptions
  • Add default sampling parameters based on model author recommendations
  • Add experimental option to expose llama-server to network by binding to 0.0.0.0 -- see issue #17