Releases · Blaizzy/mlx-vlm

03 Dec 21:47

Blaizzy

v0.3.9

a010fd3

v0.3.9 Latest

Latest

What's Changed

Fix qwen3_vl ValueError: Image features and image tokens do not match by @ziya32 in #608
Add ministral3 by @Blaizzy in #611

New Contributors

@ziya32 made their first contribution in #608

Full Changelog: v0.3.8...v0.3.9

Contributors

Blaizzy and ziya32

Assets 2

27 Nov 01:34

Blaizzy

v0.3.8

cc9609e

v0.3.8

What's Changed

Add chat_ui command by @Blaizzy in #589
Fix 422 error when calling /chat/completions with OpenAI SDK #590 by @Blaizzy in #592
Fix qwen3_vl by @awni in #599

Full Changelog: v0.3.7...v0.3.8

Contributors

awni and Blaizzy

Assets 2

17 Nov 16:04

Blaizzy

v0.3.7

34d7e75

v0.3.7

What's Changed

update readme with new openai endpoints details by @mguella in #585
Fix lighton OCR empty result on main by @Blaizzy in #587
Add model revision loading by @Blaizzy in #588

Full Changelog: v0.3.6...v0.3.7

Contributors

mguella and Blaizzy

Assets 2

14 Nov 22:34

Blaizzy

v0.3.6

84b56e5

v0.3.6

What's Changed

Fix: Input cast error by @Blaizzy in #559
Fix qwen-vl position ids (2.5 & 3) by @Blaizzy in #564
Add Evals by @Blaizzy in #563
Fix Qwen3-VL Attention by @Blaizzy in #566
Update evals init by @Blaizzy in #567
Fix Qwen3-VL multi image reshape by @Blaizzy in #569
Add processor args to DeepSeekOCR by @Blaizzy in #570
host and port params for server by @mguella in #568
FastVLM by @pcuenca in #502
Add support for InternVL3 by @iRonJ in #540
Add Z.ai GLM-4.1v by @Blaizzy in #572
Make rope deltas private by @Blaizzy in #573
Add example notebook for interleaving text and images in prompts by @Copilot in #574
[Bugfix] fix mrope in qwen2vl and qwen2.5vl by @JJJYmmm in #576
Add lighton-ocr by @Blaizzy in #550
changed image parameter instead of files in stream_generate by @Manikandan-t in #521
Remove auto config loading by @Blaizzy in #577
[BugFix][Qwen3VL] fix deepstack and multi-image inference by @JJJYmmm in #581
openai compatible endpoints by @mguella in #580
Bump version to 0.3.6 by @Blaizzy in #582

New Contributors

@mguella made their first contribution in #568
@iRonJ made their first contribution in #540
@Copilot made their first contribution in #574
@JJJYmmm made their first contribution in #576
@Manikandan-t made their first contribution in #521

Full Changelog: v0.3.5...v0.3.6

Contributors

pcuenca, iRonJ, and 4 other contributors

Assets 2

26 Oct 22:43

Blaizzy

v0.3.5

6fe088e

v0.3.5

What's Changed

Remove docs actions temporarly by @Blaizzy in #536
Fix load_image for URLs by @pcuenca in #534
Add Deepseek ocr by @Blaizzy in #541

Full Changelog: v0.3.4...v0.3.5

Contributors

pcuenca and Blaizzy

Assets 2

14 Oct 08:00

Blaizzy

v0.3.4

49760b6

v0.3.4

What's Changed

Add n_kv_heads property used in LM Studio to glm4v_moe.LanguageModel by @hehua2008 in #472
Fix: rope rotation (GLM-4.5v) by @Blaizzy in #481
Remove scipy dep by @Blaizzy in #482
Add CUDA and CPU as optional deps by @Blaizzy in #483
Fix Deepseek vl2 chat template by @Blaizzy in #488
Fix deepseek-vl default chat template by @Blaizzy in #490
Fix smolvlm video generate by @Blaizzy in #491
Fix base64 encoded images by @Blaizzy in #493
Map Apriel configs to pixtral and fix prompt formatting by @ivanfioravanti in #518
Fix video understanding NB by @Blaizzy in #519
Fix fine-tuning bug in trainer.py by @avishekjana in #473
Bump minimum required Python version to 3.10 by @dokterbob in #485
Add Qwen3-VL (Dense & MoE) by @Blaizzy & @vincentamato in #528
Fix video Qwen3-VL by @Blaizzy in #529
Fix Qwen3 VL (Dense) Sanitize by @Blaizzy in #531
Bump version to 0.3.4 by @Blaizzy in #535

New Contributors

@hehua2008 made their first contribution in #472
@ivanfioravanti made their first contribution in #518
@avishekjana made their first contribution in #473
@dokterbob made their first contribution in #485
@vincentamato made their first contributuin in #529

Full Changelog: v0.3.3...v0.3.4

Contributors

dokterbob, ivanfioravanti, and 4 other contributors

Assets 2

20 Aug 14:52

Blaizzy

v0.3.3

7051821

v0.3.3

What's Changed

fix changelog task by @Blaizzy in #441
Add LFM2-VL by @Blaizzy in #460
[llava_next] Fix config inheritance by @neilmehta24 in #448
Kimi_VL: Fix activation args by @Blaizzy in #465
Add GLM-4-5V by @Blaizzy in #458
External access to LFM2-VL merge-input-IDs method by @christian-lms in #466
Add Command-A-Vision by @Blaizzy in #467
[kernels] Use a header for bicubic_interpolate for compatibility with macOS < 15 by @mattjcly in #469

New Contributors

@christian-lms made their first contribution in #466

Full Changelog: v0.3.2...v0.3.3

Contributors

Blaizzy, mattjcly, and 2 other contributors

Assets 2

22 Jul 16:12

Blaizzy

v0.3.2

8c8d1c0

v0.3.2

What's Changed

Fix energy calc in omni.py by @Blaizzy in #427
Feat(build): migrate to pyproject.toml by @SauravMaheshkar in #282
Fix quant predicate by @Blaizzy in #430
Load dependencies from requirements.txt by @Blaizzy in #431
Fix broken wheel builds caused by ambiguous package spec by @neilmehta24 in #432
Make UI and audio dependencies optional by @zhnext in #433
Cleanup: Refactor config by @Blaizzy in #437
Add cuda support by @Blaizzy in #438
Fix/server module exposure by @zhnext in #434
Support text only training by @Goekdeniz-Guelmez in #424
Fix phi3_v and molmo mask by @Blaizzy in #440

New Contributors

@zhnext made their first contribution in #433
@Goekdeniz-Guelmez made their first contribution in #424

Full Changelog: v0.3.1...v0.3.2

Contributors

zhnext, Blaizzy, and 3 other contributors

Assets 2

12 Jul 16:21

Blaizzy

v0.3.1

71f2611

v0.3.1

What's Changed

fix(chat-ui): Fix imports, blocking Chat-ui server start by @zenyr in #420
Fix server empty image for Gemma3n by @Blaizzy in #422
[Gemma3n] Add hooks for image embedding computation by @will-lms in #407
[gemma3n] Fix OCR after weight re-upload by @neilmehta24 in #425
Add gemma3n omni example by @Blaizzy in #426

New Contributors

@zenyr made their first contribution in #420

Full Changelog: v0.3.0...v0.3.1

Contributors

zenyr, Blaizzy, and 2 other contributors

Assets 2

05 Jul 17:29

Blaizzy

v0.3.0

fee22cc

v0.3.0

What's Changed

[gemma3n] Correctly scale text embeddings for quantized gemma3n conversions by @neilmehta24 in #397
smolvlm_video example: fix typo in system prompt by @pcuenca in #389
Fix gemma3n pixel casting by @Blaizzy in #398
Fix audio model check and prompt utils by @Blaizzy in #395
Add KV Quantization by @Blaizzy in #401
Fix Gemma3n multi-task merging and update LM by @Blaizzy in #405
[gemma3n] Fix vision encoder implementation of EdgeResidual and UniversalInvertedResidual by @neilmehta24 in #410
fix: Remove unnecessary unicode_escape decoding for Chinese text input by @nicekate in #403
Add support for Mixed Quant by @Blaizzy in #413
Fix gemma3n Vision OCR + LM only reponses by @Blaizzy in #414
Fix generate signature by @Blaizzy in #416
Add support for audio modality in server by @Blaizzy in #417
Update server, readme and misc by @Blaizzy in #418

New Contributors

@nicekate made their first contribution in #403

Full Changelog: v0.2.0...v0.3.0

Contributors

pcuenca, Blaizzy, and 2 other contributors

Assets 2

Uh oh!

Releases: Blaizzy/mlx-vlm

v0.3.9

What's Changed

New Contributors

Contributors

Uh oh!

v0.3.8

What's Changed

Contributors

Uh oh!

v0.3.7

What's Changed

Contributors

Uh oh!

v0.3.6

What's Changed

New Contributors

Contributors

Uh oh!

v0.3.5

What's Changed

Contributors

Uh oh!

v0.3.4

What's Changed

New Contributors

Contributors

Uh oh!

v0.3.3

What's Changed

New Contributors

Contributors

Uh oh!

v0.3.2

What's Changed

New Contributors

Contributors

Uh oh!

v0.3.1

What's Changed

New Contributors

Contributors

Uh oh!

v0.3.0

What's Changed

New Contributors

Contributors

Uh oh!