Skip to content

Releases: dgarage/llama.cpp

b6782

17 Oct 02:23
1bb4f43

Choose a tag to compare

mtmd : support home-cooked Mistral Small Omni (#14928)

b6665

02 Oct 08:04
95ce098

Choose a tag to compare

HIP: add IMbackK to codeowner (#16375)

b6503

18 Sep 05:58
62c3b64

Choose a tag to compare

CANN: Remove print (#16044)

Signed-off-by: noemotiovon <757486878@qq.com>

b6153

14 Aug 07:05
3ea913f

Choose a tag to compare

perplexity: give more information about constraints on failure (#15303)

* perplexity: give more information about constraints on failure

This checks whether -np is insufficient vs context, and provides clues as to how much is needed for each.

* log formatting

* log error and return instead of storing max_seq_exceeded int

* check if s0 is zero for -np check

b6140

13 Aug 02:13
b049315

Choose a tag to compare

HIP: disable sync warp shuffel operators from clr amd_warp_sync_funct…

b6106

07 Aug 03:58
5fd160b

Choose a tag to compare

ggml: Add basic SET_ROWS support in WebGPU (#15137)

* Begin work on set_rows

* Work on set rows

* Add error buffers for reporting unsupported SET_ROWS indices

* Remove extra comments

b6039

31 Jul 01:31
6e67254

Choose a tag to compare

opencl: add `mul_mat_f32_f32_l4_lm` and `mul_mat_f16_f32_l4_lm` (#14809)

b6020

29 Jul 08:02
0a5036b

Choose a tag to compare

CUDA: add roll (#14919)

* CUDA: add roll

* Make everything const, use __restrict__

b5926

18 Jul 01:18
670e136

Choose a tag to compare

convert : fix Ernie4.5 MoE without shared experts (#14746)

b5581

03 Jun 05:19
71e74a3

Choose a tag to compare

opencl: add `backend_synchronize` (#13939)

* This is not needed by the normal use where the result is read
  using `tensor_get`, but it allows perf mode of `test-backend-ops`
  to properly measure performance.