Add the sifive_rvv configuration#832
Conversation
|
We should try to merge this with the |
|
|
Thanks, @devinamatthews! Just rebased it.
I'm not sure whether it can be determined at run-time or configure-time. Currently, it is set at compile-time in |
|
Sorry for the late reply. @angsch and I would like to merge the functionality of the different RISC-V configurations. Our RISC-V configurations are generalized, in that they will work for both RV32 and RV64, with or without the vector extension present, and our vector implementation is vector-length-agnostic,, but only the vector *GEMM functions have been optimized at this time (the rest of the functionality is the same as
|
|
|
OK, I've been reading around a bit and this is what I think I understand:
So, my suggestions are:
|
| #define BLIS_NR_s ( 4 * __riscv_v_min_vlen / 32 ) | ||
| #define BLIS_NR_d ( 4 * __riscv_v_min_vlen / 64 ) | ||
| #define BLIS_NR_c ( 2 * __riscv_v_min_vlen / 32 ) | ||
| #define BLIS_NR_z ( 2 * __riscv_v_min_vlen / 64 ) |
There was a problem hiding this comment.
Is this really necessary (to fix these sizes)? The kernels do need MR fixed but NR could be determined dynamically from VLEN at runtime. Setting these macros to -1 here (or not defining them) simply disables the unrolled reference GEMM kernel.
|
@myeh01 last comment: AFAICT it seems that NR is fixed based on the minimum vector length specified in This shouldn't cause any problems for GEMM, but it might have an effect on packing, and I don't know how TRSM would be affected. If it would be too much work to support then we can go as-is. |
|
Thanks for the suggestions, @devinamatthews! I have updated the code so that |
|
Does in |
This PR adds a configuration called
sifive_rvvto support RVV platforms beyond SiFive's x280. Essentially, the kernel code currently undersifive_x280has been migrated tosifive_rvv, but thepackmkernel andgemmandgemmtrsmmicrokernels have been modified slightly so thatNRis defined in terms of the machine'sVLENinstead of hardcoded toVLEN = 512.sifive_rvvis currently compiled withVLEN = 128(Zvl128), the minimumVLENrequired by the standardVextension, but users can modifymake_defs.mkto change it to theVLENof their target machine for potentially better performance. Thesifive_x280configuration is now defined in terms ofsifive_rvv, calling the kernels fromsifive_rvvand usingVLEN = 512forpackm,gemm, andgemmtrsm.This PR is based on #822.
Many thanks to Eric Love (@ericlove) and Aaron Hutchinson (@Aaron-Hutchinson) for their help with this PR.
@fgvanzee, @devinamatthews, and others, any feedback is appreciated!