Optimization of the assembly code of uaccess APIs.#120
Merged
sterling-teng merged 2 commits intoRVCK-Project:rvck-6.6from Sep 27, 2025
Merged
Optimization of the assembly code of uaccess APIs.#120sterling-teng merged 2 commits intoRVCK-Project:rvck-6.6from
sterling-teng merged 2 commits intoRVCK-Project:rvck-6.6from
Conversation
mainline inclusion from mainline-v6.10-rc1 commit f190594 category: feature bugzilla: RVCK-Project#112 -------------------------------- When the dst buffer pointer points to the last accessible aligned addr, we could still run another iteration of unrolled copy. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240313103334.4036554-1-xiao.w.wang@intel.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion from mainline-v6.10-rc1 commit 9850e73 category: feature bugzilla: RVCK-Project#112 -------------------------------- The bytes copy for unaligned head would cover at most SZREG-1 bytes, so it's better to set the threshold as >= (SZREG-1 + word_copy stride size) which equals to 9*SZREG-1. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240313091929.4029960-1-xiao.w.wang@intel.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: gaorui <gao.rui@zte.com.cn>
|
开始测试 |
|
check patch done. log: https://jenkins.oerv.ac.cn/job/rvck-pipeline/job/check-patch/154/consoleFull |
|
Kernel build success! |
|
Lava check done! result url: https://lava.oerv.ac.cn/results/635/0_rvck_common-test_qemu |
|
kunit test done. log:https://jenkins.oerv.ac.cn/job/rvck-pipeline/job/kunit-test/170/consoleFull |
Contributor
|
预计本周晚些时候尝试合并。 |
sterling-teng
pushed a commit
that referenced
this pull request
Nov 3, 2025
…nter dereference [ Upstream commit 9cf9aa7 ] There is a critical race condition in kprobe initialization that can lead to NULL pointer dereference and kernel crash. [1135630.084782] Unable to handle kernel paging request at virtual address 0000710a04630000 ... [1135630.260314] pstate: 404003c9 (nZcv DAIF +PAN -UAO) [1135630.269239] pc : kprobe_perf_func+0x30/0x260 [1135630.277643] lr : kprobe_dispatcher+0x44/0x60 [1135630.286041] sp : ffffaeff4977fa40 [1135630.293441] x29: ffffaeff4977fa40 x28: ffffaf015340e400 [1135630.302837] x27: 0000000000000000 x26: 0000000000000000 [1135630.312257] x25: ffffaf029ed108a8 x24: ffffaf015340e528 [1135630.321705] x23: ffffaeff4977fc50 x22: ffffaeff4977fc50 [1135630.331154] x21: 0000000000000000 x20: ffffaeff4977fc50 [1135630.340586] x19: ffffaf015340e400 x18: 0000000000000000 [1135630.349985] x17: 0000000000000000 x16: 0000000000000000 [1135630.359285] x15: 0000000000000000 x14: 0000000000000000 [1135630.368445] x13: 0000000000000000 x12: 0000000000000000 [1135630.377473] x11: 0000000000000000 x10: 0000000000000000 [1135630.386411] x9 : 0000000000000000 x8 : 0000000000000000 [1135630.395252] x7 : 0000000000000000 x6 : 0000000000000000 [1135630.403963] x5 : 0000000000000000 x4 : 0000000000000000 [1135630.412545] x3 : 0000710a04630000 x2 : 0000000000000006 [1135630.421021] x1 : ffffaeff4977fc50 x0 : 0000710a04630000 [1135630.429410] Call trace: [1135630.434828] kprobe_perf_func+0x30/0x260 [1135630.441661] kprobe_dispatcher+0x44/0x60 [1135630.448396] aggr_pre_handler+0x70/0xc8 [1135630.454959] kprobe_breakpoint_handler+0x140/0x1e0 [1135630.462435] brk_handler+0xbc/0xd8 [1135630.468437] do_debug_exception+0x84/0x138 [1135630.475074] el1_dbg+0x18/0x8c [1135630.480582] security_file_permission+0x0/0xd0 [1135630.487426] vfs_write+0x70/0x1c0 [1135630.493059] ksys_write+0x5c/0xc8 [1135630.498638] __arm64_sys_write+0x24/0x30 [1135630.504821] el0_svc_common+0x78/0x130 [1135630.510838] el0_svc_handler+0x38/0x78 [1135630.516834] el0_svc+0x8/0x1b0 kernel/trace/trace_kprobe.c: 1308 0xffff3df8995039ec <kprobe_perf_func+0x2c>: ldr x21, [x24,#120] include/linux/compiler.h: 294 0xffff3df8995039f0 <kprobe_perf_func+0x30>: ldr x1, [x21,x0] kernel/trace/trace_kprobe.c 1308: head = this_cpu_ptr(call->perf_events); 1309: if (hlist_empty(head)) 1310: return 0; crash> struct trace_event_call -o struct trace_event_call { ... [120] struct hlist_head *perf_events; //(call->perf_event) ... } crash> struct trace_event_call ffffaf015340e528 struct trace_event_call { ... perf_events = 0xffff0ad5fa89f088, //this value is correct, but x21 = 0 ... } Race Condition Analysis: The race occurs between kprobe activation and perf_events initialization: CPU0 CPU1 ==== ==== perf_kprobe_init perf_trace_event_init tp_event->perf_events = list;(1) tp_event->class->reg (2)← KPROBE ACTIVE Debug exception triggers ... kprobe_dispatcher kprobe_perf_func (tk->tp.flags & TP_FLAG_PROFILE) head = this_cpu_ptr(call->perf_events)(3) (perf_events is still NULL) Problem: 1. CPU0 executes (1) assigning tp_event->perf_events = list 2. CPU0 executes (2) enabling kprobe functionality via class->reg() 3. CPU1 triggers and reaches kprobe_dispatcher 4. CPU1 checks TP_FLAG_PROFILE - condition passes (step 2 completed) 5. CPU1 calls kprobe_perf_func() and crashes at (3) because call->perf_events is still NULL CPU1 sees that kprobe functionality is enabled but does not see that perf_events has been assigned. Add pairing read and write memory barriers to guarantee that if CPU1 sees that kprobe functionality is enabled, it must also see that perf_events has been assigned. Link: https://lore.kernel.org/all/20251001022025.44626-1-chenyuan_fl@163.com/ Fixes: 50d7805 ("tracing/kprobes: Add probe handler dispatcher to support perf and ftrace concurrent use") Cc: stable@vger.kernel.org Signed-off-by: Yuan Chen <chenyuan@kylinos.cn> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> [ Adjust context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
合入了两个uaccess优化代码,提示uaccess性能
riscv: uaccess: Relax the threshold for fast path
具体含义:该补丁放宽了 RISC-V 架构下用户空间访问(uaccess)快速路径的阈值。在用户空间和内核空间进行数据拷贝时,如果数据对齐情况不佳,需要先拷贝未对齐的头部数据。原本的阈值设置较小,导致一些情况下即使数据量不大,也会进入慢速路径,影响效率。此补丁将阈值设置为大于等于 SZREG−1+word_copystridesize,即 9∗SZREG−1,其中 SZREG 是寄存器大小。
riscv: uaccess: Allow the last potential unrolled copy
具体含义::在某些情况下,如果目标缓冲区的最后一个对齐地址处还有剩余的数据需要拷贝,此补丁允许继续进行展开拷贝,而不是直接停止拷贝,这样可以充分利用展开拷贝的性能优势,进一步提高数据拷贝的效率。
这里主要使用access边界测试和非对齐测试,看看uaccess是否返回值正确,参考
uaccess_test1.c
uaccess_test2.c