Skip to content

Comments

unmap延时刷新tlb,提升系统性能#222

Open
uestc-gr wants to merge 4 commits intoRVCK-Project:rvck-6.6from
uestc-gr:unmap_tlb
Open

unmap延时刷新tlb,提升系统性能#222
uestc-gr wants to merge 4 commits intoRVCK-Project:rvck-6.6from
uestc-gr:unmap_tlb

Conversation

@uestc-gr
Copy link
Contributor

issues: #221

backport上游补丁,已经按照补丁的用例测试过,批量unmap性能提升很大

~ # time /home/unmap_tlb_flush
Testing with memory size: 256 MB
Threads created, starting swap test...
Completed 1 iterations
Completed 2 iterations
Completed 3 iterations
Completed 4 iterations
Completed 5 iterations
Completed 6 iterations
Completed 7 iterations
Completed 8 iterations
Completed 9 iterations
Completed 10 iterations
Test completed, cleaning up...
Done.
real 10m 56.58s
user 0m 1.88s
sys 10m 53.74s

补丁合入后

~ # time /home/unmap_tlb_flush
Testing with memory size: 256 MB
Threads created, starting swap test...
Completed 1 iterations
Completed 2 iterations
Completed 3 iterations
Completed 4 iterations
Completed 5 iterations
Completed 6 iterations
Completed 7 iterations
Completed 8 iterations
Completed 9 iterations
Completed 10 iterations
Test completed, cleaning up...
Done.
real 2m 56.10s
user 0m 1.54s
sys 2m 46.82s

Alexandre Ghiti added 2 commits February 11, 2026 14:20
mainline inclusion
from mainline-6.7-rc1
commit 9e11306
category: feature
bugzilla: RVCK-Project#221

--------------------------------

flush_tlb_range() uses a fixed stride of PAGE_SIZE and in its current form,
when a hugetlb mapping needs to be flushed, flush_tlb_range() flushes the
whole tlb: so set a stride of the size of the hugetlb mapping in order to
only flush the hugetlb mapping. However, if the hugepage is a NAPOT region,
all PTEs that constitute this mapping must be invalidated, so the stride
size must actually be the size of the PTE.

Note that THPs are directly handled by flush_pmd_tlb_range().

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/Five SMARC
Link: https://lore.kernel.org/r/20231030133027.19542-3-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-6.8-rc1
commit 54d7431
category: feature
bugzilla: RVCK-Project#221

--------------------------------

Allow to defer the flushing of the TLB when unmapping pages, which allows
to reduce the numbers of IPI and the number of sfence.vma.

The ubenchmarch used in commit 43b3dfd ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration") that
was multithreaded to force the usage of IPI shows good performance
improvement on all platforms:

* Unmatched: ~34%
* TH1520   : ~78%
* Qemu     : ~81%

In addition, perf on qemu reports an important decrease in time spent
dealing with IPIs:

Before:  68.17%  main     [kernel.kallsyms]            [k] __sbi_rfence_v02_call
After :   8.64%  main     [kernel.kallsyms]            [k] __sbi_rfence_v02_call

* Benchmark:

int stick_this_thread_to_core(int core_id) {
        int num_cores = sysconf(_SC_NPROCESSORS_ONLN);
        if (core_id < 0 || core_id >= num_cores)
           return EINVAL;

        cpu_set_t cpuset;
        CPU_ZERO(&cpuset);
        CPU_SET(core_id, &cpuset);

        pthread_t current_thread = pthread_self();
        return pthread_setaffinity_np(current_thread,
sizeof(cpu_set_t), &cpuset);
}

static void *fn_thread (void *p_data)
{
        int ret;
        pthread_t thread;

        stick_this_thread_to_core((int)p_data);

        while (1) {
                sleep(1);
        }

        return NULL;
}

int main()
{
        volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
                                         MAP_SHARED | MAP_ANONYMOUS, -1, 0);
        pthread_t threads[4];
        int ret;

        for (int i = 0; i < 4; ++i) {
                ret = pthread_create(&threads[i], NULL, fn_thread, (void *)i);
                if (ret)
                {
                        printf("%s", strerror (ret));
                }
        }

        memset(p, 0x88, SIZE);

        for (int k = 0; k < 10000; k++) {
                /* swap in */
                for (int i = 0; i < SIZE; i += 4096) {
                        (void)p[i];
                }

                /* swap out */
                madvise(p, SIZE, MADV_PAGEOUT);
        }

        for (int i = 0; i < 4; i++)
        {
                pthread_cancel(threads[i]);
        }

        for (int i = 0; i < 4; i++)
        {
                pthread_join(threads[i], NULL);
        }

        return 0;
}

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Tested-by: Jisheng Zhang <jszhang@kernel.org> # Tested on TH1520
Tested-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20240108193640.344929-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
@github-actions
Copy link

github-actions bot commented Feb 11, 2026


开始测试 log: https://github.com/RVCK-Project/rvck/actions/runs/21894978777

参数解析结果
args value
repository RVCK-Project/rvck
head ref pull/222/head
base ref rvck-6.6
LAVA repo RVCK-Project/lavaci
LAVA Template lava-job-template/qemu/qemu-ltp.yaml
Testcase path lava-testcases/common-test/ltp/ltp.yaml
need run job kunit-test,kernel-build,check-patch,lava-trigger

测试完成

详细结果:

RVCK result

check result
kunit-test success
kernel-build success
lava-trigger success
check-patch success

Kunit Test Result

[06:29:19] Testing complete. Ran 457 tests: passed: 445, skipped: 12

Kernel Build Result

Kernel build succeeded: RVCK-Project/rvck/222/

591c03e1eb5e54bec8a9067becf7b688 /srv/guix_result/0923fab5c983410da4748e639a9d3f34fd411cd6/Image
7918b6eba5e1c00f0214e6624775aa4b /root/initramfs.img

LAVA Check

args:

result:

Lava check done! lava log: https://lava.oerv.ac.cn/scheduler/job/1418

lava result count: [fail]: 175, [pass]: 1434, [skip]: 290

Check Patch Result

Total Errors 0
Total Warnings 3

Alexandre Ghiti and others added 2 commits February 11, 2026 16:03
mainline inclusion
from mainline-6.8-rc4
commit 3951f6a
category: feature
bugzilla: RVCK-Project#221

--------------------------------

We must clear the cpumask once we have flushed the batch, otherwise cpus
get accumulated and we end sending IPIs to more cpus than needed.

Fixes: 54d7431 ("riscv: Add support for BATCHED_UNMAP_TLB_FLUSH")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Link: https://lore.kernel.org/r/20240130115508.105386-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-6.9-rc2
commit 674bc01
category: feature
bugzilla: RVCK-Project#221

--------------------------------

__flush_tlb_range() does not modify the provided cpumask, so its cmask
parameter can be pointer-to-const. This avoids the unsafe cast of
cpu_online_mask.

Fixes: 54d7431 ("riscv: Add support for BATCHED_UNMAP_TLB_FLUSH")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240301201837.2826172-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
@github-actions
Copy link

github-actions bot commented Feb 11, 2026


开始测试 log: https://github.com/RVCK-Project/rvck/actions/runs/21897220829

参数解析结果
args value
repository RVCK-Project/rvck
head ref pull/222/head
base ref rvck-6.6
LAVA repo RVCK-Project/lavaci
LAVA Template lava-job-template/qemu/qemu-ltp.yaml
Testcase path lava-testcases/common-test/ltp/ltp.yaml
need run job kunit-test,kernel-build,check-patch,lava-trigger

测试完成

详细结果:

RVCK result

check result
kunit-test success
kernel-build success
lava-trigger success
check-patch success

Kunit Test Result

[08:09:12] Testing complete. Ran 457 tests: passed: 445, skipped: 12

Kernel Build Result

Kernel build succeeded: RVCK-Project/rvck/222/

a5e3c9075e2abc19384a60d0e5595e10 /srv/guix_result/9f3c81015b46e6eea0158c3011297a1cd23aac2a/Image
4ffb1b3948c6a0b46f4ddab6a0e993c0 /root/initramfs.img

LAVA Check

args:

result:

Lava check done! lava log: https://lava.oerv.ac.cn/scheduler/job/1419

lava result count: [fail]: 174, [pass]: 1435, [skip]: 290

Check Patch Result

Total Errors 0
Total Warnings 5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants