Skip to content

Comments

Backport Linux RISC-V IOMMU Support#114

Merged
sterling-teng merged 18 commits intoRVCK-Project:rvck-6.6from
uestc-gr:iommu
Sep 23, 2025
Merged

Backport Linux RISC-V IOMMU Support#114
sterling-teng merged 18 commits intoRVCK-Project:rvck-6.6from
uestc-gr:iommu

Conversation

@uestc-gr
Copy link
Contributor

@uestc-gr uestc-gr commented Sep 1, 2025

目前基于设备树的IOMMU系列补丁已经合入开源,此PR将合入这一系列补丁

验证方法,使用qemu-9.2.0以上的版本,运行如下命令,使用qemu自带的sbi并关闭acpi,使用设备树方式,配置aia中断和iommu设备和网卡设备,然后启动设备,网卡设备可成功加入iommu_group
qemu-system-riscv64 -smp 8 -m 10G -nographic
-M virt,aia=aplic-imsic,aia-guests=5
-device riscv-iommu-pci,vendor-id=0x1efd,device-id=0x8
-device e1000e,netdev=net1 -netdev user,id=net1,net=192.168.0.0/24
-device e1000e,netdev=net2 -netdev user,id=net2,net=192.168.200.0/24
........

@oervci
Copy link

oervci commented Sep 1, 2025

开始测试

@oervci
Copy link

oervci commented Sep 1, 2025

开始测试

@oervci
Copy link

oervci commented Sep 1, 2025

@oervci
Copy link

oervci commented Sep 1, 2025

@oervci
Copy link

oervci commented Sep 1, 2025

Kernel build success!

mainline inclusion
from mainline-v6.6
commit 06c3750
category: feature
bugzilla: RVCK-Project#112

--------------------------------

Following the pattern of identity domains, just assign the BLOCKED domain
global statics to a value in ops. Update the core code to use the global
static directly.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Sven Peter <sven@svenpeter.dev>
Link: https://lore.kernel.org/r/1-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.6
commit 13fbceb
category: feature
bugzilla: RVCK-Project#112

--------------------------------

Move the global static blocked domain to the ops and convert the unmanaged
domain to domain_alloc_paging.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Sven Peter <sven@svenpeter.dev>
Link: https://lore.kernel.org/r/4-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.8
commit 0061ffe
category: feature
bugzilla: RVCK-Project#112

--------------------------------

The current device_release callback for individual iommu drivers does the
following:

1) Silent IOMMU DMA translation: It detaches any existing domain from the
   device and puts it into a blocking state (some drivers might use the
   identity state).
2) Resource release: It releases resources allocated during the
   device_probe callback and restores the device to its pre-probe state.

Step 1 is challenging for individual iommu drivers because each must check
if a domain is already attached to the device. Additionally, if a deferred
attach never occurred, the device_release should avoid modifying hardware
configuration regardless of the reason for its call.

To simplify this process, introduce a static release_domain within the
iommu_ops structure. It can be either a blocking or identity domain
depending on the iommu hardware. The iommu core will decide whether to
attach this domain before the device_release callback, eliminating the
need for repetitive code in various drivers.

Consequently, the device_release callback can focus solely on the opposite
operations of device_probe, including releasing all resources allocated
during that callback.

Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240305013305.204605-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from Linux Linux 6.7-rc3
commit 66b73e9
category: feature
bugzilla: https://github.com/RVCK-Project/rvck-olk/issues/78

--------------------------------

sizes.h has a gap in defines between SZ_32G and SZ_64T. Add the missing
defines so they can be used in drivers.

Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Signed-off-by: Sarah Walker <sarah.walker@imgtec.com>
Signed-off-by: Donald Robson <donald.robson@imgtec.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/58b227d96f27859b453caf0ceaaac81a6616304b.1700668843.git.donald.robson@imgtec.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion
from Linux 6.8-rc5
commit b42a905
category: feature
bugzilla: RVCK-Project#112

--------------------------------

The xlate callbacks are supposed to translate of_phandle_args to proper
provider without modifying the of_phandle_args.  Make the argument
pointer to const for code safety and readability.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240216144027.185959-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion
from Linux 6.9-rc4
commit 06c3750
category: feature
bugzilla: RVCK-Project#112

--------------------------------

In order to improve observability and accountability of IOMMU layer, we
must account the number of pages that are allocated by functions that
are calling directly into buddy allocator.

This is achieved by first wrapping the allocation related functions into a
separate inline functions in new file:

drivers/iommu/iommu-pages.h

Convert all page allocation calls under iommu/intel to use these new
functions.

Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20240413002522.1101315-2-pasha.tatashin@soleen.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion
from Linux 6.12-rc6
commit 14d050c
category: feature
bugzilla: RVCK-Project#112

--------------------------------

Add bindings for the RISC-V IOMMU device drivers.

Co-developed-by: Anup Patel <apatel@ventanamicro.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/6c255602e296feaf0f005b498de4e6fdf8686ff7.1729059707.git.tjeznach@rivosinc.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion
from Linux 6.12-rc6
commit 5c0ebbd
category: feature
bugzilla: RVCK-Project#112

--------------------------------

Introduce platform device driver for implementation of RISC-V IOMMU
architected hardware.

Hardware interface definition located in file iommu-bits.h is based on
ratified RISC-V IOMMU Architecture Specification version 1.0.0.

This patch implements platform device initialization, early check and
configuration of the IOMMU interfaces and enables global pass-through
address translation mode (iommu_mode == BARE), without registering
hardware instance in the IOMMU subsystem.

Link: https://github.com/riscv-non-isa/riscv-iommu
Co-developed-by: Nick Kossifidis <mick@ics.forth.gr>
Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Co-developed-by: Sebastien Boeuf <seb@rivosinc.com>
Signed-off-by: Sebastien Boeuf <seb@rivosinc.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/2f2e4530c0ee4a81385efa90f1da932f5179f3fb.1729059707.git.tjeznach@rivosinc.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion
from Linux 6.12-rc6
commit 68682e9
category: feature
bugzilla: RVCK-Project#112

--------------------------------

Introduce device driver for PCIe implementation
of RISC-V IOMMU architected hardware.

IOMMU hardware and system support for MSI or MSI-X is
required by this implementation.

Vendor and device identifiers used in this patch
matches QEMU implementation of the RISC-V IOMMU PCIe
device, from Rivos VID (0x1efd) range allocated by the PCI-SIG.

MAINTAINERS | added iommu-pci.c already covered by matching pattern.

Link: https://lore.kernel.org/qemu-devel/20240307160319.675044-1-dbarboza@ventanamicro.com/
Co-developed-by: Nick Kossifidis <mick@ics.forth.gr>
Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/12f3bdbe519ebb7ca482191e7334d38b25b8ae8f.1729059707.git.tjeznach@rivosinc.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion
from Linux 6.12-rc6
commit 822e8bc
category: feature
bugzilla: RVCK-Project#112

--------------------------------

Advertise IOMMU device and its core API.
Only minimal implementation for single identity domain type, without
per-group domain protection.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/ba79c8eb9c7f1cd9a8961a1b048e3991ee9a2b05.1729059707.git.tjeznach@rivosinc.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion
from Linux 6.12-rc6
commit 1bac10c
category: feature
bugzilla: RVCK-Project#112

--------------------------------

Introduce device context allocation and device directory tree
management including capabilities discovery sequence, as described
in Chapter 2.1 of the RISC-V IOMMU Architecture Specification.

Device directory mode will be auto detected using DDTP WARL property,
using highest mode supported by the driver and hardware. If none
supported can be configured, driver will fall back to global pass-through.

First level DDTP page can be located in I/O (detected using DDTP WARL)
and system memory.

Only simple identity and blocking protection domains are supported by
this implementation.

Co-developed-by: Nick Kossifidis <mick@ics.forth.gr>
Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/e1c763aeccd2c05fd4ad3a32f6f2ff3b3148d907.1729059707.git.tjeznach@rivosinc.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion
from Linux 6.12-rc6
commit 856c0cf
category: feature
bugzilla: RVCK-Project#112

--------------------------------

Introduce device command submission and fault reporting queues,
as described in Chapter 3.1 and 3.2 of the RISC-V IOMMU Architecture
Specification.

Command and fault queues are instantiated in contiguous system memory
local to IOMMU device domain, or mapped from fixed I/O space provided
by the hardware implementation. Detection of the location and maximum
allowed size of the queue utilize WARL properties of queue base control
register. Driver implementation will try to allocate up to 128KB of
system memory, while respecting hardware supported maximum queue size.

Interrupts allocation is based on interrupt vectors availability and
distributed to all queues in simple round-robin fashion. For hardware
Implementation with fixed event type to interrupt vector assignment
IVEC WARL property is used to discover such mappings.

Address translation, command and queue fault handling in this change
is limited to simple fault reporting without taking any action.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/c4735fb6829053eff37ce1bcca4906192afd743c.1729059707.git.tjeznach@rivosinc.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion
from Linux 6.12-rc6
commit 488ffbf
category: feature
bugzilla: RVCK-Project#112

--------------------------------

Introduce first-stage address translation support.

Page table configured by the IOMMU driver will use the highest mode
implemented by the hardware, unless not known at the domain allocation
time falling back to the CPU’s MMU page mode.

This change introduces IOTINVAL.VMA command, required to invalidate
any cached IOATC entries after mapping is updated and/or removed from
the paging domain.  Invalidations for the non-leaf page entries use
IOTINVAL for all addresses assigned to the protection domain for
hardware not supporting more granular non-leaf page table cache
invalidations.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/1109202d389f51c7121cb1460eb2f21429b9bd5d.1729059707.git.tjeznach@rivosinc.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.13-rc4
commit d5f88ac
category: feature
bugzilla: RVCK-Project#112

--------------------------------

Apply platform_device_msi_init_and_alloc_irqs() to add support for
MSIs when the IOMMU is a platform device.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20241112133504.491984-4-ajones@ventanamicro.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.13-rc7
commit 8d8d375
category: feature
bugzilla: RVCK-Project#112

--------------------------------

Changing cqen/fqen/pqen from 0 to 1 sets the cqh/fqt/pqt registers to 0.
But the cqt/fqh/pqh registers are left unmodified. This commit resets
cqt/fqh/pqh registers to ensure corresponding queues are empty before
being enabled during initialization.

Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
Link: https://lore.kernel.org/r/20250103093220.38106-2-luxu.kernel@bytedance.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.13-rc7
commit 8d8d375
category: feature
bugzilla: RVCK-Project#112

--------------------------------

This commit supplies shutdown callback for iommu driver. The shutdown
callback resets necessary registers so that newly booted kernel can pass
riscv_iommu_init_check() after kexec. Also, the shutdown callback resets
iommu mode to bare instead of off so that new kernel can still use PCIE
devices even when CONFIG_RISCV_IOMMU is not enabled.

Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
Link: https://lore.kernel.org/r/20250103093220.38106-3-luxu.kernel@bytedance.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.13
commit 10c62c3
category: feature
bugzilla: RVCK-Project#112

--------------------------------

When __BITS_PER_LONG == 32, size_t is defined as unsigned int rather
than unsigned long. Therefore, we should use size_t to avoid
type-checking errors.

Fixes: 488ffbf ("iommu/riscv: Paging domain support")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Tomasz Jeznach <tjeznach@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Link: https://lore.kernel.org/r/20250103024616.3359159-1-guoren@kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.17-rc3
commit 99d4d1a
category: feature
bugzilla: RVCK-Project#112

--------------------------------

The riscv_iommu_pte_fetch() function returns either NULL for
unmapped/never-mapped iova, or a valid leaf pte pointer that
requires no further validation.

riscv_iommu_iova_to_phys() failed to handle NULL returns.
Prevent null pointer dereference in
riscv_iommu_iova_to_phys(), and remove the pte validation.

Fixes: 488ffbf ("iommu/riscv: Paging domain support")
Cc: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: XianLiang Huang <huangxianliang@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250820072248.312-1-huangxianliang@lanxincomputing.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
@uestc-gr
Copy link
Contributor Author

uestc-gr commented Sep 8, 2025

iommu的基础功能重新合入,该系列补丁完全是移植开源社区正式补丁,已测试通过,因我们有多个重要功能的开发依赖于这一系列补丁,麻烦优先评审下,多谢

@sterling-teng
Copy link
Contributor

iommu的基础功能重新合入,该系列补丁完全是移植开源社区正式补丁,已测试通过,因我们有多个重要功能的开发依赖于这一系列补丁,麻烦优先评审下,多谢

抱歉,因为调试CI期间累积pr过多,所以耽搁了些时日。该pr补丁没有问题,已经进入物理机测试阶段,通过测试后将合并。

@sterling-teng sterling-teng merged commit 939db55 into RVCK-Project:rvck-6.6 Sep 23, 2025
1 check failed
@uestc-gr uestc-gr deleted the iommu branch September 24, 2025 02:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants