Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ The project demonstrates how to assemble a small SPH solver with Julia. It focus
- **Dynamic boundary condition** – inspired by DualSPHysics.
- **Density diffusion** – based on Fourtakas et al. 2019 to reduce pressure noise.
- **Wendland quintic kernel** – simple and stable without tensile corrections.
- **CUDA neighbor lists** – optional GPU neighbor cell list and loop utilities for CUDA arrays.

## Folder Structure

Expand Down Expand Up @@ -91,6 +92,59 @@ one plus the particles in its neighbor stencil.
Neighbor rebuilds use a per-cell ordering buffer without reordering particle
storage.

### CUDA Neighbor Utilities

This package ships with an optional CUDA-focused neighbor list to offload the
cell list construction and neighbor iteration. Install CUDA.jl in the project
environment before using these helpers:

```julia
using Pkg
Pkg.add("CUDA")
```

The CUDA utilities are only loaded when CUDA.jl is available. The workflow is:

1. Build a `CUDACellGrid` describing the simulation bounds and desired cell size.
2. Allocate a `CUDANeighborList` once for the particle count.
3. Call `UpdateNeighborsCUDA!` each step after updating positions.
4. Launch `NeighborLoopCUDA!` with a GPU kernel that computes pairwise
interactions.
For per-particle accumulation (mirroring `NeighborLoopPerParticle!`),
use `NeighborLoopPerParticleCUDA!` with init/interact/final callbacks.

Here is a minimal sketch:

```julia
using CUDA
using SPHExample

grid = CUDACellGrid(SVector(0.0f0, 0.0f0), 0.01f0, SVector(128, 128))
neighbor_list = AllocateCUDANeighborList(grid, length(position))

UpdateNeighborsCUDA!(neighbor_list, position)

function InteractionKernel(i, j, acc, pos)
@inbounds acc[i] += pos[j] - pos[i]
return nothing
end

NeighborLoopCUDA!(InteractionKernel, neighbor_list, acc, position)
```

These CUDA utilities are intended as building blocks for integrating GPU
workflows into the existing solver. You can keep the CPU path intact and
incrementally replace the neighbor loop where it makes sense for your use case.
The `example/StillWedgeMDBC_CUDA.jl` script shows how to load the StillWedge
geometry and run a CUDA neighbor-count pass using these utilities.

For an end-to-end GPU path (limited to `NoShifting`, `NoKernelOutput`, `NoMDBC`,
`ArtificialViscosity`, and `LinearDensityDiffusion`), use `RunSimulationCUDA`
from the `SPHCUDASimulation` module. This runs the StillWedge-style workflow on
CUDA arrays and builds neighbor lists on the CPU, then launches GPU neighbor
loops (particle data is synced to the CPU only for neighbor-list construction
and output/logging).

## Help

Questions or issues can be posted on the GitHub issue tracker. Response times may vary but all feedback is welcome.
Expand Down
133 changes: 133 additions & 0 deletions example/StillWedgeMDBC_CUDA.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
using SPHExample
using CUDA
using StaticArrays

if !SPHExample.HasCUDA
error("CUDA.jl is required to run this example.")
end

@inline function ComputeBounds(Position)
MinCorner = reduce((A, B) -> min.(A, B), Position)
MaxCorner = reduce((A, B) -> max.(A, B), Position)
return MinCorner, MaxCorner
end

@inline function BuildGrid(MinCorner, MaxCorner, CellSize)
Extents = MaxCorner - MinCorner
Dims = SVector{length(MinCorner), Int}(ceil.(Int, Extents ./ CellSize) .+ 1)
return CUDACellGrid(MinCorner, CellSize, Dims)
end

@inline function SquaredNorm(Vector)
Accumulator = zero(eltype(Vector))
@inbounds for Index in eachindex(Vector)
Accumulator += Vector[Index] * Vector[Index]
end
return Accumulator
end

@inline function NeighborCountInteractionKernel(Index, NeighborIndex, Counts, Position, SupportRadiusSquared)
Δx = Position[Index] - Position[NeighborIndex]
if SquaredNorm(Δx) <= SupportRadiusSquared
CUDA.atomic_add!(Counts, Index, 1)
end
return nothing
end

function CountNeighborsCUDA(Position, NeighborList, SupportRadiusSquared)
Counts = CUDA.zeros(Int, length(Position))
NeighborLoopCUDA!(NeighborCountInteractionKernel, NeighborList, Counts, Position, SupportRadiusSquared)
return Counts
end

let
Dimensions = 2
FloatType = Float32

SimConstantsWedge = SimulationConstants{FloatType}(dx=0.02f0, c₀=42.485762f0, δᵩ=0.1f0, CFL=0.5f0)

FixedBoundary = Geometry{Dimensions, FloatType}(
CSVFile = "./input/still_wedge/StillWedge_Dp$(SimConstantsWedge.dx)_Bound.csv",
GroupMarker = 1,
Type = Fixed,
Motion = nothing,
)

Water = Geometry{Dimensions, FloatType}(
CSVFile = "./input/still_wedge/StillWedge_Dp$(SimConstantsWedge.dx)_Fluid.csv",
GroupMarker = 2,
Type = Fluid,
Motion = nothing,
)

SimMetaDataWedge = SimulationMetaData{Dimensions, FloatType, NoShifting, NoKernelOutput, NoMDBC, StoreLog}(
SimulationName="StillWedge",
SaveLocation="W:/Simulations/StillWedge2D_MDBC_CUDA",
SimulationTime=4.0f0,
OutputTimes=0.01f0,
VisualizeInParaview=true,
ExportSingleVTKHDF=true,
ExportGridCells=true,
OpenLogFile=true,
)

if !isdir(SimMetaDataWedge.SaveLocation)
mkdir(SimMetaDataWedge.SaveLocation)
end

SimulationGeometry = [FixedBoundary; Water]
SimParticles = AllocateDataStructures(SimulationGeometry, SimMetaDataWedge)

SimKernel = SPHKernelInstance{Dimensions, FloatType}(WendlandC2(); dx = SimConstantsWedge.dx)
SimLogger = SimulationLogger(SimMetaDataWedge.SaveLocation)

CleanUpSimulationFolder(SimMetaDataWedge.SaveLocation)

PositionGPU = CuArray(SimParticles.Position)

ParticleRanges = zeros(Int, length(SimParticles) + 2)
UniqueCells = zeros(CartesianIndex{Dimensions}, length(SimParticles))
CellDict = Dict{CartesianIndex{Dimensions}, Int}()
FullStencil = ConstructStencil(Val(Dimensions))
NeighborCellLists = [Int[] for _ in 1:length(UniqueCells)]
ParticleOrder = zeros(Int, length(SimParticles))
CellOffsets = zeros(Int, length(ParticleRanges))
CellIdsHost = zeros(Int, length(SimParticles))

IndexCounter = UpdateNeighbors!(SimParticles, SimKernel.H⁻¹, ParticleRanges, UniqueCells, CellDict, ParticleOrder, CellOffsets)
UniqueCellsView = view(UniqueCells, 1:IndexCounter)
BuildNeighborCellLists!(NeighborCellLists, FullStencil, UniqueCellsView, ParticleRanges, CellDict)
@inbounds for Index in eachindex(CellIdsHost, SimParticles.Cells)
CellIdsHost[Index] = CellDict[SimParticles.Cells[Index]]
end
NeighborListPacked = UpdateNeighborsCPUToCUDA!(
nothing,
CellIdsHost,
ParticleOrder,
view(ParticleRanges, 1:(IndexCounter + 1)),
view(NeighborCellLists, 1:IndexCounter),
)

NeighborCounts = CountNeighborsCUDA(PositionGPU, NeighborListPacked, SimKernel.H²)

NeighborCountsHost = Array(NeighborCounts)
MinimumNeighbors = minimum(NeighborCountsHost)
MaximumNeighbors = maximum(NeighborCountsHost)
AverageNeighbors = sum(NeighborCountsHost) / length(NeighborCountsHost)

println("StillWedge CUDA neighbor stats (within H):")
println(" Min neighbors: ", MinimumNeighbors)
println(" Max neighbors: ", MaximumNeighbors)
println(" Avg neighbors: ", AverageNeighbors)

RunSimulationCUDA(
SimGeometry = SimulationGeometry,
SimMetaData = SimMetaDataWedge,
SimConstants = SimConstantsWedge,
SimKernel = SimKernel,
SimLogger = SimLogger,
SimParticles = SimParticles,
SimViscosity = ArtificialViscosity(),
SimDensityDiffusion = LinearDensityDiffusion(),
)
end
Loading