Interactive real-time 2D fluid simulation powered only by neural networks
2026-01-12.20-53-35.mp4
FluidNet is an interactive real-time fluid simulation system that aims to combine the visual quality of an interactive physics-based fluid solvers with the speed of neural network inference (not there yet). A convolutional neural network (UNet) is trained on Blender Mantaflow simulations, then deployed in a C++ engine for autoregressive rollout in a real-time execution.
Note: The entire pipeline currently operates at 128×128 resolution. This project was developed on a single consumer-grade machine - while the code supports higher resolutions (256×256+), generating training data and training at higher resolutions requires significant compute time. A higher-resolution pipeline is planned for future work when resources permit.
Key Features:
- Neural network trained on generated Blender/Mantaflow simulations
- Real-time inference using ONNX Runtime (CPU or GPU)
- Interactive controls: inject density via emitter-mask, apply forces, add obstacles via collider-mask
End-to-end pipeline to transform 3D physics simulations into a real-time multi-frames 2D neural dataset used for training:
graph LR
A[Blender Simulation] --> B[VDB/Alembic Export]
B --> C[NPZ Conversion]
C --> D[CNN Training]
D --> E[ONNX Export]
E --> F[C++ Engine]
style A fill:#4FC3F7,stroke:#01579B,stroke-width:2px,color:#000
style B fill:#FFB74D,stroke:#E65100,stroke-width:2px,color:#000
style C fill:#F06292,stroke:#880E4F,stroke-width:2px,color:#000
style D fill:#66BB6A,stroke:#1B5E20,stroke-width:2px,color:#000
style E fill:#BA68C8,stroke:#4A148C,stroke-width:2px,color:#000
style F fill:#FF7043,stroke:#BF360C,stroke-width:2px,color:#000
Blender Simulation → Automated generation of randomized fluid scenarios using Mantaflow solver (vdb-tools/README.md)
VDB/Alembic Export → 3D volumetric cache files and mesh metadata
NPZ Conversion → Transform 3D simulations to 2D training data with normalization
CNN Training → UNet learns fluid dynamics (ml/README.md)
ONNX Export → Optimized model format for C++ inference
C++ Engine → Real-time interactive execution (engine/README.md)
Training data is generated through an automated pipeline:
1. Blender Simulation Generation Headless Blender scripts create randomized fluid scenarios using Mantaflow, producing 3D volumetric data (VDB format) and mesh animations (Alembic format).
2. 3D to 2D Projection (vdb-tools) Custom pipeline converts 3D simulations into 2D training data by projecting volumetric fields onto a plane and extracting relevant physics quantities (density, velocity, emitter/collider masks).
3. NPZ Dataset Processed sequences are saved as compressed NPZ files, ready for PyTorch training. Each sample contains the current state, previous state, and boundary conditions to predict the next frame.
A UNet-based encoder-decoder architecture with skip connections. Extensive experimentation with different configurations including depth, base channels, normalization layers, and activation functions.
Input/Output:
- Input: 6 channels (density_t, velx_t, velz_t, density_t-1, emitter_t, collider_t)
- Output: 3 channels (density_t+1, velx_t+1, velz_t+1)
Training involved experimentation with multiple loss components:
- MSE Loss - Standard reconstruction loss
- Divergence Penalty - Enforces fluid incompressibility
- Emitter Loss - Prevents density generation in non-emitter regions, avoid hallucinations
- Gradient Loss - Preserves sharp density features
Different combinations and weightings were tested to balance visual quality and physical realism.
While physics-based metrics (divergence, kinetic energy, collider violations) were tracked during training, most validation came from exporting models and qualitatively evaluating autoregressive rollout behavior to "feel" the fluid dynamics.
For complete training details, see ml/README.md.
Models are trained using a staged approach where each K-step model initializes from the previous:
K1 (1-step) → K2 (2-step) → K3 (3-step) → K4 (4-step)
Each stage learns to predict further into the future, building on the physics knowledge from the previous stage. This curriculum approach is critical for autoregressive stability - without it, models drift and collapse after a few dozen frames. With K-training, stable rollouts last 600+ frames.
The C++ engine provides an interactive environment for experiencing trained fluid models:
Technology Stack:
- ONNX Runtime: ML inference with CPU/CUDA execution providers
- OpenGL: Rendering
- GLFW: Cross-platform windowing and input
- ImGui: Real-time UI and debugging interface
Interactive Controls:
- Velocity mode + drag: Inject velocity forces
- Configurable brush radius, force strength, and emission rate
- Pause, reset, and model switching
For build instructions and usage details, see engine/README.md.
2026-01-13.00-58-20.mp4
2026-01-13.00-54-07.mp4
2026-01-13.00-52-08.mp4
2026-01-13.00-33-40.mp4
2026-01-13.00-31-43.mp4
Note: This project is not yet packaged for easy public use. Pre-trained weights will be made available on HuggingFace in the future. However, if you're handy with build systems, everything should work - the engine can be built and trained models exported following the instructions in engine/README.md and ml/README.md.
Different UNet architectures were tested with varying inference speeds (GPU metrics on RTX 3070):
Medium UNet (GPU): ~5ms inference - good balance of quality and speed, handles colliders well
Small UNet (GPU): ~1-2ms inference - very fast, but collider behavior isn't quite solved yet (work in progress)
Small UNet quantized (INT8 CPU): ~30-35ms inference - on-the-fly quantization without calibration, surprisingly usable and fluid on CPU without a GPU, though collider interactions still need "refinement"
A key advantage of this approach: inference time is constant regardless of scene complexity. Whether you have one emitter or ten colliders, performance stays the same - unlike traditional physics solvers that scale with complexity.
The goal is to get the smaller/faster models to match the physics quality of the medium architecture while maintaining speed gains.
Used in this project:
-
tempoGAN: A Temporally Coherent Generative Model (arXiv:1806.02071) Referenced for physics-aware loss implementations (divergence and gradient losses).
-
Accelerating Eulerian Fluid Simulation with CNNs (ICML 2017) Foundational work on using CNNs for fluid simulation.
For future exploration:
-
Awesome Neural Physics (Project Page) Curated collection of papers and resources on physics-informed machine learning.
-
PhiFlow: Learning Physics (arXiv:2006.08762) Differentiable physics framework for training neural simulators.
-
Transolver: A Fast Transformer Solver for PDEs (arXiv:2412.10748) Recent work on learned PDE solvers using transformers.





