Body force convergence, convergence issues #103

jeremyfirst22 · 2025-10-24T19:28:25Z

Bugfixes

add8c60: CRITICAL: Extension of GPU race condition in MPI communication #98 to other places in ScaLBL.cpp.
1681cb6: The timestep of the Restart.db was never read, leading to a restart starting with timestep = 0 and overwriting previous output files.
38a21f7: The min_steady_timesteps database parameter was never used to exit upon converged Ca, since Ca_previous was never updated, causing ((fabs((Ca - Ca_previous) / Ca) < tolerance) to always be false. Added a break to exit once the steady point is achieved to avoid continuing to write steady point information until the EXIT_TIMESTEP is achieved.

Features

a4dee77: Added a lbpm_color_unsteady_simulator module to run an unsteady simulation until the water saturation converges below the supplied tolerance (default = 0, i.e., always run to timestepMax)
1ee867e: Adjusted body force search algorithm: In the steady state calculations, the body force is proportionally adjusted to achieve the desired capillary number, but because these properties are non-linearly related, the proportional scaling was prone to 'straddling' the target and infinitely bouncing back-and-forth (as identified in #75). Now, once the proportional scaling brackets the target, we switch to the bisection method for guaranteed convergence. Resolves Endless overshoot/undershoot Capillary number using Fractional Flow Protocol #75.

Cleanup

ebe2ae5: Switched Ca and body_force print statements to use scientific notation
7e060fc: Consolidated the actual LBM run logic into a ForwardStep function to share between the two Run functions.
4f3be0c: Removed vestige variables flagged by compiler warnings

I've targeted this PR to dev, but the bugfixes may better be merged into master (particularly the first). Please let me know if you'd like me to split this PR and merge the bugfixes directly into master.

…g message passing

… use scientific notation

diogosiebert · 2025-11-03T20:53:20Z

Hello @jeremyfirst22,

Thanks for the pull requests — we really appreciate your effort in contributing to the improvement of LBPM.

When you submitted the first bug fix, it occurred to me that the best approach might be to include the barrier code inside the PACK function, which would ensure the same behavior as in the CPU code.
Additionaly this would avoid the need of a ScaLBL_DeviceBarrier() after each ScaLBL_D3Q19_Pack() call.

For future pull requests, I just ask that you separate them by individual feature or bug fix so we can discuss and evaluate each change more easily.

For example, in your latest submission, you proposed three different bug fixes. It would be much better if each of these fixes were submitted as separate pull requests, which would make the discussion and testing of each change simpler and more effective.

Best Regards,

Diogo Nardelli Siebert

JamesEMcClure · 2025-11-04T20:26:52Z

As implemented there is no race condition in ScaLBL_D3Q19_Pack() because the data that packed into the communication buffers at a given timestep is unique (i.e. there is a unique mapping between the distributions and the buffer locations). My recommendation would be to add synchronization within MPI_COMM_SCALBL.Isend because the race condition is introduced at this point (i.e. because MPI is not ensuring the data is synchronized). The ideal solution would be for the Isend command to be augmented to take a build-time flag to determine whether or not device synchronization is needed or not (since this is dependent on the MPI implementation and underlying communication details)

jeremyfirst22 · 2025-11-11T19:47:38Z

@diogosiebert The reason I did not put the cudaDeviceSynchronize call within ScaLBL_D3Q19_Pack itself is because that would require the buffer packing kernel to complete before exiting the function. This might not be ideal, as there could be other work that could be done simultaneously, or multiple packing kernels could be launched for simultaneous packing using different cudaStreams.

It could be just as easily be nested within the MPI_COMM_SCALBL.Isend routine itself (as @JamesEMcClure suggests), which would keep the code open for overlapping work with communication. But I do not think there is any benefit to wrapping the DeviceBarrier in pragmas like #ifdef USE_GPU. The ScaLBL_DeviceBarrier itself is already overloaded for the GPUs, so there should be no need for build-time flags, right?

And noted about splitting each bugfix into a separate PR. I will close this PR and submit a PR per feature/bugfix. Thank you both for the comments!

-- Jeremy

jeremyfirst22 added 8 commits October 24, 2025 12:25

bugfix: kernel packing of send buffers must complete before initiatin…

add8c60

…g message passing

bugfix: restart to initialize from previous timestep

1681cb6

bugfix: min_steady_timesteps in lbpm_color_simulator

38a21f7

lbpm_color_unsteady_simulator for sw convergence

a4dee77

body force adjustment: scale-then-bisect algorithm

1ee867e

cleanup: fixed point representation not useful for Ca and body force,…

ebe2ae5

… use scientific notation

cleanup: consolidate color model forward propagation steps

7e060fc

cleanup: build warnings unused variables

4f3be0c

jeremyfirst22 closed this Nov 11, 2025

jeremyfirst22 mentioned this pull request Nov 11, 2025

bugfix: kernel packing of send buffers must complete before initiatin… #105

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Body force convergence, convergence issues #103

Body force convergence, convergence issues #103

Uh oh!

jeremyfirst22 commented Oct 24, 2025 •

edited

Loading

Uh oh!

diogosiebert commented Nov 3, 2025

Uh oh!

JamesEMcClure commented Nov 4, 2025 •

edited

Loading

Uh oh!

jeremyfirst22 commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Body force convergence, convergence issues #103

Body force convergence, convergence issues #103

Uh oh!

Conversation

jeremyfirst22 commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bugfixes

Features

Cleanup

Uh oh!

diogosiebert commented Nov 3, 2025

Uh oh!

JamesEMcClure commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeremyfirst22 commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jeremyfirst22 commented Oct 24, 2025 •

edited

Loading

JamesEMcClure commented Nov 4, 2025 •

edited

Loading