Skip to content

Decompose across panels#220

Open
thomasmelvin wants to merge 10 commits intoMetOffice:mainfrom
thomasmelvin:decompose_across_panels
Open

Decompose across panels#220
thomasmelvin wants to merge 10 commits intoMetOffice:mainfrom
thomasmelvin:decompose_across_panels

Conversation

@thomasmelvin
Copy link

@thomasmelvin thomasmelvin commented Jan 15, 2026

PR Summary

Sci/Tech Reviewer: @tommbendall
Code Reviewer: @christophermaynard

This is a port of the previous SRS ticket https://code.metoffice.gov.uk/trac/lfric/ticket/4699 & its related linked ticket https://code.metoffice.gov.uk/trac/lfric/ticket/4699 Both these tickets have already been science and code reviewed on the SRS and will go to the same reviewers here to check for any further issues.

In detail, the aim of this ticket is to modify the existing partitioning strategy so that it can partition multiple panels (either 3 sets of 2 or 2 sets of 3) into uniform rectangular domains. This will mean that the number of mpi ranks needs to only have a factor of 2 or 3 instead of 6 as a base requirement.

Using this option means that code that assumes all owned cells are on the same panel (we think only the extended mesh in the transport code) will not give the correct results here.

Code Quality Checklist

(Some checks are automatically carried out via the CI pipeline)

  • I have performed a self-review of my own code
  • My code follows the project's
    style guidelines
  • Comments have been included that aid understanding and enhance the
    readability of the code
  • My changes generate no new warnings

Testing

  • I have tested this change locally, using the LFRic Core rose-stem suite
  • If required (e.g. API changes) I have also run the LFRic Apps test suite
    using this branch
  • If any tests fail (rose-stem or CI) the reason is understood and
    acceptable (e.g. kgo changes)
  • I have added tests to cover new functionality as appropriate (e.g. system
    tests, unit tests, etc.)
  • Any new tests have been assigned an appropriate amount of compute resource
    and have been allocated to an appropriate testing group (i.e. the
    developer tests are for jobs which use a small amount of compute resource
    and complete in a matter of minutes)

trac.log

Test Suite Results - lfric_core - core_decompose_across_panels/run5

Suite Information

Item Value
Suite Name core_decompose_across_panels/run5
Suite User thomas.melvin
Workflow Start 2026-01-19T13:54:16
Groups Run developer
Dependency Reference Main Like
lfric_core thomasmelvin/lfric_core@decompose_across_panels False
SimSys_Scripts MetOffice/SimSys_Scripts@2025.12.1 True

Task Information

✅ succeeded tasks - 384

Security Considerations

  • I have reviewed my changes for potential security issues
  • [] Sensitive data is properly handled (if applicable)
  • Authentication and authorisation are properly implemented (if applicable)

Performance Impact

  • Performance of the code has been considered and, if applicable, suitable
    performance measurements have been conducted

AI Assistance and Attribution

  • Some of the content of this change has been produced with the assistance
    of Generative AI tool name (e.g., Met Office Github Copilot Enterprise,
    Github Copilot Personal, ChatGPT GPT-4, etc) and I have followed the
    Simulation Systems AI policy
    (including attribution labels)

Documentation

  • Where appropriate I have updated documentation related to this change and
    confirmed that it builds correctly

PSyclone Approval

  • If you have edited any PSyclone-related code (e.g. PSyKAl-lite, Kernel
    interface, optimisation scripts, LFRic data structure code) then please
    contact the
    tooscollabdevteam@metoffice.gov.uk

Sci/Tech Review

  • I understand this area of code and the changes being added
  • The proposed changes correspond to the pull request description
  • Documentation is sufficient (do documentation papers need updating)
  • Sufficient testing has been completed

Please alert the code reviewer via a tag when you have approved the SR

Code Review

  • All dependencies have been resolved
  • Related Issues have been properly linked and addressed
  • CLA compliance has been confirmed
  • Code quality standards have been met
  • Tests are adequate and have passed
  • Documentation is complete and accurate
  • Security considerations have been addressed
  • Performance impact is acceptable

@github-actions github-actions bot added the cla-required The CLA has not yet been signed by the author of this PR - added by GA label Jan 15, 2026
@github-actions github-actions bot added cla-signed The CLA has been signed as part of this PR - added by GA and removed cla-required The CLA has not yet been signed by the author of this PR - added by GA labels Jan 15, 2026
@mo-marqh
Copy link
Member

Ping @EdHone FYI

@thomasmelvin thomasmelvin added the Linked Apps This PR is linked to a MetOffice/lfric_apps PR label Jan 16, 2026
@thomasmelvin thomasmelvin self-assigned this Jan 16, 2026
@tommbendall
Copy link
Contributor

This can pass science review. I already reviewed this on trac and all issues that I previously raised have been addressed.

Copy link
Contributor

@mike-hobson mike-hobson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably need to look more carefully at this, but from a very quick glance through, I can see a number of new routines, but there are no new unit tests. It adds new functionality (being able to partition over n*2 and n*3 partitions) - but none of that is tested in any rose-stem system tests. The new functionality will also need to be documented in the Core documentation.

@mo-rickywong
Copy link
Contributor

It will also require a suitable test in mesh tools from a pre-partitioning point of view, for say a 2-partitions, 3-partitions

@thomasmelvin
Copy link
Author

I probably need to look more carefully at this, but from a very quick glance through, I can see a number of new routines, but there are no new unit tests. It adds new functionality (being able to partition over n2 and n3 partitions) - but none of that is tested in any rose-stem system tests. The new functionality will also need to be documented in the Core documentation.

Thanks Mike, there's only really one new routine for the get_nprocs but I'll add a unit test for it. Where is the core documentation located and are there any guides for updating them?

The change is tested in the lfric_apps rose-stem system tests of the linked pull request so I thought that should be enough to cover the new functionality

@mo-rickywong
Copy link
Contributor

I probably need to look more carefully at this, but from a very quick glance through, I can see a number of new routines, but there are no new unit tests. It adds new functionality (being able to partition over n_2 and n_3 partitions) - but none of that is tested in any rose-stem system tests. The new functionality will also need to be documented in the Core documentation.

Thanks Mike, there's only really one new routine for the get_nprocs but I'll add a unit test for it. Where is the core documentation located and are there any guides for updating them?

The change is tested in the lfric_apps rose-stem system tests of the linked pull request so I thought that should be enough to cover the new functionality

A system test in lfric_apps is good, though is insufficient. This change adds functionality to infrastructure and so should be tested within the infrastructure test suite, not an external repository. I also note that this is a linked ticket, however there is no link/reference to the actual linked ticket on the (either) PR request. The orange icons on this page take you to a linked PR page for core not apps, so you essentially have to navigate to lfric_apps and search for something that looks like its linked. Better to have an actual link on the core/lfric_app PRs.

@mike-hobson
Copy link
Contributor

mike-hobson commented Jan 16, 2026

Thanks Mike, there's only really one new routine for the get_nprocs but I'll add a unit test for it. Where is the core documentation located and are there any guides for updating them?

The change is tested in the lfric_apps rose-stem system tests of the linked pull request so I thought that should be enough to cover the new functionality

Sorry, I hadn't noticed there was a linked ticket with a system test - but as Ricky says, if we're adding functionality to Core, we shouldn't have to rely on a test in Apps, so it would be good to have a system test in one of the Core tests.

The documentation is held in the documentation/ directory at the top of the repository. I'm guessing that any new documentation might possibly go in documentation/source/how_to_use_it/technical_articles/lfric_distmem_impl.rst. There's certainly a section on partitioning in there.

@thomasmelvin
Copy link
Author

It will also require a suitable test in mesh tools from a pre-partitioning point of view, for say a 2-partitions, 3-partitions

I've added in the requested tests

@thomasmelvin
Copy link
Author

I probably need to look more carefully at this, but from a very quick glance through, I can see a number of new routines, but there are no new unit tests. It adds new functionality (being able to partition over n_2 and n_3 partitions) - but none of that is tested in any rose-stem system tests. The new functionality will also need to be documented in the Core documentation.

Thanks Mike, there's only really one new routine for the get_nprocs but I'll add a unit test for it. Where is the core documentation located and are there any guides for updating them?
The change is tested in the lfric_apps rose-stem system tests of the linked pull request so I thought that should be enough to cover the new functionality

A system test in lfric_apps is good, though is insufficient. This change adds functionality to infrastructure and so should be tested within the infrastructure test suite, not an external repository. I also note that this is a linked ticket, however there is no link/reference to the actual linked ticket on the (either) PR request. The orange icons on this page take you to a linked PR page for core not apps, so you essentially have to navigate to lfric_apps and search for something that looks like its linked. Better to have an actual link on the core/lfric_app PRs.

Sorry, that was my inexperience with github, the link is now in the PR summary

Copy link
Contributor

@tommbendall tommbendall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with these updates from a science review point-of-view, so this is ready for code review @christophermaynard

@thomasmelvin
Copy link
Author

Results from 5 day C224 NWP forecasts

Field Branch Trunk
Cloud branch-nwp-gal9-5day-cloud trunk-nwp-gal9-5day-cloud
m_cl branch-nwp-gal9-5day-m_cl trunk-nwp-gal9-5day-m_cl
m_v branch-nwp-gal9-5day-m_v trunk-nwp-gal9-5day-m_v
precip branch-nwp-gal9-5day-prec trunk-nwp-gal9-5day-prec
theta branch-nwp-gal9-5day-theta trunk-nwp-gal9-5day-theta
u branch-nwp-gal9-5day-u_in_w3 branch-nwp-gal9-u_in_w3

@thomasmelvin thomasmelvin added this to the Spring 2026 milestone Jan 29, 2026
@thomasmelvin
Copy link
Author

@christophermaynard Science review is all filled in, this is ready for code review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed The CLA has been signed as part of this PR - added by GA Linked Apps This PR is linked to a MetOffice/lfric_apps PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants