-
Notifications
You must be signed in to change notification settings - Fork 25
Interval POMDPs, POSMG rewards, POSMG constraints #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… initial_memory setting to POSMG; updated POSMG synthesis to display game iterations; added new test POSMG model
TheGreatfpmK
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Remove the pomdp/sketches from models as these are used for policy tree experiments which is not part of this PR.
- Add some tests if possible
| type = line.removeprefix(cls.TYPE_PREFIX).removesuffix('\n') | ||
| return type | ||
| raise ValueError | ||
| if cls.INTERVAL_BEGINNING in line: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that we are going through to whole file if it's not an interval model but I don't know how to make this better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that not all transitions in the model have to be specified as intervals for the model to be an interval pomdp. In other words: if there is at least one transition with intervals, it is an interval model. So you cannot be sure, until you check the whole file. See e.g. model models/ipomdp/simple1/sketch.templ where action 1 is not defined with intervals but action 0 is.
| if updated is not None: explicit_quotient = updated | ||
| if not payntbind.synthesis.assertChoiceLabelingIsCanonic(explicit_quotient.nondeterministic_choice_indices,explicit_quotient.choice_labeling,False): | ||
| logger.warning("WARNING: choice labeling for the quotient is not canonic") | ||
| # TEMPORARY FIX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be difficult to implement?
paynt/quotient/ipomdp.py
Outdated
| self.ipomdp = ipomdp | ||
| self.specification = specification | ||
|
|
||
| logger.debug(f'ipomdp has {max(self.ipomdp.observations)+1} observations') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it max() here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal is to print the number of distinct observations in the model. It was now improved to not require observations to be continuous sequence of numbers starting with 0.
| synthesis_timer.stop() | ||
| time = synthesis_timer.time | ||
| logger.info(f'synthesis completed, value: {round(value, 6)}, time: {round(time, 2)} s') | ||
| # better summary?? use Statistic class? (specification, game iterations, ...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be nice to be more in line with other synthesizers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improved by adding statistics for ipomdp
| import paynt.quotient.pomdp | ||
| import paynt.synthesizer.synthesizer_ar | ||
|
|
||
| class SynthesizerPomdpOneByOne(paynt.synthesizer.synthesizer.Synthesizer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this one-be-one synthesis in master? I would probably remove this, I understand it was used for experiments but this is not something anyone will use in PAYNT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if we want to keep it then I would look into how to make use of the one-by-one synthesizer that's already in PAYNT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved by removing POMDP family related code.
paynt/quotient/ipomdp.py
Outdated
| # the new states will have new action representing combinations of lower and upper bound of the interval | ||
| # new states (and their actions) will be at the end of the matrix | ||
| # IDEA use p1state,p2state,choice(action),destination,transition,probability or originalState,newState,row,column,entry,value? | ||
| # IDEA return just new state count instead of new states |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean up the comments so it's only the important stuff, if something is potential TODO add it to where it belongs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed ideas, kept description
| posmg = payntbind.synthesis.posmg_from_smg(smg, observations) | ||
| logger.debug(f'constructed game abstraction having {posmg.nr_states} states and {posmg.nr_choices} choices.') | ||
|
|
||
| return posmg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the whole purpose of this class and all the function inside is to create the POSMG right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. And the class also stores the specification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds end-to-end support for interval POMDPs by extending the parser, quotient, POSMG reward handling, and synthesizer, along with corresponding tests.
- Introduce
IpomdpQuotientfor interval POMDP abstraction into POSMG games. - Enhance
PosmgManagerto construct and propagate SMG reward models. - Update DRN parser to recognize and build interval models, and wire through a new
SynthesizerIpomdp. - Add a comprehensive test suite for IPOMDP abstraction and synthesis.
Reviewed Changes
Copilot reviewed 126 out of 136 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/ipomdp/test_ipomdp.py | New tests validating IPOMDP abstraction and solving |
| paynt/quotient/ipomdp.py | Implements IPOMDP-to-POSMG quotient logic |
| paynt/parser/drn_parser.py | Detects IPOMDP sketches and builds interval models |
| payntbind/src/synthesis/posmg/PosmgManager.cpp | Adds constructRewardModel and integrates reward models |
| paynt/synthesizer/synthesizer_ipomdp.py | New synthesizer class for IPOMDP using POSMG abstraction |
| paynt/synthesizer/policy_tree.py | Refactors game-stat logging into log_game_stats |
Comments suppressed due to low confidence (1)
paynt/parser/drn_parser.py:56
- [nitpick] Using
typeshadows the built-in. Consider renaming this variable tomodel_typeor similar.
type = line.removeprefix(cls.TYPE_PREFIX).removesuffix('\n')
paynt/synthesizer/policy_tree.py
Outdated
|
|
||
|
|
||
| def log_game_stats(self, states, game_solver): | ||
| self.stat.iteration_game(states) |
Copilot
AI
Jul 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The game_solver parameter is unused in log_game_stats. Either remove it or use it for logging.
| self.stat.iteration_game(states) | |
| self.stat.iteration_game(states) | |
| logger.info("Game solver stats: solution value = {}, solution state-to-action mapping size = {}".format( | |
| game_solver.solution_value, len(game_solver.solution_state_to_player1_action))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I said this in my review as well. Please resolve!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved by removing POMDP family related code.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
No description provided.