Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3301 +/- ##
=======================================
Coverage 99.95% 99.95%
=======================================
Files 380 381 +1
Lines 53949 54020 +71
=======================================
+ Hits 53927 53998 +71
Misses 22 22 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@sergisiso @arporter This is ready for a first review. The implementation is currently pretty small and neat - if you want more functionality testing let me know and I can add more. |
arporter
left a comment
There was a problem hiding this comment.
Thanks very much Aidan - as you say it's pleasingly simple. Also as you say, it could do with some more checks and associated tests :-)
On the naming front, I worry that the use of Extract will be confused with the existing ExtractTrans which is obviously quite different. IAAI (two of them) and it seems that IntroduceLocal/TemporaryTrans is a good alternative that doesn't involve "Extract".
Finally, (once everything else is done) I think it would be exciting (and possibly educational) to try this for real. I'm thinking we'd alter nemo/scripts/utils.py::normalise_loops so that it e.g. looks for any Calls which have array expressions as arguments.
| "OMPDeclareTargetTrans", | ||
| "OMPCriticalTrans", | ||
| "OMPParallelTrans", | ||
| "DataNodeExtractTrans", |
There was a problem hiding this comment.
I think the doc generation works better if we don't put the entry here but instead have a __all__ at the end of the file containing the class implementation. (We can keep the earlier import though since that makes it available from the transformations module.)
| f"'{node.debug_string().strip()}'." | ||
| ) | ||
|
|
||
| dtype = node.datatype |
There was a problem hiding this comment.
Please mv up to L93 so that we can re-use the result rather than re-computing.
| ) | ||
|
|
||
| def apply(self, node: DataNode, storage_name: str = "", **kwargs): | ||
| """Applies the DataNodeExtractTransApplies to the input arguments. |
| """Applies the DataNodeExtractTransApplies to the input arguments. | ||
|
|
||
| :param node: The datanode to extract. | ||
| :param storage_name: The name of the temporary variable to store |
There was a problem hiding this comment.
Perhaps "The (base) name of the..." to indicate that it won't necessarily be exactly that name that is used. Also, please could you de-dent the following lines to just use four spaces so we don't need as many lines.
There was a problem hiding this comment.
I guess we need to decide if this should be a base name of if the storage_name argument is provided the name must be exact? Happy to go either way - if the name must be exact I should probably add it to validate.
| f"so the DataNodeExtractTrans cannot be applied. " | ||
| f"Input node was '{node.debug_string().strip()}'." | ||
| ) | ||
|
|
There was a problem hiding this comment.
Please add a check that storage_name is a str and raise a TypeError if not.
There was a problem hiding this comment.
This is done by the validate_options function so we don't need to.
| assign = psyir.walk(Assignment)[0] | ||
| with pytest.raises(TransformationError) as err: | ||
| dtrans.validate(assign.rhs) | ||
| assert ("Input node's datatype is an array of unknown size, so the " |
There was a problem hiding this comment.
In this case, we know that we can query e.g. a for its L/UBOUND but I guess that this requires an extension to the datatype method which you discussed with @sergisiso elsewhere. Might be worth a comment.
There was a problem hiding this comment.
If datatype does not work you can check the expression produced by get_effective_shape.
But this made me thing that expressions using imported symbols (even with explicit bounds) may be a problem. Do the symbol in the explicit bound is available locally? Does it have the global or the local interface?
There was a problem hiding this comment.
@sergisiso Is there a good way to test this? I'm a bit unfamiliar with making codes that correctly follow imports for the test suite.
There was a problem hiding this comment.
Yes, there are a few tests that use a code and an imported module code. See for example src/psyclone/tests/psyir/symbols/symbol_table_test.py::test_resolve_imports_from_child_symtabs
There was a problem hiding this comment.
This works until an array index is already an imported symbol in wherever its imported from. I need to think of a good solution for that
There was a problem hiding this comment.
Yes, I agree that it's only the latter check that's necessary. In general, it's best not to have an individual transformation import symbols because that performs very badly. It's much better to get the frontend to import them when it first constructs the PSyIR (by having the module names added to RESOLVE_IMPORTS in the transformation script). Therefore, if you could update the error message to say this that would be great - you'll probably need to finesse it a bit depending on whether it is Unresolved or Unsupported.
There was a problem hiding this comment.
Further, it's probably going to really help the user if we could say which Symbol(s) is/are causing the problem. Therefore, once we know we're going to raise an error, it would be worth examining node to get its Symbols and seeing which of them are the problem.
There was a problem hiding this comment.
I need to think more about the "symbol merging" as it seems kinda hard to do - the existing symbol can be not equal but the essentially the same symbol (same module, name, type etc.) so I need to work out how to handle this.
There was a problem hiding this comment.
Ah by import symbols I was only meaning adding the relevant use x_mod: symbol symbols when we find them when importing known symbols.
There was a problem hiding this comment.
I've added some code now to handle imported symbols including those in shape declarations, I'm hoping I've not overengineered it but i'll start working on the tests for it next.
| assert ("Input node's datatype cannot be computed, so the " | ||
| "DataNodeExtractTrans cannot be applied. Input node " | ||
| "was 'b + a'" in str(err.value)) | ||
|
|
There was a problem hiding this comment.
Please add a test for an UnsupportedType declaration.
There was a problem hiding this comment.
Please also add tests when the transformation is given the wrong type of node and a non-str name.
| assign = psyir.walk(Assignment)[0] | ||
| dtrans.apply(assign.rhs.operands[1]) | ||
| out = fortran_writer(psyir) | ||
| assert ("integer, dimension(SIZE(a, dim=1),SIZE(b, dim=2)) :: tmp" |
There was a problem hiding this comment.
Ah, this was what you discussed. I guess the same comment applies (that we could do better but it's up to datatype to do that).
| out = fortran_writer(psyir) | ||
| assert "integer :: temporary" in out | ||
| assert "temporary = INT(a)" in out | ||
| assert "b = temporary" in out |
There was a problem hiding this comment.
Please could you add at least one test with a more complicated code structure - e.g. an expression inside a loop. This will exercise the hierarchy of symbol tables.
|
|
||
| # Create a symbol of the relevant type. | ||
| if not storage_name: | ||
| symbol = node.scope.symbol_table.new_symbol( |
There was a problem hiding this comment.
Talking with @sergisiso about possible use-cases for this transformation, he mentioned things like:
a(2:4) = 3*a(1:3)where we need to explicitly store the copy of a(1:3) in a temporary in order to parallelise. With this transformation now we'd generate:
tmp = a(1:3)
a(2:4) = tmpThat has made me think that we need to check that we create tmp with dimensions starting at 1. (I think we do because that's what datatype returns?)
Are there any other considerations @sergisiso?
There was a problem hiding this comment.
That has made me think that we need to check that we create tmp with dimensions starting at 1.
I am not sure about this, if it doesn't start at the same lbound as the lhs it will generate the offset expression.
However, impure calls could be a problem:
a = globalvar + b(fn(4))
If we try to move b, it could internally update 'globalvar' which would change the order of the use-update. I would not allow impure or unknown purity functions.
There was a problem hiding this comment.
Added tests for both of these cases with appropriate errors in validate. (I think the function one is currently already caught by the datatype but I was explicit in case we make improvements in the future).
There was a problem hiding this comment.
Just to be clear, the problem is there even without moving the the fn()
a = fn(3) + b(globalvar)
moving b still changes the order if fn also uses globalvar, and this one is not caught by datatype.
Also if the result_type of fn is known, we will have a valid datatype but still be impure.
There was a problem hiding this comment.
Yeah - probably just my inexperience, I couldn't create a test that could correctly resolve the datatype of a function (but i do fail on any impure function call in the provided node). I guess what I don't currently handle is:
a = fn(somevar) + b(globalvar) - If b was provided as the argument (and PURE) in theory we'd happily move it which is probably incorrect (if fn is impure)? I'm not sure what the best solution to this is - probably check the statement containing the input node for impure calls?
There was a problem hiding this comment.
FIXME:
Ah thats what you wrote - I'll make that change when I'm next on the branch
There was a problem hiding this comment.
On a second thought, this will be too restrictive for the first use case: taking argument expressions out of io calls. These are always impure.
So now I think we don't need this check. Sorry for the distraction
…e into 1646_datanode_extract_trans
|
@arporter I think I fixed all the issues I found from the last review, ready for another look. |
Initial implementation - transformation is quite straightforward so should be good for a review assuming CI is ok.