-
Notifications
You must be signed in to change notification settings - Fork 132
Make use of Numpy-Arrays #138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
560eee0 to
70fc6cf
Compare
|
Doing a rebuilt based on new master. |
|
As said here #134 , I think that one of those last two commits (well, I hope it is one of those) might have introduced a problem. Once the results of a currently running generation are done, I will post an image, should anything go wrong. (Or just a message, if everything is ok.) |
e24735c to
4c5a0d2
Compare
|
Update on this: The ocean_level-branch will have to wait for a bit of discussion with you guys since the output might not be desirable. But that can be fixed, after some input. :) @psi29a What do I have to do to update whatever tests/data/data_generator.py spit at me? This branch will not pass the tests without it, I think. (There is always complaining about a numpy-method that is missing.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numpy is just awesome. :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed :)
|
I added a ton of comments since these commits change quite a bit. Hopefully I didn't make anything unnecessarily cryptic. |
This looks to me that ocean isn't numpy... |
Is this a typo? |
|
No. "T" is a call to the transpose-function of the numpy array. The error only occurs when an old world is used. However, I can replace the call to T for something more clear, it doesn't gain any performance anyway. (That's why I tried it.) |
|
It is "fixed" now. I should have done that earlier. |
|
I didn't notice this before:
I have not really understood why but that is the same problem as with the missing transpose() method: Somehow when loading an old map the arrays are not (yet?) numpy arrays. Did I overlook a place where the arrays could get into the program without being converted? Is this due to the pickle-format? |
|
I think the problems come from this: I assume the test-world is a pickle-world and things go like this:
The Protobuf-loaders could be manipulated in World.py, but the old pickle-save does not contain numpy-arrays and thus after loading the arrays cannot be used as such. If this sounds feasible, we should really regenerate that world and see how the two tests do that so far have been failing (for me, at least). |
|
That was my first thought as well... loading old datafiles will have data as array and not numpy array. This means we'll likely have to regenerate the world files to accept the new format. :) We'll be breaking compatibility with older world files as a result. This is likely the case for protobuf as well. This is why we have unit and functional tests, to help us stop and think about the problem before a user is exposed to it. :) |
|
All the arrays are converted after being loaded respectively before being saved with protobuf. I expect old protobuf saves to still work; I don't have one to test it, though. I never made the biome-map use protobuf (but I think that is the only one, it just looked like quite a bit of work). So if the world were to be regenerated...at some point another break of compatibility would have to occur, I fear. |
|
I think I might at least put very basic code in there to have the biome-map be a numpy-array (right now), without the usual small optimisations that follow all over the code. That way we don't have to break pickle-compatibility twice. (By the way: If protobuf were to be the favourite serializer, maybe the test-worldfile should be in protobuf-format? That way we wouldn't need two separate files for Python 2 and 3, either.) Or is this kind of break maybe too much of a problem and the switch to numpy not feasible? I would hate to see that. |
|
I agree totally... @ftomassetti I think we're getting to that point where we need to pull the trigger on axing pickle. ;) |
…arrays. Some smaller functions make use of numpy now.
… river-maps with numpy maps.
…esponsible for half of all function calls during world-generation (i.e. tens of millions).
|
I could use some help right now: I generated a full set of data, including that for Python 2. Under Python 3 all tests pass, under Python 2 the ancient-map tests don't pass (that is 2/40). Did I do something wrong while generating? I don't know anything about the ancient maps yet, I thought they are something that is generated after the actual world is already finished and saved. Is that wrong? (Once this is sorted out, I can push all the changes and this should be done.) PS: So far I haven't touched the code for the ancient maps. I really don't know what's going on. Is there maybe another rogue RNG seed at work? |
|
I will take another look at this tomorrow morning. Maybe I made some mistake that happens to make Python 2 and 3 create different outputs. EDIT: Maybe there is a chance that this is a legitimate bug? I don't know how well Python 2/3 compatibility had been tested before. EDIT2: All other output-images seem to be fine. And since my Python 3 version passed the test, it couldn't be randomness introduced during the ancient-map generation. Are there any preparatory steps during world generation that could influence the outcome? |
|
I think I found the problem. Maybe we need a new random module. And a new noise module? :( For now I guess I will push all my changes and then we regard this is a problem to be fixed. In principle it concerns all drawing functions that make use of the random module. |
|
I am wondering: did you generate the two random numbers on the same machine just using different version of Python? I ask because I read "linux" and "linux2". If the random generator gives different numbers on different versions of Python that is bad :( |
|
Just opened an issue: #145 The changes in the Python-module can probably be spotted and compared. I don't know about gcc, though. Should I go through with the pushes and we sort this out later? I will switch out problematic parts in the binaries for the Python 2 versions so the Travis tests should work out at least. |
|
Finally: Here Mindwerks/worldengine-data#2 is the PR for the actual, final, eternal data that makes Python 2 pass all tests. (Python 3 doesn't like the ancient maps, everything else works - 38/40.) If that PR is merged and the tests are rerun, things should work fine. I really, really hope. :) |
|
How do you trigger a re-test? |
|
Yippie! :) |
|
Make it so @ftomassetti ? :) I'm happy with this. |
|
What's the situation here? |
|
Fine by me: all tests are passing :) |
|
I didn't want to change anything about this branch anymore, no. The ones after might need some polish, though. |
|
Merged, now we can see what adaptations the other PRs require |




Another branch, as promised (#136). I replaced almost all arrays with numpy-arrays. Everything that is not mentioned in the serialization-tests is probably not touched, either. (I had to start somewhere.)
Here some performance-test results:
Parameters:
Generation times for every branch:
Images (master - perf_opt - numpy_repl - platec1.4.0):




The latest changes were not supposed to speed things up too much, only some of the excessively used functions were replaced. I did what I could while keeping an eye on the target of getting numpy ready. In the future it won't be a pain to get a bit of optimization done anymore. And maybe the memory footprint will have changed a bit, I still have no way of reliably checking that.
I will do some tests with larger maps and see if anything changed there. (And I am really looking forward to the new version of platec. :) )
All tests except two are passed, although I did run one of the regeneration-scripts in the tests-folder to make it happen. At this point, especially due to the change to the seeding, worldengine-data should be updated a bit anyway, I think. (One of the failed tests, maybe even both, can probably be passed with fresh data - maybe it has to be Python 2/3-specific, though.)
EDIT: I took a look at the new platec version and added the time above. In total we are down to roughly 50% of the generation time from a couple of days ago! :)
EDIT2: As far as I see, the failed test comes from the fact that the old world-file was generated without numpy-arrays. I wouldn't know what to do about that - on my system regenerating the test-world was enough, though.
EDIT3: I did a little more profiling this evening. The air gets thin now, there are only three methods left that slightly justify spending time on with optimization (world.is_land()~11% of total time, world.tiles_around()~20%, common.anti_alias_point()~20% - that's roughly 50% in total!). And after looking at them for a while and trying some things I really couldn't come up with any improvement.
A funny sidenote is that the function calls for my testmap went done from more than 300 million to less than 100 million. Half of which is attributed to is_land(). Speed-wise this might be the end of the road, unless a lot more work is invested.