Skip to content

Conversation

@ftomassetti
Copy link
Member

Fix #91

@tcld
Copy link
Contributor

tcld commented Oct 26, 2015

I have not checked every single line, it's a bit too late for that (here). But I love it anyway.^^

What are the major advantages of this format, by the way? I know that pickle isn't fully compatible between Python 2 and 3, and I read about you pointing out some problems with Protobuf (which I honestly don't like that much either, because I believe that Google is in charge of it - and Google tends to change course by 180 degrees regardless of how many million lifes are on the line xP; I hate relying on their work).

@ftomassetti
Copy link
Member Author

Honestly is just that I want open worlds from Java and using protobuf I cannot do it right now. HDF5 seems a good format, with a great support for many languages and Bret suggested it for a while so I thought... why not?

@ftomassetti
Copy link
Member Author

tox was not aware of h5py. Let's try again

@ftomassetti
Copy link
Member Author

but sure, I have to install also the library... for both Travis and Appveyor. This is going to be a lot of fun, I am sure.

@psi29a
Copy link
Member

psi29a commented Oct 27, 2015

Yeah... new libraries are hell.

I imagine that pyinstaller is going to be worse. ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how it works, but: it seems like both of these are numpy-arrays - is it really necessary to copy them element by element instead of using numpy.copy() or something similar?
If not, there are some more arrays that are copied that way (further down: ocean, sea_depth, ... - maybe even the biome-map).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this:

numpy.copyto(elevation_data, world.elevation['data'])

But I got this:

TypeError: argument 1 must be numpy.ndarray, not Dataset

I guess it is not exactly a numpy array

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I can use:
elevation_data.write_direct(world.elevation['data'])

@tcld
Copy link
Contributor

tcld commented Oct 29, 2015

I think this looks good - although it might be possible to get rid of almost all element-by-element copy-processes that are being done, since all participants are numpy-arrays anyway. Maybe I am overlooking something there?

Other than looking at it I think that a field test is the easiest way to see if all data is properly written and read.

@tcld
Copy link
Contributor

tcld commented Nov 4, 2015

What's up with the future replacement of pickle and protobuf?^^

@psi29a
Copy link
Member

psi29a commented Nov 4, 2015

I don't see this replacing protobuf, since protobuf will eventually support java. HDF5 is just a stop-gap.

@ftomassetti
Copy link
Member Author

Update status: the code is done and the round-trip works (serialization -> unserialization -> back to the original world). Then I started thinking about having to install another library on AppVeyor, having to explain to users how to installed. So I cried a little, opened a bottle of whiskey and not touched this PR for a while...

Now that I am sober again I am not sure... It is true that we could make hdf5 optional as GDAL is. What do you think guys?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comparison part should be moved to its own function. I think this piece of code exists three times now. And should another save-format ever be supported, things would get even messier.

EDIT: Yes, three times.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this code has to be moved into its own function to then replace the tests for all formats.

@tcld
Copy link
Contributor

tcld commented Nov 5, 2015

As long as it isn't trivial to add another format, I would actually prefer seeing us focus on exactly one. Pickle is kind of worthless as long as Python 2 and 3 are supported since the change in Unicode-support breaks save-compatibility between the Python versions. Protobuf has been at the latest alpha for quite a while and I personally don't like relying on Googles code - they have scrapped or "reimagined" other software before, no matter how many people relied on it. (Of course, we could stick to the latest working version in that case. And even modify it ourselves: https://github.com/google/protobuf )

So while I don't know if we need a third way of saving data, we could really use one that is actually reliable and works for all purposes we can come up with. (And then hopefully get rid of the other formats.)

@psi29a
Copy link
Member

psi29a commented Nov 5, 2015

I think hdf5 should be optional like gdal, it is there and useful for when the right libs are in place. :)

@ftomassetti
Copy link
Member Author

rebased

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question since I don't know: Is this "int" the same for Python 2 and 3? (Maybe it doesn't even matter and save-compatibility will exist either way.)

EDIT: I guess the tests will find out once we load a Python 2 HDF5-world into Python 3. So just disregard my comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would hope so but I do not know. Should we use something like uint16?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. Unless the tests fail, this will be fine. (Can you run the tests on Python 2 and 3?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad @tcld found this, yes we should avoid python datatypes when working with numpy and use numpy's standard datatypes:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html

numpy.dtype(numpy.uint16)
dtype('uint16')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. I am now using numpy.uint16, numpy.float and numpy.bool

@ftomassetti
Copy link
Member Author

In the meantime Travis complains:

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-Lorf5r/pygdal
ERROR: could not install deps [coverage, numpy, pygdal==1.10.0.1, Pillow, PyPlatec, noise, nose, protobuf, six, h5py]; v = InvocationError('/home/travis/build/Mindwerks/worldengine/.tox/py27/bin/pip install coverage numpy pygdal==1.10.0.1 Pillow PyPlatec noise nose protobuf six h5py (see /home/travis/build/Mindwerks/worldengine/.tox/py27/log/py27-1.log)', 1)

@psi29a
Copy link
Member

psi29a commented Nov 6, 2015

Numpy failed to compile... I'm going to assume that the wheel is not being installed here?

@ftomassetti
Copy link
Member Author

That could be the case but I do not understand why this patch is affecting numpy

@psi29a
Copy link
Member

psi29a commented Nov 6, 2015

It's not... likely a new release of numpy and we don't specify a specific version.

@ftomassetti
Copy link
Member Author

Should we specify a min and max release?

@tcld
Copy link
Contributor

tcld commented Nov 6, 2015

The latest version is about a month old.

Why would it cause problems if there was a new version of a module? Does it maybe have to be recompiled?

@ftomassetti
Copy link
Member Author

Perhaps it is trying to use a version for which there are no pre-compiled packages. Damn how much easier are these things on the JVM....

@ftomassetti
Copy link
Member Author

rebased after #187 has been merged

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we maybe think of a way that generalizes the saving of all the different maps? Iterate over the worlds variables. If there's a plain numpy-array, write it. If it's a dict, write it's key-/value-pairs.
It isn't absolutely necessary, but it would be great if a much more future-proof.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should create some intermediate concepts such as "Layer" and perhaps having subclasses like "LayerWithQuantiles" or "LayerWithThresholds". In this way we could streamline many parts of the code, including serialization.

@ftomassetti
Copy link
Member Author

Ok, it seems good to go from a tests point of view. Feel free to point out corrections and improvements!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In tox.ini there is a different numpy-version used that looks like it is newer than the highest version here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is true. Here the version has been chosen to make AppVeyor happy. Version 1.10.1 has no wheels packages available. However version 1.10.1 was working on Travis. Let me check if other version works on Travis too.

@ftomassetti
Copy link
Member Author

Most comments addressed and all tests passing. Time to merge and move forward with the release?

@tcld
Copy link
Contributor

tcld commented Nov 15, 2015

Do you want to release immediately after this PR? I wouldn't mind adding the tests (although...not too important for a release, I guess) and maybe the frozen oceans (which are done here, I just need to update the data).
Also...you really didn't want to put the serialization-stuff into its own function?^^

@ftomassetti
Copy link
Member Author

about the serialization stuff, like this it is much easier to debug and to implement: when I support a new serialization method I start serializating the different elements one by one and as I figure things out I add another layer. So for me it is much simpler like this, until we introduce classes for the single layers and we reorder the World class.

About other features, I guess we will be always working on cool stuff we want but a certain point we need to release stuff so that users can benefit from it. I would suggest we do a release very soon and after that we start doing releases more and more often.

@tcld
Copy link
Contributor

tcld commented Nov 15, 2015

I think the frozen oceans are something that at least provides a visual benefit to users. And that's always good. ;P (Also: It is practically done, but nobody commented the latest version yet. I would have to adapt the HDF5-code a bit to make it work, though.)

The serialization-code should be put into its own function. I think it is very ugly that it appears twice in the same file (and we aren't talking about five lines of code) and even differs slightly due to personal preference of the person who wrote the respective lines of code.
Of course that's not a huge problem, especially since it's only tests we are talking about. I wouldn't object a merge at all, just wanted to put some more salt into the wound.^^

@ftomassetti
Copy link
Member Author

Well, there are other tasks to do before releasing the new version. Whatever it is merged before we tag release 0.19.0 can go in. Otherwise it could end up in 0.19.1.

This repository has ~700 commits. 200 commits have been added since we released version 0.18.0 (and we have not yet merged this PR). It is really time to share it with the world :)

@tcld
Copy link
Contributor

tcld commented Nov 15, 2015

I would really like to see the satellite-map as a standard output for the next release, since it is such great work by @pangal. Including polar caps. Of course it's not a necessity but it might leave a good impression to have some more eye-candy.

@ftomassetti
Copy link
Member Author

I definitely agree with that: I love the satellite view!

@tcld
Copy link
Contributor

tcld commented Nov 15, 2015

I suggest merging this PR and then #185 (which also fixes several bugs of the satellite-view - it's actually not just about tests). After that WorldEngine should be in a good state to release.
If there is another day we can wait, let's at least make the satellite-view a default. (Polar caps...well, I would prefer adding them now, since they also introduce a new variable that will be saved to the world-file. I don't know if that could cause any problems when done later.)

@ftomassetti
Copy link
Member Author

Let's start by merging this one then. At this point we should move to the next point of the schedule for releasing a new version which is getting the documentation ready. While we work on that we could merge more stuff but it is important to understand that it is not the end of the world if some feature ends up in the next release instead of this one.

ftomassetti added a commit that referenced this pull request Nov 15, 2015
support hdf5 serialization/unserialization of all the world data
@ftomassetti ftomassetti merged commit 81dc7af into master Nov 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants