Skip to content

Conversation

@mayurdb
Copy link

@mayurdb mayurdb commented Dec 24, 2019

  • With Spark moving towards the v2 for both reads as well write, eventually Fits Datasource also needs to move to v2.
  • This PR adds support to read Fits data in Spark using v2 APIs by specifying the format as, fitsv2.
  • v2 takes a somewhat different approach compared to v1 on how the data is partitioned across multiple tasks. The new way has simplified the code a bit

There are still few ToDos -
1. Add unit tests
2. Add examples
3. Check the working on a larger dataset
4. Test corner cases (covered in the UTs I guess)
4. Currently, the user will have to give format as fitsv2. We will have to check based on some conf if we can toggle between v1 and v2, as that would be more ideal.
5. Some of the v1 code is replicated in utils for use in v2, refactor the v1 code to use the code in utils

@mayurdb
Copy link
Author

mayurdb commented Jan 13, 2020

Scala 2.11 the build will fail. This is something we should consider, these changes are compiled against Spark 3.0 which requires scala 2.12. I don't think we can compile this against Spark 2.4.3 because as far as I know the V2 APIs are different between those two versions. I'll give it a try, if they have ported the changes back to 2.4.3 it might work

@JulienPeloton
Copy link
Member

Good point - we are at this moment of transition where we need to take a decision on what to support. The future being spark v3 with the associated DSv2, we should primarily focus on this combination for this PR. We can always let a branch open to deal with spark 2.4 support. If there is a chance of backporting, it should be secondary (although important!).

@mayurdb
Copy link
Author

mayurdb commented Jan 16, 2020

@JulienPeloton Yup checked it. As expected does not compile with Spark 2.4.3

@JulienPeloton
Copy link
Member

Thanks - good to know for the future. I will make the review of the current code asap.

@JulienPeloton
Copy link
Member

Hi @mayurdb - I haven't forgotten this important one, I am just overwhelmed these times...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants