Skip to content

Conversation

@ddiez
Copy link
Contributor

@ddiez ddiez commented Jan 17, 2025

Aims to fix #212. This does not work satisfactorily at the moment because I did not realize _infer_dataset_id() will fail when there is no id attached to feature_slice.h5. It works however if we pass dataset_id="". But I wonder what the optimal approach is. By default spaceranger will return feature_slice.h5 so it makes sense to accept this the default and provide the possibility of passing dataset_id when this is not true. In that case perhaps it is best to modify _infer_dataset_id() to check whether feature_slice.h5 exists. If not then check for files with file id, unless dataset_id is passed in which case id_feature_slice.h5 will be used. Comments?

@ddiez ddiez mentioned this pull request Jan 17, 2025
@codecov-commenter
Copy link

codecov-commenter commented Jan 17, 2025

Codecov Report

Attention: Patch coverage is 9.09091% with 10 lines in your changes missing coverage. Please review.

Project coverage is 50.15%. Comparing base (c4f6cf7) to head (70f5060).
Report is 122 commits behind head on main.

Files with missing lines Patch % Lines
src/spatialdata_io/readers/visium_hd.py 12.50% 7 Missing ⚠️
src/spatialdata_io/readers/visium.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #264      +/-   ##
==========================================
+ Coverage   43.68%   50.15%   +6.46%     
==========================================
  Files          23       26       +3     
  Lines        2353     2660     +307     
==========================================
+ Hits         1028     1334     +306     
- Misses       1325     1326       +1     
Files with missing lines Coverage Δ
src/spatialdata_io/readers/visium.py 19.35% <0.00%> (-0.32%) ⬇️
src/spatialdata_io/readers/visium_hd.py 17.83% <12.50%> (-0.46%) ⬇️

... and 3 files with indirect coverage changes

@grst
Copy link
Contributor

grst commented Jan 21, 2025

For non-HD visium, the reader fails when no dataset_id can be inferred. IMO this is desirable as it forces the user to specify the dataset id manually. This helps to actually know what is stored in a spatialdata object.

@ddiez
Copy link
Contributor Author

ddiez commented Jan 21, 2025

@grst Enforcing having a dataset_id might be useful to help recognize what is in the spatialdata object. I do not know much about default file storage options for other platforms, except perhaps Visium V1, but the problem now is that passing dataset_id that is not "" will fail with the standard visium_hd format returned by spaceranger.

One way to solve the double requirement is to guess dataset_id from the folder name that is passed, since that usually has that meaning, but then use by default feature_slice.h5 for the h5 file. In the (probably rare) cases when the h5 file has another name, like the data that can be downloaded from 10x website, being able to specify a custom file name for feature_slice.h5 makes also sense. What do you think? Not sure how much reworking the current approach this requires.

@LucaMarconato LucaMarconato mentioned this pull request Feb 3, 2025
@LucaMarconato LucaMarconato marked this pull request as ready for review February 3, 2025 13:20
@LucaMarconato
Copy link
Member

Thanks @ddiez for starting the PR. I started from your code and now the issue should be addressed. I'll merge but wait to gather feedback from the users before the next release, so if something needs to be adjusted we will do it.

@LucaMarconato LucaMarconato merged commit 7724871 into scverse:main Feb 3, 2025
4 checks passed
@quentinblampey
Copy link
Contributor

Hello,
Just a small note here: for Visium HD only, when dataset_id=None, what do you think of trying both to infer the dataset_id and (if not working) try to fallback to dataset_id=""? It would work most of the time, while, right now, it's not very convenient to always have to set dataset_id="" after running Space Ranger.

Let me know if that sounds good to you, and I can make a PR.

@LucaMarconato
Copy link
Member

@quentinblampey opened a PR here (#352). Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

5 participants