-
Notifications
You must be signed in to change notification settings - Fork 22
SWAPI: Fix I-ALiRT error resulting from new sweep table number #2485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
SWAPI: Fix I-ALiRT error resulting from new sweep table number #2485
Conversation
|
Here is the test I wrote that shows that the code does at least run (I guess you can also just use the existing tests with simulated data...): @pytest.mark.external_test_data
def test_process_real_packet():
from pathlib import Path
packets_directory = Path('/home/hafarooki/projects/imap_processing/swapi-test/packets')
xtce_ialirt_path = (
imap_module_directory / "ialirt" / "packet_definitions" / "ialirt.xml"
)
calibration_file = pd.read_csv(
f"{imap_module_directory}/tests/ialirt/data/l0/swapi_ialirt_energy_steps.csv"
)
lines = []
for packet_path in packets_directory.iterdir():
sc_xarray_data = packet_file_to_datasets(
packet_path, xtce_ialirt_path, use_derived_value=False
)[478]
lines.append(packet_path.name)
lines.append('\tswapi_coin_cnt# sum: ' + ' '.join(f'{str(key).split("_")[-1]}:{sc_xarray_data[key].sum().item()}'
for key in sc_xarray_data.keys()
if str(key).startswith('swapi_coin_cnt')))
lines.append('\ttotal count: ' + str(sum(sc_xarray_data[key].sum().item()
for key in sc_xarray_data.keys()
if str(key).startswith('swapi_coin_cnt'))))
swapi_product = process_swapi_ialirt(sc_xarray_data, calibration_file)
for i in range(len(swapi_product)):
density = swapi_product[i]['swapi_pseudo_proton_density']
temperature = swapi_product[i]['swapi_pseudo_proton_temperature']
speed = swapi_product[i]['swapi_pseudo_proton_speed']
lines.append(f'\ti={i} ({swapi_product[i]["met_in_utc"]})')
lines.append(f'\t\tpseudo moments: density={density}, temperature={temperature}, speed={speed}')
output_file = packets_directory.parent / 'output.txt'
output_file.write_text('\n'.join(lines))
assert False, 'need to add assertions comparing with omni'
``` |
laspsandoval
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take a look at my comments.
| 60000 * (initial_speed_guess / 400) ** 2, | ||
| ] | ||
| ) | ||
| five_point_range = range(max_index - 2, max_index + 2 + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep the range the same as before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous range was incorrect -- I should have put this in a separate PR. It was using 6 points instead of 5 points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I included this here by accident -- should I put this in a separate PR or just update the name/description of this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might take more time if we change the range. Bishwas will need to review and we will need to fix the test. Could we leave it for now and update it in the next PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to this:
five_point_range = range(max_index - 3, max_index + 3)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its okay to leave the energy range like this now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. We will need to update the test that is failing:
test_optimize_parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Figure 4 of Lee et al. 2025, the I-ALiRT paper, it looks like they actually used 6 points to fit.
https://link.springer.com/article/10.1007/s11214-025-01244-9
Bishwasls
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not critical to change the energy bins used for fitting.
| 60000 * (initial_speed_guess / 400) ** 2, | ||
| ] | ||
| ) | ||
| five_point_range = range(max_index - 2, max_index + 2 + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its okay to leave the energy range like this now.
|
I reverted to using 6 points for fitting. Locally, I confirmed that the failing test once again passes. |
|
One thing that should be discussed is how we handle cases where less than 3 points are available for fitting. With less than 3 points, the 3-parameter Maxwellian model cannot be constrained. Some sort of fill value should be used instead? |
bishwassth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version looks good to me.
That's a good point. Did you see any case where less than three points are available? I always thought we will have at least three points available for fitting, but may be not due to compression of the counts in the I-ALiRT packets. |
|
So far I have seen a case with only 3 points available but not one with less than 3. However would be good to include the edge case in case it ever happens so that the I-ALiRT service is not interrupted again. |
greglucas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have some example tests so that we can add some assertions on what you want to happen in the various cases.
Something like the following pseudo code to test just this curve fitting function.
x = np.arange(63)
# Add a nan to our x that should be ignored in the fits
x[50] = np.nan
output = optimize_pseudo_parameters(x, ...)
assert output ...
|
Here is a link to the failing test you should be able to run locally: |
Co-authored-by: Greg Lucas <greg.m.lucas@gmail.com>
… into fix-nan-handling
|
Something strange I just realized: Why is there even a nan/infinite value in the xdata (as the stack trace in the original email that raised this issue shows) in the first place? In # Find the sweep's energy data for the latest time, where sweep_id == 2
subset = calibration_lut_table[
(calibration_lut_table["timestamp"] == calibration_lut_table["timestamp"].max())
& (calibration_lut_table["Sweep #"] == 2)
]
if subset.empty:
energy_passbands = np.full(NUM_IALIRT_ENERGY_STEPS, np.nan, dtype=np.float64)
else:
subset = subset.sort_values(["timestamp", "ESA Step #"])
energy_passbands = (
subset["Energy"][:NUM_IALIRT_ENERGY_STEPS].to_numpy().astype(float)
)Note in particular that when the data subset is empty, nans are provided as energy. Is this appropriate behavior? Maybe instead of masking out nans in I am not sure what that line of code is meant to be doing exactly---under what circumstances would |
This reverts commit 194115e.
@hafarooki I agree with you here. If that is the case, we should probably just exit/return early because there is no point in even continuing on after that subset.empty call. Another thing that is slightly confusing to me and might help is to make |
|
Why do we hardcode sweep # == 2? We have actually changed to sweep 3 now, so is there a way for I-ALiRT to know which sweep # to lookup or do we have to hardcode 3 now and make sure we are in-sync manually. |
|
Good catch! After consulting with the engineering team, I concluded that we can use |
|
pre-commit.ci autofix |
…ng into fix-nan-handling
|
I'm sorry, but I'm not going to review this. You have touched too many things since the last time. Please keep updates minimal! |
greglucas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is touching L2 code now too. Please remove that
|
I agree, the scope of this PR has changed because the issue turned out to be different from the original one, and we have identified some other issues along the way. I will split this into multiple PRs once we finalize the changes to be made. |
|
A lot has happened today for the SWAPI team in terms of how to process the I-ALiRT packets to get our pseudo moments. Because of this, I think we should pause with the PRs for now. |
Change Summary
Overview
I was shown in an email that I-ALiRT is broken because this portion of the code is failing to handle NaNs. I also found that an interval where the same solution kept being returned was having that issue because the zeros were not being excluded from
curve_fit. I am working on another potential PR to improve this fitting procedure, but this PR should fix those particular bugs. However, I haven't tested it properly (I did validate that the code runs with a custom unit test but the test is not well-written and not included in the PR) and would like help doing that. Also, there is a potential issue with how I did it: If ALL of the counts are excluded by the mask, there will be NO samples passed to curve fit.New Dependencies
N/A
New Files
N/A
Deleted Files
N/A
Updated Files
Just updated
process_swapi.pyto exclude zeros (as otherwise the initial guess is left as the solution and the covariance is infinite) and nan/infinite points (as otherwise there is an error that breaks I-ALiRT)Testing
I think there should be a test or two added to ensure the new behavior is correct. I haven't gotten around to creating such tests yet as I am not sure how the software team would prefer them to be implemented and I understand that this is urgent..