Skip to content

Issue 53625: remap assay lookup keys before validation so that validators can validate the remapped value#7100

Merged
cnathe merged 7 commits intodevelopfrom
fb_assayLookupValidator53625
Oct 7, 2025
Merged

Issue 53625: remap assay lookup keys before validation so that validators can validate the remapped value#7100
cnathe merged 7 commits intodevelopfrom
fb_assayLookupValidator53625

Conversation

@cnathe
Copy link
Contributor

@cnathe cnathe commented Oct 3, 2025

Rationale

Issue 53625: Assay result lookup field import fails when titleColumn value provided with "Ensure Value Exists in Lookup Target" checkbox enabled for lookup field

When an assay results domain has a look field with the "Ensure Value Exists in Lookup Target" enabled, we are currently throwing a ConversionException in the validator code because it is trying to validate if the provided value (which could be the key value for the lookup or the titleColumn value) is in the lookup table.

This PR fixes that by catching the ConversionException in the LookupValidator.validate() method but also by making sure we first remap the provided value before checking the validators in AbstractAssayTsvDataHandler.checkData.

Changes

  • remap assay lookup keys before validation so that validators can validate the remapped value
  • add test case to AssayTest

Tasks 📍

  • Manual Testing @XingY
    If my lookup label is string type but the value is number like, for example "123", then import fails with "look: Value '123' was not present in lookup target 'lists.cRck" @cnathe
  • Needs Automation

@cnathe cnathe self-assigned this Oct 3, 2025
@cnathe cnathe requested a review from XingY October 3, 2025 21:52
TableInfo lookupTable = remappableLookup.get(pd);
try
{
Object remapped = cache.remap(lookupTable, s, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I have a item with label 4 (whose pk is 5), this will map input 4 to pk 4. It's probably not big deal, but it is inconsistent with how we resolve samples (ExperimentService.findExpMaterial).

@cnathe
Copy link
Contributor Author

cnathe commented Oct 6, 2025

@labkey-klum @labkey-jeckels would either of you like to also do a CR on this?
Also, do you have any thoughts on Xing's last comment regarding if we should be resolving by PK RowId first or by lookup value first?

@labkey-klum
Copy link
Contributor

@labkey-klum @labkey-jeckels would either of you like to also do a CR on this? Also, do you have any thoughts on Xing's last comment regarding if we should be resolving by PK RowId first or by lookup value first?

I doubt (or hope not) that we would see a scenario in real life where the lookup resembles the PK but I think I agree with Xing's assessment that we should favor the primary key.

}
catch (ConversionException e)
{
errors.add(new PropertyValidationError(e.getMessage(), pd.getName()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this exception always have a message, or should we fill in a generic one? (I see that this is moved code, but still could be relevant)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was existing code that I moved. However, I don't think the catch block is needed here. The remap code doesn't look to convert the value or throw the ConversionException. The error for a value that can't be mapped ends up in the code above this where remapped == null. So I believe we can remove this try/catch.

@labkey-jeckels
Copy link
Contributor

@labkey-klum @labkey-jeckels would either of you like to also do a CR on this? Also, do you have any thoughts on Xing's last comment regarding if we should be resolving by PK RowId first or by lookup value first?

I doubt (or hope not) that we would see a scenario in real life where the lookup resembles the PK but I think I agree with Xing's assessment that we should favor the primary key.

Adding @labkey-susanh in case she hasn't been following along. For TSV import, it seems more likely that the data provided would be the lookup value (a sample ID that happens to be a number, for example). However, I don't know the full details of the scenario that prompted this change.

@labkey-susanh
Copy link
Contributor

@labkey-klum @labkey-jeckels would either of you like to also do a CR on this? Also, do you have any thoughts on Xing's last comment regarding if we should be resolving by PK RowId first or by lookup value first?

I doubt (or hope not) that we would see a scenario in real life where the lookup resembles the PK but I think I agree with Xing's assessment that we should favor the primary key.

Adding @labkey-susanh in case she hasn't been following along. For TSV import, it seems more likely that the data provided would be the lookup value (a sample ID that happens to be a number, for example). However, I don't know the full details of the scenario that prompted this change.

Yeah, in the absence of other information, I would tend toward resolving as a name first instead of as a PK first, especially since we know customers have number names.

@cnathe
Copy link
Contributor Author

cnathe commented Oct 6, 2025

@labkey-klum @labkey-jeckels would either of you like to also do a CR on this? Also, do you have any thoughts on Xing's last comment regarding if we should be resolving by PK RowId first or by lookup value first?

I doubt (or hope not) that we would see a scenario in real life where the lookup resembles the PK but I think I agree with Xing's assessment that we should favor the primary key.

Adding @labkey-susanh in case she hasn't been following along. For TSV import, it seems more likely that the data provided would be the lookup value (a sample ID that happens to be a number, for example). However, I don't know the full details of the scenario that prompted this change.

Yeah, in the absence of other information, I would tend toward resolving as a name first instead of as a PK first, especially since we know customers have number names.

@labkey-susanh @labkey-jeckels @labkey-klum @XingY
I agree with you all. It would make more sense to use the incoming value to first try to resolve the lookup value by titleColumn and then fall back to resolving it by PK column. This PR doesn't change any of that code, this change just moves the remap code before the validator code so that we have the remapped value when we decide if ColumnValidator is valid.

It looks like the code that is currently resolving the lookup value by pkColumn value and then titleColumn value is in SimpleTranslator.java RemapConverter.mappedValue(). The method there uses _pkColumnLookupMap first and returns if there is a non-null value found before trying the _titleColumnLookupMap. I think the group here is suggesting that we flip the order of these. If that is true, I'd like to make that change as a separate PR. Thoughts?

@labkey-susanh
Copy link
Contributor

@labkey-klum @labkey-jeckels would either of you like to also do a CR on this? Also, do you have any thoughts on Xing's last comment regarding if we should be resolving by PK RowId first or by lookup value first?

I doubt (or hope not) that we would see a scenario in real life where the lookup resembles the PK but I think I agree with Xing's assessment that we should favor the primary key.

Adding @labkey-susanh in case she hasn't been following along. For TSV import, it seems more likely that the data provided would be the lookup value (a sample ID that happens to be a number, for example). However, I don't know the full details of the scenario that prompted this change.

Yeah, in the absence of other information, I would tend toward resolving as a name first instead of as a PK first, especially since we know customers have number names.

@labkey-susanh @labkey-jeckels @labkey-klum @XingY I agree with you all. It would make more sense to use the incoming value to first try to resolve the lookup value by titleColumn and then fall back to resolving it by PK column. This PR doesn't change any of that code, this change just moves the remap code before the validator code so that we have the remapped value when we decide if ColumnValidator is valid.

It looks like the code that is currently resolving the lookup value by pkColumn value and then titleColumn value is in SimpleTranslator.java RemapConverter.mappedValue(). The method there uses _pkColumnLookupMap first and returns if there is a non-null value found before trying the _titleColumnLookupMap. I think the group here is suggesting that we flip the order of these. If that is true, I'd like to make that change as a separate PR. Thoughts?

Definitely agree it would be a separate PR If this PR didn't introduce a change in the ordering. I'm not sure we would reorder these statement for all (not something to be done on a whim) but might choose when or whether includePkLookup is set to true. Within the RemappingConvertColumn, we first convert with this property set to false, then convert with it set to true, and part (or all) of the reason for that is the ordering of the logic in RemapConverter.mappedValue().

@cnathe
Copy link
Contributor Author

cnathe commented Oct 7, 2025

FYI, I am going to merge this change and then open a related issue with the details for the "resolve lookup value before PK" discussion.

@cnathe cnathe merged commit e8e15b7 into develop Oct 7, 2025
17 checks passed
@cnathe cnathe deleted the fb_assayLookupValidator53625 branch October 7, 2025 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants