FILTER support for '*' or '?' constraint for more than 1 variable #37
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
FILTER expression involving '*' or '?' does not give correct result if more than 1 variable used.
Example:
query Q1
filter "(BAR[*] < 0.1)" test.vcf-> yields no resultquery Q2
filter "(BAR[*] < 0.1) & (FOO[*] >= 2)" test.vcf-> yields 3 resultQ2 is more strict than Q1 so its result should always be a subset of Q1's result.
Cause
After looking at the code, the
*or?query currently supports 1 variable.If there are multiple variables
FOO,BARboth having*or?predicate, the indexnused to evaluateBAR[n]does not get evaluated properly.This causes incorrect results to be generated. See also
TestCasesFilter#test_57in this pull request for concrete example.Proposed Solution
Track the current index for each query variable in
FieldIterator.In this context
FOOwill track some indexmandBARwill track another indexn.Previously both
FOOandBARtracks the same indexneven though they are at different stages in the iteration.Verification
Added the following test files:
TestCasesFilter#test_57test/test_filter_multiple_var.vcf