-
-
Notifications
You must be signed in to change notification settings - Fork 90
Open
Description
General Remarks:
- Are
LIST OF DOUBLEandARRAY_OF_DOUBLESin the functions handled as float? - What about vector types
ARRAY_OF_SHORTS,ARRAY_OF_INTEGERS,ARRAY_OF_LONGS? - Is
ARRAY_OF_FLOATSaLIST OF FLOAT? If so what is[1.0, 2.0]by default?LISTorARRAY_OF_FLOAT? I assume a list? I guess this hierarchy needs documenting. - Is
SparseVectora SQL data type? Can a property have this type? - The dot product computation (sum of squares) is implemented in various functions, should there be one that is reused?
Specific Remarks:
vectorDimension- Errors for a
NULLargument, IMHO should return0like.length()or.size()for aNULLargument. - Wouldn't
vectorDimbe sufficient as a name, given that it is alsovectorHasInfand notvectorHasInfinity?
- Errors for a
vectorHasNaN,vectorHasInf- Cannot be tested with
LIST OF FLOATbecauseNANvalues are automatically converted toNULLwhich is not allowed in typed collections. -
SELECT vectorHasNaN([1.0,sqrt(-1.0),3.0])errors withCannot invoke "Object.getClass()" because "elem" is null - Is maybe a function
vectorHasNullneeded?
- Cannot be tested with
vectorIsNormalized- The test for normalization is numerically as complex as normalizing itself, so instead of testing for normality one could just normalize.
- Why is the default threshold
0.001is this supposed to be approximatelysqrt(eps)for float? Then it should be0.0000001for doubles. - Wouldn't
vectorIsNormalbe sufficient as name?
vectorAdd,vectorSubtract- How to broadcast? Meaning how to add (or subtract) a scalar from or to a vector without creating a vector, ie
[1.0,2.0,3.0] + 4.0 (which currently would add the element 4.0 to the vector instead of adding 4.0 to every element) - Wouldn't
vectorSubbe sufficient as a name?
- How to broadcast? Meaning how to add (or subtract) a scalar from or to a vector without creating a vector, ie
vectorMagnitude- Why is this function not named
vectorL2Norm, symmertrically withvectorL1NormandvectorLInfNorm?
- Why is this function not named
vectorLInfNorm- The loop does not need a conditional if something like
maxAbs = Math.max(maxAbs, Math.abs(value))is used.
- The loop does not need a conditional if something like
vectorSparsity- Should there be a default threshold like
sqrt(eps)? - Alternatively or additionally the L0 pseudonorm could be computed as sparsity measure (Geometric mean of absolute values)
- Should there be a default threshold like
vectorSum,vectorAvg,vectorMax,vectorMin- There is an error in the
vectorSumfunction as repeated calls ofSELECT vectorSum([1.0,2.0,3.0])yield different (increasing) results, same when using a property as argument.vectorAvgdoes not have this problem. - These seems to work different than other aggregating functions, which for a single argument aggregate over the argument, which would mean here for example the sum of vector elements.
-
vectorAvgdoes not produce the arithmetic average (just one specific property of a record)
- There is an error in the
vectorStdDev,vectorVariance- Unlike the
varianceandstddevSQL functions, these are not aggregating. - Unlike the
varianceandstddevSQL functions,vectorStdDevis not reusing thevectorVariancecode.
- Unlike the
vectorClip- Since this is called
clampin Java terminology, should this be renamed?
- Since this is called
vectorCosineSimilarity- The two loops in the computation can be merged into one.
vectorQuantizeBinary- Why is the median used to decide?
- Why is there no
vectorDequantizeBinary? At least for completeness.
vectorDequantizeInt8- This does not work
SELECT vectorDequantizeInt8(vectorQuantizeInt8([1.0, 2.0, 3.0]), 1.0, 3.0)and gives the errorQuantized vector must be an array or list, found: QuantizationResult
- This does not work
vectorApproxDistance- What means ranking is preserved for
INT8? Vector spaces are not ordered. Is this meant element-wise? - Why can't the function deduce the quantization from its arguments?
- The following errors
SELECT vectorApproxDistance(vectorQuantizeInt8([1.0, 2.0, 3.0]),vectorQuantizeInt8(1.0, 3.0, 3.0),'INT8')withvectorQuantizeInt8(<vector>)
- What means ranking is preserved for
vectorNormalizeScores- Wouldn't it be faster to create a new array with the midpoint value for the edge case of range zero instead of looping?
vectorMultiScore- The associated Java class filename does nt fit the pattern (misses the
Vectorprefix). - Why is the weighted average an extra type, and not just
'AVG'with an extra argument, orAVGis always weighted but by default with the vector or ones.
- The associated Java class filename does nt fit the pattern (misses the
vectorHybridScore- This is just a special case of
vectorMultiScorefor the case of two scores with a weighted average. Is this extra function needed? - This is not really a
vectorfunction as it does not handle vectors.
- This is just a special case of
vectorRRFScore- This produces wrong results for more than two scores as it cannot be distinguished between optional last argument and score.
- Why are the scores not grouped into a vector as for
vectorMultiScore? - This is not really a
vectorfunction as it does not handle vectors.
vectorScoreTransformation- This is not really a
vectorfunction as it does not handle vectors. -
LNwould be more clear in terms of type of logarithm thanLOG. - additionally
TANHmight be a useful variant toSIGMOID.
- This is not really a
vectorDenseToSparse- Associated java class filename differs from pattern (
vectorprefix missing) - Couldn't it be named
vectorAsSparse?
- Associated java class filename differs from pattern (
vectorSparseCreate,vectorSparseDot,vectorSparseToDense- Associated java class filename differs from pattern (
vectorprefix missing)
- Associated java class filename differs from pattern (
vectorToString- Are these meant to copy paste into code / scripts or saved to a file which then loaded?
- When using the
numpyfromStringmethod is used a comma-separated list is expected, the brackets are only for code AFAIK - In MATLAB the separator determines the type of vector space (or comma!) produce a row vector, while semi-colon results in column vector
- Julia is similar to MATLAB in many regards, so the MATLAB variant should work also in Julia, but it would be more obvious if also a
'JULIA'would be available. - Should this be renamed to
vectorAsString?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request