Add pairwise distance computation in condensed form for symmetric metrics.#79
Add pairwise distance computation in condensed form for symmetric metrics.#79lucabrivio wants to merge 1 commit intoJuliaStats:masterfrom
Conversation
5d36c4e to
b20e644
Compare
b20e644 to
1fdc3d2
Compare
1fdc3d2 to
c8aec6d
Compare
|
Thanks, but I'm not sure we want to provide this kind of function. Julia supports very powerful So instead of returning a vector, which is arguably a hack due to the limitations of other languages, we should use What the most interesting feature for you in this PR? Saving RAM? Making subsequent computations faster? |
|
Thank you for your answer.
The most interesting feature for me is indeed making subsequent computations faster, for instance when comparing a large number of such matrices... (In my case it also matches the form I intended to store the results in.)
I am fairly new to Julia actually and could not find the best way to represent the result matrix, however I agree that we should use something different!
|
|
If you mainly care about the computation speed, then returning a I think you could just add a |
|
I too agree that |
|
What is the status on this? I also think that the output for semi-metrics should be symmetric. Also, besides saving computation, we could have a method to flatten a symmetric matrix (the upper or lower part of it as a vector). Many times that is what is needed in distance calculations on a huge point cloud. |
|
Re-stating what was already stated in the previous comments. The main advantage of storing half of the matrix entries in a flattened vector is that we can perform subsequent calculations more efficiently. Simple statistics like histograms, means, variances are easy to apply. Even if pairwise returns a matrix type that is symmetric, its zero entries would compromise the summaries. |
This patch adds support for computing pairwise distances (for semimetrics) in a condensed form, similar to how
pdistworks in several packages available for MATLAB/Octave, R, Python, etc. (although on columns).At the moment I have added no tests for the new methods, nor have I written functions to convert between the condensed form and the redundant one.