Skip to content

Conversation

@MattScicluna
Copy link
Contributor

close #87

@coveralls
Copy link

coveralls commented Dec 18, 2025

Pull Request Test Coverage Report for Build 20349244901

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 0.0%

Totals Coverage Status
Change from base Build 18852513838: 0.0%
Covered Lines: 0
Relevant Lines: 0

💛 - Coveralls

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for random landmarking with precomputed distance and affinity matrices, addressing issue #87. Previously, random landmarking only worked with raw data using Euclidean distance computations. Now users can provide precomputed distance or affinity matrices and use random landmarking for landmark-based dimensionality reduction.

Key changes:

  • Extended build_landmark_op() to detect and handle precomputed matrices by using affinity-based clustering instead of distance-based clustering
  • Added test coverage for random landmarking with precomputed affinity matrices
  • Preserved existing functionality for non-precomputed graphs

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
graphtools/graphs.py Modified build_landmark_op() method to check for precomputed matrices and use kernel-based affinity clustering (argmax) instead of distance-based clustering (argmin) when precomputed matrices are detected
test/test_random_landmarking.py Added new test test_random_landmarking_with_precomputed_affinity() to verify random landmarking works correctly with precomputed affinity matrices

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +408 to +446
def test_random_landmarking_with_precomputed_affinity():
"""Random landmarking should work with precomputed affinity matrices"""
affinity = np.array(
[
[1.0, 0.8, 0.1, 0.0, 0.0, 0.0],
[0.8, 1.0, 0.2, 0.0, 0.0, 0.0],
[0.1, 0.2, 1.0, 0.9, 0.4, 0.0],
[0.0, 0.0, 0.9, 1.0, 0.5, 0.2],
[0.0, 0.0, 0.4, 0.5, 1.0, 0.9],
[0.0, 0.0, 0.0, 0.2, 0.9, 1.0],
]
)
affinity = (affinity + affinity.T) / 2 # ensure symmetry
n_landmark = 3
random_state = 42

G = graphtools.Graph(
affinity,
precomputed="affinity",
n_landmark=n_landmark,
random_landmarking=True,
random_state=random_state,
knn=3,
thresh=0,
)

# Trigger landmark construction
_ = G.landmark_op

rng = np.random.default_rng(random_state)
landmark_indices = rng.choice(affinity.shape[0], n_landmark, replace=False)
expected_clusters = np.asarray(
G.kernel[:, landmark_indices].argmax(axis=1)
).reshape(-1)

assert np.array_equal(G.clusters, expected_clusters)
assert G.transitions.shape == (affinity.shape[0], n_landmark)
assert G.landmark_op.shape == (n_landmark, n_landmark)

Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this test covers precomputed="affinity", there is no test coverage for precomputed="distance". Since the implementation in graphs.py handles precomputed distance matrices by converting them to affinity matrices in build_kernel(), it would be valuable to add a similar test case that verifies random landmarking works correctly with precomputed distance matrices as well.

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +408 to +446
def test_random_landmarking_with_precomputed_affinity():
"""Random landmarking should work with precomputed affinity matrices"""
affinity = np.array(
[
[1.0, 0.8, 0.1, 0.0, 0.0, 0.0],
[0.8, 1.0, 0.2, 0.0, 0.0, 0.0],
[0.1, 0.2, 1.0, 0.9, 0.4, 0.0],
[0.0, 0.0, 0.9, 1.0, 0.5, 0.2],
[0.0, 0.0, 0.4, 0.5, 1.0, 0.9],
[0.0, 0.0, 0.0, 0.2, 0.9, 1.0],
]
)
affinity = (affinity + affinity.T) / 2 # ensure symmetry
n_landmark = 3
random_state = 42

G = graphtools.Graph(
affinity,
precomputed="affinity",
n_landmark=n_landmark,
random_landmarking=True,
random_state=random_state,
knn=3,
thresh=0,
)

# Trigger landmark construction
_ = G.landmark_op

rng = np.random.default_rng(random_state)
landmark_indices = rng.choice(affinity.shape[0], n_landmark, replace=False)
expected_clusters = np.asarray(
G.kernel[:, landmark_indices].argmax(axis=1)
).reshape(-1)

assert np.array_equal(G.clusters, expected_clusters)
assert G.transitions.shape == (affinity.shape[0], n_landmark)
assert G.landmark_op.shape == (n_landmark, n_landmark)

Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test only covers dense precomputed affinity matrices. Consider adding test coverage for sparse precomputed affinity matrices to ensure the sparse matrix handling code path (lines 1210-1217 in graphs.py) works correctly with random landmarking.

Copilot uses AI. Check for mistakes.
Comment on lines +448 to +488
def test_random_landmarking_with_precomputed_distance():
"""Random landmarking should work with precomputed distance matrices"""
dist = np.array(
[
[0, 1, 4, 4, 4, 4],
[1, 0, 4, 4, 4, 4],
[4, 4, 0, 1, 4, 4],
[4, 4, 1, 0, 4, 4],
[4, 4, 4, 4, 0, 1],
[4, 4, 4, 4, 1, 0],
]
)

n_landmark = 3
random_state = 42

G = graphtools.Graph(
dist,
precomputed="distance",
n_landmark=n_landmark,
random_landmarking=True,
random_state=random_state,
bandwidth=1, # deterministic affinity: exp(-dist)
decay=1,
thresh=0,
knn=3,
)

# Trigger landmark construction
_ = G.landmark_op

rng = np.random.default_rng(random_state)
landmark_indices = rng.choice(dist.shape[0], n_landmark, replace=False)
expected_clusters = np.asarray(
G.kernel[:, landmark_indices].argmax(axis=1)
).reshape(-1)

assert np.array_equal(G.clusters, expected_clusters)
assert G.transitions.shape == (dist.shape[0], n_landmark)
assert G.landmark_op.shape == (n_landmark, n_landmark)

Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test only covers dense precomputed distance matrices. Consider adding test coverage for sparse precomputed distance matrices to ensure the sparse matrix code path works correctly with random landmarking.

Copilot uses AI. Check for mistakes.
Comment on lines +408 to +446
def test_random_landmarking_with_precomputed_affinity():
"""Random landmarking should work with precomputed affinity matrices"""
affinity = np.array(
[
[1.0, 0.8, 0.1, 0.0, 0.0, 0.0],
[0.8, 1.0, 0.2, 0.0, 0.0, 0.0],
[0.1, 0.2, 1.0, 0.9, 0.4, 0.0],
[0.0, 0.0, 0.9, 1.0, 0.5, 0.2],
[0.0, 0.0, 0.4, 0.5, 1.0, 0.9],
[0.0, 0.0, 0.0, 0.2, 0.9, 1.0],
]
)
affinity = (affinity + affinity.T) / 2 # ensure symmetry
n_landmark = 3
random_state = 42

G = graphtools.Graph(
affinity,
precomputed="affinity",
n_landmark=n_landmark,
random_landmarking=True,
random_state=random_state,
knn=3,
thresh=0,
)

# Trigger landmark construction
_ = G.landmark_op

rng = np.random.default_rng(random_state)
landmark_indices = rng.choice(affinity.shape[0], n_landmark, replace=False)
expected_clusters = np.asarray(
G.kernel[:, landmark_indices].argmax(axis=1)
).reshape(-1)

assert np.array_equal(G.clusters, expected_clusters)
assert G.transitions.shape == (affinity.shape[0], n_landmark)
assert G.landmark_op.shape == (n_landmark, n_landmark)

Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a test case for the edge case where some samples have zero affinity to all randomly selected landmarks. This would test the warning logic implemented in lines 1223-1228 of graphs.py, which is currently not covered by these tests.

Copilot uses AI. Check for mistakes.
precomputed = getattr(self, "precomputed", None)

if precomputed is not None:
# Use the precomputed affinities/distances directly to avoid Euclidean fallback
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment could be clearer. It says "Use the precomputed affinities/distances directly" but the code actually uses self.kernel, which is always an affinity matrix (distances are converted to affinities in build_kernel). Consider updating to "Use affinities from the kernel computed from the precomputed matrix" for clarity.

Suggested change
# Use the precomputed affinities/distances directly to avoid Euclidean fallback
# Use affinities from the kernel computed from the precomputed matrix to avoid Euclidean fallback

Copilot uses AI. Check for mistakes.
@MattScicluna
Copy link
Contributor Author

@copilot open a new pull request to apply changes based on the comments in this thread

Copy link

Copilot AI commented Dec 18, 2025

@MattScicluna I've opened a new pull request, #89, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 2 commits December 18, 2025 19:35
…prove comment clarity

Co-authored-by: MattScicluna <19255250+MattScicluna@users.noreply.github.com>
Add sparse matrix and edge case test coverage for random landmarking with precomputed matrices
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Random landmarking incompatible with precomputed affinities

3 participants