Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions supabase/migrations/20251021110000_task_ticket_vector_search.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
CREATE OR REPLACE FUNCTION get_similar_tasks(
query_embedding vector(768),
match_count INT DEFAULT 3
)
RETURNS TABLE (
task_id UUID,
title TEXT,
description TEXT,
similarity FLOAT
) AS $$
BEGIN
RETURN QUERY
SELECT
t.id AS task_id,
t.title,
t.description,
1 - (t.description_embedding <=> query_embedding) AS similarity
FROM tasks t
WHERE t.description_embedding IS NOT NULL
ORDER BY t.description_embedding <=> query_embedding
Comment on lines +17 to +20
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the migration file
find . -name "*20251021110000_task_ticket_vector_search.sql" -type f

Repository: AOSSIE-Org/Ell-ena

Length of output: 128


🏁 Script executed:

#!/bin/bash
# Check all SQL files for vector index creation patterns
rg -n 'hnsw|ivfflat|vector_cosine_ops|vector_l2_ops' --type sql

Repository: AOSSIE-Org/Ell-ena

Length of output: 44


🏁 Script executed:

#!/bin/bash
# Also search for CREATE INDEX with vector patterns
rg -n 'CREATE INDEX.*embedding|CREATE INDEX.*vector' --type sql -i

Repository: AOSSIE-Org/Ell-ena

Length of output: 44


🏁 Script executed:

#!/bin/bash
# List all migration files to understand the structure
find . -path "*/migrations/*.sql" -type f | sort

Repository: AOSSIE-Org/Ell-ena

Length of output: 743


🏁 Script executed:

#!/bin/bash
# Read the migration file to see its full content
cat -n ./supabase/migrations/20251021110000_task_ticket_vector_search.sql

Repository: AOSSIE-Org/Ell-ena

Length of output: 1518


🏁 Script executed:

#!/bin/bash
# Also check the meeting vector search migration for comparison
cat -n ./supabase/migrations/20251021090000_meeting_vector_search.sql

Repository: AOSSIE-Org/Ell-ena

Length of output: 4913


Add vector indexes for efficient similarity queries.

The <=> operator in both ORDER BY clauses will fall back to sequential scans without indexes. Add HNSW indexes to enable efficient vector search:

CREATE INDEX IF NOT EXISTS idx_tasks_embedding
  ON tasks USING hnsw (description_embedding vector_cosine_ops);

CREATE INDEX IF NOT EXISTS idx_tickets_embedding
  ON tickets USING hnsw (description_embedding vector_cosine_ops);

This is critical once the tables grow beyond a few thousand rows. If PR1 creates these indexes separately, this reminder can be disregarded.

Also applies to: 42-45

🤖 Prompt for AI Agents
In `@supabase/migrations/20251021110000_task_ticket_vector_search.sql` around
lines 17 - 20, Add HNSW vector indexes so the <=> similarity ORDER BY on
description_embedding uses an index: create IF NOT EXISTS idx_tasks_embedding on
tasks using hnsw for column description_embedding with vector_cosine_ops, and
create IF NOT EXISTS idx_tickets_embedding on tickets similarly for
description_embedding; ensure the new indexes are applied before running
similarity queries (also add the same index when you see other queries ordering
by description_embedding <=> query_embedding).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left indexing for now on purpose coz I wanted to keep this one focused on adding the similarity functions for tasks and tickets and not expanding the scope too much but I can add another pr for indexing as the table grows

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dolliecoder, that makes sense! Keeping the PR focused is a good approach.

Would you like me to create a follow-up issue to track adding the HNSW indexes for tasks.description_embedding and tickets.description_embedding? This way it won't be forgotten as the tables grow.

LIMIT match_count;
END;
$$ LANGUAGE plpgsql;


CREATE OR REPLACE FUNCTION get_similar_tickets(
query_embedding vector(768),
match_count INT DEFAULT 3
)
RETURNS TABLE (
ticket_id UUID,
title TEXT,
description TEXT,
similarity FLOAT
) AS $$
BEGIN
RETURN QUERY
SELECT
tk.id AS ticket_id,
tk.title,
tk.description,
1 - (tk.description_embedding <=> query_embedding) AS similarity
FROM tickets tk
WHERE tk.description_embedding IS NOT NULL
ORDER BY tk.description_embedding <=> query_embedding
LIMIT match_count;
END;
$$ LANGUAGE plpgsql;