Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/19306.misc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Prune stale entries from `sliding_sync_connection_required_state` table.
38 changes: 38 additions & 0 deletions synapse/storage/databases/main/sliding_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -450,6 +450,9 @@ def _get_and_clear_connection_positions_txn(

# Now that we have seen the client has received and used the connection
# position, we can delete all the other connection positions.
#
# Note: the rest of the code here assumes this is the only remaining
# connection position.
sql = """
DELETE FROM sliding_sync_connection_positions
WHERE connection_key = ? AND connection_position != ?
Expand Down Expand Up @@ -515,6 +518,41 @@ def _get_and_clear_connection_positions_txn(
required_state_map=required_state_map[required_state_id],
)

# Clean up any required state IDs that are no longer used by any
# connection position on this connection.
#
# We store the required state config per-connection per-room. Since this
# can be a lot of data, we deduplicate the required state JSON and store
# it separately, with multiple rooms referencing the same required state
# ID. Over time as the required state configs change, some required
# state IDs may no longer be referenced by any room config, so we need
# to clean them up.
#
# We do this by noting that we have pulled out *all* rows from
# `sliding_sync_connection_required_state` for this connection above. We
# have also pulled out all referenced required state IDs for *this*
# connection position, which is the only connection position that
# remains (we deleted the others above).
#
# Thus we can compute the unused required state IDs by looking for any
# required state IDs that are not referenced by the remaining connection
# position.
used_required_state_ids = {
required_state_id for _, _, required_state_id in room_config_rows
}

unused_required_state_ids = required_state_map.keys() - used_required_state_ids
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is wrong 🤔

required_state_map.keys() will be things like the state event types requested.

used_required_state_ids is a bunch of required_state_id which is an auto-incrementing number.

if unused_required_state_ids:
self.db_pool.simple_delete_many_batch_txn(
txn,
table="sliding_sync_connection_required_state",
keys=("connection_key", "required_state_id"),
values=[
(connection_key, required_state_id)
for required_state_id in unused_required_state_ids
],
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably could use a test to sanity check the database looks fine after this runs


# Now look up the per-room stream data.
rooms: dict[str, HaveSentRoom[str]] = {}
receipts: dict[str, HaveSentRoom[str]] = {}
Expand Down
Loading