Conversation
heyronhay
left a comment
There was a problem hiding this comment.
Couple of suggestions!
datadog_sync/utils/resource_utils.py
Outdated
| sorter.prepare() | ||
| # If prepare() succeeds, no cycles | ||
| return None | ||
| except Exception: |
There was a problem hiding this comment.
Seems like an overly broad catch, is there a more specific error we can catch here?
There was a problem hiding this comment.
Good call, fixed to pull in specifically the CycleError
| await r_class._send_action_metrics("delete", _id, Status.FAILURE.value) | ||
| self.config.logger.error(f"error deleting resource {resource_type} with id {_id}: {str(e)}") | ||
| finally: | ||
| # Mark as done in cleanup sorter if it exists |
There was a problem hiding this comment.
Do we need to test if it was actually deleted successfully first?
There was a problem hiding this comment.
Ideally yes, but I think that's a bigger restructure. Calling done(q_item) frees up the sorter so it doesn't hang forever. If we have a dashboard->monitor->slo dependency chain and the dashboard fails to get deleted then the calls to delete the monitor and slo will still happen (and will both fail). That's no different from today. To actively skip those other delete calls we need to start tracking every dependency chain I think. It would be better of course, just a bigger change.
What does this PR do?
This fix is to delete items in the correct order. We've always created them in the correct order, for example if a Dashboard depends on an SLO being there we create the SLO before creating the Dashboard. The opposite wasn't true though, we didn't always delete the Dashboard before deleting the SLO.
Description of the Change
Topological sort for deleting things.