-
Notifications
You must be signed in to change notification settings - Fork 226
Fix n+1 select issues in various views, significantly improving performance #801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix n+1 select issues in various views, significantly improving performance #801
Conversation
Fix most of the n+1 select issues with the view. TaggableManager.names seems to issue a new select, even when prefetched, so use a manual for loop instead.
All of the n+1 selects appear to be fixed, leading to a massive performance boost. Rather than using `has_access`/`user_can_view` in templates, annotate the domains/servers/etc with a `current_user_can_view` field that checks if the object's project is in the set of projects that the user is allowed to access. This is done as part of fetching all objects, avoiding the n+1 select issue, while still using the `Project.for_user` function and thus avoids duplicating the permissions logic.
Applies similar prefetching and tag expansion to the project detail page, reducing the number of queries and increasing performance.
Prefetch and use tags similar to previous commits.
Prefetch and use tags similar to previous commits.
Prefetch server_provider, tags, and aux server addresses, and also alter the `get_primary_address` tag to use the (now prefetched) field on the object rather than having it run its own query.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #801 +/- ##
=======================================
Coverage 91.43% 91.43%
=======================================
Files 368 368
Lines 20924 20941 +17
=======================================
+ Hits 19131 19148 +17
Misses 1793 1793 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR addresses critical N+1 query performance issues across multiple views by adding select_related and prefetch_related to Django querysets. The changes significantly reduce database queries and improve page load times (e.g., client detail pages went from ~25 seconds to <1 second).
Changes:
- Added
prefetch_related("tags")to domain, server, finding, and observation queries to bulk-load tags - Modified templates to use
tag.nameinstead oftags.names()to leverage prefetched data - Added annotated
current_user_can_viewfields to client detail queries to check permissions in-database - Added
select_relatedfor related models across multiple views to reduce joins
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| ghostwriter/shepherd/views.py | Added tag and auxiliary address prefetching for domain/server autocomplete queries |
| ghostwriter/shepherd/templates/shepherd/domain_list.html | Changed tag rendering to use prefetched data |
| ghostwriter/rolodex/views.py | Added permission annotations and extensive prefetching for client detail and project list views |
| ghostwriter/rolodex/tests/test_views.py | Added test data and assertions for access control verification |
| ghostwriter/rolodex/templatetags/determine_primary.py | Updated to use prefetched auxiliary addresses |
| ghostwriter/rolodex/templates/rolodex/project_list.html | Changed tag rendering to use prefetched data in autocomplete |
| ghostwriter/rolodex/templates/rolodex/client_detail.html | Updated to use annotated permission fields |
| ghostwriter/reporting/views2/report.py | Added tag prefetching for findings and observations |
| ghostwriter/reporting/views2/observations.py | Added tag prefetching to observation queryset |
| ghostwriter/reporting/views2/finding.py | Added tag prefetching and reorganized query ordering |
| ghostwriter/reporting/templates/reporting/report_detail.html | Changed tag rendering to use prefetched data |
| ghostwriter/reporting/templates/reporting/observation_list.html | Changed tag rendering to use prefetched data |
| ghostwriter/reporting/templates/reporting/finding_list.html | Changed tag rendering to use prefetched data |
| ghostwriter/oplog/views.py | Added select_related for project relationships in oplog list |
Comments suppressed due to low confidence (1)
ghostwriter/rolodex/tests/test_views.py:1
- Corrected spelling of 'assigend' to 'assigned'.
# Standard Libraries
chrismaddalena
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good. I didn't notice any issues, and the new tests cover the changes to the RBAC (can_access filter moved to current_user_can_view annotations).
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
An "N+1 Select" issue is a common issue in ORMs where after fetching a set of objects from a database, the set is iterated over and related objects are fetched one-at-a-time, resulting in many additional selects where a join would be more performant. The round-trip overhead of issuing these selects quickly adds up, leading to slow page rendering.
Django will fetch related objects lazily when accessing the related field on the ORM object, which is convenient, but also makes it easy to accidentally trigger this issue when accessing the field in a loop - for example, when rendering lists of items and including info about their projects and tags. This has turned out to be a major performance problem on SpecterOps' own GW instance, which is recently exceeding the nginx timeout and leading to errors.
Thankfully, Django offers a solution - the
select_relatedandprefetch_relatedmethods on querysets will fetch the related fields, either as part of the query using a JOIN (select_related) or by running a few extra queries to bulk load the data (prefetch_related), which is much more performant.These patches add calls to those methods on appropriate queries throughout several Ghostwriter views and applies some other related tweaks, significantly reducing the amount of queries and improving page load times, especially on pages with a lot of data. For example, loading the detail page of one of our clients went from taking roughly 25 seconds to less than a second.
Most of the patches add the above method calls to queries, with two notable exceptions:
names()method does not make use of the prefetched data and will always result in another query. Such instances have been changed to iterate through the tags and accesstag.nameinstead, which uses the prefetched data. I have filed a bug with django-taggit:TaggableManager.names()does not use the prefetch cache jazzband/django-taggit#936.annotate'd field on the objects, which tests if the object's project is in the set of projects that the user can access as part of the main query. This is a bit technical, but allows us to check access to the objects without needing to duplicate the access control logic.This PR isn't comprehensive, and there are likely other similar issues to find in other pages. However, it does optimize most of the major views, that are causing issues in our production instance.