-
Notifications
You must be signed in to change notification settings - Fork 31
INTPYTHON-698 Allow defining embedded model indexes on the top-level model #376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
INTPYTHON-698 Allow defining embedded model indexes on the top-level model #376
Conversation
998a080 to
5dd116b
Compare
django_mongodb_backend/schema.py
Outdated
| # Remove the top level indexes. | ||
| # TODO: Find a workaround | ||
| for index in model._meta.indexes: | ||
| if any( | ||
| field_name.startswith(f"{field.column}{LOOKUP_SEP}") | ||
| for field_name in index.fields | ||
| ): | ||
| self.remove_index(model, index) | ||
| for constraint in model._meta.constraints: | ||
| if any( | ||
| field_name.startswith(f"{field.column}{LOOKUP_SEP}") | ||
| for field_name in constraint.fields | ||
| ): | ||
| self.get_collection(model._meta.db_table).drop_index(constraint.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think this is unnecessary since the Index/Constraint would have to be removed from the model's Meta.indexes/constraints before removing the field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 Oki, I will try to move the logic into _remove_model_indexes and _remove_field_unique. The downfall is I have to pass all the indexes from its parents. Like concatenating all way down the recursion. The index could have been created in anywhere of the hierarchy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's simpler than you describe. Example:
class Book(models.Model):
author = EmbeddedModelField(Author)
class Meta:
app_label = "schema_"
indexes = [
models.Index(fields=["author__indexed_two"]),
models.Index(fields=["author__address__indexed_one"]),
]I removing the an index, the generated migration operation is:
migrations.RemoveIndex(
model_name='book',
name='embed_index_author._f84329_idx',
),The logic you describe isn't needed.
django_mongodb_backend/lookups.py
Outdated
| class Options(base.Options): | ||
| def get_field(self, field_name): | ||
| if LOOKUP_SEP in field_name: | ||
| previous = self | ||
| keys = field_name.split(LOOKUP_SEP) | ||
| path = [] | ||
| for field in keys: | ||
| field = base.Options.get_field(previous, field) | ||
| if isinstance(field, EmbeddedModelField): | ||
| previous = field.embedded_model._meta | ||
| else: | ||
| previous = field | ||
| path.append(field.column) | ||
| column = ".".join(path) | ||
| embedded_column = field.clone() | ||
| embedded_column.column = column | ||
| return embedded_column | ||
| return super().get_field(field_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Monkey patching at such a low level seems risky, though I haven't much about what could go wrong. Instead I imagined an Index subclass with this sort of logic. Did you consider it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 Maybe is feasible if we create something like EMFIndex. I didn't considered yet. I could take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmh this way (monkey patching) will not work, I forgot that foreign field, lookups and many other things use this. So isn't very straightforward. I have to think another way to solve this
3e722d1 to
2010776
Compare
86aed5e to
9636c17
Compare
9636c17 to
e1c98f3
Compare
Jibola
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good to me!
Only missing component is documentation.
django_mongodb_backend/indexes.py
Outdated
| query = Query(model=model, alias_cols=False) | ||
| compiler = query.get_compiler(connection=schema_editor.connection) | ||
| for expression in self.expressions: | ||
| query = Query(model=model, alias_cols=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to reinstantiate this? Does expression.resolve_expression(query) mutate query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I moved the query outside the loop, and forgot to remove this one.
e1c98f3 to
eb29e72
Compare
|
Concurrent with the release of this, will we stop support unique constraints and indexes declared on embedded models and their fields? Or perhaps we should try to put that through a deprecation process? |
|
Probably should be deprecated in case anyone is using them. |
We should put this in a deprecation process. For the first bump where this change occurs, we'll let them know that the deprecation is coming. |
eb29e72 to
c0707cc
Compare
d40c24a to
9c56be4
Compare
42d9062 to
4ad1c49
Compare
1c28e45 to
d9c04dc
Compare
This section needs to be rewritten in the PR description. The constraints/indexes for nested fields need to be defined within the EmbeddedModel. Right now this "before" looks like a better system than the AFTER. 😭
|
|
Potentially out of scope for this PR, but could we introduce an indexing specific expression subclass that just converts dot notation to class MongoDBIndexExpression(F):
def __init__(name, *args, **kwargs):
super().(name.replace(".", "__"), *args, **kwargs) |
Yes, I’ll rewrite it. |
| def index_expression(self, compiler, connection, as_expr=False): # noqa: ARG001 | ||
| result = [] | ||
| for expr in self.get_source_expressions(): | ||
| if expr is None: | ||
| continue | ||
| for sub_expr in expr.get_source_expressions(): | ||
| try: | ||
| result.append(sub_expr.as_mql(compiler, connection)) | ||
| except FullResultSet: | ||
| result.append(Value(True).as_mql(compiler, connection)) | ||
| return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me this should be in indexes.py rather than the expressions module?
| def _unique_supported( | ||
| self, | ||
| condition=None, | ||
| deferrable=None, | ||
| include=None, | ||
| expressions=None, | ||
| nulls_distinct=None, | ||
| ): | ||
| return ( | ||
| (not condition or self.connection.features.supports_partial_indexes) | ||
| and (not deferrable or self.connection.features.supports_deferrable_unique_constraints) | ||
| and (not include or self.connection.features.supports_covering_indexes) | ||
| and ( | ||
| not expressions | ||
| or self.connection.features.supports_expression_indexes | ||
| or self._check_expression_indexes_applicable(expressions) | ||
| ) | ||
| and ( | ||
| nulls_distinct is None | ||
| or self.connection.features.supports_nulls_distinct_unique_constraints | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This copies much from the base class. It needs a comment explaining the method is is overridden. I guess it's about _check_expression_indexes_applicable.
| Indexes from Expressions | ||
| ======================== | ||
|
|
||
| Django MongoDB Backend now supports creating indexes from expressions. | ||
| Currently, only ``F()`` expressions are supported, which allows referencing | ||
| fields from the top-level model inside embedded fields. | ||
|
|
||
| Example:: | ||
|
|
||
| from django.db import models | ||
| from django.db.models import F | ||
|
|
||
| class Author(models.EmbeddedModel): | ||
| name = models.CharField() | ||
|
|
||
| class Book(models.Model): | ||
| author = models.EmbeddedField(Author) | ||
|
|
||
| class Meta: | ||
| indexes = [ | ||
| models.Index(F("author__name")), | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this document, probably "Indexing embedded models" should be part of that topic guide.
| - Added support for creating indexes from expressions. | ||
| Currently, only ``F()`` expressions are supported to reference top-level | ||
| model fields inside embedded models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should say something like "You can now index embedded model fields by adding an index to the parent model." and link to the new documentation.
| class Book(models.Model): | ||
| author = EmbeddedModelField(Author) | ||
|
|
||
| class Meta: | ||
| app_label = "schema_" | ||
| indexes = [ | ||
| models.Index(F("author__indexed_two").asc(), name="indexed_two"), | ||
| models.Index(F("author__address__indexed_one").asc(), name="indexed_one"), | ||
| ] | ||
|
|
||
| new_field = EmbeddedModelField(Author) | ||
| new_field.set_attributes_from_name("author") | ||
|
|
||
| with connection.schema_editor() as editor: | ||
| # Create the table and add the field. | ||
| editor.create_model(Book) | ||
| editor.add_field(Book, new_field) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the test you copied this from needs to be fixed, but this logic isn't correct. You started with Book that already has an author field, then added the field again. Anyway, this doesn't simulate how the migration operations will be generated. When you add the author field like this, you'll also have AddIndex operations, so really I think there's nothing additional to test. Unlike the case that you copied this test from, you haven't added any logic to SchemaEditor that requires testing. I'll try to provide some more guidance about this later.
| def _unique_supported( | ||
| self, | ||
| condition=None, | ||
| deferrable=None, | ||
| include=None, | ||
| expressions=None, | ||
| nulls_distinct=None, | ||
| ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't analyze this complete, but it could be problematic to override this without also overriding Index.check. This method will silently ignore unsupported indexes. The system check framework is what gives the user a warning that the index they declared isn't supported.
NOW:
Allow simple expressions to support top-level index model.
BEFORE:
Indexes on
EmbeddedModelField(EMF) are defined using dot notation:Notes on Django checks
Django enforces models.E016, which requires all
UniqueConstraintandIndexdefinitions to reference an existing local field.Migration output
The generated migrations expand embedded models into flat fields using dot-notation:
Current issue
Because embedded fields are materialized both inside the main model and within the embedded model itself, columns are projected twice. For example:
This duplication means fields can be referenced redundantly. A side effect is that they are still reachable through
F("embedded.something"), but projections end up with repeated columns.