implicitly annotate subject showup

Closed Bengfort requested to merge refactor-annotate-property into main

This is the final bit of !1533 (closed) that is still unmerged, and probably the most controversial.

Django uses an object-relational mapper (ORM), so it tries to unify the interface for python objects and database tables. This works quite well in many cases. One area where this works less well is computed attributes: On the database side, there is annotate(). On the python side there is @property. (There is also @cached_property and when to use which of these two is a topic on its own).

  • A @property is easy to write and understand and should therefore be used by default
  • If you want to efficiently filter a list by computed attributes you need to use annotate()
  • annotate() can sometime have massive performance benefits for lists with complex computed attributes
  • It is possible to annotate a model's default queryset by overwriting Manager.get_queryset(). However, complex annotations should only be called on demand.
  • Annotations are not always easy to add, e.g. for related objects (e.g. participation.subject).
  • Instances that have not yet been saved to the database cannot have annotations. So even if we overwrite Manager.get_queryset() we can still not rely on the computed attributes.

In summary:

  • If you want to do efficient database queries: use annotate()
  • If you want to access the value from everywhere without much up-front work: use @property

The issue is that sometimes we want both, but we do not want to duplicate the logic.

My solution to this problem is a small function annotate_instance() that allows to easily define properties from annotations:

def __getattr__(self, key):
    if key == 'showup_score':
        values = annotate_instance(self, Subject.objects.annotate_showup_score())
        return values[key]
    raise AttributeError(key)

__getattr__() is only executed when someone tries to access an attribute that does not exist. annotate_instance sets the computed attributes on the instance, so after calling it once the attribute now exists and it is not called a second time. On the other hand, if the attribute is never accessed, no database query is executed. (This is very similar to how @cached_property works.)

The downside is that this used a lot of complicated python magic. Many developers probably have never heard of __getattr__() and __dict__.

Coming back to the motivation: In most cases we are completely fine using either annotate() or @property. I am not sure if the few cases where we need both justify this code.

Edited by Bengfort

Merge request reports