castellum merge requestshttps://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests2021-11-01T16:29:21Zhttps://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/2110Conflicting studies charts2021-11-01T16:29:21ZBengfortConflicting studies chartsSame as !2101, but with a different UI:
![2021-11-01_17-23-47](/uploads/a4866b70d11815e0a6c80437f1d6131d/2021-11-01_17-23-47.png)
The UI is very much inspired by the execution progress view we added in !1865. Our hope was that setting ...Same as !2101, but with a different UI:
![2021-11-01_17-23-47](/uploads/a4866b70d11815e0a6c80437f1d6131d/2021-11-01_17-23-47.png)
The UI is very much inspired by the execution progress view we added in !1865. Our hope was that setting `min_subject_count`, potential subjects, conflicting studies, and `CASTELLUM_EXPECTED_SUBJECT_FACTOR` in relation makes them clearer.
I think this is somewhat better, but still far from understandable.https://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/2090avoid session-scoped fixtures2021-10-20T14:35:39ZBengfortavoid session-scoped fixturesIn https://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/2089 I had some issues with session-scoped fixtures. So I wanted to check how much of a performance gain we get from them and if it is worth the additional complexity...In https://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/2089 I had some issues with session-scoped fixtures. So I wanted to check how much of a performance gain we get from them and if it is worth the additional complexity.
I started with `groups` and `study_groups` because they were relatively easy to do. The result (averaged over 10 runs):
- session-scoped: 32.793 s
- not session-scoped: 38.470 s
So the not-session-scoped version takes 17 % longer. IMHO that is enough to justify the complexity.https://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/1841Try nplusone to detect inefficient DB queries2021-04-06T14:49:53ZBengfortTry nplusone to detect inefficient DB queries[nplusone](https://github.com/jmcarp/nplusone/) is a library that is supposed to help finding inefficient DB queries. I spent some time to try it. My conclusion is that is not all that helpful. I like the general approach. However, it ge...[nplusone](https://github.com/jmcarp/nplusone/) is a library that is supposed to help finding inefficient DB queries. I spent some time to try it. My conclusion is that is not all that helpful. I like the general approach. However, it generates many false positives and it's hard to see what would have to be changed to fix the warnings. Also, fixing all warnings would be premature optimization.https://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/1719bootstrap 52021-12-15T14:00:32ZBengfortbootstrap 5I tried [bootstrap 5](https://getbootstrap.com/docs/5.0/migration/) for a test run. There are quite some changes:
- Slightly more color contrast
- More aggressive styling for checkbox, radio, and selects
- More focus on utility classes ...I tried [bootstrap 5](https://getbootstrap.com/docs/5.0/migration/) for a test run. There are quite some changes:
- Slightly more color contrast
- More aggressive styling for checkbox, radio, and selects
- More focus on utility classes (e.g. no more `btn-block`, `form-group`, `badge-secondary`, `form-row`)
- Use "start"/"end" rather than "left"/"right" in utility classes (e.g. `me-2`)
- `bs-` prefix for CSS-variables or javascript hooks
- javascript no longer depends on jquery
Additionally, we need to wait for an update of [django-bootstrap4](https://github.com/zostera/django-bootstrap4/issues/224) or switch to my replacement (see also !1406).https://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/1546implicitly annotate subject showup2020-09-29T07:24:37ZBengfortimplicitly annotate subject showupThis is the final bit of !1533 that is still unmerged, and probably the most controversial.
Django uses an object-relational mapper (ORM), so it tries to unify the interface for python objects and database tables. This works quite well ...This is the final bit of !1533 that is still unmerged, and probably the most controversial.
Django uses an object-relational mapper (ORM), so it tries to unify the interface for python objects and database tables. This works quite well in many cases. One area where this works less well is computed attributes: On the database side, there is `annotate()`. On the python side there is `@property`. (There is also `@cached_property` and when to use which of these two is a topic on its own).
- A `@property` is easy to write and understand and should therefore be used by default
- If you want to efficiently filter a list by computed attributes you need to use `annotate()`
- `annotate()` can sometime have massive performance benefits for lists with complex computed attributes
- It is possible to annotate a model's default queryset by overwriting `Manager.get_queryset()`. However, complex annotations should only be called on demand.
- Annotations are not always easy to add, e.g. for related objects (`e.g. participation.subject`).
- Instances that have not yet been saved to the database cannot have annotations. So even if we overwrite `Manager.get_queryset()` we can still not rely on the computed attributes.
In summary:
- If you want to do efficient database queries: use `annotate()`
- If you want to access the value from everywhere without much up-front work: use `@property`
The issue is that sometimes we want both, but we do not want to duplicate the logic.
My solution to this problem is a small function `annotate_instance()` that allows to easily define properties from annotations:
```python
def __getattr__(self, key):
if key == 'showup_score':
values = annotate_instance(self, Subject.objects.annotate_showup_score())
return values[key]
raise AttributeError(key)
```
[`__getattr__()`](https://docs.python.org/3/reference/datamodel.html#object.__getattr__) is only executed when someone tries to access an attribute that does not exist. `annotate_instance` sets the computed attributes on the instance, so after calling it once the attribute now exists and it is not called a second time. On the other hand, if the attribute is never accessed, no database query is executed. (This is very similar to how `@cached_property` works.)
The downside is that this used a lot of complicated python magic. Many developers probably have never heard of `__getattr__()` and `__dict__`.
Coming back to the motivation: In most cases we are completely fine using either `annotate()` or `@property`. I am not sure if the few cases where we need both justify this code.https://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/15331766 showup in list2020-09-16T18:17:15ZBengfort1766 showup in listSo the was supposed to be a simple template change to display showup in the execution list. However, I fell down the annotate/property rabbit hole.
The simple way to add a computed field to a model is by defining a `@property`. This wor...So the was supposed to be a simple template change to display showup in the execution list. However, I fell down the annotate/property rabbit hole.
The simple way to add a computed field to a model is by defining a `@property`. This works well and is easy to understand. However, it becomes a performance issue on lists if the property makes additional database requests.
The better option in that case is to calculate the field in the database using `.annotate()`. That is more complicated but can improve performance a lot.
The big trouble with `.annotate()` is that it is not available everywhere, notably instances that have not yet been saved to the database.
Another issue is that you have to decide: Either you add the annotation to a model's default manager which makes every single query more complicated, or you do the annotation on a case-by-case basis, which makes them unavailable in many situations.
The holy grail would be to combine both approaches: To have a single definitions that works and is efficient in all cases.
This MR could bring us a step closer to that goal: It adds the `annotate_instance()` helper that will get annotations from the database and add them to an existing instance, but only if they do not exist yet. It is based on [this stackoverflow answer](https://stackoverflow.com/a/59060833).
This helper is then used to efficiently render showup both on single subjects as well as on a pre-annotated list of subjects.
~~**Scratch everything I wrote above.** The list we need to show the showup in is not actually a list of subjects, but a list of participations. So the issue is that we cannot annotate a `.select_related()` query -- at least I don't know how.~~ [`.prefetch_related()` did the trick]
So not the big question is: Do we want this?
- Merge like this
- Remove the optimization and just use `@property` instead
- Do not show showup in the list at allhttps://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/1521Add script to export current attribute descriptions as JSONSchema2020-09-08T09:26:13ZBengfortAdd script to export current attribute descriptions as JSONSchemaThis is a script that converts attribute descriptions to JSONSchema as used in Leipzig. It includes two different modes:
- Default mode is compatible with the format used in huscy
- Raw mode is compatible with the format used in castell...This is a script that converts attribute descriptions to JSONSchema as used in Leipzig. It includes two different modes:
- Default mode is compatible with the format used in huscy
- Raw mode is compatible with the format used in castellum
There are two notable things missing from this representation:
- Translations (for label, help text, and choices labels)
- Filter-specific information such as `operators` or `filter_form_class`
I did not add support for Categories yet because they are rarely used.
```json
$ ./manage.py attribute_schema
{
"properties": {
"Date of birth": {
"format": "date",
"type": "string"
},
"Handedness": {
"enum": [
"Right",
"Left"
],
"type": "string"
},
"Highest degree": {
"enum": [
"No degree",
"Elementary school",
"Hauptschule",
"Mittlere Reife",
"Abitur",
"Bachelor",
"Master"
],
"type": "string"
},
"Language": {
"type": "string"
}
},
"type": "object"
}
```
```json
$ ./manage.py attribute_schema --raw
{
"properties": {
"d5": {
"description": "Handedness",
"enum": [
10,
11
],
"type": "integer"
},
"d6": {
"description": "Language",
"type": "string"
},
"d7": {
"description": "Date of birth",
"format": "date",
"type": "string"
},
"d8": {
"description": "Highest degree",
"enum": [
12,
13,
14,
15,
16,
17,
18
],
"type": "integer"
}
},
"type": "object"
}
```https://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/1503Squash migrations2020-11-30T15:26:40ZBengfortSquash migrationsI wanted to learn about [`squashmigrations`](https://docs.djangoproject.com/en/stable/topics/migrations/#squashing-migrations) so this is an example on how we could use it.
Migrations are no longer relevant once they have been execute...I wanted to learn about [`squashmigrations`](https://docs.djangoproject.com/en/stable/topics/migrations/#squashing-migrations) so this is an example on how we could use it.
Migrations are no longer relevant once they have been executed, so they are basically a bunch of dead code. Migrations can also get quite complex on larger refactoring. Getting rid of dead complex code seems like a good idea.
On the other hand, this breaks installations that have not executed these old migrations. In other words: It makes updating more complicated. And even though it feels strange, keeping those old migrations doesn't actually hurt.
If you checkout this branch and run `migrateall` it will not work. Instead, you first have to checkout the "squash migrations" commit and run `migrateall` there. This will store the squashed migrations in the database so that django later knows that they do not need to be executed again.
Theoretically, we could squash all migrations on every release and then cleanup on the next release. But should we? I am torn.
The django docs recommend not to do it:
> You are encouraged to make migrations freely and not worry about how many you have
Still I feel like we should cleanup early refactoring before our first "real" release. But maybe not adding migrations at all until !1058 was already sufficient for that goal.https://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/1399Example geofilter2020-06-15T15:53:02ZBengfortExample geofilterThis is an example geofilter using the existing interfaces. What is striking to me is how much boilerplate code is necessary. I have some ideas how this could be improved.
- use [project templates](https://cookiecutter.readthedocs.io/) ...This is an example geofilter using the existing interfaces. What is striking to me is how much boilerplate code is necessary. I have some ideas how this could be improved.
- use [project templates](https://cookiecutter.readthedocs.io/) to generate the boilerplate automatically
- store the result in `AttributeSet.data` instead of a custom model. This is a JsonField, so we can add more data without migrations. On the other hand, I am not sure if I want to encourage filter authors to meddle with `AttributeSet` directly.
- Do not use custom filters at all and instead provide an interface to import multipolygons as geofilters (inside polygon is a match)https://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/68419 external settings with python dist2020-06-10T16:47:59ZBengfort419 external settings with python distSee also !64See also !64https://git.mpib-berlin.mpg.de/castellum/castellum/-/merge_requests/67example of how we can use django-db-mailer inside of app2020-06-10T16:48:09ZTyapkovexample of how we can use django-db-mailer inside of appExample shows how we can use django-db-mailer app to send html emails
**Info:**
Django module to easily send emails/push/sms/tts using django templates stored in a database.
From box you can use it with django-celery for send backgroun...Example shows how we can use django-db-mailer app to send html emails
**Info:**
Django module to easily send emails/push/sms/tts using django templates stored in a database.
From box you can use it with django-celery for send background messages.
Also you have opportunity to create reports from logs by mail categories and slug.
Groups with Recipients and send by model signal also available by default.
Can be used without any depends from programming language as a external service.
That app very simple to install and use on your projects.
https://github.com/LPgenerator/django-db-mailer
Pros:
1. We don't have to save templates in variables anymore. Plain text files or templates save in db can be used.
2. We have different backend which we can use - sms, push, email. If required we can write our own backend.
3. DB-Mailer can be used with Celery.