Skip to content

Move all storage information from dataset to study

Bengfort requested to merge rm-dataset-redundancy into main

The problem

We do not yet have a clear idea whether storage belongs to the study or to the dataset.

  • we usually talk about "study folders".
  • in the code, storage location, "to be stored until" (and expected size until it was removed in f88fd3c1) all belong to the dataset.
  • in !44 (merged), we added Study.storage_location which is required for ARC compatibility (especially deployments)
  • we sometimes mentioned that dataset folders could be subfolders of study folders. But we never got into the specifics and it sounds overly complicated to me.
  • Currently, it is unclear when Study.storage_location should be deleted. We only know that for DataSet.location.
  • We renamed datasets multiple times ("data set" -> "data storage" -> "(raw) data folder"). This could indicate that it is conceptually not clear.

The Proposal

My proposal is to move all storage information to the study. In practice this means that "to be stored until" is moved to the study and DataSet.location is removed.

Side effect: Explain that folder will be created

Without further explanation, "to be stored until" would be very confusing in the study form. This is basically the same issue already described in !34 (closed): When users register a study, they do not know that a folder will be created for them.

I added a box around the "to be stored until" field that contains an explanation, including some of the text that was previously included in the blue info box in the dataset form. I think this turned out really nice.

For good measure I also added a success message when the study was created.

Position in the form

Conceptually, "to be stored until" is quite different from most other fields in the study form. It is somewhat connected to "Users who can edit this study profile and access the private folder" because that also affects the storage folder. It is also a bit similar to "Expected end of study" because that is also a rough estimation of the future.

My proposal is to have it at the very end, just before "Users who can edit this study profile and access the private folder".

What is the purpose of datasets?

I changed the wording back from "data folders" to "data sets".

I also removed the sentence "If you store data on MPIB's storage cluster, please provide some additional information in the following fields" because this is no longer optional.

So what is left of datasets? I guess the remaining fields are more coherent. However, I am not sure what motivation a researcher could have to enter this data, especially internal/external reuse.


2023-03-14_18-00-30 Screenshot_2023-03-14_at_18-00-01_640-wurstwasser___Studies___Study_registration

Merge request reports