diff --git a/source/index.rst b/source/index.rst index 80e76e76b0c96aa7226401a337a64ba62a13413a..34ccb6c1c0c92d98d3a3109768478d18395a386a 100644 --- a/source/index.rst +++ b/source/index.rst @@ -12,7 +12,7 @@ Welcome to Castellum's documentation! overview features roles - privacy + pseudonyms security faqs diff --git a/source/privacy.rst b/source/privacy.rst deleted file mode 100644 index 01c5a69e27f61c7ae87aeee8abf767a58c6c5647..0000000000000000000000000000000000000000 --- a/source/privacy.rst +++ /dev/null @@ -1,99 +0,0 @@ -Privacy -======= - -At its core, Castellum is about splitting a subject's data into little pieces. -On the one hand this means that users can only access the pieces that are -necessary for them. On the other hand this means that castellum contains the -necessary information to put all the pieces back together, e.g. so it can be -deleted on request. - - -Contact data ------------- - -Contact details are stored in Castellum itself. This means that anyone who -wants to get in contact with a subject needs to go through castellum. - -.. warning:: - Traces of contact data can also exist in the systems that are used for - communication, e.g. email servers or payment providers. - - -Pseudonyms ----------- - -Scientific data should never be stored with a subject's name. Instead, -Castellum automatically generates and stores random pseudonyms that can be used -to link the data back to the subject. - -.. note:: - An alternative approach for generating pseudonyms would be to calculate an - encrypted hash over immutable, subject-related information (e.g. name, date - of birth) - - That approach would have the benefit of not relying on a central - infrastructure to store the pseudonyms. However, in cases where such a - central infrastructure with strict access control is feasible, Castellum's - approach is much simpler. - - For more information on these two approaches, see `Anforderungen an den - datenschutzkonformen Einsatz von Pseudonymisierungslösungen (german) - `_. - -.. note:: - The algorithm that is used to generate pseudonyms can be configured. The - algorithm that is used by default produces alphanumeric strings with 20 - bits of entropy and two checkdigits that are guaranteed to detect single - errors. It is also available as a `standalone package - `_. - -A subject can have many different pseudonyms in different domains. Castellum -automatically creates a new domain for each study. There can be more than one -domain per study as well as *general domains* that are not connected to studies -at all. - -.. warning:: - Pseudonyms are only unique (and therefore useful) within their domain. - Whenever you use a pseudonym, make sure that it is clear which domain it - belongs to. If in doubt, store the domain along with the pseudonym. - -It is up to you to decide on a granularity of domains. For example you could -use a single domain for all bio samples. Or you could use separate domains for -blood, saliva, stool, …. - -Using study pseudonyms -~~~~~~~~~~~~~~~~~~~~~~ - -Whenever you collect data in the context of a study, it should be stored with a -study pseudonym. Pseudonyms can also be printed on questionnaires or passed to -external survey services. - -Relevant guides: - -- :ref:`study-domains` -- :ref:`subject-by-pseudonym` -- :ref:`subject-get-pseudonym` - -.. todo:: - - attribute export - -.. _general-domains: - -Using pseudonyms from general domains -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Central repositories (e.g. for bio samples or IQ scores) often store data that -is not related to a specific study. In these cases, you can use pseudonyms from -a *general domain*. - -Because these pseudonyms are the same across all studies, access to them is -highly restricted. Both the user and the study need to be authorized before it -shows up in list of pseudonyms. - -Relevant guides: - -- :ref:`admin-general-domains` -- :ref:`admin-users` -- :ref:`study-domains` -- :ref:`subject-get-pseudonym` -- :ref:`subject-delete` diff --git a/source/pseudonyms.rst b/source/pseudonyms.rst new file mode 100644 index 0000000000000000000000000000000000000000..0db10c071f522db07eb15b2aaf8546f771e7766e --- /dev/null +++ b/source/pseudonyms.rst @@ -0,0 +1,80 @@ +Pseudonyms +========== + +Scientific data should never be stored with a subject's name. Instead, +Castellum provides pseudonyms that can be used to link the data back to the +subject. Anyone who wants to get in contact with a subject should have to go +through castellum. + +.. warning:: + Traces of contact data can also exist in the systems that are used for + communication, e.g. email servers or payment providers. + +A subject can have many different pseudonyms in different domains. Castellum +automatically creates a new domain for each study. There can be more than one +domain per study as well as *general domains* that are not connected to studies +at all. + +Pseudonyms are only unique (and therefore useful) in the context of a domain. +Whenever you use a pseudonym, make sure that it is clear which domain it +belongs to. If in doubt, store the domain along with the pseudonym. + +It is up to you to decide on a granularity of domains. For example you could +use a single domain for all bio samples. Or you could use separate domains for +blood, saliva, stool, …. + +Using study pseudonyms +---------------------- + +Whenever you collect data in the context of a study, it should be stored with a +study pseudonym. Pseudonyms can also be printed on questionnaires or passed to +external survey services. + +Relevant guides: + +- :ref:`study-domains` +- :ref:`subject-by-pseudonym` +- :ref:`subject-get-pseudonym` + +.. todo:: + - attribute export + +.. _general-domains: + +Using pseudonyms from general domains +------------------------------------- + +Central repositories (e.g. for bio samples or IQ scores) often store data that +is not related to a specific study. In these cases, you can use pseudonyms from +a *general domain*. + +Because these pseudonyms are the same across all studies, access to them is +highly restricted. Both the user and the study need to be authorized before it +shows up in list of pseudonyms. + +Relevant guides: + +- :ref:`admin-general-domains` +- :ref:`admin-users` +- :ref:`study-domains` +- :ref:`subject-get-pseudonym` +- :ref:`subject-delete` + +How pseudonyms are generated +---------------------------- + +Castellum generates random pseudonyms and stores them in a database. + +An alternative approach for generating pseudonyms would be to calculate an +encrypted hash over immutable, subject-related information (e.g. name, date of +birth). That approach would have the benefit of not relying on a central +infrastructure to store the pseudonyms. However, in cases where such a central +infrastructure with strict access control is feasible, Castellum's approach is +much simpler. For more information on these two approaches, see `Anforderungen +an den datenschutzkonformen Einsatz von Pseudonymisierungslösungen (german) +`_. + +The algorithm that is used to generate pseudonyms can be configured. The +default algorithm produces alphanumeric strings with 20 bits of entropy and two +checkdigits that are guaranteed to detect single errors. It is also available +as a `standalone package `_.