diff --git a/source/features.rst b/source/features.rst index 2d3c7851da6ac2477059b2e5f96596ee97f512c7..f8fe6ae4be46406d29a8b7b059a10c6d812b4918 100644 --- a/source/features.rst +++ b/source/features.rst @@ -13,10 +13,12 @@ Relevant guides: - :ref:`subject-search` - :ref:`subject-create` - :ref:`subject-edit` +- :ref:`subject-by-pseudonym` - :ref:`subject-get-pseudonym` - :ref:`subject-to-be-deleted` - :ref:`subject-delete` - :ref:`study-create` +- :ref:`study-domains` - :ref:`study-delete` - :ref:`data-protection-dashboard` - :ref:`subject-export` diff --git a/source/guides/admin.rst b/source/guides/admin.rst index 4336d41dc4469288b1f511677ce24207956fbe57..8caac3606be1b5cf3ce7fef0d9820b6cdbc3b39a 100644 --- a/source/guides/admin.rst +++ b/source/guides/admin.rst @@ -13,8 +13,9 @@ Manage Users 5. Add the appropriate global :ref:`roles` 6. Add the appropriate :ref:`privacy-level` -7. Set an expiration date -8. Click on one of the saving options +7. Add the appropriate **general domains** +8. Set an expiration date +9. Click on one of the saving options .. _admin-unlock: @@ -121,3 +122,16 @@ case there is a two step process: up in the data protection dashboard. The legal basis for each subject can be found in the subject detail view. + + +.. _admin-general-domains: + +Manage general domains +---------------------- + +1. Click on **Admin** on the front page +2. Go to **Domains** +3. Click on **Add Domain** (oval with grey background) +4. Enter a name +5. Leave the ``object_id`` and ``content_type`` fields empty +6. Click on one of the saving options diff --git a/source/guides/pseudonyms.rst b/source/guides/pseudonyms.rst new file mode 100644 index 0000000000000000000000000000000000000000..28573ab8aa4d2baa89020691f0e8fe73f14adae7 --- /dev/null +++ b/source/guides/pseudonyms.rst @@ -0,0 +1,26 @@ +.. _subject-by-pseudonym: + +Find subject by study pseudonym +=============================== + +1. Click on **Studies** on the front page +2. In the list of studies, find the study and click **Execution** +3. Go to the **By pseudonym** tab +4. Enter the pseudonym. If there is more than one study domain, you also have + to select the correct domain. + + +.. _subject-get-pseudonym: + +Get the pseudonyms of a subject +=============================== + +1. Click on **Studies** on the front page +2. In the list of studies, find the study and click **Execution** +3. In the list of participating subjects, click **Details** +4. In the subject overview, the pseudonym is listed among the subject's + contact data and operational hints + +.. note:: + The pseudonyms are only shown once you click a button. Each access to a + pseudonym is monitored to detect abuse. diff --git a/source/guides/study-management.rst b/source/guides/study-management.rst index 90586201813f8cd2f54252ab77535ddfd684be90..df2b06413f1d5d7b35b812d4a18e8195d5bde47a 100644 --- a/source/guides/study-management.rst +++ b/source/guides/study-management.rst @@ -172,6 +172,18 @@ your study: links can be inserted as standard text. +.. _study-domains: + +Manage study pseudonym domains +------------------------------ + +In the **Pseudonym domains** tab you can add a new domain or change the name of +an existing domain. + +If there are general domains you can also define which general domains need to +be accessed in the context of this study. + + .. _study-members: Manage study members diff --git a/source/guides/subject-get-pseudonym.rst b/source/guides/subject-get-pseudonym.rst deleted file mode 100644 index 6fe0ddbaa74e637970ee89e44345cb471294044e..0000000000000000000000000000000000000000 --- a/source/guides/subject-get-pseudonym.rst +++ /dev/null @@ -1,14 +0,0 @@ -.. _subject-get-pseudonym: - -Get the pseudonym of a subject -============================== - -1. Click on **Studies** on the front page - -2. In the list of studies where you are a member, click **Execution** - next to the name of the study and its contact person - -3. In the list of participating subjects, click **Details** - -4. In the subject overview, the pseudonym is listed among the subject's - contact data and operational hints diff --git a/source/guides/subject-management.rst b/source/guides/subject-management.rst index 6a2ade14c606f9dd5461bcc7ed87f34433c54797..f636ac147f6d24e97f9a538ff512805fa4ebf7a4 100644 --- a/source/guides/subject-management.rst +++ b/source/guides/subject-management.rst @@ -267,7 +267,7 @@ sufficient legal basis to keep the data. This is only available for users with permissions that are granted to staff members who are data protection coordinators or the like. - If deletion was requested by a subject you should follow your institutes + If deletion was requested by a subject you should follow your institute's rules on verifying identity of requester. In order to delete the externally and internally stored data of a subject, @@ -281,10 +281,18 @@ please proceed as follows: - Contact the responsible person for each study and ask them to delete all collected data of the subject concerned. Identify the subject using the study pseudonym that is displayed. - - Once the responsible contact person has conformed the deletion of all + - Once the responsible contact person has confirmed the deletion of all data, delete the participation record using the **Delete** button. -3. Once all participation have been deleted you will see a message saying +3. If you see a message saying **This subject may still have data in general + domains.**, proceed as follows: + + - Click on **Pseudonyms** next to **General pseudonym domains** to get a + list of pseudonyms. + - Contact the responsible person for each general domain and make sure + that all data is deleted. + +4. Once all participations have been deleted you will see a message saying **Are you sure you want to permanently delete this subject and all related data?** You can now click **Confirm** and the subject will be deleted. diff --git a/source/index.rst b/source/index.rst index 6e7f34a885f05d355d83f4df3c8fa965118433f9..53c4656cc1f1791195de9786ec14266a9e588f8a 100644 --- a/source/index.rst +++ b/source/index.rst @@ -12,6 +12,7 @@ Welcome to Castellum's documentation! overview features roles + privacy security faqs @@ -22,7 +23,7 @@ Welcome to Castellum's documentation! guides/two-factor-authentication guides/subject-management guides/study-management - guides/subject-get-pseudonym + guides/pseudonyms guides/data-protection guides/consent-management diff --git a/source/privacy.rst b/source/privacy.rst new file mode 100644 index 0000000000000000000000000000000000000000..718f87edfc9fea31cb5b72a9c7c1795307179c6c --- /dev/null +++ b/source/privacy.rst @@ -0,0 +1,116 @@ +Privacy +======= + +At its core, Castellum is about splitting a subject's data into little pieces. +On the one hand this means that users can only access the pieces that are +necessary for them. On the other hand this means that castellum contains the +necessary information to put all the pieces back together, e.g. so it can be +deleted on request. + + +Contact data +------------ + +Contact details are stored in Castellum itself. This means that anyone who +wants to get in contact with a subject needs to go through castellum. + +.. warning:: + Traces of contact data can also exist in the systems that are used for + communication, e.g. email servers or payment providers. + + +Pseudonyms +---------- + +Scientific data should never be stored with a subject's name. Instead, +Castellum automatically generates and stores random pseudonyms that can be used +to link the data back to the subject. + +.. note:: + An alternative approach for generating pseudonyms would be to calculate an + encrypted hash over immutable, subject-related information (e.g. name, date + of birth) + + That approach would have the benefit of not relying on a central + infrastructure to store the pseudonyms. However, in cases where such a + central infrastructure with strict access control is feasible, Castellum's + approach is much simpler. + + For more information on these two approaches, see `Anforderungen an den + datenschutzkonformen Einsatz von Pseudonymisierungslösungen (german) + `_. + +.. note:: + The algorithm that is used to generate pseudonyms can be configured. The + algorithm that is used by default produces alphanumeric strings with 20 + bits of entropy and two checkdigits that are guaranteed to detect single + errors. It is also available as a `standalone package + `_. + +A subject can have many different pseudonyms in different domains. Castellum +automatically creates a new domain for each study. There can be more than one +domain per study as well as *general domains* that are not connected to studies +at all. + +.. warning:: + Pseudonyms are only unique (and therefore useful) within their domain. + Whenever you use a pseudonym, make sure that it is clear which domain it + belongs to. If in doubt, store the domain along with the pseudonym. + +It is up to you to decide on a granularity of domains. For example you could +use a single domain for all bio samples. Or you could use separate domains for +blood, saliva, stool, …. + +Using study pseudonyms +~~~~~~~~~~~~~~~~~~~~~~ + +Whenever you collect data in the context of a study, it should be stored with a +study pseudonym. Pseudonyms can also be printed on questionnaires or passed to +external survey services. + +Relevant guides: + +- :ref:`study-domains` +- :ref:`subject-by-pseudonym` +- :ref:`subject-get-pseudonym` + +.. todo:: + - attribute export + +Using pseudonyms from general domains +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Central repositories (e.g. for bio samples or IQ scores) often store data that +is not related to a specific study. In these cases, you can use pseudonyms from +a *general domain*. + +Because these pseudonyms are the same across all studies, access to them is +highly restricted. Both the user and the study need to be authorized before it +shows up in list of pseudonyms. This also means that, even though general +domains exist independently of studies, they can only be accessed through +studies. + +Relevant guides: + +- :ref:`admin-general-domains` +- :ref:`admin-users` +- :ref:`study-domains` +- :ref:`subject-get-pseudonym` +- :ref:`subject-delete` + + +Database split +-------------- + +In Castellum, contact data is handled in a database server which is separated +from everything else to provide an additional barrier. + +This provides a clear structure for developers that should help avoiding +critical data leaks. Even if an attacker is able to dump a whole table or even +a whole database, this structure still limits the impact. + +However, it is important to understand that the barrier between recruitment and +contact data is not that high. Since castellum has full access to both, an +attacker can also gain full access. Spreading the system across several +databases on different servers or even in different organizations does not help +much if there is still a single point of entry. diff --git a/source/roles.rst b/source/roles.rst index 5abc22939a209482515d579a12998e4786c8798f..1d000910bcea8d939afc47f4f5d0789b1bbf6422 100644 --- a/source/roles.rst +++ b/source/roles.rst @@ -41,6 +41,7 @@ Relevant guides: - :ref:`study-members` - :ref:`study-sessions` - :ref:`study-recruitment-settings` +- :ref:`study-domains` - :ref:`study-finish` - :ref:`study-delete` - :ref:`set-up-external-scheduler` diff --git a/source/security.rst b/source/security.rst index 0933d30aac834918b609b71c2ac02a6ae99532c2..8e482b77028d284f15395b09667e299baccd0152 100644 --- a/source/security.rst +++ b/source/security.rst @@ -81,60 +81,6 @@ user's privacy level is controlled via the special permissions ``privacy_level_1`` and ``privacy_level_2``. The three levels (0-2) accord to the data security levels of the Max Planck Society. -Pseudonyms ----------- - -There are generally two approaches to generate pseudonyms: - -- Calculate an encrypted hash over immutable, subject-related information - (e.g. name, date of birth) -- Generate a random pseudonym and store it in a mapping table - -The former approach has the benefit of not relying on a central infrastructure. -However, in cases where such a central infrastructure with strict access -control is feasible, the latter approach is much simpler. - -Castellum implements the latter approach. - -For more information on these two approaches, see `Anforderungen an den -datenschutzkonformen Einsatz von Pseudonymisierungslösungen (german) -`_. - -The algorithm that is used to generate pseudonyms can be configured. The -algorithm that is used by default produces alphanumeric strings with 20 bits of -entropy and two checkdigits that are guaranteed to detect single errors. It is -also available as a `standalone package -`_. - -Data separation ---------------- - -Implementation -~~~~~~~~~~~~~~ - -We chose to split the data into three different categories: - -- Scientific data is handled outside of castellum. Castellum only - provides the pseudonyms that are used to map this data to subjects. -- Data relevant for recruitment is handled in castellum. -- Contact data is also handled in castellum, but in a separate database - to provide an additional barrier. - -Security Considerations -~~~~~~~~~~~~~~~~~~~~~~~ - -The described architecture provides a clear structure for developers -that should help avoiding critical data leaks. Even if an attacker is -able to dump a whole table or even a whole database, this structure -still limits the impact. - -However, it is important to understand that the barrier between -recruitment and contact data is not that high. Since castellum has full -access to both, an attacker can also gain full access. Spreading the -system across several databases on different servers or even in -different organizations does not help much if there is still a single -point of entry. - Monitoring ----------