adaptive pseudonym length
I was not sure how to pass the pseudonym length
- Different pseudonym generators can have a different amount of entropy per char. I think giving a measure of entropy (domain size) is better than a specific length.
- The expected domain size might be a huge number. I feel like a logarithm of that should be sufficient.
- We generate random pseudonyms and only check uniqueness afterwards. So the bigger the domain, the less likely it is that the uniqueness check fails (see also !895 (merged)). Which part of the application should decide how much extra space we plan for that?
- The only indication of expected study domain size is
study.min_subjects_count
. It is not completely clear how big amax_subjects_count
would be in relation to that, but I guess a factor of 10 is probably sufficient.
For now I went with bits
as it is a common way to measure entropy. Not completely sure though.
I was wondering whether it is a security issue to have pseudonyms with dynamic length. For example, it could be easy to identify contact pseudonyms since they are much longer than other ones.