Skip to content

adaptive pseudonym length

Bengfort requested to merge 1187-adaptive-pseudonym-length into master

I was not sure how to pass the pseudonym length

  • Different pseudonym generators can have a different amount of entropy per char. I think giving a measure of entropy (domain size) is better than a specific length.
  • The expected domain size might be a huge number. I feel like a logarithm of that should be sufficient.
  • We generate random pseudonyms and only check uniqueness afterwards. So the bigger the domain, the less likely it is that the uniqueness check fails (see also !895 (merged)). Which part of the application should decide how much extra space we plan for that?
  • The only indication of expected study domain size is study.min_subjects_count. It is not completely clear how big a max_subjects_count would be in relation to that, but I guess a factor of 10 is probably sufficient.

For now I went with bits as it is a common way to measure entropy. Not completely sure though.

I was wondering whether it is a security issue to have pseudonyms with dynamic length. For example, it could be easy to identify contact pseudonyms since they are much longer than other ones.

Edited by Bengfort

Merge request reports