From: Methods for the de-identification of electronic health records for genomic research
De-identification method | Techniques | Details |
---|---|---|
Masking (applied to direct identifiers) | Suppression/redaction | Direct identifiers are removed from the data or replaced with tags |
 | Random replacement/randomization | Direct identifiers are replaced with randomly chosen values (for example, for names and medical record numbers) |
 | Pseudonymization | Unique numbers that are not reversible replace direct identifiers |
Generalization (applied to quasi-identifiers) | Hierarchy-based generalization | Generalization is based on a predefined hierarchy describing how precision on quasi-identifiers is reduced |
 | Cluster-based generalization | Individual transactions are empirically grouped or based on pre-defined utility policies |
Suppression (applied to records flagged for suppression) | Casewise deletion | The full record is deleted |
 | Quasi-identifier deletion | Only the quasi-identifiers are deleted |
 | Local cell suppression | Optimization scheme is applied to the quasi-identifiers to suppress the fewest values but ensure a re-identification probability below the threshold |