Health information that does not identify an individual and for which there is no reasonable basis to believe it can be used to identify an individual.
De-identified information is health data from which all individually identifiable elements have been removed such that the remaining information cannot reasonably be used to identify a specific person. Once health information is properly de-identified under HIPAA standards, it is no longer considered protected health information (PHI) and is not subject to the Privacy Rule's use and disclosure restrictions. This distinction is critically important for research, public health analysis, and healthcare operations that rely on population-level health data without needing to know individual identities.
HIPAA provides two approved methods for de-identifying health information. The Expert Determination method (Section 164.514(b)(1)) requires a person with appropriate knowledge and experience to apply statistical or scientific principles and determine that the risk of identifying an individual is very small. The Safe Harbor method (Section 164.514(b)(2)) requires the removal of 18 specific identifiers, including names, geographic data smaller than a state, dates more specific than year, phone numbers, email addresses, Social Security numbers, medical record numbers, health plan numbers, account numbers, certificate numbers, vehicle identifiers, device identifiers, URLs, IP addresses, biometric identifiers, full-face photographs, and any other unique identifying number. Under Safe Harbor, the organization must also have no actual knowledge that the remaining information could identify an individual.
While de-identified information offers significant flexibility, organizations must exercise caution. Re-identification risk increases as more data elements are combined, particularly with small populations or rare conditions. The growing availability of external data sources has made re-identification attacks increasingly sophisticated. Organizations should regularly reassess their de-identification processes, consider using the Expert Determination method for higher-risk datasets, and avoid including data elements that could serve as quasi-identifiers. It is also important to distinguish between de-identified information, which has no use restrictions, and limited data sets, which retain some identifiers and require a data use agreement.