Skip to content

Detect and/or Redact Protected Health Information v1.0.0 Help

Inspects text for protected health information (PHI) entities and returns details about them; can redact identified PHI entities with provided masks. Refers to Named entity recognition (NER).

How can I use the Step?

The Step lets you find and redact PHI entities in unstructured clinical text such as physician's notes, discharge summaries, test results, and medical records. This way, you can automate PHI data collection and implement specific policies to deal with protected health information to meet the HIPAA Privacy Rule. The Step only supports English and can handle 9 PHI entity types, covering all the HIPAA identifiers.

Warning: This Step is not a substitute for professional medical advice, diagnosis, or treatment. In any medical scenario, review and validate results before use.

How does the Step work?

A PHI entity is a text reference to personally identifiable information associated with the health data content. To learn more about PHI, visit this webpage.

For example, in a text, "The patient is John Doe, a 48-year-old teacher, and resident of Seattle, Washington," the Step recognizes John Doe as a name, 48 as an age, teacher as a profession, and Seattle, Washington as an address.

In addition, the Step assigns a confidence score to each PHI entity found in a text. This score indicates confidence that the Step correctly identified PHI entity type. To learn more, see the Output example.

To meet the HIPAA Privacy Rule, you can mask found PII entities using different Redaction options.

Note: To confirm the accuracy of detected PHI for specific compliance use cases, we recommend using the additional human review or other methods.

Input settings

To set up the section, do the following:

  1. For Operations, select at least one of the following options:

  2. For Input text, enter text to analyze.

  3. For PHI entity types, select entity types you want to detect/redact in the text.

Input text

The input text must be a UTF-8 string. The string must contain at least one character. The maximum string size is 20 KB. English is the only valid language.

PHI entity types

In the following table, you can find information about PHI entity types:

PHI Entity TypeDescription
addressThis includes all geographical subdivisions of an address of any facility, named medical facilities, or wards within a facility.
ageAll components of age, spans of age, and any age mentioned, be it patient, family member, or others involved in the note. Default is in years unless otherwise noted.
dateAny date related to patient or patient care.
emailAny email address, such as marymajor@email.com
idAny number associated with a patient's identity, including their social security number, medical record number, facility identification number, clinical trial number, certificate or license number, vehicle or device number. It also includes biometric numbers and numbers identifying the place of care or provider.
nameAll names mentioned in the clinical note, typically belonging to the patient, family, or provider.
phoneOrFaxAny phone, fax, pager; excludes named phone numbers such as 1-800-QUIT-NOW and 911.
urlAny web URL.
professionIncludes any profession or employer mentioned in a note on the patient or the patient’s family.

Datetime settings

The Datetime feature converts dates found in the text from one timezone to another and returns the converted date in a selected date and time format.

To set up the section, follow these steps:

  1. For Timezone, select input and output timezones for date conversion.
  2. For Output format, select the date and time format and specify options that suit your application.

Redaction options

Redact operation lets you mask PHI entities using two following options:

  • Entity type mask (default)
  • Custom mask

Entity type mask

Entity type mask redact PHI entities with predefined PHI types.

For example, using the text, "The patient is John Doe, a 48-year-old teacher and resident of Seattle, Washington," with a Redact operation and Entity type mask, the Step returns the following text:

"The patient is [name], a [age]-year-old [profession] and resident of [address]"

Custom mask

Custom mask works similarly to Entity type mask but redact PHI entities with characters you provide instead of predefined PHI types.

Output and exit behavior

To set up this section, take the following steps:

  1. For Output data options, select the appropriate options to configure the output structure. The setting is available only for Detect operation.
  2. In Output data structure, ensure that the output structure suits your application.

Merge field settings

The Step returns the result as a JSON object and stores it in the Merge field variable. Thus you can access the output JSON object from any point of your Flow. To learn more about this Step's output, see the Output example.

Skip logic exit

Use this setting to handle cases where duplicate Merge field variable names exist in your Flow, whereas the previously defined variable holds value.

By default, in such cases, the Step overwrites the existing variable with the new value. Another option is to skip the Step execution and direct the Flow down the selected exit. To do so, follow these steps:

  1. Enable the Skip step execution if existing merge field has data toggle.
  2. In the Skip logic exit list, select exit to direct the Flow.

Output example

The Step's output contains information about each detected PHI entity, including its type, confidence score, category, start and end points in the text, context, and redacted text (if applicable).

For example, using the Detect and Redact operations with default settings and the input text "The patient is John Doe, a 48-year-old teacher and resident of Seattle, Washington," the Step returns the following JSON object:

json
{
  "count": 4,
  "byOrder": [
    {
      "id": 1,
      "beginOffset": 15,
      "endOffset": 23,
      "score": 0.991275429725647,
      "text": "John Doe",
      "category": "protectedHealthInformation",
      "type": "name",
      "traits": []
    },
    {
      "id": 2,
      "beginOffset": 27,
      "endOffset": 29,
      "score": 0.9998018145561218,
      "text": "48",
      "category": "protectedHealthInformation",
      "type": "age",
      "traits": []
    },
    {
      "id": 3,
      "beginOffset": 39,
      "endOffset": 46,
      "score": 0.9306008219718933,
      "text": "teacher",
      "category": "protectedHealthInformation",
      "type": "profession",
      "traits": []
    },
    {
      "id": 4,
      "beginOffset": 64,
      "endOffset": 83,
      "score": 0.9836329817771912,
      "text": "Seattle, Washington",
      "category": "protectedHealthInformation",
      "type": "address",
      "traits": []
    }
  ],
  "redacted": "The patient is [name], a [age]-year-old [profession] and resident of [adress]"
}
{
  "count": 4,
  "byOrder": [
    {
      "id": 1,
      "beginOffset": 15,
      "endOffset": 23,
      "score": 0.991275429725647,
      "text": "John Doe",
      "category": "protectedHealthInformation",
      "type": "name",
      "traits": []
    },
    {
      "id": 2,
      "beginOffset": 27,
      "endOffset": 29,
      "score": 0.9998018145561218,
      "text": "48",
      "category": "protectedHealthInformation",
      "type": "age",
      "traits": []
    },
    {
      "id": 3,
      "beginOffset": 39,
      "endOffset": 46,
      "score": 0.9306008219718933,
      "text": "teacher",
      "category": "protectedHealthInformation",
      "type": "profession",
      "traits": []
    },
    {
      "id": 4,
      "beginOffset": 64,
      "endOffset": 83,
      "score": 0.9836329817771912,
      "text": "Seattle, Washington",
      "category": "protectedHealthInformation",
      "type": "address",
      "traits": []
    }
  ],
  "redacted": "The patient is [name], a [age]-year-old [profession] and resident of [adress]"
}

Error Handling

By default, the Step handles errors using a separate exit. So if any error occurs during the Step execution, the Flow proceeds down the error exit.

Note: If you disable the Handle error toggle, the Step does not handle errors. With this setup, if any error occurs during the Step execution, the Flow fails immediately after exceeding the Flow's timeout. To prevent the Flow from being suspended while continuing to handle errors in the Flow, place the Flow Error Handling Step before the main Flow logic.

Reporting

The Step reports once after its execution. You can change the Step log level and add new tags in the section.

Log level

By default, the Step inherits its log level from Flow's log level. You can change the Step's log level by selecting an appropriate option from the Log level list.

Tags

Tags help organize and filter session information when generating reports. You can specify the tag category, label, and value when adding a new tag.

Service dependencies

  • flow builder - v2.28.3
  • event-manager - v2.3.0
  • deployer - v2.6.0
  • comprehend provider - v0.9.0

Release notes

v1.0.0

  • Initial release