Bulk Text-Based Redaction
Overview
This technical note explains the procedure for performing bulk text-based redactions using a combination of the Lexbe eDiscovery Platform (LEP) and Adobe Acrobat Pro.
Background
LEP can identify documents containing Personal Identifiable Information (PII) through the use of specific search options. Once identified, the LEP in-line redaction tool functions on a per-document, per-redaction basis. This works well for most redactions.
However, sometimes hundreds or thousands of documents may need to be redacted using repetitive text and data types. Adobe Acrobat Pro is the best of class tool for this requirement and allows text and pattern searching and bulk redactions across a document collection. A strong feature of Abobe Acrobat Pro is redaction can be done by common PII patterns, including Social Security Numbers, Phone Numbers, Credit Cards, and Email Addresses.
Documents identified in LEP for redaction can be exported to the desktop, bulk redacted using Adobe Acrobat Pro, and reimported into the same document record in LEP, then showing up as a redacted document on the Redaction Tab in the LEP viewer.
Recommended Workflow
Search in LEP to identify and code documents needing to be redacted. For PII search assistance, see Searching for Personal Identifiable Information (PII)
Once documents are identified and coded, export a log of the DocIds (this will become the .mergemapping files you will use in step 6.
Export the documents via Briefcase and be sure to name the files per DocId; For more information, see Briefcases Download the zip file and extract on local computer. You will need the PDFs in one main folder.
Perform bulk redactions in Adobe Acrobat Pro with text searches
a. Open Adobe Acrobat Pro
b. Click on Tools>Redact
c. Click on Mark for Redaction>Find Text
d. In the new window that displays:
ii. Select ‘All PDF Documents in’ and browse to the file path in which the exported PDF’s are saved
ii. Select Pattern as the ‘Search for’ option and then use the dropdown box to specify the pattern type i.e. Social Security Number, Phone Number, etc. Other search options include Single word or phrase and Multiple words or phrase (see screenshot below).
iii. Click Search and Remove Text
e. A list of hits will appear in the results. Click ‘Check All’ and then click ‘Mark Checked for Redaction’
f. The next window to appear will be the output options:
i. Create a folder on your local computer called REDACTED. Use that File Path as the target folder
ii. Select ‘Keep Original Filenames’
iii. Check ‘Apply Redaction Marks’
iv. Click Ok
g. The redacted documents will now be in the REDACTED folder
5. Ensure the REDACTED folder count is as expected
6. Perform bulk upload/merge of redacted documents into LEP; for instructions see Bulk Upload of Redacted Documents
Cautions
We recommend using Adobe Acrobat Pro for this procedure as the PDF file type is complex and Adobe is the most reliable and advanced tool for this purpose. Mass redaction relies on many factors, including the PDF being non-corrupt, the redaction tool being accurate, the text being accurate, and other factors. If OCR is used in the PDF creation this may lead to lower quality redaction results. For these and other reasons, we recommend a robust QC procedure be used to check redactions after any mass or bulk redaction procedure.
QC Procedure
The recommended manual QC process includes spot checking individual documents after redactions are applied in Adobe as well as a spot check search and review within LEP after the redacted documents have been merged. In LEP, search the word, phrase or PII pattern and spot check the results. Open the document and verify the hit on the Hits tab, confirm there is a redacted document in the Redacted tab and ensure the hit is redacted on the Redacted document.
Additional Information
Lexbe’s Professional Services staff can perform the bulk text-based redaction procedure upon request as a billable service. For more information, please contact professionalservices@lexbe.com