NearDup Grouping for Faster Review

This technical note provides step-by-step instructions to speed document review using NearDup Groupings in the Lexbe eDiscovery Platform (LEP).


Near-Duplicates contain contextually similar sentences or subject, but are not exact matches. They have significantly similar versions of documents that differ by, for example, a few sentences, words or paragraphs. Our Near Duplication technology identifies and groups documents that are at least 50% similar in text content.  See Near Duplication for more information. 

The Near Dup group numbers are identifiers that indicate when documents are in the same group. Grouping numbers are randomly assigned and there is no relationship between groups. Group numbers may not be continuous (gaps between grouping numbers) as groups may be consolidated as part of the grouping process. NearDup Grouping is an intensive process and on a large case can take considerable computer resources. The exact time to run varies depending on the documents in the case and other factors, but we can usually run within several days of an order. If Professional Services reruns NearDup Groupings in a case after adding new documents, all documents must be run again. NearDup Groupings will be re-assigned so existing groups and numbers will not be retained, but the groupings will be consistent.

Steps Involved in Working with NearDup Groups

Step 1. Request a quote from your sales consultant for Professional Services to run NearDup Groupings on a case.  As a result, the case database will be updated with a near-duplicate groupings of documents.  You will also receive an Excel spreadsheet report entitled NearDup Grouping Report.

Step 2. Open the NearDup Grouping spreadsheet and look at the Large Groups Ordered, or All Groups Ordered sheets to get grouping numbers sorted by size.

Step 3. Select a NearDup Group ID to evaluate from the Excel template (e.g., largest to smallest).

Step 4. Go to Browse in LEP and show the following fields: Title, Pages, Words, Master Date, NearDup Groupings, and any others of interest.

Step 5. Click on Filters>Select Filters and input the desired NearDup Grouping ID in the NearDup Grouping field and filter (e.g. Near Dup Group No. 62).

Step 6. The result in Browse will be the all documents in the specific NearDup Grouping filtered. 

Step 7. Open samples in the document viewer to check by clicking on the Title field.

Step 8. Return to Browse, and select all documents in filter (top of table) and then Multi Doc Edit to assign to the fields: Responsive, Non-Responsive, or Needs Further Review.

Step 9. The documents in the NearDup Grouping will be coded for the review.

Step 10. Then proceed to the next NearDup Grouping number (by size) and repeat until all are assigned.

Step 11. Progress can be checked anytime by running Case Assessment reports on the Responsiveness tag.

Email Threading

Consistency Check (Near-Duplicates)


We offer eDiscovery Consulting and Professional Services (billed hourly) as needed.  Contact your sales consultant for a quote.