Assisted Review+

The following description will guide you through using Assisted Review+ in the Lexbe eDiscovery Platform.

Step 1: Setting up an Assisted Review Job in the eDiscovery Platform

1. Within the eDiscovery Platform, select ‘Assisted Review’ under the ‘Discovery’ tab.

2. To start a new assisted review task, select ‘Create New Job.’ Add a job title and notes as desired, then select 'Create.'


3. Change the number of documents in your seed set and the number of documents in each control set by selecting ‘Edit,’ then ‘Update’ once complete.


Step 2: Adding Documents to the Assisted Review Job

1. Navigate to the ‘Browse’ tab to find the documents you will be adding to this Assisted Review Job.

2. Apply any filters needed until you are viewing only the documents you wish to add to the Assisted Review job.  This will typically be all of the documents in a case.

3. Click ‘Select All’ at the top of the ‘Browse’ list to select all documents.

4. On the left side of the screen, under the section ‘Assisted Review Jobs’, select the specific Assisted Review job from the drop down that you wish to add documents to.

5. Confirm the number of documents is correct and click ‘Add Selected Docs’ in the pop-up confirmation screen.

Step 3: Reviewing/Training the Seed Set

After setting up your Assisted Review job, determining the size of your seed set, and adding documents to the job, it is time to review the seed set documents.

Accessing the Seed Set for Review/Training

1. Navigate back to ‘Assisted Review’ under the ‘Discovery’ tab, and select your Assisted Review job from the drop down menu on the left.

2. Select ‘Next’ in the upper right corner of the screen to proceed to ‘2. Seed Set.’

3. Selecting ‘View’ takes users to the Browse screen which is populated by the documents contained in the seed set.

4. With the documents from the seed set populated in the Browse screen, the reviewer may begin their review by clicking on the Title of the first document which will open the document in the document viewer.

5. Navigate to the ‘DISC’ tab within the document viewer, ensure that the ‘Propagate Coding’ checkbox is NOT selected. The ‘Auto Advance’ checkbox may be selected if desired.

6. Select the applicable type of responsiveness under the ‘Coding’ tab. Please note, all documents must be coded as Responsive or Non-Responsive before moving to the Control Set step in the Assisted Review process.

7. Clicking 'Save' will save your changes, and, if ‘Auto Advance’ is selected, you will be advanced to the next document.

8. When you are finished coding the entire seed set, return to your Assisted Review job by choosing ‘Assisted Review’ under the ‘Discovery’ tab and selecting your Assisted Review job from the drop down menu.

Step 4: Review Control Sets

1. Once all documents within the seed set have been coded, it is time to apply the algorithm to a control set.

2. Selecting ‘Next’ on the seed set display will automatically generate a control set.

3. When the assisted review algorithm has completed its automated coding of the control set, you will be advanced the the control set display. Here you will find your first control set, the number of documents it contains, and several columns containing Assisted Review metrics.

4. Selecting ‘Review’ will direct you to the Browse screen populated by documents from the first control set.

5. Select the first control set document by clicking the document title and review the coding that has been automatically applied in the ‘DISC’ tab of the document by the Assisted Review+ algorithm.

6. To overturn how the assisted review algorithm reviewed a document, simply select the appropriate designation under the ‘DISC’ tab. If the coding is already correct, check the ‘Document Reviewed by Me’ box. Ensure ‘Propagate Coding’ is unchecked, and save the document to advance to the next.

7. When you are finished reviewing a control set, select ‘Add Control Set’ to release another set of documents to review. Lexbe uses technology that ensures that the next control set is available immediately thereby saving valuable time for reviewers by not having the traditional wait for the next control set to be ready. If the next control set does not immediately generate, simply refresh the page. **Do not click 'Next' as this will apply the algorithm to all documents prematurely.**

8. Continue reviewing control sets until the F-score has stabilized. Stabilization is an indication that the metrics used to evaluate assisted review have also stabilized and will likely be unaffected by continued review of control sets. Continuing to review control sets will serve only to reduce the margin of error associated with the F-score.

Step 5: Apply Assisted Review+ to the Remaining Documents

1. Once the F-score has stabilized after reviewing control sets, select ‘Next’ to apply Assisted Review+ to the remaining documents in the collection.

2. After the remaining documents have been reviewed by the algorithm, a report detailing the outcome of your application of Assisted Review+ is automatically generated .

Step 6: Viewing the Assisted Review Report

1. Select ‘Download’ on the report display.

2. Open assisted review report using Excel.

Understanding Your Results

Following the application of Assisted Review+, an Assisted Review Report will be generated.  This report is helpful in describing the procedures used to generate the computer assisted review results.

Assisted Review Report

The following is a breakdown of the key elements of the Assisted Review Report.

1. Assisted Review Case Information: This area of the report identifies the name of the case, the applied title, the date and time the assisted review process was completed, the email address associated with the user who ran assisted review, and any comments added to the report.

2. Assisted Review Graph: This chart is a visual representation of key assisted review metrics and results. The x-axis identifies the number of control sets that have been reviewed and the y-axis is a percentile measure (0[0%]-1[100%]). Three lines appear on the graph: A blue line representing the F-score, and two red lines representing the upper and lower measures of the margin of error. This graph allows you to visualize how the margin of error converged on the stabilizing F-score as control sets were reviewed.

3. Predictive Coding Results: This section of the report quantifies the proportion of responsive and nonresponsive coding through the stages of assisted review. The number and proportion of documents coded responsive and nonresponsive in the seed set are 28 (56%) and 22 (44%), respectively. The number and proportion of documents coded responsive and nonresponsive in the control set are 40 (57%) and 30 (43%), respectively.

4. Predictive Coding Statistics: The last section of the report identifies the final F-score (93%), precision measure (87%), recall measure (100%), and margin of error (±18%). This section surmises the final statistical measurables available to evaluate the outcome of the predictive coding process on your case.

Frequently Asked Questions

What content is reviewed by the Assisted Review algorithm?

Only the OCR content from the PDF version of the document is reviewed by the algorithm.

What is a stabilized F-score, and how will you know it has been reached?

The F-score is the harmonic mean of Precision and Recall. An F-score, or F1 Measure of 1.0%, represents perfect precision and recall. A stabilized F-score is reached when you receive a similar F-score each time you review a control set. You will want to receive a similar F-score across several control sets in succession to consider it stabilized. A sample of what a stabilized F-score looks like in LEP is as follows:  

Is there a certain number of Control Sets that should be reviewed to achieve a stabilized score?

Stabilization is highly dependent on the data set. As such, there is no specific, or predetermined number of control sets that will provide you with a stabilized F-score. It’s possible that an F-score may stabilize at an undesirable value which would indicate that the data set is likely not appropriate for TAR.

Key Terms

Seed Set. The Seed Set is created by compiling a random sampling of documents from the entire set.  The seed set is then reviewed by attorney(s) to serve as the training foundation for the predictive coding algorithm to automatically review the remainder of the case documents. The predictive coding outcomes are heavily determined by the accuracy of the seed set review. The size of the seed set is determined by the number of documents in the entire data set. The seed set should be between approximately 2,400 and 4,800 documents.

Control Set. The control set serves a quality control function.  Documents in the control set are released by the number established at the time the Assisted Review job was created. Reviewers either confirm or overturn how the document has been coded (i.e. changing responsive to non-responsive and vice-versa). Overturning a document that has been automatically coded as non-responsive negatively affects the recall element of the F-score. The frequency of false negative overturns indicates how effectively the front end manual review has trained the predictive coding algorithm.

F2 Score. F scores are determined generally by considering the precision and recall of the predictive coding algorithm. Precision is a measure of how often the algorithm accurately predicts a document to be responsive. Recall is a measure of what percentage of the responsive documents in a data set have been found by the algorithm. A low precision score indicates an abundance of false positive identifications, or over-delivery. But, a high precision score does not mean that all the responsive documents have been found. A low recall score is an indication of under-delivery, and a high recall score shows the percentage of responsive documents that have been delivered. The Lexbe eDiscovery Platform uses an F2 score which equally weigh precision and recall.

Margin of Error. The margin of error is a statistical measure of uncertainty based on the possibility that the data sampled was not an accurate representation of the entire data set, assuming a normal distribution of documents. As the amount of data sampled increases, the margin of error is reduced. In assisted review, the margin of error decreases as more control sets are reviewed to verify that the algorithm correctly coded the documents. The margin of error should be interpreted along with the final F-score. For example, if there is a final F-score of 0.75 and a margin of error of  ± 5% , then there is 95% certainty that the harmonic mean of the recall and precision in this instance of assisted review is between 0.7 and 0.8.

More Information

For more information about technology assisted review, please consult Lexbe Assisted Review: Background & Key Concepts.