Load Files

Background

Load files are plain text files using characters called delimiters (comma, tab or special character delimiters) to separate the various metadata items in the load file. Load files include reference and association of data items such as Bates number or control number, beginning and ending Bates number of a document, beginning and ending Bates number of email attachments, date (various date fields are commonly collected including date/time sent and received for email and date last modified for other native files), sender or author, receiver, and custodian. Fields also reference the path to single page TIFF images, single page text files, and PDF and native file versions. Extracted or OCR text can be included as separate files (referenced in the load file) or sometimes part of the load file itself.

The Lexbe eDiscovery Platform (LEP) Production Directory Structure

LEP supports Concordance and Summation (DII) load files and also has its own Excel-based load file (Excel files are easier to view and work with than DAT or DII files). When producing load files as part of productions, we support an extensive list of standard processing fields, when available and produce or load from Concordance DAT and Summation DII standard formats. When the user downloads a Production, it will include the following folder structure:

IMAGES: This folder contains image files and is only present with a Standard plus TIFF production.LOADFILES: The folder designated for load files. In a large production where the set of documents is split in separate volumes compressed into the same ZIP file, the LOADFILES sub-folder may be in any of the Zip file volumes. ORIGINALS: The folder designated for all the native files (Word, Excel, JPG, PGN, etc.).


PDF: This folder contains the PDF version of the files also included in Standard plus TIFF productions.


TEXT: The folder designated for all the Text files that are document based.

Load File Sub-Folder Contents

The LOADFILES sub-folder contains the following:

VOL001.dat: Concordance-style load file.

VOL001.dii: Summation-style load file.

VOL001.lexbeloadfile.xlsx: LEP's load file.

VOL001.opt: Opticon image cross-reference file.

VOL001.tsv: Tabular-style load file. A simple file format that is widely supported, so it is often used to move tabular data between different computer programs that support the format.

VOL001.txt: Concordance load file with tab delimiter substitutions.

These load file formats include variations that can handle TIFF, PDF, and native productions.