TIFF Image (DAT) Load File Specifications

This technical note details specifications for accepting TIFF image (DAT) load files for import or ingestion into the Lexbe eDiscovery Platform (LEP), and files that have been processed to Native format with load files (see Native Load File Specifications). Load file data fields should be named pursuant to our Standard Metadata Processing and Load File Fields document.

TIFF Image (DAT) Load Files can be produced from a number of eDiscovery processing, review, and production tools, including Concordance, Summation, iPro, Relativity, and iConnect.

The load file format that LEP accepts is also known in the industry as a Concordance TIFF Load File.


General Description

A standardized TIFF Concordance load file consists of two related files:

Concordance Load File - A text-delimited file ending with the file extension DAT.  The Concordance load file references one document per line, and includes document metadata.

Opticon Cross-Reference File - A text-delimited file ending with the extension OPT.  The Opticon cross-reference file references one Bates number per line.

Document files reference the following:  

TIFF Images - Single page TIFF files in TIFF CCITT Group IV format, which are page-based images of processed ESI.  TIFF images are named by Bates number and end with the extension TIF.  Multi-page TIFFs are not supported.

Text files - Single page text files containing ASCII text of processed ESI.  Text files are named by Bates number and end with the extension TXT.  

Native files - Native versions of files used to generate the TIFF images and TXT files, with minimal or no ESI processing applied.


Folder Structure

The Concordance load file grouping is located within the following folder structure and must be present:


Level 1

Level 2

Level 3

Description

LOADFILES

VOL1.DAT


Concordance load file

LOADFILES

VOL1.TXT


Concordance load file with tab delimiter substitutions (LEP output only)

LOADFILES

VOL1.OPT


Opticon image cross-reference file

IMAGES

/001, /002, etc.

XYZ 00177.TIF

Single-paged TIFF images; first page of multi-page document

TEXT

/001, /002, etc.

XYZ 00177.TXT

Text file accompanying single-paged TIFF image; first page of multi-page document

ORIGINALS

/001, /002, etc.

XYZ 00177.DOCX

Original native file (entire multi-page document)



File Naming

Files are named by the Bates number of the first page including an optional Confidential suffix and are located inside the ORIGINALS folder in sub-folders of up to 5,000 files each. The sub-folders use three digits and start with ‘.1’. 

For example:
ORIGINALS/001/XYZ 000177.xlsx
ORIGINALS/001/XYZ 000180 Confidential.docx
ORIGINALS/001/XYZ 000181.jpg


Opticon Image Cross-Reference File Format

The Opticon image cross-reference file should be named VOL1.OPT and located in the LOADFILES folder. Each Bates-stamped page (TIFF image) should have a corresponding entry (new line) in the Opticon Image Cross-Reference file. The file uses Windows OS line breaks between item entries 
(i.e. new Bates number) . The format of the load file is as follows, using comma delimiters: Bates Number, Volume Label, Image File Path, Document break, Page Count, Empty, Empty.

Field Name

Example

Description

Bates Number

XYZ 000177


Volume Label

PROD_IMG001


Image File Path

IMAGES\030\XYZ 000177.TIF

Relative image file path

Document break

Y

Y if a new document is starting and blank otherwise

Page Count

10

Number of pages converted to TIF. Field populated on the first page of a document

Empty


Not used

Empty


Not used


Example entries:
XYZ 000177,PROD_IMG001,IMAGES\030\XYZ 000177.TIF,Y,3,,
XYZ 000178,PROD_IMG001,IMAGES\030\XYZ 000178.TIF,,,,
XYZ 000179,PROD_IMG001,IMAGES\030\XYZ 000179.TIF,,,,
XYZ 000180,PROD_IMG001,IMAGES\030\XYZ 000180.TIF,Y,1,,
XYZ 000181,PROD_IMG001,IMAGES\030\XYZ 000181.JPG,Y,1,,


Concordance Load File Format

The Concordance load file is named VOL1.DAT and should be located in the LOADFILES folder.

The first line contains headers using the data field names listed in the Standard Metadata Processing and Load File Fields document. The text file should be delimited using the following character substitutions:

Text Character

ASCII Substitution

Comma

20

Quote

254

New line

174

Multi-Value

059

Nested Values

092


The size of each production to be loaded to LEP (native files, text files, and load file) should be 50GB in size, or less, prior to compression. If the production is larger than 50GB, it should be split into manageable volumes prior to ingestion. Please note, the number of Native files per directory should be limited to 5,000.