TIFF Image (DAT) Load File Specifications
This technical note details specifications for accepting TIFF image (DAT) load files for import or ingestion into the Lexbe eDiscovery Platform (LEP), and files that have been processed to Native format with load files (see Native Load File Specifications). Load file data fields should be named pursuant to our Standard Metadata Processing and Load File Fields document.
TIFF Image (DAT) Load Files can be produced from a number of eDiscovery processing, review, and production tools, including Concordance, Summation, iPro, Relativity, and iConnect.
The load file format that LEP accepts is also known in the industry as a Concordance TIFF Load File.
A standardized TIFF Concordance load file consists of two related files:
Concordance Load File - A text-delimited file ending with the file extension DAT. The Concordance load file references one document per line, and includes document metadata.
Opticon Cross-Reference File - A text-delimited file ending with the extension OPT. The Opticon cross-reference file references one Bates number per line.
Document files reference the following:
TIFF Images - Single page TIFF files in TIFF CCITT Group IV format, which are page-based images of processed ESI. TIFF images are named by Bates number and end with the extension TIF. Multi-page TIFFs are not supported.
Text files - Single page text files containing ASCII text of processed ESI. Text files are named by Bates number and end with the extension TXT.
Native files - Native versions of files used to generate the TIFF images and TXT files, with minimal or no ESI processing applied.
The Concordance load file grouping is located within the following folder structure and must be present:
Files are named by the Bates number of the first page including an optional Confidential suffix and are located inside the ORIGINALS folder in sub-folders of up to 5,000 files each. The sub-folders use three digits and start with ‘.1’.
ORIGINALS/001/XYZ 000180 Confidential.docx
Opticon Image Cross-Reference File Format
The Opticon image cross-reference file should be named VOL1.OPT and located in the LOADFILES folder. Each Bates-stamped page (TIFF image) should have a corresponding entry (new line) in the Opticon Image Cross-Reference file. The file uses Windows OS line breaks between item entries (i.e. new Bates number) . The format of the load file is as follows, using comma delimiters: Bates Number, Volume Label, Image File Path, Document break, Page Count, Empty, Empty.
XYZ 000177,PROD_IMG001,IMAGES\030\XYZ 000177.TIF,Y,3,,
XYZ 000178,PROD_IMG001,IMAGES\030\XYZ 000178.TIF,,,,
XYZ 000179,PROD_IMG001,IMAGES\030\XYZ 000179.TIF,,,,
XYZ 000180,PROD_IMG001,IMAGES\030\XYZ 000180.TIF,Y,1,,
XYZ 000181,PROD_IMG001,IMAGES\030\XYZ 000181.JPG,Y,1,,
Concordance Load File Format
The Concordance load file is named VOL1.DAT and should be located in the LOADFILES folder.
The first line contains headers using the data field names listed in the Standard Metadata Processing and Load File Fields document. The text file should be delimited using the following character substitutions:
The size of each production to be loaded to LEP (native files, text files, and load file) should be 50GB in size, or less, prior to compression. If the production is larger than 50GB, it should be split into manageable volumes prior to ingestion. Please note, the number of Native files per directory should be limited to 5,000.