Standard File Types

This technical note describes standard file types used and supported in the Lexbe eDiscovery Platform (LEP).

Details

Prior to converting files, LEP applies container file expansion, DeNIST, and extension repair procedures.  See Automated ESI Processing for more information.  

Files that do not convert as part of the automated processing services are marked with a placeholder file, either in PDF or TIFF, depending on the service ordered.  

Standard File Types for Conversion

LEP identifies and attempts to convert (to TIFF or PDF, depending on the service), the following file types:

 Ext  Application / Description
 Type
  bmp    Image BMP      Image
  class   Java programming file   Text/Word   Processing
  config    Application configuration File   Text/Word Processing
  css    Cascading style sheet (web page support)   Text/Word Processing
  csv   CSV (Comma-separated values) (*.csv)    Text/Word Processing
  doc    Microsoft Word 2003 for Windows (*.doc)    Text/Word Processing
  docx    Microsoft Word 2007 and 2010 for Windows (*.docx)   Text/Word Processing
  eml
  Microsoft Outlook Express email
  Email
  gif   Image GIF   Image
  htm   HTML web page   Web Page
  html    HTML web page    Web Page
  ics    iCalendar file    Text/Word Processing
  inf    Setup Information File   Text/Word Processing
  ini    Text configuration file   Text/Word Processing
  jpeg   JPEG    Image
  jpg    JPG   Image
  js   Javascript programming file   Text/Word Processing
  json   Javascript object notation file   Text/Word Processing
  lnk    Windows File Shortcut   Text/Word Processing
  log   Appplication log file   Text/Word Processing
  manifest    Java programming file   Text/Word Processing
  mht    HTML web page   Web Page
  mht   MHT archives (HTML archives saved by Internet Explorer) (*.mht)   Web page
  msg
  Micorsoft Outlook email
  Email
  pdf   Adobe Acrobat (converted from text)   Text/Word Processing
  pdf    Adobe Acrobat (image only)   Image
  pdf   Adobe Acrobat (text under image)    Image
  php    PHP programming file   Text/Word Processing
  png    PNG image    Image
  pps   Microsoft PowerPoint 2003 for Windows   Presentation
  ppsx    Microsoft PowerPoint 2007 and 2010 for Windows   Presentation
  ppt    Microsoft PowerPoint 2003  for Windows   Presentation
  pptx   Microsoft PowerPoint 2007 and 2010 for Windows   Presentation
  pst    Microsoft Outlook data files
  Container
  py    Python Programming Script     Text/Word Processing
  rar    RAR   Container
  rtf    Microsoft Rich Text Format (*.rtf)    Text/Word Processing
  tif    TIF   Image
  tiff    TIFF    Image
  txt    ASCII Text   Text/Word Processing
  url   Unliform Resource Locator file   Text/Word Processing
  vcf    Vcard contact information file    Text/Word Processing
  xls    Microsoft Excel 2003 for Windows (*.xls)   Spreadsheet
  xlsm   Microsoft Excel 2007 and 2010 for Windows (*.xlsx)    Spreadsheet
  xlsx   Microsoft Excel 2007 and 2010 for Windows (*.xlsx)   Spreadsheet
  xml     XML text   Text/Word Processing
  zip   Archive container ZIP (*.zip)    Container
   
Failure to Convert Standard File Types

If a standard file type fails to convert, a placeholder file is created and it is noted in the database record that the file Failed to Convert. Standard file types may fail to convert for a variety of reasons, including:  file corruption, file type mis-identification, print or data extraction issues, and password protection.  Inherently, password protected files are not searchable (even with dual index) and require extra due diligence.  

Some non-converted standard file types can be converted manually as a technical service (billable hourly or per GB, depending on file type and issues involved).

Other Files Not Converted

LEP does not auto-convert files other than the standard file types listed above.  Instead, a placeholder file is created and it is noted in the database record that the file was Not Converted (i.e., is not supported). 

Some non-converted, non-standard file types can be converted manually as a technical service (
billable hourly or per GB, depending on file type and issues involved).

Files that do not convert would include: media files, some container files, some email files, database files, and others, described in more detail below.  

Failure to convert a file does not mean it does not contain probative evidence, only that it did not convert with automated procedures.  These files should be reviewed and further steps taken to convert, when appropriate.

Media Files

Media files (video and audio) cannot be converted to TIFF or PDF.  They can be uploaded to LEP and coded.  They can sometimes be viewed or played depending on file type, connection speed, local browser, computer settings, and installed applications.  The following is a list of common media file types:

 Ext  Application / Description  Type
 avi
  Windows video
 Video
 asf    ASF  Video
 m4a   QuickTime  Video
 m4p   Apple  Video
 m4v     QuickTime  Video
 mov    QuickTime  Audio
 mp3   MP3   Audio
 swf    Flash
 Video
 wav   Wav file  Audio
 wma   WMA  Audio or Video
 wmf     Windows  Metafile Format 
 wmv  Windows Video
 Video

Unusual Container Files

As part of automated processing, LEP extracts ZIP and RAR files.  LEP does not automatically extract unusual container files. Examples, would include:  7z, G7, Iza, Jar, Sit.  Many container files can be extracted manually as a technical service (billed hourly). 

Email Files

The automated conversion process automatically converts Outlook PST and MSG files. Other email files or stores that can be manually converted as a technical service (billed hourly) prior to automated processing are listed below:

 Ext  Application
 dbx    Microsoft Outlook Express 5 and 6 for Windows
 mbs  Opera Email for Windows
 mbx   Eudora MBX message files
 mbx   MBOX email archives (including Thunderbird)
 nsf  Lotus Notes email

Database Files

The automated conversion process does not convert database files.  Database types (depending on type and version) that can be processed manually as a technical service (billed hourly) are listed below: 

 Ext  Application
 dbf
 Oracle or other database
 frm  MySQL
 myd  MySQL
 myi  MySQL
 mdb   Microsoft Access Database 2003
 mdbx   Microsoft Access Database 2007 and 2010
 iif  Intuit interchange file (Quickbooks)
 ldf  SQL Server
 qba  Quickbooks
 qbb  Quickbooks
 qbm  Quickbooks
 qbw  Quickbooks
 qbx  Quickbooks
 qby  Quickbooks

Mac Files

The automated conversion process does not typically convert productivity files used for the Mac (e.g., Microsoft Office for the Mac, Apple Numbers, etc.)  These files occasionally convert, depending on version and other factors.  However, more often, will not convert, mis-convert or generate internal Mac resource fork files.  Mac email is not supported (Mac native or Outlook for the Mac) but these files can be converted by Technical Services. Best practice is for Mac productivity files to be converted to MS Office for Windows version 2007 or 2010 prior to upload.

Other Files Not Automatically Converted

Many of the file types listed below can be converted to PDF or TIFF manually as a technical service (billed hourly). 

 Ext  Application  Type
 123  Lotus 1-2-3 (*.123, *.wk?)  Spreadsheet
 art  Bitmap image file compressed by (AOL)  Graphic/Image
 doc  Microsoft Word for the Mac (any version)  Text/Word Processing
 docs  Microsoft Word for the Mac (any version)  Text/Word Processing
 pages  iWork Pages for the Mac (any version)  Text/Word Processing
 numbers  iWork Numbers for the Mac (any version)  Spreadsheet
 key  iWork Keynotes for the Mac (any version)  Presentation
 dwg  Autocad   Other 
 dxf   Autocad   Other
 epsf  EPSF    Image
 hjt   Treepad HJT files  Other
 mpp  Microsoft Project 2003   Other
 mppx   Microsoft Project 2007 - 2010  Other
 obd  Office binder document  Container
 qpw   Quattro Pro    Spreadsheet
 sam   Ami Pro (*.sam)   Text/Word Processing
 tmp  Application Temporary File  Other
 vdx   Visio XML files (*.vdx)   Image
 vcf  MS Outlook, other programs (contact info)  Text
 xlk  Backup file created by MS Excel  Other
 wb1    Quattro Pro   Spreadsheet
 wb2  Quattro Pro   Spreadsheet
 wb3   Quattro Pro   Spreadsheet
 wks   Microsoft Works
 Text/Word Processing
 wpg    WPG (WPG version 1.0 only)   Image