What it FreeEed?



FreeEed™ is fun and cool software for eDiscovery. Here is the slide deck we use at conferences, "FreeEed eDiscovery Overview," and a brief overview for lawyers, "FreeEed popcorn". It is all packaged for your use in the Download section.


It is based on an open source project published by SHMsoft  and released under the Apache 2.0 License.  It is built with Hadoop and other Big Data technologies.  FreeEed is intended for use in eDiscovery, as an engine and a kernel for the company's search application, or as an investigator's tool. It works on a Windows, Mac, or Linux workstation, on a Hadoop cluster.




Telephone support, training, helping you find the best way to fit the FreeEed family of tools into your environment, leveraging the existing design or modifying it. Support is offered in two packages: initial and on-going yearly.



How it works



Processing is organized by the Hadoop framework.  The input data is staged by zipping it in archives of a set size. Then in processing each file is read from the archive, assigned a unique ID, and processed with Tika, which extracts text and metadata. Metadata, text, and the file itself are delivered as processed results.


The major building blocks of the system are HDFS, Hadoop, Tika, LuceneHive.




Each FreeEed project will create its own Lucene/Solr index for later searches.




Metadata results are output as a CSV file, while the native files and the extracted text are stored in a zip file(s). The end results can be used for culling and producing native files for legal review.

With the compilation and professional support available for enterprise use, FreeEed brings high performance, scalability and reliability to data processing at a fraction of the cost of proprietary products.


Supported file formats

MS Office and other formats  (over 300)

PST processing

OST processing


Other capabilities

Text extraction

Data culling

Native/Text/Metadata results delivery

Optical Character Recognition (OCR)

Imaging (PDF creation)

Instant search

Deduplication (configurable for emails)