What is FreeEed? 

FreeEed™ is fun and cool software for eDiscovery. It is also a unique Big Data project with Hadoop and Search technologies. Here is a slide overviewpopcorn edsico, "FreeEed popcorn." 

It is an open-source project and released under the Apache 2.0 License.  FreeEed is intended for use in eDiscovery, as an engine and a kernel for the company's search application, or as an investigator's tool. It works on a Windows, Mac, or Linux workstation and on a Hadoop cluster.

What's hot?

Save $$$ on Early Case Assessment (ECA), try FreeEED in the cloud, more...


Scaling is an especially important aspect of FreeEed. Since it is based on Big Data technologies and works on hoards of computers, you can fire up 100 servers and process the data a hundred times faster, for the same price. Hadoop management is a one-click operation provided in the FreeEed Player.

How it works

Processing is organized by the Hadoop framework.  The input data is staged by zipping it in archives of a set size. Then in processing each file is read from the archive, assigned a unique ID, and processed with Tika, which extracts text and metadata. Metadata, text, and the file itself are delivered as processed results.

The primary building blocks of the system are HDFS, Hadoop, Tika, LuceneHive.


The companion application, FreeEed Review, offers document review. 

It is integrated with SOLR so that the users get the advantage of the SOLR open source ecosystem


Each FreeEed project will create its own Lucene/Elasticseasrch index for later searches 


Metadata results are output as a CSV file, while the native files and the extracted text are stored in a zip file(s). The end results can be used for culling and producing native files for legal review. You can use FreeEedUI for review or load it into Concordance.

With the compilation and professional support available for enterprise use, FreeEed brings high performance, scalability, and reliability to data processing at a fraction of the cost of proprietary products.


Supported input file formats


Other capabilities

  • Text extraction
  • Data culling
  • Native/Text/Metadata results delivery
  • Optical Character Recognition (OCR)
  • Imaging (PDF creation)
  • Instant search
  • Deduplication (configurable for emails)
  • Text analytics
  • Social media analytics