希望访问中文页面? 请点此(简体中文版)  

Posted on: January 14, 2016

in Blog

Guide to Using Analytics Technology During Discovery

This guide defines the different analytics tools available and explains which types of cases would be best suited for the application of each technology.

This post was originally published in January of 2015 in an effort to bring clarity to the options legal teams have when choosing analytics technology and has been updated to reflect both feature advancements and changes to the way the industry is applying analytics. Although analytics is more widely used in the legal industry, there is still much confusion as to which analytics technology to use and when to apply it, especially on smaller matters.

Performing pre-review analysis on your data can help educate you about which workflow would be most beneficial for your case - Download this guide to learn how to narrow your focus.

In order to help clients determine which analytics and review workflows to implement for each project, the conversation typically starts with the simple question: “What do you want to accomplish?”. This is important to ask because each application has a different function and depending on the type of project and desired results, one technology may be better suited than another. There is so much more to analytics than predictive coding or assisted review and cases of all size can benefit from the use of advanced technologies.

It is important to note that a successful analytics project depends on the quality of the textual content so always validate text prior to implementing analytics and review workflows.

The Evolution of Analytics Technologies in Discovery

Analytics technology and workflows have come a long way in the last year however there is still a long way to go with industry knowledge and implementation. For example, near duplicate detection and email threading have proven to increase efficiency, speed up review time, and decrease costs for matters of any size, yet many legal teams are still not utilizing these applications on every case. For legal teams willing to try, they have seen great results in both expedited review and significant cost savings.

There are many different types of analytics technologies and functionality and they continue to evolve. Below is an updated chart outlining a few of the more popular choices with suggestions of when they might be considered for a particular case.

Near Duplicate Detection

Identifies documents that are near‐duplicates of each other based on textual content and then groups those documents according to similarity.

Classification: Structured

Best Case Size for Implementation: All Cases

Minimum # of Documents Required: No

Use with OCR Documents: Yes

Use with EDA/ECA Workflows: Yes

Email Threading

Determines the relationship between email messages by grouping related email items together and creating a visualization that helps users track the progression of an email chain.

*email threading will not be as accurate when used with OCR versus ESI however it can be done

Classification: Structured

Best Case Size for Implementation: All Cases

Minimum # of Documents Required: No

Use with OCR Documents: Yes

Use with EDA/ECA Workflows: Yes

Keyword Expansion

Keyword expansion can be used to identify and validate words, terms, names, email domains, variations of words and/or conceptually related terms that a legal team wasn’t aware of.

Classification: Conceptual

Best Case Size for Implementation: All Cases

Minimum # of Documents Required: No

Use with OCR Documents: Yes

Use with EDA/ECA Workflows: Yes

Clustering

Identifies groups of conceptually similar documents, however, because clustering does not require user input, there is no way to identify which concepts are of particular interest to a legal team. Clustering is most useful when working with unfamiliar data sets.

Classification: Conceptual

Best Case Size for Implementation: All Cases

Minimum # of Documents Required: No

Use with OCR Documents: Yes

Use with EDA/ECA Workflows: Yes

Categorization

Users create a set of example documents that will be used as the basis for identifying and grouping together other conceptually similar documents. As documents are reviewed, users can designate example documents and add them to various categories. These examples can then be used to apply categories to the remaining document population.

Classification: Conceptual

Best Case Size for Implementation: Medium‐Large Cases

Minimum # of Documents Required: 50,000+ Documents

Use with OCR Documents: Yes

Use with EDA/ECA Workflows: Yes

Concept Searching

Identifies documents with similar conceptual content and is very different from keyword or metadata searching. A concept search reveals conceptual matches between a query and textual content versus matching a specific word or search term / set of search terms. This can help prioritize or find important content quickly.

Classification: Conceptual

Best Case Size for Implementation: Medium‐Large Cases

Minimum # of Documents Required: 50,000+ Documents

Use with OCR Documents: Yes

Use with EDA/ECA Workflows: Yes

Relativity Assisted Review

Relativity Assisted Review uses categorization to train/teach the system on how to determine if a document is responsive or not responsive.

Subject Matter Experts (SMEs) on the review team begin by coding a sample set of documents. Based on those coding decisions, Relativity determines how the remaining document population should be coded. Reviewers then validate Relativity’s automated decisions by manually reviewing statistically‐relevant subsets of documents in order to ensure coding accuracy.

Classification: Predictive

Best Case Size for Implementation: Medium‐Large Cases

Minimum # of Documents Required: 50,000+ Documents

Use with OCR Documents: Yes

Use with EDA/ECA Workflows: Yes

Equivio Relevance

Equivio Relevance is a Predictive Coding workflow that uses machine‐learning technology that is trained, by way of examples, to imitate the decisions of Subject Matter Experts (SMEs).

Relevance uses a scoring system to organize and rank documents according to relevance. This allows legal teams to quickly cull down raw data collections to more substantive datasets, focus on key documents and to prioritize data for review.

Classification: Predictive

Best Case Size for Implementation: Medium‐Large Cases

Minimum # of Documents Required: 20,000‐50,000+ Documents

Use with OCR Documents: Yes

Use with EDA/ECA Workflows: Yes

Brainspace Discovery

Brainspace Discovery is a visual analytics solution that allows users to access clustering, concept searching and near-duplicate identification technology all within one seamless interface.

This collective and intuitive approach allows legal teams to quickly cull down raw data collections to more substantive datasets while at the same time connecting data, people and knowledge through the relationships the technology identifies.

Classification: Conceptual/Predictive

Best Case Size for Implementation: Medium‐Large Cases

Minimum # of Documents Required: 20,000‐50,000+ Documents

Use with OCR Documents: Yes

Use with EDA/ECA Workflows: Yes

Download the guide

Analytics is a powerful supplement to any standard review workflow. It should be considered for every case, regardless of size. Don’t let assumptions, misconceptions or lack of experience get in the way of investigating and realizing the many benefits of advanced technologies. There are numerous resources that indicate how far analytics technology has come. It’s not going away anytime soon, and equipping yourself with detailed knowledge about the different types of technology can help you make informed decisions for your next case.

Examples of Analytics Workflows to Use in Discovery:


Discover More Categories

D4 Weekly eDiscovery Outlook

Power your eDiscovery intellect with our weekly newsletter.