希望访问中文页面? 请点此(简体中文版)  

Posted on: April 26, 2017

in Blog

How to Use Office 365 Advanced eDiscovery to Prioritize Your Review

While my last blog focused specifically on features within Office 365 Enterprise E5, the biggest differentiator between E3 and E5 is the advanced eDiscovery component. 

Advanced eDiscovery within Office 365 E5 builds on the existing set of eDiscovery capabilities but provides a more efficient and streamlined process to give you more control over your data. 

E5 Advanced eDiscovery Benefits

Identify relevant data quickly to save time and money, while focusing on what’s unique and relevant by training the system to identify emails and documents through predictive coding.  Reduce document volume with near duplication and email threading, therefore reducing the data set and decreasing the cost to review a matter.

Office 365 In-Place Holds

One of the biggest advantages for eDiscovery in O365 is the premise of “In-Place” holds. You can work with In-Place Data (Exchange, SharePoint, OneDrive, Skype), and work with in-place custodians (groups, mailboxes, sites, …).  The ability to search for email, documents, Skype for Business conversations, and other content in your organization to then can place a legal hold on a custodian, preserve the data and perform all the processing, analysis and culling activities, all from within the same repository is a dream scenario for the security conscience and the IT novice. 

The ability to perform analysis on that data by applying the text analytics, machine learning, and the relevance/predictive coding capabilities of Advanced Analytics can result in quickly processing thousands of email messages, documents, and other kinds of data to find those items that are most likely relevant to a specific case. This streamline process of analytics and artificial intelligence could end up saving you hundreds of thousands of dollars, annually in review.

E5 has added a great new enhanced way to gather the data needed.  According to Microsoft: 

"Content Search is a new eDiscovery search tool with new and improved scaling and performance capabilities. Use Content Search to run very large eDiscovery searches. You can search all mailboxes, all Exchange public folders, and all SharePoint Online sites and OneDrive for Business locations in a single Content Search. There are no limits on the number of content locations that you can search. There are also no limits on the number of searches that can run at the same time. After you run a Content Search, the number of content locations and an estimated number of search results are displayed. After you run a search you can preview the results, get keyword statistics for one or more searches, bulk-edit content searches or realize that you are ready for that data to be analyzed in Office 365 Advanced eDiscovery."

Prioritize Your Review and Identify Cost-Savings

Let’s take a deeper dive into what it looks like when Advanced eDiscovery learns from your tagging decisions on documents and how it applies statistical and self-learning techniques to calculate the relevance of each document in the data set. This information enables you to focus on key documents, make quick yet informed decisions on case strategy, cull data, and prioritize review.

In this particular example, here’s my sweet spot, and here’s why…

Click to enlarge image

If I review 25% of my data set and see that I am at a 90% (lets round up) relevance recall rate, meaning that I am capturing 90% relevant documents. Knowing that in order to capture the next relevant document I will spend about $23 dollars, am I comfortable with this scenario?

You can achieve significant cost-savings just by leveraging the features available to you within Office 365. Join experts from Microsoft and X1 for an informative webinar on August 3rd to learn how. →

I know the review should cost me somewhere around $170k (with an estimated cost of $1 per document for our purposes here)... Am I ready to export my data for review based on these numbers? Does it make sense to keep going with the machine learning, training the system?

Click to enlarge image

As you can see, at a certain point the relevance evens out:

  • The cost to obtain the next relevant document goes up considerably
  • The total cost for review increases exponentially
  • And although we’ve almost doubled our data set training, we’ve only increased our recall of relevance by roughly 10 points

This is a key tool to understand and work with so that you are confident in your accuracy and cost predictions. To further show how, have your SME training the system on a particular data set, to show it does not make sense to spend more and more hours going through additional documents.

Look at the review-recall ratio line between 50 and 75%.  It doesn’t change, but how many hours of resource time and cost did an SME just spend on that block of (hundreds maybe thousands) of documents?

Click to enlarge image

Now your next relevant document will cost you $105 and your review will cost you $509k.

Sure, you’re at 97% relevance recall now, but the time and cost to get there wasn’t worth that business decision.  In other words, you were in a good spot to pull the trigger hours and dollars ago at the 25% reviewed stage.

…You get the trend.

Once you’ve run your E5 eDiscovery Advanced Analytics – Relevance, and find your sweet spot, the reduced data set can be exported for SME Review.

Discover More:


Related Tags

Office 365

Discover More Categories

D4 Weekly eDiscovery Outlook

Power your eDiscovery intellect with our weekly newsletter.