Posted on: November 24, 2015in Blog
Microsoft, Predictive Coding and eDiscovery
This article was originally published on The Daily Record.
This post explains the advantages of Microsoft’s acquisition
of Equivio and how it can help O365 users leverage advanced analytics to
preserve and collect ESI.
Last week I travelled to Seattle to attend a Microsoft conference where the company announced upcoming enhancements to Office 365. Besides the cool factor of getting to visit the headquarters of the largest software company on earth, I was also wowed by the direction that Microsoft is taking its Office 365 (“O365”) product line. Microsoft is wagering that O365 is the future of business cloud computing. It’s a good bet. With a user base of over 60 Million, O365 subscribers can access SharePoint, Office products, and easily share data with colleagues, all in the cloud. In my opinion, the cloud is the future is when it comes to collaboration and productivity in business, and Microsoft is doing its best to capitalize on that trend.
So what does this have to do with eDiscovery? A lot. Many of the new enhancements and functions in O365 specifically address eDiscovery with the aim of assisting organizations mitigate the costs and risks traditionally associated with managing large volumes of ESI. It makes sense if you think about it. One of the main challenges (and cost drivers) with eDiscovery is finding ESI across myriad systems and custodians. With the cloud, and specifically O365, some of those challenges become much easier (nothing ever goes away completely). In theory, the ESI is now in one “place” and can be managed, collected, searched, reviewed and produced much easier. You may get the managed and collected part, but what about the other three? What is Microsoft doing to tackle those issues?
Microsoft's Acquisition of Equivio
Earlier this year, Microsoft purchased Equivio, a predictive coding software company. When I first heard the news I was a bit baffled as Microsoft is not in the eDiscovery game. But then it made sense. Microsoft may not be in the eDiscovery business, but they are in the information business.
Not familiar with Equivio? Equivio is a software company that develops data analytics technology. They made a name for themselves as the predictive coding wave swept through the eDiscovery industry. What is predictive coding you ask? Think artificial intelligence or think Pandora. You click on a song you like and Pandora tries to predict the type of music that suits your fancy. In a document review, users train the system to identify documents relevant to a particular subject, such as a legal case or investigation. This iterative process is more accurate and cost-effective than keyword searches and manual review of vast quantities of documents. The technology has achieved broad acceptance in the legal community as a valuable eDiscovery tool and it allows organization and legal teams to plow through large volumes of data much faster, with very high quality of review.
Pretty cool, O365 subscribers will soon have access to cutting-edge analytics. If that doesn’t get your leg tingling, let’s put this in perspective of cost and process. In the olden days (November 2015 and prior), users of O365 relied on antiquated search terms to identify potentially relevant documents or they would pull down an entire mailbox in response to a document request. This was just the way it was in the olden days!
Microsoft, Equivio, Analytics and How it All Works
Let’s say that an organization wanted to collect all the email stored in O365 for 33 custodians. If each mailbox was 3 GB in size we would have approximately 100 GB of email. That 100 GB has to be sent to an eDiscovery vendor in order to be processed, searched and prepared for review. There’s a cost to that. Pricing in eDiscovery is usually measured by the GB. Recent surveys peg the cost of taking a single GB of data through collection to review at well over $5,000. Logically, the less there is to collect and review equates to reduced costs.
This is where Microsoft and Equivio enter the picture. With Equivio, an organization doesn’t need to collect all the email from O365. It will now have the ability to apply predictive coding as well as near duplicate detection, email threading and other analytics at the beginning of the process; a process traditionally called early data/case assessment. That’s where the real power is with this technology. It shifts much of the process to the left, prior to collection and preparation for review. But what if the corporation doesn’t have the internal expertise or desire to properly utilize all of this whiz-bang, new-fangled technology? A company could hire a third-party expert or vendor to assist with the Equivio and analytics piece. There are certainly nuances to the technology and many legal teams who choose to use predictive coding process wind up putting forth an expert on the stand to testify about that process. Microsoft is actively creating an eDiscovery partner program for its O365 customers for just this purpose.
Now let’s think outside the box. Sure, it can be used in litigation in response to a document request in order to reduce the volume of ESI sent to review, thereby significantly reducing the cost of review, but let’s think beyond just immediate litigation needs.
This technology could be used by a company to assess old email and other electronic records that were being retained for business, regulatory or past litigation reasons. Assessment may lead to purging, thereby reducing storage costs, future liabilities and compliance with corporate data retention policies. I have consulted with many clients on using analytics to purge data from backup tapes. We used keywords, clustering and predictive coding to develop a repeatable and defensible method to purge unnecessary and potentially costly data. It was a worthy endeavor, but sometimes painful and laborious. I imagine a simpler, more cost-effective and streamlined process with Equivio and data stored in O365.
In my mind this is game-changing. Hooray for Microsoft for innovating (or buying and integrating) and raising the bar.
- Microsoft and Equivio: Why it Makes Sense to eDiscovery and the Rest of Us
- Understanding Cloud Email Discovery and Collection Options [White Paper]
- The Three Groups of Discovery Analytics and When to Apply Them
- 9 Reasons Microsoft Outlook Is Not a Litigation Review Tool
D4 Weekly eDiscovery Outlook
Power your eDiscovery intellect with our weekly newsletter.
Posted February 23, 2017
Women in eDiscovery Atlanta | New Data Technology Trends
Posted February 23, 2017
Corporate Internal Investigations Best Practices
Posted February 13, 2017
4 Key Internal Roles Involved with Conducting Corporate Investigations
Posted February 09, 2017
Corporate Internal Investigations: A Legal & IT Love Story [Webinar]
Posted February 09, 2017
Intellectual Property Theft: How to Ensure a Defensible Investigation
Posted February 02, 2017
Could the Amazon Echo be a New Source of ESI?
Posted January 26, 2017
Information Governance Policies: The Fundamental Building Block to eDiscovery
Posted January 25, 2017
4 Urban Legends about Analytics and e-Discovery
Posted January 19, 2017
Legal Hold Triggers: When Should You Document Your Reasonable Anticipation of Litigation?
Posted January 12, 2017
5 New Year's Resolutions from an Experienced eDiscovery Team