Posted on: October 07, 2015in Blog
How to Remove False Positives From Proximity Searches
This post explains how to reduce the number of false positives returned in your keyword search results by using these customizable dtSearch operators.
Counsel conducting discovery often need to remove false positives from their search hits through the use of exclusion criteria. This is typically the case when a search term hits on words contained in an email footer. The NOT W/ or the AND NOT dtSearch operators can be used to exclude most false positives. For example, if we are searching for the word privileged, we can use privileged NOT w/1 “This message is privileged and confidential.” However, other cases can be much more difficult.
Learn how to optimize your keyword search terms with proven best practices and expert tips in this 4-week course.
Consider the following hypothetical case. Suppose your client is a milk company and they want to search an employee’s email account for any correspondence regarding price skimming. A search for skim* W/15 price yields too many false positives that reference skim milk instead of price skimming. You might consider using (skim W/15 price) AND NOT milk, however, this construction is over-exclusive and would filter out documents that discuss both price skimming and skim milk. You might also be tempted to use the following invalid construction:
INVALID SEARCH: (skim* NOT W/3 milk) W/15 (price)
Though the above search may appear to return your intended search hits, a NOT W/ proximity operator should not be nested within an additional W/ proximity operator because this results in unclear search syntax and an inaccurate set of search hits. The above string may only return a partial list of hits instead of the full population of hits responsive to your discovery request. Additionally, there is a chance the above string will return a handful of documents that are completely outside the intended search criteria. A more accurate method of discovering our responsive documents will combine the following two search formulas:
CERTAIN HITS: (skim* W/15 price) AND NOT (skim* W/3 milk)
- AND -
POTENTIAL HITS: (skim* NOT W/3 milk) AND (skim* W/15 price) AND (skim* W/3 milk)
The Certain Hits search string will return a subset of documents that meet our search criteria without returning any of the false positives that reference milk. The Potential Hits string will return additional documents meeting our search criteria as well as a limited number of false positives. In the first search above, we verify that skim* is near price and that milk is not near skim*. These hits meet our criteria and we can prioritize for review. In the second search above, we verify a document has the word skim* without milk nearby and we verify that an instance of skim* is within 15 words of price. However, if this document also contains the phrase skim* near milk somewhere in the document, then we cannot be sure which instance of skim is actually falling within 15 words of price. Only a careful review of each document in the Potential Hits group will determine whether or not it is a false positive. Alternatively, you may want to return all potential and certain hits in one search, and that string is below:
ALL CERTAIN AND POTENTIAL HITS: (skim* NOT W/3 milk) AND (skim* W/15 price)
The above case is certainly an exception but highlights the need to intimately understand how different dtSearch operators function together and where problems can arise. As advanced as today’s search tools are, they do not always throw errors when search terms contain invalid syntax. If there’s any doubt how a search term is functioning, it’s always best to have an industry expert assist with formatting and testing search terms. Included below is a resource for your personal use, each string is substituted with generic placeholders that can be modified with terms specific to your search.
D4 Weekly eDiscovery Outlook
Power your eDiscovery intellect with our weekly newsletter.
Posted November 30, 2017
Help Your Employees Find the Information They Need with Machine Learning
Posted November 22, 2017
How to Use Managed and Prioritized Workflows to Reduce the Cost of Review [On-Demand Webinar]
Posted November 16, 2017
5 Workflow Tips for Conducting a Foreign Language Review
Posted November 10, 2017
What You Need to Know About Managed Review and the eDiscovery Process
Posted November 02, 2017
7 Steps to Help You Defensibly Migrate eDiscovery Data
Posted October 27, 2017
CLE Webinar with Lewis Brisbois: How to Do Social Media Collection and Presentation Right
Posted October 26, 2017
Despite Clawback, Defendant’s Reckless Abandon of Rule 502 Bites Back
Posted October 20, 2017
How to Use the eDiscovery PST Export Tool in Office 365 E3
Posted October 12, 2017
Recent eDiscovery Cases for Mobile Phones and Social Media
Posted October 05, 2017
Raising Objections to the Format of ESI Productions: Do it Early and Do it Clearly