“Help! I received 1 TB of data from my client and I don’t know what’s in there and what’s relevant!”
Does this sound familiar? If so, you’re not alone. It is not uncommon to feel overwhelmed after receiving large data sets from clients — especially when you’re unsure of how much data needs to be reviewed. Remember: just because it was collected, doesn’t mean it all needs to be processed and reviewed.
The next time you receive a large data dump, try these practices when building out your strategy. All five steps can be done remotely and help you through even the most frustrating cases.
A DeNIST is a standard list of known system applications and files that are unrelated to a matter. Most eDiscovery software allows administrators to load the NIST list so files can be removed automatically during processing. This helps eliminate any unwanted files and thus reduces the document count.
Deduplicating, otherwise known as deduping, removes any exact duplicates based on the Hash Value of the electronic document. Depending on the tool being used, there are different methods of deduping your data. The most common method is to dedupe based on a family level. This means that emails with identical attachments received between multiple custodians would be removed. However, if a word document is sent in two different emails to different recipients, that would not be removed because the parent (email) is different. Deduping is a common practice and helps to easily reduce data between multiple custodians.
3. Date Range Filtering
If your matter is within a specific time frame, you can easily apply a date range for the data to be reviewed. This means only the documents within those specific dates would be processed or exported for your designated review tool.
4. Search Terms
Applying search terms can greatly reduce your data. The search terms can be applied either during the processing phase (so only the potentially-relevant data is exported) or once it's uploaded to your review tool. Depending on the complexity of your case and the tool being used, you can create searches that range from simple to complex.
5. File Analysis
File Analysis, Early Case Assessment or Data Assessment — however you refer to it, this process can analyze your data and provide crucial insights. Through programs such as ActiveNav, you can:
- Get a high-level view of the folder structures within the data dump
- See an overview on the types of data you're dealing with
- Determine whether specific folders can be completely excluded from processing
- Discover the true value of your data
Each insight will help you pare down the data and expatiate the task of sorting through large sets.
If you have questions about managing large data sets or would like to learn more about ActiveNav, reach out to us today.