Cost of Storage and Processing

Aaron asked a good question in the comments from last night’s post:

Given the (low) cost of storage, is it worth the time/hassle of keeping track of 1GB vs. 15GB? I have no idea what typical volumes of data are for your organization, but as you noted, the client will end up paying more for the larger quantity of data reviewed… does the storage and machine time really make up a large actual cost?

I started to reply to Aaron’s comment in the comments, but then it occurred to me that I’d be better off making this a new post, because I want to get into some of the nuances of e-discovery in my explanation.

First off, yes the cost of storage being what it is, there’s not that much cost to us to store data, especially when comparing 1GB to 15 or 20GB. Processing that much data, however, is a bit different. For example, let’s take 3 cases where clients are presenting us with PST files where we will need to use processing software to extract each message, each attachment and all corresponding metadata so that we can load it into a review database. The first is a 1GB PST, the second is a group of PST’s totaling 20GB, the third 100GB.

With billable hours being the determining factor in cost of in-house processing, you are going to be billed for the time I spend connecting up the external drive to our processing machine, setting up the new project parameters, labeling the appropriate information, doing a quality control check after it processes, and finally starting the export process. There’s no difference in my time for 1GB or 20GB, but how long the machine is in use is vastly different. Granted, we don’t have the most robust processing tools, but 1GB can be kicked out in a couple of hours, 20GB takes a couple of days! Keeping in mind that during those couple of days, anything else that comes in has to wait to be processed, there may not be a fixed “cost” of machine time, but there is definitely an opportunity cost of tying up resources that might be used to otherwise be working on billable work. Especially if it’s time-sensitive and we end up having to outsource it, losing potential revenue, in the interest of getting in done on time.

The 100GB case, given our resource limitations, would almost certainly be shipped to an outside vendor, and that cost (which may very well be a per GB charge by the vendor!), is then passed directly to the client.

At the end of the day, two of the clients are billed the exact same amount for processing and the third is billed a cost that may be higher or lower because it’s based on a completely different factor than the first two. Throw in the very real possibility that these are actually three different collections from the same client, coming in at different times, and what you have is a lack of clarity in billing/costs that is the essence of the argument against billable hours!

So while the firm, as a whole, really does make up the difference when it comes to the longer review process, it might be clearer to people if we simplified the process when it comes to certain tasks, and moved away from time as the factor and toward volume. My point is less about the actual cost in dollars, it’s more about how to make the process clear and fair to all clients.

Secondly, the cost of storage actually works in the other way too. As organizations struggle with the amount of data they have, it’s cheap and easy to “throw more storage” at the problem, as opposed to making the hard decisions about what they need to keep, what they don’t, and how to enforce that policy. I call it the Gmail theory, “don’t worry about deleting or organizing, just keep everything and we’ll search it!”. That may work great when Google is indexing everything in your email constantly, it might not work so great when you’re keeping everything on servers, on shared drives, and not indexing or organizing it in any way.

When these organizations are then party to litigation, the amount of data that needs to be searched, or possibly reviewed, continously grows. There may be no real difference to our firm in the cost of a 1GB versus 15GB case right now, but when you start talking about Terabytes of data? As the cost of storage declines, the amount of data being stored by organizations grows, and the more storage we have to incorporate in order to store the relevant data to our cases. That cost will need to get passed on somehow, and right now that is through the review process, but it’s an indirect method of passing the cost along. If you want something clearer and easier to understand perhaps we should have a small per GB charge for storing data? Tie the amount of data directly to their costs, and perhaps that would even encourage them toward better document retention and organizing behavior as well, in turn leaving us fewer documents to review, and attorneys more time to work on other, much more interesting, endeavors?

Possibly, but that’s a discussion for another time, and one that, being a non-lawyer, I wouldn’t even know where to begin with. 🙂

Tags: BillableHours, Ediscovery, ESIVolume

Similar Posts


  1. Makes sense. Putting actual numbers to the issue as well as a quantifiable opportunity cost and I see where it would make sense to charge by volume.

  2. Of course the challenge is getting a firm to put a number to it and get out of the billable hour mindset, as well as going back to the original post, figuring out what to use for something like review instead.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.