Make the most of new technology in litigation

Issue February 2003 By Douglas M. Bean

The use of technology is no longer a question in the practice of law. Remaining only are the issues of how lawyers will use it, who will be most effective at exploiting it and the fate of those who fail to embrace it.

Today, an autopsy of any successful case will show some level of effective use of technology. Corporate clients demand technology as a cost-cutting measure. The courts already require some level of technology in discovery management and soon will for electronic filing. And if it hasn't happened already, opposing counsel will soon confront you with powerful information management through their use of technology.

The focus of this article will be the effective us of four pieces of litigation technology: digital imaging or scanning, data extraction/indexing from the images, electronic discovery and data management.

Digital imaging

Law firms do gain economic and strategic advantage through imaging. Advantages come through more powerful control over data and more efficient access to their documents.

There are several economic advantages, including cost savings, storage savings, employee savings and review savings.

Typical law firms still reprint at least one quarter of their current collections over a 12-month period. These copies come in the form of "extra" sets to minimize damage to original copy sets, productions to opposing counsel, etc. A firm with collections equaling 1 million pages, therefore prints approximately 250,000 pages over a 12-month period. Paper costs, toner costs, hardware wear and tear approximate $45,000. Maintaining a set of digital images would eliminate most of these costs, virtually paying for imaging all collections in the first year.

A 1 million page collection occupies from 500 to 750 boxes requiring at least 2,000 square feet of storage space. At an average of $14 per square foot, firms could realize an annual savings of $28,000 just by eliminating costly storage space.

By eliminating full-time clerks dedicated to retrieving documents produced by database searches, copying relevant documents, circulating copies to appropriate legal staff and dealing with the inevitable problem of re-filing and searching for misfiled documents, a typical savings on a medium-sized case could be as much as $50,000 annually.

Accurately quantifying review savings is not simple. Conservatively, an estimate is that the time attorneys take to review documents in a case using digital images was about 250 hours less than with a paper-based system. At $100 per hour, this translates into a savings of approximately $25,000.

In addition, there are strategic advantages in terms of accuracy in response and collaboration.

Both the plaintiff and defense bar now use technology to their advantage. Inconsistencies in responses across multiple jurisdictions are now more likely to be uncovered by aggressive and organized opposing counsel. Unifying the document collection into one set of digital images allows for more accurate productions, better tracking and a more efficient way to ensure consistency in multi-jurisdictional cases.

The ability to share document images across a wide group of users quickly and at low cost creates synergies that win cases. Copy sets can be created faster, more cheaply, and with 100 percent accuracy, a significant advantage in multi-counsel or other multi-copy cases.

The ability to quickly share documents enables a better institutional knowledge of the nature and content of collection. Those reviewing can spot "hot" documents more quickly, respond to last-minute discovery more thoroughly and prepare more accurate responses during the life of the litigation.

Efficiency is another advantage to imaging. CD-ROM, the most popular storage medium for images, is purported to have a 50-year lifespan. Imaged document collections are portable. Imaging reduces five boxes of documents to one CD; 1,000 boxes or 2 million pages in 6-inch binders, or the equivalent of a 4,000-square-foot storage facility.

Dos and don'ts to imaging, include:

•  Don't become an imaging expert. Too frequently firms lose focus on what they do best: practicing law. If your document collection exceeds five boxes, turn to an expert for imaging. The cost of scanning paper in-house while maintaining quality and compatibility with other parties is far greater than the per page cost charged by an imaging vendor.

•  Choose the right vendor. All imaging vendors are not created equal. Many profess expertise. However, to realize the true advantages of imaging documents you need a vendor who is committed to high quality, has years of experience and who guarantees the work.

•  Make sure your vendor can output images in all industry formats (TIF, PDF, JPG, GIF, etc.) and that they are compatible with current database programs (Concordance, Summation, etc.). Visit their facility, often your eyes tell you everything you need to know. When you visit, request an explanation of their quality control process in practice. Do not settle for an explanation; make sure you see it work. Ask for at least five law firm clients they have worked for in the last six months. Call each reference and ask about quality and timeliness.

•  Think about integration. Many of the advantages of imaging come through the ability to share the images with others. To do so, you must have images that come in a format compatible with other programs and systems. Frequently opposing counsel will share costs if you can produce images in a format compatible with their software. If the case goes to trial, make sure the image format you choose is compatible with your trial presentation software, as well as any database you may use before trial.

Data extraction

After digital imaging, the next step to effectively managing documents is to extract meaningful data for searching purposes. There are a variety of ways to accomplish this - each comes with advantages, cost and efficiency.

Bibliographic Coding

Bibliographic coding is the process of extracting key pieces of information from a document such as date, title, source, document type, author, addressee, copyee, etc.

The advantages to this type of coding are numerous. Once the information is extracted in a searchable form, document review becomes simple across large collections. Users have the ability to focus searches (e.g., all CONTRACTS dated after 1990 that mention JOHN DOE). Attorneys can remove privileged and irrelevant documents and respond to discovery much more quickly and effectively. All this means better preparation and more powerful control of data.

In-text Coding

In-text coding involves the same basic process as bibliographic coding, but the information extracted is slightly different. This extraction involves looking for specific key words, specific names, product types, etc.

Depending on the needs of the litigation, this type of data extraction may be more useful if users plan on searching mainly for the key words and not bibliographic information.

Since this type of coding is usually more expensive than bibliographic coding, it is critical to examine the searching needs of the attorneys before starting this process. If you can isolate the type of searching most critical to the case, you can make more informed decisions and choose the type of data extraction best suited for your case.

OCR (Optical Character Recognition)

OCR is the process by which a computer program reads the characters on a digital image and attempts to reproduce the text from the original document.

Much has been written about OCR and its use in litigation. Some corporate law departments routinely OCR every document they receive. My experience is that the use of OCR must be carefully weighed against the end users' needs for searching.

Frequently, the quality of OCR is so poor accuracy levels of 70 percent to 80 percent are common. Those levels may sound high until you consider that for a typical 2,000-page document, 70 percent accuracy means 600 words are missed! If extremely accurate searching is required then OCR may not be acceptable.

A variation of this process involves using OCR to produce an initial text file, and then using a vendor to "clean up" the mistakes generated in the OCR. This approach is widely used, and can be very cost effective when only the most import document types are selected for this treatment.


The most time consuming and costly type of data extraction is re-keying. This process involves humans physically looking at the digital image and re-typing the entire document into a searchable format.

This process is also the most accurate, searchable and likely to be the most effective in litigation since the entirety of the document exists in the database for searching.

Some dos and don'ts include the following:

•  Don't become a data extraction expert. Firms also fall into the trap of thinking they can become data extraction experts, employing untrained (or barely trained) "temps" to do the work. Too frequently firms who undertake this challenge find the cost is higher, the accuracy lower and processing slower than if they chose a reputable vendor to do the same work.

•  Choose the right vendor. As with the imaging vendors, data extraction vendors are not created equal. The key component of data extraction is accuracy. No matter how quick or inexpensive a vendor may be, if their work is shoddy and inaccurate the end product (the database) will be unusable. If lawyers have no confidence the data was extracted properly, they will have no confidence in the accuracy of their searching and will quickly abandon the technology.

To protect yourself, make the vendor show you the following: 1) Quality control measures used to ensure the level of accuracy you demand. Do not settle for a discussion of the process; require the vendor to show you actual output from their quality control system. 2) A list of satisfied legal industry clients they have served in the last six months. Make sure to call each reference and ask specifically about quality and timeliness. 3) Proven mastery of industry technology. Make sure the vendor can output data in all industry database formats. 4) A concrete pricing schedule that shows you exactly what the cost of the project will be, procedures for overage and the process for resolving questions on invoices.

Remember, the lowest price will not always buy the most usable data. The key is achieving the right price for the quality you need.

•  Think about integration. The quality of the data matters little if you cannot use it. Before beginning a data extraction project think carefully about how the data will be "consumed." Consider questions such as: how will the data be searched (in a database, on-line, etc.); will you share the data with others (the client, co-counsel, joint defense groups, etc.); what database application will be used for searching; how long will you need to retain the data?

Make sure any vendor you choose can provide the data to you in whatever format you choose and compatible with whatever database application you intend to use.

Electronic discovery

Today over 50 percent of corporate documents are never printed, they exist only electronically. According to a recent study, "93 percent of all information generated in 1999 was generated only in digital form." ("Digital Discovery Starts to Work." The National Law Journal, Monday, November 4, 2002, Pg. C3, Cover Story.) In a legal industry traditionally focused on paper, a radical paradigm shift is necessary for managing electronic data for discovery purposes.

Any large production now involves a client's electronic files, as well as his or her paper documents. Lawyers must now cope with the process of collecting, searching and producing relevant electronic files, but how?

For several years technology has been available to convert electronic files directly into searchable databases for review purposes. Attorneys can now review client's electronic e-mail and other data, select relevant files and then create images or print to paper responsive documents for production to opposing counsel.

Dos and Don'ts include the following:

•  Don't go solo. Electronic discovery is a complex process. Firms should use the resources they have available: their IT staff, the vendor and their support staff to make an accurate decision on how data will be collected, how it will be delivered to the vendor and how the data will be searched once it is processed. Let each group focus on its core competencies and the team will operate much more efficiently.

•  Let technology do your work. Too many firms try to cut costs through manual review, printing or other "home grown" solutions before giving data to the vendor. This strategy nearly always fails. Vendors have the technology to quickly and efficiently convert data. Let them do the "heavy lifting" of making your data searchable, and then focus on what you do best, legal analysis.

•  Manage cost creatively. There are many ways to attack electronic discovery review. For example, use a vendor who can allow you to search electronic files before creating images of any documents. Choose a vendor who can modify their conversion process (perhaps you want to exclude a file type from conversion, or elect not to convert attachments to e-mails, etc.). Choose vendors who can "de-dupe" files, remove useless blanks, garbage and virus-laden files. Choose a vendor who has a convenient review application (Web-based is easiest) so users can review and mark up data from anywhere at any time. In large reviews this is both very cost effective and can speed the process.

•  Again, choose the right vendor. Because electronic discovery is relatively new, there is less standardization and uniformity in the industry. Hence, selection of the right vendor becomes even more critical.

When choosing a vendor, make sure the follow are present: 1) Flexibility. Beware a vendor who tells you it can't be done. Most technological puzzles can always be solved, although cost is always an issue. The main point is you want a vendor who will think creatively and help you solve problems, not offer rigid excuses. 2) Experience. Like all technology, the more experience a vendor has with the complexity and vagaries of a given process the better. Ask for current references and call each one. Ask specifically about quality, timeliness and expertise.

•  Plan for the unexpected. The result of large scale electronic discovery can sometimes be surprising: you may confront exponentially larger data sets than you expected, image counts that soar beyond predictions and databases so large meaningful review could extend into years.

Before undertaking electronic processing of this kind consider the following: 1) Where will you "house" the data that comes from this processing? If the data set is large it will likely exceed your current hardware. Consider having a vendor bear the burden and expense of housing the data. Purchasing hardware and managing the personnel to support it are not the traditional forte of a law firm. 2) How will you effectively review the data? One gigabyte (the size of a typical individual's Outlook e-mail file) can produce 70,000 pages (or approx. 20,000 documents). Do you have the personnel to review a document population that large? Consider key word searches to narrow the population, and a database application sharable among many users to facilitate coordinated review.

Data management

Digital images and thorough data extraction do no good if you have no way of accessing the information. Countless firms have spent fortunes only to find the data management tool (typically database software) is underpowered, or is ill suited for the task at hand.

Again, plan with the end in mind. Think about the real world needs of the final user, typically an attorney or paralegal responding to discovery requests. Will they need to redact images and do frequent productions to the opposing side? Or will they need raw search power to find that one "hot" document? Will they need to constantly import new data into the system or export it to other parties? Will the sheer volume of the data overwhelm typical databases? When contemplating these questions, think of these items:

•  Raw power. Chose a data management product that is scalable. You may not have a million documents in the present collection, but you may have 100,000 or half a million in the future. Choose an application that can handle the stress of multiple users searching large quantities of data. You may not need it now, but finding out your database is underpowered midstream can cause sleepless nights.

•  Collaboration. In today's techno-savvy marketplace you will need to share your data with others. You may want to share documents with the client, co-counsel and insurer, or opposing counsel. Choose an application that allows multiple, disconnected parties to access the documents.

•  Security. Second only to the need for collaboration is the need for security. Choose an application that employs the industry's highest level of encryption. Also, choose a tool that provides a high degree of flexibility in rights managements. Your software should allow you to prevent unwanted access, allow limited access to some and full access to others. The best tools allow extreme granularity in rights administration, down to the ability to limit access to individual pieces of information on a given document.

•  One-stop shopping. Choose a data management product that allows you to go to one location to do your work. Applications that force you to look in many different repositories to find your data often frustrate users.

•  Integration. Choose a data management tool that allows you to export data to various other formats and applications. If your data is trapped in your system, its efficiency is greatly reduced. Whether you are exporting images to trial presentation software, sharing information with others who use a different application or producing data to opposing counsel, your application should enable simple delivery of data to others.

In today's marketplace it remains for every firm to decide to what degree it will master current applications and invoke the advantage, power and efficiency of technology. The effective use of digital imaging, data extraction, electronic discovery and data management is a start to crafting a comprehensive technology strategy. If firms will take care to examine their data "consumption" needs up front, choose experienced vendors committed to quality and focus on choosing applications that make long-term sense, they will enjoy quick and often decisive advantage over their opponents.

Douglas M. Bean is chief technology officer and general counsel for CaseData Corporation.