By Rick Borstein
Optical Character Recognition, commonly referred to as OCR, is the process of converting scanned images of letters and words into a electronic versions. For example, you can use the Recognize Text feature in Acrobat DC to convert an image of a page into a searchable version in which you can select text, comment on it and even edit it.
OCR is an imperfect process. While some very good originals will process at or near 100% accuracy, if you feed Acrobat a poor quality document, results will suffer. So, yes, a fax of a fax of fax is not going to OCR well. Scanned documents may also contain handwriting which seldom is recognized as text.
OCR affects search quality and that should be a concern to legal professionals. Consider a contract that may be part of your case. Perhaps the only place your client’s name can be found in the document is in handwritten Name and Signature fields.
If you use Acrobat (or other tools) to search for your client name, no result will be returned. Since your client’s name is an important term for most cases, you might want to consider correcting key documents to enhance search results.
Fortunately, Acrobat DC includes tools to help you audit OCR quality and correct OCR errors.
The Acrolaw Blog is a resource for lawyers, law firms, paralegals, legal IT pros and anyone interested in the use of Acrobat in the legal community. Rick Borstein, blog author, is a Principal Solutions Consultant with Adobe Systems Incorporated.