OCR: A Tool for Family Historians
Biff Barnes
Old documents are a common problem for family historians. At the recent RootsTech conference I spoke with several people had diaries, journals or books written by ancestors that they hoped to publish in whole or part. The question all of them posed was how to do it without having to retype the whole document.
Optical character recognition (OCR) is one possible answer. OCR is a process for translating text from paper into electronic files that can be manipulated on a computer using a word processing program. An OCR system requires an optical scanner to read the text and software to analyze the images the scanner creates. Some high end OCR systems use both hardware (specialized circuit boards) and software. Some less expensive systems rely solely on software. OCR systems can read text in a large variety of fonts, but don’t work as well in reading handwritten documents.
The fact is, if you have a scanner, you may already have OCR software that came bundled with your scanner software. Open your scanner software and check options. Text recognition may be one of the options.
If you don’t have OCR software, free downloads are available. Check CNET for reviews of some of the available free downloads. Two of the most popular OCR programs available for free download are Free OCR and Smart OCR.
More sophisticated OCR software is available from a variety of sources. The Wise Geek website offers some good guidance in evaluating potential choices including: What is Optical Character Recognition? , a good overview of the features of OCR programs, and How Do I Choose the Best OCR Software? Top Ten Reviews offers specific comments on top OCR programs and the specific features each offers in its article 2012 Best OCR Software Comparisons and Reviews
Whatever OCR program you use, plan to read both the original and the document generated by the software carefully because OCR is not always 100% accurate. You'll want to catch any errors.