Peace Library News

A weblog from the Peace Library
at the Centre for Conflict Resolution, Cape Town

Tips from the Librarian's desk : OCR scanning

This has been compiled from various scanner manuals:

  • Use crisp, clear text. The most important thing to remember about getting accurate text scans is to start with a good quality original. Tears, wrinkles and smudges can confuse the OCR software and lead to errors in the final output.

  • Black text on a white background is best.

  • Touch up a dirty original with a touch of correction fluid, or make a photocopy to improve the contrast of the original.

  • Try text that is 9-point font or larger.

  • If the text on the original is slanted, or the page shape is irregular, it can be difficult to align the text properly for scanning. Draw a line on the back of the page that corresponds with the baseline of the text, and use that line to align the page with the scanner's guides.

  • Translucent papers and newsprint allow text from the opposite side to show through the paper, confusing the scanner. Place a sheet of black paper over the back of the page to block the light.

  • These types of text might convert less accurately:

  • text close to non-text elements: bullets, lines, or graphics
    text in spreadsheets, tables, or forms
    letters that have gaps, bleed along their edges, or touch each other
    underlined text
    text on colored paper

  • Handwriting cannot be converted.

  • Use your word processor's spell checker to go over the scanned text when you are finished scanning. Though OCR programs have built in dictionaries, they are seldom as extensive or precise as the one you've personalized for your own use.
« Home | Next »
| Next »
| Next »
| Next »
| Next »
| Next »
| Next »
| Next »
| Next »
| Next »