Let’s say you’re out of your mind, which I just might be, and you want to republish an older novel of your own to which you own the rights. You might take a bandsaw and one of the original hardcovers of your novel, which was published and finished before the advent of digital publishing, cut off the spine, and then run the pages through your scanner, the feeder of which might just break while you’re doing it. Let’s just say that’s what you’re doing. Here are my recommendation:
- Scan at the highest possible setting — I did it at 600 dpi. Really, less than that should be fine. I just wanted to give the OCR function in Acrobat Pro as much information as possible.
- Run the OCR function on Acrobat Pro (or the OCR program of your choice)
- Export to Word or RTF after scanning and recognizing text.
- Then, whenever you find a nit, such as 1 for I, do a global search and replace but make sure that you look at each one to make sure that it’s what you want to change. Your 1 could be part of that /1 for A ugliness.
- Run Word’s grammar checker, it will help find stuff.
- Search 0 for O (zero for O)
- Quotation marks will be screwed up. Guaranteed. Check each one. Search on ” and ‘ and then make sure they’re correct. I had some that were truly weird. Especially with sentences beginning with “I. And then, go through and make sure that your close-quotes are all correct.
- Then, if you have a lot of formatting, like I did, with italics, etc., make sure the italics are italics and that stuff that’s supposed to be plain text instead of itals is actually pt.
- Finally, go through the pdf and the Word/RTF doc side by side on the screen and make sure you’ve got what you’ve think you’ve got. Then, put your Word or RTF file into another font — you’d be surprised at how many things this will help you find. (Blowing it up to 200 percent and marching through it will also help.)