Below is a message from an ezine to which I subscribe.
The idea is that they put out of print/copyright books pertaining to Scotland on their site for interested parties to read. They have obviously reached an impasse and I would appreciate any help - even if it is just a website or a google email address which could help.
Lockergnome has been great in the past with answers to my queries.
ELECTRIC SCOTLAND
-----------------
Just been pushing on with getting more books onto the site this week and there are a couple of new books started for which see below.
I might add that on the whole I try to ocr books onto the site but there are books that really defy the ocr programs either because they use a non standard font which means the ocr programs make a real mess of trying to understand the words or the page is very faint which also causes real problems.
When I understood that the adobe acrobat program could scan in pages and make a reasonable job of ocr'ing the results at the same time I moved to this method of posting such books as .pdf files. I am still confused however on how Google indexes such files and despite emailing adobe and google I still don't have an answer. The point is that Adobe states that you need to scan in at 300dpi if the program is to try and ocr the text. When I did a chapter of a book at this dpi it came out at 4.6Mb. I then took the option of reducing the file size and that got it down to 648k which is a lot smaller.
My problem is this... if you ever use google to search for something and it offers a .pdf file in the results they also offer an html view of the .pdf file where when clicked they have obviously made an attempt at ocr'ing the text. So what I need to know is if I reduce the file size of the .pdf file will they still be able to do this? Like does the .pdf file already contain the text and so there would be no need to put up the larger file? As I still don't know the answer to this I am posting both versions up on the site with the main link to the larger file and then I post a "Version for dial up visitors" below that and link that to the smaller file. It would be great if I knew the answer to this question then I could just use the smaller file.
Anyway... should anyone have an answer to this I be very pleased to hear from you
