Home » Free, General News, PDF Search

Google adds searching of image-based PDF files for free

6 Nov 2008 | | One Comment

In what I’d consider to be one of the most exciting non-election related developments in recent times — CIO Today reports that Google is in the process of applying OCR technology to its indexing of PDF files. This is a massive boon — allowing any image-based PDF file (usually scanned documents, books and faxes) to be returned in search results.

Evin Levey, a Google product manager says — “This is a small but important step forward in our mission of making all the world’s information accessible and useful”.

In the article, Google’s Legal Officer David Drummond is said to talk of an agreement with the Association of American Publishers that resolves lawsuits associated with sharing the content of books in copyright — without the explicit permission of the owner. They say Google has been scanning book collections from major libraries at a rate of 3,000 titles per day since it unveiled its Google Book Search program at the Frankfurt Book Fair in 2004.

CIO Today also report Adobe systems are in the process of also making information with Flash-based media accessible to search engines as well.

All in all — a fantastic result for those of us who like to have access to information — and yes, that’s pretty much all of us.

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 2.50 out of 5)

One Comment »

  • Rowan Hanna said:

    I’m starting to come across more and more indexed PDFs — fairly high in the search results too. As well as some flash documents published at Scribd.com

    Good time for people to take another look at PDFDownload.com or PDFMeNot.com if they don’t want to download PDFs…

Leave your response!

You must be logged in to post a comment.