Search multiple PDFs

skaertus

Distinguished
Mar 20, 2010
9
0
18,510
I have a library of 500+ research articles in PDF (about 700-800 MB of my HD). They are all searchable PDFs (I have OCRed the ones which were scanned pages with noi text). The articles are of different sizes, but about 50 pages each.

I've put these PDFs all together in one folder and now I am looking for a search engine (for Windows 7) which is able to perform full-text searches in this whole library of PDF files. I've tried several pieces of software, but no one has given me a satiisfactory experience. Let me tell you which ones I've used:

Windows Search: fast and indexes files, but not very straightforward to limit the searches to a specific folder.

Google Desktop: fast and indexes files, but I have not found a way to linit the searches to PDFs inside a specific folder (I don't want it to search the thousands of PDFs stored in my HD). Plus, it has been discontinued by Google.

Copernic: fast, indexes files and I can limit the searches to a specific folder. However, it is not able to render properly the text inside the PDFs.

Mendeley: it creates a database of PDFs, indexes and searches those PDFs included in the database. However, it has crashed due to the large number of PDF files I've added. In addition, it cannot display all the instances of a specific word I search for.

Zotero: I couldn't even try it, it crashed as I tried to add my PDFs to its database.

Adobe Reader: it searches all PDF files inside a specific folder. However, the search is very slow (it does not index files). It is able to show all the instances a word is found in each PDF file, renders PDFs greatly and it is possible to read and annotate the PDFs right after the search. But it is sooooo slow.

PDF X-Change Viewer: prety much the same as Adobe Reader.

Foxit Reader: the best so far. Just like Adobe Reader and PDF X-Change Viewer, but the searches are a bit faster. In addition, I liked the interface better.

The ideal solution for me would be if Foxit Reader could index all PDFs inside a specific folder, so searches would be much faster. Is it possible? Is there a solution which I have not yet tried?
 

skaertus

Distinguished
Mar 20, 2010
9
0
18,510
Right-click on the folder, select "Search".

Thank you. I've managed to do it with Windows Search. However, Windows Search will not highlight the results in the PDF files. This would be a useful feature because many PDFs are 50+ (or even 100+) pages of written text, and I don't want to read the entire document just because the search engine told me that the word or expression I want is there...

I've found out that Mendeley and Qiqqa are able to do that (search indexed PDFs and show the highlighted results), but they are both too slow and sluggish (and I guess they're not going to get any better, since I'm running them on a Core i7-2720M machine with 8 GB of RAM). Copernic is able to do that too, but it cannot display the PDFs properly. I still prefer Foxit, which does not index PDFs, but handles them much better. The only piece of software which I found out that can do the task in a satisfactorily way so far is dtSearch, but I guess it would be overkill (and it costs US$ 199). Any other ideas?
 
Adobe has their own search filter http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611, but that one looks to be older. Also http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025

Take a look at Adobe's site see what else they may have.

There is also an option to index pdf files better in Windows http://answers.microsoft.com/en-us/windows/forum/windows_7-files/i-need-to-index-pdf-files-in-my-laptop-running/ef5fa7c3-bb64-4f3e-9607-3e01349dccca as it seems that pdf ifilter is not tested under Windows 7 and does not work well if at all there.
 

skaertus

Distinguished
Mar 20, 2010
9
0
18,510
I've tested a few other pieces of software:

dtSearch: A great search engine. It is able to index, search and show PDF files highlighting the results. However, it is buggy and expensive (US$ 199).

Archivarius: Another great software, although not as powerful as dtSearch. It index, searches and show the results in the text. However, it is not able to render the PDF files properly. But it is very fast. And much cheaper than dtSearch.

Qiqqa: does all that for free. But it is very slow and sluggish. Not worth it even for free...

I've also tested some alternatives for Mac OS. Papers is very good, although it lacks a feature to navigate through the highlighted search results. And DEVONThink is even better.

However, for what I've seen so far, I think Archivarius seems to be the best. Unless you may suggest other alternatives.
 

TRENDING THREADS