Wednesday, April 29, 2009

PDF Ifilters in SharePoint


We have more PDF files than anything else in our SharePoint sites, by a huge margin. We also have some pretty over-sized hardware, so up until recently, we have been able to get by with the single-threaded free Adobe ifilter.

When the Foxit ifilter was released I did some testing and could not see sufficient benefit to make the change. I am pretty sure my testing was flawed, as others have confirmed the superior performance of the Foxit ifilter. I carefully retested it and with a relatively small sample size (less than 2000 files) I got a benefit of not quite double the throughput. I think that with a larger sample the benefit would be greater, but the successful test is sufficient for my needs and allows me to justify this purchase.

Checking Ifilter Registry Entries

Along the way I experienced some difficulties. The instructions for installing the Foxit ifilter are very simple, and sometimes they work as they should, but not always. If you are doing a first time installation of a PDF ifilter, follow Foxit's instructions carefully. For a replacement of Adobe, you simply uninstall the Adobe ifilter and then you install Foxit. I did this uninstall/install cycle a couple times. Either of the uninstalls may not always happen cleanly, so you may need to get out your SharePoint/IIS/Windows hammer and do a bit of tapping.

As I was testing, I was getting an error on all the PDF's that there was no ifilter installed. I ended up manually modifying two registry keys, at first just to put back Adobe so I could test that. Then the uninstall/reinstall worked, but I still checked it just to be sure. My problems could have been from not following procedures (starts, stops, etc.) or possibly from the install / uninstall not being complete. Checking these keys was a big part of my solution.

The two keys are:
  • [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf]
  • [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf]
Their multi-string value should be the appropriate GUID for the classid for the ifilter you are using (I think that's what this is):
  • {987f8d1a-26e6-4554-b007-6b20e2680632} Foxit
  • {4C904448-74A9-11D0-AF6E-00C04FD8DC02} Adobe 6
  • {E8978DA6-047F-4E3D-9C78-CDBE46041603} Adobe 8 (or 9 or 64 bit??)
You also need to be sure to start and stop the search services and why not throw in an IIS reset while you are at it (hammer). Of course a reboot has the same effect (more hammering). After doing that, to make your tests cleaner, it's good to get the setup primed. You can do that by running it twice (reset the content in between), or just enough the first time to be sure that everything is running. If you monitor the index you will see a minute or two delay on an un-primed setup.

I am looking forward to getting this into a production environment with hundreds of thousands of PDFs. I hope I won't need my hammers, but at least I know where to look if it doesn't seem to be working.

2 comments:

  1. Good to see you're still fighting the good fight with all those PDFs. When I started over here, they weren't even indexing PDFs. They were suprised when I turned it on and 30,000 new items showed up in the index.

    I'd be interested in seeing FoxIt in a x64 environment, as that is the requirement for SharePoint 2010.

    ReplyDelete
  2. Anonymous6:53 PM

    Interesting post, i havent tried the 64bit installation yet seen that foxit performs better than adobe's own. liked this post on 32bit installtion gives some ideas on other ifilters.

    http://zebracube.wordpress.com/2009/06/21/pdf-ifilter-sharepoint/

    ReplyDelete