virtualvoid's posterous

« Back to blog

PDF to image conversion: comparing PDF renderer performance

During some performance analysis I found out that our server spends some significant CPU time on PDF to image conversion. So I was looking for the fastest way to convert a PDF to an image.

My initial implementation just used ImageMagick for the conversion which in turn just delegates to Ghostscript to read the PDF. For some of the more complex PDFs Ghostscript needs more than 10 seconds for the conversion. So there is definitely room for improvement.

I tried the following PDF renderers/converters:
- Ghostscript 8.7
- pdftoppm 0.12.1 from the Poppler distribution
- pdf2image by pdftron
- PDF Renderer
- sips (scriptable image processing system)

sips is a command line interface for Apple's ImageIO Framework and thus only available on MacOS X. I only included it since Apple claims to have the fastest PDF renderer available. Since our server is running on Linux I can't use it in production. But it is still interesting as a "lower bound".

All tests where performed on a MacBook Pro 2.4 GHz. Ghostscript and Poppler where installed using MacPorts and compiled as 64 bit binaries. (I also tried some of the tests on Debian Linux and the results where consistent which what I saw on the Mac).

I used the following PDF to test the renderers because it was the PDF that took the longest to render:

Click here to download:
content.pdf (620 KB)
(download)

Both PDF Render and pdf2image failed to render this PDF correctly. So I didn't consider them further.

Here is the result for the remaining contestants:

Screen_shot_2010-01-18_at_14

So for this single PDF Poppler is a lot faster than Ghostscript.

If you look at all the about 1300 PDFs in my test set the results are somewhat different unfortunately.

Screen_shot_2010-01-18_at_15

Poppler is slower on the whole set because it is slightly slower to render PDFs which only contain a single big image (mostly scans). I still chose to use Poppler since it has the more consistent rendering times (none of the rendering times where above 10 seconds while Ghostscript needed more than 10 seconds in 5 instances) and it successfully converts more of the PDFs (38 failures vs. 52 failures for Ghostscript).