virtualvoid's posterous

PDF to image conversion: comparing PDF renderer performance

During some performance analysis I found out that our server spends some significant CPU time on PDF to image conversion. So I was looking for the fastest way to convert a PDF to an image.

My initial implementation just used ImageMagick for the conversion which in turn just delegates to Ghostscript to read the PDF. For some of the more complex PDFs Ghostscript needs more than 10 seconds for the conversion. So there is definitely room for improvement.

I tried the following PDF renderers/converters:
- Ghostscript 8.7
- pdftoppm 0.12.1 from the Poppler distribution
- pdf2image by pdftron
- PDF Renderer
- sips (scriptable image processing system)

sips is a command line interface for Apple's ImageIO Framework and thus only available on MacOS X. I only included it since Apple claims to have the fastest PDF renderer available. Since our server is running on Linux I can't use it in production. But it is still interesting as a "lower bound".

All tests where performed on a MacBook Pro 2.4 GHz. Ghostscript and Poppler where installed using MacPorts and compiled as 64 bit binaries. (I also tried some of the tests on Debian Linux and the results where consistent which what I saw on the Mac).

I used the following PDF to test the renderers because it was the PDF that took the longest to render:

Click here to download:
content.pdf (620 KB)
(download)

Both PDF Render and pdf2image failed to render this PDF correctly. So I didn't consider them further.

Here is the result for the remaining contestants:

Screen_shot_2010-01-18_at_14

So for this single PDF Poppler is a lot faster than Ghostscript.

If you look at all the about 1300 PDFs in my test set the results are somewhat different unfortunately.

Screen_shot_2010-01-18_at_15

Poppler is slower on the whole set because it is slightly slower to render PDFs which only contain a single big image (mostly scans). I still chose to use Poppler since it has the more consistent rendering times (none of the rendering times where above 10 seconds while Ghostscript needed more than 10 seconds in 5 instances) and it successfully converts more of the PDFs (38 failures vs. 52 failures for Ghostscript).

iPhone Network Roundtrip Time

Before you optimize you need to measure. So I spent an idle afternoon measuring the network roundtrip times in our iPhone application. The results were quite interesting so I'd like to share them with you.

The timings are complete roundtrip times, so they include sending the HTTP request, server response time and decoding the JSON result. I used the most commonly used request of our application for the measurements. It's a simple query that requests a number of entries matching that query from the server.

The first graph shows the network roundtrip time (in seconds) as function of the number of requested items on a 1st generation iPhone using WLAN to access the server.

Click here to download:
PastedGraphic-3.pdf (5 KB)
(download)

So the network roundtrip time is linearly dependent on the size of the JSON document. For these tests the JSON document was GZIP compressed and the communication was SSL encrypted.

In the next test I requested 200 items over different network connections on the same device.

Click here to download:
PastedGraphic-4.pdf (7 KB)
(download)

These results were quite unexpected. On the slow iPhone CPU I expected a bigger impact of the SSL encryption.

For the final test I also performed the same tests on a iPhone 3GS:

Click here to download:
PastedGraphic-9.pdf (18 KB)
(download)

These tests show that the CPU time needed to parse the JSON response are a significant part of the complete roundtrip time. Also they show how slow our office WLAN connection really is. I didn't expect that a 3G network would beat a 2 Mbit DSL connection.

Chili and Beer

Today I tried the latest recipe from Cooking For Engineers: Buffalo Chicken Chili (see http://www.cookingforengineers.com/recipe/268/Buffalo-Chicken-Chili ).

Dscf2416

It's very tasty - especially with the right beer :)

Posted May 16, 2009

Paragliding is fun!

I went paragliding this weekend. We started from the Wallberg near Tegernsee. From the launch point we had an excellent view over the lake. Unfortunately we didn't have enough lift to make it to the lake and back.

 Before the start I thought it would be scary to be kept in the air by a a lot strings and a few square meters of fabric. But actually I was way too fascinated to be scared. I would immediately do it again.

Dscf2404

I can really recommend the tandem flights with Paragliding-Oberbayern (http://www.paragliding-oberbayern.de/). I felt really safe with their excellent and experienced pilot.

Posted May 4, 2009

New Office

After a few finishing touches our office is finally setup.

 Our great printer/fax/scanner/card reader/kitchen sink:

Dscf2410

Bruce guards Gregor's 30 inch display:

Dscf2411

My old G5 got a new life as office server:

Dscf2412

The most important piece of equipment:

Dscf2413

Posted May 4, 2009

Excluding the Maven Repository from Time Machine Backups

During the last days I wondered why Time Machine is backing up multiple 100 MB per day. I found out that this is caused by the constantly changing Maven repository that is located in $HOME/.m2/repository by default.

 To avoid this problem I tried to move the Maven repository to $HOME/Library/Caches. This avoids that the repository is backed up by Time Machine. Actually this turned out to be harder than I thought.

 Setting the localRepository configuration in $M2_HOME/conf/settings.xml to "~/Library/Caches/..." didn't work. Maven then created a "~" folder in the current directory containing the repository. Using "${user.home}/Library/Caches/..." didn't work either. Using "${env.HOME}/Library/Caches/Maven/repository" finally did the trick.

Fungi living on radiation

I finally found this very interesting article on fungi living on radioactive radiation again:
 
http://www.foxnews.com/story/0,2933,276196,00.html
 
I was really fascinated by this when I first read about it some time ago, but I never was able to find the article again. So now I post it here that I don't misplace it again :)

iPhone länger klingeln lässen

Endlich habe ich herausgefunden wie man das iPhone bei T-Mobile länger klingeln lassen kann.
 
"Einfach" **61*3311*11*25# wählen damit der Anruf erst nach 25 Sekunden auf die Mailbox weitergeleitet wird. Die letzte Nummer kann auch eine 30 sein, für 30 Sekunden Verzögerung.
 
Quelle: http://www.t-mobile.de/downloads/mobilbox/anleitung_rufumleitung.pdf

Posted April 8, 2009