Performance Comparison of Image Libraries Revisited

A few years ago, I wrote up a brief comparison of various image libraries running a series of operations (image compositing and resizing) that we use for Lensley on an OS X + Python setup.

Just recently, I started doing work with Ruby + RMagick, however I ran into some issues doing basic operations (PNG resizing on a set of images) that was just incredibly slow.

ruby 1.9.3p385 (rvm) + RMagick 2.13.2 (ImageMagick 6.8.0-7 2013-02-19 Q16)
real	9m37.735s
user	4m11.995s
sys	3m16.644s

What’s going on here? Looking more closely, Ruby started out maxing out the CPU but this actually declined as the script ran, and memory steadily climbed, reaching 6GB by the end (which took about 30s after processing just to release). Obviously a GC issue, and sure enough, there was a thread about it. Adding a couple destroy! calls at the end seemed to fix things nicely:

real	3m49.132s
user	3m44.673s
sys	0m3.150s

Now, how did that compare with running convert via Python, I wondered?

Python 2.7.3 + envoy + convert (ImageMagick 6.8.0-7 2013-02-19 Q16)
real	4m58.882s
user	4m40.536s
sys	0m15.840s

Seems about right (one interesting thing to note is that the processing was actually shorter than Activity Monitor’s refresh so it never showed maxed CPU usage). Now how about running ImageMagick directly?

time mogrify -path test-seq3 -scale 800x450  test-in/*.png
real	3m17.050s
user	3m14.937s
sys	0m1.879s

OK, and since we’re just doing simple manipulation, lets see how it does against sips.

real	0m56.272s
user	0m51.000s
sys	0m5.150s

Well now, that’s a bit embarrassing isn’t it? Still, one thing with all of these so far was that only a single processor was being maxed out.

I decided to try multiprocessing (this was easier for me in Python) to see how fast I could really process these images. I used multiprocessing + Queue w/ 8 processes for my cores (similar to this example).

Python MP + envoy + convert (ImageMagick 6.8.0-7 2013-02-19 Q16)
real	0m51.472s
user	4m44.998s
sys	0m19.737s

Python MP + PIL 1.1.7
real	0m18.123s
user	2m5.540s
sys	0m2.721s

Python MP + Wand (ImageMagick 6.8.0-7 2013-02-19 Q16)
real	0m39.012s
user	4m39.162s
sys	0m4.145s

Python MP + pgmagick (GraphicsMagick 1.3.17 2012-10-13 Q8)
real	0m17.148s
user	1m47.560s
sys	0m1.593s

Python MP + envoy + sips
real	0m52.984s
user	0m58.504s
sys	0m13.715s

The biggest surprise was that sips had virtually no gain and no effect on the actual processing. I wonder if there’s some pipelining going on or what the loss in subprocesses was… PIL and GraphicsMagick beat the pants of ImageMagick, both being over twice as fast in processor and wall time.

I would have liked to have tried comparing to freeimage, but alas couldn’t get wrappers to work. smc.freeimage and FreeImagePy had problems talking to the dylib, and I was able to get mhotas‘ freeimage wrapper mostly working but it was giving me fits on resizing. Maybe next time.

  • jcupitt

    Hi, I’m the maintainer of libvips, I wonder if you’ve looked at that? It’s about 6x faster than ImageMagick, on this test at least:

    There’s a nice ruby binding and it’s all on homebrew.

  • lhl

    Hadn’t looked at libvips, just did right now (used MacPorts which is based on 7.26.6, which is pretty old? I guess no one is maintaining that port?)

    I didn’t have my previous test set, but here’s a quick results that I got scaling some PNG files. Looks like libvips is about twice as fast in Python, which is nice. That being said, I couldn’t find out how to do stuff like basic PNG compositing/color transforms from the docs. I’m sure it can do it, it’s just impossible to find out how.

    wand (ImageMagick 6.8.7-7)
    23.05 real 170.79 user 4.60 sys

    sips via envoy
    11.32 real 28.91 user 6.71 sys

    12.82 real 94.20 user 2.10 sys

    vipsthumbnail via envoy (couldn’t scale HxW so file size is a bit bigger, so not apples to apples)
    25.47 real 135.30 user 9.74 sys

  • Mihail Naydenov

    Don’t bother with freeimage, it is not multi-threaded at all.

  • Underground

    FreeImage is described as thread safe. It seems to be, to a point.
    I have it running in parallel on a 64 bit OS, but there are some caveats. It doesn’t seem to like hyperthreading, and would consistently error out as soon as there was a load on the first virtual core. Disabling hyperthreading eliminated that issue though.