A friend of mine recently sent me a link to an interesting article “Not Your Father’s Von Neumann Machine: A Crash Course in Modern Hardware”.

One of the interesting elements discussed in this article is the relative cost of memory access. An access satisfied from L1 cache (closest to the processor) might take just 3 clock cycles, but an access requiring access to main memory might take 200 clock cycles.

That’s a massive difference in performance.

I began to wonder if locality of reference was impacting on the performance of my Mandelbrot Screensaver. Over the years since the first public release in 2001, my constant quest has been to improve performance - to generate each new Mandelbrot frame in less time.

After a bit of work to rearrange processing to maximize locality of reference, I ran some benchmarks.

Current Public Release (v3.1): Single 1280x1024 frame in 12.7 seconds

Current Development Version: Single 1280x1024 frame in 2.6 seconds.

In a word, WOW!

The new code does the same amount of work as the old, but does it almost 5 time faster.

This implies that over 10 seconds of my original run-time was wait time - time spent by the processor waiting for memory to provide required information.

Clearly, not every application can be sped up to this extent - but it’s a useful thing to know for those rare times when performance is the #1 priority, bar none.

Comments

blog comments powered by Disqus
Next Post
MappingLists with NHibernate 19 Aug 2009
Prior Post
Data Migration issues 11 Aug 2009
Related Posts
Fractals for Christmas 21 Dec 2013
The importance of reporting a bug 03 Nov 2013
Mandelbrot screensaver update 29 Oct 2013
Nova's Fractals 11 Dec 2008
Mandelbrots on TV 18 Aug 2008
Mandelbrot Screensaver v3.1 14 Jul 2008
Mandelbrot Screensaver v3 01 Jun 2008
A Settings problem 26 May 2008
Related Pages
August 2009 archive