<movie-trailer-guy> Many months ago he attempted Tim Bray’s first Wide Finder in C++, mainly as a coding exercise. Back then the goal was readability and conciseness. This time … it’s performanceal. </movie-trailer-guy>
Wide Finder 2: The Widening
Required Viewing
If you’re at all interested in computing technology you can’t help but be amazed at the advances in CPU power over the last few decades, Moore’s Law, blah blah blah. But a few seconds pondering this invariably provokes the question as to how long this party can last.
The commonly accepted wisdom is that CPUs have gotten about as fast as they are likely to go in terms of sheer clock speed, and now manufacturers are turning to multiprocessing to provide more processing power for a given price point. The recent Intel price drops which made the quad-core Q6600 CPU available for less than AUD400 are a highly relevent (and welcome) data point to illustrate this trend.
This raises lots of hairy questions for developers, such as “how are we going to design our software to run efficiently in a multi-processing environment?” The previously-linked wide finder experiment is an attempt to explore some of these issues. And it’s pretty obvious that so far there is no silver bullet.
But wait, it gets worse. I will point you to a long but highly thought-provoking presentation from Herb Sutter. Turns out we are already hitting major architectural hurdles in the form of memory access limitations, and we’ll need to find some solutions for these before tackling the parallel computation problem.
Sutter’s presentation is deeply technical, but still quite accessible, and delivered with an engaging style that makes it required viewing. Highly recommended.
I recently had some experience diagnosing some memory-related performance problems (not quite in the same class as that discussed by Sutter, but similar) and I have to say there is a serious deficit in the development tools for these kinds of problems. Currently we need to look aggregate behaviour over multiple iterations to isolate some of these problems, and this is a difficult and error-prone approach. For example, check out Sutter’s technique to discover the memory cache line size in code. In the future it would be great if we could monitor cache misses, pipeline stalls, page faults, and other performance-impacting events within the debugger.
These issues also make me wonder about how higher-level languages are going to provide appropriate abstractions to avoid the performance problems. For example, garbage collection is a major win for programmer productivity but it does encourage memory usage patterns that are not always conducive to performance given architectural limitations in the underlying hardware. The same abstraction problems affect C/C++ of course but at least there is the option to go “bare-metal” where necessary.
Whatever the answers are here, it’s certain there are some interesting times ahead for developers.
Wide Finder in C++
Have you been following Tim Bray’s Wide Finder project? It’s an exercise to express a fairly simple task in a manner that will scale across multiple CPU cores. Some of Tim’s initial progress with Erlang, and other contributors in different languages, is quite fascinating.
Like Tim I was also amused at Pete Kirkham’s C++ implementation which was purported to be shorter than an initial attempt by an Erlang expert (in Erlang obviously). However on closer examination it seems that Pete’s C++ implementation was simply handling the I/O portion with simplified parsing and not the subsequent sorting.
So as another C++ coding Kata I decided to have a go. Whereas Tim’s goal was to evaluate different methods of expressing algoritms for parallel computation, mine was a lot more modest: just get it running concisely in C++ and compare performance with the raw Ruby version. Here’s what I came up with.
C++ 1, Unicode 0
Yes, another C++ post. Yes, I’ve been doing a lot of it lately.
Recently on WorseThanFailure there have been several incidences of functions intended to perform relatively simple string manipulation tasks. Being worthy of posting to WTF, they have of course been hilariously over-long, complicated and bug-ridden. One recent example was attempting to compare two strings in a case-insensitive manner. Another was attempting to remove spaces from a string.
I’m going to have a go at a similar problem, namely writing a C++ program to count whitespace characters in a file.
Ready to follow along? Good!
Kata Four in C++
On a whim, I attempted Dave Thomas’ Kata Four in C++. Yes that’s right, C++.
Here’s what I ended up with, feel free to throw peanuts.


