<movie-trailer-guy> Many months ago he attempted Tim Bray’s first Wide Finder in C++, mainly as a coding exercise. Back then the goal was readability and conciseness. This time … it’s performanceal. </movie-trailer-guy>
Wide Finder 2: The Widening
Twitter over IP
Let’s solve Twitter’s scalability problems, shall we?
So, like most people, I don’t know much about the problems there and certainly don’t have any solutions to suggest. But I do know there are a certain class of solutions which aren’t on the table.
If you look at Twitter from a suitably high vantage point you see real-time communication between small groups. People entering short messages and having these messages appear at their peers a small time later. There’s also a central archive, but I’ve heard Twitter described as “public Instant-Messaging” and this seems to characterise it best for me.
In short, Twitter seems more suited to peer-to-peer communication than to client-server. What sort of protocol would it use? I can imagine a protocol which would be probably UDP-based, and which would send tweets to followers either directly from peers or perhaps through a local aggregation point. Large groups of followers could perhaps even use UDP multicast. Archive servers could be reached through network anycast addresses, to allow for greater decentralisation. IPv6 to get universal connectivity. And so on; fill in your own pet network technology here, there are certainly lots of potential solutions.
Instead of these, clients communicate directly with the Twitter servers using HTTP. Not only that, but they poll for updates. Bit of an architectural blunder, you might think. Well not really. In fact I don’t think the Twitter designers had any choice.
Once upon a time it was possible to deploy new application-layer protocols on the Internet. But those times have passed, it seems. These days, it’s HTTP(S) or nothing. And this is not the protocol you would choose for carrying tweets, if you had the choice. So the fact that twitter works at all over this sub-optimal application-layer protocol is quite an achievement.
This is a great example of the many ways in which innovation can be stifled by enforcing a lowest-common-denominator.
The impact is of course more widespread than just Twitter. In fact, the so-called end-to-end principle which was one of the fundamental founding principles of the Internet is now all but abandoned in practice. Geoff Huston examines the issue in some detail in a recent article, and it is highly recommended.
Of course, there are no easy answers, either for Twitter or the next application to suffer due to the proliferation of network middleware. But it’s certainly an issue that does need to be more prominent.
(This post is an obvious departure from my usual style of blatant attack pieces in order to score traffic and fame for myself. Normal service will resume shortly.)
Blogging Horror
So you’re watching a new TV show and you’re hooked. It’s clever, the characters are believable, the dialog is witty, the cinematography is inspired, the direction is tight and the plot is engaging. You want to see more. You’re in love. With a TV show.
Developer Essentials
A long, long time ago in a corporate universe far, far away… I was admitted to an elite group. A group where the members each had a manifesto. Chris was a founding member of the group, and has since published his manifesto on his blog. I don’t quite know how I was deemed worthy for membership. My manifesto was unpublished, it was mostly unfinished, and it was unseen by all but me. Regardless, I was admitted. A fraud. Living a lie. For years I have lived with this shame.
At last all can be revealed, for here is my manifesto. Or as I originally called it, my List of Skills and Knowledge That All Developers Should Have. There are 9 items in no particular order, so I guess that makes it a set rather than a list for all you pedants in the audience (and I know you’re there).
The idea is to document a set of skills that every developer should have. That’s everyone who develops software professionally. Doesn’t matter what type of position or company or industry; this is my stab at a body of knowledge that every serious developer should have. Essentials, essentially.
It’s partly based on experience; specifically the experience of surprise I felt when a colleage, or random stranger on the internet, expressed ignorance at one of the items on this list. Other items on the list have come from an exacting process of posterior extraction. I’ll leave it to you to guess which is which, and who is who. Or at least skim the headings. Read on.
Web Forums Considered Annoying
Specialised web forums are commonplace these days. They cover the entirety of the long tail, and are therefore indispensable for discussing obscure, and not-so-obscure, topics.
These days I wouldn’t consider making any serious purchase without at least a brief consulation of the relevant web forum. Technical problems with just about anything can usually be resolved with a well-crafted search through the appropriate forum. Put simply, the web forum is the basic unit of online community these days.
But despite their ubiquity and obvious utility they remain frustratingly limited in lots of ways. In this post I vent some of these frustrations.
Latest and Greatest
Those of you familiar with my propensity to succumb to temptation for the latest and greatest may have been looking at my setup, marvelling at the mildly dated hardware, rolling your eyes and thinking “that’s not going to last long”. And I’m happy to confirm that, in accordance with prophesy, the hardware upgrade juggernaut has finally rolled through my part of the world. Oh my word yes.
In the first week of January, Apple announced an upgrade to the Mac Pro line of workstations. They would have 8 cores as “standard”, through the use of two 4-core Harpertown Xeon CPUs. What they didn’t highlight was the fact that the “standard” configuration was actually customisable down to a mere single 2.8GHz 4-core CPU, making it available for a comparatively-quite-reasonable A$3300.
And naturally, I bought one. Read on for initial impressions.
Arc Flashlight
A few years ago I purchased an Arc-AAA led flashlight, mainly on the advice of Dan. It was an excellent piece of kit. Small enough to carry everywhere but bright enough to be useful. For years it was the only thing that hung off my keychain (besides keys of course). A month ago it died due to a battery leak.
I wasn’t hopeful of saving it. Since I bought my unit, the guy had stopped selling them for a while, and then been bought out by another company. Nevertheless I wrote to the new company and asked if it could be fixed. They said sure, send it in. A few weeks later, this arrived:

A brand new Arc-AAA, mysteriously labeled an Arc-P.
The new one is slightly heavier I think, but feels more rugged. The knurled metal case is that dull greenish-grey colour, as might be carried by Brown from Spook Country. The light is activated by twisting the head, and the new unit has a lot more resistance which makes it a lot less likely to turn on in your pocket. It’s at least as bright as the old Arc-AAA, in other words surprisingly good for the size (and capable of out-shining any AAA Maglite). If you want to get really mental I see there’s now a “premium” version which is even brighter.
The quality of the unit itself and of the support from the vendor gets two thumbs up from me.
Vendor Lock-In
I generally agree unequivocally with Bruce Schneier, but his recent column on vendor lock-in has me wanting to take issue with some of his points.
Vendor lock-in is real, but the example he gives of the iPhone is not a very good one. Why? Because it’s easy to switch: you call up the carrier (AT&T in this case) and say “I don’t like my iPhone, it’s too sleek and good looking and it’s user interface is too elegant. Instead I’d like to subject myself to some nonsense from the traditional handset vendors.” To which the AT&T person says “sure, we’ll charge you $X and ship out a new handset. When it arrives, just activate and transfer your contacts.” Bingo, you’re off the iPhone.
[Update: Andrew points out in comments that the 24-month contract may impede switching in this manner. I don’t know the details, but I’d be surprised if it was impossible to switch away from the iPhone, merely expensive. This is, to my mind anyway, not sufficient to justify the term “vendor lock-in”, but I suppose that depends on your definition. My definition is below.]
In Australia we have number portability which means that I can generally switch handset or carrier without too much fuss. I’m not sure about the situation in the US, but as illustrated above you are still free to switch handsets while keeping the same carrier. So if there’s lock-in at play here, it’s lock-in to AT&T, not the iPhone.
So what is vendor lock-in anyway? I would define it as the presence of constraints on a given product or service that are imposed by the vendor and which prevent you from switching to a different product. These constraints may take the form of missing features which would enable a switch, or of usage constraints imposed by licensing, or both. Either way there has been an explicit decision — technical or policy — by the vendor which prevents switching to a competitive product. Hence the term is a mild pejorative.
It’s a slightly confusing term because it applies to a product or service, and not to the vendor. So it’s quite possible for product X to exhibit vendor lock-in, but not product Y from the same vendor. “Vendor-imposed lock-in” might be a better term.
Note that there is an implicit assumption that the features and capabilities of the product in question are available elsewhere in the marketplace. In other words, there exists an equivalent product to switch to. This assumption does not always hold, and sometimes you may find yourself unable to switch to a different product, simply because there are no other products on the market with a given capability or feature. This does not, by my definition anyway, constitute vendor lock-in, because the inability to switch does not arise as a result of a decision from the vendor.
Does the lack of an SDK constitute vendor lock-in for the iPhone, as claimed by Schneier? Well, does the lack of this feature prevent switching to a different product? No, of course it doesn’t, as illustrated above.
In fact, it is the presence of an SDK which constitutes vendor lock-in, of a sort. Third-party applications written to the iPhone cannot, by definition, be easily be ported to other mobile platforms. If you suddenly decide you don’t like your iPhone any more, but have hundreds of third-party applications installed, you have a problem.
This problem is common to all computing platforms; vendor lock-in is a necessary consequence of all vendor-controlled SDKs and APIs.
Incidentally the delay in making an iPhone SDK available can quite easily be explained by the technical challenges involved, and does not neccesarily imply any policy decision by Apple to deliberately lock out third-party developers. Producing an SDK of any quality is a hard task, and the instant it is released it has to be supported for the life of the product. As Charles Miller puts it, “third party apps are for life, not just for Christmas”. It is quite understandable that Apple would make sure their SDK is just right before committing to it.
But where does the “no SDK == lock-in” idea come from anyway? I suspect that it arises from the expectation that we are able to install third-party applications on the iPhone. Where does the expectation come from? It comes from the disclosed fact that the iPhone runs OS X. If Apple had not divulged this fact, or if the iPhone ran some un-named OS — as is the case for all classic iPods, for example — there would be no expectation of third-party applications. It is for this reason no one is claiming that the lack of an iPod SDK exhibits vendor lock-in.
However, Schneier claims that there is* vendor lock-in on the iPod, due to the fact that “music purchased from Apple for your iPod won’t work on other brands of music players”. This is misleading; it is quite possible to purchase DRM-free music from Apple for the iPod and other players. Again, he’s incorrectly identified the source of the vendor lock-in, which in this case is *certain music from the iTunes Store and not the iPod.
To reiterate, vendor lock-in is real and is important. It is contrary to the idea of Free Data and deserves to be more widely discussed. However, let’s first understand what we are talking about, so that we can think critically.
"Open This First"
To justify my occasional lapses into Apple fanboy-ism, I offer the following for your consideration.
Exhibit A: Apple, 24 years ago
Exhibit B: Microsoft, today
Yeah, I know, who cares about packaging? But howcome so few companies get it right?
And in the long run, I think it *is* important. The message you send with the packaging of your product is one of the care you have put in to producing it. And of the importance of the customer’s time in getting up and running quickly.
Message received, Apple.
Freedom Zero
Some prominent bloggers have recently asked the question “Why don’t people care about freedom 0?” To me this just raises the question “What is this freedom 0 stuff anyway?” Herewith the (lengthy) results of my own attempts to find out the answers to both of these questions, and even raise some of my own.
Code 2.0
So I recently finished Lawrence Lessig’s Code 2.0 and I am compelled to pass some sort of comment on it. Book reviews aren’t really my strength, but this one definitely deserves attention of some sort.
First the easy stuff. This book is a new edition of an earlier book with the obvious title. The changes and updates have been through an interesting process whereby the entire text of the book was posted on a wiki and anonymous contributions were invited to produce the successor. This might make you a bit wary about the overall continuity and structure of the book but I’m happy to report this is not a problem at all. Follow the link above and you’ll note that the entire text of the book is still available online, and indeed contributions are currently being sought for Code 3.0.
As you might expect from a Creative Commons founder, Lessig is putting his money where his mouth is by putting it all online under a CC license. Notwithstanding this, you’ll probably want a paper version. I bought my copy through my usual source, the Book Depository (£8.40 shipped anywhere).
OK so it’s already an interesting book, but what is it about? (You may be wondering) Let me see if I can do it some kind of justice.
iDefend iTunes
The Claytons iPhone
Behold my Claytons iPhone. It’s the iPhone I have when I don’t have an iPhone. And in many ways it’s better than the real thing (besides being non-vapourware here in Australia).
This phone is a Nokia 6110 Navigator. It replaces the Nokia N70 I bought, and wrote extensively about, a while back. Unlike last time when I did a detailed review of all aspects of the phone, this time I’m going to look at the two big features of the 6110N first, followed by a list of carefully selected bullet points relating to the rest of the phone.
Server Shelf
One of the main joys of home ownership is the ability to run Ethernet cable throughout the house without asking anyone’s permission. For ages I have wanted to do this, and now I have. Behold.
In lieu of a server room, I have a server shelf. It’s in the laundry/garage area under the house. The gang plate terminates ethernet cables which run to all parts of the house. They are in turn connected by short cables to my trusty WRT54GS.
You may be able to just see a phone line in the centre of the gang plate; it provides connection to the ADSL2 modem. This line is connected through a central splitter to the outside world, providing as-good-as-it-gets ADSL2 throughput (still not great though because I am a long way from the exchange).
Yes, I have used a slightly dodgy double adapter to make room for the stupid wall warts to co-exist. I am, however, totally desensitised to this sort of hackery, because it just so common. The power socket people really need to get together with the wall-wart people. Failing that, there’s definitely a market for 10cm extension cords; anyone know where I can get these, cheap?
Required Viewing
If you’re at all interested in computing technology you can’t help but be amazed at the advances in CPU power over the last few decades, Moore’s Law, blah blah blah. But a few seconds pondering this invariably provokes the question as to how long this party can last.
The commonly accepted wisdom is that CPUs have gotten about as fast as they are likely to go in terms of sheer clock speed, and now manufacturers are turning to multiprocessing to provide more processing power for a given price point. The recent Intel price drops which made the quad-core Q6600 CPU available for less than AUD400 are a highly relevent (and welcome) data point to illustrate this trend.
This raises lots of hairy questions for developers, such as “how are we going to design our software to run efficiently in a multi-processing environment?” The previously-linked wide finder experiment is an attempt to explore some of these issues. And it’s pretty obvious that so far there is no silver bullet.
But wait, it gets worse. I will point you to a long but highly thought-provoking presentation from Herb Sutter. Turns out we are already hitting major architectural hurdles in the form of memory access limitations, and we’ll need to find some solutions for these before tackling the parallel computation problem.
Sutter’s presentation is deeply technical, but still quite accessible, and delivered with an engaging style that makes it required viewing. Highly recommended.
I recently had some experience diagnosing some memory-related performance problems (not quite in the same class as that discussed by Sutter, but similar) and I have to say there is a serious deficit in the development tools for these kinds of problems. Currently we need to look aggregate behaviour over multiple iterations to isolate some of these problems, and this is a difficult and error-prone approach. For example, check out Sutter’s technique to discover the memory cache line size in code. In the future it would be great if we could monitor cache misses, pipeline stalls, page faults, and other performance-impacting events within the debugger.
These issues also make me wonder about how higher-level languages are going to provide appropriate abstractions to avoid the performance problems. For example, garbage collection is a major win for programmer productivity but it does encourage memory usage patterns that are not always conducive to performance given architectural limitations in the underlying hardware. The same abstraction problems affect C/C++ of course but at least there is the option to go “bare-metal” where necessary.
Whatever the answers are here, it’s certain there are some interesting times ahead for developers.



