It Sure Is Feisty
I wish I could say that my experience upgrading to Ubuntu 7.04 was delightful.
Last Friday I invoked the handy updater tool. After a couple of hours of downloading (from the Internode mirror, thanks) and installing the new software, I rebooted to find...
A blank screen.
Not just that, but a completely unresponsive computer. It had almost completed initialising, just to the point of loading X. After the screen went blank though, it wouldn't exit X to the text console. In fact, it wouldn't even respond to pings.
So began a weekend attempting to diagnose the blank screen. I soon isolated it to a problem with the nVidia proprietary drivers, but no solutions were forthcoming.
The Ubuntu Forums are a bit of a blessing and a curse in my experience. There is just so much traffic there, which is good, but there is a fairly low signal-to-noise ratio, which is bad. When I was diagnosing RAID disk problems, I found many threads that had become inundated with folks describing similar-but-different problems (or even completely different problems), and with others proposing half-arsed solutions. It's often quite difficult wade through it all to a real solution to any given problem.
After some time wading, I tried a different tack. The following ominous message appeared early in the boot sequence, and it bugged me:
agpgart: Detected AGP bridge 0 agpgart: Aperture conflicts with PCI mapping. agpgart: Aperture from AGP @ f0000000 size 4096 MB agpgart: Aperture too small (0 MB) agpgart: No usable aperture found. agpgart: Consider rebooting with iommu=memaper=2 to get a good aperture. PCI-DMA: Disabling IOMMU.
So I tried the obvious, rebooting with
iommu=memaper=2. This was no good, but a bit of googling led me to this nV News Forums post, where someone is describing very similar symptoms to mine, and getting response from an nVidia engineer. There was no resolution, but from this thread I learned two important facts:
- That some recent changes were made in the kernel in the area of IOMMU and aperture size.
- The existence of
The second piece of information was the one that eventually solved my problem, namely adding
iommu=noaperture to the kernel boot line.
The solution is really sub-optimal because it results in a 32MB aperture, and apparently 64MB is the useful minimum. However I can say that 32MB does at least allow the system to load the nVidia drivers and run at an acceptable speed. In fact it runs beautifully.
It seems likely that fixing this problem would almost certainly have fixed the appalling graphics performance with the old 6.10 version. Whereas in the old kernel, the lack of a usable AGP aperture resulted in performance problems, in the new kernel it simply refused to work at all.
Of course it has to be asked: why wasn't this caught in testing?
My theory is that this problem is caused by a combination of: an AMD x86-64 CPU, AGP graphics, and a particular type of BIOS that, for some reason, doesn't set/report the correct AGP aperture size.
As for the chance of your average human being able to diagnose any of this. Well. I'll leave it up to you humans to decide.