Some thoughts after recovering from a hard drive failure

The other night the internal hard drive of my MacBook Pro suddenly died. A few observations regarding the recovery process, etc.


16 Apr, 2013

File under: I didn't see this coming

Don’t get me wrong: I’m enough experienced to know that you can’t naïvely expect a hard drive to last for long, especially if it’s the internal hard drive of the machine you use most. Yet, despite your level of preparation, the death of a hard drive tends to catch you by surprise quite often.

In the last 20 years of dealing with hard drives, I’ve witnessed the most diverse demises. A hard drive usually dies rather quickly, but it generally has a way to warn you that its passing is getting alarmingly near: it starts emitting new noises or noise patterns that are different from the usual. This ‘acoustic’ approach has saved me in a couple of situations in the past, when I wasn’t as backup-savvy as I am now. By hearing different ticking patterns, I could predict the imminent failure and save 80–90% of my stuff in time.

I also witnessed extreme cases, like with the internal 40 GB hard drive of my 12″ PowerBook G4, where the drive degradation was so gradual it actually kept working for two months after manifesting strange (and at times frighteningly loud) mechanical noises. During that period, the PowerBook could boot fine and a lot of things were working well. There were no performance-related issues or applications that took a suspiciously long time to launch. But the upcoming death of the hard drive was evident, not only because of the noises. Disk Utility had deemed the disk ‘irreparable’ after aborting a verification process that had already taken 45 minutes. After half an hour of use, the PowerBook’s fan would rapidly reach full speed and the chassis would become extremely hot in the disk drive area. I was lucky I could save everything before having to replace the drive. At the time, money was quite tight, and that ‘slow death’ bought me some time to save enough for a replacement. (I can’t emphasise this enough: this is not common hard drive failure behaviour, so treat this anecdote as the exception, not the rule).

Two nights ago, the hard drive of my MacBook Pro did nothing of the sort. Quite the opposite, actually. It just died without warning. No strange noises, no unusual ticking patterns, not even an increase in noisiness (even quiet drives tend to get a bit noisier as they grow old and especially when their time’s about over). I periodically run Disk Utility on the main drive to check up on its health. Never a problem, not even the occasional mix-up in the drive’s logic structure. Nothing. I was watching a movie and suddenly the image froze, while the audio kept playing for a while. I thought there was something wrong with the video file, or that the player application was acting up, so I tried quitting it. No response. Force-quitting didn’t get me far, either. The Mac quickly became unresponsive, so I forced a reboot.

Grey screen. Apple logo. Spinning wheel. Minutes passed. Not good. Fans started, rapidly accelerating. Not good at all. Then a flow of text and command strings (like when you reboot your Mac in Verbose mode) appeared briefly. Then, a message in various languages warned that the computer would restart due to a problem. Definitely not good. At this point, the Mac entered a self-restart loop, trying to finish the boot process but never succeeding. After seven attempts, I turned it off. The drive was evidently gone. I felt more surprised than angry or worried. I stared at the powered-off MacBook Pro for a few moments: Did that just happen? Really? — I was asking myself.

A bit of luck

I keep various backups of my stuff, and despite not being the perfect tool, I’ve always given Time Machine a chance since day one. I keep Time Machine backups of my MacBook Pro, although I don’t keep the external Time Machine drive always on during the day. In other words, I don’t do hourly backups (also because I have CrashPlan always running in the background, keeping a tight backup schedule of my entire Home folder), but I turn the external drive on during the day, usually towards the end of the day, and do at least three or four backups.

Luckily, when disaster hit on Saturday night, I’ve had the Time Machine drive running for a few hours, and when the internal hard drive failed at 3:50 AM, the last useful Time Machine backup was completed at 3:43 AM. I thought, If the backup is fine, I can restore the system from Time Machine without losing practically anything.

The problem is that drive failures always catch you in a bad moment, and I needed to recover and have the Mac up and running again as soon as possible. Having a drive die on you on a Saturday night means waiting at least until Monday to do anything. So I started browsing online for a quick replacement. Again, since my current financial situation is not good, another constraint was the budget, so the replacement had to come quick and be cheap. Disappointingly, the online Apple Store doesn’t sell internal hard drives for Mac laptops, only a few expensive solutions for Mac Pros (at least here in Spain). I checked other good sources and found a few eligible candidates: not needing a huge internal disk (the one that failed was the stock 320 GB this MacBook Pro came with, and I still had 60 GB free) was another good thing, because today internal hard drives in the 320–500 GB range are in fact quite affordable. Yet, even for the best of options I would have to wait a few days for international shipping. Oh well, I can’t do much to speed things up anyway; — I thought — I’ll place the order on Sunday evening and hope for the best, meanwhile I can continue my work on my 12″ and 17″ PowerBook G4s.

Yesterday I went for a stroll with my wife in the city centre, to clear my mind and divert it from the paranoic trains of thought one inevitably has in these situations (“What if my backups are corrupted and I lose all those important documents and a year worth of photos?”, things like this). We visited the FNAC store just out of curiosity, though I clearly remembered from a previous visit that they didn’t sell internal hard drives — they mostly had desktop and portable external storage solutions. When I saw a box with a Seagate 500 GB 2.5″ internal hard drive on special offer, I couldn’t believe my eyes. I had found an affordable solution that was also bigger than the drive it would replace. And I had found it in a store. On a Sunday. If all went well, I could be restoring everything in a matter of hours.

Keep calm and carry on

Yesterday night was devoted to replacing the hard drive and attempting to restore the entire system from the last Time Machine backup. I connected the external disk via FireWire 800 and crossed my fingers. When I booted the MacBook Pro I realised it couldn’t boot from the Recovery HD partition, because that was a blank new hard drive, but evidently there was one in the external Time Machine drive, because after a few instants, the OS X Utilities main interface was there on the screen. I selected Restore from Time Machine backup and prepared to wait a long long while as almost 200 GB worth of stuff had to be copied back in the internal hard drive. At around 9%, the recovery application aborted, citing unspecified errors. I was bummed. Since it was the most recent backup, the backup performed just 7 minutes before the previous drive died, I thought that maybe an error had occurred because some of the essential files in that backup had been corrupted. So I tried with the penultimate backup. An error, again. It was 4 AM, I was tired and a bit depressed, and decided to go to bed.

This morning, as I resumed the recovery operations, something occurred to me. I verified my suspicions and I was right: I had tried to restore a Mountain Lion backup using an older version of the OS X Utilities, the Lion version. Since Internet Recovery was out of the question (I tried rebooting with Cmd-Option-R a few times, but nothing happened), to have a working Mountain Lion Recovery HD partition I would need to install a fresh copy of Mountain Lion on the MacBook Pro, then reboot the MacBook Pro from that partition, and try the Restore from Time Machine backup option again. But I had no physical copy of OS X Mountain Lion. When I upgraded I forgot to create an installation disk for cases like this, mea culpa. Luckily I still had around the USB pendrive with a copy of the OS X Lion installer, so I installed Lion from the pendrive, connected to the Mac App Store, redownloaded Mountain Lion, upgraded, and finally rebooted in the correct, freshly-created Recovery HD partition. The process of restoring the entire system from the last Time Machine backup went smoothly, although the wait was long and suspenseful.

Considerations, in no particular order

– The Restore process is rather amazing. After the final reboot, I was presented with an almost identical snapshot of my system just minutes before the previous hard drive died. All the applications that were open before the disaster simply reopened, with the browsers restoring all open tabs, and other application reopening the last opened documents. I know it’s something that has to be expected from restoring a full backup, but I was amazed nonetheless.

– If you want to be up and running in no time, keep a bootable clone of your main system on an external hard drive, and update it at least once a week. Use great tools like SuperDuper! or Carbon Copy Cloner.

– Create a bootable backup install disk of the latest OS X version. Believe me, it really comes in handy in situations like these. On the Web there are quite a few good articles explaining how to make such a disk (for Mountain Lion, see for example this one from Ars Technica, or this one from Macworld, just to name two of the most prominent sources). The process is straightforward, and you only have to invest in a small 8 GB USB pendrive.

– Checking the SMART state of the hard drive is of little use to predict impending failures. It has never worked for me, always reporting “Verified” or similar reassuring statuses even when performing diagnostics on patently bad drives. This time was no exception. Don’t rely on that. Try to develop a fine ear for your hard drive’s noises, establishing a baseline of normal noises and patterns during daily operations, and watching out for anything out of the ordinary (strange repeated ticking, unexpected whirring, and the like).

– Even if you don’t check your internal drive routinely, drop everything and do so the moment you notice something unusual happening on your Mac, e.g. unexpected slowdowns in system performance, applications becoming sluggish or non-responsive for no apparent reasons, applications that take an unusually long time to launch, etc. In my experience, the most telling visual sign of something odd involving the hard drive is the whole system becoming sluggish and registering any user action with noticeable delay (cursor movement included).

– Remember: hard drives die unexpectedly in most cases. (Solid State drives too, in case you’re wondering). It will happen when least you expect it. It will happen at an inconvenient time. You will be bothered. If you don’t have a backup of your stuff, you will also be panicking. Be prepared.

– Backup, backup, backup, of course. Maybe I was just lucky in trusting Time Machine, but if you really don’t want to invest energies in developing a backup strategy, at least buy an external drive and configure it as a Time Machine drive. It’s an easy, hassle-free and low-maintenance method. Even if you can’t restore your entire system from a Time Machine backup (for whatever reason or error), and you need to reinstall OS X, you will at least be able to recover some settings, documents and data from that backup using the Setup Assistant.

– Check which version of the OS X Utilities you’re using to restore your Time Machine backup. An older version won’t be able to restore a Time Machine backup of a newer OS X version. I was trying to restore a Mountain Lion backup using the recovery utilities created by Lion, and I kept getting errors. You can tell which version of the OS X Utilities you’re using by looking at the icon of the Reinstall OS X option.

– If you’re looking for good quality, step-by-step guides to replace your Mac’s hard drive (and much much more), keep iFixit’s online Guides in your bookmarks. Really an invaluable resource.

