By John Gruber
Never lose a recording — Audio Hijack is 20% off with coupon code FIREBALL20.
This is the easiest product review I’ve ever written. The iPhone 4S is exactly what Apple says it is: just like the iPhone 4, but noticeably faster, with a significantly improved camera, and an impressive new voice-driven feature called Siri.
Siri feels like old-school Apple. Newton-esque, at least in spirit. Sculley-era Apple was obsessed with this sort of thing — natural language processing, the computer as “digital assistant”. New models of human-computer interaction. AI that works naturally. Even the startups from former Apple employees of that era worked on this sort of thing — remember General Magic?
It’s also sort of the antithesis of everything prior in iOS. iOS is explicit and visual. Everything you can do in iOS is something you can see and touch on screen. The limits are visible and obvious. Siri, on the other hand, feels limitless. It’s fuzzy, and fuzzy on purpose. There’s no way to tell what will work and what won’t. You must explore. I found it extremely fun to explore Siri — primarily because so many of the things I tried actually worked. It’s a completely different interface for interacting with your iPhone. You’re not driving or commanding the existing iPhone interface with commands. There is no syntax to memorize. You’re just, well, talking to your iPhone.
I tried the same things Scott Forstall demoed on stage. They all worked, as promised. On a whim, I asked Siri, with no other context, “When is my next haircut?” Siri answered with my appointment scheduled for later this month. I asked, “When was my last haircut?”, and it found that appointment from a month ago. I told Siri, “Play something by the Rolling Stones” and it played a random Stones song (“Gimme Shelter”, from Let It Bleed). I interrupted and said, “Play ‘Some Girls” by the Rolling Stones”, and Siri played just that song.
I was out running errands today, walking through the city. I remembered, a mile away from home, that a screw had fallen out of my wife’s favorite eyeglasses over the weekend, and that she was waiting for me to fix them. Walking down a city street, I said, “Remind me to fix Amy’s glasses when I get home.”1 Half an hour later, within a few doors of our house, the reminder went off.
Me: “Set an alarm for 9 AM.”
Siri: “It’s set for 9 AM.”
Me: “Change that to 10 AM.”
Siri: “I changed your alarm to 10 AM tomorrow.”
Me: “Cancel that alarm.”
Siri: “I deleted your 10 AM alarm.”
Me: “Thank you, Siri.”
Siri: “Your wish is my command.”
In a sense, Siri is like a second interface to iOS. The first interface is the app interface. Launch, tap, drag, slide. The Siri interface is a different world. As stated above, this new interface is in many ways the opposite of the regular one — open-ended and implicit instead of narrowly defined and explicit. I don’t mean to imply that Siri doesn’t fit in or feel right at home — it does. But Siri is indicative of an AI-focused ambition that Apple hasn’t shown since before Steve Jobs returned to the company. Prior to Siri, iOS struck me as being designed to make it easy for us to do things. Siri is designed to do things for us.
But there are parallels between Siri and the regular iPhone interface, too. The original iPhone launched in 2007 with a limited feature set and no third-party apps. 2011’s Siri is largely based on the same feature set: messages, email, phone calls, calendars, alarms, web search. But with the original iPhone, it was always obvious how Apple could allow third-party developers into the party: by allowing third-party apps. Apps are a nice clean, obvious concept. The iOS interface is fundamentally only two levels deep: the first level is the home screen, listing all available apps. The second level is when you tap an app to use it. Hit the home button to go back to the home screen. That’s it. File system sandboxing and background processing restrictions make it easy to keep apps from interfering with each other or with the system as a whole.
People are going to start clamoring for third-party Siri integration as soon as they see Siri in action. But I’m not sure what form that integration could take. Best I can think is that apps could hook up to (as yet utterly hypothetical) Siri APIs much in the same way that Mac apps can supply system-wide Services menu items. But how would they keep from stomping on one another? If Siri supported third-party apps and you said, “Schedule lunch tomorrow at noon,” what would Siri do if you have multiple Siri-enabled calendar apps installed? This is similar to the dilemma Mac OS X faces when you open a document with a file extension that multiple installed apps register support for.
Perhaps Siri is too centralized to allow for third-party integration? Where by third-party integration, I mean “any app in the App Store” integration. Siri does support two third-party services right now: Yelp and Wolfram Alpha. But it was Apple that added those two, and Apple that determines when to use them. They’re data services, not software.2 With the regular iOS interface, the central hub is very thin: a home screen of app icons. It wasn’t necessarily obvious from the get-go that Apple would allow third-party apps, but it was obvious how they could: just add those new icons to the home screen. It’s not so obvious how you could add new commands to Siri without potentially stomping on existing ones.
Here’s an example. Wolfram Alpha has terrific stock-price information and comparison features. I link to them frequently for stock info from Daring Fireball. So I tried asking Siri, “What was Apple’s stock price 10 years ago?” But once Siri groks that you’re asking about a stock price, it queries the built-in Stocks app for data, and the Stocks app doesn’t have historical data that goes back that far. “What did Apple’s stock price close at today?” works, but asking for historical data does not. But Wolfram Alpha has that data. And in fact, you can get it through Siri, by asking something like “Search Wolfram Alpha for Apple’s stock price ten years ago.” But there’s no way to tell Siri to prefer Wolfram Alpha for stock market information by default.
Even if Siri is never opened up to App Store apps, though, there’s clearly a sense when you use it that Apple is only just getting started with this.
The best sign I can think of regarding Siri’s practical utility: after a week of using this test iPhone 4S, yesterday, while using my regular iPhone 4, without thinking I held down the home button to create a new reminder for myself, and when the old Voice Control interface appeared, my mind went blank for a few seconds while I pondered what went wrong. I missed Siri already.
I wouldn’t say I can’t live without Siri. But I can say that I don’t want to.
Alongside Siri, the digital assistant, is the straight-up speech-to-text dictation feature now available system-wide. This was my favorite feature of Android when I tested a Nexus S early this year. It’s now one of my favorite features of the iPhone 4S. I composed numerous tweets, text messages, and email replies this week using it. It’s not flawless but it is excellent. It works just as well while walking on a city street as it does while I’m sitting alone in my office. It even worked remarkably well in a crowded, busy bar.
Worth noting that you can say punctuation, like “comma”, “period”, and “question mark”. So if you say, “Try saying quote open the pod bay doors unquote”, Siri should properly interpret that as:
Try saying “open the pod bay doors”
The cynical answer, of course, would be that Apple is withholding it from other iOS devices in order to spur additional upgrades to the iPhone 4S.
A non-cynical answer would be that Siri depends on certain hardware in the iPhone 4S that doesn’t exist on any other iOS device. But what, exactly? The iPad 2 has the same CPU (A5) and same amount of RAM (512 MB), but no Siri, no text dictation. No one at Apple has an answer for this other than to say that Siri was developed alongside and specifically for just one device: the iPhone 4S. And as good as Siri is, I get the impression that Apple is far from satisfied with where it stands today. The Newton was killed by that “egg freckles” stuff — it never recovered from the public perception that its handwriting recognition wasn’t good enough. If Siri is any less accurate on older iOS devices than it is on the 4S, Apple isn’t going to allow it.
Don’t forget that there’s a server/cloud-based backend that is required for Siri to function. I can’t help but suspect there’s some truth to this tweet from Mark Crump, speculating that Apple might be limiting Siri to the 4S simply to restrict the server load while the service is “beta”. There could be 100 million iOS 5 users by the end of this weekend; there will only be 1 or 2 million iPhone 4S users.
The most-hated sight in all of Mac OS X: the rainbow “wait” cursor, a.k.a. the spinning pizza of death. Waiting sucks.
iOS doesn’t have any cursors, let alone the spinning pizza, but it does have an equivalent: the closed iris splash screen of the Camera app. You want to snap a picture, but instead, you see the iris, and… you wait.
The most profound difference between the 4S and 4 cameras has nothing to do with image quality. It’s that you don’t have to wait nearly as long. That closed iris comes up for a moment and then it’s gone, and you’re ready to shoot. And after you shoot, the camera is ready to snap additional photos almost instantly. The difference is huge, and it’s especially nice in conjunction with iOS 5’s new lock screen shortcut to jump right into the Camera app.
I spoke to some friends familiar with the development of iOS 5 and the 4S, and word on the Cupertino street is that camera speed — time from launch to being able to snap a photo, as well as the time between subsequent photos — received an enormous amount of engineering attention during development. The stopwatches were out, and every single tenth of a second that could be shaved was shaved.
Image quality is improved, too, as promised. White balance and exposure choices look more accurate (or at least more pleasing) to my eyes, it performs noticeably better in low light, and dynamic range has been improved significantly. Here’s a small photoset I put on Flickr, showing the same scenes photographed with three different cameras: my old iPhone 4, the iPhone 4S, and my trusty Ricoh GR-D — a dedicated point-and-shoot camera I bought for $800 four years ago.
The photos were not edited, retouched, etc. I simply imported them into iPhoto and then uploaded them to Flickr. You can quibble with the exposure settings, but I snapped the photos naturally — I framed the image, tapped to choose a focus/exposure point, and snapped. (With the Ricoh, I focused first, then recomposed.)
The photos from the 4S look better than those from the 4. They don’t look as good as those from the Ricoh, but over the last year, I found myself carrying the Ricoh less and less, because the iPhone 4 was good enough — and already in my pocket. Now, I wonder if the Ricoh is going to start collecting dust.
I asked for and received a Sprint model from Apple for testing, so that I could compare it against AT&T and Verizon. Sprint service was decent in both downtown San Francisco and at home in Philadelphia. Network benchmarks (using the Speedtest.net iPhone app) showed very similar results to those I’d gotten with the Verizon iPhone 4 earlier this year.
My tests were certainly far from extensive, but from what I’ve seen, Sprint’s service is very much comparable to Verizon’s. Compared to AT&T, Sprint and Verizon are better for voice (both in terms of audio quality and call-dropping), but worse for data (slower, often much slower).
Because I use 3G for data far more than I do for voice calls, I’m sticking with AT&T for at least another year. At whatever point in the future the iPhone adds support for LTE networking, I will definitely reconsider this decision, and likely switch. (Look at this comparison of the service plans available from AT&T, Sprint, and Verizon. Sprint’s is far simpler — the only difference between tiers is the number of monthly voice minutes. Text messages and data are unlimited on all Sprint plans.) If you live or work in an area with excellent Sprint coverage, I wouldn’t hesitate to recommend a Sprint iPhone.
Apps launch quicker, scrolling is smoother, web pages render faster. If you used both an original iPad and and iPad 2, it’s a lot like that. If anything, the jump from iPad to iPad 2 was a little more of an improvement, though, because the iPad 2 went from 256 MB of RAM to 512. The iPhone 4 and 4S both have 512 MB of RAM.
The iPhone 4 was my favorite product that Apple has ever made. The iPhone 4S has all the best features of the iPhone 4 — same look, same feel, same Retina Display — and adds several significant improvements. The one and only disappointment I have with the iPhone 4S is that the shutdown spinner animation is still low-res. That’s pretty low on the list of nits to pick.
You might wonder, Hey, don’t you feel like a jerk walking around the city talking to your phone? But here’s the thing: Siri, by default, kicks in when you hold the iPhone up to your ear, so you can talk to it and it looks like you’re on a phone call. ↩︎
You ever notice how Wolfram Alpha’s website returns just about all results as static rendered images, not text? I’ve always suspected the point of that is to make it harder for anyone to systematically spider their results. One side effect for Siri, though, is that Siri can’t read anything from Wolfram Alpha results. ↩︎