By John Gruber
Learn anything, anywhere with Skillshare. Get your first two months free.
For many years now, iOS has offered an option in the Passcode section of the Settings all: “Erase all data on this iPhone after 10 failed passcode attempts.”
I’ve long been intrigued by this setting, but never turned it on, out of the vague fear that something could happen and I’d wind up with a wiped iPhone. Say, if a “friend” surreptitiously took my phone at a bar and entered 10 wrong passcodes as a prank. Something like that.
I asked on Twitter over the weekend how many people use this feature, and over 4,000 people responded to the poll. One-third use the feature, two-thirds don’t. Among those who don’t, the most common response, by far, is that they don’t use it because they’re the parents of young children, and they fear that their kids will trigger the erasure of their phone.
I had no idea until I looked into it last weekend, but it turns out this feature is far more clever than I realized, and it’s highly unlikely that your kids or jackass drinking buddies could ever trigger it. After the 5th failed attempt, iOS requires a 1-minute timeout before you can try again. During this timeout the only thing you can do is place an emergency call to 911. After the 6th attempt, you get a 5-minute timeout. After the 7th, 15 minutes. These timeouts escalate such that it would take over 3 hours to enter 10 incorrect passcodes.
It seems pretty clear from the responses to my poll that I wasn’t alone in thinking that this feature was more dangerous than it really is. I’ve got it turned on now, and I can’t think of a good reason why anyone wouldn’t enable this. ★
Let me just reiterate up front that my suspicions surrounding Google’s Duplex recordings are not suspicions regarding the idea of Duplex itself. If I had to bet on who will be the first to create an AI voice system that passes for human, even within the limited constraints of a single well-defined task like booking reservations, it would be Google. If Vegas had a betting line on this, Amazon would probably have decent odds too, but surely Google would be the favorite.
We can all hear for ourselves how well Google Assistant works today. I’m not alleging that these recordings are complete fabrications, or betting against Google being further ahead in this effort than anyone else.
But everything about the way Google announced this — the curious details of the calls released so far, the fact that no one in the media has been allowed to see an actual call happen live — makes me suspect that for one or more reasons, the current state of Duplex is less than what Sundar Pichai implied on stage. His words before the first recording was played: “What you’re going to hear is the Google Assistant actually calling a real salon to schedule an appointment for you. Let’s listen.” And after the second recording: “Again, that was a real call.”
You can parse those words precisely and argue that Pichai never said they were unscripted or un-coached, or that the recordings are unedited. But that’s like saying Bill Clinton was technically truthful with his “I did not have sexual relations with that woman” statement. The implication of Clinton’s statement was that he wasn’t involved sexually with his intern, and that wasn’t true. The implication of Pichai’s statement was that right now, today, Google has a version of Duplex in its lab that can call a real restaurant or hair salon and book a reservation and sound truly human while doing so. Not soon, today. Look at the news coverage from the announcement — Mashable, The Guardian, The Verge, The Evening Standard — all of those reports on Duplex’s announcement are written in the present tense, as though it’s something Google has working, as heard, with no or very minimal editing, today.
If a few months or more from now Google can demonstrate a real Duplex call, live, that wouldn’t disprove my suspicion that they can’t do it right now in May 2018 — even though Sundar Pichai clearly implied last week that they can. If I’m wrong — if stories come out in the next week or two from journalists granted behind-the-scenes access to listen to Duplex make live calls (and watch them be parsed correctly, creating calendar events and notifications of the reservation dates and times), and those calls sound every bit as realistically human as the recordings Google has released so far — my suspicion will be proven false. And I’d be delighted by that. Part of the reason I’m so focused on Duplex is that if it really works like it does in these recordings, it’s one of the most amazing advances in technology in years.
But Google hasn’t done that, and the more I think about it, and the longer Google stonewalls on press inquiries about Duplex, the more suspicious I get that they can’t. Even if Duplex still has a low success rate, it would be amazing if, say, half its calls worked as well and sounded as good as these recordings. That would be perfectly understandable for a technology still in development.
But Pichai also said “This will be rolling out in the coming weeks as an experiment.” On the one hand, that makes me feel like maybe I am off my rocker for being so skeptical. Why in the world would Pichai say that if they weren’t at a stage in internal testing where Duplex works as the recordings suggest? But on the other hand, if they are that close, why haven’t they invited anyone from the media to see Duplex in action?
They did invite Richard Nieva from CNet to a behind-the-scenes preview before I/O, but all he got to hear were recordings, too:
In a building called the Partnerplex on Google’s sprawling campus in Mountain View, California, I’ve been invited to hear a 51-second phone recording of someone making a dinner reservation. […]
As I listen to what sounds like a man and a woman talking, Google’s top executives for Assistant, the search giant’s digital helper, watch closely to gauge my reaction. They’re showing off the Assistant’s new tricks a few days before Google I/O, the company’s annual developer conference that starts Tuesday.
Turns out this particular trick is pretty wild.
That’s because Person 2, the one who sounds like a man, isn’t a person at all. It’s the Google Assistant.
Why not let Nieva hear it live? Why not let Nieva answer the phone and book the reservation himself, as though he works at the restaurant? If it’s “weeks” away from rolling out in a limited beta to the public, that should be possible.
The job of journalists is to verify these things, not just to take a company’s word for it. Here’s Om Malik, linking to Dan Primack’s Axios story on Google’s stonewalling:
“Google may well have created a lifelike voice assistant…Or it was partially staged. Or something else entirely. We just don’t know, because Google won’t answer the questions.” @danprimack doing what journalists are supposed to do. Verify and dig deeper!
Finally journalism starts asking obvious questions of tech.
Tech journalism has never asked basic questions like “how did you do this?”
Apple once used my software to demo their tech, which wasn’t ready.
Reporters refused to ask about this.
“How did you do this?” is a necessary question. But even broader, when you’re only shown a recording, the question is “How do we know this is real?”
Maybe Duplex, today, works just as well and sounds just as human as these recordings suggest. But maybe it doesn’t work as well as they claimed, or doesn’t sound so human,1 or takes pauses that were edited out of the clips they’ve released. We don’t know, because Google hasn’t allowed anyone to verify anything about it. It’s like a card trick where the magician, rather than an audience member, picks the card and shuffles the deck.
It’s the difference between, say, watching video of a purported self-driving car versus watching — or even better, riding as a passenger in — an actual self-driving car.
The headlines last week should have been along the lines of “Google Claims Assistant Can Make Human-Sounding Phone Calls”, not “Google Assistant Can Make Human-Sounding Phone Calls”. There’s a difference.
A recording is not a demo. You can demo hardware and software that isn’t shipping yet — most companies do, because that’s when the products are still under wraps and can make for a surprise. But there’s an obligation to be clear about the current state of the product, and to demo what you currently have working “for real”. Showing it privately to select members of the media is another acceptable strategy. Just to cite one famous example from Apple: in January 2007 the original iPhone was six months away from shipping and still needed a lot of work. But what Steve Jobs showed on stage was real — early stage software running on prototype hardware. Everything demoed was live, not a recording. And then to further prove that, after the keynote, select members of the media, including Jason Snell, Andy Ihnatko, and David Pogue, got up to 45 minutes of actual hands on time with a prototype, even though the software was at such an early stage that some of the default apps only showed screenshots of what they were supposed to look like.
That’s how you prove to the world that a demo was what you said it was. It is damn curious that Google won’t do that with Duplex. ★
Google now claims their plan all along has been to have Duplex identify itself to humans. I don’t understand how that squares with the efforts they clearly went through to make Duplex sound convincingly human. It seems clear that they only started thinking about disclosing Duplex as a bot to humans in response to the ethical outcry after the keynote. Ethics aside though, what makes the promise of Duplex so tantalizing as a technology is its seeming humanness. ↩︎
At the bottom of Google’s AI Blog announcement of Duplex (“An AI System for Accomplishing Real World Tasks Over the Phone”), they included a photo of two Duplex engineers eat a meal, with the following caption:
Yaniv Leviathan, Google Duplex lead, and Matan Kalman, engineering manager on the project, enjoying a meal booked through a call from Duplex.
As suspicions around this announcement deepen, I got to wondering if we could identify this restaurant. If we could identify the restaurant, we could ask them if they had been told in advance they would be speaking to Google Duplex, among other interesting questions.
The image is cropped somewhat tightly, but they’re clearly eating Chinese food, the bench style and wall color are distinctive, and there’s a large picture hanging over their heads. So, I did the laziest thing I could possibly do: I asked my Twitter followers if any of them recognized it.
22 minutes later, we had the answer from DF reader Jay P: Hong’s Gourmet, in Saratoga, CA. This image on Yelp shows the same bench, same wall, and same picture on the wall. Next door to Hong’s Gourmet is Masu Sushi, whose sign is legibly reflected in the glass of the picture behind the Google engineers.1
My thanks to Jay P and everyone else who contributed to the thread on Twitter. Jay deserves the credit for cracking this, by going backwards from the Masu Sushi sign in the reflection.2 All I did was ask. The fact that I had an answer to my question in just 22 minutes shows that having a large follower count on Twitter is a bit of a super power. I honestly can’t think of another way to answer this question without Google PR’s help. I suppose, without Twitter, I could have just posted the question on Daring Fireball, and I might have gotten the same answer. But the threaded, public, instant nature of Twitter allowed for multiple people to contribute — we went from “this might be the place” to “this is definitely the place” in just a handful of minutes. Remarkable, really. ★
One weird detail is that the image from Google of the engineers has been flipped horizontally, so the reflection of the neighboring restaurant’s sign isn’t mirrored. My only guess as to why Google flipped this image is that they wanted Leviathan, the project lead, to have his name listed first in the caption. ↩︎