By John Gruber
Honk is the all-new way to chat with your friends in real time, with messages shown live as you type.
If you missed it, here’s a re-link to last week’s special episode of The Talk Show, with special guests Craig Federighi and Greg “Joz” Joswiak.
By necessity, it was shot remotely — Federighi and Joz were at Apple Park, and I was at home in Philadelphia. Overall I think it turned out pretty well, and whatever is wrong with it is the result of my middling skills as an interviewer. Technically, I think it came out amazingly well — it looks great and sounds great. It doesn’t look or sound like a Zoom or FaceTime call that was simply recorded and played back.
A lot of folks noticed that, and have asked how we made it. I have good news and bad news. The good news is the answer is very simple and doesn’t require any expensive equipment. The bad news is it’s a lot of work.
Federighi and Joz were using iPad Pros for the call itself. I was using a MacBook Pro. We all wore AirPods. So the call itself was conducted using iPads on their side, a MacBook Pro on mine, using the built-in device cameras for video. One advantage of using iPads is that you guarantee there will be no fan noise. We wore AirPods for the call to avoid feedback.
But all that stuff was just for the conference call. We didn’t use any of that footage for the show.
For the show’s audio, we used real podcast/TV-quality microphones — desktop mics for Federighi and Joz; and a professional lav mic and boom mic for me, connected to a Sound Devices MixPre-6 digital audio recorder (all borrowed from local audio pro and Sandwich collaborator Zach Phillips). We didn’t need both microphones, but using two gave us more choices in post. I could have recorded my side with the microphone and USB preamp I usually use for my podcast, but it didn’t really work visually with the space where we set up to film in my office. The point is all you really need is any microphone good enough to record a podcast.
For video, we shot 4K 30 FPS using iPhone 11 Pros. That’s right, iPhones. Apple seems to have plenty of them so Federighi and Joz each had two — one positioned head-on, and one to the side for a wider-angle view. I just used one. The trick to getting that “looking right at the camera” angle is to position the iPhone camera just behind and above the device being used for the conference call. We weren’t using the iPhones as webcams for the call, but we positioned the main ones as though we were. That’s key to the setup.
So we wound up with three audio recordings (one of each of us) and five video recordings (one for me, two each for Federighi and Joz). We also had an “if all else fails” recording of the Webex call. I’m lucky to have nice natural light in my office (we shot at 10am PT / 1pm ET), and we set up a few additional low-cost LED lights, that, to be honest, I don’t think accomplished much.1 After that, it was just a matter of editing.
Which, of course, is a huge matter. So, a few weeks before the show, I called upon my friends at Sandwich, and they were keen to help. They know me, they know The Talk Show, and they know how to make good videos. They nailed it. They were so easy to work with and the end result is exactly how I imagined the show turning out. Really, the biggest problem was just getting them my footage. I get somewhere between 250-300 Mbps downstream, but my upstream connection maxes out around 10 Mbps. With 17 GB of footage, that wound up taking around 7-8 hours. (Because they used four cameras, Apple’s footage was close to 100 GB in total — they, however, apparently have faster upstream internet service than I do.)
So to recap:
None of this is magic or particularly expensive. Calling in Sandwich for post-production and editing was, let’s face it, a cheat code, but the raw video footage from the iPhones was really good. Recent iPhones are amazingly good video cameras.
Basically, the secret to shooting a remote interview that doesn’t look like a recorded internet call is simply not to use the internet call video. Instead, shoot each participant like you would if there were no internet call involved, recording video and audio locally for everyone, using decent cameras and microphones. In audio podcasting we call this technique “double-ending” — recording the audio locally for each participant. We used the same principle for my show, just with both video and audio.
I checked the forecast days in advance, and expected and got good weather. A severely overcast day here in Philadelphia would have necessitated a Plan B for lighting on my side. ↩︎