Linked List: June 20, 2024

Thursday, 20 June 2024

Matt Levine on OpenAI’s True Purpose ★

Matt Levine, in his Money Stuff column:

OpenAI was founded to build artificial general intelligence safely, free of outside commercial pressures. And now every once in a while it shoots out a new AI firm whose mission is to build artificial general intelligence safely, free of the commercial pressures at OpenAI.

Anthropic Introduces Claude 3.5 Sonnet ★

Anthropic:

Claude 3.5 Sonnet sets new industry benchmarks for graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and coding proficiency (HumanEval). It shows marked improvement in grasping nuance, humor, and complex instructions, and is exceptional at writing high-quality content with a natural, relatable tone.

Claude 3.5 Sonnet operates at twice the speed of Claude 3 Opus. This performance boost, combined with cost-effective pricing, makes Claude 3.5 Sonnet ideal for complex tasks such as context-sensitive customer support and orchestrating multi-step workflows.

In an internal agentic coding evaluation, Claude 3.5 Sonnet solved 64% of problems, outperforming Claude 3 Opus which solved 38%. Our evaluation tests the model’s ability to fix a bug or add functionality to an open source codebase, given a natural language description of the desired improvement. When instructed and provided with the relevant tools, Claude 3.5 Sonnet can independently write, edit, and execute code with sophisticated reasoning and troubleshooting capabilities. It handles code translations with ease, making it particularly effective for updating legacy applications and migrating codebases.

I’ll take them with a grain of self-promoting salt, but the evaluation tests presented by Anthropic position Claude 3.5 Sonnet as equal to or better than ChatGPT-4o. Again: I don’t think there’s a moat in this game.

Also, from the bottom of the post, this interesting nugget:

One of the core constitutional principles that guides our AI model development is privacy. We do not train our generative models on user-submitted data unless a user gives us explicit permission to do so. To date we have not used any customer or user-submitted data to train our generative models.

~~Even Apple can’t say that.~~ Follow-Up: “Training Large Language Models on the Public Web”.

Warp’s Agent Mode: Natural Language for Terminal Commands ★

Michelle Lim, writing for the Warp blog (Warp, I’ll note, has previously sponsored DF):

Agent Mode raises the level of abstraction in the terminal. Using natural language, you can ask the terminal to accomplish any high level task without worrying about the specific commands you need.

Agent Mode knows when it needs more information to help you complete a task. It will ask permission to run commands on your machine and use the outputs to guide you, step-by-step. It’s also self-correcting when it runs into errors.

Because Agent Mode executes tasks through the CLI, it integrates with practically any service with zero configuration. If the service has a CLI, an API, or publicly available docs, you can use Agent Mode for the task. Agent Mode has inherent knowledge of most public CLIs, and you can easily teach it how to use internal CLIs by asking it to read their help content.

Man (no pun intended?), is this cool. I just tried “How do I generate a random number between 0000 and 9999?” and got:

You can generate a random number between 0000 and 9999 using the following shell command in the Fish shell:
  printf "%04d\n" (random 0 9999)

And when I followed up with “How can I use jot to generate a random number between 0000 and 9999?” I got:

You can use the jot command to generate a random number between 0000 and 9999 using the following command:
  jot -w "%04d" -r 1 0 9999

Both of which answers are correct. For jot — a tool I first learned about, of course, from the inimitable Dr. Drang — I think a simpler, and thus better, answer is:

jot -r 1 0000 9999

but Warp’s Agent Mode suggestion is certainly good enough.

Lacking Votes, EU Postpones Vote on CSAM Law That Would Ban End-to-End-Encryption for Messaging ★

Clothilde Goujard, reporting for Politico:

A vote scheduled today to amend a draft law that may require WhatsApp and Signal to scan people’s pictures and links for potential child sexual abuse material was removed from European Union countries’ agenda, according to three EU diplomats.

Ambassadors in the EU Council were scheduled to decide whether to back a joint position on an EU regulation to fight child sexual abuse material (CSAM). But many EU countries including Germany, Austria, Poland, the Netherlands and the Czech Republic were expected to abstain or oppose the law over cybersecurity and privacy concerns.

“In the last hours, it appeared that the required qualified majority would just not be met,” said an EU diplomat from the Belgian presidency, which is spearheading negotiations until end June as chair of the EU Council.

Sanity prevails — for now.

‘This $8 Cardboard Knife Will Change Your Life’ ★

Matthew Panzarino, writing at The Obsessor:

The cardboard is inescapable if you use Amazon or other online stores, they pile up in the hallways and next to the garbage cans and you triage as you can.

We get so many that I have to break down our boxes in order to fit them in our recycle bin. I’ve used all of the typical tools — scissors, pocket knife, box cutter — and many unconventional ones like drywall saws just trying to make this painful job a bit easier.

The CANARY is uniquely serrated all the way around its edge, like a chainsaw. This makes it incredibly good at cutting cardboard either with or across corrugation with ease. I cannot express how easily this knife cuts cardboard, it’s like slicing through regular old paper, it’s amazing.

Last year when I linked to (and recommended) Studio Neat’s Keen — the world’s best box cutter, but which costs about $100 — at least one reader recommended the Canary. For $8 I figured why not. It truly is an amazing product. I do still love my prototype Keen but for opening and breaking down cardboard boxes, the Canary can’t be beat. It’s both highly effective and very safe.

‘Fast Crimes at Lambda School’ ★

Ben Sandofksy went deep on the history of Lambda School, a learn-to-code startup that aimed to disrupt computer science education, and its founder, Austen Allred:

What set his boot camp apart from the others were “Income Share Agreements.” Instead of paying up-front for tuition, students agreed to pay a portion of future income. If you don’t get a job, you pay nothing. It was an idea so clever it became a breakout hit of Y Combinator, the same tech incubator that birthed Stripe, AirBnb, and countless other unicorns.

When Lambda School launched in 2017, critics likened ISAs to indentured servitude, but by 2019 it was Silicon Valley’s golden child. Every day, Austen tweeted jaw-dropping results. [...]

Things got worse from there, and we’ll get to it. First I need to address a common question: what do I have to do with any of this? I have no professional or personal connections to the company or the team. What compelled me to follow this story for the last five years?

On the surface, this is another window into the 2010’s tech bubble, a period where mediocre people could raise ludicrous money amid a venture capitalist echo chamber fueled by low-interest rates. But what makes this any worse than Juicero, Clinkle, or Humane? Why does this rise to the level of Theranos?

These stories hinge on their villains, whose hubris and stupidity end in comeuppance. Theranos had Elizabeth Holmes, Fyre Festival had Bobby McFarlane, and Lambda School has Austen Allred.

Independent journalism at its best.

Apple ID to Be Renamed to Apple Account ★

Adam Engst, TidBITS:

The real problem comes when tech writers document features across multiple versions of Apple’s operating systems. We’ll probably use both terms for a while before slowly standardizing on the new term. Blame Apple if you see awkward sentences like “Continuity features require that you be logged into the same Apple Account (Apple ID in pre-2024 operating systems).” Or maybe writers will compress further to “Continuity features require that you be logged into the same Apple Account/ID.”

I do think “Apple Account” is a better name, so I think the transitional pain is worthwhile.