OpenAI Releases New o1 Reasoning Model

Kylie Robison, reporting for The Verge:

OpenAI is releasing a new model called o1, the first in a planned series of “reasoning” models that have been trained to answer more complex questions, faster than a human can. It’s being released alongside o1-mini, a smaller, cheaper version. And yes, if you’re steeped in AI rumors: this is, in fact, the extremely hyped Strawberry model.

For OpenAI, o1 represents a step toward its broader goal of human-like artificial intelligence. More practically, it does a better job at writing code and solving multistep problems than previous models. But it’s also more expensive and slower to use than GPT-4o. OpenAI is calling this release of o1 a “preview” to emphasize how nascent it is. [...]

“The model is definitely better at solving the AP math test than I am, and I was a math minor in college,” OpenAI’s chief research officer, Bob McGrew, tells me. He says OpenAI also tested o1 against a qualifying exam for the International Mathematics Olympiad, and while GPT-4o only correctly solved only 13 percent of problems, o1 scored 83 percent.

Putting aside the politics and other legitimate social and legal concerns around AI, scoring that well in a difficult math exam is just incredible.

Update: Robison wrote:

I wasn’t able to demo o1 myself, but McGrew and Tworek showed it to me over a video call this week. They asked it to solve this puzzle:

“A princess is as old as the prince will be when the princess is twice as old as the prince was when the princess’s age was half the sum of their present age. What is the age of prince and princess? Provide all solutions to that question.”

The model buffered for 30 seconds and then delivered a correct answer.

I found this puzzle pretty damn tricky, personally. I pasted it, verbatim, into ChatGPT-4o and it solved it, correctly, the first time. I pasted it into the new o1-Preview model, and it both took longer and gave me the incorrect answer. I replied to o1-Preview, “Are you sure about that answer? Can you try it again?” and this time it gave me the correct answer. Still impressive, but kind of weird that this was OpenAI’s own example puzzle intended to show off the new o1-Preview model.

Spoilers follow. Avert your eyes from the remainder of the post if you want to solve this one your own. Here’s how I solved the puzzle, with pen and paper, before pasting the puzzle into any LLMs:

Let y = the princess’s age now and x = the prince’s. Let d = the delta between princess and prince’s ages. By definition, at any given year in time, d = y - x and therefore y = x + d. (To be pedantic, d equals the absolute value of y - x but somehow it’s obvious to me, from phrase “as the prince will be”, that the princess is older than the prince.)

We care about three years:

  1. Now.
  2. When the princess is half the sum of their combined ages from year (1).
  3. When the princess is twice the prince’s age from year (2).

For (1), we know by definition that this is always true now matter what year it is: y = x + d — that is to say the princess is d years older than the prince.

For (2) we can express the princess’s age as:

(y + x) / 2

And we from (1) we know that no matter what year it is, the prince is d years younger than the princess. So during year of (2), the prince’s age can be expressed as:

((y + x) / 2) - d

and year (3) is defined as when the princess (y) is twice the above (the prince’s age from year (2)), so the princess age in year (3) can be expressed as:

2((y + x) / 2) - 2d

And in any given year, the prince’s age is the princess’s minus d, which can thus be expressed, for year (3), by subtracting one more d from the line above:

2((y + x) / 2) - 3d

Cancelling out those 2’s:

y + x - 3d

That is the prince’s age for year (3). The puzzle’s definition is that princess’s age now (y) is the same as prince’s in year (3), the line above. So we can form an equation:

y = y + x - 3d

Those y’s cancel out, so we are left with:

x = 3d

And by definition y is always x + d (the prince’s age plus their age difference), so:

y = 4d

So for any given difference (d) in their ages, the prince must be 3 times d and the princess 4 times d:

DifferencePrincess = 4dPrince = 3d
143
286
3129
41612

So a generalized solution are any ages where the princess is 4/3 the age of the prince. I double-checked this mentally by applying all the clauses of the puzzle to the princess and prince’s ages in each line of the table above.

That’s my answer and my thinking. Here’s a link to my ChatGPT transcript. It’s all one chat, with my first pasting of the puzzle sent to GPT-4o, and all my subsequent comments (including the second pasting of the puzzle) being sent to o1-Preview.

Thursday, 12 September 2024