“In Awe”: Scientists Impressed by Latest ChatGPT Model o1

“Andrew White, a chemist at FutureHouse, a non-profit organization in San Francisco that focuses on how AI can be applied to molecular biology, says that observers have been surprised and disappointed by a general lack of improvement in chatbots’ ability to support scientific tasks over the past year and a half, since the public release of GPT-4. The o1 series, he says, has changed that.

Strikingly, o1 has become the first large language model to beat PhD-level scholars on the hardest series of questions — the ‘diamond’ set — in a test called the Graduate-Level Google-Proof Q&A Benchmark (GPQA). OpenAI says that its scholars scored just under 70% on GPQA Diamond, and o1 scored 78% overall, with a particularly high score of 93% in physics…

OpenAI also tested o1 on a qualifying exam for the International Mathematics Olympiad. Its previous best model, GPT-4o, correctly solved only 13% of the problems, whereas o1 scored 83%.”

From Nature.

Human Progress

“In Awe”: Scientists Impressed by Latest ChatGPT Model o1

Suggested Readings

How to Escape the Productivity Slump

Keeping Cool: The Air Conditioner That Changed America

What If AI Chatbots Are Saving Lives?

The AI Debate: Extinction Versus Salvation