6 Comments

Great piece. Lots of impressive big AI programs that do specific narrow tasks. Even generating AI art is pretty specific, though not in the way you're talking about here.

All these programs impress me more than the more general bullshit machine that is ChatGpt. (But many people love ChatGpt obviously.)

Expand full comment

This is a great summary of this fascinating research. Some things I wonder:

1. How many mathematical domains can this sort of approach be extended to? The authors of this study seem to think it may be limited to just geometry proofs.

2. I'd be so curious to hear from gold-medal high school students who participate in the Olympiad and learn how they solve these problems. I guarantee it doesn't involve generating millions of examples that go into a synthetic data set.

3. It's interesting to me that, notwithstanding having generated copious amounts of data beyond what any human could do, this approach still doesn't surpass gold-medal performance. At some point it might be interesting to develop a taxonomy of tasks where AI systems have clearly exceeded human capabilities (chess, go), are at parity (geometry), and are still way behind (abstract reasoning).

Expand full comment

A few useful other notes about this work:

1. Even though pure search methods are intractable for this problem, the best Computer Algebra based solution gets about 10 of the IMO problems right. The paper says basically that they don't care about this approach because it isn't relevant to broader AI questions they are interested in. (I found that disappointing.)

2. They show in the paper that just the deductive models they built get 21 problems right, so that already is well past the SOTA and close to gold medalist level, even without the deep learning aspect.

Expand full comment

Great Piece, Tim! So I take away that AI is - will be truly creative. I would be great if you could do a piece on the subject 'how can scientist observe what an AI is doing' or 'can we ask an AI to explain its reasoning and procedure'

Expand full comment

Sounds like Kahneman’s system 1 and system 2.

Expand full comment

Yes that's a great comparison!

Expand full comment