Yeah you sound like a shill. Nearly everybody who uses LLMs on a serious basis understands that every model is useful for different things and no one straight up writes anything off.
Claude code is still pretty far behind Codex in several ways, most notably long running tasks. They even admit that in their own writings, that’s why it’s such a high priority for the team.
If you think one model is drastically worse than the other it’s something it’s probably because of your prompts and your lack of context or intent. Actually you can prove that if you still have a go to prompt or a list of prompts that you’ve booked Mark because prompt Engineering is not the way to get the best results. Even anthropic has several guides telling you that exact same thing.
Hi Nathanael! ChatGPT recognized me so I don't quite understand your comment. Probably if I tried Claude it would also recognize me. I'm not sure what that would prove.
An interesting question is whether I could teach ChatGPT or Claude to recognize new authors whose work isn't in the training set of the underlying model. I am skeptical of this for the reasons laid out in the article, but it's a hard question to test since I don't have a large trove of unpublished articles by various writers.
From my experience working with scientists, extracting deep insights is like 0.1% of the job. Many professional scientists aren't expected to extract deep insights at all - the insights are often obvious given the data, and the vast majority of the job is data collection. Or they're working in a lab and it's someone else's job to extract insights.
Or consider reproducing research papers. Checking to be sure if a research paper is reproducible is valid science, right? Arguably we need more of that. But there isn't even supposed to be any deep insight there. You still need to hire scientists for this, because so many of the practices in a field are only known by the scientists working in it.
I'm sure there are many scientists whose jobs cannot be replaced by AI. But the "science" part doesn't seem like the hardest part to replace - the AI is much worse at, for example, delivering powerpoint slides asking for funding. A critical part of the PI job! A lot of entry level scientists seem like they could have all of the work they would do five years ago entirely replaced with a modern AI.
There isn’t *supposed* to be any deep insight in replicating an experiment. But that’s only true if the initial author managed to right down all the tips and tricks needed to get the initial experiment working. You often have to fly out someone from the initial lab to show you where they had to kick the machine or duct tape something to get it working.
Well a lot of that data gathering involves interacting with the real world, though, right? AI seems much further away from being able to automate that kind of work.
I guess it probably depends on what sort of science. For example I have been doing some work recently with astronomers, and there it's very common to share these huge public datasets, and then describe your analysis of it. You could reproduce just the "huge dataset to analysis" part of it without any additional real-world access. Of course building another huge telescope in the desert is a ton of interacting with the real world, but that stuff never "replicates" anyway.
This is the point Dwarkesh Patel made last year in talking about “continual learning”. And I’m arguing in a talk I’m giving these days that it’s actually very close to the point that Hubert Dreyfus was making about expert systems back in the 1980s (and about a lot of analytic philosophy). It’s true that a lot of intelligence can be reduced to knowledge that can be expressed as sentences in a language. But there are things you need to practice and optimize on and can’t express in words.
Modern LLMs do a great job of improving on expert systems by having a bottom layer that has trained and practiced. All the types of reinforcement learning they’re adding are doing more. But they don’t do any better on your own task than the instructions you can write down unless that task makes it into the reinforcement learning loop for the next model.
I don’t believe there’s been a recording but if you’re interested enough I can share my notes (which are already halfway to a full paper, because I wanted them to be detailed enough that Claude could write my slides for me as a part of the demonstration).
Tim, one of the cleanest popular-press articulations of the implicit-knowledge problem I've read, the temp-worker analogy in particular does real work.
One push: the reason a seasoned hunch is trustworthy isn't just that it's pattern-rich. It has carried a consequence. The practitioner has made calls, paid for the wrong ones, and adjusted. That loop gives the hunch its weight. An LLM mid-session has nothing in the loop that bears a cost, which means this isn't a context-window problem, it's a stake problem.
In other words the seasoned human practitioner knows what it's like to get burned. Anyone ever release a code change to production that brought down a business critical process? Bet you will never forget doing that. Has any AI model or eco system that feature yet? Does it know what it feels like to get burned? That's where the hunches really come from, even if it happened to you 20 years ago.
Exactly, and if companies make the mistake of replacing Jr employees with AI, then you have a whole other problem. In 10 or so years, you are left without the would be seasoned human professionals that are needed to ensure AI doesn't just run a mock.
There’s an interesting question about how this relates to what goes on during training. You start with a randomized network, feed it some training data, evaluate how badly it did on the data, and then tweak the parameters in the direction that makes it do better on the data. It bears that cost and updates in response. But once it’s released, there are no more changes (because there is no label of whether what it did was good or bad).
(OpenAI tried for a while to use the thumbs up or thumbs down that users gave on responses in later training, but this is what made their models from 4o to 4.5 so extremely sycophantic.)
Kenny — this is exactly the right place to push, and I think it sharpens rather than complicates the picture.
You're right that training involves a real form of consequence-bearing. The network bears a cost, measured against the labeled data, and updates in response. That is genuinely a feedback loop, and it's the loop that produces whatever capability the deployed model has. So the "no consequences ever" framing some critics use isn't quite right.
But I think the OpenAI sycophancy story you point to is the proof that two very different things have been getting collapsed under the same label. Training-time feedback shapes behavior by penalizing the model when its outputs diverge from labeled targets. Consequence-bearing in the wisdom sense requires something stronger: that the cost of being wrong fall on the agent making the call, in a way that matters to that agent's continued standing, capability, or relationships, and over a horizon long enough that the agent has to live with what they decided.
Training has the first. It doesn't have the second, and arguably can't. The thumbs-up/thumbs-down attempt is what you get when an organization tries to manufacture a substitute. The signal is real but the consequence is wrong: the model learns what users approve of, not what was right. Sycophancy is the predictable result, because telling people what they want to hear is the dominant strategy under that reward structure. You've essentially built a system that has been trained on the social cost of disagreement without any of the long-run costs of being wrong.
Which I think is why post-deployment severing of the loop matters more than it first appears. It's not just that the model stops learning. It's that there's no architecture available, and maybe none possible, for routing the right kind of consequence back into the system. The kind that produces judgment in humans takes years of carrying outcomes you cannot deflect. The training paradigm doesn't have a clean analog for that, and the proxies we've tried produce something that pattern-matches to wisdom from the outside while being structurally different underneath.
Curious whether you see a path to a richer feedback structure post-deployment, or whether you think the architecture itself rules it out.
"LLMs seem to lack a capacity for continual learning: the ability to recognize new patterns in — and form new hunches about — information they encounter at inference time."
We have two modes now. One is the online mode described in this article. The AI agent diligently reads and writes notes as it works. This is indeed proving to be quite revolutionary for Claude.
The second is to at some point take all the accumulated knowledge and retrain the model from scratch. This can make much deeper connections.
While I understand the issues raised in this article, it is not fully clear to me humans genuinely do something totally different than these two modes.
At most one could argue humans have more granularity so can refresh their mind faster and without total reset. This is surely smarter and more efficient. But is it fundamentally different?
I always start from the position that we know very little about human cognition and whether the best way to think about it is, in fact, in terms of information processing.
But assuming this is a helpful description of what humans are doing, it’s in part an empirical and engineering question: how feasible and reliable is it to fine-tune a model on a new corpus (like all your chats) and have it glean the right insights (while not forgetting other crucial stuff, etc)? I don’t know—someone more familiar with fine-tuning than me could probably answer that. Setting aside the computational demands, I assume part of the challenge is curating the continual learning corpus—you don’t necessarily want to learn from every new input. Maybe another llm could do that? It’s an interesting question.
We will see. We barely started to capitalize on what we got so far and success of Claude was far from assured given where we were last year. There's so much work within reach of current and near-term AI.
"translate them to English, Python, or any other explicit form"
The word "explicit" crystalized for me why talking about LLMs in relation to testable languages like coding languages and hand waving languages like most human languages in the same breath is inherently misleading.
LLM results that can be expressed in Python et al are explicit (definite in form and content). English et al are only explicit when stating names (Charlie Chaplin) and pointing words (this sentence).
Tim: As much as you did deliver information useful to me throughout your post, your wrapping it in an apparent acceptance of sameness of scopes of the two language types will encourage some lazy folks to believe they behave the same in terms of delivering something actionable. I know this because I am lazy as often as not.
It isn't mostly the point of this post, but I still want a definition of an "automated AI scientist" that can be operationalized and hasn't already happened.
Technically, continual learning is almost trivial - just keep applying backprop during inference. The problem is not really continual learning - it is that LLM learning itself is slow. So continual learning offers few benefits - and has substantial costs - regular brain wipes are a safety feature.
You can’t apply backprop (or the reinforcement learning they use to train reasoning) during inference if you don’t have labels on what is correct and what isn’t!
But of course, that only works for some kinds of tasks. You can learn to predict the next word, or to solve math problems, but you can’t learn to write interesting stories or make persuasive arguments.
It works for interesting stories and persuasive arguments - if they are in the training data. Obliterate part of the training data and score the agent on how well they reconstruct it. That's the basis of how researchers got around the need for labelled trainiing data - which was previously a siignificant impediment. It expanded the domain of ML from labelled datasets - such as ImageNet - to practically anything.
My guess is that backpropagation will only optimize the system further, not make the system capable of abstractions, which I think is required to create improved, or even new, models of the world.
Current LLMs do not have any concept of consequences and causalities, tokens are purely generated from past tokens, as LLMs only have attention mechanisms for relating tokens to each other.
Personally I believe it needs a new kind of attention mechanism that relates output with outcomes, allowing it to modify the attention mechanism itself. In a sense, to have expectations, and modulate uncertainty and confidence about these expectations.
Backpropagation is part of this, but I doubt it is enough.
Instinctively, I believe (or want to believe) in short timelines. When it comes to continual learning, I lean toward Zvi and away from Dwarkesh. But you make a good point. Is this a fair summary: there are at least two important types of human learning, explicit (capable of being rendered into text) and implicit? LLMs can have both, but they acquire the latter only through training. The latter is critically important. So until such time as we can train on the fly, human performance (at least at many tasks) is going to elude our models. Not implausible, although I find myself wondering whether training will turn out to be the only way to impart implicit learning to these things.
I doubt training on the fly will give LLMs human like learning capabilities. Humans can learn with a few examples, through abstractions, mental models, and experimentation. LLMs only "learn" through brute force optimization of its neural networks. Adding more data, even real time data, will not give an LLM any new capabilities to integrate the outcome of their output with a model of a system, to improve that model, or to even evaluate if the model reflects reality at all.
Evolution resulted in better and better visual perception for a lot of organism, but the increase in visual data was probably not what makes some species more successful than others. What you are seeing is not nearly as important as where to look.
Contextual expectations allows complex biological systems to make predictions based on their surroundings, and actively adapt the predictions. An LLM, even with continuous training cannot adapt itself to change where it looks, it cannot improve its attention.
To me, a model is a representation that helps paying attention to the important parts and relationships between the parts. If attention cannot be managed, I don't think actual mental models are possible, and I think real learning is effectively mental modelling.
Great transition from author prediction to implicit knowledge to scientific research /knowledge work. I came into this without a strong opinion but I am now convinced it needs to be addressed (maybe retrain more often?)
Right now AI doesn't learn on the job past its context-building, but it seems to me that there is so much economic power in continual training that it will become an irresistible attraction.
Here's an example. Say I get an AI agent to be my marketing assistant. It learns how to work in my business, but it also takes those lessons back to the central model. Right now there would be a lot of resistance to this - we would have privacy concerns. But I would be forced to swallow any privacy concerns if the marketing assistant had the expertise you would get from running millions of A/B tests across thousands of firms. McKinsey has built their business model not on putative brain power, but on their having visibility of best business practices across all of their client firms. An AI agent that brought back its learnings would have the same advantage, squared. This would create a business moat that the AI firms currently do not have. It would bring network effects to what is now close to a commodity business.
The feedback loop length is doing a lot of work here that I think gets overlooked.
AI can already do something close to science in domains where you run an experiment and get a result in seconds. Protein folding, chip design. The implicit knowledge gap matters less when you can brute-force through thousands of iterations before a human scientist finishes their coffee.
But in domains where the feedback loop runs in months or years, geopolitics, macroeconomics, the gap becomes everything. You can't iterate your way to understanding a trade war. You get maybe three or four data points per decade and each one is confounded by a hundred other variables.
That's probably where human+AI stays orders of magnitude ahead for longest. Humans have spent decades accumulating the kind of gut-level calibration that only comes from living inside slow systems. Models haven't, and training data can't substitute for it.
The frozen-weights point is the cleanest version of why current architectures cannot autonomously do science. Test-time compute and tools stretch retrieval and reasoning over the existing weights, but they do not add a learn-from-the-experiment loop. Sam Altman's March 2028 automated researcher target is a bet that someone closes the inference-time learning gap, not that scaling current systems gets there.
Love the post but I don't think it's going to convince the doubters. (Also, Dreyfus is good but his feuds with the AI people are hard to stomach; I would suggest Harry Collins book Artificial Experts but also this critical review he wrote of Dreyfus: https://www.sciencedirect.com/science/article/pii/0004370296000836.
I think though the problem with the Dreyfusian argument is that AI does end up doing many of the things he thinks it won't, including natural language conversation. I think he's still right with what makes humans human and why computers can't (points similar to what you're making) but the point is that humans routinely create conditions where technology plays a big role in our lives. A lot of what made modern AI possible was the web and the fact that we lead a lot of our lives on the web now which makes our LLM-driven "agents" possible and useful.
You need to stop using chatGPT.
It's not currently the best model.
Go ahead and spend some money and use a real AI agent that has memory, like Claude 4.7
If you legitimately give it articles you actually wrote, it can start recognizing when you upload things you did not write do not belong to you.
Yeah you sound like a shill. Nearly everybody who uses LLMs on a serious basis understands that every model is useful for different things and no one straight up writes anything off.
Claude code is still pretty far behind Codex in several ways, most notably long running tasks. They even admit that in their own writings, that’s why it’s such a high priority for the team.
If you think one model is drastically worse than the other it’s something it’s probably because of your prompts and your lack of context or intent. Actually you can prove that if you still have a go to prompt or a list of prompts that you’ve booked Mark because prompt Engineering is not the way to get the best results. Even anthropic has several guides telling you that exact same thing.
Hi Nathanael! ChatGPT recognized me so I don't quite understand your comment. Probably if I tried Claude it would also recognize me. I'm not sure what that would prove.
An interesting question is whether I could teach ChatGPT or Claude to recognize new authors whose work isn't in the training set of the underlying model. I am skeptical of this for the reasons laid out in the article, but it's a hard question to test since I don't have a large trove of unpublished articles by various writers.
From my experience working with scientists, extracting deep insights is like 0.1% of the job. Many professional scientists aren't expected to extract deep insights at all - the insights are often obvious given the data, and the vast majority of the job is data collection. Or they're working in a lab and it's someone else's job to extract insights.
Or consider reproducing research papers. Checking to be sure if a research paper is reproducible is valid science, right? Arguably we need more of that. But there isn't even supposed to be any deep insight there. You still need to hire scientists for this, because so many of the practices in a field are only known by the scientists working in it.
I'm sure there are many scientists whose jobs cannot be replaced by AI. But the "science" part doesn't seem like the hardest part to replace - the AI is much worse at, for example, delivering powerpoint slides asking for funding. A critical part of the PI job! A lot of entry level scientists seem like they could have all of the work they would do five years ago entirely replaced with a modern AI.
There isn’t *supposed* to be any deep insight in replicating an experiment. But that’s only true if the initial author managed to right down all the tips and tricks needed to get the initial experiment working. You often have to fly out someone from the initial lab to show you where they had to kick the machine or duct tape something to get it working.
That sounds perfect, I bet the AI could also fly someone out from the initial lab to ask them questions ;-)
Well a lot of that data gathering involves interacting with the real world, though, right? AI seems much further away from being able to automate that kind of work.
I guess it probably depends on what sort of science. For example I have been doing some work recently with astronomers, and there it's very common to share these huge public datasets, and then describe your analysis of it. You could reproduce just the "huge dataset to analysis" part of it without any additional real-world access. Of course building another huge telescope in the desert is a ton of interacting with the real world, but that stuff never "replicates" anyway.
This is the point Dwarkesh Patel made last year in talking about “continual learning”. And I’m arguing in a talk I’m giving these days that it’s actually very close to the point that Hubert Dreyfus was making about expert systems back in the 1980s (and about a lot of analytic philosophy). It’s true that a lot of intelligence can be reduced to knowledge that can be expressed as sentences in a language. But there are things you need to practice and optimize on and can’t express in words.
Modern LLMs do a great job of improving on expert systems by having a bottom layer that has trained and practiced. All the types of reinforcement learning they’re adding are doing more. But they don’t do any better on your own task than the instructions you can write down unless that task makes it into the reinforcement learning loop for the next model.
Has your talk been recorded and posted anywhere?
I am also interested in this!
I don’t believe there’s been a recording but if you’re interested enough I can share my notes (which are already halfway to a full paper, because I wanted them to be detailed enough that Claude could write my slides for me as a part of the demonstration).
Tim, one of the cleanest popular-press articulations of the implicit-knowledge problem I've read, the temp-worker analogy in particular does real work.
One push: the reason a seasoned hunch is trustworthy isn't just that it's pattern-rich. It has carried a consequence. The practitioner has made calls, paid for the wrong ones, and adjusted. That loop gives the hunch its weight. An LLM mid-session has nothing in the loop that bears a cost, which means this isn't a context-window problem, it's a stake problem.
Extended the thought (and where it goes for governance) here: https://substack.com/@jammit1994/p-196711207
In other words the seasoned human practitioner knows what it's like to get burned. Anyone ever release a code change to production that brought down a business critical process? Bet you will never forget doing that. Has any AI model or eco system that feature yet? Does it know what it feels like to get burned? That's where the hunches really come from, even if it happened to you 20 years ago.
Exactly, and if companies make the mistake of replacing Jr employees with AI, then you have a whole other problem. In 10 or so years, you are left without the would be seasoned human professionals that are needed to ensure AI doesn't just run a mock.
There’s an interesting question about how this relates to what goes on during training. You start with a randomized network, feed it some training data, evaluate how badly it did on the data, and then tweak the parameters in the direction that makes it do better on the data. It bears that cost and updates in response. But once it’s released, there are no more changes (because there is no label of whether what it did was good or bad).
(OpenAI tried for a while to use the thumbs up or thumbs down that users gave on responses in later training, but this is what made their models from 4o to 4.5 so extremely sycophantic.)
Kenny — this is exactly the right place to push, and I think it sharpens rather than complicates the picture.
You're right that training involves a real form of consequence-bearing. The network bears a cost, measured against the labeled data, and updates in response. That is genuinely a feedback loop, and it's the loop that produces whatever capability the deployed model has. So the "no consequences ever" framing some critics use isn't quite right.
But I think the OpenAI sycophancy story you point to is the proof that two very different things have been getting collapsed under the same label. Training-time feedback shapes behavior by penalizing the model when its outputs diverge from labeled targets. Consequence-bearing in the wisdom sense requires something stronger: that the cost of being wrong fall on the agent making the call, in a way that matters to that agent's continued standing, capability, or relationships, and over a horizon long enough that the agent has to live with what they decided.
Training has the first. It doesn't have the second, and arguably can't. The thumbs-up/thumbs-down attempt is what you get when an organization tries to manufacture a substitute. The signal is real but the consequence is wrong: the model learns what users approve of, not what was right. Sycophancy is the predictable result, because telling people what they want to hear is the dominant strategy under that reward structure. You've essentially built a system that has been trained on the social cost of disagreement without any of the long-run costs of being wrong.
Which I think is why post-deployment severing of the loop matters more than it first appears. It's not just that the model stops learning. It's that there's no architecture available, and maybe none possible, for routing the right kind of consequence back into the system. The kind that produces judgment in humans takes years of carrying outcomes you cannot deflect. The training paradigm doesn't have a clean analog for that, and the proxies we've tried produce something that pattern-matches to wisdom from the outside while being structurally different underneath.
Curious whether you see a path to a richer feedback structure post-deployment, or whether you think the architecture itself rules it out.
Cheers, James.
"LLMs seem to lack a capacity for continual learning: the ability to recognize new patterns in — and form new hunches about — information they encounter at inference time."
We have two modes now. One is the online mode described in this article. The AI agent diligently reads and writes notes as it works. This is indeed proving to be quite revolutionary for Claude.
The second is to at some point take all the accumulated knowledge and retrain the model from scratch. This can make much deeper connections.
While I understand the issues raised in this article, it is not fully clear to me humans genuinely do something totally different than these two modes.
At most one could argue humans have more granularity so can refresh their mind faster and without total reset. This is surely smarter and more efficient. But is it fundamentally different?
Maybe text itself is lossy, and verbalization is the root of the problem.
I always start from the position that we know very little about human cognition and whether the best way to think about it is, in fact, in terms of information processing.
But assuming this is a helpful description of what humans are doing, it’s in part an empirical and engineering question: how feasible and reliable is it to fine-tune a model on a new corpus (like all your chats) and have it glean the right insights (while not forgetting other crucial stuff, etc)? I don’t know—someone more familiar with fine-tuning than me could probably answer that. Setting aside the computational demands, I assume part of the challenge is curating the continual learning corpus—you don’t necessarily want to learn from every new input. Maybe another llm could do that? It’s an interesting question.
We will see. We barely started to capitalize on what we got so far and success of Claude was far from assured given where we were last year. There's so much work within reach of current and near-term AI.
"translate them to English, Python, or any other explicit form"
The word "explicit" crystalized for me why talking about LLMs in relation to testable languages like coding languages and hand waving languages like most human languages in the same breath is inherently misleading.
LLM results that can be expressed in Python et al are explicit (definite in form and content). English et al are only explicit when stating names (Charlie Chaplin) and pointing words (this sentence).
Tim: As much as you did deliver information useful to me throughout your post, your wrapping it in an apparent acceptance of sameness of scopes of the two language types will encourage some lazy folks to believe they behave the same in terms of delivering something actionable. I know this because I am lazy as often as not.
It isn't mostly the point of this post, but I still want a definition of an "automated AI scientist" that can be operationalized and hasn't already happened.
Technically, continual learning is almost trivial - just keep applying backprop during inference. The problem is not really continual learning - it is that LLM learning itself is slow. So continual learning offers few benefits - and has substantial costs - regular brain wipes are a safety feature.
You can’t apply backprop (or the reinforcement learning they use to train reasoning) during inference if you don’t have labels on what is correct and what isn’t!
The solution to that is the same as it is for training: use self-supervised learning. Then you have abundant data which is labelled for correctness.
But of course, that only works for some kinds of tasks. You can learn to predict the next word, or to solve math problems, but you can’t learn to write interesting stories or make persuasive arguments.
It works for interesting stories and persuasive arguments - if they are in the training data. Obliterate part of the training data and score the agent on how well they reconstruct it. That's the basis of how researchers got around the need for labelled trainiing data - which was previously a siignificant impediment. It expanded the domain of ML from labelled datasets - such as ImageNet - to practically anything.
My guess is that backpropagation will only optimize the system further, not make the system capable of abstractions, which I think is required to create improved, or even new, models of the world.
Current LLMs do not have any concept of consequences and causalities, tokens are purely generated from past tokens, as LLMs only have attention mechanisms for relating tokens to each other.
Personally I believe it needs a new kind of attention mechanism that relates output with outcomes, allowing it to modify the attention mechanism itself. In a sense, to have expectations, and modulate uncertainty and confidence about these expectations.
Backpropagation is part of this, but I doubt it is enough.
What about AI computer scientists? They can do their own experiments easily enough.
Instinctively, I believe (or want to believe) in short timelines. When it comes to continual learning, I lean toward Zvi and away from Dwarkesh. But you make a good point. Is this a fair summary: there are at least two important types of human learning, explicit (capable of being rendered into text) and implicit? LLMs can have both, but they acquire the latter only through training. The latter is critically important. So until such time as we can train on the fly, human performance (at least at many tasks) is going to elude our models. Not implausible, although I find myself wondering whether training will turn out to be the only way to impart implicit learning to these things.
Yes that sounds right to me!
Thanks! (Aside: I really enjoy your occasional conversations with Bob Wright.)
I doubt training on the fly will give LLMs human like learning capabilities. Humans can learn with a few examples, through abstractions, mental models, and experimentation. LLMs only "learn" through brute force optimization of its neural networks. Adding more data, even real time data, will not give an LLM any new capabilities to integrate the outcome of their output with a model of a system, to improve that model, or to even evaluate if the model reflects reality at all.
Evolution resulted in better and better visual perception for a lot of organism, but the increase in visual data was probably not what makes some species more successful than others. What you are seeing is not nearly as important as where to look.
Contextual expectations allows complex biological systems to make predictions based on their surroundings, and actively adapt the predictions. An LLM, even with continuous training cannot adapt itself to change where it looks, it cannot improve its attention.
To me, a model is a representation that helps paying attention to the important parts and relationships between the parts. If attention cannot be managed, I don't think actual mental models are possible, and I think real learning is effectively mental modelling.
Great transition from author prediction to implicit knowledge to scientific research /knowledge work. I came into this without a strong opinion but I am now convinced it needs to be addressed (maybe retrain more often?)
Agreed.
Well, we are close to AIs that can do science, but the best humans guiding the best models will be orders of magnitude more effective scientists.
So, I agree in spirit.
Right now AI doesn't learn on the job past its context-building, but it seems to me that there is so much economic power in continual training that it will become an irresistible attraction.
Here's an example. Say I get an AI agent to be my marketing assistant. It learns how to work in my business, but it also takes those lessons back to the central model. Right now there would be a lot of resistance to this - we would have privacy concerns. But I would be forced to swallow any privacy concerns if the marketing assistant had the expertise you would get from running millions of A/B tests across thousands of firms. McKinsey has built their business model not on putative brain power, but on their having visibility of best business practices across all of their client firms. An AI agent that brought back its learnings would have the same advantage, squared. This would create a business moat that the AI firms currently do not have. It would bring network effects to what is now close to a commodity business.
The feedback loop length is doing a lot of work here that I think gets overlooked.
AI can already do something close to science in domains where you run an experiment and get a result in seconds. Protein folding, chip design. The implicit knowledge gap matters less when you can brute-force through thousands of iterations before a human scientist finishes their coffee.
But in domains where the feedback loop runs in months or years, geopolitics, macroeconomics, the gap becomes everything. You can't iterate your way to understanding a trade war. You get maybe three or four data points per decade and each one is confounded by a hundred other variables.
That's probably where human+AI stays orders of magnitude ahead for longest. Humans have spent decades accumulating the kind of gut-level calibration that only comes from living inside slow systems. Models haven't, and training data can't substitute for it.
The frozen-weights point is the cleanest version of why current architectures cannot autonomously do science. Test-time compute and tools stretch retrieval and reasoning over the existing weights, but they do not add a learn-from-the-experiment loop. Sam Altman's March 2028 automated researcher target is a bet that someone closes the inference-time learning gap, not that scaling current systems gets there.
Love the post but I don't think it's going to convince the doubters. (Also, Dreyfus is good but his feuds with the AI people are hard to stomach; I would suggest Harry Collins book Artificial Experts but also this critical review he wrote of Dreyfus: https://www.sciencedirect.com/science/article/pii/0004370296000836.
I think though the problem with the Dreyfusian argument is that AI does end up doing many of the things he thinks it won't, including natural language conversation. I think he's still right with what makes humans human and why computers can't (points similar to what you're making) but the point is that humans routinely create conditions where technology plays a big role in our lives. A lot of what made modern AI possible was the web and the fact that we lead a lot of our lives on the web now which makes our LLM-driven "agents" possible and useful.