Protein-folding models like AlphaFold 2, which just predict the structure of a protein given the sequence, aren't as important to pharmaceutical companies. At least as far as I can tell.
But newer models like AlphaFold 3 are trying to predict the structure of a protein interacting with another molecule (e.g., another protein, a strand of DNA, with an antibody, with a small molecule). This, if it worked really well, might be very valuable to drug companies. Take a drug company trying to make a drug that binds to a specific receptor: they could screen a bunch of candidates using this hypothetical model and only experimentally test the candidates which seemed promising on the computer. This could substantially decrease the cost of finding good drugs.
I'm honestly not sure if models are useful enough yet to provide this value to drug companies. (It's very hard to tell from the outside; one of the things that helps the academics working on OpenFold is that they can get feedback from pharma companies internal benchmarks which the general public probably won't hear about). But this vision has enough potential to have drug companies willing to invest fairly large sums of money.
“But second, the experience of watching LLMs become increasingly restricted underlined the importance of open source. “It’s not something that I thought I cared about all that much,” he told me. “I’ve become a bit more of a true open source advocate.””
nice to see people come around to open source. it’s been a huge boon for me and my career. love to see this and nice read thank you.
At Kult, we ran into a similar issue with foundation models. Started heavily on GPT-4 for product recommendations. Worked great until OpenAI changed their API pricing mid-quarter — our costs jumped 40% overnight.
We've since built abstraction layers that let us swap between Claude, GPT, and open models. More engineering overhead upfront, but the freedom to optimize for cost/performance without rewriting everything is worth it.
The pharma companies funding OpenFold are making the same calculation: short-term cost for long-term autonomy.
Curious how this plays out as the gap between open and closed models potentially widens. Is independence worth it if the proprietary models stay meaningfully better?
This tension between open science and commercial interests mirrors a deeper architectural challenge in AI development. When we design systems with monolithic, proprietary architectures, we create dependencies that extend beyond code to knowledge itself. What if we approached protein folding models more like biological systems, with modular, interoperable components that different labs could specialize in and share?
This could allow both academic depth and commercial application without the current zero-sum dynamic. The brain doesn't hoard specialized functions; it distributes them across regions that collaborate.
Really enjoyed this perspective. It highlights how cooperation often beats isolation: academia gets openness, pharma gets independence, and the ecosystem becomes more resilient. Sometimes the biggest progress happens when unlikely allies choose shared infrastructure over dependence on a single gatekeeper. Team work!
Thank you for this interesting article. May I request a brief explanation of why protein-folding models are important to pharmaceutical companies?
Protein-folding models like AlphaFold 2, which just predict the structure of a protein given the sequence, aren't as important to pharmaceutical companies. At least as far as I can tell.
But newer models like AlphaFold 3 are trying to predict the structure of a protein interacting with another molecule (e.g., another protein, a strand of DNA, with an antibody, with a small molecule). This, if it worked really well, might be very valuable to drug companies. Take a drug company trying to make a drug that binds to a specific receptor: they could screen a bunch of candidates using this hypothetical model and only experimentally test the candidates which seemed promising on the computer. This could substantially decrease the cost of finding good drugs.
I'm honestly not sure if models are useful enough yet to provide this value to drug companies. (It's very hard to tell from the outside; one of the things that helps the academics working on OpenFold is that they can get feedback from pharma companies internal benchmarks which the general public probably won't hear about). But this vision has enough potential to have drug companies willing to invest fairly large sums of money.
Thank you. That is very interesting and helped me understand your post. I appreciate it.
“But second, the experience of watching LLMs become increasingly restricted underlined the importance of open source. “It’s not something that I thought I cared about all that much,” he told me. “I’ve become a bit more of a true open source advocate.””
nice to see people come around to open source. it’s been a huge boon for me and my career. love to see this and nice read thank you.
This vendor dependency dynamic rings very true.
At Kult, we ran into a similar issue with foundation models. Started heavily on GPT-4 for product recommendations. Worked great until OpenAI changed their API pricing mid-quarter — our costs jumped 40% overnight.
We've since built abstraction layers that let us swap between Claude, GPT, and open models. More engineering overhead upfront, but the freedom to optimize for cost/performance without rewriting everything is worth it.
The pharma companies funding OpenFold are making the same calculation: short-term cost for long-term autonomy.
Curious how this plays out as the gap between open and closed models potentially widens. Is independence worth it if the proprietary models stay meaningfully better?
This is a fantastic piece Kai. Super interesting and well reported. Do you guys cross publish stuff? I could see this in the NYT or Post or WSJ.
This tension between open science and commercial interests mirrors a deeper architectural challenge in AI development. When we design systems with monolithic, proprietary architectures, we create dependencies that extend beyond code to knowledge itself. What if we approached protein folding models more like biological systems, with modular, interoperable components that different labs could specialize in and share?
This could allow both academic depth and commercial application without the current zero-sum dynamic. The brain doesn't hoard specialized functions; it distributes them across regions that collaborate.
Really enjoyed this perspective. It highlights how cooperation often beats isolation: academia gets openness, pharma gets independence, and the ecosystem becomes more resilient. Sometimes the biggest progress happens when unlikely allies choose shared infrastructure over dependence on a single gatekeeper. Team work!