When Anton Korinek, an economist at the University of Virginia and a fellow at the Brookings Institution, got access to the new generation of large language models such as ChatGPT, he did what a lot of us did: he began playing around with them to see how they might help his work. He carefully documented their performance in a paper in February, noting how well they handled 25 “use cases,” from brainstorming and editing text (very useful) to coding (pretty good with some help) to doing math (not great).
ChatGPT did explain one of the most fundamental principles in economics incorrectly, says Korinek: “It screwed up really badly.” But the mistake, easily spotted, was quickly forgiven in light of the benefits. “I can tell you that it makes me, as a cognitive worker, more productive,” he says. “Hands down, no question for me that I’m more productive when I use a language model.”
When GPT-4 came out, he tested its performance on the same 25 questions that he documented in February, and it performed far better. There were fewer instances of making stuff up; it also did much better on the math assignments, says Korinek.
Since ChatGPT and other AI bots automate cognitive work, as opposed to physical tasks that require investments in equipment and infrastructure, a boost to economic productivity could happen far more quickly than in past technological revolutions, says Korinek. “I think we may see a greater boost to productivity by the end of the year—certainly by 2024,” he says.
What’s more, he says, in the longer term, the way the AI models can make researchers like himself more productive has the potential to drive technological progress.
That potential of large language models is already turning up in research in the physical sciences. Berend Smit, who runs a chemical engineering lab at EPFL in Lausanne, Switzerland, is an expert on using machine learning to discover new materials. Last year, after one of his graduate students, Kevin Maik Jablonka, showed some interesting results using GPT-3, Smit asked him to demonstrate that GPT-3 is, in fact, useless for the kinds of sophisticated machine-learning studies his group does to predict the properties of compounds.
“He failed completely,” jokes Smit.
It turns out that after being fine-tuned for a few minutes with a few relevant examples, the model performs as well as advanced machine-learning tools specially developed for chemistry in answering basic questions about things like the solubility of a compound or its reactivity. Simply give it the name of a compound, and it can predict various properties based on the structure.