
Large Language Models' Emergent Abilities Are a Mirage | WIRED
notes
spend enough time with these technologies (and ask a lot of questions, in different directions) and the point of this article becomes clear.
link
summary
A new study suggests that sudden jumps in LLMs' abilities aren't surprising or unpredictable, but are actually the consequence of how we measure ability in AI. The original version of this story appeared in Quanta Magazine. Researchers found that the jump in ability wasn't smooth; performance remained near zero for a while, then jumped. A new paper by researchers at Stanford University posits that the sudden appearance of these abilities is just a consequence of the way researchers measure the LLM's performance. They argue that the abilities are neither unpredictable nor sudden. The size of language models is measured in terms of parameters, roughly analogous to all the ways that words can be connected. The more parameters, the more connections an LLM can find. The Stanford researchers point out that the LLMs were judged only on accuracy; either they could do it perfectly, or they couldn't. They suggest using a metric that awards partial credit.