TL;DR

METR wrote a paper (reasonable, measured) and tweeted (sensational) that AI agent autonomy was doubling every seven months, insinuating that there was a new “Moore’s Law” for AI agents. You can tell by people’s reactions who read the paper and who looked at the tweet.

There is no new Moore’s Law for AI Agents.

Read on to see why.

Original tweet that got all of the attention.

Moore's Law for AI Agents

More measured tweet from METR’s founder and CEO. Notice the 1500x fewer views than the original.

"Persnickety title"

Moore’s Law—Predictable from Observation. Predictive from Roadmap.

“The number of transistors in an integrated circuit doubles every two years.”

  • Gordon Moore

Gordon Moore’s infamous prediction was made in 1965 after observing two years of this growth. It ended up being predictive because it was based on four well-understood engineering challenges with clear paths for improvement.

  1. Miniaturizing transistors and wires, fitting more onto a chip.
  2. Improving the printing process for chips with four known ways to improve photolithography.
  3. Making the wafers larger to accommodate more transistors.
  4. Improving circuit design.

Moore made the projection based on the observed trends in a rapidly developing, but fundamentally understandable technology.

Now, did this prediction become a target that motivated the industry? Yes. History shows it created enormous pressure across the industry to invest in R&D to drive this technological progress.

The observation and the pressure are where similarities between Moore and METR end.

METR’s Law for AI Agents?

No.

In contrast with Moore’s Law there is not a set of well-understood engineering challenges with clear paths for improvement that will continue to drive growth in AI agents’ abilities.

The measurements for the engineering challenges Moore saw were objective. METR’s attempt to quantify AI Agent ability is subjective. If you read the research paper as well as the HCAST, Human Calibrated Autonomy Software Tasks paper that it references you’ll see how much subjectivity went into coming up with a way to measure AI agent ability.

That’s a longer way of saying, microchip measurements were straightforward. AI agent ability is measuring knowledge work, which is fundamentally subjective.

The Contrast With Moore’s Law

In generative AI, METR is observing an early trend, similar to Moore.

But what’s going to continue driving improvements that could enable doubling?

Who, if anyone, truly understands the underlying technology? There is a lot going on underneath the hood of a generative AI model. Anthropic’s research about understanding the thoughts of a language model demonstrates both how much is happening and how much we don’t yet understand about this technology.

An AI roadmap or Premature Extrapolation

Capital has poured into GenAI and continues to do so. The large labs are in constant competition. The pressure that Gordon Moore thought was needed to drive Moore’s Law already exists in spades in the generative AI industry.

It’s the roadmap that is missing.

What’s the data, algorithm, compute, or unknown that’s going to get us to autonomy?

The waters are muddy.

The rhetoric from Sam and Dario has been off the charts about expecting something approximating AGI. They have better seats than anyone to see the whole field, but we also have to discount what they say. For one, CEOs will always talk their book. Second, we’ve seen how some of what they have proclaimed hasn’t proven true (e.g. scaling laws for pre-training will drive us to AGI.)

I don’t see an equivalent to Gordon Moore’s four-part roadmap. Instead I see unknowns. Does more basic research need to be done? Could we strip mine the mountain of AI research that already exists for ideas that could unlock potential breakthroughs that could make METRs observation into a continued reality?

The tsunami of cash being poured into the field alongside a history of AI that stretches back 70 years makes for ripe conditions for advances. What those may be though are unclear.

Communication Is Perception

Here’s one I’m torn on, but it seems worth calling out.

Is METR’s headline what we need communicated right now?

Communication is perception.

It doesn’t matter what you say, it matters what others hear. This is a principle often ignored by people that try to communicate.

What people are hearing in reaction to this headline is “AI will take over knowledge work.”

Andrew Yang’s reaction was illustrative and representative of reactions to METR’s announcement.

There’s an argument that METR’s research puts pressure on the world to become aware of the potential for a fundamental reordering of society.

There’s also an argument that this is alarmist and untrue.

The U.S. is alight in gaslighting these days and maybe I’m just irked at a sensational headline.

But at the same time, for someone whose life’s work is to build a better education system, the potential of AI to reorder society as we know has some pretty big implications for that work.

Trying to sort through that while in the fog of uncertain and unclear times is a challenge in its own right, and misleading headlines don’t help.