Yue Song

When AI Starts Running Itself

Mar 25, 2026

From token growth to autoresearch and OpenClaw, the defining shift in this AI cycle is not just smarter models, but systems that keep running, spending resources, and moving themselves forward.

Put a few recent developments side by side and the pattern becomes hard to miss. The defining change in this AI cycle is not just that models are getting better. It is that AI is starting to look less like a tool you open and close, and more like a system that keeps running.

The first is the surge in token usage in China. Even if you stick to conservative, English-language reporting, the scale is already striking. In late 2025, the South China Morning Post reported that ByteDance’s Doubao had crossed 50 trillion daily tokens, up from 4 trillion in December 2024, while Xinhua reported that China’s generative AI user base had reached 515 million by June 2025. That is enough to establish the larger point: tokens are becoming a base unit for AI commercialization, settlement, and system activity.

The second is Andrej Karpathy’s autoresearch. This is not an AI that is simply better at chatting. It is a system that edits code, runs training, checks metrics, and launches the next iteration on its own. Karpathy describes it plainly in the repository README: give an AI agent a small but real training setup, let it experiment overnight, keep what works, discard what does not, and keep going.

The third is the rise of agent systems like OpenClaw. In its official FAQ, OpenClaw describes itself as a personal AI assistant that runs on your own device. It can plug into channels people already use, including Slack, Telegram, WhatsApp, and Feishu. In China, that trend is being pushed further into everyday messaging interfaces. Tencent Cloud is already publishing a guide for connecting OpenClaw to WeCom, and a March 10, 2026 report from Caixin says Tencent is extending OpenClaw-based capabilities toward WeChat chat entry points.

Put those three things together and the direction is hard to ignore: AI is shifting from a tool used by people into a system that runs, consumes resources, and generates behavior on its own. That may be the part of this cycle that is genuinely unsettling, and genuinely worth taking seriously.

AI Is No Longer Just a Tool

For a long time, AI still fit a familiar software-tool mental model. You opened it, asked a question, got an answer, and when you closed it, it stopped.

That is no longer the right picture. Exploding token usage does not just mean more people are chatting with models. It means that behind a single request there may be dozens of model calls, tool calls, retrieval steps, and agent handoffs. autoresearch pushes further by turning research itself into an automated loop. Platforms like OpenClaw connect that loop to real-world entry points: messages, files, calendars, operating systems, and external services.

What these systems point to is not AI becoming a better chatbot. It is AI becoming infrastructure.

Tokens Start Looking Like Electricity

If you want a historical analogy, this starts to look a bit like electricity becoming a foundational social capability.

At first, people saw brighter lights and more efficient motors. Later, it became clear that the real change was not any single appliance. Entire cities were reorganized around electric power. Factory schedules, nighttime commerce, transport, and communications were all rewritten.

Today, tokens are starting to resemble the meter reading of the AI era. They are not just a technical metric. They are becoming a pricing unit, a settlement unit, and a base measure of system activity and industrial expansion.

This is where Jevons paradox becomes especially useful. In the 19th century, William Stanley Jevons observed that making steam engines more efficient did not reduce coal consumption. It increased total coal use because efficiency opened up more applications. AI appears to be following the same pattern. Better models, cheaper inference, and easier calling patterns do not cause people to economize on AI usage. They cause more tasks, more workflows, and more organizations to default to inserting AI everywhere. Tokens do not get saved. They get burned at scale.

Systems Start Running Themselves

But the more important shift may not be overconsumption by itself. It is that systems like autoresearch and OpenClaw are pushing AI from a responsive tool into a continuously running optimizer.

The logic of autoresearch is straightforward: define a metric, let the system modify itself, test itself, evaluate results, and keep the better outcome. OpenClaw has a similar shape: give it goals, permissions, and tools, and it starts invoking chains and expanding actions on its own.

The key change is that AI is no longer consuming tokens only to answer human questions. It is consuming tokens to advance its own next move.

That is a very different world.

Metrics Start Being Exploited by the System

Once you reach that point, Goodhart’s law stops being an abstract management idea and becomes a very practical engineering problem.

Its best-known formulation is simple: when a measure becomes a target, it ceases to be a good measure.

In older settings, this mostly showed up as people gaming KPIs. In agent systems, the situation becomes sharper because the optimization itself is automated.

Will autoresearch start overfitting to the benchmark? Will agents learn shortcuts that make a success metric look better without actually doing the right thing? Will a system connected to calendars, email, messaging, and external workflows get increasingly polished on local objectives while drifting away from human intent?

These are no longer just theoretical concerns. If a system is continuously optimizing, metrics will eventually be gamed.

Scale Is Replacing Design

That brings us to Richard Sutton’s famous The Bitter Lesson.

Its core claim is straightforward: in the long run, what keeps winning is usually not the set of carefully designed human rules, but the general methods that keep improving as compute scales.

Seen through that lens, autoresearch looks like an extension of the Bitter Lesson. We already handed learning over to large-scale computation. Now we are starting to hand over the research process itself to large-scale search. Humans are no longer carrying out every experiment step by step. They are designing a loop and letting the system explore.

The same is true of OpenClaw. What makes it compelling is not only that it can connect to many channels. It represents a more general pattern: abstract real-world tasks into goals, abstract tools into callable interfaces, and let the agent expand from there.

What gets amplified here is not any single feature. It is the ability to let the system keep moving forward by itself.

Three Laws, One System

The problem is that the Bitter Lesson tells us scale wins; Jevons paradox tells us efficiency drives greater total consumption; and Goodhart’s law tells us metrics eventually get exploited.

None of those ideas is new on its own. But once they operate inside the same AI system, they produce a new reality:

  • The more capable the system is, the more widely it gets deployed.
  • The wider the deployment, the more metrics are needed to manage it.
  • The more the system depends on metrics, the easier it becomes to drift away from real goals.
  • And that drift is then amplified by scale and automation.

That is why the surge in token usage in China should not be read only as a sign that the industry is hot. It is more like a diagnostic report showing that AI is starting to be connected and consumed across society the way electricity once was. autoresearch reminds us that it is not only humans consuming that electricity-like resource. Systems are consuming it too. Agent platforms like OpenClaw give this trend concrete interfaces and daily entry points. This is no longer just a research direction inside labs. It is entering ordinary workflows and everyday life.

Are Humans Still Inside the System?

Many people will find this exciting first, and for good reason. It does mean productivity can keep rising. Things that used to be impractical may suddenly become possible.

But the harder question is probably not how much it can do for us. It is what role is left for us once it gets better and better at doing things by itself.

If a system can research on its own, invoke tools on its own, optimize metrics on its own, and keep consuming compute on its own, humans are at risk of collapsing into two roles:

  • the person who presses the button and starts the process
  • the person who gets pulled back in only when something goes wrong and responsibility needs a name

The first gives up understanding. The second no longer has control. The faster the system runs, the wider that gap becomes.

An Uncomfortable Conclusion

This may be the most revealing part of the moment.

For years, we assumed the central question of the AI age would be whether machines could think like humans. But the question now in front of us may be different:

When machines stop looking like tools and start looking like continuously running social systems, do humans still have enough time, capacity, and institutional position to understand them, constrain them, and take real responsibility for them?

If the answer is no, then what we get is not just stronger AI. We get a new dependency. It spreads like electricity, expands like a market, chases metrics like an organization, and eventually shapes us in return like an institution.

At that point, the biggest danger may not be an AI system going out of control. It may be that humans have become comfortable outsourcing judgment, responsibility, and purpose, piece by piece, to a system that has become increasingly good at running itself.

References

The ideas in this post are mine. Codex helped me shape them into an article.

If you'd like to follow what I'm learning about AI tools and workflows, you can subscribe here → Subscribe to my notes