AI has progressed rapidly in the past few years.
The primary cause of recent AI progress is scaling. In 2015, researchers discovered you can improve AI basically by spending more money. AI is made by using compute to execute algorithms that find an AI model that fits data. The scaling laws AIs’ performance just by using more compute, more data, or bigger models. In other words, you can improve AI just by spending more money. This is significant because it suggests AI will continue to progress exponentially. One might have thought AI
Since the discovery of the scaling laws, investment into large, general purpose AI systems has increased drastically. OpenAI’s GPT-1 used around 10^19 FLOPs (a measurement of compute). GPT-2 around 10^21 FLOPs. GPT-3 used around 10^23 FLOPs. GPT-4 is estimated to have used 10^25 FLOPs. In April, a leaked Anthropic pitch deck said they plan to develop a competing system that is 10 times more capable than GPT-4.
What will upcoming models be able to do? Anthropic claimed “These models could begin to automate large portions of the economy. We believe that companies that train the best 2025/26 models will be too far ahead for anyone to catch up in subsequent cycles.”
Companies are not just scaling AI systems to make them smarter. They are also investing into making AI more agentic, more autonomous, and better at long term memory and planning so they can take significant actions in the real world.
OpenAI has been reported to be raising money to develop “artificial general intelligence that is advanced enough to improve its own capabilities.” Just days after OpenAI’s head of alignment warned about scrambling developing risky planning and action-taking abilities, OpenAI announced it had connected GPT-4 to a massive range of tools, including Slack and Zapier.
Anthropic announced expanding it’s chatbot Claude’s context window from 9K to 100K tokens, corresponding to around 75,000 words. Per their website, “this means businesses can now submit hundreds of pages of materials [as well as an entire book] for Claude to digest and analyze, and conversations with Claude can go on for hours or even days.”
DeepMind has also worked on generalist agents. Their research on Gato showed a single AI can “play Atari, caption images, chat, stack blocks with a real robot arm and much more, deciding based on its context whether to output text, joint torques, button presses, or other tokens.” In fact, it could outperform humans on hundreds of tasks, even though it was never specifically trained to do them.
AutoGPT – a tool that enables GPT to interact with itself so it can take real world actions without humans in the loop – has become the most popular repository on Github. There are countless other examples of AIs beginning to improve themselves. If AIs begin self-improving more, we could see capabilities explode at an accelerating rate.