Elon Musk’s xAI Colossus Supercomputer Is on Its Way to 1 Million GPUs
(Credit: Nvidiia)
Phase One of xAI’s supercomputer is operating as planned. Elon Musk’s massive AI training system in Memphis, Tenn. has transitioned to the city’s main power grid, according to Tom’s Hardware. The grid gives xAI access to the 150MW it needs to power a center that operates 200,000 Nvidia GPUs. The company plans to reach 1 million GPUs as it scales up to compete with rivals like the Oracle Cloud Infrastructure (OCI), which has at least 131,072 GPUs.
One of the most intriguing aspects of xAI’s ambitious project is the speed with which it’s moving. The company came onto the AI scene in the summer of 2023, when Musk announced xAI on X, writing that its purpose was “to understand reality.” Fast forward a year, and construction of Colossus, the company’s AI supercluster, was well underway. Colossus started with 100,000 Nvidia H100 Hopper AI accelerators, which is a stunning number of devices in its own right, but that number doubled in February 2025 to 200,000.
According to xAI, Colossus was built in 122 days, which is a shockingly fast construction period. The company then bumped to 200,000 GPUs in 92 days. During this time, the company faced a monumental challenge: providing power to the massive number of GPUs in its cluster. Unfortunately, the power grid provided just 7MW to Colossus at launch. To provide power while awaiting a better connection to the Memphis power grid, the company made use of natural gas generators, which, as Tom’s Hardware notes, led to complaints from some residents.
Now, the situation is different. Memphis provides 150MW of power, which resulted in the Colossus reducing the number of generators it was using. On top of that, the Colossus apparently has 150MW in Tesla batteries to provide power when needed. That’s good news for Phase One, but Phase Two will require more power. Another substation will help the Colossus get to 300MW of power, possibly as early as the fall of this year. The supercomputer’s power needs will continue to rise as more GPUs are added to the existing setup.
XAI offers Colossus for training large language models (LLMs) and uses it to train its own Grok, an AI tool available on Musk’s X social network. The company recently released Grok 3 beta, which lets users see the answer it provides and the reasoning it used to arrive at that answer.