π¨ Mark Zuckerberg says Meta is building a 5GW AI data center
Meta plans 5GW AI mega-center, xAI plans Grok micro-agents, Cognition bags Windsurf, Pentagon gives contract to AI cos, Karpathy pushes deep multi-metric RL rethink.
Read time: 11 min
π Browse past editions here.
( I publish this newletter daily. Noise-free, actionable, applied-AI developments only).
β‘In todayβs Edition (14-July-2025):
π¨ Mark Zuckerberg says Meta is building a 5GW AI data center
π‘ xAI will spin Grok into hundreds of task-focused agents that talk to each other.
π οΈ Cognition AI is taking Remaining Windsurf team and tech, days after Google bought its founders for $2.4B
ποΈ Byte-Size Briefs:
Pentagon picked Google, OpenAI, xAI, and Anthropic for new defense deals. Each agreement carries a spending limit of $200 million.
π§βπ Deep Dive: Reinforcement Learning Needs More Than 1 Number, says Andrej Karpathy
π¨ Mark Zuckerberg says Meta is building a 5GW AI data center
Mark Zuckerberg announced Meta will spend hundreds of billions building AI data centers that each pull gigawatt-scale power, chasing models that out-think humans. As part of this, he also earlier stated that Meta will also be investing US$65bn in AI in 2025.
He confirms that they are building multiple GW clusters. 1000MW Prometheus will come online in 2026 and Hyperion will be built in phases and be able to scale to 5000MW+. For context, the largest H100/H200 clusters operational today is only 150-200MW.
Prometheus, its planned AI super-compute campus, goes live in 2026 and Hyperion (the bolder sequel to Prometheus) later ramps to 5 GW, all paid for by Meta's own capital. Zuckerberg said Hyperionβs footprint will be large enough to cover most of Manhattan. Prometheus will be online in 2026, making it one of the first tech companies to control an AI data center of this size.
Meta folded every AI project into Superintelligence Labs after Llama 4 stalled. Bigger models need far more compute, so the plan pivots from βadd serversβ to βbuild mini-power plantsβ.
A single 1 GW cluster can host tens of thousands of cutting-edge GPUs, matching the electricity draw of a mid-size nuclear unit.
Designing that much power and liquid cooling under one roof means custom grid ties, on-site substations, and new waste-heat recovery tricks, not just racks and cables.
Capital spending for 2025 already sits at $64-72 B and rises from there.
Mark Zuckerberg argues the ad business throws off enough cash to fund the gamble. Leadership poached top builders like Alexandr Wang and Nat Friedman, betting their know-how plus vast compute equals a lead over OpenAI and Google.
If Prometheus hits schedule, Metaβs network latency drops, training cycles shrink, and fresh revenue streamsβMeta AI chat, video ads that write themselves, smarter glassesβarrive faster. Investors still wonder when those billions loop back, especially with electricity prices and chip supply both volatile.
A single gigawatt cluster marks a shift: compute is now limited by grid wires, not silicon.
And last week I discussed whey Meta is getting super serious, because ChatGPT Is Turning Into The Busiest Hangout Online, And That Scares Meta
π‘ xAI will spin Grok into hundreds of task-focused agents that talk to each other.
Elon Musk said, that they are βcreating a multi-agent AI software company xAI, where Grok will spawn hundreds of specialized coding and image/video generation/understanding agents all working together and then emulates humans interacting with the software in virtual machines until the result is excellent.β
With $2 B in fresh SpaceX backing and a valuation reported at $113 B, xAI now has the cash to chase OpenAI and Google on the agent frontier.
AI agent architecture that can simulate real humans in ways far more complex than traditional approaches. Stanford HAI recently showed that agent swarms can recreate answers from 1,000 real survey participants, hinting at realistic human-in-the-loop testing in silico.
Agent teams split long coding tasks: one drafts functions, another writes tests, a third trims dependencies. Images and video follow the same patternβspecialist agents handle layout, color grading, and captioning, then merge outputs.
And Musk wants the agents to run whole apps inside headless virtual machines, clicking buttons and writing files the same way a person would
The timing is no accident. OpenAI, Google, and Anthropic all previewed agent toolkits this quarter, and Cognitionβs Devin pitch full-stack AI workers.
Goldman Sachs is piloting Cognition's Devin, an autonomous coding agent, and expects to deploy hundreds or even thousands of these bots alongside its 10,000 human software developers, letting the machines write, test, and ship routine code inside secure sandboxes.
We are on a fast route to a working AI βfactory floorβ.
Common Technical hurdles that need to be overcome
Latency grows when 100+ agents wait on each other.
GPU memory balloons, even with parameter-sharing tricks.
Coordination failures can snowball; an agent might pass bad context that derails the group. arXiv surveys list environment design, memory, and evaluation as open pain points.
Grok also adds a ridiculous anime companion with βNSFWβ mode
SuperGrok just added animated AI companions Ani and Rudy; Ani even hides an optional NSFW outfit. The avatars live inside Grokβs voice chat and, for now, users flip a setting to meet them, though xAI promises a simpler switch soon. You need to Update your app to try out Grok companions!
These companions are not mere stickers. Grok streams its usual large-language-model replies through a real-time animation pipeline, so Ani tilts her head or flashes a shy wink while the text comes back. The same speech-to-text loop that already powers Grokβs voice mode now layers 3D rigging, lip-sync, and emotion tags on top, then pushes a compact animation stream to the app.
That keeps bandwidth low while the GPUs crunch the language response in xAIβs cloud. A similar tag drives the NSFW flag, swapping Aniβs default costume for lingerie once the userβs relationship level or toggle allows it; Rudyβs βBad Rudyβ mood works the same way.
Security engineers slipped the avatars behind the same guardrails that filter Grokβs text. xAI allows adventurous users unlock more daring modes because the images themselves stay client-side, limiting server liability.
A third avatar, Chad, sits in the build and hints at constant content drops that mimic game seasons. At $30 per month for SuperGrok, xAI is betting that playful skins plus voice chat will keep subscribers around longer than plain text.
How to enable Companions on SuperGrok?
01. Click the two bars on the top left.
02. Select the gear icon at the bottom right.
03. Enable the Companions slider.
π οΈ Cognition AI is taking Remaining Windsurf team and tech, days after Google bought its founders for $2.4B
Cognition AI is taking Windsurfβs code, brand, and $82M revenue days after Google bought its founders for $2.4B, slotting the prize under a $4B valuation.
No acquisition amount was disclosed publicly, nor were specific terms of the deal (both are private startups). The acquisition gives Cognition access to Windsurfβs core product, brand, and remaining team β but not its original CEO or co-founders, several of whom have now joined Google
So basically, Google took the captains, Cognition got the ship. Google sidestepped a full purchase by licensing the tech and hiring the chiefs, a play that avoids antitrust noise yet strips the startup of leadership.
Google gains Windsurfβs trained weights plus the brains that built them. In code generation, those weights decide how well an LLM predicts the next line of source. Cursor, Anthropic, and others already fight for this space. So moving them means Windsurfβs best ideas now can fuel Googleβs stack, crushing any advantage the stand-alone remnant holds.
Cognition grabs the rest, promises instant vesting for every engineer, and will feed Windsurfβs data into Devin, its automated coder, hoping the extra examples cut hallucinations and widen language support.
Wang also stated in the announcement video: βAnd of course, weβre friends with Anthropic again,β a reference to Windsurfβs prior falling out with Anthropic, that resulted in Anthropic Claude models being pulled from the list of options that developers could rely on to power their Windsurf AI coding agents and processes.
Cognitionβs future strategic direction
The Cognition-Windsurf combo now goes head-to-head with GitHub Copilot, Replit, Cursor, Googleβs Gemini, and Visual Studio Codeβs agent mode. Earlier, Devin drew attention by fixing GitHub tickets and finishing coding jobs on its own. Blending that skill with Windsurfβs Previews, Reviews, and Enterprise flows could trim silos and crank up automation beyond what rivals manage.
ποΈ Byte-Size Briefs
Pentagon picked Google, OpenAI, xAI, and Anthropic for new defense deals. Each agreement carries a spending limit of $200 million.
On this, Anthropic said in its official announement, that Engineers will fine-tune Claude on classified data, plug it into secure chat and data-fusion dashboards, and stress-test it against spoofing or hallucination tricks. Anthropic already runs Claude Gov inside Palantirβs classified networks and at Lawrence Livermore Lab, where researchers parse nuclear materials data faster than manual review. Those wins gave DoD confidence that the model can handle real-world security stakes.
xAI also said, Grok for Government makes xAIβs frontier models available; it holds a new DoD contract and sits on the GSA schedule, so any federal agency can buy.
So The Pentagon is hedging its bets. Google, OpenAI, and Muskβs xAI each won a similar $200 M ceiling, keeping competition tight and reducing vendor lock-in.
π§βπ Deep Dive: Reinforcement Learning Needs More Than 1 Number, says Andrej Karpathy
Andrej Karpathy write s along post on Twitter and he thinks reinforcement learning (RL) still brings solid near-term wins, yet he sees a ceiling because todayβs setups squeeze 1 reward number out of a whole task.
Whatβs this β1 reward number out of a whole taskβ?
Reinforcement learning, or RL, treats an AI agent like a gamer who only sees the final score once the game ends. The agent tries a bunch of moves, waits, and then receives 1 number that says βgoodβ or βbadβ. For short arcade-style tasks that single score works fine. For long jobs that stretch over many steps, it hides most of the useful details.
Andrej Karpathy points out that this single-score habit still brings quick wins right now, yet it creates a ceiling. Long tasks such as writing code, answering multi-part questions, or planning a robot routine need richer feedback. If the agent only hears 1 number after all those steps, it struggles to understand which decision helped and which hurt.
Karpathyβs fix is simple. After the agent finishes its run, let it read its own work, write down what worked, note the mistakes, and store those lessons for next time. People do this when we study: we spot the error, jot a short reminder like βcount letters one by oneβ, and carry that note into the next attempt.
Real research groups already test this idea. One project called Satori adds a review loop that lets a 7B-parameter model reflect on each reasoning path before updating its policy, and that smaller model now tops math benchmarks that used to need larger systems.
π Lessons, review, reflect
Karpathy argues that people learn by pulling many insights out of each attempt, writing them down, then distilling them into intuition later. Research is moving that way. Satori trains a 7B model with a Chain-of-Action-Thought loop that includes self-reflection before updating policy weights, and the authors report state-of-the-art math scores after reinforcement fine-tuning arXiv. Nvidiaβs Eureka uses GPT-4 to propose and refine reward code through iterative critique, beating expert-designed rewards on 83% of 29 robotics tasks eureka-research.github.io. These projects show that extra textual feedback, not just the scalar, makes RL far more data-efficient.
πΎ Memory as a quick patch
Karpathy mentions how Claude fixes its letter-count bug by adding a prompt line telling itself to list letters one by one. That pattern mirrors OpenAIβs rolling-out memory: ChatGPT now stores user-specific facts and injects them into new chats.
Claudeβs internal prompt literally instructs it to separate letters by commas before counting, solving the famous βhow many Rβs in strawberryβ failure.
Nvidiaβs Eureka toolkit goes a different route, asking GPT-4 to write and revise reward code, then beating expert-designed rewards on 83% of 29 robotics tasks. Even mainstream products follow the pattern: ChatGPTβs memory feature stores user facts or special instructions so the model stops repeating old mistakes.
So start with the classic RL loopβtry actions, get 1 scoreβbut bolt on a written self-review that captures many small signals. Feed those notes back in, and the agent learns faster and pushes past the ceiling that comes from relying on a single number.
π¬ What other research says
RL with AI feedback scales better than human labeling and keeps parity on helpfulness and harmlessness metrics (arXiv).
Chain-of-thought verification combined with RL improves alignment scores in Anthropicβs tests (arXiv).
Autoregressive search plus reflection, as in Satori, outperforms pure supervised fine-tuning on math benchmarks (arXiv).
All of these studies suggest that richer intermediate signalsβself-critique, memory strings, code rewardsβpush models past the plateau Karpathy worries about.
π€ My take
RL is not going away, but it starts to shine only when paired with explicit feedback channels that tell the model what went well, what failed, and how to change next time. Storing those bits either in a reward model, a memory bank, or a prompt βlessonβ shrinks the sample complexity and closes obvious gaps like counting letters or following multi-step instructions. Future gains will likely come from stitching these pieces together: an RL core, a fast verifier that drops detailed comments, and a memory module that remembers the comments so the system stops repeating old mistakes.
Thatβs a wrap for today, see you all tomorrow.