
Why Meta is spending tens of billions on ad AI
Meta Engineering blog, 2024-03. Meta built two clusters of 24,000 GPUs for training GenAI and recommendation models. This is the hardware foundation under later announcements like Llama 3, Andromeda, and GEM.
For advertisers, it's the backstory to "why Meta keeps pushing Advantage+ and GEM."
Source: Meta Engineering — Building Meta's GenAI Infrastructure
What a 24K-GPU cluster actually is
Numbers Meta disclosed:
- GPU count: 24,576 per cluster × 2 (NVIDIA H100)
- Network: RoCE (Ethernet) + InfiniBand, two versions in parallel
- Purpose: Llama 3 training + recommendation and ranking model training
24K GPUs is a massive scale. Training a single huge AI model requires roughly that much as baseline. Meta is declaring it will put that capacity into training ad-ranking models (GEM and others).
Why this matters to advertisers
1. "Why did Advantage+ get better this year?" — here's the backstory
Your settings didn't change, yet Advantage+ performance keeps improving. Because central models like GEM and Andromeda keep getting trained, and every downstream ad product improves automatically. 24K GPUs make that possible.
2. AI creative generation goes mainstream
Open-sourcing Llama 3 and future Llama releases → features like Advantage+ Creative get smarter. Advertisers can generate a wider range of creative with less manual work.
3. Faster algorithm change cadence
More GPU capacity means shorter training cycles. Meta used to adjust algorithms monthly — that becomes weekly, and as autonomous agent-based automation comes online, possibly daily.
Advertisers need an ops rhythm that keeps up with that speed.
Meta's infrastructure investment scale (for context)
- 2024 capital expenditure (CapEx): $35–40B
- A large portion goes to GPUs and data centers
- Second only to Google in the industry
Why does Meta invest at this level? Ads are 98% of total revenue. Ad quality is company survival. Using AI to push up ad accuracy is the most certain investment.
So what do we do?
No direct lever for you to pull. But to benefit from the trend:
- Raise your Advantage+ share — the direct path to central-model benefits
- Keep event quality high — good data makes good models (garbage in, garbage out)
- Feed the creative pipeline — supply varied signals so AI can learn fast
Emotional reactions to avoid:
- "AI is getting so strong that ad costs will spike" — actually, higher relevance drives CPA down
- "AI will replace my job" — strategy and verification stay with humans. Only execution and analysis automate
- "The algorithm changes too often to keep up" — judge on weekly averages and it's fine
Outlook over the next 2–3 years
Meta is going to keep adding GPUs. Official roadmap:
- End of 2024: ~350,000 GPUs
- 2025–26: reaching ~1 million GPUs (one of the industry leaders)
How far will AI-model quality go at that scale? Expect CPA improvements, creative automation, and ops automation to keep expanding across the board.
Advantage+ usage, metric interpretation, and operations in the AI era are covered in Meta Ads Book 4.