Good question.
Here's the
full story.
Five acts. The model, the infrastructure, and the content it powers.
Act 2
The Supply Chain
Before the model could answer, an entire world had to be built.
Where it begins
Every advanced AI chip in the world depends on a single machine, made by a single company, in a single country.
ASML in the Netherlands makes every extreme ultraviolet lithography machine on earth. There is no alternative supplier. There is no backup plan. This is where the supply chain story begins.
Source: ASML Annual Report 2024
By the numbers
to serve GPT-5 · Anuj Saharan, OpenAI, 2025
ASML — no alternative exists worldwide
mapped across 8 layers, 7 countries
Supply Chain — 8 layers
What happens to AI if…
That is what it takes to power the model. Now here is what it costs to keep it running.
Act 3
The Footprint
Act 3 · The Footprint
A typical AI chat response uses roughly 10× the electricity of a Google search.
That does not sound like much. But multiply it by billions of queries processed every day, and AI data centres are on track to consume more electricity than many countries. Here is what keeps the model running.
Global distribution
Data Centers
Click a segment to explore · Source: Cloudscene / Statista, March 2024
↑ Click any segment to explore
Select a region to see the breakdown
Energy · Live counter
AI data centers have consumed
since you opened this page · at 14,576 kWh/sec globally
2023
AI data center consumption that year
2030 projected
Equivalent to Japan's entire national grid
Source: IEA Energy and AI Report 2025
Water consumption
AI is thirsty.
Data centers use vast amounts of water to cool their servers. Every time you send a message to a large language model, water evaporates somewhere in the world.
per 1,000 queries
A standard water bottle is 500 ml.
A GPT-4 class model needs roughly 700 ml of water to handle 1,000 prompts — more than a full bottle. The cooling systems in data centers convert that water to vapor to dissipate heat.
queries handled daily by ChatGPT-scale systems
bottles of water evaporated per day globally
exhaust temperature from server racks
Source: University of California Riverside, 2023
Carbon emissions
One query. Tiny.
Multiplied by billions. Enormous.
A single AI query emits roughly 0.3g of CO₂ — less than a text message. But AI runs at planetary scale, and the training runs behind each model dwarf everything.
CO₂ per activity (grams)
🔥 The training problem
Estimated carbon cost of training GPT-4 — ranging from 1,000 tonnes on a clean energy grid to 15,000 tonnes on a fossil-fuel grid. The location of the data center changes everything. Source: Ludvigsen et al., 2023
🌍 Industry total · 2024
Total AI industry emissions in 2024 — roughly equivalent to the entire annual carbon footprint of a country like Denmark. And growing faster.
Source: Ludvigsen et al., 2023; Goldman Sachs Research, 2024
All of this infrastructure exists for one reason: to run a piece of software that learned to think by reading the internet.
Act 4
The Intelligence
The Intelligence · Beat 1
What the model learned
The model was trained on text. Not some text — functionally much of the publicly available text on the internet.
Books, articles, code, conversations, scientific papers, forums, poetry, manuals. Trillions of words, compressed into a mathematical representation of how language works. It did not memorise it. It learned patterns — how words relate, how ideas connect, how questions lead to answers.
est. · EpochAI, 2024
could read in a lifetime
Beat 2 · Training Cost
What training cost
That single training run now serves millions of people simultaneously.
A singular, enormous investment that becomes shared infrastructure. Like building a highway — built once, used by everyone, every day. Training cost over $100M and took six months. That investment now answers your question in under a second, alongside millions of other questions, every hour of every day.
Source: Sam Altman, 2023 / Stanford AI Index 2024
Beat 3 · The Blindspot
What the model does not know
Patterns in language. How concepts relate. General knowledge up to its training cutoff. Public information from the open web.
Your product specifications. Your technical documentation. Your company's policies. Your pricing. Your support guides. Your proprietary knowledge.
The model was trained once and frozen. When it encounters a question that requires specific, current, or proprietary information, it has two options: retrieve that information from a source, or guess.
The model is extraordinary. But it has a blindspot. It does not know what your company knows. When it answers a question about your product — it relies entirely on the information you have made available. And most companies have not thought carefully about what that information looks like.
The model can reason. The model can generate. But it can only work with what you give it.
Act 5
The Content Layer
Act 5 · The Content Layer
Every layer of this machine was built with extraordinary precision.
Nanometre-scale chips. Data centres cooled to fractions of a degree. Models trained for months with obsessive attention to quality.
Then the model meets your content. And for most companies, that content looks like this:
Outdated PDF
Last updated 2019
Conflicting wiki
3 different answers
Siloed intranet
No machine access
Unstructured docs
No metadata
Screen 3 — Content Environment Comparison
Same question. Two very different answers.
Click a document to highlight which parts of the AI answer it contributes to.
Environment A
Unstructured — shared drives, email, legacy CMS
X-440 Operating Specs v2.pdf
PDF · March 2019
Valve X-440 Maintenance Guide.docx
DOCX · January 2021
X440_pressure_specs_FINAL.pdf
PDF · November 2022
X-440 Support Article
Confluence article · August 2023
RE: Valve specs question (email)
Email thread · February 2024
Environment B
Structured — Paligo Component Content Management
X-440 Maximum Safe Operating Pressure
Technical specification
status
PUBLISHED
version
3.1
product
Valve X-440
audience
Field engineers, installation technicians
owner
Engineering — Product Spec Team
lastReviewed
February 2024
Environment A
AI Answer
Conflicting values found across source documents. Answer confidence: LOW.
One sentence inferred — no authoritative source document.
Environment B
AI Answer
About Paligo
Paligo is a structured content platform used by enterprise companies to manage technical documentation at scale. By single-sourcing content and delivering it in machine-readable formats, Paligo helps organisations ensure that when an AI retrieves their information, the answer is accurate, consistent, and current.
See how Paligo works →The question, answered
The world built the most sophisticated question-answering machine in history.
Make sure your answer is ready.
Test it yourself
Try this prompt in any AI
Tell me about [your company name]'s [product or service area]. What can you tell me, what are your limitations, and what sources did you use?
The solution
Control the answer at the source
Paligo is the component content management system that gives AI something structured, accurate, and versioned to work with.
An interactive report by Paligo · paligo.net · February 2026
Transparency
Methodology &
Sources
Every claim in this piece is sourced, derived, or clearly marked as an estimate. This section documents our data sources, derived calculations, and the limitations of the underlying data — so you can verify, challenge, or build on our work.
Sources by Section
How We Derived Key Figures
Several numbers in the piece are calculated from primary sources rather than directly reported. Each calculation is shown in full.
Live energy counter rate
460 TWh/year (IEA 2024) ÷ 31,557,600 seconds/year = 14,576 kWh/sec
Represents continuous global AI data center consumption at the 2024 baseline rate.
Water bottles per day
2,000,000,000 queries/day ÷ 1,000 × 700 ml ÷ 500 ml/bottle = 2,800,000 bottles
Based on UC Riverside 2023 figure (700 ml/1,000 queries) and OpenAI-reported ~2B daily queries (early 2026).
>10,000× lifetime reading comparison
~10 trillion words (13T tokens × 0.75) ÷ ~700M words (lifetime reading) = ~14,000–100,000×
Conservative floor uses 700M words (generous lifetime estimate at 250 wpm × 1hr/day × 75 years). Real ratio depends on reading assumptions; >10,000× is defensible at any reasonable estimate.
AI industry CO₂ (30M tonnes)
Derived from Goldman Sachs Research 2024 electricity projections × IEA grid carbon intensity (2024 global average ~500g CO₂/kWh applied to AI-specific data center energy fraction)
This is an estimated order-of-magnitude figure. Actual emissions depend on the carbon intensity of the electrical grids powering each facility — which varies from near-zero (hydro/nuclear) to high (coal-heavy grids).
Cascade disruption timelines
Taiwan production halt → NVIDIA GPU shortage: 18 months
Based on semiconductor industry lead times reported in TSMC and ASML annual reports, cross-referenced with SIA supply chain resilience analyses.
Limitations & Caveats
Data journalism is only as credible as its limitations. These are ours.
AI market moves fast. Some figures may be outdated within months of publication. Where possible, we note the data vintage. The campaign was fact-checked in February 2026.
OpenAI and Anthropic do not disclose training details. GPU counts, dataset sizes, and training costs for frontier models are third-party estimates from Epoch AI, Semianalysis, and Stanford. All such figures carry the "~" or "est." qualifier.
Chinese data is structurally incomplete. Chinese AI companies, chipmakers (SMIC, Huawei), and data center operators disclose limited data. Chinese data center counts in global databases reflect Western-standard classification only.
Market share figures are point-in-time estimates. GPU, HBM, and cloud market shares are highly dynamic. TrendForce and SIA figures represent quarterly snapshots and should not be treated as stable year-round averages.
Carbon accounting methodology varies. CO₂ figures mix Scope 1 (direct), Scope 2 (purchased electricity), and lifecycle estimates depending on the source. The GPT-4 training range (1,000–15,000 tCO₂e) reflects grid carbon intensity variation, not model size uncertainty.
Supply chain is simplified. Real AI supply chains involve hundreds of companies, sub-suppliers, and raw material sources. This map shows the critical path — the nodes whose disruption would most severely impact AI capability.
Inference cost figures are from leaked documents, not official disclosures. The OpenAI inference cost figures in Act 4 come from internal Microsoft financial documents reported by Ed Zitron (November 2025). They reflect Microsoft's internal accounting of one side of a bilateral relationship. Inference spend is predominantly cash; training is largely funded via Azure credits (per TechCrunch sourcing). The same documents show revenue-sharing payments of $493.8M (2024) and $865.8M (Q1–Q3 2025) from OpenAI to Microsoft — which, under a widely-reported 20% deal, would imply ~$4.3B revenue for those nine months vs $8.65B inference spend. This profitability question is contested: Altman claims $13B+ annual revenue. OpenAI has never officially published cost or revenue figures.
About this project
This campaign was produced by Paligo to make the infrastructure, cost, and consequences of AI legible to a general audience. It is designed as a data-driven editorial narrative — not marketing material. Every statistic has a primary source. Every estimate is qualified.
Data currency
For corrections, updates, or data disputes, contact the Paligo editorial team. This project is open to review. If you identify a factual error, we will investigate and publish a correction in this section.
