Skip to main content

AI Model Sustainability & CO2 Consumption

This document provides comprehensive data on the environmental impact of AI models used in the SF Explorer application, including CO2 emissions, water consumption, and energy usage.

Overview

AI model inference has a measurable environmental footprint. Understanding these impacts helps make informed decisions about model selection, balancing performance needs with sustainability goals.

Key Metrics

  • CO2 Emissions (gCO2e): Grams of CO2 equivalent per 1,000 tokens
  • Water Consumption (L): Liters of water per 1,000 tokens (cooling + energy production)
  • Host Provider: Cloud infrastructure hosting the model
  • PUE (Power Usage Effectiveness): Data center efficiency (lower is better, 1.0 is ideal)
  • CIF (Carbon Intensity Factor): gCO2e per Wh of electricity

Model CO2 Consumption Data

OpenAI Models (Azure)

ModelCO2 (g/1k tokens)Water (L/1k tokens)HostSource
GPT-513.77817.686AzureArtificial Analysis (gpt-5-2025-08-07 high, DGX H200/H100)
GPT-5 Mini7.7466.102AzureArtificial Analysis (gpt-5-mini-2025-08-07 high, DGX H200/H100)
GPT-4.10.5550.437AzureArtificial Analysis (gpt-4.1-2025-04-14, DGX H200/H100)
GPT-4.1 Mini0.5880.464AzureArtificial Analysis (gpt-4.1-mini-2025-04-14, DGX H200/H100)
GPT-4o1.1650.878AzureArtificial Analysis (gpt-4o-2024-11-20, DGX H200/H100)
GPT-4o Mini0.6390.463AzureArtificial Analysis (gpt-4o-mini-2024-07-18, DGX A100)
O30.9900.780AzureArtificial Analysis (o3-2025-04-16, DGX H200/H100)
O4 Mini5.1324.043AzureArtificial Analysis (o4-mini-2025-04-16 high, DGX H200/H100)

Anthropic Models (AWS Bedrock)

ModelCO2 (g/1k tokens)Water (L/1k tokens)HostSource
Claude 4.5 Sonnet1.2000.630AWSArtificial Analysis (claude-sonnet-4-5-20250929, DGX H200/H100)
Claude 4 Sonnet1.1780.620AWSArtificial Analysis (claude-sonnet-4-20250514, DGX H200/H100)
Claude 4.5 Haiku0.7840.413AWSArtificial Analysis (claude-haiku-4-5-20251001, DGX H200/H100)
Claude 3 Haiku0.6440.339AWSArtificial Analysis (claude-3-haiku-20240307, DGX H200/H100)

Google Models (Vertex AI)

ModelCO2 (g/1k tokens)Water (L/1k tokens)HostSource
Gemini 2.5 Pro1.5431.838GoogleArtificial Analysis (gemini-2.5-pro, TPU V6e)
Gemini 2.5 Flash0.5560.663GoogleArtificial Analysis (google/gemini-2.5-flash, TPU V6e)
Gemini 2.5 Flash Lite0.1500.180GoogleEstimated based on Gemini 2.5 Flash
Gemini 2.0 Flash0.4500.540GoogleEstimated based on Gemini 2.5 Flash
Gemini 2.0 Flash Lite0.1200.145GoogleEstimated based on Gemini 2.5 Flash Lite

Amazon Models (AWS Bedrock)

ModelCO2 (g/1k tokens)Water (L/1k tokens)HostSource
Amazon Nova Pro0.5000.350AWSEstimated based on similar models
Amazon Nova Lite0.1000.070AWSEstimated based on similar lightweight models

Environmental Factors by Cloud Provider

Different cloud providers have varying environmental impacts based on their infrastructure and energy sources:

ProviderPUECIF (gCO2e/Wh)WUE (Site)WUE (Source)Notes
Azure1.120.340.34.35Higher carbon intensity
AWS1.140.300.185.12Moderate carbon intensity
Google1.090.2310.31.1Lowest carbon due to renewable energy
xAI1.50.3850.363.142Higher PUE
DeepSeek1.270.61.26.016Highest carbon intensity

Key Takeaways

  • Google has the lowest carbon footprint due to extensive renewable energy investments
  • Azure and AWS have comparable environmental profiles
  • PUE measures data center efficiency (1.0 = perfect, typical range 1.1-1.5)
  • CIF varies significantly based on regional energy grid composition

Sustainability Ratings

Models in the SF Explorer application are assigned sustainability ratings based on their CO2 emissions relative to other available models:

RatingPercentileDescription
A+Top 20%Most sustainable - lowest emissions
A21-40%Very sustainable
B41-60%Moderate impact
C61-80%Higher impact
DBottom 20%Least sustainable - highest emissions

Choosing Sustainable Models

Best for Sustainability (A+ rated)

  1. Amazon Nova Lite - 0.100 g CO2/1k tokens
  2. Gemini 2.0 Flash Lite - 0.120 g CO2/1k tokens
  3. Gemini 2.5 Flash Lite - 0.150 g CO2/1k tokens
  4. Gemini 2.0 Flash - 0.450 g CO2/1k tokens
  5. Gemini 2.5 Flash - 0.556 g CO2/1k tokens

Best Balance of Performance & Sustainability

  1. GPT-4.1 - 0.555 g CO2/1k tokens (excellent reasoning)
  2. GPT-4.1 Mini - 0.588 g CO2/1k tokens (fast, efficient)
  3. Claude 3 Haiku - 0.644 g CO2/1k tokens (quick responses)
  4. GPT-4o Mini - 0.639 g CO2/1k tokens (multimodal)
  5. Claude 4.5 Haiku - 0.784 g CO2/1k tokens (fast Claude)

High-Performance Models (Higher Emissions)

  1. GPT-5 - 13.778 g CO2/1k tokens (flagship reasoning)
  2. GPT-5 Mini - 7.746 g CO2/1k tokens (GPT-5 quality, faster)
  3. O4 Mini - 5.132 g CO2/1k tokens (reasoning model)
  4. Gemini 2.5 Pro - 1.543 g CO2/1k tokens (Google flagship)

Real-World Equivalents

To help contextualize emissions, here are equivalents for processing 1 billion prompts:

ModelEnergy (MWh)CO2 (Tons)Water (kL)Equivalent
GPT-5 (high)12,1564,13356,136~898 cars/year
GPT-4o Mini5641921,910~42 cars/year
Claude 3 Haiku6441933,398~42 cars/year
Gemini 2.5 Flash723167995~36 cars/year

Methodology

Data Sources

  • Primary: Artificial Analysis benchmark data
  • Hardware profiles: DGX H200/H100, DGX A100, TPU V6e
  • Query length: Short (300 tokens), Medium (1000 tokens), Long (1500 tokens)

Calculation Formula

CO2 (gCO2e) = Energy (Wh) × CIF (gCO2e/Wh) × PUE
Water (mL) = Energy (Wh) × WUE × PUE

Assumptions

  1. Query length normalization: Values normalized to per-1,000-token basis
  2. Combined metrics: Uses Mean Combined Energy/Carbon/Water values (average of min/max utilization)
  3. Estimated models: Some models without direct benchmarks use extrapolated values from similar architectures

References


Last updated: January 2026

Data source: packages/sf-explorer-app/src/Framework/utils/co2Consumption.csv