Skip to main content

AI Model Sustainability Guide

Understanding and minimizing the environmental impact of AI model usage.


Why Sustainability Matters

The Environmental Impact of AI

AI models, especially large language models (LLMs), require significant computational resources:

  • Energy Consumption: Large models like GPT-5 can use 20-100x more energy than efficient models
  • CO₂ Emissions: Training GPT-3 produced ~552 tons of CO₂; inference adds up at scale
  • Water Usage: Data centers use water for cooling - up to 17 liters per 1,000 tokens for largest models
  • Scaling Impact: As AI adoption grows, cumulative environmental impact becomes significant

Business & Environmental Benefits

BenefitImpact
Cost ReductionMore sustainable models are typically 5-50x cheaper
Regulatory ComplianceEU AI Act and sustainability reporting requirements
Corporate ResponsibilityESG goals and stakeholder expectations
PerformanceEfficient models often have faster response times

How We Calculate Sustainability

CO₂ Emissions Formula

CO₂ (grams) = (Tokens ÷ 1000) × CO₂_per_1k_tokens

Data Sources:

  • Artificial Analysis benchmark data
  • Model provider specifications
  • Industry research on AI carbon footprints

Factors Considered:

  • Model size (parameters)
  • Hardware efficiency (DGX A100, H100, H200, TPU)
  • Data center PUE (Power Usage Effectiveness)
  • Carbon Intensity Factor (CIF) by region

Water Consumption Formula

Water (liters) = (Tokens ÷ 1000) × Water_liters_per_1k_tokens

Factors:

  • Direct cooling water usage
  • Indirect water from power generation
  • Data center location and climate

Sustainability Rating System

Models are rated on a percentile-based scale:

RatingPercentileCO₂ RangeDescription
A+Top 20%< 1.0 g/1kMost sustainable
A20-40%1.0-2.0 g/1kVery sustainable
B40-60%2.0-5.0 g/1kModerate
C60-80%5.0-10.0 g/1kHigher impact
DBottom 20%> 10.0 g/1kHighest impact

Complete Model Sustainability Reference

All Models Ranked by CO₂ Efficiency

RankModelCO₂/1k (g)Water/1k (L)Cost/1k ($)Rating
1Amazon Nova Lite0.100.070.0005A+
2Gemini 2.0 Flash Lite0.120.150.0007A+
3Gemini 2.5 Flash Lite0.150.180.0008A+
4Gemini 2.0 Flash0.450.540.002A+
5Amazon Nova Pro0.500.350.003A+
6GPT-4.10.560.440.012A+
7Gemini 2.5 Flash0.560.660.0025A+
8GPT-4.1 Mini0.590.460.002A+
9GPT-4o Mini0.640.460.0015A+
10Claude 3 Haiku0.640.340.0008A+
11Claude Haiku 4.50.780.410.001A+
12O3 (Beta)0.990.780.006A
13GPT-4o1.170.880.010A
14Claude 3.7 Sonnet1.180.620.015A
15Claude Sonnet 41.180.620.018A
16Claude Sonnet 4.51.200.630.018A
17Gemini 2.5 Pro1.541.840.010A
18O4 Mini (Beta)5.134.040.002B
19GPT-5 Mini7.756.100.005B
20GPT-513.7817.690.020D
21GPT-5.1 (Beta)13.7817.690.025D

Relatable Equivalents

To make environmental impact tangible, we convert metrics to everyday equivalents:

CO₂ Equivalents

CO₂ AmountEquivalent
1 gram0.004 km driving
10 gramsCharging a smartphone
100 grams1 dishwasher cycle
1 kg4 km driving

Water Equivalents

Water AmountEquivalent
0.1 liters1/5 glass of water
0.5 liters1 water bottle
1 liter2 water bottles
5 liters1 minute shower

Example: 10,000 Requests (500 tokens each)

ModelCO₂EquivalentWaterEquivalent
GPT-568.9 kg275 km driving88.4 L18 min shower
GPT-4o Mini3.2 kg13 km driving2.3 L5 water bottles
Amazon Nova Lite0.5 kg2 km driving0.35 L1 glass water

Best Practices for Sustainable AI

1. Right-Size Your Model

Rule of thumb: Use the smallest model that meets quality requirements.

Simple tasks → Lite/Haiku models (A+)
Standard tasks → Mini models (A+)
Complex tasks → Pro/Full models (A/B)
Critical tasks → Premium models (C/D) - only when necessary

2. Optimize Token Usage

Reduce input tokens:

  • Use concise prompts
  • Remove unnecessary context
  • Use efficient prompt templates

Reduce output tokens:

  • Set appropriate max_tokens limits
  • Request concise responses
  • Use structured output formats

3. Implement Smart Routing

Route requests based on complexity:

function selectModel(complexity) {
switch(complexity) {
case 'simple':
return 'amazon-nova-lite'; // A+ rating
case 'standard':
return 'gpt-4o-mini'; // A+ rating
case 'complex':
return 'gpt-4.1'; // A+ rating
case 'critical':
return 'gpt-5'; // D rating - use sparingly
}
}

4. Cache and Batch

  • Cache responses for repeated queries
  • Batch similar requests to reduce overhead
  • Use embeddings for semantic caching

5. Monitor and Report

Track sustainability metrics:

  • Total CO₂ emissions per day/week/month
  • Water consumption trends
  • Cost vs. sustainability correlation
  • Model usage distribution

Model Selection by Sustainability Priority

🌱 Sustainability-First (A+ Only)

Best for organizations prioritizing environmental impact:

ModelCO₂/1kUse Case
Amazon Nova Lite0.10gSimple tasks, high volume
Gemini 2.0 Flash Lite0.12gFast responses
GPT-4.10.56gComplex tasks with sustainability
GPT-4o Mini0.64gGeneral purpose
Claude 3 Haiku0.64gHigh-volume chatbots

⚖️ Balanced Approach (A+ and A)

Good sustainability with broader capability:

ModelCO₂/1kUse Case
GPT-4o1.17gMultimodal tasks
Claude Sonnet 41.18gComplex reasoning
Gemini 2.5 Pro1.54gResearch, analysis

🎯 Quality-First (Accept Higher Impact)

When quality is paramount:

ModelCO₂/1kUse Case
GPT-513.78gMost complex tasks
GPT-5.113.78gCutting-edge features

ROI Calculator: Sustainability Switch

Scenario: 100,000 requests/month

Current State: GPT-5 for all requests

  • Monthly CO₂: 689 kg
  • Monthly Water: 884 L
  • Monthly Cost: $1,000

After Optimization: Hybrid approach

  • 70% GPT-4o Mini → 224 kg CO₂
  • 20% GPT-4.1 → 56 kg CO₂
  • 10% GPT-5 → 69 kg CO₂

Results:

MetricBeforeAfterSavings
CO₂689 kg349 kg49% reduction
Water884 L450 L49% reduction
Cost$1,000$22078% reduction

Data Sources & Methodology

Calculation Methodology

Our sustainability data is based on:

  1. Artificial Analysis Benchmarks: Real-world inference measurements
  2. Hardware Specifications:
    • NVIDIA DGX A100/H100/H200
    • Google TPU v5e/v6e
  3. Data Center Metrics:
    • PUE (Power Usage Effectiveness): 1.09-1.14
    • CIF (Carbon Intensity Factor): 0.231-0.34 gCO2e/Wh

Assumptions

  • Inference only (not training)
  • Standard 300-token query baseline
  • Average data center efficiency
  • Regional carbon intensity averages

Limitations

  • Actual values vary by:
    • Geographic location
    • Time of day (grid mix)
    • Specific hardware configuration
    • Workload patterns
  • Values are estimates; actual impact may differ ±20%

Additional Resources


Key Takeaways

  1. Small changes = Big impact: Switching from GPT-5 to GPT-4o Mini reduces CO₂ by 95%
  2. Cost and sustainability align: Cheaper models are usually more sustainable
  3. Right-size your models: Use premium models only when quality demands it
  4. Monitor and optimize: Track your AI carbon footprint over time
  5. Hybrid approach wins: Route requests to appropriate models based on complexity

Building sustainable AI isn't just good for the planet—it's good for your bottom line.


Last Updated: January 2025 Data Version: Based on models.json with Artificial Analysis benchmark data