AI ROI Framework: Conservation-Consistent Measurement

Date: November 2025 Status: Phase 8 Extension

Executive Summary

This framework extends the discrete accounting conservation system to answer the central question facing investors and analysts:

Will the $1–3 trillion wave of AI capital expenditures earn their cost of capital?

By embedding AI-specific metrics within the 15-constraint accounting core, we transform qualitative claims about AI productivity into testable, filing-grounded economics that survive diligence.

Key Innovation: Traditional DCF models treat AI capex as generic capital. This framework enforces incremental ROIC on AI assets (ROIC(^{AI})) must respect the growth-reinvestment identity and conservation constraints, preventing internally inconsistent valuations.

1. The Wall Street Question

1.1 Investment Scale

From 2024–2029, hyperscalers (Microsoft, Amazon, Google, Meta, Oracle) are projected to invest:

$2.8 trillion in AI infrastructure (Citigroup, 2025)
$55 GW of new power capacity required (Bloom Energy, 2025)
$1.4 trillion in U.S. power infrastructure alone

Q3 2024 Snapshot: Top 4 hyperscalers spent $58.9B in capex (63% YoY growth), with capex-to-revenue ratios reaching 22% vs. historical 11–16%.

1.2 Return Uncertainty

Bearish View (Goldman Sachs, June 2024): - ~$1T spend with “little to show for it” beyond developer efficiency - Daron Acemoglu (MIT): 0.5% productivity increase over next decade - Only 6.1% of U.S. businesses currently using AI for production

Bullish View (Morgan Stanley, 2024): - GenAI expected to generate profits starting 2025 (34% margin) - $153B revenue (2025) → $1.1T (2028), a 20× increase - Joseph Briggs (GS): 15% labor productivity increase possible

Bubble Warnings (IMF, Bank of England, October 2024): - Stock valuations “comparable to the peak” of 2000 dot-com bubble - “Growing risk that AI bubble could burst” with systemic implications - High concentration in small cluster of AI-heavy companies

1.3 Measurement Gap

Despite massive investment: - 97% of enterprises struggle to demonstrate GenAI business value (2024) - 65% adoption but only 11% use at scale - AI-attributed revenue and NOPAT not separately disclosed in 10-K/10-Q filings

This framework provides the first LP-based system to reconcile AI assumptions with accounting conservation, enabling firm-by-firm, quarter-by-quarter ROI measurement from public filings.

2. Conservation Framework Extension

2.1 Accounting Core (Review)

The base system uses a 15-constraint integer matrix (A ^{15 m}) encoding double-entry conservation:

[ A = ]

where () contains line items from the balance sheet, income statement, and cash flow statement. Any reported filing (_t) with non-zero residual (_t = A _t) is projected to the feasible set:

[ t = { } |( - _t)|_1 A = ]

Quality Metric: [ Q_t = ]

Higher (Q_t) indicates more “fixes” needed to satisfy conservation, signaling lower reporting quality.

2.2 AI-Specific Augmentation

We extend the constraint matrix to track AI assets as a sub-ledger within PP&E and intangibles:

[ IC^{AI}_t = NWC^{AI}_t + NFA^{AI}_t ]

where: - **(NFA^{AI}_t): GPU servers, data center infrastructure, capitalized AI software (from footnote roll-forwards) - (NWC^{AI}_t)**: Working capital directly attributable to AI operations (typically small)

Roll-Forward Constraint (extends existing PP&E conservation): [ IC^{AI}t = IC^{AI}{t-1} + ^{AI}_t - ^{AI}_t - ^{AI}_t ]

This embeds AI capex within the existing conservation matrix, ensuring AI asset tracking respects double-entry physics.

3. Incremental ROIC on AI Assets

3.1 Definition

Return on Invested Capital for AI infrastructure:

[ ]

where: - (NOPAT_t): Incremental Net Operating Profit After Tax attributed to AI (see §3.2) - **(_{t-1:t})**: Average AI invested capital over the period

Threshold Test: If (ROIC^{AI} < WACC), the AI capex is value-destructive.

3.2 NOPAT Attribution Methodology

AI-attributed NOPAT is not directly disclosed. We estimate via three approaches:

Method 1: Top-Down Trend Break

Compare actual NOPAT growth to pre-AI trend:

[ NOPAT^{AI}_t = NOPAT_t - _t^{} ]

where (_t^{}) is extrapolated from 2020–2022 trend (before major AI capex).

Advantages: Simple, uses only financial statements Disadvantages: Attributes all trend deviation to AI (may miss other factors)

Method 2: Bottom-Up Unit Economics

For disclosed AI revenue segments (e.g., Microsoft’s “$13B AI business run rate”):

[ NOPAT^{AI}_t = _t ]

Incremental Margin Estimation: - Use management-disclosed margins (e.g., Meta: “strong ROI from core AI”) - Or apply segment-level operating margin adjusted for infrastructure depreciation

Advantages: Ties to disclosed revenue Disadvantages: Requires margin assumptions; AI revenue not always broken out

Method 3: Cost Savings (Accelerated Computing)

For GPU-based compute cost reductions:

[ NOPAT^{AI}_t = _t - _t - ^{GPU}_t ]

where: - Baseline CPU Cost: Estimated from prior workload requirements - Accelerated Cost: GPU opex (power, maintenance) - D&A(^{GPU}): Depreciation on GPU assets

Nvidia’s Claim: “90% cost savings” from accelerated computing. We test this with disclosed unit economics (see §6).

Advantages: Validates vendor claims with filings Disadvantages: Requires workload-specific modeling

3.3 Multi-Method Consensus

In practice, we compute all three estimates and report bounds:

[ ROIC^{AI}_t ]

If the upper bound < WACC, AI capex is definitively value-destructive. If the lower bound > WACC, AI capex is definitively value-creative. Otherwise, ROI is ambiguous given data limitations.

4. Distributed Lag Model

4.1 Infrastructure Payback Timeline

AI infrastructure (data centers, GPU farms) requires 3–5 years to generate full returns, evidenced by:

Academic Literature (Bom & Ligthart, 2014): Public infrastructure shows short-run elasticity 0.083, long-run 0.122 (5–10 year lag)
AI-Specific Evidence (McElheran et al., 2025): Initial productivity decline of 1.33% to 60% (J-curve), recovering over 3–5 years
Hyperscaler Timeline (Morgan Stanley, 2024): Profitability expected 2025 for early movers

We model AI capex impact with Autoregressive Distributed Lag (ARDL) structure:

[ NOPAT_t = + _{k=0}^{K} k {t-k} + _t + _t ]

Typical Parameters (from infrastructure literature): - (K = 4) to (8) quarters (1–2 years lag) - Peak coefficient at (k = 4) (1-year lag) - 50% of total effect within 6 months, 90% within 3 years

4.2 Net Present Value Calculation

Over a 5-year horizon:

[ NPV^{AI} = {t=0}^{20} - {t=0}^{20} ]

where (t) indexes quarters.

Decision Rule: Invest if (NPV^{AI} > 0), accounting for distributed lags.

Implication: A project showing negative annual ROIC in years 1–2 may still be NPV-positive if the J-curve effect is temporary.

5. Terminal Value Consistency

5.1 Conservation-Consistent Multiple

The implied enterprise value-to-EBITDA multiple must satisfy:

[ ]

where: - (): Effective tax rate - (g): Terminal growth rate - (ROIC): Return on invested capital (including AI assets) - (WACC): Weighted average cost of capital

Feasibility Bounds: 1. (WACC > g) (else perpetuity diverges) 2. (g ROIC) (implied by growth-reinvestment identity (g = s ROIC) with (s )) 3. (g ) for mature firms (negative terminal growth requires explicit justification) 4. () (empirical tax rate bounds for developed markets)

5.2 AI Growth Constraints

If AI capex drives terminal growth assumptions:

[ g = + ]

Power Constraint Check: - If (g) implies (X) GW of compute capacity - And only (Y < X) GW is feasible given power constraints - Then (g) must be reduced to (g’ = g (Y/X))

This prevents analysts from forecasting growth rates that physically cannot be supplied with electricity.

5.3 Solver Output

For a given set of inputs ((g, ROIC, WACC, )), the validator:

Checks feasibility bounds (reports violations as terminal_value_infeasible)
Computes implied EV/EBITDA
Compares to analyst’s chosen multiple
Reports inconsistency gap (^*_{} = | - |)

If (^* > 0.5) (tolerance: 0.5 turns of EBITDA), the model is internally inconsistent.

6. Cost-of-Compute Parity Test

6.1 Accelerated Computing Claims

Nvidia and partners claim: - “90%+ cost/energy savings” from GPU vs. CPU for certain workloads - “Best ROI computing infrastructure investment” (Jensen Huang, CEO)

We formalize as a shadow P&L:

[ _t = _t - _t ]

where: - Baseline CPU Cost: (_t ) - Accel Cost: (_t + _t + _t)

Data Sources: - FLOPs per workload: Published in academic papers (e.g., GPT-4 training: (2 ^{25}) FLOPs) - CPU $/FLOP: AWS EC2 pricing ($0.096/hr for c6i.xlarge, 19.2 GFLOPS ≈ $5/GFLOP-hr) - GPU $/FLOP: AWS p4d.24xlarge pricing ($32.77/hr, 8×A100 = 2.5 PFLOPS ≈ $0.013/GFLOP-hr, 385× cheaper per FLOP) - Power: A100 draws 400W; industrial electricity $0.07/kWh

6.2 Empirical Case Studies

Commonwealth Bank of Australia (Nvidia case study): - 640× performance boost (RAPIDS Accelerator) - 80% cost reduction vs. CPU

AT&T: - 3.3× faster data processing - 60% lower cost

IRS: - 20× speed improvements - 50% cost reduction

We extract disclosed metrics from footnotes and test consistency with Nvidia’s unit-economics claims.

6.3 Feasibility Check

If savings < (D&A + Power), ROI fails despite performance gains. Embed as contra-COGS item:

[ _t = _t - _t ]

Then recompute margins on conservation-adjusted financials (_t).

7. Power Constraints and Utilization

7.1 Capacity Requirement

Citi Estimate (2025): 55 GW of new power needed by 2030 Cost: ~$50B per GW ($2.8T total infrastructure)

AWS Disclosure (2024): Added 3.8 GW capacity in past 12 months

For a given AI capex trajectory, implied power demand:

[ _t^{} = ]

Typical Ratio: ~0.5–1 MW per $1M in data center capex (varies by PUE and GPU density)

7.2 Utilization Rate Estimation

Industry Reality (multiple sources): - 85%+ of GPU capacity sits idle - 10–15% typical utilization in hybrid research/production systems

Traditional metrics (nvidia-smi) overstate utilization by measuring kernel execution time, not computational throughput. Advanced metrics (SM efficiency) reveal true waste.

Financial Impact: At 15% utilization, effective cost per useful GPU-hour is 6.7× nominal cost.

Adjustment Factor: [ ^{AI}_t = ]

Recalculate ROIC(^{AI}) using effective IC to reflect capital efficiency.

7.3 Power Purchase Agreement (PPA) Tracking

Extract from 10-K risk factor disclosures: - Microsoft + Constellation: 20-year PPA for Three Mile Island reactor (PA) - AWS + Talen Energy: 960 MW from Susquehanna nuclear plant (PA)

Constraint: If disclosed PPA capacity < implied power demand, flag power-constrained scenario.

8. Quality-Adjusted Cash Flows

8.1 Haircut Formula

For companies with high (Q_t) (noisy accounting):

[ _t = FCFF_t (1 - Q_t) ]

where () is a severity parameter (default: 0.5).

Rationale: Mechanically inconsistent reporters don’t get full credit in DCF valuation.

8.2 Application to AI Capex

If AI-attributed NOPAT estimates rely on management commentary (Method 2), apply quality adjustment:

[ NOPAT^{AI, }_t = NOPAT^{AI}_t (1 - Q_t) ]

This penalizes companies with loose accounting discipline when making aggressive AI ROI claims.

9. Comps by Conservation Signature

9.1 Signature Vector

For each company, compute:

[ =

$$\begin{bmatrix} ROIC \\ \text{D/EBITDA} \\ \text{WC intensity} \\ \text{Cash conversion} \\ \tau \\ \text{Capex/Revenue} \\ Q_t \end{bmatrix}$$

]

from conservation-adjusted financials (_t).

9.2 Distance Metric

Choose comps by minimizing Mahalanobis distance:

[ d_{ij} = ]

where () is the covariance matrix of signatures across the universe.

Advantage: Accounting-consistent multiples for private valuations, ignoring buzzwords.

10. Debt Capacity and Rating Impact

10.1 Coverage Ratio

As AI capex scales, interest coverage may decline:

[ _t = ]

Rating Thresholds (S&P): - AAA: Coverage > 20× - AA: Coverage 10–20× - A: Coverage 6–10× - BBB: Coverage 3–6×

10.2 WACC Re-Pricing

If AI-driven capex pushes coverage below threshold: 1. Credit rating downgrades (e.g., AA → A) 2. Cost of debt (r_d) increases (e.g., 4.0% → 4.5%) 3. WACC rises: (WACC = w_e r_e + w_d r_d (1-)) 4. Hurdle rate increases, making AI capex less attractive

This feedback loop is embedded in the solver to prevent circular logic.

11. Practical Workflow

11.1 Input Data

From 10-K/10-Q: - Total capex (quarterly, annual) - Segment revenue and operating income - PP&E roll-forwards (additions, D&A, disposals) - Footnote disclosures for AI assets (GPUs, data centers, capitalized software) - Risk factor mentions of power, AI competition

From external sources: - AI revenue estimates (analyst reports, management commentary) - GPU utilization benchmarks (industry surveys, academic papers) - Power PPA details (press releases, utility filings)

11.2 Computation Steps

Build AI Ledger: Extract IC(^{AI}_t) from roll-forwards + footnotes
Estimate (NOPAT^{AI}): Apply Methods 1–3, report bounds
Calculate ROIC(^{AI}): Compare to WACC
Lag Model: Fit ARDL with (K=4) to (8) quarters
Terminal Multiple: Check physics-consistency with growth assumptions
Power Constraint: Validate capacity vs. implied demand
Quality Adjust: Apply (Q_t) haircut if accounting noisy
Generate Report: Bands, feasibility flags, sensitivity tables

11.3 Output Format

Company: MSFT (FY2024)
──────────────────────────────────────
AI Invested Capital:     $111.0B (ending), $98.0B (average)
Δ NOPAT (AI, estimated): $5.3B – $8.9B
ROIC^AI:                 5.4% – 9.1%
WACC:                    8.5%
Status:                  ⚠ AMBIGUOUS (lower bound < WACC < upper bound)
──────────────────────────────────────
Terminal Multiple (15.0×) implies:
  g = 6.5%, ROIC = 10%, WACC = 8.5%
  Physics-consistent range: 11.7×
  → EXIT MULTIPLE TOO HIGH BY 3.3×
──────────────────────────────────────
Power Constraint:
  Disclosed PPA:         0.8 GW
  Implied Demand:        83.3 GW
  → POWER-CONSTRAINED (104× gap)
──────────────────────────────────────
Recommendation: Expand PPA capacity or
                reduce terminal growth

12. Measurement Uncertainty

12.1 Data Availability

Available in 10-K/10-Q (high quality): - Total capex by segment - Segment revenue and operating income - PP&E roll-forwards - Risk factor disclosures (qualitative)

Requires Estimation (moderate to high uncertainty): - AI-specific capex (% of total) - AI-attributed NOPAT - GPU utilization rates - Distributed lag parameters

Not Disclosed (must infer): - Model training/inference costs per unit - PPA pricing terms ($/MWh) - Detailed ROI by AI product line

12.2 Sensitivity Analysis

For each key input, compute elasticity:

[ ]

High-Impact Parameters (from literature): 1. AI-attributed NOPAT margin (±5 pp changes ROIC(^{AI}) by ±2–3 pp) 2. Utilization rate (15% → 30% halves effective capex, doubles ROIC(^{AI})) 3. Distributed lag (K) (4 qtrs → 8 qtrs delays payback, reduces NPV by 10–15%)

Generate tornado charts and scenario tables for transparency.

13. Citations

Acemoglu, D. (2024). The Simple Macroeconomics of AI. MIT Economics. https://economics.mit.edu/sites/default/files/2024-05/The%20Simple%20Macroeconomics%20of%20AI.pdf

Bank of England. (2024, October). Financial Stability Report. https://www.bankofengland.co.uk/

Bloom Energy. (2025). 2025 Data Center Power Report. https://www.bloomenergy.com/

Bom, P. R., & Ligthart, J. E. (2014). What Have We Learned From Three Decades Of Research On The Productivity Of Public Capital? Journal of Economic Surveys, 28(5), 889-916.

Brynjolfsson, E., Li, D., & Raymond, L. R. (2023). Generative AI at Work (NBER Working Paper No. 31161). https://www.nber.org/papers/w31161

Citigroup Research. (2025, September). Big Tech AI Spending Forecast.

Goldman Sachs Research. (2024, June). Gen AI: Too Much Spend, Too Little Benefit? Top of Mind, Issue 129. https://www.goldmansachs.com/insights/top-of-mind/gen-ai-too-much-spend-too-little-benefit

McElheran, K., Yang, M., Brynjolfsson, E., & Kroff, Z. (2025). The Rise of Industrial AI in America: Microfoundations of the Productivity J-curve(s) (Census Working Paper CES-WP-25-27). https://www2.census.gov/library/working-papers/2025/adrm/ces/CES-WP-25-27.pdf

Morgan Stanley Research. (2024). GenAI Revenue Growth and Profitability. https://www.morganstanley.com/insights/articles/genai-revenue-growth-and-profitability

Framework Version: 1.0 (November 2025) Module: src/ai_roi/ Test Coverage Target: 85%+ Publication Status: Draft for peer review