Measurement Challenges: Data Availability and Estimation Methods
Date: November 2025
Executive Summary
Measuring AI capex ROI faces fundamental data limitations: AI-attributed revenue and NOPAT are not separately disclosed in financial statements. This document catalogs what’s available, what requires estimation, and quantifies uncertainty.
Key Finding: 97% of enterprises struggle to demonstrate GenAI business value despite 65% adoption (McKinsey, 2024).
1. Data Availability Matrix
1.1 HIGH QUALITY (Disclosed in 10-K/10-Q)
| Metric | Source | Quality | Frequency |
|---|---|---|---|
| Total CapEx | Statement of Cash Flows | ★★★★★ Audited | Quarterly |
| Segment Revenue | Income Statement | ★★★★★ Audited | Quarterly |
| Segment Operating Income | Income Statement | ★★★★★ Audited | Quarterly |
| PP&E Roll-Forward | Footnotes | ★★★★★ Audited | Annual |
| Risk Factors | 10-K Item 1A | ★★★★☆ Qualitative | Annual |
Example (Microsoft FY2024): - Total Capex: $48.4B (disclosed, precise) - Intelligent Cloud Revenue: $109.6B (disclosed, precise) - Intelligent Cloud Operating Income: $52.1B (disclosed, precise)
1.2 MODERATE QUALITY (Voluntary Disclosures)
| Metric | Source | Quality | Frequency |
|---|---|---|---|
| AI Business Metrics | Earnings Calls, Press Releases | ★★★☆☆ Unaudited | Irregular |
| Capacity Additions | Management Commentary | ★★★☆☆ Unaudited | Irregular |
| Future Capex Guidance | Earnings Calls | ★★☆☆☆ Forward-looking | Quarterly |
Example (Microsoft Q2 FY2025 Earnings): - AI business: “$13B annual run rate” (unaudited, not in 10-Q) - Azure AI growth: “175% YoY” (qualitative, no segment breakdown)
Issue: Definitions vary across companies; no GAAP standardization.
1.3 REQUIRES ESTIMATION (Not Disclosed)
| Metric | Estimation Method | Uncertainty | Impact |
|---|---|---|---|
| AI-Specific CapEx | % of total based on qualitative disclosures | ★★★☆☆ HIGH | Direct input to ROIC^AI denominator |
| AI-Attributed NOPAT | Top-down trend break OR bottom-up unit economics | ★★★★☆ VERY HIGH | Direct input to ROIC^AI numerator |
| GPU Utilization Rates | Industry benchmarks (10-15% typical) | ★★★★☆ VERY HIGH | 6.7× cost multiplier at 15% |
| Distributed Lag Parameters | Infrastructure literature (K=4-8 quarters) | ★★★☆☆ HIGH | Affects NPV by 10-15% |
| Model Training Costs | FLOPs × cloud pricing | ★★★★☆ VERY HIGH | Rarely disclosed |
2. Estimation Methods and Uncertainty
2.1 AI-Attributed NOPAT (Numerator of ROIC^AI)
Method 1: Top-Down Trend Break
[ NOPAT^{AI}_t = NOPAT_t - _t^{} ]
Uncertainty Sources: - Macro conditions (interest rates, demand shocks) confound trend - Other strategic initiatives (non-AI) also drive NOPAT - Pre-AI trend window choice (2020-2022 includes COVID distortions)
Typical Error Bounds: ±30% (comparing to disclosed AI metrics when available)
Method 2: Bottom-Up Unit Economics
[ NOPAT^{AI}_t = _t ]
Uncertainty Sources: - AI revenue is voluntary disclosure, inconsistently defined - Incremental margin requires assumptions (20-40% range typical for cloud/SaaS) - Depreciation allocation to AI unclear
Typical Error Bounds: ±40% (wide margin assumption range)
Method 3: Cost Savings (Accelerated Computing)
[ NOPAT^{AI}_t = _t - ^{GPU}_t ]
Uncertainty Sources: - Baseline CPU cost is counterfactual (what would’ve been spent) - Nvidia’s “90% savings” claims are workload-specific (ML training), not universal - Allocation of shared infrastructure costs
Typical Error Bounds: ±50% (highly workload-dependent)
Consensus Approach: Report bounds across all three methods:
[ ROIC^{AI}_t ]
2.2 AI-Specific CapEx (Denominator of ROIC^AI)
Method 1: Footnote Extraction
If PP&E schedule breaks out “AI/ML servers”: - Uncertainty: ★★☆☆☆ LOW (when disclosed) - Availability: ~30% of hyperscalers provide this level of detail
Method 2: Qualitative Disclosure Percentage
“Approximately 70% of capex for AI infrastructure” (MD&A): - Uncertainty: ★★★☆☆ MODERATE (management estimate, unaudited) - Availability: ~60% provide qualitative percentage
Method 3: Proxy Ratio
[ _t = _t ]
- Uncertainty: ★★★★☆ HIGH (assumes capital intensity scales linearly with revenue)
- Availability: Always possible (last resort)
Method 4: Industry Benchmark
Median AI capex as % of total (peers): 66.5% (2024) - Uncertainty: ★★★☆☆ MODERATE TO HIGH (company-specific strategies vary) - Use Case: Private companies or non-disclosers
2.3 GPU Utilization Rates
Industry Evidence: - 85%+ idle time (multiple surveys, academic studies) - 10-15% typical utilization in hybrid systems - 40% considered good
Measurement Challenge: Companies don’t disclose; requires inference from: - Academic research (NERSC Perlmutter study, METR analysis) - Third-party monitoring tools (when available) - Industry practitioner surveys
Financial Impact: [ IC^{AI, } = ]
At 15% utilization: Effective capex is 6.7× nominal, crushing ROIC^AI.
Uncertainty: ★★★★☆ VERY HIGH (ranges 10-50% depending on workload mix)
Conservative Assumption: Use 20-30% utilization for production inference workloads; 10-15% for research/training.
3. Quality Score Incorporation
3.1 Accounting Quality Metric (Q_t)
From base framework: [ Q_t = ]
Interpretation: Higher Q_t → more “fixes” needed → lower reporting quality.
3.2 Haircut Application
For companies with noisy accounting (high Q_t):
[ NOPAT^{AI, }_t = NOPAT^{AI}_t (1 - Q_t) ]
where () is severity parameter (default: 0.5).
Rationale: If a company can’t maintain clean roll-forwards for basic assets, their aggressive AI ROI claims deserve skepticism.
Example: - Company A: Q_t = 0.02 (clean) → Haircut = 1% → Accept AI claims at 99% face value - Company B: Q_t = 0.15 (sloppy) → Haircut = 7.5% → Discount AI claims by 8%
4. Distributed Lag Uncertainty
4.1 Parameter Selection
Academic Literature (Bom & Ligthart, 2014; Kumar, 2024): - Infrastructure shows 3-5 year lags to full productivity impact - Peak coefficient at K=4 quarters (1 year lag)
AI-Specific Evidence (McElheran et al., 2025): - J-curve: Initial productivity decline 1.33-60% - Recovery timeline: 3-5 years
ARDL Model: [ NOPAT_t = + _{k=0}^{K} k {t-k} + _t ]
Uncertainty: Choice of (K) (4 vs. 8 quarters) changes NPV by 10-15%.
Sensitivity Analysis: Report NPV for K=4, K=6, K=8 to bound outcomes.
5. Measurement Controversies
5.1 The “$600B Question” (Sequoia Capital)
Argument: Infrastructure capex requires commensurate revenue to justify ROI. - Current GenAI revenue: ~$100B (estimated) - Infrastructure build-out implies: ~$600B revenue needed - Gap: $500B
Rebuttal: Natural 5-10 year time lag between infrastructure and application revenue (distributed lag defense).
Resolution: Our framework models this explicitly via ARDL → NPV over 5-year horizon.
5.2 Acemoglu vs. Goldman Sachs (10× Disagreement)
Acemoglu (MIT): 0.5% productivity increase over 10 years (only 4.6% of tasks exposed) Goldman Sachs / McKinsey: 15% productivity increase (30% of work hours automatable)
Source of Disagreement: 1. Task exposure rate: 4.6% vs. 30% 2. Productivity gain per task: Modest vs. transformational 3. General equilibrium effects: Deflationary pressures vs. margin expansion
Framework Approach: We measure firm-by-firm, quarter-by-quarter from filings, not macro forecasts. Aggregate ex-post.
5.3 Utilization Rate Controversy
Industry Claims: “Capacity constrained”, “sold out of GPUs” (implies near-100% utilization) Empirical Evidence: 85%+ idle time measured
Possible Explanations: 1. Peak demand != average utilization (provisioned for spikes) 2. Development/testing workloads highly bursty 3. Inefficient orchestration / scheduling
Financial Implication: If true utilization 10-15%, effective ROIC^AI is 6.7× lower than nominal.
6. Best Practices for Practitioners
6.1 Transparency Checklist
When reporting AI ROI analysis, disclose:
- Data Sources: Which metrics from 10-K vs. estimated?
- Estimation Methods: Which of the 3 NOPAT methods used?
- Assumptions: Utilization rate, incremental margin, distributed lag K
- Sensitivity: Tornado chart showing impact of key assumptions
- Bounds: Report ranges, not point estimates
- Quality Adjustment: Apply Q_t haircut for noisy reporters
6.2 Red Flags
Avoid these pitfalls: 1. Precision Illusion: Reporting ROIC^AI to 0.1% when underlying estimates have ±30% error 2. Cherry-Picking: Using Method 2 (bottom-up) because it gives highest number 3. Ignoring Utilization: Assuming 100% when industry average is 10-15% 4. Zero Lag: Expecting immediate payback when infrastructure takes 3-5 years 5. Quality Blindness: Trusting aggressive AI claims from companies with high Q_t
6.3 Example Disclosure (Good Practice)
MSFT AI ROIC Estimate (FY2024)
───────────────────────────────────────────────────────────
AI Invested Capital: $111.0B ending ($98.0B average)
Δ NOPAT (AI, 3 methods): $5.3B – $8.9B (±40% uncertainty)
ROIC^AI (nominal): 5.4% – 9.1%
ROIC^AI (20% utilization): 1.1% – 1.8% ← Effective, after util. adjustment
WACC: 8.5%
Status: ⚠ AMBIGUOUS (lower 5.4% < WACC 8.5% < upper 9.1%)
🔴 BELOW HURDLE at industry utilization (10-15%)
Sensitivity:
+10pp utilization → +2.7pp ROIC^AI
+5pp margin → +1.9pp ROIC^AI
+2 qtrs lag (K) → -1.3pp NPV
Data Quality (Q_t): 0.03 (low, clean accounting)
Quality Haircut: Minimal (1.5%)
Conclusion: ROI highly dependent on achieving >50% utilization;
current industry benchmarks (10-15%) imply value destruction.
7. Future Improvements
7.1 Standardized Disclosure (Advocacy)
Proposal: SEC require separate AI capex and AI revenue segment disclosure for companies with >10% capex in AI.
Benefits: - Reduces estimation uncertainty from ±40% to ±10% - Enables better capital allocation across economy - Prevents AI hype from obscuring weak fundamentals
7.2 Third-Party Verification
Opportunity: Independent auditors verify: - GPU utilization rates (via DCGM telemetry) - Model training costs (FLOPs × time × power) - Incremental revenue attribution (A/B test results)
Precedent: Energy industry audits reserves; financial services audits risk models.
8. Citations
Brynjolfsson, E., Li, D., & Raymond, L. R. (2023). Generative AI at Work (NBER Working Paper No. 31161).
Cahn, D. (2024). AI’s $600B Question. Sequoia Capital Blog. https://www.sequoiacap.com/article/ais-600b-question/
McElheran, K., Yang, M., Brynjolfsson, E., & Kroff, Z. (2025). The Rise of Industrial AI in America. Census Working Paper CES-WP-25-27.
McKinsey & Company (2024). The State of AI in Early 2024. https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-2024
Module: src/ai_roi/quality_adjust.py
Recommended Practice: Always report bounds, never point
estimates.