Aggregation Idempotence Theorem
Author: Nirvan Chitnis Date: 2025-11-08 Status: Formal Proof
Abstract
We prove that the tag-to-bucket projection operator $\Pi$ satisfies the idempotence property $\Pi^2 = \Pi$. This establishes that aggregating accounts multiple times produces the same result as aggregating once, formalizing the correctness of hierarchical account classifications (e.g., XBRL taxonomy mappings).
1. Statement of the Theorem
Theorem (Aggregation Idempotence): Let $\Pi: \mathbb{R}^n \to \mathbb{R}^m$ be a linear projection operator that maps detailed accounts to aggregated buckets (e.g., XBRL concepts to equity bridge source terms). Then:
Equivalently: - Applying the projection twice is equivalent to applying it once - The image of $\Pi$ is a direct summand of the domain - There exists a complementary projector $\Pi^\perp = I - \Pi$ such that $\mathbb{R}^n = \text{Im}(\Pi) \oplus \text{Im}(\Pi^\perp)$
2. Mathematical Framework
2.1 Projection Operators in Accounting
In the accounting conservation framework, we classify accounts into source term buckets:
| Detailed Accounts (XBRL tags) | Bucket (Source Term) |
|---|---|
us-gaap:NetIncomeLoss |
NI |
us-gaap:OtherComprehensiveIncomeLoss |
OCI |
us-gaap:ProceedsFromIssuanceOfCommonStock |
Issuance |
us-gaap:PaymentsForRepurchaseOfCommonStock |
Repurchases |
us-gaap:DividendsCommonStock |
Dividends |
The tag-to-bucket projection $\Pi$ is defined by:
where $\mathcal{I}_{\text{NI}}, \mathcal{I}_{\text{OCI}}, \ldots$ are index sets for each bucket.
2.2 Matrix Representation
We can represent $\Pi$ as an $m \times n$ matrix where entry $\Pi_{ij} \in \{0,1\}$ indicates whether account $j$ contributes to bucket $i$:
Key constraint: Each account belongs to at most one bucket (disjoint partition):
3. Proof of Idempotence
3.1 Direct Proof via Matrix Multiplication
Claim: $\Pi^2 = \Pi$ when $\Pi$ represents a disjoint partition.
Proof: We compute $\Pi^2$ entry-by-entry:
Case 1: If account $j$ is in bucket $i$ (i.e., $j \in \mathcal{I}_i$), then: - $\Pi_{ij} = 1$ (by definition) - For any intermediate index $k$, either $k \in \mathcal{I}_i$ (and $\Pi_{ik} = 1, \Pi_{kj} = 0$ since $k \neq j$) or $k \notin \mathcal{I}_i$ (and $\Pi_{ik} = 0$) - Only the term $k=j$ survives: $\Pi_{ij} \cdot \Pi_{jj} = 1 \cdot 1 = 1$ - Thus $(\Pi^2)_{ij} = 1 = \Pi_{ij}$ ✓
Case 2: If account $j$ is NOT in bucket $i$ (i.e., $j \notin \mathcal{I}_i$), then: - $\Pi_{ij} = 0$ (by definition) - For all $k$, either $\Pi_{ik} = 0$ or $\Pi_{kj} = 0$ (disjoint partition property) - Thus $(\Pi^2)_{ij} = 0 = \Pi_{ij}$ ✓
Since all entries match, $\Pi^2 = \Pi$. ∎
3.2 Proof via Kernel and Image
Alternative Proof: An operator $\Pi$ is a projection if and only if it satisfies the splitting property:
For tag-to-bucket projection: - $\text{Im}(\Pi) = \text{span}\{e_1, \ldots, e_m\}$ (the $m$ bucket vectors) - $\ker(\Pi) = \text{span}\{v : v_i = 0 \text{ for } i \in \bigcup_j \mathcal{I}_j\}$ (accounts not in any bucket)
Since these subspaces are complementary (by disjoint partition), $\Pi$ is a projection, hence $\Pi^2 = \Pi$. ∎
4. Consequences for Account Classification
4.1 Idempotent Classification
Corollary 1 (Re-classification stability): If accounts are already aggregated into buckets, re-applying the classification does not change the result:
Practical implication: Intermediate aggregation steps (e.g., XBRL → US-GAAP → equity bridge) can be applied in any order without loss of information within the image.
4.2 Complementary Null Space
Corollary 2 (Untagged accounts): The complementary projector $\Pi^\perp = I - \Pi$ identifies accounts not classified into any bucket:
Application: Diagnostics for missing XBRL tags or accounts outside the taxonomy.
Example: If
us-gaap:CommonStockDividendsPerShareDeclared is not mapped
to any bucket:
5. Implications for Equity Bridge Validation
5.1 Residual Decomposition
The equity bridge residual can be decomposed as:
where $x$ is the vector of all account changes. The residual lives in the kernel of $\Pi$:
Diagnostic: If $R \neq 0$ and $\Pi^\perp x \neq 0$, the residual may be due to untagged accounts.
5.2 Hierarchical Aggregation
For multi-level taxonomies (e.g., XBRL → US-GAAP → Equity Bridge), idempotence ensures associativity:
This allows modular validation: validate each level independently, then compose.
6. Non-Idempotent Operations (Counter-Examples)
Not all accounting aggregations are idempotent. Examples:
| Operation | Idempotent? | Reason |
|---|---|---|
| Tag-to-bucket projection | ✅ Yes | Disjoint partition |
| Consolidation (parent ← subsidiaries) | ❌ No | Intercompany eliminations change on repeated application |
| FX translation (functional → presentation) | ❌ No | Re-translating using new rates gives different result |
| Inflation restatement (IAS 29) | ❌ No | Restatement factor compounds |
| Balance sheet aggregation | ✅ Yes | Sum of subtotals = total |
| IFRS 16 lease present value | ❌ No | Discounting again uses different term |
Key insight: Linear aggregations (sums, weighted sums) are idempotent. Non-linear operations (consolidation, discounting, translation with different rates) are not.
7. Test Coverage
7.1 Property-Based Tests
We validate idempotence via Hypothesis property-based testing:
@given(st.lists(st.floats(allow_nan=False, allow_infinity=False), min_size=10))
def test_aggregation_idempotence(account_values):
"""Applying tag projection twice equals applying once."""
x = np.array(account_values)
projected_once = tag_to_bucket_projection(x)
projected_twice = tag_to_bucket_projection(projected_once)
assert np.allclose(projected_once, projected_twice, atol=1e-6)7.2 Null Space Tests
def test_untagged_accounts_in_null_space():
"""Accounts not in any bucket should be in kernel of Π."""
x = np.zeros(100)
x[99] = 100.0 # Account 99 not tagged
projected = tag_to_bucket_projection(x)
residual = x - inverse_projection(projected)
assert residual[99] == 100.0 # Untagged account preserved in null space7.3 Decomposition Tests
def test_kernel_image_decomposition():
"""Verify that x = Πx + Π⊥x (orthogonal decomposition)."""
x = random_account_vector(100)
image_part = tag_to_bucket_projection(x)
kernel_part = x - image_part
reconstructed = image_part + kernel_part
assert np.allclose(x, reconstructed, atol=1e-10)See:
tests/property_based/test_aggregation_idempotence.py
8. Practical Applications
8.1 XBRL Taxonomy Validation
Problem: XBRL filings may have multiple tags mapping to the same concept.
Solution: Idempotence ensures that
hierarchical roll-ups (e.g., NetIncomeLoss
as sum of revenue minus expenses) are consistent with direct
tags.
Check: $\Pi_{\text{rollup}}(x) = \Pi_{\text{direct}}(x)$ implies consistent tagging.
8.2 Diagnostic Reporting
Application: Generate a “classification completeness” report:
- Coverage = 1: All accounts classified
- Coverage < 1: Some accounts in null space (untagged or ambiguous)
8.3 Auditor Assurance
Assertion for auditors: “All equity movements are classified into mutually exclusive, collectively exhaustive source terms.”
Formal statement: $\Pi$ is a projection and $\ker(\Pi) = \{0\}$ (no untagged accounts).
9. Connection to Category Theory
In category-theoretic terms, a projection $\Pi$ is a retract (satisfies $\Pi \circ \pi = \text{id}_{\text{Im}(\Pi)}$). The idempotence property is equivalent to saying:
This connects to Ellerman’s (2014) work on accounting as a category with duality, where projections correspond to coproduct injections.
10. Conclusion
Main Result: The tag-to-bucket projection operator $\Pi$ is idempotent ($\Pi^2 = \Pi$), ensuring:
- ✅ Hierarchical aggregations are stable (re-aggregating gives same result)
- ✅ Untagged accounts are identifiable via null space $\Pi^\perp$
- ✅ Equity bridge residuals decompose into classified vs. unclassified components
Practical Impact: - Validates correctness of XBRL taxonomy mappings - Enables modular validation of multi-level aggregations - Provides mathematical foundation for completeness audits
Formal Verification: See
formal/lean/aggregation_idempotence.lean for mechanized
proof.
References
- Halmos, P. R. (1958). Finite-Dimensional Vector Spaces. Springer. — Projection operators and direct sums
- Ellerman, D. (2014). Accounting and Category Theory. arXiv:1412.4229 — Duality and retracts in double-entry
- XBRL International (2023). XBRL Dimensions Specification 1.0 — Hierarchical taxonomy design principles
Generated with Claude Code (claude-sonnet-4-5) Co-Authored-By: Claude noreply@anthropic.com