Aggregation Idempotence Theorem

Author: Nirvan Chitnis Date: 2025-11-08 Status: Formal Proof


Abstract

We prove that the tag-to-bucket projection operator $\Pi$ satisfies the idempotence property $\Pi^2 = \Pi$. This establishes that aggregating accounts multiple times produces the same result as aggregating once, formalizing the correctness of hierarchical account classifications (e.g., XBRL taxonomy mappings).


1. Statement of the Theorem

Theorem (Aggregation Idempotence): Let $\Pi: \mathbb{R}^n \to \mathbb{R}^m$ be a linear projection operator that maps detailed accounts to aggregated buckets (e.g., XBRL concepts to equity bridge source terms). Then:

$$\Pi^2 = \Pi$$

Equivalently: - Applying the projection twice is equivalent to applying it once - The image of $\Pi$ is a direct summand of the domain - There exists a complementary projector $\Pi^\perp = I - \Pi$ such that $\mathbb{R}^n = \text{Im}(\Pi) \oplus \text{Im}(\Pi^\perp)$


2. Mathematical Framework

2.1 Projection Operators in Accounting

In the accounting conservation framework, we classify accounts into source term buckets:

Detailed Accounts (XBRL tags) Bucket (Source Term)
us-gaap:NetIncomeLoss NI
us-gaap:OtherComprehensiveIncomeLoss OCI
us-gaap:ProceedsFromIssuanceOfCommonStock Issuance
us-gaap:PaymentsForRepurchaseOfCommonStock Repurchases
us-gaap:DividendsCommonStock Dividends

The tag-to-bucket projection $\Pi$ is defined by:

$$\Pi x = \begin{bmatrix} \sum_{i \in \mathcal{I}_{\text{NI}}} x_i \\ \sum_{i \in \mathcal{I}_{\text{OCI}}} x_i \\ \sum_{i \in \mathcal{I}_{\text{Iss}}} x_i \\ \vdots \end{bmatrix}$$

where $\mathcal{I}_{\text{NI}}, \mathcal{I}_{\text{OCI}}, \ldots$ are index sets for each bucket.

2.2 Matrix Representation

We can represent $\Pi$ as an $m \times n$ matrix where entry $\Pi_{ij} \in \{0,1\}$ indicates whether account $j$ contributes to bucket $i$:

$$\Pi = \begin{bmatrix} 1 & 0 & 0 & \cdots & 0 \\ 0 & 1 & 1 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{bmatrix}$$

Key constraint: Each account belongs to at most one bucket (disjoint partition):

$$\mathcal{I}_i \cap \mathcal{I}_j = \emptyset \quad \forall i \neq j$$


3. Proof of Idempotence

3.1 Direct Proof via Matrix Multiplication

Claim: $\Pi^2 = \Pi$ when $\Pi$ represents a disjoint partition.

Proof: We compute $\Pi^2$ entry-by-entry:

$$(\Pi^2)_{ij} = \sum_{k=1}^n \Pi_{ik} \Pi_{kj}$$

Case 1: If account $j$ is in bucket $i$ (i.e., $j \in \mathcal{I}_i$), then: - $\Pi_{ij} = 1$ (by definition) - For any intermediate index $k$, either $k \in \mathcal{I}_i$ (and $\Pi_{ik} = 1, \Pi_{kj} = 0$ since $k \neq j$) or $k \notin \mathcal{I}_i$ (and $\Pi_{ik} = 0$) - Only the term $k=j$ survives: $\Pi_{ij} \cdot \Pi_{jj} = 1 \cdot 1 = 1$ - Thus $(\Pi^2)_{ij} = 1 = \Pi_{ij}$ ✓

Case 2: If account $j$ is NOT in bucket $i$ (i.e., $j \notin \mathcal{I}_i$), then: - $\Pi_{ij} = 0$ (by definition) - For all $k$, either $\Pi_{ik} = 0$ or $\Pi_{kj} = 0$ (disjoint partition property) - Thus $(\Pi^2)_{ij} = 0 = \Pi_{ij}$ ✓

Since all entries match, $\Pi^2 = \Pi$. ∎

3.2 Proof via Kernel and Image

Alternative Proof: An operator $\Pi$ is a projection if and only if it satisfies the splitting property:

$$\mathbb{R}^n = \ker(\Pi) \oplus \text{Im}(\Pi)$$

For tag-to-bucket projection: - $\text{Im}(\Pi) = \text{span}\{e_1, \ldots, e_m\}$ (the $m$ bucket vectors) - $\ker(\Pi) = \text{span}\{v : v_i = 0 \text{ for } i \in \bigcup_j \mathcal{I}_j\}$ (accounts not in any bucket)

Since these subspaces are complementary (by disjoint partition), $\Pi$ is a projection, hence $\Pi^2 = \Pi$. ∎


4. Consequences for Account Classification

4.1 Idempotent Classification

Corollary 1 (Re-classification stability): If accounts are already aggregated into buckets, re-applying the classification does not change the result:

$$\Pi(\Pi x) = \Pi x$$

Practical implication: Intermediate aggregation steps (e.g., XBRL → US-GAAP → equity bridge) can be applied in any order without loss of information within the image.

4.2 Complementary Null Space

Corollary 2 (Untagged accounts): The complementary projector $\Pi^\perp = I - \Pi$ identifies accounts not classified into any bucket:

$$\Pi^\perp x = x - \Pi x$$

Application: Diagnostics for missing XBRL tags or accounts outside the taxonomy.

Example: If us-gaap:CommonStockDividendsPerShareDeclared is not mapped to any bucket:

$$\Pi^\perp x \neq 0 \implies \text{Unclassified accounts exist}$$


5. Implications for Equity Bridge Validation

5.1 Residual Decomposition

The equity bridge residual can be decomposed as:

$$R = \Delta E_{\text{actual}} - \Delta E_{\text{expected}} = \Delta E_{\text{actual}} - \mathbf{1}^\top \Pi x$$

where $x$ is the vector of all account changes. The residual lives in the kernel of $\Pi$:

$$R \in \ker(\Pi) \iff \text{All tagged accounts correctly classified}$$

Diagnostic: If $R \neq 0$ and $\Pi^\perp x \neq 0$, the residual may be due to untagged accounts.

5.2 Hierarchical Aggregation

For multi-level taxonomies (e.g., XBRL → US-GAAP → Equity Bridge), idempotence ensures associativity:

$$\Pi_{\text{bridge}} \circ \Pi_{\text{GAAP}} \circ \Pi_{\text{XBRL}} = \Pi_{\text{bridge}} \circ (\Pi_{\text{GAAP}} \circ \Pi_{\text{XBRL}})$$

This allows modular validation: validate each level independently, then compose.


6. Non-Idempotent Operations (Counter-Examples)

Not all accounting aggregations are idempotent. Examples:

Operation Idempotent? Reason
Tag-to-bucket projection ✅ Yes Disjoint partition
Consolidation (parent ← subsidiaries) ❌ No Intercompany eliminations change on repeated application
FX translation (functional → presentation) ❌ No Re-translating using new rates gives different result
Inflation restatement (IAS 29) ❌ No Restatement factor compounds
Balance sheet aggregation ✅ Yes Sum of subtotals = total
IFRS 16 lease present value ❌ No Discounting again uses different term

Key insight: Linear aggregations (sums, weighted sums) are idempotent. Non-linear operations (consolidation, discounting, translation with different rates) are not.


7. Test Coverage

7.1 Property-Based Tests

We validate idempotence via Hypothesis property-based testing:

@given(st.lists(st.floats(allow_nan=False, allow_infinity=False), min_size=10))
def test_aggregation_idempotence(account_values):
    """Applying tag projection twice equals applying once."""
    x = np.array(account_values)
    projected_once = tag_to_bucket_projection(x)
    projected_twice = tag_to_bucket_projection(projected_once)
    assert np.allclose(projected_once, projected_twice, atol=1e-6)

7.2 Null Space Tests

def test_untagged_accounts_in_null_space():
    """Accounts not in any bucket should be in kernel of Π."""
    x = np.zeros(100)
    x[99] = 100.0  # Account 99 not tagged
    projected = tag_to_bucket_projection(x)
    residual = x - inverse_projection(projected)
    assert residual[99] == 100.0  # Untagged account preserved in null space

7.3 Decomposition Tests

def test_kernel_image_decomposition():
    """Verify that x = Πx + Π⊥x (orthogonal decomposition)."""
    x = random_account_vector(100)
    image_part = tag_to_bucket_projection(x)
    kernel_part = x - image_part
    reconstructed = image_part + kernel_part
    assert np.allclose(x, reconstructed, atol=1e-10)

See: tests/property_based/test_aggregation_idempotence.py


8. Practical Applications

8.1 XBRL Taxonomy Validation

Problem: XBRL filings may have multiple tags mapping to the same concept.

Solution: Idempotence ensures that hierarchical roll-ups (e.g., NetIncomeLoss as sum of revenue minus expenses) are consistent with direct tags.

Check: $\Pi_{\text{rollup}}(x) = \Pi_{\text{direct}}(x)$ implies consistent tagging.

8.2 Diagnostic Reporting

Application: Generate a “classification completeness” report:

$$\text{Coverage} = \frac{\|\Pi x\|}{\|x\|} \in [0,1]$$

8.3 Auditor Assurance

Assertion for auditors: “All equity movements are classified into mutually exclusive, collectively exhaustive source terms.”

Formal statement: $\Pi$ is a projection and $\ker(\Pi) = \{0\}$ (no untagged accounts).


9. Connection to Category Theory

In category-theoretic terms, a projection $\Pi$ is a retract (satisfies $\Pi \circ \pi = \text{id}_{\text{Im}(\Pi)}$). The idempotence property is equivalent to saying:

$$\text{The category of account aggregations has a terminal object (the bucket space).}$$

This connects to Ellerman’s (2014) work on accounting as a category with duality, where projections correspond to coproduct injections.


10. Conclusion

Main Result: The tag-to-bucket projection operator $\Pi$ is idempotent ($\Pi^2 = \Pi$), ensuring:

Practical Impact: - Validates correctness of XBRL taxonomy mappings - Enables modular validation of multi-level aggregations - Provides mathematical foundation for completeness audits

Formal Verification: See formal/lean/aggregation_idempotence.lean for mechanized proof.


References

  1. Halmos, P. R. (1958). Finite-Dimensional Vector Spaces. Springer. — Projection operators and direct sums
  2. Ellerman, D. (2014). Accounting and Category Theory. arXiv:1412.4229 — Duality and retracts in double-entry
  3. XBRL International (2023). XBRL Dimensions Specification 1.0 — Hierarchical taxonomy design principles

Generated with Claude Code (claude-sonnet-4-5) Co-Authored-By: Claude