# CAPAS Claim Admissibility Calculus v1

CAPAS is not a scientific truth oracle. It is a deterministic admissibility
compiler for structured claim/evidence records. Its output answers a narrower
question:

> Given this supplied evidence packet and this registered evidence contract,
> what reuse boundary is licensed for this claim?

This document defines the formal layer implemented by
`admissibility_certificate` in CAPAS schema v3 outputs.

## 1. Problem Class

Neighboring systems usually operate at different levels:

- Dataset governance controls records, source summaries, licenses, and splits.
- Provenance systems such as W3C PROV and RO-Crate describe entities,
  activities, agents, and packaged research objects.
- Scientific fact-checking systems retrieve literature and estimate whether
  a natural-language claim is supported, contradicted, or uncertain.
- Model governance frameworks manage lifecycle risks and documentation duties.

CAPAS occupies the gap between those layers: claim-level admissibility. It
does not ask whether a statement is broadly true. It asks whether supplied,
typed evidence licenses a specific claim boundary for controlled reuse.

This is the operational failure mode CAPAS is designed to catch:

```text
paper sentence:      association observed under limited conditions
dataset sentence:    treatment causes improvement
downstream artifact: governed training example
```

The drift is not only a provenance problem. The source can be known and still
fail to license the rewritten claim boundary.

## 2. Evidence Contracts

Each supported claim type has a registered evidence contract:

```text
C_t = (required_fields, optional_fields, decision_rule, admissible_bounds)
```

For a payload `P = (claim, evidence, training_evidence)`, CAPAS first resolves
the claim type `t = claim.type`. If `t` has no registered contract, CAPAS does
not decide the scientific claim. It emits a HOLD with a proof obligation to
register the claim type and evidence contract.

This is why universal coverage is not achieved by relaxing validation. It is
achieved by making contract registration explicit. Unknown scientific domains
can be added without weakening existing domains.

## 3. Admissibility Lattice

CAPAS assigns each decision to a finite product lattice:

```text
A = Contract x Evidence x Boundary x Provenance x Defeaters
```

Each axis is totally ordered from weakest to strongest.

### Contract Axis

```text
unregistered < registered < schema_clean < contract_complete
```

This axis measures whether CAPAS has a known contract and whether the payload
conforms to it.

### Evidence Axis

```text
none < declared < complete < supports_bounded_claim < supports_claim_boundary
```

This axis measures whether the evidence packet is merely present, complete, or
strong enough for the specific claim boundary.

### Boundary Axis

```text
none < schema_blocked < claim_excluded < bounded_rewrite < claim_licensed < training_ready
```

This is the main reuse boundary. It maps to the operational decision:

- `schema_blocked`: payload cannot be evaluated yet.
- `claim_excluded`: evidence contradicts or fails the claim.
- `bounded_rewrite`: weaker claim boundary is licensed.
- `claim_licensed`: claim is accepted for controlled reuse.
- `training_ready`: claim is accepted and external provenance gates pass.

### Provenance Axis

```text
none < declared < source_backed < externally_reviewed < externally_verified
```

Browser decisions can preview this axis. Full external verification requires
CLI/API operations: URL hashing, review hash checks, witness registry
resolution, RO-Crate validation, and reviewer attestation verification.

### Defeaters Axis

```text
open_defeaters < schema_undercut < burden_gap < bounded_defeater < no_active_defeater
```

Defeaters encode what blocks reuse. A schema undercutter blocks all domain
reasoning. A burden gap means the evidence contract is incomplete. A bounded
defeater supports REWRITE rather than ACCEPT.

## 4. Meet Operations

CAPAS computes two conservative meet values:

```text
controlled_reuse_meet = min(contract, evidence, boundary, defeaters)
training_reuse_meet   = min(controlled_reuse_meet, provenance)
```

The meet prevents a strong result on one axis from hiding a weak result on
another. For example, an ACCEPT verdict with weak provenance can license a
claim boundary for controlled review, but it cannot become `training_ready`
until external provenance verification passes.

## 5. Dialectical Certificate

Each `admissibility_certificate` also includes a dialectical tuple:

```text
D = (thesis, licensed_thesis, warrant, supports, undercutters, rebuttals, obligations)
```

- `thesis`: the original claim text.
- `licensed_thesis`: the accepted or rewritten claim boundary.
- `warrant`: the deterministic reason reported by the rule engine.
- `supports`: supplied evidence fields used in the contract.
- `undercutters`: schema errors and missing fields that block reasoning.
- `rebuttals`: rule-level reasons that reject or weaken the claim.
- `obligations`: concrete proof obligations required for reuse.

This makes CAPAS more like a compiler than a scorer. It produces an artifact
that can be reviewed, queued, exported, and re-run.

## 6. Exception Queue

Batch mode aggregates per-claim certificates into:

```json
{
  "admissibility_summary": {
    "reuse_boundaries": {
      "claim_licensed": 12,
      "bounded_rewrite": 7,
      "schema_blocked": 3
    },
    "next_actions": {
      "verify_provenance_for_training": 12,
      "edit_and_resubmit": 7,
      "repair_schema": 3
    }
  },
  "exception_queue": []
}
```

The queue is not a generic task list. It is routed by formal boundary and next
action:

- `repair_schema`: the record has structural defects.
- `register_claim_type`: no contract exists for this claim type.
- `supply_evidence`: required evidence fields are missing.
- `edit_and_resubmit`: the weaker licensed claim should replace the original.
- `exclude_or_replace_evidence`: the claim is not licensed by current evidence.
- `verify_provenance_for_training`: the claim boundary is licensed, but
  training-data reuse still requires active provenance verification.

## 7. Universalization Strategy

CAPAS should not become universal by accepting arbitrary fields. That would
turn it into a weak form validator. It scales by adding explicit contracts:

```text
new claim family -> evidence contract -> validation rules -> boundary rule
```

For unsupported types, the correct result is HOLD with a contract-registration
obligation. This preserves auditability while giving the system a path to
expand into causality, systematic reviews, multimodal evidence, theorem
claims, biomedical claims, and organization-specific evidence contracts.

## 8. SOTA Positioning

The hard distinction is:

```text
Fact-checking asks: Is this statement true?
Provenance asks: Where did this artifact come from?
Dataset governance asks: What records are in this dataset?
CAPAS asks: Does this evidence packet license this claim boundary for reuse?
```

That last question is narrower than truth and stricter than metadata. It is
also where claim drift enters training data.

Relevant external anchors:

- NIST AI RMF defines governance-oriented risk management for AI systems:
  https://www.nist.gov/itl/ai-risk-management-framework
- EU AI Act Article 10 requires data governance practices for high-risk AI
  training, validation, and testing data:
  https://artificialintelligenceact.eu/article/10/
- W3C PROV defines provenance concepts for entities, activities, and agents:
  https://www.w3.org/TR/prov-overview/
- RO-Crate packages research objects with metadata and contextual entities:
  https://www.researchobject.org/ro-crate/
- Data Provenance Initiative audits dataset lineage, licensing, and
  attribution across large-scale AI datasets:
  https://www.dataprovenance.org/

CAPAS does not replace those layers. It consumes and complements them by
compiling claim-level admissibility certificates.

## 9. Scope Limits

CAPAS is intentionally strict:

- It does not infer hidden evidence.
- It does not certify broad scientific truth.
- It does not replace external expert review.
- It does not make unsupported claim types pass by default.
- It does not make `fine_tune_ready` true from browser-only self-reporting.

These limits are part of the product boundary, not missing features. The
system is useful because it is explicit about what it can and cannot license.