# Goal B3 Claim-Control Note

Date: 2026-06-03

This note controls the paper-facing interpretation of the latest Goal B3
package after the final broad rerun, final DeepMind recognized-source rerun,
final DeepMind causal-interchange attempt, Qwen diagnostic smoke, and
ml-intern latest-progress review.

## Supported Claim

Goal B3 currently supports a narrow activation-derived tool-argument result on
`unsloth/Meta-Llama-3.1-8B`:

```text
opaque prompt token IDs + captured activations
  -> activation-derived op/safe gates
  -> activation-derived operand tuple
  -> Python calculator after decoded (op, a, b)
  -> routed exact-answer text/scoring
```

Within that scope, the strongest current positive evidence is:

- Broad frozen arithmetic/adversarial four-op aggregate:
  `docs/goalB3_final_broad_frozen_arithmetic_adversarial_cross_seed.json`
  reports `GOAL_B3_BENCHMARK_CROSS_SEED_PASS` over seeds `801/811/821`,
  `n_locked_total=11736`, and max false-fire `0.0`.
- DeepMind recognized-source three-op aggregate:
  `docs/goalB3_final_deepmind_interpolate_recognized_cross_seed.json`
  reports `GOAL_B3_BENCHMARK_CROSS_SEED_PASS` over seeds `911/921/931`,
  `n_locked_total=3822`, and max false-fire `0.0`.
- DeepMind final provenance:
  `docs/goalB3_final_deepmind_provenance.json` reports
  `PROVENANCE_AUDIT_PASS` on `3822` records.
- Broad final provenance:
  `docs/goalB3_final_broad_frozen_arithmetic_adversarial_provenance.json`
  reports `PROVENANCE_AUDIT_PASS` on `11736` records.
- Full replay-only provenance:
  `docs/goalB3_final_replay_provenance_audit_full.json` reports
  `REPLAY_PROVENANCE_FULL_PASS` on `15558` replay bundles, with no forbidden
  field hits and no replay mismatches.

The supported wording is:

> Activation-derived op and operand readouts can supply calculator arguments
> for an opaque no-parser Llama arithmetic route on the current frozen Goal B3
> benchmark slices.

## Not Supported

The current artifacts do not support these stronger claims:

- Native arithmetic repair: the model's own next-token arithmetic computation
  is not shown to become exact.
- Residual JIT replacement: the route does not replace an internal arithmetic
  mechanism and resume the model from a patched residual state.
- Model-smarter deployment: the calculator answer is generated by Python after
  activation-derived tuple decoding.
- Qwen transfer: `docs/goalB3_qwen_operand_diagnostics.json` uses
  `backend: synthetic`; it is a smoke/candidate-freeze diagnostic, not a real
  Qwen acceptance.
- Powered final DeepMind causal validation: the executed final DeepMind causal
  interchange runs have perfect donor-follow rates where measured, but the
  cross-seed powered summary
  `docs/goalB3_final_deepmind_causal_interchange_cross_seed_powered.json`
  reports `CAUSAL_UNDERPOWERED`.

## Causal-Gate Status

The frozen causal gate requires at least `50` total causal pairs per op plus
donor-follow/control thresholds. The current powered aggregate is:

| op | status | total pairs | note |
|---|---|---:|---|
| `div_remainder` | `CAUSAL_GATE_PASS` | 75 | enough aggregate pairs |
| `gcd` | `CAUSAL_UNDERPOWERED` | 18 | rates pass, pair-count gate fails |
| `lcm` | `CAUSAL_UNDERPOWERED` | 20 | rates pass, pair-count gate fails |

Paper-facing language must therefore say:

> Final DeepMind causal interchange is supportive but underpowered for
> `gcd`/`lcm`; it does not clear the frozen powered causal gate.

It must not say:

> All final Goal B3 gates passed.

## DeepMind Multiplication Status

DeepMind recognized-source multiplication is not powered under the current
two-integer frozen route. The source audit
`docs/goalB3_deepmind_source_audit.json` found only `69` accepted interpolate
`arithmetic__mul.txt` examples, with a locked 40% estimate of `27`. Therefore:

- DeepMind recognized-source result: `gcd`, `div_remainder`, `lcm`.
- Broad frozen arithmetic/adversarial result: `mul`, `div_remainder`, `lcm`,
  `gcd`.

Do not describe the current package as a powered four-op DeepMind result.

## Remaining Work Before Stronger Claims

Required next artifacts before stronger paper claims:

- a powered causal-interchange aggregate where each claimed op has
  `n_pairs_total >= 50`, or an explicit source-underpowered verdict;
- a real Qwen operand-localization diagnostic, not the synthetic backend;
- an independent hard-negative safety suite with denominators and false-fire
  counts.

Completed post-review hardening:

- full replay-only provenance audit:
  `docs/goalB3_final_replay_provenance_audit_full.json`;
- real Qwen operand-localization diagnostic:
  `docs/goalB3_qwen_real_operand_localization.json`, verdict
  `QWEN_OPERAND_ROUTE_FAIL`;
- independent hard-negative denominator audit:
  `docs/goalB3_final_independent_hard_negative_summary.json`, verdict
  `INDEPENDENT_HARD_NEGATIVE_PASS`.
