Applied Researcharxiviq.com
Research Registry · arxiviq.com

Applied Research: Research Registry

Every embedding method the research org has tried, with measured verdicts and its stage in the Research → Business Development → Execution pipeline.

Live model versionslive

Measured model versions from this workspace's core-api /v1/models, ranked by effective rank.

asn-head-8080041erank 27.1nDCG@10 0.942
asn-head-1825460deployederank 26.8 · trunc 256nDCG@10 0.954
asn-head-8282654erank 26.0nDCG@10 0.908

Based on 891 runs and 1,190 trainings · evidence: EVIDENCE.md §3.1–§3.7, SWEEP_FINDINGS.md, SWEEP_REPORT.md · updated 2026-06-28

How the pipeline works

The research org tests every embedding method on measured benchmarks. Methods that prove their worth are promoted to Business Development for real-world tenant pilots, then the Execution team implements the winners in production. Dead ends are archived with the evidence that killed them.

In Research(2)Promoted to Business Development(3)In Execution(2)Archived (rejected)(4)

In Research

2 methods

Under active investigation in the research org: measured, but not yet promoted.

VICReg (variance + covariance rank floor)

Measured (regime-specific)

Loss-space rank floor. Prevents collapse where it occurs (non-contrastive / low-negative regimes) but is neutral on standard InfoNCE real-text training. Kept as insurance, not a default.

Key metric · +0.32 kNN for SimSiam; ~0 for InfoNCE batch≥16; neutral on real text
Evidence: EVIDENCE §3.4, §3.6, §3.7 (H-A)

Differentiable rank-floor regularizer

Measured (mid-pack)

Adds a differentiable effective-rank maximization term to InfoNCE. Works but ranked below Barlow/InfoNCE in the search.

Key metric · robust score 1.31 (#3 of 6)
Evidence: EVIDENCE §3.7 wave-2

Promoted to Business Development

3 methods

Earned its keep in research. Handed to Business Development for real-world tenant pilots.

Domain fine-tuning

Measured

Cheap per-tenant fine-tuning on the tenant's domain. The defensible product moat: beats general commercial embeddings in-domain, with no out-of-domain forgetting at this scale.

Key metric · +1.5% (300 pairs) → +3.0% (1,200 pairs) in-domain; OOD improved
Evidence: EVIDENCE §3.6, §3.7 (H-C)

Barlow Twins

Validating on real text

Redundancy-reduction objective. Top method in the Bayesian search; keeps full-dim quality while resisting collapse. Real-text confirmation vs VICReg in progress.

Key metric · robust score 1.42 (#1 of 6 methods, synthetic)
Evidence: EVIDENCE §3.7 wave-2 TPE

Matryoshka (MRL) truncation training

Validating on real text

Trains nested prefixes of the served representation so it degrades gracefully when truncated. The real lever for cheap truncated serving (decorrelation is not).

Key metric · best truncated-dim kNN among methods (synthetic)
Evidence: EVIDENCE §3.7 (B, wave-2)

In Execution

2 methods

Validated and shipped. Part of the production serving / training path today.

InfoNCE contrastive (default)

Measured

Standard contrastive objective. Inherently resists collapse with adequate negatives (batch ≥16); the production default training path.

Key metric · AG News kNN 0.892 / nDCG 0.822 after fine-tune
Evidence: EVIDENCE §3.6, §3.7 (H-A)

int8 quantized serving

Measured

Per-vector int8 quantization for edge serving. Essentially lossless; ship it as the default cheap-serving tier.

Key metric · kNN_int8 ≈ kNN_full everywhere
Evidence: EVIDENCE §3.7 (B)

Archived (rejected)

4 methods

Tested honestly and rejected. Kept here so we don't re-litigate dead ends.

Three-tier spectral surgery (original ASN)

Rejected

Periodic weight-space SVD surgery shrinking the weak singular band. Fights anisotropy, not collapse. In a real collapse regime it made collapse worse (rank 3.4→1.0).

Key metric · served rank 3.4 → 1.0; kNN −12–21pts
Evidence: EVIDENCE §3.2

spectral_lift (rank-floor surgery)

Rejected

Redesigned surgery that lifts weak singular values. Fixes the harm of three-tier but still loses to doing nothing; weight-space surgery can't outrun the collapse gradient.

Key metric · rank 2.06 vs do-nothing 3.40
Evidence: EVIDENCE §3.3

Sleep / SHY consolidation (AwakenedSleepNet)

Rejected

Bio-inspired wake/sleep phasing with synaptic downscaling + dream pruning + replay. Phasic consolidation fails; only a continuous in-loss rank floor works. Downscaling survives as a collapse-neutral renormalizer.

Key metric · phasic rank 9.99 / 5.71 vs continuous 21.0
Evidence: EVIDENCE §3.5

DINO-style centering

Rejected

Self-distillation with centering + sharpening. Worst method in the Bayesian search on this task.

Key metric · robust score 1.08 (#6 of 6)
Evidence: EVIDENCE §3.7 wave-2