O(1)
Memory retrieval (vs O(N²) for attention)
~100
Pristine anchor vectors, mathematically orthogonal
90°
Angular separation via Gram-Schmidt surgery

Vector Symbolic Architectures

This technique, rooted in cognitive science, gives the AI a native algebra for reasoning rather than making it reason by predicting the next word. Think of it as a built-in calculator for logic that operates on the structure of information itself.

Three core operations power this algebra:

  1. Binding — Links a role to a value (like assigning a variable). Uses circular convolution in the Fourier domain.
  2. Superposition — Combines multiple bound states into one vector (like working memory holding several facts simultaneously).
  3. Unbinding — Retrieves a specific value from a superposed state (like a dictionary lookup by key) — instantly.
Imagine encoding information on a vinyl record. You "bind" a fact by pressing a groove at a unique frequency. You "superpose" facts by pressing many frequencies onto the same record. To "unbind", you play the record through a filter tuned to one frequency and the original fact comes back — instantly, no matter how many other grooves are there.

Physical Embedding Surgery

Before any of this algebra can work cleanly, the framework must fix a critical problem: the operator tokens (like => or |) already carry messy, ambiguous meanings from their internet training data. So the framework performs Physical Embedding Surgery — forcefully resetting ~50–100 operator tokens and repositioning them to be exactly 90° apart in the model's internal space via Gram-Schmidt orthogonalisation. These cleaned operators are then frozen so they never drift back.

🔬 Technical Deep Dive — VSA Operations & Kanerva Limit

Binding uses Holographic Reduced Representations (HRR): V_bound = F⁻¹(F(Role) ⊙ F(Value)). Both Role and Value must be unitary (normalized) to prevent magnitude explosion in recursive chains, enforced via LayerNorm or cosine-similarity loss penalties.

Topo-Categorical Anchors: ~50–100 ASCII operator tokens undergo Physical Embedding Surgery — their pre-trained weights are forcefully reset and orthogonalised via Gram-Schmidt (O(d·k²) ≈ 1.5×10⁸ ops at init), then frozen to prevent semantic drift.

Kanerva Limit: The maximum items safely superposed is k ≈ 0.10–0.15 × d_k. For dk=12,288, that's ~1,200–1,800 simultaneous bindings — orders of magnitude more efficient than attention-based retrieval over the same token count.

Complexity comparison: HRR circular convolution is O(d log d) (FFT-bound), FHRR complex binding is O(d), vs standard attention at O(N·d) per layer.