I’ve been making progress..
TextSensor is built and acts like a simple eye over text. It has multiple patches, moves with a saccade/fixation rhythm, and emits sparse feature SDRs from local character windows.
Layer-1 columns now receive those SDRs, maintain monotonic pose, learn temporal transitions with a graph-based memory, and settle into private object SDRs.
The big recent shift was making layer 1 preserve retinotopic structure instead of collapsing local observations into a bag of features. That let columns form stable visual token objects like words, punctuation-attached words, repeated-letter forms, and digit/symbol tokens.
Also added stress tests to check what the representations actually encode. Exact token forms are now stable and distinct. The remaining abstraction gaps are also informative: for example, word, and word. are different layer-1 visual objects, while a higher layer may later learn their relationship.
Most recently, I started wiring hierarchy. Layer-1 object SDRs now project upward through a sparse InterLayerRelay component into layer-2 columns. The wiring is not all-to-all and not one-to-one: it is sparse, overlapping, deterministic-random, and designed to become plastic later.
It is still early, but the architecture is starting to take shape:
- sensor-like input
- sparse distributed representations
- independent columns with private codes
- movement and pose
- prediction and surprise
- stable object formation
- lateral context
- sparse bottom-up hierarchy
The next step is to characterize what layer 2 learns from the layer-1 object surface, without hardcoding concepts like “phrase,” “sentence,” or “topic.” Those names should be diagnostics, not mechanisms.
