May 29, 2026 – Million Minds

I’ve been making progress..

TextSensor is built and acts like a simple eye over text. It has multiple patches, moves with a saccade/fixation rhythm, and emits sparse feature SDRs from local character windows.

Layer-1 columns now receive those SDRs, maintain monotonic pose, learn temporal transitions with a graph-based memory, and settle into private object SDRs.

The big recent shift was making layer 1 preserve retinotopic structure instead of collapsing local observations into a bag of features. That let columns form stable visual token objects like words, punctuation-attached words, repeated-letter forms, and digit/symbol tokens.

Also added stress tests to check what the representations actually encode. Exact token forms are now stable and distinct. The remaining abstraction gaps are also informative: for example, word, and word. are different layer-1 visual objects, while a higher layer may later learn their relationship.

Most recently, I started wiring hierarchy. Layer-1 object SDRs now project upward through a sparse InterLayerRelay component into layer-2 columns. The wiring is not all-to-all and not one-to-one: it is sparse, overlapping, deterministic-random, and designed to become plastic later.

It is still early, but the architecture is starting to take shape:

sensor-like input
sparse distributed representations
independent columns with private codes
movement and pose
prediction and surprise
stable object formation
lateral context
sparse bottom-up hierarchy

The next step is to characterize what layer 2 learns from the layer-1 object surface, without hardcoding concepts like “phrase,” “sentence,” or “topic.” Those names should be diagnostics, not mechanisms.