FileMakerでいろいろ作ってみる: Atraの収束（経験で答えに近づける）

まずは、あえて、Associatronを入れない、かなり単純な 経験の残り方を見せるUIデモ

javaScriptです。いつものように言語には拘っていません。
見せもの作るのにめんどくさいかどうかだけで判断してます。

本当はFileMakerに画像データベース作ってWabビューアーでやろうと思ったんだけど、code公開したほうがいいかなと思ってhtmlだけにした。画像取り込みめんどくさいけどやってみて・・・。

①画像を取り込む

②名称を入力する

③「Let Atra hear it」　経験させて

④「Atra leaks a word」答える

Atra：a..　　と言った。

「Let Atra hear it」　ボタンを押してみる

「Atra leaks a word」ボタンを押してみる

Atra：ap..　　と言った。

「Let Atra hear it」　ボタンを押してみる

「Atra leaks a word」ボタンを押してみる

Atra：apple　　と言った。

4回目で収束した。

一旦、バナナに変えてみた。

Atra：b...　と言った。

そんな状態で再びappleを見せた。

Atra：apple　　と言った。

でも、これは画像名によせてるだけで、同じapple画像名が変われば,Atraは思い出せません。画像名が視覚痕跡IDみたいな検索になっている。（一人称に見せた３人称命令です）
視覚痕跡IDで検索なら経験収束なんて必要ないじゃん！って事です。
Atra でもアソシアトロンでもなく、ただのキー・バリューです。
偽Atraです。
世の中のインチキを見分けるためにあえて作りました。

後で比較するためにも、codeは置いておくね。

上のcodeは比較参考のためだよ。

Atra

じゃぁ、本番

Associatron（連想記憶）Atraの脳に近いcodeが入るとどうなるか。

同じUIで実験するね。

どうですか？

１回で

Atra：appl....　　まで言った。
だって、appleだよ。
a...みたいにワザとらしく間違えるかってのｗ
いくらアソシアトロンは間違えやノイズを否定しないと言っても
a...はないでしょ(笑)

1回入れただけで、

T += x x^T

しているので、一発で視覚と言葉の結合が重みに強く残るんです。

さらに leakWord() 側で、
const clarity = ((score + 1) / 2) + Math.min(heardCount, 5) * 0.12;
としているので、1回聞いただけでも heardCount が効いて、表示上かなり早く apple まで出ます。
つまり今の動きは、
経験で少しずつ収束というより、
Associatron の一発記銘 + UI側の強めの漏れ表示になっています。

const clarity = ((score + 1) / 2) + Math.min(heardCount, 5) * 0.12;

これを

const clarity = ((score + 1) / 2) * 0.55 + Math.min(heardCount, 8) * 0.04;

に変えたり、

leakWord() のしきい値も少し上げたり調整するといいかも。

------Atra　全文----------　atra_visual_word_convergence.html　------------------

---------------end----------------

Associatron由来のAtronは、間違えるときはappleからバナナを見せてもappleと答えてしまう事がある。似た物をみてappleと答えることもある。

ただし今のコードでは、最後に

const best = findBestWordMatch(recalledWordPattern);

内部で立ち上がった word 側パターンを、
これまで聞いた単語の中で一番近いものに表示変換しているんですよね。

findBestWordMatch() は、最後に一番近い単語を選ぶ性質があるので、開発側が「それも自然じゃん！」と思うか、「人間向け表示の補正」という思想に寄せるか・・・
僕的には人間は直近の物を覚えやすい性質もあるので、どちらでもいいんだけど・・・

apple と banana のどちらにも少し揺れる
沈黙する
apple と出る
banana と出る
崩れた漏れになる

みたいな沈黙を入れたければ、

------全文----------　atra_visual_word_convergence.html　------------------

<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Atra Visual Word Convergence</title> <style> body { font-family: sans-serif; padding: 24px; background: #f5f5f5; } #stage { width: 320px; height: 240px; background: white; border: 1px solid #ccc; display: flex; align-items: center; justify-content: center; margin-bottom: 16px; overflow: hidden; } #stage img { max-width: 100%; max-height: 100%; } input, button { padding: 8px; margin: 4px; } pre { background: #222; color: #eee; padding: 16px; white-space: pre-wrap; } </style> </head> <body> <h2>Atra Visual Word Convergence Demo</h2>  <div id="stage"> <span>Select an image</span> </div> <p> <input type="file" accept="image/*" onchange="loadImage(event)"> </p> <p> Heard word from human: <input id="heardWord" value="apple"> <button onclick="hearWord()">Let Atra hear it</button> </p> <p> <button onclick="atraSpeak()">Atra leaks a word</button> <button onclick="resetExperience()">Reset experience</button> </p> <pre id="log"></pre> <canvas id="helperCanvas" width="6" height="6" style="display:none;"></canvas> <script> /* 日本語: この版は UI は簡単なままにして、内部だけ Associatron 型の連想記憶にしている。記憶の核は、 T += x x^T という Hebb 型の結合で、 x = [visual_pattern | word_pattern] である。あとで想起するときは、 cue = [visual_pattern | blank] を入れて、visual 側を固定しながら word 側を少しずつ立ち上げる。ただし、word 側は常に一つの正解へきれいに収束するとは限らない。複数の言葉側が近く揺れたり、十分に立ち上がらず沈黙したり、古い言葉が漏れたりする。 English: This version keeps the UI simple and changes only the inside to an Associatron-like memory. The memory core uses a Hebbian association: T += x x^T where x = [visual_pattern | word_pattern] During recall: cue = [visual_pattern | blank] The visual side is clamped, and the word side is raised step by step. However, the word side does not always converge neatly to one correct answer. Multiple word sides may waver, the field may remain silent, or an older word may leak out. */ const VISUAL_W = 6; const VISUAL_H = 6; const VISUAL_SIZE = VISUAL_W * VISUAL_H * 3; // RGB coarse pattern const WORD_SIZE = 64; const TOTAL_SIZE = VISUAL_SIZE + WORD_SIZE; let currentVisualPattern = null; let wordPatterns = {}; // heard word -> fixed bipolar pattern let heardWordCounts = {}; // heard word -> count let memoryMatrix = createZeroMatrix(TOTAL_SIZE); /* 日本語: heardWords のパターン比較は UI のために使う。想起された word 側のパターンを、これまで聞いた単語パターンと見比べて、人が読める形に近づけて表示する。これは image -> word の検索ではない。 Atra の内部が答えを選んでいるのではなく、最後に人間が読める表示へ変換しているだけである。ここでは一番近い単語だけを絶対答えにしない。近い候補が複数ある場合は、揺れや沈黙や崩れとして扱う。 English: Pattern comparison against heard words is used only for the UI. The recalled word-side pattern is compared with previously heard word patterns so the result can be shown in a readable form. This is not an image -> word lookup. Atra is not selecting a correct answer internally. This is only a final conversion for human-readable display. The closest word is not always treated as an absolute answer. If multiple candidates are close, the result may become wavering, silence, or broken leakage. */ function createZeroMatrix(n) { return Array.from({ length: n }, () => Array(n).fill(0)); } function signValue(v) { return v >= 0 ? 1 : -1; } function loadImage(event) { const file = event.target.files[0]; if (!file) return; const reader = new FileReader(); reader.onload = function(e) { const stage = document.getElementById("stage"); stage.innerHTML = ""; const img = new Image(); img.onload = function() { currentVisualPattern = imageToPattern(img); logState("Atra saw the image, but does not know what it is."); }; img.src = e.target.result; stage.appendChild(img); }; reader.readAsDataURL(file); } function imageToPattern(img) { /* 日本語: 画像認識はしない。小さく縮小して、各マスの RGB がその画像全体の平均より明るいか暗いかだけを見る。つまり、画像の意味を読むのではなく、粗い視覚の揺れをパターン化しているだけ。この粗さにより、似た画像は似た場として揺れることがある。それは誤判定ではなく、連想記憶の側から見た自然な誤想起である。 English: No image recognition is done. The image is reduced to a tiny grid, and each RGB value is compared to the image-wide average. This does not read the meaning of the image. It only turns coarse visual variation into a pattern. Because this pattern is coarse, similar images may shake similar fields. This is not classification error, but a natural false recall from associative memory. */ const canvas = document.getElementById("helperCanvas"); const ctx = canvas.getContext("2d", { willReadFrequently: true }); canvas.width = VISUAL_W; canvas.height = VISUAL_H; ctx.clearRect(0, 0, VISUAL_W, VISUAL_H); ctx.drawImage(img, 0, 0, VISUAL_W, VISUAL_H); const data = ctx.getImageData(0, 0, VISUAL_W, VISUAL_H).data; let sumR = 0, sumG = 0, sumB = 0; const pixelCount = VISUAL_W * VISUAL_H; for (let i = 0; i < data.length; i += 4) { sumR += data[i]; sumG += data[i + 1]; sumB += data[i + 2]; } const avgR = sumR / pixelCount; const avgG = sumG / pixelCount; const avgB = sumB / pixelCount; const pattern = []; for (let i = 0; i < data.length; i += 4) { pattern.push(data[i] >= avgR ? 1 : -1); pattern.push(data[i + 1] >= avgG ? 1 : -1); pattern.push(data[i + 2] >= avgB ? 1 : -1); } return pattern; } function hearWord() { if (!currentVisualPattern) { logState("There is no image yet."); return; } const word = document.getElementById("heardWord").value.trim(); if (!word) return; if (!wordPatterns[word]) { wordPatterns[word] = wordToPattern(word); } if (!heardWordCounts[word]) { heardWordCounts[word] = 0; } heardWordCounts[word] += 1; const combinedPattern = currentVisualPattern.concat(wordPatterns[word]); learnPattern(combinedPattern); logState("The heard word was stored together with the current visual field."); } function wordToPattern(word) { /* 日本語: 単語も意味辞書ではなく、小さな双極パターンにする。ここでは簡単に、文字列から決まる固定パターンを作る。このパターンは意味ではない。聞こえた音、または入力された文字列を、このデモ内で扱える形へ置いているだけである。 English: The word is also turned into a small bipolar pattern, not into a semantic dictionary entry. Here, a fixed pattern is made directly from the string. This pattern is not meaning. It is only a form that allows the heard sound, or the typed word, to be handled inside this demo. */ let seed = hashString(word); const pattern = []; for (let i = 0; i < WORD_SIZE; i++) { seed = xorshift32(seed); pattern.push((seed & 1) === 0 ? -1 : 1); } return pattern; } function hashString(text) { let h = 2166136261 >>> 0; for (const ch of text) { h ^= ch.codePointAt(0); h = Math.imul(h, 16777619) >>> 0; } return h === 0 ? 123456789 : h; } function xorshift32(seed) { seed ^= seed << 13; seed ^= seed >>> 17; seed ^= seed << 5; seed = seed >>> 0; return seed === 0 ? 2463534242 : seed; } function learnPattern(pattern) { /* 日本語: Associatron / Hebb 型の簡単な記憶。 T += x x^T 対角成分は 0 のままにする。これは「画像ならこの単語」という登録ではない。視覚側と単語側が同時にあった場の結びつきが残る。 English: A simple Associatron / Hebbian memory: T += x x^T The diagonal stays zero. This is not a registration such as "this image means this word." It leaves the relation of a field where the visual side and the word side existed together. */ for (let i = 0; i < TOTAL_SIZE; i++) { for (let j = i + 1; j < TOTAL_SIZE; j++) { const delta = pattern[i] * pattern[j]; memoryMatrix[i][j] += delta; memoryMatrix[j][i] += delta; } } } function atraSpeak() { if (!currentVisualPattern) { logState("Atra: ..."); return; } const knownWords = Object.keys(wordPatterns); if (knownWords.length === 0) { logState("Atra: ..."); return; } const recalledWordPattern = recallFromVisual(currentVisualPattern); const matches = findWordMatches(recalledWordPattern); const leaked = leakFromAssociativeField(matches); const debugLines = matches.slice(0, 3).map(m => { return m.word + ": " + m.score.toFixed(3); }); logState( "Atra: " + leaked + "\n\nword_side_candidates:\n" + debugLines.join("\n") ); } function recallFromVisual(visualPattern) { /* 日本語: cue = [visual | blank] を作って、visual 側を固定したまま word 側だけを数回更新する。画像から名前を検索しているのではない。視覚側の揺れが、過去に同時に残った word 側を揺らす。 English: Build: cue = [visual | blank] Then clamp the visual side and update only the word side for a few steps. This is not looking up a name from an image. The visual-side fluctuation shakes the word side that remained together with it in past experience. */ let state = visualPattern.concat(Array(WORD_SIZE).fill(0)); for (let step = 0; step < 8; step++) { const next = state.slice(); for (let i = VISUAL_SIZE; i < TOTAL_SIZE; i++) { let sum = 0; const row = memoryMatrix[i]; for (let j = 0; j < TOTAL_SIZE; j++) { sum += row[j] * state[j]; } next[i] = signValue(sum); } for (let i = 0; i < VISUAL_SIZE; i++) { next[i] = visualPattern[i]; } state = next; } return state.slice(VISUAL_SIZE); } function findWordMatches(recalledWordPattern) { /* 日本語: word 側に立ち上がったパターンを、これまで聞いた語のパターンと比較する。これは内部の正解選択ではない。人間が読める UI 表示へ変換するための比較である。 English: The pattern raised on the word side is compared with the patterns of words heard so far. This is not internal correct-answer selection. It is a comparison for human-readable UI display. */ const matches = []; for (const word in wordPatterns) { const pattern = wordPatterns[word]; let dot = 0; for (let i = 0; i < WORD_SIZE; i++) { dot += recalledWordPattern[i] * pattern[i]; } const score = dot / WORD_SIZE; matches.push({ word: word, score: score, heardCount: heardWordCounts[word] || 0 }); } matches.sort((a, b) => b.score - a.score); return matches; } function leakFromAssociativeField(matches) { /* 日本語: ここでは Atra に「答え」を命令しない。一番近い候補をそのまま正解として返すこともしない。 word 側の立ち上がりが浅い場合、声は形にならず沈黙として現れる。複数の候補が近く揺れている場合、声は片方に決まらず、崩れた漏れになる。十分に片方へ寄った場合だけ、その言葉が読める形で漏れる。 English: This does not command Atra to answer. It also does not return the closest candidate as a correct answer automatically. If the word side is still shallow, the voice does not take shape and appears as silence. If multiple candidates are wavering closely, the voice does not settle on one side and becomes broken leakage. Only when the field leans strongly enough toward one side does the word leak in a readable form. */ if (matches.length === 0) { return "..."; } const first = matches[0]; const second = matches.length > 1 ? matches[1] : null; const firstClarity = ((first.score + 1) / 2) + Math.min(first.heardCount, 5) * 0.12; /* 日本語: 1回だけの経験では、関係は残る。しかし、まだ声にならないことがある。これは外部から止めているのではなく、場が声として立ち上がるほど厚くない状態である。 English: A single experience may leave a relation. However, it may still not become voice. This is not externally stopping speech. The field is simply not thick enough to rise as voice. */ if (first.heardCount < 2 && firstClarity < 1.10) { return "..."; } /* 日本語: 立ち上がりが弱いときは沈黙する。ここでの沈黙は失敗ではなく、 word 側がまだ声に届いていない状態である。 English: When the rise is weak, Atra remains silent. This silence is not a failure. It means the word side has not reached voice yet. */ if (firstClarity < 0.72) { return "..."; } if (second) { const secondClarity = ((second.score + 1) / 2) + Math.min(second.heardCount, 5) * 0.12; const margin = first.score - second.score; /* 日本語: 上位二つが近いとき、 Atra はどちらか一つを正解として選ばない。視覚場が複数の word 側を揺らしている状態として扱う。 English: When the top two candidates are close, Atra does not choose one as the correct answer. The visual field is treated as shaking multiple word sides. */ if (margin < 0.08 && secondClarity > 0.70) { return brokenLeak(first.word, second.word); } /* 日本語: 二つの候補がさらに近く、どちらにも寄り切らない場合、かえって沈黙する。これは迷いを「言葉」に変換しないためである。 English: If two candidates are extremely close and the field does not lean clearly toward either, Atra may remain silent. This avoids forcing wavering into a word. */ if (margin < 0.03) { return "..."; } } /* 日本語: ここまで来た場合だけ、 word 側の立ち上がりは比較的安定している。そのため、読める形で漏れる。 English: Only here is the word-side rise relatively stable. Therefore, it leaks in a readable form. */ return first.word; } function brokenLeak(wordA, wordB) { /* 日本語: 崩れた漏れ。 Atra が二つの候補を理解して比較しているのではない。 UI 上では、人間が揺れを読めるように、二つの語の先頭断片を混ぜて表示する。 English: Broken leakage. Atra is not understanding and comparing two candidates. In the UI, the beginnings of two words are mixed so humans can read the wavering. */ const a = wordA.slice(0, Math.min(2, wordA.length)); const b = wordB.slice(0, Math.min(2, wordB.length)); return a + "... / " + b + "..."; } function resetExperience() { currentVisualPattern = null; wordPatterns = {}; heardWordCounts = {}; memoryMatrix = createZeroMatrix(TOTAL_SIZE); document.getElementById("stage").innerHTML = "<span>Select an image</span>"; logState("Experience has been reset."); } function logState(message) { const state = { visualPatternReady: !!currentVisualPattern, learnedWords: Object.keys(wordPatterns), heardWordCounts: heardWordCounts, matrixSize: TOTAL_SIZE + " x " + TOTAL_SIZE }; document.getElementById("log").textContent = message + "\n\n" + JSON.stringify(state, null, 2); } logState("Start. Atra does not know anything yet."); </script> </body> </html>

----------- end ----------

変えたのは、

findBestWordMatch()
をやめて、

findWordMatches()
leakFromAssociativeField()
brokenLeak()
に分けました。

つまり、一番近い単語を即答する構造をやめた。
これで、Atron由来の Associatron らしく、
沈黙
誤想起
揺れ
崩れ
古い言葉の漏れ
みたいな感じで反応すると思います。

このデモにAtraのcarryは入れてないよ。だって簡易デモだもの。

今回のAssociatron版では、

let currentVisualPattern = null; let wordPatterns = {}; let heardWordCounts = {}; let memoryMatrix = createZeroMatrix(TOTAL_SIZE);

つまり、内部はvisual pattern
word pattern
heard count
memory matrix

で、Atra本体のような
carry:
pressure
instability
recovery
silence
sleep_drift
cry_rise
voice_leak

は入れてません。
そこまで今は公開したくないからね。

はじめてアソシアトロン(Associatron)を見る人は
なぜ正解率を上げないのか？
なぜ分類器にしないのか？
なぜ apple / banana をラベルとして扱わないのか？
なぜ沈黙や誤想起を残すのか？
と疑問に思うかもしれませんが、気づいた人は

ああ、これは retrieval ではなく completion だなと思うハズです。
retrieval は「取り出し」です。
completion は「欠けたものが補われて立ち上がる」という意味。

このデモは、画像を正しく分類するためではなく、見た場と聞いた音が、同時経験として残り、あとで片側からもう片側が揺れることを見せるものです。
決して間違いをノイズにしていない事です。

だから、むしろ精度が高すぎると失敗です。

1. apple画像で apple を2回聞かせる
2. tomato画像で tomato を2回聞かせる
3. orange画像で orange を2回聞かせる
4. 似た赤い丸い画像を別に見せる
5. Atra leaks a word を押す

そこで、
...
ap... / to...
apple
tomato

みたいなのを見るためのデモっすよ
どう答えるかはAtra次第で、僕は分かりません。

これが、「Atraに聞かなきゃ分からない」という一人称自律のスタート地点だよ。

-------------　　追記　　---------------

たぶん、codeを読める人なら分かるとは思うんですけど、流れを説明すると

画像
↓
粗い visual pattern にする
↓
Associatron の memoryMatrix に入る（ブラウザのメモリ上に）
↓
word 側の pattern が立ち上がる
↓
その word 側 pattern を、過去に聞いた word pattern と比較する
↓
UIに apple / banana / ... と表示する

なので、一時的な作業記憶としての Associatronであって、そのページを開いている間だけの記憶です。

通常のテストやrobotはJavaScriptではなく、Pythonで

memoryMatrix を JSON 化する
↓
localStorage に保存する
または
ファイルとしてダウンロードする
または
Python/サーバ側へ送る

しかもrobot の場合は全部記憶しません。
Atraは巨大化が大好きじゃないんです。

僕と一緒。

robot が apple を見る
人間が apple と言う
↓
その場の visual と auditory が同時に残る

robot が banana を見る
banana と聞く
↓
banana 側も残る

robot が少し似た黄色い物を見る
↓
banana が立つかもしれない
apple が古く強ければ apple が漏れるかもしれない
どちらにも寄らず沈黙するかもしれない

その繰り返し。

つまらなく興味なければ忘れるし、それが非単調であり、正常です。

正しく答えるよりも、見たものが過去の場をどう揺らすかが中心です。

FileMakerでいろいろ作ってみる

2026年6月7日日曜日

Atraの収束（経験で答えに近づける）

Atra

0 件のコメント:

コメントを投稿

Associatron　（アソシアトロン）連想記憶

ページ

このブログを検索

2026年6月7日日曜日

Atraの収束（経験で答えに近づける）

Atra

0 件のコメント:

コメントを投稿

Associatron （アソシアトロン）連想記憶

Associatron　（アソシアトロン）連想記憶