{"id":13415,"date":"2025-05-13T00:45:30","date_gmt":"2025-05-13T04:45:30","guid":{"rendered":"https:\/\/spinor.info\/weblog\/?p=13415"},"modified":"2025-05-13T00:45:30","modified_gmt":"2025-05-13T04:45:30","slug":"machine-learning-in-8-bit-land","status":"publish","type":"post","link":"https:\/\/spinor.info\/weblog\/?p=13415","title":{"rendered":"Machine learning in 8-bit land"},"content":{"rendered":"<p>A friend of mine challenged me. After telling him how I was able to implement some decent neural network solutions with the help of LLMs, he asked: Could the LLM write a neural network example in Commodore 64 BASIC?<\/p>\n<p>You betcha.<\/p>\n<p>Well, it took a few attempts &#8212; there were some syntax issues and some oversimplifications so eventually I had the idea of asking the LLM to just write the example on Python first and then use that as a reference implementation for the C64 version. That went well. Here&#8217;s the result:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-13416\" src=\"https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/05\/C64NN.png\" alt=\"\" width=\"768\" height=\"544\" srcset=\"https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/05\/C64NN.png 768w, https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/05\/C64NN-300x213.png 300w, https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/05\/C64NN-150x106.png 150w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/p>\n<p>As this screen shot shows, the program was able to learn the behavior of an XOR gate, the simplest problem that requires a hidden layer of perceptrons, and as such, a precursor to modern &#8220;deep learning&#8221; solutions.<\/p>\n<p>I was able to run this test on Kriszti\u00e1n T\u00f3th&#8217;s (no relation) excellent <a href=\"https:\/\/ty64.krissz.hu\/\">C64 emulator<\/a>, which has the distinguishing feature of reliable copy-paste, making it possible to enter long BASIC programs without having to retype them or somehow transfer them to a VIC-1541 floppy image first.<\/p>\n<p>In any case, this is the program that resulted from my little collaboration with the Claude 3.7-sonnet language model:<\/p>\n<blockquote style=\"text-indent: 0;\">\n<pre>10 REM NEURAL NETWORK FOR XOR PROBLEM\r\n20 REM BASED ON WORKING PYTHON IMPLEMENTATION\r\n\r\n100 REM INITIALIZE VARIABLES\r\n110 DIM X(3,1) : REM INPUT PATTERNS\r\n120 DIM Y(3) : REM EXPECTED OUTPUTS\r\n130 DIM W1(1,1) : REM WEIGHTS: INPUT TO HIDDEN\r\n140 DIM B1(1) : REM BIAS: HIDDEN LAYER\r\n150 DIM W2(1) : REM WEIGHTS: HIDDEN TO OUTPUT\r\n160 DIM H(1) : REM HIDDEN LAYER OUTPUTS\r\n170 DIM D1(1,1) : REM PREVIOUS DELTA FOR W1\r\n180 DIM B2 : REM BIAS: OUTPUT LAYER\r\n190 DIM D2(1) : REM PREVIOUS DELTA FOR W2\r\n200 DIM DB1(1) : REM PREVIOUS DELTA FOR B1\r\n210 DB2 = 0 : REM PREVIOUS DELTA FOR B2\r\n220 LR = 0.5 : REM LEARNING RATE\r\n230 M = 0.9 : REM MOMENTUM\r\n\r\n300 REM SETUP TRAINING DATA (XOR PROBLEM)\r\n310 X(0,0)=0 : X(0,1)=0 : Y(0)=0\r\n320 X(1,0)=0 : X(1,1)=1 : Y(1)=1\r\n330 X(2,0)=1 : X(2,1)=0 : Y(2)=1\r\n340 X(3,0)=1 : X(3,1)=1 : Y(3)=0\r\n\r\n400 REM INITIALIZE WEIGHTS RANDOMLY\r\n410 FOR I=0 TO 1\r\n420 FOR J=0 TO 1\r\n430 W1(I,J) = RND(1)-0.5\r\n440 NEXT J\r\n450 B1(I) = RND(1)-0.5\r\n460 W2(I) = RND(1)-0.5\r\n470 NEXT I\r\n480 B2 = RND(1)-0.5\r\n\r\n\r\n510 REM INITIALIZE MOMENTUM TERMS TO ZERO\r\n520 FOR I=0 TO 1\r\n530 FOR J=0 TO 1\r\n540 D1(I,J) = 0\r\n550 NEXT J\r\n560 D2(I) = 0\r\n570 DB1(I) = 0\r\n580 NEXT I\r\n590 DB2 = 0\r\n\r\n600 REM TRAINING LOOP\r\n610 PRINT \"TRAINING NEURAL NETWORK...\"\r\n620 PRINT \"EP\",\"ER\"\r\n630 FOR E = 1 TO 5000\r\n640 ER = 0\r\n650 FOR P = 0 TO 3\r\n660 GOSUB 1000 : REM FORWARD PASS\r\n670 GOSUB 2000 : REM BACKWARD PASS\r\n680 ER = ER + ABS(O-Y(P))\r\n690 NEXT P\r\n700 IF (E\/10) = INT(E\/10) THEN PRINT E,ER\r\n710 IF ER &lt; 0.1 THEN E = 5000\r\n720 NEXT E\r\n\r\n800 REM TEST NETWORK\r\n810 PRINT \"TESTING NETWORK:\"\r\n820 FOR P = 0 TO 3\r\n830 GOSUB 1000 : REM FORWARD PASS\r\n840 PRINT X(P,0);X(P,1);\"-&gt;\"; INT(O+0.5);\" (\";O;\")\"\r\n850 NEXT P\r\n860 END\r\n\r\n1000 REM FORWARD PASS SUBROUTINE\r\n1010 REM CALCULATE HIDDEN LAYER\r\n1020 FOR I = 0 TO 1\r\n1030 S = 0\r\n1040 FOR J = 0 TO 1\r\n1050 S = S + X(P,J) * W1(J,I)\r\n1060 NEXT J\r\n1070 S = S + B1(I)\r\n1080 H(I) = 1\/(1+EXP(-S))\r\n1090 NEXT I\r\n1100 REM CALCULATE OUTPUT\r\n1110 S = 0\r\n1120 FOR I = 0 TO 1\r\n1130 S = S + H(I) * W2(I)\r\n1140 NEXT I\r\n1150 S = S + B2\r\n1160 O = 1\/(1+EXP(-S))\r\n1170 RETURN\r\n\r\n2000 REM BACKWARD PASS SUBROUTINE\r\n2010 REM OUTPUT LAYER ERROR\r\n2020 DO = (Y(P)-O) * O * (1-O)\r\n2030 REM UPDATE OUTPUT WEIGHTS WITH MOMENTUM\r\n2040 FOR I = 0 TO 1\r\n2050 DW = LR * DO * H(I)\r\n2060 W2(I) = W2(I) + DW + M * D2(I)\r\n2070 D2(I) = DW\r\n2080 NEXT I\r\n2090 DW = LR * DO\r\n2100 B2 = B2 + DW + M * DB2\r\n2110 DB2 = DW\r\n2120 REM HIDDEN LAYER ERROR AND WEIGHT UPDATE\r\n2130 FOR I = 0 TO 1\r\n2140 DH = H(I) * (1-H(I)) * DO * W2(I)\r\n2150 FOR J = 0 TO 1\r\n2160 DW = LR * DH * X(P,J)\r\n2170 W1(J,I) = W1(J,I) + DW + M * D1(J,I)\r\n2180 D1(J,I) = DW\r\n2190 NEXT J\r\n2200 DW = LR * DH\r\n2210 B1(I) = B1(I) + DW + M * DB1(I)\r\n2220 DB1(I) = DW\r\n2230 NEXT I\r\n2240 RETURN<\/pre>\n<\/blockquote>\n<p>The one proverbial fly in the ointment is that it took about two hours for the network to be trained. The Python implementation? It runs to completion in about a second.<\/p>\n<fb:like href='https:\/\/spinor.info\/weblog\/?p=13415' send='true' layout='standard' show_faces='true' width='450' height='65' action='like' colorscheme='light' font='lucida grande'><\/fb:like>","protected":false},"excerpt":{"rendered":"<p>A friend of mine challenged me. After telling him how I was able to implement some decent neural network solutions with the help of LLMs, he asked: Could the LLM write a neural network example in Commodore 64 BASIC? You betcha. Well, it took a few attempts &#8212; there were some syntax issues and some <a href='https:\/\/spinor.info\/weblog\/?p=13415' class='excerpt-more'>[&#8230;]<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[58,36],"tags":[],"class_list":["post-13415","post","type-post","status-publish","format-standard","hentry","category-cybernetics","category-programming","category-58-id","category-36-id","post-seq-1","post-parity-odd","meta-position-corners","fix"],"_links":{"self":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts\/13415","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=13415"}],"version-history":[{"count":5,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts\/13415\/revisions"}],"predecessor-version":[{"id":13421,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts\/13415\/revisions\/13421"}],"wp:attachment":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=13415"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=13415"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=13415"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}