{"id":13768,"date":"2025-11-09T01:40:38","date_gmt":"2025-11-09T06:40:38","guid":{"rendered":"https:\/\/spinor.info\/weblog\/?p=13768"},"modified":"2025-11-09T01:40:38","modified_gmt":"2025-11-09T06:40:38","slug":"elizagpt","status":"publish","type":"post","link":"https:\/\/spinor.info\/weblog\/?p=13768","title":{"rendered":"ElizaGPT"},"content":{"rendered":"<p>A few weeks ago I had an idea.<\/p>\n<p>What if I implement a GPT? No, not something on the scale of ChatGPT, with many hundreds of billions of parameters, consuming countless terawatt-hours, training on a corpus that encompasses much of the world&#8217;s literature and most of the Internet.<\/p>\n<p>No, something far more modest. How about&#8230; a GPT that emulates the world&#8217;s first chatbot, <a href=\"https:\/\/en.wikipedia.org\/wiki\/ELIZA\">Eliza<\/a>?<\/p>\n<p>Long story short (the long story will follow in due course on my Web site) I succeeded. I have built a GPT from scratch in C++, including training. I constructed a sensible (though far from perfect) training corpus of user prompts and Eliza responses. And over the course of roughly a week, using a consumer-grade GPU for hardware acceleration, I managed to train my smallest model.<\/p>\n<p>No, don&#8217;t expect perfection. My little model does not have hundreds of billions of parameters. It does not even have millions of parameters. It is only a 38 thousand (!) parameter model.<\/p>\n<p>Yet&#8230; it works. Sometimes its output is gibberish. But most of the time, the output is definitely Eliza-like.<\/p>\n<p>The best part? The model is so small, its inference runtime works well when implemented in JavaScript, running in-browser.<\/p>\n<p>And here is my first ever exchange with the JavaScript implementation, unfiltered and unedited.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-13770\" src=\"https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/11\/elizaGPT.png\" alt=\"\" width=\"488\" height=\"383\" srcset=\"https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/11\/elizaGPT.png 488w, https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/11\/elizaGPT-300x235.png 300w, https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/11\/elizaGPT-150x118.png 150w\" sizes=\"(max-width: 488px) 100vw, 488px\" \/><\/p>\n<p>No, I am not going to win awards with this chatbot, but the fact that it works at all, and that it successfully learned the basic Eliza-like behavior is no small potatoes.<\/p>\n<p>For what it&#8217;s worth, I was monitoring its training using a little bit of homebrew near-real-time instrumentation, which allowed me to keep an eye on key model parameters, making sure that I intervene, adjusting learning rates, to prevent the training from destabilizing the model.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-13769\" src=\"https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/11\/training.png\" alt=\"\" width=\"361\" height=\"477\" srcset=\"https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/11\/training.png 722w, https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/11\/training-227x300.png 227w, https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/11\/training-114x150.png 114w\" sizes=\"(max-width: 361px) 100vw, 361px\" \/><\/p>\n<p>I am now training a roughly 10 times larger version. I do not yet know if that training will be successful. If it is, I expect its behavior will be more robust, with less gibberish and more Eliza-like behavior.<\/p>\n<p>In the meantime, I can now rightfully claim that I know what I am talking about&#8230; after all, I have a C++ implementation, demonstrably working, complete with backpropagation, by way of credentials.<\/p>\n<fb:like href='https:\/\/spinor.info\/weblog\/?p=13768' send='false' layout='button_count' show_faces='true' width='450' height='65' action='like' colorscheme='light' font='lucida grande'><\/fb:like>","protected":false},"excerpt":{"rendered":"<p>A few weeks ago I had an idea. What if I implement a GPT? No, not something on the scale of ChatGPT, with many hundreds of billions of parameters, consuming countless terawatt-hours, training on a corpus that encompasses much of the world&#8217;s literature and most of the Internet. No, something far more modest. How about&#8230; <a href='https:\/\/spinor.info\/weblog\/?p=13768' class='excerpt-more'>[&#8230;]<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[58,35,36],"tags":[],"class_list":["post-13768","post","type-post","status-publish","format-standard","hentry","category-cybernetics","category-personal","category-programming","category-58-id","category-35-id","category-36-id","post-seq-1","post-parity-odd","meta-position-corners","fix"],"_links":{"self":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts\/13768","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=13768"}],"version-history":[{"count":3,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts\/13768\/revisions"}],"predecessor-version":[{"id":13773,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts\/13768\/revisions\/13773"}],"wp:attachment":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=13768"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=13768"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=13768"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}