Nov 282025
 

Behind every front-end there is a back-end. My WISPL.COM chatbot is no exception. It’s one thing to provide a nice chatbot experience to my select users. It’s another thing to be able to manage the system efficiently.

Sure, I can, and do, management tasks directly in the database, using SQL commands. But it’s inelegant and inconvenient. And just because I am the only admin does not mean I cannot make my own life easier by creating a more streamlined management experience.

Take announcements. The WISPL chatbot operates in four languages. Creating an announcement entails writing it in a primary language, translating it into three other languages, and the posting the requisite records to the database. Doing it by hand is not hard, but a chore.

Well, not anymore. I just created a nice back-end UI for this purpose. By itself it’s no big deal of course, but it’s the first time the software itself uses a large language model for a targeted purpose.

Note the highlighted Translate button. It sends the English-language text to a local copy of Gemma, Google’s open-weights LLM. Gemma is small but very capable. Among other things, it can produce near flawless translations into, never mind German or French, even Hungarian.

This back-end also lets me manage WISPL chatbot users as well as the language models themselves. It shows system logs, too.

 Posted by at 5:44 pm
Nov 222025
 

Earlier this morning, still in bed, I was thinking about how electricity entered people’s lives in the past century or so.

My wife and I personally knew older people who spent their childhood in a world without electricity.

  • When electricity finally arrived, it was at first in the form of electric lights.
  • Not much later, simple machines appeared. Maybe a vacuum cleaner. Maybe a coffee grinder. Something with a simple electric motor and some trivial mechanical construction.
  • Next came the radio. Suddenly, electricity introduced a whole new dimension: you were never alone anymore. If you had a radio set and a power source, you could listen to the world.
  • Then there were refrigerators, revolutionizing kitchens. Leftovers were no longer waste or slop for farm animals: they could be consumed a day or two later, kept fresh in the fridge.
  • Not long after, another miracle began to pop up in homes: television sets.
  • Sometime along the way, electric stoves, ventilators, and ultimately, air conditioning also appeared in many homes.

One could almost construct a timeline along these lines. This is what was on my mind earlier in the morning as I was waking up.

And then, a few hours later, a post showed up in my feed on Facebook, courtesy of the City of Ottawa Archives, accompanied by an image of some exhibition grounds celebrating World Television Day back in 1955.

It’s almost as though Facebook read my mind.

No, I do not believe that they did (otherwise I’d be busy constructing my first official tinfoil hat) but it is still an uncanny coincidence. I could not have possibly come up with a better illustration to accompany my morning thoughts on this subject.

 Posted by at 10:16 pm
Nov 222025
 

It was high time, I think. I just finished putting together a Web site that showcases my AI and machine learning related work.

The site is called WISPL. It is a domain name I fortuitously obtained almost a decade ago with an entirely different concept in mind, but which fits perfectly. It’s a short, pronounceable domain name and it reminds one of the phrase, “AI whisperer”.

The site has of course been the home of my “chatbot” for more than two years already, but now it is something more. In addition to the chatbot, I now present my retrieval augmented generation (RAG) solution; I show a Web app that allows the user to play chess against GPT “properly” (while also demonstrating the ground truth that autoregressive stochastic next-token predictors will never be great reasoning engines); I showcase my work on Maxima (the computer algebra system, an example of more “conventional” symbolic AI); and I describe some of my AI/ML research projects.

 Posted by at 3:48 am
Nov 112025
 

I came across this meme earlier today:

For the first time in history, you can say “He is an idiot” and 90% of the world will know whom you are talking about.

In an inspired moment, I fed this sentence, unaltered, to Midjourney. Midjourney 6.1 to be exact.

Our AI friends are… fun.

 Posted by at 6:19 pm
Nov 092025
 

While I was working on my minimalist but full implementation of a GPT, I also thought of a game that can help participants understand better how language models really work. Here are the rules:

  1. Someone asks a question.
  2. Participants take turns, making a best effort to contribute to the answer, ONE WORD AT A TIME.
  3. The round is finished when someone ends it with a period.

Say, there are three participants, Alice, Bob and Christine, trying to answer the question, “What was the most significant geopolitical event of the 20th century”?

A: THE
B: ATOMIC
C: BOMB
A: WAS
B: TESTED
C: IN
A: THE
B: SUMMER
C: OF
A: 1945
B: .

Did Alice really want to talk about the atomic bomb? Perhaps she was thinking of the Sarajevo assassination and the start of WW1. Or the collapse of the USSR.

Did Bob really mean to talk about the bomb? Perhaps he was thinking about the discovery of the atomic nature of matter and how it shaped society. Or maybe something about the atomic chain reaction?

Did Christine really mean to talk about the first atomic test, the Trinity test in New Mexico? Maybe she had in mind Hiroshima and Nagasaki.

The answer we got is an entirely sensible answer. But none of the participants knew that this will be the actual answer. There was no “mind” conceiving this specific answer. Yet the “latent knowledge” was present in the “network” of the three players. At each turn, there were high probability and lower probability variants. Participants typically but not necessarily picked the highest probability “next word”, but perhaps opted for a lower probability alternative on a whim, for instance when Bob used “TESTED” instead of “DROPPED”.

Language models do precisely this, except that in most cases, what they predict next is not a full word (though it might be) but a fragment, a token. There is no advance knowledge of what the model would say, but the latent knowledge is present, as a result of the model’s training.

In 1980, Searle argued, in the form of his famous Chinese Room thought experiment, that algorithmic symbol manipulation does not imply understanding. In his proposed game, participants who do not speak Chinese manipulate Chinese language symbols according to preset rules, conveying the illusion of comprehension without actual understanding. I think my little game offers a perfect counterexample: A non-algorithmic game demonstrating the emergence of disembodied intelligence based on the prior world knowledge of its participants, but not directly associated with any specific player.

My wife and I just played two turns of this game. It was a fascinating experience for both of us.

 Posted by at 7:39 pm
Nov 092025
 

A few weeks ago I had an idea.

What if I implement a GPT? No, not something on the scale of ChatGPT, with many hundreds of billions of parameters, consuming countless terawatt-hours, training on a corpus that encompasses much of the world’s literature and most of the Internet.

No, something far more modest. How about… a GPT that emulates the world’s first chatbot, Eliza?

Long story short (the long story will follow in due course on my Web site) I succeeded. I have built a GPT from scratch in C++, including training. I constructed a sensible (though far from perfect) training corpus of user prompts and Eliza responses. And over the course of roughly a week, using a consumer-grade GPU for hardware acceleration, I managed to train my smallest model.

No, don’t expect perfection. My little model does not have hundreds of billions of parameters. It does not even have millions of parameters. It is only a 38 thousand (!) parameter model.

Yet… it works. Sometimes its output is gibberish. But most of the time, the output is definitely Eliza-like.

The best part? The model is so small, its inference runtime works well when implemented in JavaScript, running in-browser.

And here is my first ever exchange with the JavaScript implementation, unfiltered and unedited.

No, I am not going to win awards with this chatbot, but the fact that it works at all, and that it successfully learned the basic Eliza-like behavior is no small potatoes.

For what it’s worth, I was monitoring its training using a little bit of homebrew near-real-time instrumentation, which allowed me to keep an eye on key model parameters, making sure that I intervene, adjusting learning rates, to prevent the training from destabilizing the model.

I am now training a roughly 10 times larger version. I do not yet know if that training will be successful. If it is, I expect its behavior will be more robust, with less gibberish and more Eliza-like behavior.

In the meantime, I can now rightfully claim that I know what I am talking about… after all, I have a C++ implementation, demonstrably working, complete with backpropagation, by way of credentials.

 Posted by at 1:40 am