Apr 072026
 

Earlier tonight, at 6:44 PM Eastern time to be precise, we saw this on NASA TV:

It was followed, 42 minutes later, by this image:

Yes, that tiny thing in the depth of space is Earth. Every human being currently alive, and the remains of every human being who ever lived, are there, except for the four people on board Artemis 4, the spacecraft that took these images as part of its live video feed as it circumnavigated the Moon. (And no, don’t worry, the Moon is still there in the second picture, it’s just that from their perspective it’s mostly dark. Soon thereafter, they actually experienced a solar eclipse in space, as the Sun vanished behind the Moon for about an hour or so.)

Meanwhile, a friend of mine sent me a picture of something that hangs on his wall:

Yes, it means exactly what it implies: he was a member of the recovery force that recovered the last human expedition to the Moon, after their successful return and oceanic splashdown, back in 1972.

I was not yet 10 back then. Now? I already celebrated my 63rd birthday.

Past and present, both now part of one of those rare arcs of history that are worth remembering for all the right reasons.

As the Artemis II crewmembers remarked after breaking the distance record from the Earth: they hope their new record will not remain unbroken for long.

I hope so, too. Very, very much.

 Posted by at 12:50 am
Mar 282026
 

Very well, I was not really chatting or conferencing with a toy animal. My wife’s little tiger just served as a prop, since he was far more likely to stay put than any of our actual miniature tigers, I mean cats.

In any case, the point was not so much who I was chatting with but how: I was doing so using my very own little conferencing service, implemented in half an afternoon with prodigious coding help from Claude.

No, not “vibe coding”. I tried “vibe coding” and it’s not my cup of tea. Not because I am a control freak but because I like to know what my applications do and why, and how to fix and debug them. Moreover, I don’t mind owning the concept. That’s not the time-consuming part. The time-consuming part is implementation and this is where coding assistants, Claude in particular, excel. Don’t outsource combinatorial reasoning, like navigating the excessive landscape of design options. Nor do I need a coding agent type commands for me (and, on a bad day, wipe out my code base.) What I need the AI for is to write the routine stuff once the design is settled.

That is exactly what we did here, and the result… well, works. The concept is simple: Keep everything TCP. Of course TCP is the worst choice for real-time media streaming, except for the alternative: UDP works until it doesn’t, because it is blocked by a NAT-firewall, a mobile network policy, or something else. In fact, it was my struggle to get things run well in these post-Skype days that led me to my simple (but not simplistic) implementation: sending compressed, differenced video and audio at a bandwidth that remains manageable even with my self-hosted relay host for up to maybe half a dozen users.

The thing works. Granted, so far I only tested it with two users (me and me) but it works robustly and reliably even over a cellular connection. I might soon get a chance to test it with real users, some friends. Until then, I have my plush tiger to talk to.

All in all, 2000+ lines of good quality, working code in half an afternoon. That’s what a good AI assistant can do under competent supervision.

 Posted by at 1:09 am
Mar 252026
 

In case anyone is under the illusion (or is that a delusion?) that cats are not judging us, I have proof that they do.

Yes, that’s our cat Marcel and his mother Luisa. The resemblance to Statler and Waldorf from The Muppet Show is… revealing.

 Posted by at 1:19 am
Mar 132026
 

I have an old radio: an ICOM IC-PCR1000. It’s a nondescript little black box, with no controls other than a power switch. It is controlled by a computer.

Its age is perfectly characterized by this simple fact: It is controlled through a serial port, good old RS-232. Not even USB.

It works perfectly well. Or rather, it did, under Windows, with its 25-year old control software. But that same software cannot reliable connect to the radio under Linux, via Wine. It runs, it connects, but every few seconds it reports an error despite the fact that it seems to read and control the radio just fine.

Well, I solved the problem now, with the help of Claude. No, not “vibe” programming: the concept is mine, in fact it is based on software I myself wrote some 12 years ago. At that time, I was experimenting with a C-sharp back end connecting to the radio. This time around? A plain C back-end (a translation done by ChatGPT) and a front-end and midware in HTML+JS and PHP.

So the code logic is mine, even as Claude saved me a lot of time. The layout and visual appearance, however, are entirely Claude’s doing, Claude 4.6 Opus in particular, running on my WISPL Web site. I only introduced some very minor tweaks to refine the appearance.

And yes, I am using it right now, listening to CBC Radio 2.

 Posted by at 9:42 pm
Mar 092026
 

Though I lived in three different countries and worked in several more, I am not terribly well-traveled. Most of the travel I’ve done in my life was within Europe or North America.

Even so, I now realize that there’s a growing list of places I visited that have since become targets of military action:

  • Lviv, in Western Ukraine, known from Austro-Hungarian times as Lemberg: I visited in 1983 or 1984 I believe. Since 2022, it’s been attacked several times by Russian drones or missiles;
  • Vilnius, the capital of Lithuania, which I also visited back then, was briefly the target of military action in 1991, during Lithuania’s less-than-fully-peaceful divorce from the former USSR;
  • Abu Dhabi in the United Arab Emirates, where I briefly did some consulting work back in the mid 2010s, has been targeted by Iran in the most recent war between Iran and the combined US-Israeli force;
  • Dubai, the same, targets even including civilian locations like a hotel or the international airport.

I suppose I could also add to this some cities in Romania that I visited often, including Timisoara and Bucharest, which saw brief bursts of regime violence during the overthrow of Nicolae Ceausescu in 1989.

Then again, I should count myself as one of the lucky ones. For me, these remain relatively distant places, distant memories. For others, it’s home.

 Posted by at 1:00 am
Feb 282026
 

Today was an ordinary day.

I did some real work. Working on a slide deck for an upcoming seminar. Fixing code in my WISPL chatbot implementation, so that it can handle SVG graphics better. Working on a paper. Thinking about a referee report that I am about to write. Reviewing our tax returns that I just completed. Doing some ordinary bookkeeping. Answering e-mails. Posting several nice (I hope) Quora answers. Playing with a Feynman diagram in LaTeX (fhe feynmf package) for one of them. Listening to music through my workstation, and watching Netflix on the same workstation while eating a meal.

Sounds perfectly ordinary, doesn’t it.

Here are the things I wasn’t doing, well, not much anyway.

  • I wasn’t thinking about Debian and KDE;
  • I wasn’t messing with a new backup for my workstation;
  • I wasn’t trying to rebuild my workflow in a new desktop environment;
  • I wasn’t searching for packages to install to allow software I need to run;
  • I wasn’t testing Wine settings for old Windows programs.

In short, I was doing perfectly ordinary things on a perfectly ordinary Saturday on what feels like a perfectly ordinary computer… no longer thinking much about the fact that a few weeks ago, I abandoned Windows as my primary desktop operating system after 35 years of muscle memory.

Why I did it of course has a lot to do with Microsoft’s policies and decisions, but also with the fact that Linux has matured in amazing ways. I tried this once almost a quarter century ago… I still have a VM of that old desktop. I almost did it… but there was too much friction, and my workflow was too dependent on Windows.

Not anymore.

Oh well. I am sure there will be moments of frustration, perhaps even regret, in the future. But by and large I actually feel liberated. And, well, things just work. Whether it is my taxes, my favorite games, my papers, my entertainment, or the software code I write: things work.

OK, I lied. I actually did some work on the machine itself as opposed to with the machine. I installed my old copy of Maple 11, the last version that supports the Classic Worksheet interface, which I prefer.

It works.

 Posted by at 11:56 pm
Feb 112026
 

It’s done.

I changed my primary workstation from Windows 10 to Debian with KDE. Hello, world, please say hello to my nice new KDE desktop.

I am really one of the least likely candidates to make this switch. I have been using Windows as my primary desktop operating system since Windows 3.0, back in 1990. That is a long time. And although I’ve been using Linux for almost as long (1992, SLS Linux 0.96pl12), I never quite made the switch to a Linux desktop.

I came the closest in 2001. That’s when Microsoft introduced Windows XP, with Activation. I despised Activation. This was the first time Microsoft made you feel like your computer is not yours anymore, you’re just rending space from Microsoft. So I went in almost all the way. Slackware Linux with KDE. A nice desktop. I even managed to get my essential Windows applications running (with minor glitches.)

However, I was able to secure instead a small business volume license (which at the time did not require Activation). Not to mention that there was just too much friction. My primary client base was using Windows. I was developing for Windows. Heck, I even wrote books on Visual C++. So I stayed with Windows.

I “survived” Vista by ignoring it. Windows 7 was decent. Once again I “survived” Windows 8, and I eased myself into Windows 10 after discovering third-party apps that allowed me to keep a working Start menu (instead of that abomination Microsoft introduced after trying to get rid of the Start menu altogether) and allowed me to keep my desktop widgets that I got so used to.

Windows 10 at first was touted as Microsoft’s “forever” operating system, with no pre-announced sunset date. Sadly, that changed. Eventually, Microsoft did decide to retire Windows 10 in favor of its successor, Windows 11.

And now comes the saddest part. I was ready to upgrade. Not necessarily eager, especially now with everything that I knew about Windows 11 but hey, you have to move with the times and all.

Except… Except that Microsoft made it impossible. You see, my hardware is not young. Between 1992 and 2016, it appears that I replaced my hardware more or less like clockwork, every 6 years or so. And I had every expectations to continue doing that, say, in 2022… except that there was no point. The hardware I would have built in 2022, or for that matter even now, in early 2026, would be virtually identical to my 10-year old boxes, with only marginal gains. A decent (but low power, for longevity/low thermal load) Xeon CPU, 32 GB ECC RAM, a good quality motherboard, a better PSU. What qualified as a robust, solid workstation in 2016, it turns out, continues to qualify as a robust, solid workstation in 2026. Moore’s law appears to be dead. One notable exception may be the GPU and indeed, my new-old workstation has a fairly new GPU (for CUDA/machine learning work), but for that, I do not need to ditch the old hardware. Hardware that still works reliably. Hardware, of which I have redundant backup. Yet Microsoft would want me to turn all that into e-waste because… because my perfectly decent 4th generation Xeons are a teeny bit older than what Microsoft deems acceptable, and the motherboard does not support TPM 2.0, which I do not want anyway. (Contrary to suggestions, the trusted platform module is not about you trusting your computer; it’s about the likes of Microsoft trusting that they can control your computer.)

So Windows 11 was out. But… what about Windows 10 extended support? And that’s where the story turns surreal. You see, I have some very legitimate licenses of Windows, on account of the fact that for many years, I’ve been a Visual Studio Enterprise subscriber. In fact, my subscription goes back to the early beta days when it was called the Microsoft Developer Network (MSDN.) Under this license, I have legitimate copies of everything, including all versions of Windows, with appropriate activation keys. Except… except that these licenses do not entitle me to extended support. Worse yet, I cannot even buy extended support. The version of Windows 10 I installed foolishly back in 2016 is Windows 10 Enterprise. Yes, you can get extended support for Windows 10 Enterprise… assuming you acquired it as a business, through a volume license. Which I didn’t.

That settled it. For several months now, I’ve been exploring options. Which Linux? Go back to Slackware? A bit pedestrian but I am very fond of that distribution and it’s part muscle memory. Stay with RHEL/Oracle OS/Fedora? The server versions are a tad too conservative, and in any case, I don’t trust Red Hat anymore, not after what they’ve done to CentOS. Some of the CentOS successors? Again, not exactly for desktops. I also wanted a desktop running KDE, not Gnome, because I never really liked Gnome’s philosophy. Eventually, my searches and experiments allowed me to zero in on Debian: almost as old as Slackware, at least as respectable, and, well, a decent old school distro with a modern face.

I installed Debian 13 on my backup workstation four days ago. Earlier tonight, the big swap happened. With vacuum cleaner at hand (10-year old dust in places!) I swapped the backup and my primary workstation. So now I am sitting at my usual desk, typing on my worn keyboard, listening to Sirius XM on my computer… and apart from a few oddities, I no longer even notice that I am not on Windows.

Part of what made it possible of course is that today, my professional life is almost entirely operating system agnostic. I do very little development for Windows these days. It’s been months since I last fired up a version of Visual Studio. Mostly I target the Web, maybe writing back-end code in whichever language best suits the task, from Python to C++. Frankly, my biggest compatibility concern are my favorite games. But I’m sure I’ll find a way to run versions of Fallout, S.T.A.L.K.E.R., Metro, and a few more GOG games that I enjoyed playing from time to time.

Meanwhile, routine tasks work just fine, including my personal bookkeeping (using Microsoft Money 98 — yes, that ancient, since it contains my records going back almost 30 years.) I can also open just fine my Office files, including Visio diagrams, nontrivial Excel spreadsheets, and encrypted Word documents in LibreOffice. Besides, it’s not like my Windows 10 machine went anywhere. It’ll serve me as a backup box (workstation or server) likely for years to come. Until finally these old Xeon boxes either become truly obsolete or reach the end of their useful lives. But not because Microsoft forces me to ditch them.

 Posted by at 2:48 am
Jan 292026
 

A few days ago, I was in for a shock. Something I held as absolutely true, something that seemed self-evident in general relativity turned out to be blatantly wrong.

We learn early on that there is no dipole gravitational radiation, because there is no such thing as intrinsic gravitational dipole moment. Gravitational radiation is therefore heavily suppressed, as the lowest mode is quadrupole radiation.

Therefore, I concluded smartly, an axisymmetric system cannot possibly produce gravitational radiation as it does not have a quadrupole mass moment. Logical, right? Makes perfect sense.

Wrong.

What is true is that an axisymmetric system that is stationary – e.g., a rigid, axisymmetric rotating object with an axis of rotation that corresponds to its symmetry axis – will not radiate.

But two masses in a head-on collision? They sure do. There is even literature on this subject, some really interesting papers going back more than half a century. That’s because there is more to gravity than just the distribution of masses: there’s also momentum. And therefore, the gravitational field will have a time-varying quadrupole component even as the quadrupole mass moment remains zero.

So I took it upon myself to calculate a simple, pedagogical case: two masses on a spring, bouncing back-and-forth without rotation. A neat, clean, nonrelativistic case, which can be worked out in an analytic approximation, without alluding to exotics like event horizons, without resorting to opaque numerical relativity calculations.

Yes, this system will produce gravitational radiation. Not a lot, mind you, but it will.

Once I understood this, I had another concern however. Over the years, surely I must have written answers on Quora promoting my flawed understanding? Yikes! How do I find those answers?

Oh wait. Not too long ago, I built a RAG: a retrieval augmented generation AI demo, which shows semantic (cosine similarity) searches using the totality of my over 11,000 Quora answers up to mid-2025 as its corpus. That means that I can interrogate my RAG solution and that would help me find Quora answers that no keyword search could possibly uncover. So I did just that, asked my RAG a question about axisymmetry and gravitational radiation, and presto: the RAG found several answers of mine, three of which were wrong, one dead wrong.

These are now corrected on Quora. And this exercise demonstrated how RAG works in an unexpected way. Note that the RAG answer is itself wrong, in part because it is faithfully based on my own incorrect Quora answers. Garbage in, garbage out. In this case, though, it meant that the same cosine similarity search zeroed in on my most relevant wrong answers. In an almost picture-perfect demonstration of the utility of a RAG-based solution, it saved me what would likely have amounted to hours of fruitless searching for past Quora answers of mine.

 Posted by at 3:33 am
Jan 282026
 

The other day, my friend John Moffat called me. He told me about a letter that he received the day before. An old school letter, paper-in-an-envelope, brought by the mailman.

A letter from the Prime Minister of Canada.

In these sad days, we have to question the authenticity of everything but no, this letter was the real thing: an actual letter, on embossed letterhead, with a handwritten signature.

I don’t know what prompted the Prime Minister’s office to send this letter to John. I refuse to speculate.

The letter itself stands on its own. It is a very dignified recognition of John’s amazing (and amazingly long!) career as a physicist, and a true note of appreciation that he decided to live his incredible life in this great country, Canada.

Congratulations, John. Perhaps you deserve more recognition. You definitely deserved this. And keep up the good work. I am rooting for you: perhaps I’ll be there when you celebrate your 100th birthday, still in good health and good spirits, still working on MOG/STVG, your nonlocal quantum field theory, your complexified manifold proposal and other intriguing ideas!

 Posted by at 2:32 am
Dec 202025
 

So I had a surprisingly smooth experience with a chat agent, most likely an AI agent though probably with some human-in-the-loop supervision. This had to do with canceling/downgrading my Sirius XM subscription, now that we no longer have a vehicle with satellite radio.

And it got me thinking. Beyond the hype, what does it take to build a reliable AI customer experience (CX) agent?

And that’s when it hit me: I already did it. Granted, not an agent per se, just the way I set up GPT-5 to play chess.

The secret? State machines.

I did not ask GPT to keep track of the board. I did not ask GPT to update the board either. I told GPT the state of the board and asked GPT to make a move.

The board state was tracked not by GPT but by conventional, deterministic code. The board is a state machine. Its state transitions are governed by the rules of chess. There is no ambiguity. The board’s state (including castling and en passant) is encoded in a FEN string unambiguously. When GPT offers a move, its validity is determined by a simple question: does it represent a valid state transition for the chessboard?

And this is how a good AI CX agent works. It does not unilaterally determine the state of the customer’s account. It offers state changes, which are then evaluated by the rigid logic of a state machine.

Diagram created by ChatGPT to illustrate a CX state machine

Take my case with Sirius XM. Current state: Customer with a radio and Internet subscription. Customer indicates intent to cancel radio. Permissible state changes: Customer cancels; customer downgrades to Internet-only subscription. This is where the LLM comes in: with proper scaffolding and a system prompt, it interrogates the customer. Do you have any favorite Sirius XM stations? Awesome. Are you planning on purchasing another XM radio (or a vehicle equipped with one)? No, fair enough. Would you like a trial subscription to keep listening via the Internet-only service? Great. State change initiated… And that’s when, for instance, a human supervisor comes in, to approve the request after glancing at the chat transcript.

The important thing is, the language mode does not decide what the next state is. It has no direct authority over the state of the customer’s account. What it can do, the only thing it can do at this point, is initiating a valid state transition.

The hard part when it comes to designing such a CX solution is mapping the states adequately, and making sure that the AI has the right instructions. Here is what is NOT needed:

  • There is no need for a “reasoning” model;
  • There is no need for “agentic” behavior;
  • There is no need for “council-of-experts”, “chain-of-thought” reasoning, “self-critique”, or any of the other hyped inventions.

In fact, a modest locally run model like Gemma-12B would be quite capable of performing the chat function. So there’s no need even to worry about leaking confidential customer information to the cloud.

Bottom line: use language models for what they do best, associative reasoning. Do not try to use a system with no internal state and no modeling capability as a reasoning engine. That’s like, if I may offer a crude but (I hope) not stupid analogy, it’s like building a world-class submarine and then, realizing that it is not capable of flying, nailing some makeshift wooden wings onto its body.

I almost feel tempted to create a mock CX Web site to demonstrate all this in practice. Then again, I realize that my chess implementation already does much of the same: the AI agent supplies a narrative and a proposed state transition, but the state (the chessboard) is maintained, its consistence is ensured, by conventional software scaffolding.

 Posted by at 3:31 pm
Dec 202025
 

Good-bye, 2022 Accord. Hardly knew ya.

Really, our Accord had ridiculously low mileage. We weren’t driving much even before COVID but since then? I’ve not been to NYC since 2016, or to the Perimeter Institute since, what, 2018 I think. In fact in the past 5 years, the farthest I’ve been from Ottawa was Montreal, maybe twice.

Needless to say, when our dealer saw a car with such low mileage, they pounced. Offered a new lease. I told them, sure, but I’m downsizing: no need for an Accord when we use the car this little, a Civic will do just fine. And a Civic it is.

Things missing? Very few. In no particular order:

  • No Sirius XM satellite radio (apparently it’s gone from Hondas?)
  • No built-in GPS (but Android Auto works way better than it ever did in the Accord);
  • Somewhat fewer USB, power outlets (but enough for our needs);
  • No HUD (but the instrument panel is quite adequate); and
  • No turn signals on the mirrors.

This is it. Really. That’s all. (I thought it also lacked blindspot warning, but I was mistaken.) And it’s a car just as decent and capable as the Accord, but substantially cheaper. So… who am I to complain?

So here we go, nice little Civic, until this lease expires. (No, I am not buying cars anymore. They are loaded with things that, when they go bad, are very costly to repair or replace. The technical debt is substantial.)


Almost forgot this rather important bit:

Yes. Made in Canada. These days, sadly, it matters.

 Posted by at 2:33 am
Nov 282025
 

Behind every front-end there is a back-end. My WISPL.COM chatbot is no exception. It’s one thing to provide a nice chatbot experience to my select users. It’s another thing to be able to manage the system efficiently.

Sure, I can, and do, management tasks directly in the database, using SQL commands. But it’s inelegant and inconvenient. And just because I am the only admin does not mean I cannot make my own life easier by creating a more streamlined management experience.

Take announcements. The WISPL chatbot operates in four languages. Creating an announcement entails writing it in a primary language, translating it into three other languages, and the posting the requisite records to the database. Doing it by hand is not hard, but a chore.

Well, not anymore. I just created a nice back-end UI for this purpose. By itself it’s no big deal of course, but it’s the first time the software itself uses a large language model for a targeted purpose.

Note the highlighted Translate button. It sends the English-language text to a local copy of Gemma, Google’s open-weights LLM. Gemma is small but very capable. Among other things, it can produce near flawless translations into, never mind German or French, even Hungarian.

This back-end also lets me manage WISPL chatbot users as well as the language models themselves. It shows system logs, too.

 Posted by at 5:44 pm
Nov 222025
 

Earlier this morning, still in bed, I was thinking about how electricity entered people’s lives in the past century or so.

My wife and I personally knew older people who spent their childhood in a world without electricity.

  • When electricity finally arrived, it was at first in the form of electric lights.
  • Not much later, simple machines appeared. Maybe a vacuum cleaner. Maybe a coffee grinder. Something with a simple electric motor and some trivial mechanical construction.
  • Next came the radio. Suddenly, electricity introduced a whole new dimension: you were never alone anymore. If you had a radio set and a power source, you could listen to the world.
  • Then there were refrigerators, revolutionizing kitchens. Leftovers were no longer waste or slop for farm animals: they could be consumed a day or two later, kept fresh in the fridge.
  • Not long after, another miracle began to pop up in homes: television sets.
  • Sometime along the way, electric stoves, ventilators, and ultimately, air conditioning also appeared in many homes.

One could almost construct a timeline along these lines. This is what was on my mind earlier in the morning as I was waking up.

And then, a few hours later, a post showed up in my feed on Facebook, courtesy of the City of Ottawa Archives, accompanied by an image of some exhibition grounds celebrating World Television Day back in 1955.

It’s almost as though Facebook read my mind.

No, I do not believe that they did (otherwise I’d be busy constructing my first official tinfoil hat) but it is still an uncanny coincidence. I could not have possibly come up with a better illustration to accompany my morning thoughts on this subject.

 Posted by at 10:16 pm
Nov 222025
 

It was high time, I think. I just finished putting together a Web site that showcases my AI and machine learning related work.

The site is called WISPL. It is a domain name I fortuitously obtained almost a decade ago with an entirely different concept in mind, but which fits perfectly. It’s a short, pronounceable domain name and it reminds one of the phrase, “AI whisperer”.

The site has of course been the home of my “chatbot” for more than two years already, but now it is something more. In addition to the chatbot, I now present my retrieval augmented generation (RAG) solution; I show a Web app that allows the user to play chess against GPT “properly” (while also demonstrating the ground truth that autoregressive stochastic next-token predictors will never be great reasoning engines); I showcase my work on Maxima (the computer algebra system, an example of more “conventional” symbolic AI); and I describe some of my AI/ML research projects.

 Posted by at 3:48 am
Nov 092025
 

While I was working on my minimalist but full implementation of a GPT, I also thought of a game that can help participants understand better how language models really work. Here are the rules:

  1. Someone asks a question.
  2. Participants take turns, making a best effort to contribute to the answer, ONE WORD AT A TIME.
  3. The round is finished when someone ends it with a period.

Say, there are three participants, Alice, Bob and Christine, trying to answer the question, “What was the most significant geopolitical event of the 20th century”?

A: THE
B: ATOMIC
C: BOMB
A: WAS
B: TESTED
C: IN
A: THE
B: SUMMER
C: OF
A: 1945
B: .

Did Alice really want to talk about the atomic bomb? Perhaps she was thinking of the Sarajevo assassination and the start of WW1. Or the collapse of the USSR.

Did Bob really mean to talk about the bomb? Perhaps he was thinking about the discovery of the atomic nature of matter and how it shaped society. Or maybe something about the atomic chain reaction?

Did Christine really mean to talk about the first atomic test, the Trinity test in New Mexico? Maybe she had in mind Hiroshima and Nagasaki.

The answer we got is an entirely sensible answer. But none of the participants knew that this will be the actual answer. There was no “mind” conceiving this specific answer. Yet the “latent knowledge” was present in the “network” of the three players. At each turn, there were high probability and lower probability variants. Participants typically but not necessarily picked the highest probability “next word”, but perhaps opted for a lower probability alternative on a whim, for instance when Bob used “TESTED” instead of “DROPPED”.

Language models do precisely this, except that in most cases, what they predict next is not a full word (though it might be) but a fragment, a token. There is no advance knowledge of what the model would say, but the latent knowledge is present, as a result of the model’s training.

In 1980, Searle argued, in the form of his famous Chinese Room thought experiment, that algorithmic symbol manipulation does not imply understanding. In his proposed game, participants who do not speak Chinese manipulate Chinese language symbols according to preset rules, conveying the illusion of comprehension without actual understanding. I think my little game offers a perfect counterexample: A non-algorithmic game demonstrating the emergence of disembodied intelligence based on the prior world knowledge of its participants, but not directly associated with any specific player.

My wife and I just played two turns of this game. It was a fascinating experience for both of us.

 Posted by at 7:39 pm
Nov 092025
 

A few weeks ago I had an idea.

What if I implement a GPT? No, not something on the scale of ChatGPT, with many hundreds of billions of parameters, consuming countless terawatt-hours, training on a corpus that encompasses much of the world’s literature and most of the Internet.

No, something far more modest. How about… a GPT that emulates the world’s first chatbot, Eliza?

Long story short (the long story will follow in due course on my Web site) I succeeded. I have built a GPT from scratch in C++, including training. I constructed a sensible (though far from perfect) training corpus of user prompts and Eliza responses. And over the course of roughly a week, using a consumer-grade GPU for hardware acceleration, I managed to train my smallest model.

No, don’t expect perfection. My little model does not have hundreds of billions of parameters. It does not even have millions of parameters. It is only a 38 thousand (!) parameter model.

Yet… it works. Sometimes its output is gibberish. But most of the time, the output is definitely Eliza-like.

The best part? The model is so small, its inference runtime works well when implemented in JavaScript, running in-browser.

And here is my first ever exchange with the JavaScript implementation, unfiltered and unedited.

No, I am not going to win awards with this chatbot, but the fact that it works at all, and that it successfully learned the basic Eliza-like behavior is no small potatoes.

For what it’s worth, I was monitoring its training using a little bit of homebrew near-real-time instrumentation, which allowed me to keep an eye on key model parameters, making sure that I intervene, adjusting learning rates, to prevent the training from destabilizing the model.

I am now training a roughly 10 times larger version. I do not yet know if that training will be successful. If it is, I expect its behavior will be more robust, with less gibberish and more Eliza-like behavior.

In the meantime, I can now rightfully claim that I know what I am talking about… after all, I have a C++ implementation, demonstrably working, complete with backpropagation, by way of credentials.

 Posted by at 1:40 am
Oct 172025
 

Now that I have put together my little RAG project (little but functional, more than a mere toy demo) it led to another idea. The abstract vector database (embedding) that represents my answers can be visualized, well, sort of, in a two-dimensional representation, and I built just that: an interactive visualization of all my Quora answers.

It is very educational to explore, how the embedding model managed to cluster answers by semantics. As a kind of a trivial example, there is a little “cat archipelago” in the upper right quadrant: several of my non-physics answers related to cats can be found in this corner. Elsewhere there is, for instance, a cluster of some of my French-language answers.

Anyhow, feel free to take a look. It’s fun. Unlike the RAG engine itself, exploring this map does not even consume any significant computing (GPU) resources on my server.

 Posted by at 7:18 pm