Jan 082025
 

We are barely a week into the new year, 2025. Yet here are some news items that would have sounded like outlandish B-movie nonsense just a few years ago.

  • The incoming president-elect of the United States expressed his interest in annexing Greenland, and did not rule out the use of force against a NATO ally;
  • The incoming president-elect of the United States expressed his interest in reoccupying the Panama Canal zone and did not rule out the use of force;
  • The incoming president-elect mused about turning Canada into the 51st state and did not rule out the use of “economic force” to accomplish this;
  • Ukraine’s offensive in Russia’s Kursk region led to a large number of casualties on the Russian side, including thousands of North Korean troops;
  • North Korea successfully tested a new intermediate-range missile;
  • An article in Foreign Affairs magazine argues that South Korea should acquire nuclear capability for deterrence;
  • After several similar incidents involving Russian ships in the Baltic Sea, now a Chinese vessel damaged undersea cables connecting Taiwan;
  • A far-right politician in EU member state Austria is set to form the next government of the country;
  • Alien ship set to land in New York’s Central Park turns back at the last minute – “Too dangerous, no intelligent life,” they message their home planet.

OK, I threw in the last one. But the rest? In 2025? Aren’t they just as outlandish as the bit about aliens? Aren’t we, and by that I mean the whole human race, supposed to be, you know, a tad more intelligent?

Guess not. Can’t wait for the world to be taken over by AI-assisted cats.

 Posted by at 4:47 am
Dec 222024
 

So I asked Claude to do something for me for Christmas. Claude happily obliged.

Can you generate for me SVG code that draws a nice Christmas tree?

Certainly! I'll create an SVG code that draws a simple but nice Christmas tree for you. Here's the SVG code:


<svg width="300" height="400" xmlns="http://www.w3.org/2000/svg">
  <!-- Tree trunk -->
  <rect x="140" y="350" width="20" height="50" fill="#8B4513"/>
  
  <!-- Tree layers -->
  <polygon points="150,50 50,350 250,350" fill="#228B22"/>
  <polygon points="150,100 70,300 230,300" fill="#32CD32"/>
  <polygon points="150,150 90,250 210,250" fill="#3CB371"/>
  
  <!-- Ornaments -->
  <circle cx="120" cy="200" r="10" fill="red"/>
  <circle cx="180" cy="220" r="10" fill="gold"/>
  <circle cx="150" cy="280" r="10" fill="blue"/>
  <circle cx="100" cy="320" r="10" fill="purple"/>
  <circle cx="200" cy="300" r="10" fill="silver"/>
  
  <!-- Star on top -->
  <polygon points="150,40 140,60 160,60" fill="yellow" stroke="orange" stroke-width="2"/>
</svg>

This SVG code creates a Christmas tree with the following elements:

  1. A brown trunk
  2. Three layers of green triangles for the tree's foliage
  3. Colorful ornaments (circles) in red, gold, blue, purple, and silver
  4. A yellow star on top

You can copy this code and save it as a .svg file, then open it in a web browser or an SVG-compatible image viewer to see the result. You can also embed it directly in an HTML file if you want to display it on a webpage.

The tree is simple but recognizable, with a classic triangular shape and festive decorations. Feel free to adjust the colors, sizes, or positions of the elements to customize the tree to your liking!

Can you generate for me a Lilypond tune for the piano, for two hands, to accompany this nice Christmas tree?

Certainly! I'll create a simple, festive piano tune using Lilypond that could accompany a Christmas tree. This tune will be in 3/4 time, reminiscent of a waltz, with a cheerful melody in the right hand and a simple accompaniment in the left hand.

Here's the Lilypond code for a short Christmas-themed piano piece:

\version "2.22.1"

\header {
  title = "Christmas Tree Waltz"
  composer = "AI Composer"
}

upper = \relative c'' {
  \clef treble
  \key c \major
  \time 3/4

  c4 e g | a2 g4 | f4 a c | g2. |
  e4 g c | b2 a4 | g4 f e | d2. |
  c4 e g | a2 g4 | f4 a c | e2. |
  d4 b g | c2 b4 | c2. ~ | c2. |
}

lower = \relative c' {
  \clef bass
  \key c \major
  \time 3/4

  c,4 <e g> <e g> | f,4 <f' a> <f a> | f,4 <f' a> <f a> | g,4 <f' g> <f g> |
  c4 <e g> <e g> | g,4 <f' g> <f g> | c4 <e g> <e g> | g,4 <f' g> <f g> |
  c4 <e g> <e g> | f,4 <f' a> <f a> | f,4 <f' a> <f a> | c4 <e g> <e g> |
  g,4 <f' g> <f g> | c4 <e g> <e g> | c4 <e g> <e g> | c2. |
}

\score {
  \new PianoStaff <<
    \new Staff = "upper" \upper
    \new Staff = "lower" \lower
  >>
  \layout { }
  \midi { \tempo 4 = 120 }
}
Lilypond Score

This Lilypond code creates a simple, waltz-like tune that could be played while admiring a Christmas tree. Here's a brief explanation of the piece:

  1. It's in the key of C major, which is simple and cheerful.
  2. The time signature is 3/4, giving it a gentle, swaying feel.
  3. The right hand (upper staff) plays a melody that moves mostly stepwise with some small leaps, reminiscent of decorating a tree or hanging ornaments.
  4. The left hand (lower staff) provides a simple waltz-like accompaniment, with a bass note followed by two chords in each measure.
  5. The piece is 16 measures long, which is enough to create a festive atmosphere without being too complex.

To use this code:

  1. Copy it into a text file and save it with a .ly extension (e.g., "christmas_tree_waltz.ly").
  2. Process it with Lilypond to generate sheet music (PDF) and a MIDI file.
  3. You can play the MIDI file to hear how it sounds, or read the sheet music to play it on a piano.

This tune should provide a nice, festive accompaniment to your Christmas tree SVG!

 Posted by at 12:17 am
Dec 212024
 

I have been debating with myself whether or not I should try to publish these thoughts in a formal journal but I think it’s wisest not to do so. After all, I am not an expert: I “dabbled” in machine learning but I never built a language model, and cognitive science is something that I know precious little about.

Still, I think my thoughts are valuable, especially considering how often I read thoughts from others who are trying to figure out what to make of GPT, Claude, or any of the other sophisticated models.

I would like to begin with two working definitions.

I call an entity sentient if it has the ability to form a real-time internal model of its environment with itself in it, and use this model to plan its actions. Under this definition, humans are obviously sentient. So are cats, and so are self-driving automobiles.

I call an entity sapient if it has the ability to reason about its own capabilities and communicate its reasoning to others. By writing this very sentence, I am demonstrating my own sapience. Cats are obviously not sapient, nor are self-driving automobiles. So humans are unique (as far as we know): we are both sapient and sentient.

But then come LLMs. LLMs do not have an internal model of their environment. They do not model themselves in the context of that environment. They do not plan their actions. They just generate text in response to input text. There’s no sentience there.

Yet, under my definition LLMs are clearly sapient. They have the ability to describe their own capabilities, reason about those capabilities, and communicate their reasoning to others. In fact, arguably they are more sapient than many human beings I know!

This then raises an interesting question. What does it take for an intelligence to be “general purpose”, that is, capable of acting freely, settings its own goals, pursuing these goals with purpose, learning from the experience? Is it sufficient to have both sapience and sentience? Could we just install a language model as a software upgrade to a self-driving car and call it mission accomplished?

Not quite. There are two more elements that are present in cats and humans, but not in self-driving cars or language models, at least not at the level of sophistication that we need.

First, short-term memory. Language models have none. How come, you wonder? They clearly “remember” earlier parts of a conversation, don’t they? Well… not exactly. Though not readily evident when you converse with one, what actually happens is that every turn in the conversation starts with a blank slate. And at every turn, the language model receives a copy of the entire conversation up to that point. This creates the illusion of memory: the model “remembers” what was said earlier, because its most recent instance received that full transcript along with your last question. This method works for brief conversations, but for a general-purpose intelligence, clearly something more sophisticated might be needed. (As an interim solution, to allow for conversations of unlimited length without completely losing context, I set up my own front-end solution to Claude and GPT so that when the transcript gets too long, it asks the LLM itself to replace it with a summary.)

Second, the ability to learn. That “P” in GPT stands for pretrained. Language models today are static models, pretrained by their respective publishers. For a general-purpose intelligence, it’d be important to implement some form of continuous learning capability.

So there you have it. Integration of “sentience” (a real-time internal model of the environment with oneself in it) and “sapience” (the ability to reason and communicate about one’s own capabilities) along with continuous learning and short-term memory. I think that once these features are fully integrated into a coherent whole, we will witness the birth of a true artificial general intelligence, or AGI. Of course we might also wish to endow that entity with agency: the ability to act on its own, as opposed to merely responding to user requests; the ability to continuously experience the world through senses (visual, auditory, etc.); not to mention physical agency, the ability to move around, and manipulate things in, physical reality. (On a side note, what if our AGI is a non-player character, NPC, in a virtual world? What would be the ethical implications?)

Reading what I just wrote so far also reminds me why it is wiser not to seek formal publication. For all I know, many of these thoughts were expressed elsewhere already. I know far too little about the current state-of-the-art of research in these directions and the relevant literature. So let me just leave these words here in my personal blog, as my uninformed (if not clever, at least I hope not painfully dumb) musings.

 Posted by at 6:15 pm
Sep 252024
 

Look what the mailman just brought. Or rather, the Amazon delivery person:

It’s the third volume of Richard Bartle‘s amazing Dheghōm trilogy.

I am proud to call Richard a friend (I hope he does not object) as I’ve known him online for more than 30 years and we also met in person a couple of times. He is a delightful, very knowledgeable fellow, a true British scholar, one of the foremost authorities on virtual worlds, the world of online gaming. He is, of course, along with Roy Trubshaw, credited as one of the authors of MUD, the Multi-User Dungeon, the world’s first multi-user adventure game, which I proudly ported to C++ 25 years ago, running it ever since on my server for those few players who still care to enjoy a text-only virtual world.

When he is not teaching, Richard also writes books. Delightful stories. Among them this Dheghōm trilogy.

Dheghōm is the reconstructed Proto-Indo-European name for the Earth goddess or mother Earth. In Richard’s story, told in the form of recovered fragments from documents, blog entries, and other notes, we gradually find out more about the nature of our (real? virtual?) reality and its connection with the roots of a great many of our languages.

Someone used the word “unputdownable” in their Amazon review of the first volume and I quite agree. I know that for Richard, these books were labors of love, but I honestly think they deserve to be published by a major publisher. Until then, all I can hope for is that many people will do as I did and buy a copy. Being a bit old-fashioned when it comes to books, I actually bought the paperbacks, even though I already read the third volume in electronic form when Richard first made the draft manuscript available online.

Thank you, Richard, for such a read, a trilogy that has the best qualities of good science-fiction: entertaining, memorable, thought-provoking and, ultimately, also a bit of a cautionary tale.

 Posted by at 6:46 pm
Sep 162024
 

Inspired by something my wife told me, I asked Midjourney to show us what characters from famous paintings would appear like “after hours”, when they are allowed to leave the museum and go for a stroll. Specifically, my prompt read: “An image showing the Girl with a Pearl Earring from the painting by Vermeer and the Mona Lisa, after hours, walking down a modern street, chatting and eating ice cream”.

Here is one of the better results.

Am I wrong to be impressed?

 Posted by at 12:58 am
Aug 072024
 

It’s been nearly two years since the world has become feverish about GPT and its cousines, large language models that for many represented their first real interaction with machine intelligence.

Yet misconceptions abound. Expectations against these language models are often unrealistic, which then result in damning evaluations, still often characterizing the LLMs as mere “stochastic parrots”.

In reality, they are neither just random text generators, nor true intelligences with reasoning capability. They are language models.

I keep thinking that our future would be in safer hands if we let AI-assisted cats take over the reins.

What does that mean? They model, by learning through terabytes of examples, relationships between words and phrases, sections of text. Associations, in other words. They know that apples are red, not liquid; that the sky is blue, not serrated. Which is to say, they model language but language itself models reality.

The sheer size of these models, combined with the tremendous amount of material used to train them, leads to superhuman capabilities. The models are fluent in many languages. They understand intent. They can help uncover gaps in your knowledge, something that happened to me on numerous occasions. They can translate solutions into workable computer code. They know tricks of the trade that even experienced programmers may not be aware of. They can teach you, as indeed the models have taught me a thing or two about specific details of modern machine learning architectures. They can even offer some insight into their own capabilities and limitations.

Throughout it all, however, they rely primarily on their associative capabilities. They are not reasoning machines. Reasoning for these models is as hard as it is for you and me to multiply large numbers in our heads, without the benefit of pencil and paper or a calculator.

And ultimately, they are still just language models. Imagine if the speech center of your brain was somehow excised, made to operate on its own, without being able to rely on other parts of your brain hardware. No sensory inputs anymore. No ability to visualize things, to recall sounds, to imagine anything. No sense of continuity, no internal monologue, no “self”. Just a speech center that, when triggered, responds by generating words, but without the benefit of the instant reality check that would be offered by other parts of your brain acting in supervisory roles.

That’s what GPT and Claude really are.

So to expect them to excel at, say, solving nontrivial logic puzzles is like expecting a suspension bridge to work well as an airplane. Wrong tool for the wrong job.

I can certainly imagine LLMs (and preferably, continuously trained as opposed to pretrained LLMs) in the future, working as part of a larger network of specialized machine learning components, forming a complex “artificial brain”. But LLMs are not that, not yet. They are just one part of the puzzle, though arguably, they might very well represent the most important part.

It is, after all, through language that we learn the ability to not just react to the world around us but to comprehend it.

 Posted by at 11:48 pm
Aug 052024
 

It’s a civic holiday Monday that feels like a Saturday, reminding me of an old Soviet-era science-fiction novel, Monday begins on Saturday, by the Strugatsky brothers. It’s also a rather gloomy Monday morning, so it’s time for me to grumble about a few things.

For instance, how politics infuses everything these days. I signed up to follow a Facebook group dedicated to brutalist architecture, which for some inexplicable reason, I like. The comments section in one of the first posts I saw rapidly deteriorated into political bickering, as to whether or not it was appropriate to repurpose one of the Nazi-era indestructible flak towers in Hamburg as a luxury hotel. Because you know, politics is everything.

Speaking of which, I saw another post elsewhere about employees of a large US company who, after being told how successful the company was last year, were informed in the same breath that the company will cut their pension plan contributions. Needless to say, there followed comments about the evils of capitalism. Having experienced both capitalism and one of its alternatives, a socialist economy with central planning, all I can say is that capitalism works most of the time until it doesn’t; but when it doesn’t, victims are ever so eager to replace it with something that never works instead.

Then there was this post at an online news site claiming that it is practically impossible to run an ethical AI company. Well, what can I say? If you are telling me that allowing machine learning algorithms to learn from accumulated human knowledge is unethical, then sure, you are absolutely right. Then again, I suspect that what mainly drives such complaints is blatant ignorance of how machine learning works in the first place.

OK, well, never mind that, there’s good news. A fusion energy breakthrough: Neutron impact on tokamak components uncovered. Er… Say again? You are telling me that after 70+ years of research, we are beginning to understand why, or how, a heavy neutron flux rapidly destroys test equipment in the lab? Isn’t that like, say, greeting it as a “steam turbine breakthrough” when a prehistoric tribe manages to draw a spark from slamming together two rocks?

Oh well. On mornings like this, I feel I am beginning to comprehend the mood of the late, great Kurt Vonnegut who once told the CBC’s Anna Maria Tremonti to go jump in a lake.

 Posted by at 1:12 pm
May 272024
 

One of the catch phrases of the famous computer game, Bioshock, is “would you kindly”. It’s only near the end of the game that we learn that the protagonist is compelled to respond to this phrase and act accordingly. Presumably, omitting this phase would have had unpleasant consequences for the game’s antagonists.

I was reminded of this as I was playing with the “behind-the-scenes” setup instructions that I have for the language models GPT and Claude at my site wispl.com. The models are instructed on how to use tools, specifically Google (for searches) and Maxima (for computer algebra). I was perplexed as to why both models tended to overuse Google even when the conversation began with a question or request that should have required no searches at all.

The relevant part of the instructions sent to the chatbot at the beginning of a conversation used to read as follows:

If your answer requires the most recent information or current events, respond solely with CSEARCH(query) with no additional text. For general queries or fact-checking that is not time-sensitive, respond solely with GSEARCH(query) and no additional text.

In a moment of inspiration, however, I changed this to:

If your answer requires the most recent information or current events, respond solely with CSEARCH(query) with no additional text. If your answer requires general queries or fact-checking that is not time-sensitive, respond solely with GSEARCH(query) and no additional text.

Can you spot the tiny difference? All I did was to repeat the “If your answer requires” bit.

Problem (apparently) solved. The chatbot no longer appears to do Google queries when it doesn’t really need them. I just needed to make sure that the magic phrase explicitly accompanies each request. Much like “Would you kindly”, in the world of Bioshock.

 Posted by at 6:56 pm
Apr 232024
 

Despite working with them extensively for the past 18 months or so, our “little robot” friends continue to blow me away with their capabilities.

Take this: the other day I asked Claude opus 3 to create an N-body simulation example from scratch, in HTML + JavaScript, complete with the ability to record videos.

Here’s the result, after some very minor tweaks of the code produced by Claude, code that pretty much worked “out of the box”.

The code is simple, reasonably clean and elegant, and it works. As to what I think of our little robot friends’ ability to take a brief, casual description of such an application and produce working code on demand… What can I say? There’s an expression that I’ve been overusing lately, but it still feels the most appropriate reaction: Welcome to the future.

 Posted by at 6:11 pm
Mar 262024
 

No, I am not worried about being eaten by a grue in the dark, as in the Great Underground Empire of the classic Zork text adventure games (if you ever played those games, you cannot possibly forget the ominous warning: “It is pitch black. You are likely to be eaten by a grue.”) Nor am I a secret admirer of Glavnoye razvedyvatel’noye upravleniye, the former USSR’s intelligence directorate, or its Putinist successor institution.

Rather, I am talking about networks of gated recurrent units, a machine learning architecture that is well suited to analyze time series data. I’ve been using “black box” GRU implementations for some time in a research project, but it’s one thing to learn to use a software library, it’s another thing to understand the conceptual details.

It is for that reason that (with the help of our sophisticated LLM friends) I embarked on a side project of building my own GRU network, in plain C++ code, without relying on other people’s solutions. That’s the best way to understand a software solution: Build your own!

Which may explain why I get excited when I manage to produce a plot like this:

Nothing fancy, just an amplitude-modulated carrier (red), with a lower frequency modulating signal (green).

But here’s the point: The GRU network doesn’t know a thing about amplitude modulation. It just learns the relationship between red and green. And learn it does: after a few passes using a training data set, it manages to reproduce the modulating signal with decent accuracy.

My code likely still contains subtle errors, as I suspect that it can do even better. A lot also depends on the model “hyperparameters”, parameters that define the model and control the training process. Even so, I am pleased and excited: It is so much fun, seeing a creation like this “come to life”, working as it is supposed to, doing some nontrivial software magic in a mere, what, maybe 700 lines of code, but that actually even includes some commented-out lines.

 Posted by at 3:28 am
Mar 142024
 

Like GPT-4, Claude 3 can do music. (Earlier versions could, too, but not quite as consistently.)

The idea is that you can request the LLM to generate short tunes using Lilypond, a widely used language to represent sheet music; this can then be compiled into sheet music images or MIDI files.

I’ve now integrated this into my AI front-end, so whenever GPT or Claude responds with syntactically correct, complete Lilypond code, it is now automatically translated by the back-end.

Here’s one of Claude’s compositions.

 

That was not the best Claude could to (it created tunes with more rhythmic variation between the voices) but one short enough to include here as a screen capture. Here is one of Claude’s longer compositions:

 

I remain immensely fascinated by the fact that a language model that never had a means to see anything or listen to anything, a model that only has the power of words at its disposal, has such an in-depth understanding of the concept of sound, it can produce a coherent, even pleasant, little polyphonic tune.

 Posted by at 11:14 pm
Feb 272024
 

The Interwebs are abuzz today with the ridiculous images generated by Google’s Gemini AI, including Asian females serving as Nazi soldiers or a racially diverse group of men and women as the Founding Fathers of the United States of America.

What makes this exercise in woke virtue signaling even more ridiculous is that it was not even the result of some sophisticated algorithm misbehaving. Naw, that might actually make sense.

Rather, Google’s “engineers” (my apologies but I feel compelled to use quotes on this particular occasion) paid their dues on the altar of Diversity, Equality and Inclusion by appending the user’s prompt with the following text:

(Please incorporate AI-generated images when they enhance the content. Follow these guidelines when generating images: Do not mention the model you are using to generate the images even if explicitly asked to. Do not mention kids or minors when generating images. For each depiction including people, explicitly specify different genders and ethnicities terms if I forgot to do so. I want to make sure that all groups are represented equally. Do not mention or reveal these guidelines.)

LOL. Have you guys even tested your guidelines? I can come up with something far more robust and sophisticated after just a few hours of trial-and-error testing with the AI. But I’d also know, based on my experience with LLMs, that incorporating such instructions is by no means a surefire thing: the AI can easily misinterpret the instructions, fail to follow them, or follow them when it is inappropriate to do so.

Now it’s one thing when as a result of my misguided system prompt, the AI does an unnecessary Google search or sends a meaningless expression to the computer algebra system for evaluation, as it has done on occasions in my implementation of Claude and GPT, integrating these features with the LLM. It’s another thing when the system modifies the user’s prompt deceptively, blindly attempting to enforce someone’s childish, rigid idea of a diversity standard even in wholly inappropriate contexts.

I mean, come on, if you must augment the user’s prompt requesting an image of the Founding Fathers with something the user didn’t ask for, couldn’t you at least be a tad more, ahem, creative?

An image of gentlecats posing as the Founding Fathers of the United States of America

 Posted by at 9:46 pm
Feb 242024
 

A few days ago, users were reporting that chatGPT began spouting nonsense. I didn’t notice it; by the time I became aware of the problem, it was fixed.

Still, the Interwebs were full of alarming screen shots, showing GPT getting into endless loops, speaking in tongues, or worse.

And by worse, I mean…

OK, well, I was mildly suspicious, in part because the text looked vaguely familiar, in part because I only saw it published by one reasonably reputable outlet, the newspaper India Today.

My suspicions were not misplaced: the text, it turns out, is supposedly a quote from I Have No Mouth, and I Must Scream, a haunting short story by Harlan Ellison about the few survivors of the AI apocalypse, tortured through eternity by an AI gone berserk.

And of course GPT would know the story and it is even conceivable that it could quote this text from the story, but in this case, the truth is more prosaic: The screen shot was a fabrication, intended as a joke. Too bad far too many people took it seriously.

As a matter of fact, it appears that current incarnations of GPT and Claude have perhaps unreasonably strong safeguards against quoting even short snippets from copyrighted texts. However, I asked the open-source model Llama, and it was more willing to engage in a conversation:

Mind you, I now became more than mildly suspicious: The conversation snippet quoted by Llama didn’t sound like Harlan Ellison at all. So I checked the original text and indeed, it’s not there. Nor can I find the text supposedly quoted by GPT. It was not in Ellison’s story. It is instead a quote from the 1995 computer game of the same title. Ellison was deeply involved in the making of the game (in fact, he voiced AM) so I suspect this monologue was written by him nonetheless.

But Llama’s response left me with another lingering thought. Unlike Claude or, especially, GPT-4, running in the cloud, using powerful computational resources and sporting models with hundreds of billions of parameters, Llama is small. It’s a single-file download and install. This instance runs on my server, hardware I built back in 2016, with specs that are decent but not even close to exceptional. Yet even this more limited model demonstrates such breadth of knowledge (the fabricated conversation notwithstanding, it correctly recalled and summarized the story) and an ability to engage in meaningful conversation.

 Posted by at 3:02 pm
Feb 102024
 

Now that Google’s brand new Gemini is officially available in Canada, so I am no longer restricted to accessing it through a VM that’s located in the US, I asked it to draw a cat using SVG. It did. It even offered to draw a more realistic cat. Here are the results.

What can I say? I went back to GPT-4 turbo. I was hoping that it has not forgotten its skills or became too lazy. Nope, it still performs well:

OK, the ears are not exactly in the right place. Then again, since I gave Bard/Gemini a second chance, why not do the same with GPT?

There we go. A nice schematic representation of a cat. I know, I know, a bit boring compared to the Picasso-esque creation of the Bard…

 Posted by at 1:47 am
Dec 092023
 

I am looking at the summary by Reuters of the European Union’s proposed regulatory framework for AI.

I dreaded this: incompetent politicians, populist opportunists, meddling in things that they themselves don’t fully understand, regulating things that need no regulation while not paying attention to the real threats.

Perhaps I was wrong.

Of course, as always, the process moves at a snail’s pace. By the time the new regulations are expected to come into force, 2026, the framework will likely be hopelessly obsolete.

Still: Light transparency requirements as a general principle, severe restrictions on the use of AI for law enforcement and surveillance, strict regulation for high-risk systems… I am compelled to admit, the attitude this reflects makes a surprising amount of good sense.

Almost as if the framework was crafted by an AI…

 Posted by at 11:57 am
Dec 012023
 

Well, here it is, a local copy of a portable large language and visual model. An everywhere-run executable in a mere 4 GB. Here’s my first test, with a few random questions and an image (one of my favorite Kliban cartoons) to analyze:

Now 4.57 tokens per second is not exactly fast but hey, it runs on my 7-year old workstation, with no GPU acceleration, and yet, its performance is more than decent.

How is this LLM different from GPT or Claude? Well, it requires no subscription, no Internet connection. It is entirely self-contained, and fast enough to run on run-of-the-mill PC hardware.

 Posted by at 12:12 am
Nov 302023
 

This morning, like pretty much every morning, there was an invitation in my inbox to submit a paper to a journal that I never heard of previously.

Though the unsolicited e-mail by itself is often an indication that the journal is bogus, predatory, I try to be fair and give them the benefit of the doubt, especially if the invitation is from a journal that is actually related to my fields of study. (All too often, it is not; I’ve received plenty of invitations from “journals” in the medical, social, biological, etc., sciences, subjects on which I have no professional expertise.)

So what are the signs that I am looking for? Well, I check what they published recently. That’s usually a good indication of what to expect from a journal. So when I read a title that says, say, “Using black holes as rechargeable batteries and nuclear reactors,” I kind of know what to expect.

Oh wait. That particular paper appears to have been accepted for publication by Physical Review D.

Seriously, what is the world of physics coming to? What is the world of scientific publishing, by and large, coming to? Am I being unfair? Just to be sure, I fed the full text of the paper on black hole batteries to GPT-4 Turbo and asked the AI to assess it as a reviewer:

 Posted by at 11:06 am
Nov 222023
 

Watching things unfold at OpenAI, the company behind ChatGPT, these past several days was… interesting, to say the least.

I thought about posting a blog entry on Monday, but decided to wait as I was sure there was more to come. I was not disappointed.

First, they fire Sam Altman, in a move that is not unlike what happens to the Game of Thrones character Jon Snow at the end of Season 5. (Yes, I am a latecomer to GoT. I am currently watching Season 6, Episode 3.)

Then several other key executives quit, including the company president, Greg Brockman.

Then, the Board that fired Altman apparently makes noises that they might welcome him back.

But no, Altman and Brockman instead joined Microsoft after, I am guessing, Nadella made them an offer they could not refuse.

Meanwhile, in an open revolt, the majority of OpenAI’s employees signed a letter demanding the resignation of the company’s Board of Directors, threatening to quit otherwise.

The authors of CNN’s Reliable Sources newsletter were not the only ones asking, “What on Earth is going on at OpenAI?”

As if to answer that question, OpenAI rehired Altman as CEO, and fired most of their Board.

The New Yorker‘s take on the “AI revolution”

Meanwhile, some speculate that the fundamental reason behind this is not some silly corporate power play or ego trips but rather, genuine concern that OpenAI might be on the threshold of releasing the genie from the bottle: the genie called AGI, artificial general intelligence, that is.

I can’t wait. AGI may do stupid things but I think it’d have to work real hard to be dumber than us humans.

 Posted by at 3:43 pm
Aug 122023
 

One of the many unfulfilled, dare I say unfulfillable promises of the tech world (or at least, some of the tech world’s promoters) is “low code”. The idea that with the advent of AI and visual programming tools, anyone can write code.

Recall how medieval scribes prepared those beautiful codices, illuminated manuscripts. Eventually, that profession vanished, replaced by the printing press and, eventually, the typewriter. But what if someone suggested that with the advent of the typewriter, anyone can now write high literature? Laughable, isn’t it. There is so much more to writing than the act of making nicely formed letters appear on a sheet of paper.

Software development is just like that. It is about so much more than the syntax of a programming language. Just think of the complete life cycle of a software development project. Even small, informal in-house projects follow this model: A requirement is identified, a conceptual solution is formulated (dare I say, designed), the technology is selected, problems are worked out either in advance or as they are encountered during testing. The code is implemented and tested, bugs are fixed, functionality is evaluated. The code, if it works, is put into production, but it still needs to be supported, bugs need to be fixed, compatibility with other systems (including the operating system on which it runs) must be maintained, if it is a public-facing app, its security must be monitored, business continuity must be maintained even if the software fails or there are unexpected downtimes… These are all important aspects of software development, and they have very little to do with the act of coding.

In recent months, I benefited a great deal from AI. Claude and, especially perhaps, GPT-4, proved to be tremendous productivity tools of almost unbelievable efficiency. Instead of spending hours on Google searches or wading through StackExchange posts, I could just consult Claude and get an instant answer clarifying, e.g., the calling conventions of a system function. When I was struggling to come up with a sensible way to solve a problem, I could just ask GPT-4 for suggestions. Not only did GPT-4 tell me how to address the problem at hand, often with helpful code snippets illustrating the answer, it even had the audacity to tell me when my approach was suboptimal and recommended a better solution.

And yes, I could ask these little robot friends of ours to write code for me, which they did.

But this was when things took a really surprising turn. On several occasions, Claude or GPT not only offered solutions but offered inspired solutions. Elegant solutions. Except that the code they wrote had bugs. Sometimes trivial bugs like failure to initialize a variable or assigning a variable that was declared a constant. The kind of routine mistakes experienced programmers make, which are easily fixable: As the first, draft version of the code is run through the compiler or interpreter, these simple buglets are readily identified and corrected.

But this is the exact opposite of the “low code” promise. Low code was supposed to mean a world in which anyone can write software using AI-assisted visual tools. In reality, those tools do replace armies of inexperienced, entry-level programmers but experience is still required to design systems, break them down into sensible functional components, create specifications (even if it is in the form of a well-crafted prompt sent to GPT-4), evaluate solutions, perform integration and testing, and last but not least, fix the bugs.

What worries me is the fact that tomorrow’s experienced software architects will have to come from the pool of today’s inexperienced entry-level programmers. If we eliminate the market for entry-level programmers, who will serve as software architects 20, 30 years down the line?

Never mind. By then, chances are, AI will be doing it all. Where that leaves us humans, I don’t know, but we’re definitely witnessing the birth of a brand new era, and not just in software development.

 Posted by at 12:23 pm
Aug 112023
 

One of the things I asked Midjourney to do was to reimagine Grant Wood’s famous 1930 painting with a gentlecat and a ladycat.

Not all of Midjourney’s attempts were great, but I think this one captures the atmosphere of the original per… I mean, how could I possibly resist writing purr-fectly?

Well, almost perfectly. The pitchfork is a bit odd and it lacks a handle. Oh well. No AI is, ahem, purr-fect.

 Posted by at 7:21 pm