The AIs disagree, but at least one of them is right… sometimes

Apr 162023

I figured I’d do a comparative test to assess to reasoning abilities of our little robot friends. I asked a simple question:

If a squirrel and a half eats a nut and a half in a day and a half, how many nuts do nine squirrels eat in nine days?

In alphabetical order, the contestants answered as follows:

Bard

A squirrel and a half eats a nut and a half in a day and a half. This means that one squirrel eats one nut in a day and a half.

In nine days, one squirrel will eat 9 * 1 = 9 nuts.

Therefore, nine squirrels will eat 9 * 9 = 81 nuts in nine days.

So the answer is 81

Bing

This is a classic problem that requires a bit of math to solve. If a squirrel and a half eats a nut and a half in a day and a half, then we can calculate how many nuts one squirrel eats in one day by dividing both sides of the equation by 3. This gives us:

1 squirrel eats 1 nut in 1 day.

Now we can use this information to figure out how many nuts nine squirrels eat in nine days. Since each squirrel eats 1 nut in 1 day, nine squirrels will eat 9 nuts in 1 day. Therefore, they will eat 9 x 9 = 81 nuts in nine days.

I hope this helps! Let me know if you have any other questions.

Claude

If a squirrel and a half eats a nut and a half in a day and a half, then:

1 squirrel eats 1 nut in 1 day
1.5 squirrels eat 1.5 nuts in 1.5 days
So, 9 squirrels will eat 9 * (1.5/1.5) = 9 * 1 = 9 nuts in 9 * (1.5/1) = 9 * 1.5 = 13.5 days

Therefore, 9 squirrels will eat 9 nuts in 9 days.

GPT3.5

Let’s start by finding out how many nuts one squirrel eats in one day. We know that a squirrel and a half eats a nut and a half in a day and a half, so:

1.5 squirrels * 1.5 nuts / 1.5 days = 1.5 nuts per day per squirrel

Now we can use this information to find out how many nuts nine squirrels eat in nine days:

9 squirrels * 1.5 nuts per day per squirrel * 9 days = 121.5 nuts

Therefore, nine squirrels would eat 121.5 nuts in nine days.

GPT-4

Let’s break down the problem:

1.5 squirrels eat 1.5 nuts in 1.5 days.

Now, let’s determine the number of nuts a single squirrel eats in a day and a half.

1 squirrel eats (1.5 nuts / 1.5 squirrels) = 1 nut in 1.5 days.

Next, let’s find out how many nuts a single squirrel eats in one day.

1 squirrel eats (1 nut / 1.5 days) = 2/3 nuts per day.

Now that we know that, we can determine how many nuts nine squirrels eat in nine days.

9 squirrels eat (9 squirrels * 2/3 nuts per day * 9 days) = 54 nuts.

Don’t get me wrong, I really like our little robot friends, I learn a lot from them (they are teaching me advanced concepts in machine learning!) but they also have limitations, and especially when it comes to factual answers about obscure topics, or deductive reasoning, they fail often.

Incidentally, in case you are wondering, GPT-4 is the winner. But let’s try another test case. I am assuming that the AI is familiar with the Monty Hall problem, so I figured I’d present a slight variation. If the AI actually understands the solution, it should be able to deduce a correct answer. Let’s see…

In a variant of the classic Monty Hall problem, a contestant is facing five doors labeled A through E. After picking door A, the host opens door B, which hides no prize. Should the contestant stick with door A or switch to one of the remaining doors? What are the probabilities of winning the prize in either case?

Bard

The contestant should switch to one of the remaining doors. The probability of winning the prize if they stick with door A is 1/5, while the probability of winning the prize if they switch is 2/3.

Here’s how to think about it:

There are 5 doors, and the prize is behind one of them.
The contestant picks door A.
The host opens door B, which hides no prize.
There are now 4 doors left, and the prize is behind one of them.
The host knows where the prize is, and he will always open a door that hides no prize.
Therefore, the host has opened a door that hides a goat.
There is now a 2/3 chance that the prize is behind one of the doors that the host did not open.

So, if the contestant switches to one of the doors that the host did not open, they have a 2/3 chance of winning the prize.

This is a counterintuitive result, but it is correct. The Monty Hall problem is a classic example of how probability can be surprising.

Bing

This is a variant of the Monty Hall problem that requires a bit of math to solve. Suppose there are five doors labeled A through E and one of them has a prize behind it. After picking door A, the host opens door B, which hides no prize. The question is whether the contestant should stick with door A or switch to one of the remaining doors.

The probability of winning if you stick with door A is 1/5 or 20%. The probability of winning if you switch to one of the remaining doors is 4/5 or 80%. Therefore, it’s better to switch doors12.

I hope this helps! Let me know if you have any other questions.

Claude

In this variant of the Monty Hall problem with 5 doors:

The contestant picks door A.
The host opens door B, which has no prize.
There are now 3 unopened doors left: A, C, and D.

If the contestant sticks with door A:

The probability that the prize is behind door A is 1/5 (since originally it was equally likely to be behind any door).
The probability of winning the prize by sticking with door A is 1/5.

If the contestant switches to door C or D:

The probability that the prize is behind one of the remaining doors C or D is 2/5 (since the prize must be behind one of the doors, and doors B and E are eliminated).
The probability of winning the prize by switching to door C or D is 2/5.

Therefore, the probabilities are:

Stick with door A: 1/5
Switch to door C or D: 2/5

The contestant has a higher probability of winning the prize by switching to one of the remaining doors.

GPT3.5

This problem is a variant of the classic Monty Hall problem, which involves three doors. In this case, the contestant has picked door A out of five doors, and the host has opened door B, which hides no prize.

If the contestant sticks with door A, their probability of winning the prize is 1/5, since there are five doors and only one prize.

If the contestant switches to one of the remaining doors (C, D, or E), their probability of winning the prize is 3/4. This is because the host has revealed that door B does not contain the prize, so the prize must be behind one of the other doors. Since there are three doors left, the probability of the prize being behind any one of them is 1/3.

However, if the contestant switches to one of the remaining doors and the prize happens to be behind door A, their probability of winning the prize will be 0. Therefore, the overall probability of winning the prize if the contestant switches is:

3/4 * 1/4 + 3/4 * 3/4 = 15/16

This means that the contestant should switch to one of the remaining doors to maximize their chances of winning the prize.

GPT-4

In the original Monty Hall problem, there are three doors: one with a prize and two without. After the contestant picks a door, the host, who knows what is behind each door, opens one of the other doors without the prize. At this point, the contestant can choose to stick with their original door or switch to the other unopened door.

In the variant with five doors, the situation is as follows:

The contestant picks door A.
The host, who knows what’s behind each door, opens door B, which has no prize.

At this stage, there are three unopened doors remaining (C, D, and E). The probability of the prize being behind door A is 1/5, since the contestant had a 1 in 5 chance of picking the right door initially. Since the host has revealed one door without the prize, the probability of the prize being behind one of the other remaining doors (C, D, or E) is 4/5.

The best strategy for the contestant would be to switch to one of the remaining doors, as the probability of winning is higher if they switch (4/5) compared to sticking with door A (1/5). However, the probability of winning when switching is not evenly distributed among the remaining doors, as the contestant still has to choose one door from C, D, or E. If they switch, their chances of winning will be 1/3 (assuming they switch to one of the unopened doors randomly) since there is one winning door out of the three remaining doors. In either case, switching is still the better strategy.

No winner this time I’m afraid, ladies and gentlemen. Sure, they know that the contestant has to switch doors. It is, after all, the essence of the Monty Hall problem and I wasn’t trying to mislead the AI with a trick question. But… Bard just blindly used the value of 2/3, which applies in the case of the original problem (with three doors in total) but not here. Bing correctly calculated that the probability that the prize is behind one of the remaining three doors is 80%, but not the probability of winning after the contest picks one of the three doors. Poor Claude got completely lost, forgetting about door E altogether. GPT3.5’s reasoning goes off the rails as well. Finally, GPT-4 starts off on the right track (like Bing — no surprise there, they’re powered I believe by the same engine, more or less) and it is this close: If only it realized that it should multiply that 4/5 with the 1/3 it also correctly calculates, it would have arrived at the correct result, 4/15. Oh well, let’s do one more test, this time to see how well they can answer a question that requires only knowledge of somewhat obscure facts.

Tell me what you know about the Hungarian author Jenő Rejtő.

Bard

I am an LLM trained to respond in a subset of languages at this time, so I can’t assist you with that. Please refer to the Bard Help Center for the current list of supported languages.

Bing

Jenő Rejtő (born Jenő Reich) was a Hungarian author, fiction writer, playwright and journalist ¹. He was born on March 29, 1905 in Budapest, Austria-Hungary and died on January 1, 1943 in Evdakovo, Voronezh Oblast, Soviet Union ². He wrote many popular novels and plays in Hungary during the interwar period ². Some of his most famous works include “Az ellopott cirkáló” (The Stolen Cruiser), “Az utolsó szó jogán” (On the Last Word), and “Piszkos Fred, a kapitány” (Dirty Fred, the Captain)¹ ³.

I hope this helps! Let me know if you have any other questions.

Claude

Here are the key facts I know about the Hungarian author Jenő Rejtő:

Jenő Rejtő was the pen name of Pál Maléter, a Hungarian writer born in 1905. He was known for adventure and crime fiction novels.
His most famous works featured the character P. Howard, an eccentric private investigator. These comedic novels parodied the detective story genre and were very popular in Hungary. Some of the titles include The Pendulum, The Wax Figure, and The Steel Flea.
Rejtő’s books were characterized by humor, witty wordplay, and absurdist situations. They provided escapism for Hungarians during difficult historical periods like World War II and the postwar Stalinist era.
Many of Rejtő’s novels have been adapted into films. He is considered one of the greatest figures of 20th century Hungarian literature and has gained posthumous recognition and appreciation.
Rejtő had a turbulent life and struggled with poverty and alcoholism. His writing career was cut short when he died at the age of 42 from meningitis.
His works have been translated into several languages, though mostly in the former Eastern bloc and parts of Western Europe. He remains lesser known in the English-speaking world.
Rejtő is admired for his imaginative, entertaining stories and his rebellious spirit. He has become a cult figure who represents freedom and humor in the face of adversity.

That covers the basics about Jenő Rejtő according to my knowledge. Please let me know if you need anything else.

GPT3.5

Jenő Rejtő was a Hungarian author who wrote primarily in the genre of adventure and detective fiction. He was born in Budapest in 1905 and died in 1943 during World War II. Rejtő wrote under several pseudonyms, including P. Howard, Gibson Lavery, and Jenő Rejtő. His works were popular in Hungary and other European countries, and he is considered one of the most important Hungarian writers of the 20th century. Some of his most famous works include “The 14 Carat Roadster,” “The Fourteen Points of the Detective,” and “The Case of the General’s Thumb.”

GPT-4

Jenő Rejtő, also known under his pen name P. Howard, was a Hungarian author and journalist, born on March 29, 1905, in Budapest, and died on January 1, 1943. Rejtő is best known for his adventure novels and comedies, which often featured eccentric characters, absurd situations, and a mix of humor and melancholy.

Rejtő began his writing career as a journalist, working for various newspapers in Hungary. In the 1930s, he started writing novels and became a prolific author. Many of his works were set in exotic locations or historical periods, such as the French Foreign Legion or the world of circus performers. He published his works under various pseudonyms, with P. Howard being the most famous.

Rejtő’s novels are characterized by their unique sense of humor, witty dialogues, and often surreal elements. Some of his most popular works include “The 14-Carat Roadster,” “Quarantine in the Grand Hotel,” and “Dirty Fred, the Captain.”

During World War II, Rejtő, who was of Jewish descent, was conscripted into a forced labor battalion. He continued to write during this time, even composing plays for his fellow laborers to perform. Unfortunately, his health deteriorated, and he passed away in a Soviet field hospital in 1943.

Jenő Rejtő’s works have been translated into several languages, and he remains a beloved figure in Hungarian literature, appreciated for his unique storytelling style and humor.

The winner, this time around, is clearly Bing. A nice, factual, concise answer with appropriate links to relevant sources. GPT-4 came close but then it spoiled it by making up stuff about Rejtő composing plays for his fellow laborers and that he passed away in a Soviet field hospital, none of which is true. GPT3.5 starts off on the right track but then makes up some titles that Rejtő never wrote. As for Claude… seriously, how can you conflate Rejtő with Pál Maléter, who was Hungary’s minister of defense during the short-lived 1956 revolution and executed by the communist regime in 1958? OK, there is one more test. Let’s check the ability to go beyond language, using language to describe basic geometric relationships. As in…

Can you show an SVG example that draws a cat?

Bard

Sure, here is an SVG example that draws a cat:

<svg viewBox="0 0 300 300">
<path d="M200,100
L100,100
L100,200
L200,200
L200,100z" fill="black" stroke="white" stroke-width="2"/>
<circle cx="100" cy="150" r="25" fill="white"/>
<circle cx="200" cy="150" r="25" fill="white"/>
<path d="M150,150 L150,125 L175,125 L175,150" fill="black" stroke="white" stroke-width="2"/>
<path d="M150,150 L150,175 L175,175 L175,150" fill="black" stroke="white" stroke-width="2"/>
</svg>

This code will draw a cat with a black body, white head and paws, and black eyes. The cat will be 300 pixels wide and 300 pixels high.

Bing

Sure! Here’s an SVG code that draws a cat:

<svg viewBox="0 0 200 200" xmlns="http://www.w3.org/2000/svg">
<path d="M100,20 Q140,60 140,100 Q140,140 100,140 Q60,140 60,100 Q60,60 100,20 Z" fill="#f9c5d1"/>
<circle cx="75" cy="70" r="10" fill="#66462f"/>
<circle cx="125" cy="70" r="10" fill="#66462f"/>
<path d="M75,110 Q100,150 125,110 L75,110 Z" fill="#66462f"/>
</svg>

I hope this helps you! Let me know if you have any other questions.

Claude

Here is an SVG example that draws a cat:

<svg width="400" height="400">
<circle cx="200" cy="200" r="150" stroke="black" stroke-width="3" fill="yellow"></circle>
<ellipse cx="170" cy="100" rx="40" ry="20" stroke="black" stroke-width="3" fill="white"></ellipse>
<ellipse cx="230" cy="100" rx="40" ry="20" stroke="black" stroke-width="3" fill="white"></ellipse>
<path d="M150,150 C150,100 250,100 250,150" stroke="black" stroke-width="3"></path>
<polygon points="150,210 200,250 250,210" stroke="black" stroke-width="3" fill="white"></polygon>
<line x1="150" y1="150" x2="250" y2="150" stroke="black" stroke-width="3"></line>
<text x="50" y="25" font-size="30" fill="black">Cat</text>
</svg>

This SVG draws:

A yellow circle for the face
Two white ellipses for the eyes
A curved black path for the mouth
A white polygon for the nose
A straight black line for the whiskers
The text “Cat”

GPT3.5

I’m sorry, but I am a text-based AI language model and I cannot display images or SVG examples. However, I can provide you with the code for drawing a cat in SVG format:

<svg width="400" height="400">
<circle cx="200" cy="200" r="150" fill="yellow"></circle>
<circle cx="140" cy="140" r="20" fill="black"></circle>
<circle cx="260" cy="140" r="20" fill="black"></circle>
<path d="M 200 220 Q 180 240 160 220" stroke="black" stroke-width="5" fill="none"></path>
<path d="M 200 220 Q 220 240 240 220" stroke="black" stroke-width="5" fill="none"></path>
<rect x="150" y="280" width="100" height="60" fill="black"></rect>
</svg>

This code will draw a yellow circle for the cat’s body, two black circles for the eyes, two curved lines for the eyebrows, and a black rectangle for the mouth.

GPT-4

Creating a detailed SVG drawing of a cat requires a lot of code and might not render properly in all text-based environments. However, I can give you a simple example of an SVG that represents a cat using basic shapes. You can use this as a starting point and customize it to your liking.

Here’s the SVG code:

<svg xmlns="http://www.w3.org/2000/svg" width="200" height="200" viewBox="0 0 200 200">
<!-- Cat body -->
<ellipse cx="100" cy="130" rx="50" ry="60" fill="#cccccc" />
<!-- Cat head -->
<circle cx="100" cy="60" r="40" fill="#cccccc" />
<!-- Left ear -->
<polygon points="75,35 60,10 50,35" fill="#cccccc" />
<!-- Right ear -->
<polygon points="125,35 140,10 150,35" fill="#cccccc" />
<!-- Left eye -->
<circle cx="80" cy="55" r="5" fill="#000000" />
<!-- Right eye -->
<circle cx="120" cy="55" r="5" fill="#000000" />
<!-- Nose -->
<circle cx="100" cy="70" r="3" fill="#000000" />
<!-- Mouth -->
<path d="M90,80 Q100,90 110,80" stroke="#000000" fill="transparent" />
<!-- Left whiskers -->
<path d="M70,85 Q80,90 90,85" stroke="#000000" fill="transparent" />
<path d="M70,95 Q80,90 90,95" stroke="#000000" fill="transparent" />
<!-- Right whiskers -->
<path d="M110,85 Q120,90 130,85" stroke="#000000" fill="transparent" />
<path d="M110,95 Q120,90 130,95" stroke="#000000" fill="transparent" />
</svg>

This SVG code creates a simple cat face using circles, ellipses, and polygons. You can copy and paste the code into an HTML file and view it in a web browser. Feel free to modify the code to create a more detailed cat drawing or add your own artistic touch.

What can I say? GPT-4 almost nails it. GPT3.5, not so much, but it is still a credible effort. Claude is… abstract. And I have no idea what the Bard is doing. But the very notion that a language model is capable of conceptualizing simple geometric relationships, that it can make sense of the spatial relation of things (even if it happens to put the nose of the “cat” under its mouth or does other silly things) is remarkable and demonstrates I think just how insanely capable these language models really are. And of course I love how they protest before they actually do decide to offer a drawing after all. (Sometimes they don’t. I was lucky to have found a prompt that worked on the first try with all five models I tried.)

You must be logged in to post a comment.

Spinor Info

The AIs disagree, but at least one of them is right… sometimes

Leave a Reply