Dec 092022
 

Inspired by examples I saw elsewhere, I figured I’d test ChatGPT some more. This time around, I asked it to spot a programming error, and while its answer was flawed, it was not altogether wrong.

Here is our brief conversation (ChatGPT also adds some nice colorized code formatting that I shall not attempt to reproduce here):

Can you spot the bug in the following C-language code fragment?

char *str = "SOME TEXT";
char *buf = malloc(strlen(str));
strcpy(buf, str);

The code is missing a null terminator character '\0' at the end of the string assigned to buf. The strcpy function does not automatically add a null terminator character when copying the string from str to buf, so buf will not be a properly-formatted string.

To fix the code, you can add a null terminator character to the end of buf after it has been copied from str, like this:

char *str = "SOME TEXT";
char *buf = malloc(strlen(str) + 1); // Add one to include space for the null terminator character
strcpy(buf, str);
buf[strlen(str)] = '\0'; // Add the null terminator character at the end of the string

Alternatively, you can use the strdup function to copy str into buf, which automatically adds a null terminator character to the end of the copied string, like this:

char *str = "SOME TEXT";
char *buf = strdup(str);

Note that the strdup function is not a standard C library function and may not be available on all platforms.

Pretty darn close! The AI correctly noticed that there is a problem with the terminating null character. But it’s not what it says: it’s not that strcpy won’t copy the string terminator. Quite the contrary, the problem is that strcpy does copy that terminating null, but we have not allocated memory for it: strlen returns the length of the string without that terminator.

Interestingly, the AI seems to be aware of this as in the corrected version, it adds room for that extra byte. The corrected code is error-free but redundant: the last line is not required, because strcpy would have already copied the terminating null. This line would be required, on the other hand, if we had used strncpy.

The suggestion to use strdup is a good one, except that contrary to the caveat, strdup is now part of the upcoming ISO C standard.

Still, it is doing better already than many a C programmer I knew. So how about offering it a really tricky related problem?

The following C-language code fragment works:

char *str = "SOME TEXT";
char *buf = malloc(strlen(str));
strcpy(buf, str);

but if I replace the first line with

char *str = "A DIFFERENT TEXT";

the program crashes. Can you explain why?

When you replace the first line of the code with char *str = "A DIFFERENT TEXT";, the length of the string assigned to str increases, but the size of the memory allocated for buf remains the same. This causes a buffer overflow, which can lead to a crash or other undefined behavior.

To fix this issue, you can use the strlen function to calculate the length of the string assigned to str and allocate enough memory for buf to hold the entire string, like this:

char *str = "A DIFFERENT TEXT";
char *buf = malloc(strlen(str) + 1); // Add one to include space for the null terminator character
strcpy(buf, str);

Alternatively, you can use the strdup function to copy str into buf, which automatically allocates enough memory for the entire string, including the null terminator character, like this:

char *str = "A DIFFERENT TEXT";
char *buf = strdup(str);

Note that the strdup function is not a standard C library function and may not be available on all platforms.

Really close! But no, it’s not that the size of the memory allocated for the buffer remains the same. It’s that in both cases, the size of the memory allocated for the buffer failed to include room for the terminating null character. However, in the first case the string was 9 bytes long, or 10 with the terminating null. On most modern platforms, memory is allocated in units of 32-bit words or larger, so at the very least, there’d he 12 bytes allocated, leaving enough room for the terminator. But in the second case, the text is exactly 16 bytes long. Chances are that the call to malloc will allocate exactly 16 bytes, and therefore, copying the terminating null will cause a buffer overflow, overwriting something else that it shouldn’t.

Still, even though its reasoning is flawed, the AI’s proposed fix is spot on: include room for the null terminator!

I am not unimpressed. Yet all these examples tend to remind me of the behavior of the “teacher’s pet” A-student who memorizes a lot, can even apply simple reasoning, but lacks in-depth understanding.

 Posted by at 12:29 pm

  2 Responses to “More ChatGPT adventures”

  1. Almost all the criticism of ChatGTP that I have seen could be applied to many (most?) people as well: It copies solutions without really understanding them; gives confident sounding but wrong answers; not up on current events; only shallowly thinks about the ethics of the answers it is proposing; and so on. These are all very human mistakes. It’s fascinating to me that despite a very different architecture, and wildly different training, it’s making the same kind of mistakes as humans. That’s telling us something of the nature of intelligence, I think.

  2. I quite agree, Lou. I compared it several times to the “teacher’s pet” A-student who memorizes a lot, understands superficial relationships, but often lacks in-depth understanding.

    Having said that, its ability to analyze and understand (!) source code is beyond uncanny. Recognizing my little alternate process lister (an impromptu security tool I developed 20+ years ago in response to a trojan that was hiding itself from ps) and recognizing the midpoint circle algorithm? Had it not been my own code, I’d have struggled to make sense of either program.