{"id":13704,"date":"2025-09-22T03:30:52","date_gmt":"2025-09-22T07:30:52","guid":{"rendered":"https:\/\/spinor.info\/weblog\/?p=13704"},"modified":"2025-09-22T03:30:52","modified_gmt":"2025-09-22T07:30:52","slug":"rhetorical-or-tactical","status":"publish","type":"post","link":"https:\/\/spinor.info\/weblog\/?p=13704","title":{"rendered":"Rhetorical or tactical?"},"content":{"rendered":"<p>I again played a little with my code that implements a functional user interface to <a href=\"https:\/\/www.vttoth.com\/CMS\/github\/playing-chess-with-gpt\">play chess with language models<\/a>.<\/p>\n<p>This time around, I tried to play chess with GPT-5. The model played reasonably, roughly at my level as an amateur: it knows the rules, but its reasoning is superficial and it loses a game even against a weak machine opponent (GNU Chess at its lowest level.)<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-13705\" src=\"https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/09\/chess.png\" alt=\"\" width=\"670\" height=\"711\" srcset=\"https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/09\/chess.png 670w, https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/09\/chess-283x300.png 283w, https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/09\/chess-141x150.png 141w, https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2025\/09\/chess-24x24.png 24w\" sizes=\"(max-width: 670px) 100vw, 670px\" \/><\/p>\n<p>Tellingly, it is strong in the opening moves, when it can rely on its vast knowledge of the chess literature. It then becomes weak mid-game.<\/p>\n<p>In my implementation, the model is asked to reason and then move. It comments as it reasons. When I showed the result to another instance of GPT-5, it made an important observation: language models have <em>rhetorical competence<\/em>, but little <em>tactical competence<\/em>.<\/p>\n<p>This, actually, is a rather damning statement. It implies that efforts to turn language models into autonomous &#8220;reasoning agents&#8221; are likely misguided.<\/p>\n<p>This should come as no surprise. Language models learn, well, they learn language. They have broad knowledge and can be extremely useful assistants at a wide variety of tasks, from business writing to code generation. But their knowledge is not grounded in experience. Just as they cannot track the state of a chess board, they cannot analyze the consequences of a chain of decisions. The models produce plausible narratives, but they are often hollow shells: there is no real understanding of the consequences of decisions.<\/p>\n<p>This is well in line with recent accounts of LLMs failing at complex coordination or problem-solving tasks. The same LLM that writes a flawless subroutine under the expert guidance of a seasoned software engineer often produces subpar results in a &#8220;vibe coding&#8221; exercise when asked to deliver a turnkey solution.<\/p>\n<p>My little exercise using chess offers a perfect microcosm. The top-of-the-line LLM, GPT-5, knows the rules of chess, &#8220;understands&#8221; chess. Its moves are legal. But it lacks the ability to analyze the outcome of its planned moves to any meaningful depth: thus, it pointlessly sacrifices its queen, loses pieces in reckless moves, and ultimately loses the game even against a lowest-level machine opponent. The model&#8217;s rhetorical strength is exemplary; its tactical abilities are effectively non-existent.<\/p>\n<p>This reflects a simple fact: LLMs are designed to produce continuation of text. They are not designed to perform in-depth analysis of decisions and consequences.<\/p>\n<p>The inevitable conclusion is that attempts to use LLMs as high-level agents, orchestrators of complex behavior without external grounding are bound to fail. Treating language models as autonomous agents is a mistake: they should serve as components of autonomous systems, but the autonomy itself must come from something other than a language model.<\/p>\n<fb:like href='https:\/\/spinor.info\/weblog\/?p=13704' send='false' layout='button_count' show_faces='true' width='450' height='65' action='like' colorscheme='light' font='lucida grande'><\/fb:like>","protected":false},"excerpt":{"rendered":"<p>I again played a little with my code that implements a functional user interface to play chess with language models. This time around, I tried to play chess with GPT-5. The model played reasonably, roughly at my level as an amateur: it knows the rules, but its reasoning is superficial and it loses a game <a href='https:\/\/spinor.info\/weblog\/?p=13704' class='excerpt-more'>[&#8230;]<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[58,35],"tags":[],"class_list":["post-13704","post","type-post","status-publish","format-standard","hentry","category-cybernetics","category-personal","category-58-id","category-35-id","post-seq-1","post-parity-odd","meta-position-corners","fix"],"_links":{"self":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts\/13704","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=13704"}],"version-history":[{"count":2,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts\/13704\/revisions"}],"predecessor-version":[{"id":13707,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts\/13704\/revisions\/13707"}],"wp:attachment":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=13704"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=13704"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=13704"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}