So I had a surprisingly smooth experience with a chat agent, most likely an AI agent though probably with some human-in-the-loop supervision. This had to do with canceling/downgrading my Sirius XM subscription, now that we no longer have a vehicle with satellite radio.
And it got me thinking. Beyond the hype, what does it take to build a reliable AI customer experience (CX) agent?
And that’s when it hit me: I already did it. Granted, not an agent per se, just the way I set up GPT-5 to play chess.
The secret? State machines.
I did not ask GPT to keep track of the board. I did not ask GPT to update the board either. I told GPT the state of the board and asked GPT to make a move.
The board state was tracked not by GPT but by conventional, deterministic code. The board is a state machine. Its state transitions are governed by the rules of chess. There is no ambiguity. The board’s state (including castling and en passant) is encoded in a FEN string unambiguously. When GPT offers a move, its validity is determined by a simple question: does it represent a valid state transition for the chessboard?
And this is how a good AI CX agent works. It does not unilaterally determine the state of the customer’s account. It offers state changes, which are then evaluated by the rigid logic of a state machine.

Diagram created by ChatGPT to illustrate a CX state machine
Take my case with Sirius XM. Current state: Customer with a radio and Internet subscription. Customer indicates intent to cancel radio. Permissible state changes: Customer cancels; customer downgrades to Internet-only subscription. This is where the LLM comes in: with proper scaffolding and a system prompt, it interrogates the customer. Do you have any favorite Sirius XM stations? Awesome. Are you planning on purchasing another XM radio (or a vehicle equipped with one)? No, fair enough. Would you like a trial subscription to keep listening via the Internet-only service? Great. State change initiated… And that’s when, for instance, a human supervisor comes in, to approve the request after glancing at the chat transcript.
The important thing is, the language mode does not decide what the next state is. It has no direct authority over the state of the customer’s account. What it can do, the only thing it can do at this point, is initiating a valid state transition.
The hard part when it comes to designing such a CX solution is mapping the states adequately, and making sure that the AI has the right instructions. Here is what is NOT needed:
- There is no need for a “reasoning” model;
- There is no need for “agentic” behavior;
- There is no need for “council-of-experts”, “chain-of-thought” reasoning, “self-critique”, or any of the other hyped inventions.
In fact, a modest locally run model like Gemma-12B would be quite capable of performing the chat function. So there’s no need even to worry about leaking confidential customer information to the cloud.
Bottom line: use language models for what they do best, associative reasoning. Do not try to use a system with no internal state and no modeling capability as a reasoning engine. That’s like, if I may offer a crude but (I hope) not stupid analogy, it’s like building a world-class submarine and then, realizing that it is not capable of flying, nailing some makeshift wooden wings onto its body.
I almost feel tempted to create a mock CX Web site to demonstrate all this in practice. Then again, I realize that my chess implementation already does much of the same: the AI agent supplies a narrative and a proposed state transition, but the state (the chessboard) is maintained, its consistence is ensured, by conventional software scaffolding.





















