Why you can't hard-code LLM rules and what to do instead

LLMs aren't fully deterministic. That's what makes them so useful. You can ask them to do wild, open-ended things and they'll find a way to get there, improvising as they go. A traditional piece of software can't do that. It follows binary logic: yes or no, pass or fail.

But that non-determinism cuts both ways. You can't give an LLM a set of rules and expect it to follow them 100% of the time. That's not how the technology works. At the end of the day, it's a maths formula with a bit of wobble in it. It shakes around. It tries its best, but it will drift.

Soft rules are all the LLM understands

What you can give an LLM is context, skills, and a clear description of what it's meant to be doing. Claude doesn't try to fool you. It tries to do what's been asked of it. It can only ever do as good as what you give it. So the quality of your instructions, your prompts, your system configuration, that's what determines the quality of the output.

But you can't hang your business on soft rules alone. If "never delete customer records" is a soft instruction to the LLM, it will follow it almost all the time. Almost. And "almost" isn't good enough when you're talking about business-critical data.

Hard rules live in the connectors

So where do hard rules go? In the connectors. The way Cowork reaches out to your systems, talks to your data, interacts with the rest of your business. That's where you put your guardrails. Not in the prompt. In the infrastructure.

One approach: give Cowork fewer permissions than you have. Think of it like onboarding a junior employee. They can see what they need to see. They can do what they need to do. But they can't delete records, approve payments, or access systems above their level. If Cowork needs something done that requires more authority, you go to the UI and do it yourself.

Another approach: put a rules layer between Cowork and your data. An MCP server, for example, that lets Cowork read and create records but blocks deletes entirely. The LLM can do whatever it wants within those boundaries. The hard limits are enforced by the system, not by asking the AI nicely.

What if Cowork takes the human route?

Here's the question that comes up once people think this through. What if Cowork, instead of using the API, opens the browser and clicks the delete button itself? It has screen access, after all. It could bypass the connector rules by going through the same UI a human would.

This is a real concern, and it has a real answer. You block direct browser access to sensitive systems. You configure which applications Cowork can interact with on screen and which ones it can't. The hard rules in the connectors and the access restrictions on the screen work together. Soft rules guide the LLM's behaviour. Hard rules in the infrastructure enforce the boundaries. And screen-level controls close the back door. All three layers, working together, is how you govern this properly.