by v_CodeSentinal 6 hours ago
Nothing super fancy. For me “proof” just means the agent has to make its intent explicit in a way I can check before running it.
For example: 1) If it wants to delete a file, it has to output the exact path it thinks it’s deleting. I normalize it and make sure it’s inside the project root. If not, I block it. 2) If it proposes a big change, I require a diff first instead of letting it execute directly. 3) After code changes, I run tests or at least a lint/type check before accepting it.
So it’s less about formal proofs and more about forcing the agent to surface assumptions in a structured way, then verifying those assumptions mechanically.
Still hacky, but it reduced the “creative workaround” behavior a lot.