Does good design up front matter as much if an AI can refactor in a few hours something that would take a good developer a month? Refactoring is one of those tasks that's tedious, and too non-trivial for automation, but seems perfect for an AI. Especially if you already have all the tests.
I’m constantly using code agents to work on feature development and they are constantly getting things wrong. They can refactor high level concepts but I have to nudge them to think about the proper abstractions. I don’t see how a multiagent flow could handle those interactions. The bus factor is 1, me.
Try building review skills based on how you review. I built one recently based on how I review some of the concurrent backend stuff one of our tools does. I have it auto-run on every PR. It's great, it catches tons of stuff, and ranks the issues by severity. Over 10 reviews, only 1 false positive (hallucination) and several critical catches. I wish I'd set it up sooner.
Can also after those sessions where they get stuff wrong, ask for an analysis of what it got wrong that session, and produce a ranked list. I just started that and wow, it comes up with pretty solid lists. I'm not sure if its sustainable to simply consolidate and prune it, but maybe it is?
Upgrades, API compatibility, and cross version communication are really important in some domains. A bad design can cause huge pain downstream when you need to make a change.
> Especially if you already have all the tests.
Most tests people write have to be changed if you refactor.