

I just spent about a month using Claude 3.7 to write a new feature for a big OSS product. The change ended up being about 6k loc with about 14k of tests added to an existing codebase with an existing test framework for reference.
For context I’m a principal-level dev with ~15 years experience.
The key to making it work for me was treating it like a junior dev. That includes priming it (“accuracy is key here; we can’t swallow errors, we need to fail fast where anything could compromise it”) as well as making it explain itself, show architecture diagrams, and reason based on the results.
After every change there’s always a pass of “okay but you’re violating the layered architecture here; let’s refactor that; now tell me what the difference is between these two functions, and shouldn’t we just make the one call the other instead of duplicating? This class is doing too much, we need to decompose this interface.” I also started a new session, set its context with the code it just wrote, and had it tell me about assumptions the code base was making, and what failure modes existed. That turned out to be pretty helpful too.
In my own personal experience it was actually kinda fun. I’d say it made me about twice as productive.
I would not have said this a month ago. Up until this project, I only had stupid experiences with AI (Gemini, GPT).
The PR isn’t public yet (it’s in my fork) but even once I submit it upstream I don’t think I’m ready to out my real identity on Lemmy just yet.