Ned: So you're using AI coding tools now? Didn't you just tweet about how a model tried to parse XML with regex? I believe the phrase you used was "unmitigated disaster."
Abel: That was me, yes. I spent ten dollars in API credits on that code.
Ned: And you also tweeted about how it was adding specific branches just to make the test cases pass?
Abel: Guilty. The AI didn't even try to hide it. Just casually commented "this branch is specifically for the test case." No shame at all.
Ned: So what changed?
Abel: It certainly helped that the models have gotten better. But I think the more interesting change was a change in perspective: I stopped dismissing the failures as proof that the whole enterprise was doomed.
Ned: What do you mean?
Abel: I mean I started treating the AI's failures as puzzles rather than evidence that AI coding is inherently broken. When Claude gave me garbage code, instead of rage-quitting, I started asking: "Why did it do that? How could I have phrased my request differently?"
Ned: So you're doing the AI's job for it? Debugging its thought process? That doesn't sound efficient.
Abel: I mean, that's one way of putting it. I'd frame it differently. I learned how best to work with it. Sometimes the answer was embarrassingly simple, like "the AI had no idea what I was talking about because I wasn't clear about what I wanted." I started actually investing in prompts rather than just throwing vague requests at the model.
Ned: But aren't these tools supposed to just work? The anime avatars on Twitter promise I can just describe my app in plain English and watch as it materializes before my eyes.
Abel: Yeah that's nonsense.
Ned: So that's not how it works?
Abel: Not even close. These tools are more like junior developers who are bizarrely knowledgeable about some things and completely clueless about others. They require specific direction, constant supervision, and regular course correction.
Ned: That doesn't sound very revolutionary. Sounds like babysitting a particularly erratic intern.
Abel: It's not revolutionary in the way people expected. But I've found there's a middle ground where these tools become genuinely useful. The productivity gains don't come from magical AI capabilities; they come from carefully designed workflows that combine what AI is good at with human oversight of what it's terrible at.
Ned: All right well let's hear about your revolutionary techniques for babysitting AI.
Abel: Not revolutionary. Practical. And I'm not pretending to be some AI coding guru. I just think I've found some approaches that have made these tools less frustrating and more useful for me. Every week I discover new approaches, better ways to phrase prompts, and smarter workflows. I've found a nice rhythm, but I fully expect my approach to evolve as the tools improve and as I gain more experience.
Ned: Well the last time I tried using Cursor on my Python project, it kept trying to invoke pip directly even though I was using uv to manage the project.
Abel: Well, what did you do?
Ned: I gave up. It was just too annoying. Every time the agent tried to run pip install
or pytest
outside of uv
's virtualenv, the command would fail. Then the agent would get to work trying to fix the problem, but it would just get more and more confused and asked if I wanted to install poetry
. I was just burning tokens.
Abel: Did you try correcting the model?
Ned: Well, sure. But it would keep making the same mistakes every time I created a new chat.
Abel: I hate to say it, Ned, but this sounds like a skill issue. Did you try adding these instructions to the default prompt?
Ned: Default prompt?
Abel: Well every tool calls it something different, but it's just a text file that's added to every session. Claude Code has CLAUDE.md
, Cursor has "Rules", and so on. You could have added something like: "This is a Python project managed by uv
. Run tests with uv run pytest
. Install packages with uv add
." Now the model knows this for every single interaction.
Ned: That's... surprisingly straightforward.
Abel: Look, I get it. "Skill issue" was harsh. Interacting with these tools is so different from everything that's come before. Over a decade of "traditional" software development has carved some deep mental grooves. I knew about the default prompt way before I actually started using it. I think I filed it away in the back of my mind as some sort of fiddly customization for power users. It wasn't until I read Anthropic's Best Practices for Agentic Coding article, and it mentioned it as it's first and most basic tip, that I realized I should start using it. And in retrospect, it feels like a real "duh" moment.
Ned: Yeah and now I'm realizing I could get rid of all those useless comments the model likes to write by adding that instruction to the default prompt.
Abel: Exactly. Tell it not to write comments. Or explain what types of comments are helpful and unhelpful. Provide examples. Get creative.
Ned: But there are tons of other issues I ran into. The model kept trying to call functions that didn't exist. It would just make up function names.
Abel: I have a question for you: how do you know which functions exist?
Ned: Well I RTFM of course. Or autocomplete tells me which functions exist when I type the little dot.
Abel: Right. So it would probably be helpful if the agent had access to TFM or autocomplete.
Ned: Sure.
Abel: Have you considered trying to give the model this context?
Ned: What do you mean?
Abel: I mean if you want to get the most out of these models, you need to invest some effort into giving the model all this information. Maybe this means writing a script to run the type checker and linter, and telling the model to run it after every change (another great use of the default prompt). If you use Cursor, the model gets a lot of this as long as you install the right plugins and configure them. For docs, it can even index entire websites. Or maybe you should just download the docs and check them into the repo.
Ned: But that's a lot of setup work just to use an AI. I thought these AIs were supposed to be superhuman.
Abel: Exactly! They aren't superhuman. If you couldn't code without the docs, why do you expect the AI to code without the docs? If you use a type checker, why shouldn't the AI?
Ned: That's not what the marketing promised.
Abel: Maybe one day we'll get to a place where AIs have the docs memorized and can run type checking in their weights, but we're not there yet. I think a great heuristic is: could a human implement this to my exacting specs based on what's in the context window? If the answer is "no", then a model shouldn't be expected to. In fact, so much of what I've learned about interacting with AIs is just a footnote to "you're probably not investing enough in your prompts."
Ned: What about these models just writing terrible code? The last time I asked the AI to implement a simple class, it started storing everything in a list and would do a linear search every time it needed to look up an element. No indexing of any kind.
Abel: That's why you need to keep these agents on a short leash.
Ned: What does that even mean?
Abel: It means you have to give really specific instructions and smaller units of work. The version of "vibe coding" where you write one sentence and get a production-ready app is fantasy.
Ned: So what am I supposed to do? Micromanage the AI?
Abel: In a way, yes. Be specific about what you want. Review code carefully. Give feedback. Iterate. When the model tries to write a linear search on the hot path, describe to it how to make it more efficient: "Build an index of user IDs to user objects instead of searching the array each time. Now update all the places this array is used." Also, experiment with different models. Some produce better code but have higher latency. Others are generally good but sometimes resort to reward hacking. The more you experiment, the more of an intuition you'll develop.
Ned: That sounds like a lot of work.
Abel: It's still work, but the role shifts - you spend less time typing code and more time reviewing and directing.
Ned: So coding becomes less about writing code and more about reviewing AI-generated code?
Abel: Exactly. And being very clear about what you want in the first place.
Ned: What do you mean that I need to be clear? I think I am being clear.
Abel: If you're not pasting entire design docs into the prompt, then you're probably being less clear than you think.
Ned: Entire design docs??
Abel: Yeah. When I started with these tools, I'd write these terse little prompts. "Write a client for this REST API." And I'd get garbage back. I realized I was expecting the AI to read my mind.
Ned: But there are token limits. You can't just dump a 50-page spec.
Abel: Well, Gemini actually just shipped a 2 million token context window, so actually you can. But yes, it does require some judgment. If you wanted an intern to implement this feature, what would you tell them?
Ned: I'd talk to them about the goals of the project as a whole, send them design docs, explain how what they're working on fits into the whole system, whiteboard, talk through edge cases, and ask them to come up with a plan first. Then discuss that plan ... OK I see what you're getting at.
Abel: Wow, that's beautiful. I wish I had a manager like that when I was an intern. But yes, this is exactly how you should be talking to the model.
Ned: So you're saying I should just dump as much context as possible?
Abel: Not blindly. Be strategic. Focus on including design documents that explain the architecture, emails or discussions that capture requirements and edge cases, library documentation, and even things like OpenAPI specs if they're available. Then, just like an intern, ask it to come up with a plan and ask it if it has any questions for you. Work through edge cases, and only then unleash it on the code.
Ned: That's a lot of up-front investment.
Abel: It is, but a lot of these tools have easy to miss features to help with this. Cursor has an @ command for adding files and even web pages to the context window. Or maybe MCP servers solve this? I honestly don't know. Everything is moving so fast I no longer have the bandwidth to try every new thing.
Ned: Ok but is this actually faster than just coding it myself? All this back and forth must have a lot of overhead.
Abel: I think it's still probably faster. When you're sitting around waiting for the model to respond, it can feel slow, but it can generate fairly large amounts of code in seconds. And for those small changes where latency matters a lot, that's where using multiple agents in parallel really shines.
Ned: Multiple agents? Surely you're joking.
Abel: Not at all. The latency of these tools is unavoidable right now. While you're waiting for a response, why not turn that time into productive time? One of the advantages of these agents is you can spin up as many as you want. Take advantage of that! Make the latency part of your workflow. Run multiple agents on different parts of the codebase, and cycle through them, checking in on their progress and proposed changes.
Ned: That seems chaotic. How do you manage that in practice?
Abel: Here's an example: say you have one hard problem and one easy problem. The hard problem requires a lot of context and back-and-forth, like writing a brand new feature. The easy one is more straightforward, like a conceptually simple refactor.
Ned: Go on.
Abel: Set up the hard problem with one agent. Give it all the context we talked about earlier, discuss the approach, get it started. Then, while that's cooking, open another terminal and set up a second agent to work on the easy task.
Ned: And then what? Just bounce back and forth?
Abel: Exactly. Check in on agent one - maybe it's proposed a solution you need to review. While you're thinking about that, check in on agent two. Maybe it's done, or maybe it needs a quick clarification.
Ned: That sounds insane. When I'm programming I need less context switching, not more.
Abel: It's a very different workflow than traditional programming, but I think if you try it, you'll find that you fall into a rhythm. Offload some of the cognitive work of the fiddly programming bits to the model and focus on getting good at quickly reviewing code, giving good direction, and juggling context. And get creative at giving agents work. Maybe one isn't writing code. Maybe it's answering questions about the code and planning next steps for a completely different feature. Or maybe it's spelunking through a library investigating a weird bug.
Ned: I have a hard time believing that's effective. There are times where I really just need to focus hard on a problem.
Abel: You might be right about problems requiring deep focus, but so much software work isn't like that. It's just plumbing between various libraries.
Ned: I just don't think that's right. There are plenty of times a careful insight led to an important improvement or simplification.
Abel: That might be so, but often insight requires generating a lot of hypotheses and quickly validating or excluding them. Having endless agents who can read code and implement experiments is extremely useful for this.
Ned: I guess I'll just have to try it for myself. But if you have so many agents running at once, how do you prevent them from stomping on each other?
Abel: There are a few techniques. Git worktrees are great. They let you check out multiple branches of the same repo into different directories. Another approach is just splitting active areas of work across different temporary files. Like foo1.py, foo2.py, foo3.py. Once they're all complete, ask one of the agents to merge them into foo.py.
Ned: That seems... hacky.
Abel: It is, a bit. But it's effective. You're adapting your workflow to the strengths and weaknesses of the tools. If I'm right and this workflow becomes popular, maybe we'll start to see tools that accommodate it better. But for now, it's not so bad, especially if an AI merges the files for you.
Ned: This sounds like it requires a pretty major shift in how I approach coding.
Abel: It does. And that's why adoption is the hardest part. There are days where I'll realize I've missed opportunities to offload tasks to the AI, because going back to the way I've been coding for over a decade feels more comfortable.
Ned: Look, I appreciate the walkthrough, but I'm still skeptical. All this parallel processing and context management feels like I'd be trading one type of complexity for another. Maybe slightly faster, but at what cost? Am I going to forget how to program?
Abel: I don't know. Your brain will certainly start to adapt to this new skillset. But maybe adapting to new tools isn't so bad. I hear assembly programmers had the same objection when higher level languages and compilers started making their skills obsolete. Is this so different? If I can offload some skills to the AI, then maybe those skills just aren't so valuable anymore.
Ned: But this is different. I'm offloading my ability to problem solve. I'm offloading my ability to think.
Abel: I don't think so. You're still doing a lot of code review and design work. Maybe your muscle memory for writing code will fade, but you're still spending lots of time thinking deeply about software. And even if that weren't true, you're still problem solving. You're just solving different problems and at a different level of the process. The day may come when human engineers are completely displaced by AIs, but in the meantime, there's still a lot of problem solving to be done.
Ned: Well that's a disturbing thought. But all right, I might give it another shot. But my bar is high. If it's more hassle than it's worth, I'm out.
Abel: That's all I'm asking.