Reading view

There are new articles available, click to refresh the page.

AI Is Reshaping Developer Career Paths

By: Andrew Stellman

22 October 2025 at 07:14

This article is part of a series on the Sens-AI Framework—practical habits for learning and coding with AI. Read the original framework introduction and explore the complete methodology in Andrew Stellman’s O’Reilly report Critical Thinking Habits for Coding with AI.

A few decades ago, I worked with a developer who was respected by everyone on our team. Much of that respect came from the fact that he kept adopting new technologies that none of us had worked with. There was a cutting-edge language at the time that few people were using, and he built an entire feature with it. He quickly became known as the person you’d go to for these niche technologies, and it earned him a lot of respect from the rest of the team.

Years later, I worked with another developer who went out of his way to incorporate specific, obscure .NET libraries into his code. That too got him recognition from our team members and managers, and he was viewed as a senior developer in part because of his expertise with these specialized tools.

Both developers built their reputations on deep knowledge of specific technologies. It was a reliable career strategy that worked for decades: Become the expert in something valuable but not widely known, and you’d have authority on your team and an edge in job interviews.

But AI is changing that dynamic in ways we’re just starting to see.

In the past, experienced developers could build deep expertise in a single technology (like Rails or React, for example) and that expertise would consistently get them recognition on their team and help them stand out in reviews and job interviews. It used to take months or years of working with a specific framework before a developer could write idiomatic code, or code that follows the accepted patterns and best practices of that technology.

But now AI models are trained on countless examples of idiomatic code, so developers without that experience can generate similar code immediately. That puts less of a premium on the time spent developing that deep expertise.

The Shift Toward Generalist Skills

That change is reshaping career paths in ways we’re just starting to see. The traditional approach worked for decades, but as AI fills in more of that specialized knowledge, the career advantage is shifting toward people who can integrate across systems and spot design problems early.

As I’ve trained developers and teams who are increasingly adopting AI coding tools, I’ve noticed that the developers who adapt best aren’t always the ones with the deepest expertise in a specific framework. Rather, they’re the ones who can spot when something looks wrong, integrate across different systems, and recognize patterns. Most importantly, they can apply those skills even when they’re not deep experts in the particular technology they’re working with.

This represents a shift from the more traditional dynamic on teams, where being an expert in a specific technology (like being the “Rails person” or the “React expert” on the team) carried real authority. AI now fills in much of that specialized knowledge. You can still build a career on deep Rails knowledge, but thanks to AI, it doesn’t always carry the same authority on a team that it once did.

What AI Still Can’t Do

Both new and experienced developers routinely find themselves accumulating technical debt, especially when deadlines push delivery over maintainability, and this is an area where experienced engineers often distinguish themselves, even on a team with wide AI adoption. The key difference is that an experienced developer often knows they’re taking on debt. They can spot antipatterns early because they’ve seen them repeatedly and take steps to “pay off” the debt before it gets much more expensive to fix.

But AI is also changing the game for experienced developers in ways that go beyond technical debt management, and it’s starting to reshape their traditional career paths. What AI still can’t do is tell you when a design or architecture decision today will cause problems six months from now, or when you’re writing code that doesn’t actually solve the user’s problem. That’s why being a generalist, with skills in architecture, design patterns, requirements analysis, and even project management, is becoming more valuable on software teams.

Many developers I see thriving with AI tools are the ones who can:

Recognize when generated code will create maintenance problems even if it works initially
Integrate across multiple systems without being deep experts in each one
Spot architectural patterns and antipatterns regardless of the specific technology
Frame problems clearly so AI can generate more useful solutions
Question and refine AI output rather than accepting it as is

Practical Implications for Your Career

This shift has real implications for how developers think about career development:

For experienced developers: Your years of expertise are still important and valuable, but the career advantage is shifting from “I know this specific tool really well” to “I can solve complex problems across different technologies.” Focus on building skills in system design, integration, and pattern recognition that apply broadly.

For early-career developers: The temptation might be to rely on AI to fill knowledge gaps, but this can be dangerous. Those broader skills—architecture, design judgment, problem-solving across domains—typically require years of hands-on experience to develop. Use AI as a tool, but make sure you’re still building the fundamental thinking skills that let you guide it effectively.

For teams: Look for people who can adapt to new technologies quickly and integrate across systems, not just deep specialists. The “Rails person” might still be valuable, but the person who can work with Rails, integrate it with three other systems, and spot when the architecture is heading for trouble six months down the line is becoming more valuable.

The developers who succeed in an AI-enabled world won’t always be the ones who know the most about any single technology. They’ll be the ones who can see the bigger picture, integrate across systems, and use AI as a powerful tool while maintaining the critical thinking necessary to guide it toward genuinely useful solutions.

AI isn’t replacing developers. It’s changing what kinds of developer skills matter most.

From Habits to Tools

Oreilly

By: Andrew Stellman

15 October 2025 at 08:49

AI-assisted coding is here to stay. I’ve seen many companies now require all developers to install Copilot extensions in their IDEs, and teams are increasingly being measured on AI-adoption metrics. Meanwhile, the tools themselves have become genuinely useful for routine tasks: Developers regularly use them to generate boilerplate, convert between formats, write unit tests, and explore unfamiliar APIs—giving us more time to focus on solving our real problems instead of wrestling with syntax or going down research rabbit holes.

Many team leads, managers, and instructors looking to help developers ramp up on AI tools assume the biggest challenge is learning to write better prompts or picking the right AI tool; that assumption misses the point. The real challenge is figuring out how developers can use these tools in ways that keep them engaged and strengthen their skills instead of becoming disconnected from the code and letting their development skills atrophy.

This was the challenge I took on when I developed the Sens-AI Framework. When I was updating Head First C# (O’Reilly 2024) to help readers ramp up on AI skills alongside other fundamental development skills, I watched new learners struggle not with the mechanics of prompting but with maintaining their understanding of the code they were producing. The framework emerged from those observations—five habits that keep developers engaged in the design conversation: context, research, framing, refining, and critical thinking. These habits address the real issue: making sure the developer stays in control of the work, understanding not just what the code does but why it’s structured that way.

What We’ve Learned So Far

When I updated Head First C# to include AI exercises, I had to design them knowing learners would paste instructions directly into AI tools. That forced me to be deliberate: The instructions had to guide the learner while also shaping how the AI responded. Testing those same exercises against Copilot and ChatGPT showed the same kinds of problems over and over—AI filling in gaps with the wrong assumptions or producing code that looked fine until you actually had to run it, read and understand it, or modify and extend it.

Those issues don’t only trip up new learners. More experienced developers can fall for them too. The difference is that experienced developers already have habits for catching themselves, while newer developers usually don’t—unless we make a point of teaching them. AI skills aren’t exclusive to senior or experienced developers either; I’ve seen relatively new developers develop their AI skills quickly because they’ve built these habits quickly.

Habits Across the Lifecycle

In “The Sens-AI Framework,” I introduced the five habits and explained how they work together to keep developers engaged with their code rather than becoming passive consumers of AI output. These habits also address specific failure modes, and understanding how they solve real problems points the way toward broader implementation across teams and tools:

Context helps avoid vague prompts that lead to poor output. Ask an AI to “make this code better” without sharing what the code does, and it might suggest adding comments to a performance-critical section where comments would just clutter. But provide the context—“This is a high-frequency trading system where microseconds matter,” along with the actual code structure, dependencies, and constraints—and the AI understands it should focus on optimizations, not documentation.

Research makes sure the AI isn’t your only source of truth. When you rely solely on AI, you risk compounding errors—the AI makes an assumption, you build on it, and soon you’re deep in a solution that doesn’t match reality. Cross-checking with documentation or even asking a different AI can reveal when you’re being led astray.

Framing is about asking questions that set up useful answers. “How do I handle errors?” gets you a try-catch block. “How do I handle network timeout errors in a distributed system where partial failures need rollback?” gets you circuit breakers and compensation patterns. As I showed in “Understanding the Rehash Loop,” proper framing can break the AI out of circular suggestions.

Refining means not settling for the first thing the AI gives you. The first response is rarely the best—it’s just the AI’s initial attempt. When you iterate, you’re steering toward better patterns. Refining moves you from “This works” to “This is actually good.”

Critical thinking ties it all together, asking whether the code actually works for your project. It’s debugging the AI’s assumptions, reviewing for maintainability, and asking, “Will this make sense six months from now?”

The real power of the Sens-AI Framework comes from using all five habits together. They form a reinforcing loop: Context informs research, research improves framing, framing guides refinement, refinement reveals what needs critical thinking, and critical thinking shows you what context you were missing. When developers use these habits in combination, they stay engaged with the design and engineering process rather than becoming passive consumers of AI output. It’s the difference between using AI as a crutch and using it as a genuine collaborator.

Where We Go from Here

If developers are going to succeed with AI, these habits need to show up beyond individual workflows. They need to become part of:

Education: Teaching AI literacy alongside basic coding skills. As I described in “The AI Teaching Toolkit,” techniques like having learners debug intentionally flawed AI output help them spot when the AI is confidently wrong and practice breaking out of rehash loops. These aren’t advanced skills; they’re foundational.

Team practice: Using code reviews, pairing, and retrospectives to evaluate AI output the same way we evaluate human-written code. In my teaching article, I described techniques like AI archaeology and shared language patterns. What matters here is making those kinds of habits part of standard training—so teams develop vocabulary like “I’m stuck in a rehash loop” or “The AI keeps defaulting to the old pattern.” And as I explored in “Trust but Verify,” treating AI-generated code with the same scrutiny as human code is essential for maintaining quality.

Tooling: IDEs and linters that don’t just generate code but highlight assumptions and surface design trade-offs. Imagine your IDE warning: “Possible rehash loop detected: you’ve been iterating on this same approach for 15 minutes.” That’s one direction IDEs need to evolve—surfacing assumptions and warning when you’re stuck. The technical debt risks I outlined in “Building AI-Resistant Technical Debt” could be mitigated with better tooling that catches antipatterns early.

Culture: A shared understanding that AI is a collaboration too (and not a teammate). A team’s measure of success for code shouldn’t revolve around AI. Teams still need to understand that code, keep it maintainable, and grow their own skills along the way. Getting there will require changes in how they work together—for example, adding AI-specific checks to code reviews or developing shared vocabulary for when AI output starts drifting. This cultural shift connects to the requirements engineering parallels I explored in “Prompt Engineering Is Requirements Engineering”—we need the same clarity and shared understanding with AI that we’ve always needed with human teams.

More convincing output will require more sophisticated evaluation. Models will keep getting faster and more capable. What won’t change is the need for developers to think critically about the code in front of them.

The Sens-AI habits work alongside today’s tools and are designed to stay relevant to tomorrow’s tools as well. They’re practices that keep developers in control, even as models improve and the output gets harder to question. The framework gives teams a way to talk about both the successes and the failures they see when using AI. From there, it’s up to instructors, tool builders, and team leads to decide how to put those lessons into practice.

The next generation of developers will never know coding without AI. Our job is to make sure they build lasting engineering habits alongside these tools—so AI strengthens their craft rather than hollowing it out.

The AI Teaching Toolkit: Practical Guidance for Teams

Oreilly

By: Andrew Stellman

8 October 2025 at 07:12

Teaching developers to work effectively with AI means building habits that keep critical thinking active while leveraging AI’s speed.

But teaching these habits isn’t straightforward. Instructors and team leads often find themselves needing to guide developers through challenges in ways that build confidence rather than short-circuit their growth. (See “The Cognitive Shortcut Paradox.”) There are the regular challenges of working with AI:

Suggestions that look correct while hiding subtle flaws
Less experienced developers accepting output without questioning it
AI producing patterns that don’t match the team’s standards
Code that works but creates long-term maintainability headaches

The Sens-AI Framework (see “The Sens-AI Framework: Teaching Developers to Think with AI”) was built to address these problems. It focuses on five habits—context, research, framing, refining, and critical thinking—that help developers use AI effectively while keeping learning and design judgment in the loop.

This toolkit builds on and reinforces those habits by giving you concrete ways to integrate them into team practices. It’s designed to give you concrete ways to build these habits in your team, whether you’re running a workshop, leading code reviews, or mentoring individual developers. The techniques that follow include practical teaching strategies, common pitfalls to avoid, reflective questions to deepen learning, and positive signs that show the habits are sticking.

Advice for Instructors and Team Leads

The strategies in this toolkit can be used in classrooms, review meetings, design discussions, or one-on-one mentoring. They’re meant to help new learners, experienced developers, and teams have more open conversations about design decisions, context, and the quality of AI suggestions. The focus is on making review and questioning feel like a normal, expected part of everyday development.

Discuss assumptions and context explicitly. In code reviews or mentoring sessions, ask developers to talk about occurrences when the AI gave them poor out unexpected results. Also try asking them to explain what they think the AI might have needed to know to produce a better answer, and where it might have filled in gaps incorrectly. Getting developers to articulate those assumptions helps spot weak points in design before they’re cemented into the code. (See “Prompt Engineering Is Requirements Engineering.”)

Encourage pairing or small-group prompt reviews: Make AI-assisted development collaborative, not siloed. Have developers on a team or students in a class share their prompts with each other, and talk through why they wrote them a certain way, just like they’d talk through design decisions in pair or mob programming. This helps less experienced developers see how others approach framing and refining prompts.

Encourage researching idiomatic use of code. One thing that often holds back intermediate developers is not knowing the idioms of a specific framework or language. AI can help here—if they ask for the idiomatic way to do something, they see not just the syntax but also the patterns experienced developers rely on. That shortcut can speed up their understanding and make them more confident when working with new technologies.

Here are two examples of how using AI to research idioms can help developers quickly adapt:

A developer with deep experience writing microservices but little exposure to Spring Boot can use AI to see the idiomatic way to annotate a class with @RestController and @RequestMapping. They might also learn that Spring Boot favors constructor injection over field injection with @Autowired, or that @GetMapping("/users") is preferred over @RequestMapping(method = RequestMethod.GET, value = "/users").
A Java developer new to Scala might reach for null instead of Scala’s Option types—missing a core part of the language’s design. Asking the AI for the idiomatic approach surfaces not just the syntax but the philosophy behind it, guiding developers toward safer and more natural patterns.

Help developers recognize rehash loops as meaningful signals. When the AI keeps circling the same broken idea, even developers who have experienced this many times may not realize they’re caught in a rehash loop. Teach them to recognize the loop as a signal that the AI has exhausted its context, and that it’s time to step back. That pause can lead to research, reframing the problem, or providing new information. For example, you might stop and say: “Notice how it’s circling the same idea? That’s our signal to break out.” Then demonstrate how to reset: open a new session, consult documentation, or try a narrower prompt. (See “Understanding the Rehash Loop.”)

Research beyond AI. Help developers learn that when hitting walls, they don’t need to just tweak prompts endlessly. Model the habit of branching out: check official documentation, search Stack Overflow, or review similar patterns in your existing codebase. AI should be one tool among many. Showing developers how to diversify their research keeps them from looping and builds stronger problem-solving instincts.

Use failed projects as test cases. Bring in previous projects that ran into trouble with AI-generated code and revisit them with Sens-AI habits. Review what went right and wrong, talk about where it might have helped to break out of the vibe coding loop to do additional research, reframe the problem, and apply critical thinking. Work with the team to write down lessons you learned from the discussion. Holding a retrospective exercise like this lowers the stakes—developers are free to experiment and critique without slowing down current work. It’s also a powerful way to show how reframing, refining, and verifying could have prevented past issues. (See “Building AI-Resistant Technical Debt.”)

Make refactoring part of the exercise. Help developers avoid the habit of deciding the code is finished when it runs and seems to work. Have them work with the AI to clean up variable names, reduce duplication, simplify overly complex logic, apply design patterns, and find other ways to prevent technical debt. By making evaluation and improvement explicit, you can help developers build the muscle memory that prevents passive acceptance of AI output. (See “Trust but Verify.”)

Common Pitfalls to Address with Teams

Even with good intentions, teams often fall into predictable traps. Watch for these patterns and address them explicitly, because otherwise they can slow progress and mask real learning.

The completionist trap: Trying to read every line of AI output even when you’re about to regenerate it. Teach developers it’s okay to skim, spot problems, and regenerate early. This helps them avoid wasting time carefully reviewing code they’ll never use, and reduces the risk of cognitive overload. The key is to balance thoroughness with pragmatism—they can start to learn when detail matters and when speed matters more.

The perfection loop: Endless tweaking of prompts for marginal improvements. Try setting a limit on iteration—for example, if refining a prompt doesn’t get good results after three or four attempts, it’s time to step back and rethink. Developers need to learn that diminishing returns are a sign to change strategy, not to keep grinding, so energy that should go toward solving the problem doesn’t get lost in chasing minor refinements.

Context dumping: Pasting entire codebases into prompts. Teach scoping—What’s the minimum context needed for this specific problem? Help them anticipate what the AI needs, and provide the minimal context required to solve each problem. Context dumping can be especially problematic with limited context windows, where the AI literally can’t see all the code you’ve pasted, leading to incomplete or contradictory suggestions. Teaching developers to be intentional about scope prevents confusion and makes AI output more reliable.

Skipping the fundamentals: Using AI for extensive code generation before understanding basic software development concepts and patterns. Ensure learners can solve simple development problems on their own (without the help of AI) before accelerating with AI on more complex ones. This helps reduce the risk of developers building a shallow knowledge platform that collapses under pressure. Fundamentals are what allow them to evaluate AI’s output critically rather than blindly trusting it.

AI Archaeology: A Practical Team Exercise for Better Judgment

Have your team do an AI archaeology exercise. Take a piece of AI-generated code from the previous week and analyze it together. More complex or nontrivial code samples work especially well because they tend to surface more assumptions and patterns worth discussing.

Have each team member independently write down their own answers to these questions:

What assumptions did the AI make?
What patterns did it use?
Did it make the right decision for our codebase?
How would you refactor or simplify this code if you had to maintain it long-term?

Once everyone has had time to write, bring the group back together—either in a room or virtually—and compare answers. Look for points of agreement and disagreement. When different developers spot different issues, that contrast can spark discussion about standards, best practices, and hidden dependencies. Encourage the group to debate respectfully, with an emphasis on surfacing reasoning rather than just labeling answers as right or wrong.

This exercise makes developers slow down and compare perspectives, which helps surface hidden assumptions and coding habits. By putting everyone’s observations side by side, the team builds a shared sense of what good AI-assisted code looks like.

For example, the team might discover the AI consistently uses older patterns your team has moved away from or that it defaults to verbose solutions when simpler ones exist. Discoveries like that become teaching moments about your team’s standards and help calibrate everyone’s “code smell” detection for AI output. The retrospective format makes the whole exercise more friendly and less intimidating than real-time critique, which helps to strengthen everyone’s judgment over time.

Signs of Success

Balancing pitfalls with positive indicators helps teams see what good AI practice looks like. When these habits take hold, you’ll notice developers:

Reviewing AI code with the same rigor as human-written code—but only when appropriate. When developers stop saying “the AI wrote it, so it must be fine” and start giving AI code the same scrutiny they’d give a teammate’s pull request, it demonstrates that the habits are sticking.

Exploring multiple approaches instead of accepting the first answer. Developers who use AI effectively don’t settle for the initial response. They ask the AI to generate alternatives, compare them, and use that exploration to deepen their understanding of the problem.

Recognizing rehash loops without frustration. Instead of endlessly tweaking prompts, developers treat rehash loops as signals to pause and rethink. This shows they’re learning to manage AI’s limitations rather than fight against them.

Sharing “AI gotchas” with teammates. Developers start saying things like “I noticed Copilot always tries this approach, but here’s why it doesn’t work in our codebase.” These small observations become collective knowledge that helps the whole team work together and with AI more effectively.

Asking “Why did the AI choose this pattern?” instead of just asking “Does it work?” This subtle shift shows developers are moving beyond surface correctness to reasoning about design. It’s a clear sign that critical thinking is active.

Bringing fundamentals into AI conversations: Developers who are working positively with AI tools tend to relate AI output back to core principles like readability, separation of concerns, or testability. This shows they’re not letting AI bypass their grounding in software engineering.

Treating AI failures as learning opportunities: When something goes wrong, instead of blaming the AI or themselves, developers dig into why. Was it context? Framing? A fundamental limitation? This investigative mindset turns problems into teachable moments.

Reflective Questions for Teams

Encourage developers to ask themselves these reflective questions periodically. They slow the process just enough to surface assumptions and spark discussion. You might use them in training, pairing sessions, or code reviews to prompt developers to explain their reasoning. The goal is to keep the design conversation active, even when the AI seems to offer quick answers.

What does the AI need to know to do this well? (Ask this before writing any prompt.)
What context or requirements might be missing here? (Helps catch gaps early.)
Do you need to pause here and do some research? (Promotes branching out beyond AI.)
How might you reframe this problem more clearly for the AI? (Encourages clarity in prompts.)
What assumptions are you making about this AI output? (Surfaces hidden design risks.)
If you’re getting frustrated, is that a signal to step back and rethink? (Normalizes stepping away.)
Would it help to switch from reading code to writing tests to check behavior? (Shifts the lens to validation.)
Do these unit tests reveal any design issues or hidden dependencies? (Connects testing with design insight.)
Have you tried starting a new chat session or using a different AI tool for this research? (Models flexibility with tools.)

The goal of this toolkit is to help developers build the kind of judgment that keeps them confident with AI while still growing their core skills. When teams learn to pause, review, and refactor AI-generated code, they move quickly without losing sight of design clarity or long-term maintainability. These teaching strategies give developers the habits to stay in control of the process, learn more deeply from the work, and treat AI as a true collaborator in building better software. As AI tools evolve, these fundamental habits—questioning, verifying, and maintaining design judgment—will remain the difference between teams that use AI well and those that get used by it.

The Cognitive Shortcut Paradox

Oreilly

By: Andrew Stellman

1 October 2025 at 07:07

This article is part of a series on the Sens-AI Framework—practical habits for learning and coding with AI.

AI gives novice developers the ability to skip the slow, messy parts of learning. For experienced developers, that can mean getting to a working solution faster. Developers early in their learning path, however, face what I call the cognitive shortcut paradox: they need coding experience to use AI tools well, because experience builds the judgment required to evaluate, debug, and improve AI-generated code—but leaning on AI too much in those first stages can keep them from ever gaining that experience.

I saw this firsthand when adapting Head First C# to include AI exercises. The book’s exercises are built to teach specific development concepts like object-oriented programming, separation of concerns, and refactoring. If new learners let AI generate the code before they’ve learned the fundamentals, they miss the problem-solving work that leads to those “aha!” moments where understanding really clicks.

With AI, it’s easy for new learners to bypass the learning process completely by pasting the exercise instructions into a coding assistant, getting a complete program in seconds, and running it without ever working through the design or debugging. When the AI produces the right output, it feels like progress to the learner. But the goal was never just to have a running program; it was to understand the requirements and craft a solution that reinforced a specific concept or technique that was taught earlier in the book. The problem is that to the novice, the work still looks right—code that compiles and produces the expected results—so the missing skills stay hidden until the gap is too wide to close.

Evidence is emerging that AI chatbots can boost productivity for experienced workers but have little measurable impact on skill growth for beginners. In practice, the tool that speeds mastery for seniors can slow it for juniors, because it hands over a polished answer before they’ve had the chance to build the skills needed to use that answer effectively.

The cognitive shortcut paradox isn’t just a classroom issue. In real projects, the most valuable engineering work often involves understanding ambiguous requirements, making architectural calls when nothing is certain, and tracking down the kind of bugs that don’t have obvious fixes. Those abilities come from wrestling with problems that don’t have a quick path to “done.” If developers turn to AI at the first sign of difficulty, they skip the work that builds the pattern recognition and systematic thinking senior engineers depend on.

Over time, the effect compounds. A new developer might complete early tickets through vibe coding, feel the satisfaction of shipping working code, and gain confidence in their abilities. Months later, when they’re asked to debug a complex system or refactor code they didn’t write, the gap shows. By then, their entire approach to development may depend on AI to fill in every missing piece, making it much harder to develop independent problem-solving skills.

The cognitive shortcut paradox presents a fundamental challenge for how we teach and learn programming in the AI era. The traditional path of building skills through struggle and iteration hasn’t become obsolete; it’s become more critical than ever, because those same skills are what allow developers to use AI tools effectively. The question isn’t whether to use AI in learning, but how to use it in ways that build rather than bypass the critical thinking abilities that separate effective developers from code generators. This requires a more deliberate approach to AI-assisted development, one that preserves the essential learning experiences while harnessing AI’s capabilities.

Trust but Verify

Oreilly

By: Andrew Stellman

24 September 2025 at 07:02

We often say AIs “understand” code, but they don’t truly understand your problem or your codebase in the sense that humans understand things. They’re mimicking patterns from text and code they’ve seen before, either built into their model or provided by you, aiming to produce something that looks right and is a plausible answer. It’s very often correct, which is why vibe coding (repeatedly feeding the output from one prompt back to the AI without reading the code that it generated) works so well, but it’s not guaranteed to be correct. And because of the limitations of how LLMs work and how we prompt with them, the solutions rarely account for overall architecture, long-term strategy, or often even good code design principles.

The principle I’ve found most effective for managing these risks is borrowed from another domain entirely: trust but verify. While the phrase has been used in everything from international relations to systems administration, it perfectly captures the relationship we need with AI-generated code. We trust the AI enough to use its output as a starting point, but we verify everything before we commit it.

Trust but verify is the cornerstone of an effective approach: trust the AI for a starting point but verify that the design supports change, testability, and clarity. That means applying the same critical review patterns you’d use for any code: checking assumptions, understanding what the code is really doing, and making sure it fits your design and standards.

Verifying AI-generated code means reading it, running it, and sometimes even debugging it line by line. Ask yourself whether the code will still make sense to you—or anyone else—months from now. In practice, this can mean quick design reviews even for AI-generated code, refactoring when coupling or duplication starts to creep in, and taking a deliberate pass at naming so variables and functions read clearly. These extra steps help you stay engaged with critical thinking and keep you from locking early mistakes into the codebase, where they become difficult to fix.

Verifying also means taking specific steps to check both your assumptions and the AI’s output—like generating unit tests for the code, as we discussed earlier. The AI can be helpful, but it isn’t reliable by default. It doesn’t know your problem, your domain, or your team’s context unless you make that explicit in your prompts and review the output carefully to make sure that you communicated it well and the AI understood.

AI can help with this verification too: It can suggest refactorings, point out duplicated logic, or help extract messy code into cleaner abstractions. But it’s up to you to direct it to make those changes, which means you have to spot them first—which is much easier for experienced developers who have seen these problems over the course of many projects.

Beyond reviewing the code directly, there are several techniques that can help with verification. They’re based on the idea that the AI generates code based on the context it’s working with, but it can’t tell you why it made specific choices the way a human developer could. When code doesn’t work, it’s often because the AI filled in gaps with assumptions based on patterns in its training data that don’t actually match your actual problem. The following techniques are designed to help surface those hidden assumptions, highlighting options so you can make the decisions about your code instead of leaving them to the AI.

Ask the AI to explain the code it just generated. Follow up with questions about why it made specific design choices. The explanation isn’t the same as a human author walking you through their intent; it’s the AI interpreting its own output. But that perspective can still be valuable, like having a second reviewer describe what they see in the code. If the AI made a mistake, its explanation will likely echo that mistake because it’s still working from the same context. But that consistency can actually help surface the assumptions or misunderstandings you might not catch by just reading the code.
Try generating multiple solutions. Asking the AI to produce two or three alternatives forces it to vary its approach, which often reveals different assumptions or trade-offs. One version may be more concise; another more idiomatic; a third more explicit. Even if none are perfect, putting the options side by side helps you compare patterns and decide what best fits your codebase. Comparing the alternatives is an effective way to keep your critical thinking engaged and stay in control of your codebase.
Use the AI as its own critic. After the AI generates code, ask it to review that code for problems or improvements. This can be effective because it forces the AI to approach the code as a new task; the context shift is more likely to surface edge cases or design issues the AI didn’t detect the first time. Because of that shift, you might get contradictory or nitpicky feedback, but that can be useful too—it reveals places where the AI is drawing on conflicting patterns from its training (or, more precisely, where it’s drawing on contradictory patterns from its training). Treat these critiques as prompts for your own judgment, not as fixes to apply blindly. Again, this is a technique that helps keep your critical thinking engaged by highlighting issues you might otherwise skip over when skimming the generated code.

These verification steps might feel like they slow you down, but they’re actually investments in velocity. Catching a design problem after five minutes of review is much faster than debugging it six months later when it’s woven throughout your codebase. The goal is to go beyond simple vibe coding by adding strategic checkpoints where you shift from generation mode to evaluation mode.

The ability of AI to generate a huge amount of code in a very short time is a double-edged sword. That speed is seductive, but if you aren’t careful with it, you can vibe code your way straight into classic antipatterns (see “Building AI-Resistant Technical Debt: When Speed Creates Long-term Pain”). In my own coding, I’ve seen the AI take clear steps down this path, creating overly structured solutions that, if I allowed them to go unchecked, would lead directly to overly complex, highly coupled, and layered designs. I spotted them because I’ve spent decades writing code and working on teams, so I recognized the patterns early and corrected them—just like I’ve done hundreds of times in code reviews with team members. This means slowing down enough to think about design, a critical part of the mindset of “trust but verify” that involves reviewing changes carefully to avoid building layered complexity you can’t unwind later.

There’s also a strong signal in how hard it is to write good unit tests for AI-generated code. If tests are hard for the AI to generate, that’s a signal to stop and think. Adding unit tests to your vibe-code cycle creates a checkpoint—a reason to pause, question the output, and shift back into critical thinking. This technique borrows from test-driven development: using tests not only to catch bugs later but to reveal when a design is too complex or unclear.

When you ask the AI to help write unit tests for generated code, first have it generate a plan for the tests it’s going to write. Watch for signs of trouble: lots of mocking, complex setup, too many dependencies—especially needing to modify other parts of the code. Those are signals that the design is too coupled or unclear. When you see those signs, stop vibe coding and read the code. Ask the AI to explain it. Run it in the debugger. Stay in critical thinking mode until you’re satisfied with the design.

There are also other clear signals that these risks are creeping in, which tell you when to stop trusting and start verifying:

Rehash loops: Developers cycling through slight variations of the same AI prompt without making meaningful progress because they’re avoiding stepping back to rethink the problem (see “Understanding the Rehash Loop: When AI Gets Stuck”).
AI-generated code that almost works: Code that feels close enough to trust but hides subtle, hard-to-diagnose bugs that show up later in production or maintenance.
Code changes that require “shotgun surgery”: Asking the AI to make a small change requires it to create cascading edits in multiple unrelated parts of the codebase—this indicates a growing and increasingly unmanageable web of interdependencies, the shotgun surgery code smell.
Fragile unit tests: Tests that are overly complex, tightly coupled, or rely on too much mocking just to get the AI-generated code to pass.
Debugging frustration: Small fixes that keep breaking somewhere else, revealing underlying design flaws.
Overconfidence in output: Skipping review and design steps because the AI delivered something that looks finished.

All of these are signals to step out of the vibe-coding loop, apply critical thinking, and use the AI deliberately to refactor your code for simplicity.

Prompt Engineering Is Requirements Engineering

Oreilly

By: Andrew Stellman

17 September 2025 at 06:27

In the rush to get the most from AI tools, prompt engineering—the practice of writing clear, structured inputs that guide an AI tool’s output—has taken center stage. But for software engineers, the skill isn’t new. We’ve been doing a version of it for decades, just under a different name. The challenges we face when writing AI prompts are the same ones software teams have been grappling with for generations. Talking about prompt engineering today is really just continuing a much older conversation about how developers spell out what they need built, under what conditions, with what assumptions, and how to communicate that to the team.

The software crisis was the name given to this problem starting in the late 1960s, especially at the NATO Software Engineering Conference in 1968, where the term “software engineering” was introduced. The crisis referred to the widespread industry experience that software projects were over budget and late, and often failed to deliver what users actually needed.

There was a common misconception that these failures were due to programmers lacking technical skill or teams who needed more technical training. But the panels at that conference focused on what they saw as the real root cause: Teams and their stakeholders had trouble understanding the problems they were solving and what they actually needed to build; communicating those needs and ideas clearly among themselves; and ensuring the delivered system matched that intent. It was fundamentally a human communication problem.

Participants at the conference captured this precisely. Dr. Edward E. David Jr. from Bell Labs noted there is often no way even to specify in a logically tight way what the software is supposed to do. Douglas Ross from MIT pointed out the pitfall where you can specify what you are going to do, and then do it as if that solved the problem. Prof. W.L. van der Poel summed up the challenge of incomplete specifications: Most problems simply aren’t defined well enough at the start, so you don’t have the information you need to build the right solution.

These are all problems that cause teams to misunderstand the software they’re creating before any code is written. And they should all sound familiar to developers today who work with AI to generate code.

Much of the problem boils down to what I’ve often called the classic “do what I meant, not what I said” problem. Machines are literal—and people on teams often are too. Our intentions are rarely fully spelled out, and getting everyone aligned on what the software is supposed to do has always required deliberate, often difficult work.

Fred Brooks wrote about this in his classic and widely influential “No Silver Bullet” essay. He argued there would never be a single magic process or tool that would make software development easy. Throughout the history of software engineering, teams have been tempted to look for that silver bullet that would make the hard parts of understanding and communication go away. It shouldn’t be surprising that we’d see the same problems that plagued software teams for years reappear when they started to use AI tools.

By the end of the 1970s, these problems were being reframed in terms of quality. Philip Crosby, Joseph M. Juran, and W. Edwards Deming, three people who had enormous influence on the field of quality engineering, each had influential takes on why so many products didn’t do the jobs they were supposed to do, and these ideas are especially true when it comes to software. Crosby argued quality was fundamentally conformance to requirements—if you couldn’t define what you needed clearly, you couldn’t ensure it would be delivered. Juran talked about fitness for use—software needed to solve the user’s real problem in its real context, not just pass some checklists. Deming pushed even further, emphasizing that defects weren’t just technical mistakes but symptoms of broken systems, and especially poor communication and lack of shared understanding. He focused on the human side of engineering: creating processes that help people learn, communicate, and improve together.

Through the 1980s, these insights from the quality movement were being applied to software development and started to crystallize into a distinct discipline called requirements engineering, focused on identifying, analyzing, documenting, and managing the needs of stakeholders for a product or system. It emerged as its own field, complete with conferences, methodologies, and professional practices. The IEEE Computer Society formalized this with its first International Symposium on Requirements Engineering in 1993, marking its recognition as a core area of software engineering.

The 1990s became a heyday for requirements work, with organizations investing heavily in formal processes and templates, believing that better documentation formats would ensure better software. Standards like IEEE 830 codified the structure of software requirements specifications, and process models such as the software development life cycle and CMM/CMMI emphasized rigorous documentation and repeatable practices. Many organizations invested heavily in designing detailed templates and forms, hoping that filling them out correctly would guarantee the right system. In practice, those templates were useful for consistency and compliance, but they didn’t eliminate the hard part: making sure what was in one person’s head matched what was in everyone else’s.

While the 1990s focused on formal documentation, the Agile movement of the 2000s shifted toward a more lightweight, conversational approach. User stories emerged as a deliberate counterpoint to heavyweight specifications—short, simple descriptions of functionality told from the user’s perspective, designed to be easy to write and easy to understand. Instead of trying to capture every detail upfront, user stories served as placeholders for conversations between developers and stakeholders. The practice was deliberately simple, based on the idea that shared understanding comes from dialogue, not documentation, and that requirements evolve through iteration and working software rather than being fixed at the project’s start.

All of this reinforced requirements engineering as a legitimate area of software engineering practice and a real career path with its own set of skills. There is now broad agreement that requirements engineering is a vital area of software engineering focused on surfacing assumptions, clarifying goals, and ensuring everyone involved has the same understanding of what needs to be built.

Prompt Engineering Is Requirements Engineering

Prompt engineering and requirements engineering are literally the same skill—using clarity, context, and intentionality to communicate your intent and ensure what gets built matches what you actually need.

User stories were an evolution from traditional formal specifications: a simpler, more flexible approach to requirements but with the same goal of making sure everyone understood the intent. They gained wide acceptance across the industry because they helped teams recognize that requirements are about creating a shared understanding of the project. User stories gave teams a lightweight way to capture intent and then refine it through conversation, iteration, and working software.

Prompt engineering plays the exact same role. The prompt is our lightweight placeholder for a conversation with the AI. We still refine it through iteration, adding context, clarifying intent, and checking the output against what we actually meant. But it’s the full conversation with the AI and its context that matters; the individual prompts are just a means to communicate the intent and context. Just like Agile shifted requirements from static specs to living conversations, prompt engineering shifts our interaction with AI from single-shot commands to an iterative refinement process—though one where we have to infer what’s missing from the output rather than having the AI ask us clarifying questions.

User stories intentionally focused the engineering work back on people and what’s in their heads. Whether it’s a requirements document in Word or a user story in Jira, the most important thing isn’t the piece of paper, ticket, or document we wrote. The most important thing is that what’s in my head matches what’s in your head and matches what’s in the heads of everyone else involved. The piece of paper is just a convenient way to help us figure out whether or not we agree.

Prompt engineering demands the same outcome. Instead of working with teammates to align mental models, we’re communicating to an AI, but the goal hasn’t changed: producing a high-quality product. The basic principles of quality engineering laid out by Deming, Juran, and Crosby have direct parallels in prompt engineering:

Deming’s focus on systems and communication: Prompting failures can be traced to problems with the process, not the people. They typically stem from poor context and communication, not from “bad AI.”
Juran’s focus on fitness for use: When he framed quality as “fitness for use,” Juran meant that what we produce has to meet real needs—not just look plausible. A prompt is useless if the output doesn’t solve the real problem, and failure to create a prompt that’s fit for use will result in hallucinations.
Crosby’s focus on conformance to requirements: Prompts must specify not just functional needs but also nonfunctional ones like maintainability and readability. If the context and framing aren’t clear, the AI will generate output that conforms to its training distribution rather than the real intent.

One of the clearest ways these quality principles show up in prompt engineering is through what’s now called context engineering—deciding what the model needs to see to generate something useful, which typically includes surrounding code, test inputs, expected outputs, design constraints, and other important project information. If you give the AI too little context, it fills in the blanks with what seems most likely based on its training data (which usually isn’t what you had in mind). If you give it too much, it can get buried in information and lose track of what you’re really asking for. That judgment call—what to include, what to leave out—has always been one of the deepest challenges at the heart of requirements work.

There’s another important parallel between requirements engineering and prompt engineering. Back in the 1990s, many organizations fell into what we might call the template trap—believing that the right standardized form or requirements template could guarantee a good outcome. Teams spent huge effort designing and filling out documents. But the real problem was never the format; it was whether the underlying intent was truly shared and understood.

Today, many companies fall into a similar trap with prompt libraries, or catalogs of prewritten prompts meant to standardize practice and remove the difficulty of writing prompts. Prompt libraries can be useful as references or starting points, but they don’t replace the core skill of framing the problem and ensuring shared understanding. Just like a perfect requirements template in the 1990s didn’t guarantee the right system, canned prompts today don’t guarantee the right code.

Decades later, the points Brooks made in his “No Silver Bullet” essay still hold. There’s no single template, library, or tool that can eliminate the essential complexity of understanding what needs to be built. Whether it’s requirements engineering in the 1990s or prompt engineering today, the hard part is always the same: building and maintaining a shared understanding of intent. Tools can help, but they don’t replace the discipline.

AI raises the stakes on this core communication problem. Unlike your teammates, the AI won’t push back or ask questions—it just generates something that looks plausible based on the prompt that it was given. That makes clear communication of requirements even more important.

The alignment of understanding that serves as the foundation of requirements engineering is even more important when we bring AI tools into the project, because AI doesn’t have judgment. It has a huge model, but it only works effectively when directed well. The AI needs the context that we provide in the form of code, documents, and other project information and artifacts, which means the only thing it knows about the project is what we tell it. That’s why it’s especially important to have ways to check and verify that what the AI “knows” really matches what we know.

The classic requirements engineering problems—especially the poor communication and lack of shared understanding that Deming warned about and that requirements engineers and Agile practitioners have spent decades trying to address—are compounded when we use AI. We’re still facing the same issues of communicating intent and specifying requirements clearly. But now those requirements aren’t just for the team to read; they’re used to establish the AI’s context. Small variations in problem framing can have a profound impact on what the AI produces. Using natural language to increasingly replace the structured, unambiguous syntax of code removes a critical guardrail that’s traditionally helped protect software from failed understanding.

The tools of requirements engineering help us make up for that missing guardrail. Agile’s iterative process of the developer understanding requirements, building working software, and continuously reviewing it with the product owner was a check that ensured misunderstandings were caught early. The more we eliminate that extra step of translation and understanding by having AI generate code directly from requirements, the more important it becomes for everyone involved—stakeholders and engineers alike—to have a truly shared understanding of what needs to be built.

When people on teams work together to build software, they spend a lot of time talking and asking questions to understand what they need to build. Working with an AI follows a different kind of feedback cycle—you don’t know it’s missing context until you see what it produces, and you often need to reverse engineer what it did to figure out what’s missing. But both types of interaction require the same fundamental skills around context and communication that requirements engineers have always practiced.

This shows up in practice in several ways:

Context and shared understanding are foundational. Good requirements help teams understand what behavior matters and how to know when it’s working—capturing both functional requirements (what to build) and nonfunctional requirements (how well it should work). The same distinction applies to prompting but with fewer chances to course-correct. If you leave out something critical, the AI doesn’t push back; it just responds with whatever seems plausible. Sometimes that output looks reasonable until you try to use it and realize the AI was solving a different problem.
Scoping takes real judgment. Developers who struggle to use AI for code typically fall into two extremes: providing too little context (a single sentence that produces something that looks right but fails in practice) or pasting in entire files expecting the model to zoom in on the right method. Unless you explicitly call out what’s important—both functional and nonfunctional requirements—it doesn’t know what matters.
Context drifts, and the model doesn’t know it’s drifted. With human teams, understanding shifts gradually through check-ins and conversations. With prompting, drift can happen in just a few exchanges. The model might still be generating fluent responses until it suggests a fix that makes no sense. That’s a signal that the context has drifted, and you need to reframe the conversation—perhaps by asking the model to explain the code or restate what it thinks it’s doing.

History keeps repeating itself: From binders full of scattered requirements to IEEE standards to user stories to today’s prompts, the discipline is the same. We succeed when we treat it as real engineering. Prompt engineering is the next step in the evolution of requirements engineering. It’s how we make sure we have a shared understanding between everyone on the project—including the AI—and it demands the same care, clarity, and deliberate communication we’ve always needed to avoid misunderstandings and build the right thing.

Building AI-Resistant Technical Debt

Oreilly

By: Andrew Stellman

10 September 2025 at 06:03

Anyone who’s used AI to generate code has seen it make mistakes. But the real danger isn’t the occasional wrong answer; it’s in what happens when those errors pile up across a codebase. Issues that seem small at first can compound quickly, making code harder to understand, maintain, and evolve. To really see that danger, you have to look at how AI is used in practice—which for many developers starts with vibe coding.

Vibe coding is an exploratory, prompt-first approach to software development where developers rapidly prompt, get code, and iterate. When the code seems close but not quite right, the developer describes what’s wrong and lets the AI try again. When it doesn’t compile or tests fail, they copy the error messages back to the AI. The cycle continues—prompt, run, error, paste, prompt again—often without reading or understanding the generated code. It feels productive because you’re making visible progress: errors disappear, tests start passing, features seem to work. You’re treating the AI like a coding partner who handles the implementation details while you steer at a high level.

Developers use vibe coding to explore and refine ideas and can generate large amounts of code quickly. It’s often the natural first step for most developers using AI tools, because it feels so intuitive and productive. Vibe coding offloads detail to the AI, making exploration and ideation fast and effective—which is exactly why it’s so popular.

The AI generates a lot of code, and it’s not practical to review every line every time it regenerates. Trying to read it all can lead to cognitive overload—mental exhaustion from wading through too much code—and makes it harder to throw away code that isn’t working just because you already invested time in reading it.

Vibe coding is a normal and useful way to explore with AI, but on its own it presents a significant risk. The models used by LLMs can hallucinate and produce made-up answers—for example, generating code that calls APIs or methods that don’t even exist. Preventing those AI-generated mistakes from compromising your codebase starts with understanding the capabilities and limitations of these tools, and taking an approach to AI-assisted development that takes those limitations into account.

Here’s a simple example of how these issues compound. When I ask AI to generate a class that handles user interaction, it often creates methods that directly read from and write to the console. When I then ask it to make the code more testable, if I don’t very specifically prompt for a simple fix like having methods take input as parameters and return output as values, the AI frequently suggests wrapping the entire I/O mechanism in an abstraction layer. Now I have an interface, an implementation, mock objects for testing, and dependency injection throughout. What started as a straightforward class has become a miniature framework. The AI isn’t wrong, exactly—the abstraction approach is a valid pattern—but it’s overengineered for the problem at hand. Each iteration adds more complexity, and if you’re not paying attention, you’ll end up with layers upon layers of unnecessary code. This is a good example of how vibe coding can balloon into unnecessary complexity if you don’t stop to verify what’s happening.

Novice Developers Face a New Kind of Technical Debt Challenge with AI

Three months after writing their first line of code, a Reddit user going by SpacetimeSorcerer posted a frustrated update: Their AI-assisted project had reached the point where making any change meant editing dozens of files. The design had hardened around early mistakes, and every change brought a wave of debugging. They’d hit the wall known in software design as “shotgun surgery,” where a single change ripples through so much code that it’s risky and slow to work on—a classic sign of technical debt, the hidden cost of early shortcuts that make future changes harder and more expensive.

I am giving up — *A Reddit post describing the frustration of AI-accelerated technical debt (used with permission).*

AI didn’t cause the problem directly; the code worked (until it didn’t). But the speed of AI-assisted development let this new developer skip the design thinking that prevents these patterns from forming. The same thing happens to experienced developers when deadlines push delivery over maintainability. The difference is, an experienced developer often knows they’re taking on debt. They can spot antipatterns early because they’ve seen them repeatedly, and take steps to “pay off” the debt before it gets much more expensive to fix. Someone new to coding may not even realize it’s happening until it’s too late—and they haven’t yet built the tools or habits to prevent it.

Part of the reason new developers are especially vulnerable to this problem goes back to the Cognitive Shortcut Paradox.¹ Without enough hands-on experience debugging, refactoring, and working through ambiguous requirements, they don’t have the instincts built up through experience to spot structural problems in AI-generated code. The AI can hand them a clean, working solution. But if they can’t see the design flaws hiding inside it, those flaws grow unchecked until they’re locked into the project, built into the foundations of the code so changing them requires extensive, frustrating work.

The signals of AI-accelerated technical debt show up quickly: highly coupled code where modules depend on each other’s internal details; “God objects” with too many responsibilities; overly structured solutions where a simple problem gets buried under extra layers. These are the same problems that typically reflect technical debt in human-built code; the reason they emerge so quickly in AI-generated code is because it can be generated much more quickly and without oversight or intentional design or architectural decisions being made. AI can generate these patterns convincingly, making them look deliberate even when they emerged by accident. Because the output compiles, passes tests, and works as expected, it’s easy to accept as “done” without thinking about how it will hold up when requirements change.

When adding or updating a unit test feels unreasonably difficult, that’s often the first sign the design is too rigid. The test is telling you something about the structure—maybe the code is too intertwined, maybe the boundaries are unclear. This feedback loop works whether the code was AI-generated or handwritten, but with AI the friction often shows up later, after the code has already been merged.

That’s where the “trust but verify” habit comes in. Trust the AI to give you a starting point, but verify that the design supports change, testability, and clarity. Ask yourself whether the code will still make sense to you—or anyone else—months from now. In practice, this can mean quick design reviews even for AI-generated code, refactoring when coupling or duplication starts to creep in, and taking a deliberate pass at naming so variables and functions read clearly. These aren’t optional touches; they’re what keep a codebase from locking in its worst early decisions.

AI can help with this too: It can suggest refactorings, point out duplicated logic, or help extract messy code into cleaner abstractions. But it’s up to you to direct it to make those changes, which means you have to spot them first—which is much easier for experienced developers who have seen these problems over the course of many projects.

Left to its defaults, AI-assisted development is biased toward adding new code, not revisiting old decisions. The discipline to avoid technical debt comes from building design checks into your workflow so AI’s speed works in service of maintainability instead of against it.

Footnote

I’ll discuss this in more detail in a forthcoming Radar article on October 8.

Understanding the Rehash Loop

Oreilly

By: Andrew Stellman

3 September 2025 at 06:21

This article is part of a series on the Sens-AI Framework—practical habits for learning and coding with AI.

In “The Sens-AI Framework: Teaching Developers to Think with AI,” I introduced the concept of the rehash loop—that frustrating pattern where AI tools keep generating variations of the same wrong answer, no matter how you adjust your prompt. It’s one of the most common failure modes in AI-assisted development, and it deserves a deeper look.

Most developers who use AI in their coding work will recognize a rehash loop. The AI generates code that’s almost right—close enough that you think one more tweak will fix it. So you adjust your prompt, add more detail, explain the problem differently. But the response is essentially the same broken solution with cosmetic changes. Different variable names. Reordered operations. Maybe a comment or two. But fundamentally, it’s the same wrong answer.

Recognizing When You’re Stuck

Rehash loops are frustrating. The model seems so close to understanding what you need but just can’t get you there. Each iteration looks slightly different, which makes you think you’re making progress. Then you test the code and it fails in exactly the same way, or you get the same errors, or you just recognize that it’s a solution that you’ve already seen and dismissed multiple times.

Most developers try to escape through incremental changes—adding details, rewording instructions, nudging the AI toward a fix. These adjustments normally work during regular coding sessions, but in a rehash loop, they lead back to the same constrained set of answers. You can’t tell if there’s no real solution, if you’re asking the wrong question, or if the AI is hallucinating a partial answer and too confident that it works.

When you’re in a rehash loop, the AI isn’t broken. It’s doing exactly what it’s designed to do—generating the most statistically likely response it can, based on the tokens in your prompt and the limited view it has of the conversation. One source of the problem is the context window—an architectural limit on how many tokens the model can process at once. That includes your prompt, any shared code, and the rest of the conversation—usually a few thousand tokens total. The model uses this entire sequence to predict what comes next. Once it has sampled the patterns it finds there, it starts circling.

The variations you get—reordered statements, renamed variables, a tweak here or there—aren’t new ideas. They’re just the model nudging things around in the same narrow probability space.

So if you keep getting the same broken answer, the issue probably isn’t that the model doesn’t know how to help. It’s that you haven’t given it enough to work with.

When the Model Runs Out of Context

A rehash loop is a signal that the AI ran out of context. The model has exhausted the useful information in the context you’ve given it. When you’re stuck in a rehash loop, treat it as a signal instead of a problem. Figure out what context is missing and provide it.

Large language models don’t really understand code the way humans do. They generate suggestions by predicting what comes next in a sequence of text based on patterns they’ve seen in massive training datasets. When you prompt them, they analyze your input and predict likely continuations, but they have no real understanding of your design or requirements unless you explicitly provide that context.

The better context you provide, the more useful and accurate the AI’s answers will be. But when the context is incomplete or poorly framed, the AI’s suggestions can drift, repeat variations, or miss the real problem entirely.

Breaking Out of the Loop

Research becomes especially important when you hit a rehash loop. You need to learn more before reengaging—reading documentation, clarifying requirements with teammates, thinking through design implications, or even starting another session to ask research questions from a different angle. Starting a new chat with a different AI can help because your prompt might steer it toward a different region of its information space and surface new context.

A rehash loop tells you that the model is stuck trying to solve a puzzle without all the pieces. It keeps rearranging the ones it has, but it can’t reach the right solution until you give it the one piece it needs—that extra bit of context that points it to a different part of the model it wasn’t using. That missing piece might be a key constraint, an example, or a goal you haven’t spelled out yet. You typically don’t need to give it a lot of extra information to break out of the loop. The AI doesn’t need a full explanation; it needs just enough new context to steer it into a part of its training data it wasn’t using.

When you recognize you’re in a rehash loop, trying to nudge the AI and vibe-code your way out of it is usually ineffective—it just leads you in circles. (“Vibe coding” means relying on the AI to generate something that looks plausible and hoping it works, without really digesting the output.) Instead, start investigating what’s missing. Ask the AI to explain its thinking: “What assumptions are you making?” or “Why do you think this solves the problem?” That can reveal a mismatch—maybe it’s solving the wrong problem entirely, or it’s missing a constraint you forgot to mention. It’s often especially helpful to open a chat with a different AI, describe the rehash loop as clearly as you can, and ask what additional context might help.

This is where problem framing really starts to matter. If the model keeps circling the same broken pattern, it’s not just a prompt problem—it’s a signal that your framing needs to shift.

Problem framing helps you recognize that the model is stuck in the wrong solution space. Your framing gives the AI the clues it needs to assemble patterns from its training that actually match your intent. After researching the actual problem—not just tweaking prompts—you can transform vague requests into targeted questions that steer the AI away from default responses and toward something useful.

Good framing starts by getting clear about the nature of the problem you’re solving. What exactly are you asking the model to generate? What information does it need to do that? Are you solving the right problem in the first place? A lot of failed prompts come from a mismatch between the developer’s intent and what the model is actually being asked to do. Just like writing good code, good prompting depends on understanding the problem you’re solving and structuring your request accordingly.

Learning from the Signal

When AI keeps circling the same solution, it’s not a failure—it’s information. The rehash loop tells you something about either your understanding of the problem or how you’re communicating it. An incomplete response from the AI is often just a step toward getting the right answer. These moments aren’t failures. They’re signals to do the extra work—often just a small amount of targeted research—that gives the AI the information it needs to get to the right place in its massive information space.

AI doesn’t think for you. While it can make surprising connections by recombining patterns from its training, it can’t generate truly new insight on its own. It’s your context that helps it connect those patterns in useful ways. If you’re hitting rehash loops repeatedly, ask yourself: What does the AI need to know to do this well? What context or requirements might be missing?

Rehash loops are one of the clearest signals that it’s time to step back from rapid generation and engage your critical thinking. They’re frustrating, but they’re also valuable—they tell you exactly when the AI has exhausted its current context and needs your help to move forward.

MCP Introduces Deep Integration—and Serious Security Concerns

Oreilly

By: Andrew Stellman

27 August 2025 at 05:52

MCP—the Model Context Protocol introduced by Anthropic in November 2024—is an open standard for connecting AI assistants to data sources and development environments. It’s built for a future where every AI assistant is wired directly into your environment, where the model knows what files you have open, what text is selected, what you just typed, and what you’ve been working on.

And that’s where the security risks begin.

AI is driven by context, and that’s exactly what MCP provides. It gives AI assistants like GitHub Copilot everything they might need to help you: open files, code snippets, even what’s selected in the editor. When you use MCP-enabled tools that transmit data to remote servers, all of it gets sent over the wire. That might be fine for most developers. But if you work at a financial firm, hospital, or any organization with regulatory constraints where you need to be extremely careful about what leaves your network, MCP makes it really easy to lose control of a lot of things.

Let’s say you’re working in Visual Studio Code on a healthcare app, and you select a few lines of code to debug a query—a routine moment in your day. That snippet might include connection strings, test data with real patient info, and part of your schema. You ask Copilot to help and approve an MCP tool that connects to a remote server—and all of it gets sent to external servers. That’s not just risky. It could be a compliance violation under HIPAA, SOX, or PCI-DSS, depending on what gets transmitted.

These are the kinds of things developers accidentally send every day without realizing it:

Internal URLs and system identifiers
Passwords or tokens in local config files
Network details or VPN information
Local test data that includes real user info, SSNs, or other sensitive values

With MCP, devs on your team could be approving tools that send all of those things to servers outside of your network without realizing it, and there’s often no easy way to know what’s been sent.

But this isn’t just an MCP problem; it’s part of a larger shift where AI tools are becoming more context-aware across the board. Browser extensions that read your tabs, AI coding assistants that scan your entire codebase, productivity tools that analyze your documents—they’re all collecting more information to provide better assistance. With MCP, the stakes are just more visible because the data pipeline is formalized.

Many enterprises are now facing a choice between AI productivity gains and regulatory compliance. Some orgs are building air-gapped development environments for sensitive projects, though achieving true isolation with AI tools can be complex since many still require external connectivity. Others lean on network-level monitoring and data loss prevention solutions that can detect when code or configuration files are being transmitted externally. And a few are going deeper and building custom MCP implementations that sanitize data before transmission, stripping out anything that looks like credentials or sensitive identifiers.

One thing that can help is organizational controls in development tools like VS Code. Most security-conscious organizations can centrally disable MCP support or control which servers are available through group policies and GitHub Copilot enterprise settings. But that’s where it gets tricky, because MCP doesn’t just receive responses. It sends data upstream, potentially to a server outside of your organization, which means every request carries risk.

Security vendors are starting to catch up. Some are building MCP-aware monitoring tools that can flag potentially sensitive data before it leaves the network. Others are developing hybrid deployment models where the AI reasoning happens on-premises but can still access external knowledge when needed.

Our industry is going to have to come up with better enterprise solutions for securing MCP if we want to meet the needs of all organizations. The tension between AI capability and data security will likely drive innovation in privacy-preserving AI techniques, federated learning approaches, and hybrid deployment models that keep sensitive context local while still providing intelligent assistance.

Until then, deeply integrated AI assistants come with a cost: Sensitive context can slip through—and there’s no easy way to know it has happened.