Normal view

There are new articles available, click to refresh the page.

Before yesterdayMain stream

MIT Technology Review
The man who made India digital isn’t done yet 7 January 2026 at 06:00

The man who made India digital isn’t done yet

By: Edd Gent

7 January 2026 at 06:00

Nandan Nilekani can’t stop trying to push India into the future. He started nearly 30 years ago, masterminding an ongoing experiment in technological state capacity that started with Aadhaar—the world’s largest digital identity system. Aadhaar means “foundation” in Hindi, and on that bedrock Nilekani and people working with him went on to build a sprawling collection of free, interoperating online tools that add up to nothing less than a digital infrastructure for society. They cover government services, digital payments, banking, credit, and health care, offering convenience and access that would be eye-popping in wealthy countries a tenth of India’s size. In India those systems are called, collectively, “digital public infrastructure,” or DPI.

At 70 years old, Nilekani should be retired. But he has a few more ideas. India’s electrical grid is creaky and prone to failure; Nilekani wants to add a layer of digital communication to stabilize it. And then there’s his idea to expand the financial functions in DPI to the rest of the world, creating a global digital backbone for commerce that he calls the “finternet.”

“It sounds like some crazy stuff,” Nilekani says. “But I think these are all big ideas, which over the next five years will have demonstrable, material impact.” As a last act in public life, why not Aadhaarize the world?

India’s digital backbone

Today, a farmer in a village in India, hours from the nearest bank, can collect welfare payments or transfer money by simply pressing a thumb to a fingerprint scanner at the local store. Digitally authenticated copies of driver’s licenses, birth certificates, and educational records can be accessed and shared via a digital wallet that sits on your smartphone.

In big cities, where cash is less and less common (just trying to break a bill can be a major headache), mobile payments are ubiquitous, whether you’re buying a TV from a high-street retailer or a coconut from a roadside cart. There are no fees, and any payment app or bank account can send money to any other. The country’s chaotic patchwork of public and private hospitals have begun digitizing all their medical records and uploading them to a nationwide platform. On the Open Network for Digital Commerce (ONDC), people can do online shopping searches on whatever app they want, and the results show sellers from an array of other platforms, too. The idea is to liberate small merchants and consumers from the walled gardens of online shopping giants like Amazon and the domestic giant Flipkart.

In the most populous nation on Earth—with 1.4 billion people—a large portion of the bureaucracy anyone encounters in daily life happens seamlessly and in the cloud.

At the heart of all these tools is Aadhaar. The system gives every Indian a 12-digit number that, in combination with either a fingerprint scan or an SMS code, allows access to government services, SIM cards, basic bank accounts, digital signature services, and social welfare payments. The Indian government says that since its inception in 2009, Aadhaar has saved 3.48 trillion rupees ($39.2 billion) by boosting efficiency, bypassing corrupt officials, and cutting other types of fraud. The system is controversial and imperfect—a database with 1.4 billion people in it comes with inherent security and privacy concerns. Still, in the most populous nation on Earth, a big portion of the bureaucracy anyone might encounter in daily life just happens in the cloud.

Nilekani was behind much of that innovation, marshaling an army of civil servants, tech companies, and volunteers. Now he sees it in action every day. “It reinforces that what you have done is not some abstract stuff, but real stuff for real people,” he says.

By his own admission, Nilekani is entering the twilight of his career. But it’s not over yet. He’s now “chief mentor” for the India Energy Stack (IES), a government initiative to connect the fragmented data held by companies responsible for generating, transmitting, and distributing power. India’s grids are unstable and disparate, but Nilekani hopes an Aadhaar-like move will help. IES aims to give unique digital identities not only to power plants and energy storage facilities but even to rooftop solar panels and electric vehicles. All the data attached to those things—device characteristics, energy rating certifications, usage information—will be in a common, machine-readable format and shared on the same open protocols.

Ideally, that’ll give grid operators a real-time view of energy supply and demand. And if it works, it might also make it simpler and cheaper for anyone to connect to the grid—even everyday folks selling excess power from their rooftop solar rigs, says RS Sharma, the chair of the project and Nilekani’s deputy while building Aadhaar.

Nilekani’s other side hustle is even more ambitious. His idea for a global “finternet” combines Aadhaarization with blockchains—creating digital representations called tokens for not only financial instruments like stocks or bonds but also real-world assets like houses or jewelry. Anyone from a bank to an asset manager or even a company could create and manage these tokens, but Nilekani’s team especially hopes the idea will help poor people trade their assets, or use them as loan collateral—expanding financial services to those who otherwise couldn’t access them.

It sounds almost wild-eyed. Yet the finternet project has 30 partners across four continents. Nilekani says it’ll launch next year.

A call to service

Nilekani was born in Bengaluru, in 1955. His family was middle class and, Nilekani says, “seized with societal issues and challenges.” His upbringing was also steeped in the kind of socialism espoused by the newish nation’s first prime minister, Jawaharlal Nehru.

After studying electrical engineering at the Indian Institute of Technology, in 1981 Nilekani helped found Infosys, an information technology company that pioneered outsourcing and helped turned India into the world’s IT back office. In 1999, he was part of a government-appointed task force trying to upgrade the infrastructure and services in Bengaluru, then emerging as India’s tech capital. But Nilekani was at the time leery of being viewed as just another techno-optimist. “I didn’t want to be seen as naive enough to believe that tech could solve everything,” he says.

Nilekani holds a device to one eye — Nilekani demonstrates the biometric technology at the heart of Aadhaar, the system he spearheaded that provides a unique digital identity number to all Indians.

Seeing the scope of the problem changed his mind—sclerotic bureaucracy, endemic corruption, and financial exclusion were intractable without technological solutions. In 2008 Nilekani published a book, Imagining India: The Idea of a Renewed Nation. It was a manifesto for an India that could leapfrog into a networked future.

And it got him a job. At the time more than half the births in the country were not recorded, and up to 400 million Indians had no official identity document. Manmohan Singh, the prime minister, asked Nilekani to put into action an ill-defined plan to create a national identity card.

Nilekani’s team made a still-controversial decision to rely on biometrics. A system based on people’s fingerprints and retina scans meant nobody could sign up twice, and nobody had to carry paperwork. In terms of execution, it was like trying to achieve industrialization but skip a steam era. Deployment required a monumental data collection effort, as well as new infrastructure that could compare each new enrollment against hundreds of millions of existing records in seconds. At its peak, the Unique Identification Authority of India (UIDAI), the agency responsible for administering Aadhaar, was registering more than a million new users a day. That happened with a technical team of just about 50 developers, and in the end cost slightly less than half a billion dollars.

Buoyed by their success, Nilekani and his allies started casting around for other problems they could solve using the same digitize-the-real-world playbook. “We built more and more layers of capability,” Nilekani says, “and then this became a wider-ranging idea. More grandiose.”

While other countries were building digital backbones with full state control (as in China) or in public-private partnerships that favored profit-seeking corporate approaches (as in the US), Nilekani thought India needed something else. He wanted critical technologies in areas like identity, payments, and data sharing to be open and interoperable, not monopolized by either the state or private industry. So the tools that make up DPI use open standards and open APIs, meaning that anyone can plug into the system. No single company or institution controls access—no walled gardens.

A contested legacy

Of course, another way to look at putting financial and government services and records into giant databases is that it’s a massive risk to personal liberty. Aadhaar, in particular, has faced criticism from privacy advocates concerned about the potential for surveillance. Several high-profile data breaches of Aadhaar records held by government entities have shaken confidence in the system, most recently in 2023, when security researchers found hackers selling the records of more than 800 million Indians on the dark web.

Technically, this shouldn’t matter—an Aadhaar number ought to be useless without biometric or SMS-based authentication. It’s “a myth that this random number is a very powerful number,” says Sharma, the onetime co-lead of UIDAI. “I don’t have any example where somebody’s Aadhaar disclosure would have harmed somebody.”

One problem is that in everyday use, Aadhaar users often bypass the biometric authentication system. To ensure that people use a genuine address at registration, Aadhaar administrators give people their numbers on an official-looking document. Indians co-opted this paperwork as a proof of identity on its own. And since the document—Indians even call it an “Aadhaar card”—doesn’t have an expiration date, it’s possible for people to get multiple valid cards with different details by changing their address or date of birth. That’s quite a loophole. In 2018 an NGO report found that 67% of people using Aadhaar to open a bank account relied on this verification document rather than digital authentication. That report was the last time anyone published data on the problem, so nobody knows how bad it is today. “Everybody’s living on anecdotes,” says Kiran Jonnalagadda, an anti-Aadhaar activist.

In other cases, flaws in Aadhaar’s biometric technology have caused people to be denied essential government services. The government downplays these risks, but again, it’s impossible to tell how serious the problem is because the UIDAI won’t disclose numbers. “There needs to be a much more honest acknowledgment, documentation, and then an examination of how those exclusions can be mitigated,” says Apar Gupta, director of the Internet Freedom Foundation.

Beyond the potential for fraud, it’s also true that the free and interoperable tools haven’t reached all the people who might find them useful, especially among India’s rural and poorer populations. Nilekani’s hopes for openness haven’t fully come to pass. Big e-commerce companies still dominate, and retail sales on ONDC have been dropping steadily since 2024, when financial incentives to participate began to taper off. The digital payments and government documentation services have hundreds of millions of users, numbers most global technology companies would love to see—but in a country as large as India, that leaves a lot of people out.

Going global

The usually calm Nilekani bristles at that criticism; he has heard it before. Detractors overlook the dysfunction that preceded these efforts, he says, and he remains convinced that technology was the only way forward. “How do you move a country of 1.4 billion people?” he asks. “There’s no other way you can fix it.”

The proof is self-evident, he says. Indians have opened more than 500 million basic bank accounts using Aadhaar; before it came into use, millions of those people had been completely unbanked. Earlier this year, India’s Unified Payments Interface overtook Visa as the world’s largest real-time payments system. “There is no way Aadhaar could have worked but for the fact that people needed this thing,” Nilekani says. “There’s no way payments would have worked without people needing it. So the voice of the people—they’re voting with their feet.”

A street vendor in Kolkata displays a QR code that lets him get paid via India’s Unified Payments Interface, part of the digital public infrastructure Nilekani helped build. The Reserve Bank of India says more than 657 million people used the system in the financial year 2024–2025.

That need might be present in countries beyond India. “Many countries don’t have a proper birth registration system. Many countries don’t have a payment system. Many countries don’t have a way for data to be leveraged,” Nilekani says. “So this is a very powerful idea.” It seems to be spreading. Foreign governments regularly send delegations to Bengaluru to study India’s DPI tools. The World Bank and the United Nations have tried to introduce the concept to other developing countries equally eager to bring their economies into the digital age. The Gates Foundation has established projects to promote digital infrastructure, and Nilekani has set up and funded a network of think tanks, research institutes, and other NGOs aimed at, as he says, “propagating the gospel.”

Still, he admits he might not live to see DPI go global. “There are two races,” Nilekani says. “My personal race against time and India’s race against time.” He worries that the economic potential of its vast young population—the so-called demographic dividend—could turn into a demographic disaster. Despite rapid growth, gains have been uneven. Youth unemployment remains stubbornly high—a particularly volatile problem in a large and economically turbulent country.

“Maybe I’m a junkie,” he says. “Why the hell am I doing all this? I think I need it. I think I need to keep curious and alive and looking at the future.” But that’s the thing about building the future: It never quite arrives.

Edd Gent is a journalist based in Bengaluru, India.

MIT Technology Review
AI coding is now everywhere. But not everyone is convinced. 15 December 2025 at 05:00

AI coding is now everywhere. But not everyone is convinced.

MIT Technology Review

By: Edd Gent

15 December 2025 at 05:00

Depending who you ask, AI-powered coding is either giving software developers an unprecedented productivity boost or churning out masses of poorly designed code that saps their attention and sets software projects up for serious long term-maintenance problems.

The problem is right now, it’s not easy to know which is true.

As tech giants pour billions into large language models (LLMs), coding has been touted as the technology’s killer app. Both Microsoft CEO Satya Nadella and Google CEO Sundar Pichai have claimed that around a quarter of their companies’ code is now AI-generated. And in March, Anthropic’s CEO, Dario Amodei, predicted that within six months 90% of all code would be written by AI. It’s an appealing and obvious use case. Code is a form of language, we need lots of it, and it’s expensive to produce manually. It’s also easy to tell if it works—run a program and it’s immediately evident whether it’s functional.

This story is part of MIT Technology Review’s Hype Correction package, a series that resets expectations about what AI is, what it makes possible, and where we go next.

Executives enamored with the potential to break through human bottlenecks are pushing engineers to lean into an AI-powered future. But after speaking to more than 30 developers, technology executives, analysts, and researchers, MIT Technology Review found that the picture is not as straightforward as it might seem.

For some developers on the front lines, initial enthusiasm is waning as they bump up against the technology’s limitations. And as a growing body of research suggests that the claimed productivity gains may be illusory, some are questioning whether the emperor is wearing any clothes.

The pace of progress is complicating the picture, though. A steady drumbeat of new model releases mean these tools’ capabilities and quirks are constantly evolving. And their utility often depends on the tasks they are applied to and the organizational structures built around them. All of this leaves developers navigating confusing gaps between expectation and reality.

Is it the best of times or the worst of times (to channel Dickens) for AI coding? Maybe both.

A fast-moving field

It’s hard to avoid AI coding tools these days. There are a dizzying array of products available, both from model developers like Anthropic, OpenAI, and Google and from companies like Cursor and Windsurf, which wrap these models in polished code-editing software. And according to Stack Overflow’s 2025 Developer Survey, they’re being adopted rapidly, with 65% of developers now using them at least weekly.

AI coding tools first emerged around 2016 but were supercharged with the arrival of LLMs. Early versions functioned as little more than autocomplete for programmers, suggesting what to type next. Today they can analyze entire code bases, edit across files, fix bugs, and even generate documentation explaining how the code works. All this is guided through natural-language prompts via a chat interface.

“Agents”—autonomous LLM-powered coding tools that can take a high-level plan and build entire programs independently—represent the latest frontier in AI coding. This leap was enabled by the latest reasoning models, which can tackle complex problems step by step and, crucially, access external tools to complete tasks. “This is how the model is able to code, as opposed to just talk about coding,” says Boris Cherny, head of Claude Code, Anthropic’s coding agent.

These agents have made impressive progress on software engineering benchmarks—standardized tests that measure model performance. When OpenAI introduced the SWE-bench Verified benchmark in August 2024, offering a way to evaluate agents’ success at fixing real bugs in open-source repositories, the top model solved just 33% of issues. A year later, leading models consistently score above 70%.

In February, Andrej Karpathy, a founding member of OpenAI and former director of AI at Tesla, coined the term “vibe coding”—meaning an approach where people describe software in natural language and let AI write, refine, and debug the code. Social media abounds with developers who have bought into this vision, claiming massive productivity boosts.

But while some developers and companies report such productivity gains, the hard evidence is more mixed. Early studies from GitHub, Google, and Microsoft—all vendors of AI tools—found developers completing tasks 20% to 55% faster. But a September report from the consultancy Bain & Company described real-world savings as “unremarkable.”

Data from the developer analytics firm GitClear shows that most engineers are producing roughly 10% more durable code—code that isn’t deleted or rewritten within weeks—since 2022, likely thanks to AI. But that gain has come with sharp declines in several measures of code quality. Stack Overflow’s survey also found trust and positive sentiment toward AI tools falling significantly for the first time. And most provocatively, a July study by the nonprofit research organization Model Evaluation & Threat Research (METR) showed that while experienced developers believed AI made them 20% faster, objective tests showed they were actually 19% slower.

Growing disillusionment

For Mike Judge, principal developer at the software consultancy Substantial, the METR study struck a nerve. He was an enthusiastic early adopter of AI tools, but over time he grew frustrated with their limitations and the modest boost they brought to his productivity. “I was complaining to people because I was like, ‘It’s helping me but I can’t figure out how to make it really help me a lot,’” he says. “I kept feeling like the AI was really dumb, but maybe I could trick it into being smart if I found the right magic incantation.”

When asked by a friend, Judge had estimated the tools were providing a roughly 25% speedup. So when he saw similar estimates attributed to developers in the METR study he decided to test his own. For six weeks, he guessed how long a task would take, flipped a coin to decide whether to use AI or code manually, and timed himself. To his surprise, AI slowed him down by an median of 21%—mirroring the METR results.

This got Judge crunching the numbers. If these tools were really speeding developers up, he reasoned, you should see a massive boom in new apps, website registrations, video games, and projects on GitHub. He spent hours and several hundred dollars analyzing all the publicly available data and found flat lines everywhere.

“Shouldn’t this be going up and to the right?” says Judge. “Where’s the hockey stick on any of these graphs? I thought everybody was so extraordinarily productive.” The obvious conclusion, he says, is that AI tools provide little productivity boost for most developers.

Developers interviewed by MIT Technology Review generally agree on where AI tools excel: producing “boilerplate code” (reusable chunks of code repeated in multiple places with little modification), writing tests, fixing bugs, and explaining unfamiliar code to new developers. Several noted that AI helps overcome the “blank page problem” by offering an imperfect first stab to get a developer’s creative juices flowing. It can also let nontechnical colleagues quickly prototype software features, easing the load on already overworked engineers.

These tasks can be tedious, and developers are typically glad to hand them off. But they represent only a small part of an experienced engineer’s workload. For the more complex problems where engineers really earn their bread, many developers told MIT Technology Review, the tools face significant hurdles.

Perhaps the biggest problem is that LLMs can hold only a limited amount of information in their “context window”—essentially their working memory. This means they struggle to parse large code bases and are prone to forgetting what they’re doing on longer tasks. “It gets really nearsighted—it’ll only look at the thing that’s right in front of it,” says Judge. “And if you tell it to do a dozen things, it’ll do 11 of them and just forget that last one.”

LLMs’ myopia can lead to headaches for human coders. While an LLM-generated response to a problem may work in isolation, software is made up of hundreds of interconnected modules. If these aren’t built with consideration for other parts of the software, it can quickly lead to a tangled, inconsistent code base that’s hard for humans to parse and, more important, to maintain.

Developers have traditionally addressed this by following conventions—loosely defined coding guidelines that differ widely between projects and teams. “AI has this overwhelming tendency to not understand what the existing conventions are within a repository,” says Bill Harding, the CEO of GitClear. “And so it is very likely to come up with its own slightly different version of how to solve a problem.”

The models also just get things wrong. Like all LLMs, coding models are prone to “hallucinating”—it’s an issue built into how they work. But because the code they output looks so polished, errors can be difficult to detect, says James Liu, director of software engineering at the advertising technology company Mediaocean. Put all these flaws together, and using these tools can feel a lot like pulling a lever on a one-armed bandit. “Some projects you get a 20x improvement in terms of speed or efficiency,” says Liu. “On other things, it just falls flat on its face, and you spend all this time trying to coax it into granting you the wish that you wanted and it’s just not going to.”

Judge suspects this is why engineers often overestimate productivity gains. “You remember the jackpots. You don’t remember sitting there plugging tokens into the slot machine for two hours,” he says.

And it can be particularly pernicious if the developer is unfamiliar with the task. Judge remembers getting AI to help set up a Microsoft cloud service called Azure Functions, which he’d never used before. He thought it would take about two hours, but nine hours later he threw in the towel. “It kept leading me down these rabbit holes and I didn’t know enough about the topic to be able to tell it ‘Hey, this is nonsensical,’” he says.

The debt begins to mount up

Developers constantly make trade-offs between speed of development and the maintainability of their code—creating what’s known as “technical debt,” says Geoffrey G. Parker, professor of engineering innovation at Dartmouth College. Each shortcut adds complexity and makes the code base harder to manage, accruing “interest” that must eventually be repaid by restructuring the code. As this debt piles up, adding new features and maintaining the software becomes slower and more difficult.

Accumulating technical debt is inevitable in most projects, but AI tools make it much easier for time-pressured engineers to cut corners, says GitClear’s Harding. And GitClear’s data suggests this is happening at scale. Since 2022, the company has seen a significant rise in the amount of copy-pasted code—an indicator that developers are reusing more code snippets, most likely based on AI suggestions—and an even bigger decline in the amount of code moved from one place to another, which happens when developers clean up their code base.

And as models improve, the code they produce is becoming increasingly verbose and complex, says Tariq Shaukat, CEO of Sonar, which makes tools for checking code quality. This is driving down the number of obvious bugs and security vulnerabilities, he says, but at the cost of increasing the number of “code smells”—harder-to-pinpoint flaws that lead to maintenance problems and technical debt.

Recent research by Sonar found that these make up more than 90% of the issues found in code generated by leading AI models. “Issues that are easy to spot are disappearing, and what’s left are much more complex issues that take a while to find,” says Shaukat. “That’s what worries us about this space at the moment. You’re almost being lulled into a false sense of security.”

If AI tools make it increasingly difficult to maintain code, that could have significant security implications, says Jessica Ji, a security researcher at Georgetown University. “The harder it is to update things and fix things, the more likely a code base or any given chunk of code is to become insecure over time,” says Ji.

There are also more specific security concerns, she says. Researchers have discovered a worrying class of hallucinations where models reference nonexistent software packages in their code. Attackers can exploit this by creating packages with those names that harbor vulnerabilities, which the model or developer may then unwittingly incorporate into software.

LLMs are also vulnerable to “data-poisoning attacks,” where hackers seed the publicly available data sets models train on with data that alters the model’s behavior in undesirable ways, such as generating insecure code when triggered by specific phrases. In October, research by Anthropic found that as few as 250 malicious documents can introduce this kind of back door into an LLM regardless of its size.

The converted

Despite these issues, though, there’s probably no turning back. “Odds are that writing every line of code on a keyboard by hand—those days are quickly slipping behind us,” says Kyle Daigle, chief operating officer at the Microsoft-owned code-hosting platform GitHub, which produces a popular AI-powered tool called Copilot (not to be confused with the Microsoft product of the same name).

The Stack Overflow report found that despite growing distrust in the technology, usage has increased rapidly and consistently over the past three years. Erin Yepis, a senior analyst at Stack Overflow, says this suggests that engineers are taking advantage of the tools with a clear-eyed view of the risks. The report also found that frequent users tend to be more enthusiastic and more than half of developers are not using the latest coding agents, perhaps explaining why many remain underwhelmed by the technology.

Those latest tools can be a revelation. Trevor Dilley, CTO at the software development agency Twenty20 Ideas, says he had found some value in AI editors’ autocomplete functions, but when he tried anything more complex it would “fail catastrophically.” Then in March, while on vacation with his family, he set the newly released Claude Code to work on one of his hobby projects. It completed a four-hour task in two minutes, and the code was better than what he would have written.

“I was like, Whoa,” he says. “That, for me, was the moment, really. There’s no going back from here.” Dilley has since cofounded a startup called DevSwarm, which is creating software that can marshal multiple agents to work in parallel on a piece of software.

The challenge, says Armin Ronacher, a prominent open-source developer, is that the learning curve for these tools is shallow but long. Until March he’d remained unimpressed by AI tools, but after leaving his job at the software company Sentry in April to launch a startup, he started experimenting with agents. “I basically spent a lot of months doing nothing but this,” he says. “Now, 90% of the code that I write is AI-generated.”

Getting to that point involved extensive trial and error, to figure out which problems tend to trip the tools up and which they can handle efficiently. Today’s models can tackle most coding tasks with the right guardrails, says Ronacher, but these can be very task and project specific.

To get the most out of these tools, developers must surrender control over individual lines of code and focus on the overall software architecture, says Nico Westerdale, chief technology officer at the veterinary staffing company IndeVets. He recently built a data science platform 100,000 lines of code long almost exclusively by prompting models rather than writing the code himself.

Westerdale’s process starts with an extended conversation with the model to develop a detailed plan for what to build and how. He then guides it through each step. It rarely gets things right on the first try and needs constant wrangling, but if you force it to stick to well-defined design patterns, the models can produce high-quality, easily maintainable code, says Westerdale. He reviews every line, and the code is as good as anything he’s ever produced, he says: “I’ve just found it absolutely revolutionary,. It’s also frustrating, difficult, a different way of thinking, and we’re only just getting used to it.”

But while individual developers are learning how to use these tools effectively, getting consistent results across a large engineering team is significantly harder. AI tools amplify both the good and bad aspects of your engineering culture, says Ryan J. Salva, senior director of product management at Google. With strong processes, clear coding patterns, and well-defined best practices, these tools can shine.

But if your development process is disorganized, they’ll only magnify the problems. It’s also essential to codify that institutional knowledge so the models can draw on it effectively. “A lot of work needs to be done to help build up context and get the tribal knowledge out of our heads,” he says.

The cryptocurrency exchange Coinbase has been vocal about its adoption of AI tools. CEO Brian Armstrong made headlines in August when he revealed that the company had fired staff unwilling to adopt AI tools. But Coinbase’s head of platform, Rob Witoff, tells MIT Technology Review that while they’ve seen massive productivity gains in some areas, the impact has been patchy. For simpler tasks like restructuring the code base and writing tests, AI-powered workflows have achieved speedups of up to 90%. But gains are more modest for other tasks, and the disruption caused by overhauling existing processes often counteracts the increased coding speed, says Witoff.

One factor is that AI tools let junior developers produce far more code. As in almost all engineering teams, this code has to be reviewed by others, normally more senior developers, to catch bugs and ensure it meets quality standards. But the sheer volume of code now being churned out is quickly saturating the ability of midlevel staff to review changes. “This is the cycle we’re going through almost every month, where we automate a new thing lower down in the stack, which brings more pressure higher up in the stack,” he says. “Then we’re looking at applying automation to that higher-up piece.”

Developers also spend only 20% to 40% of their time coding, says Jue Wang, a partner at Bain, so even a significant speedup there often translates to more modest overall gains. Developers spend the rest of their time analyzing software problems and dealing with customer feedback, product strategy, and administrative tasks. To get significant efficiency boosts, companies may need to apply generative AI to all these other processes too, says Jue, and that is still in the works.

Rapid evolution

Programming with agents is a dramatic departure from previous working practices, though, so it’s not surprising companies are facing some teething issues. These are also very new products that are changing by the day. “Every couple months the model improves, and there’s a big step change in the model’s coding capabilities and you have to get recalibrated,” says Anthropic’s Cherny.

For example, in June Anthropic introduced a built-in planning mode to Claude; it has since been replicated by other providers. In October, the company also enabled Claude to ask users questions when it needs more context or faces multiple possible solutions, which Cherny says helps it avoid the tendency to simply assume which path is the best way forward.

Most significant, Anthropic has added features that make Claude better at managing its own context. When it nears the limits of its working memory, it summarizes key details and uses them to start a new context window, effectively giving it an “infinite” one, says Cherny. Claude can also invoke sub-agents to work on smaller tasks, so it no longer has to hold all aspects of the project in its own head. The company claims that its latest model, Claude 4.5 Sonnet, can now code autonomously for more than 30 hours without major performance degradation.

Novel approaches to software development could also sidestep coding agents’ other flaws. MIT professor Max Tegmark has introduced something he calls “vericoding,” which could allow agents to produce entirely bug-free code from a natural-language description. It builds on an approach known as “formal verification,” where developers create a mathematical model of their software that can prove incontrovertibly that it functions correctly. This approach is used in high-stakes areas like flight-control systems and cryptographic libraries, but it remains costly and time-consuming, limiting its broader use.

Rapid improvements in LLMs’ mathematical capabilities have opened up the tantalizing possibility of models that produce not only software but the mathematical proof that it’s bug free, says Tegmark. “You just give the specification, and the AI comes back with provably correct code,” he says. “You don’t have to touch the code. You don’t even have to ever look at the code.”

When tested on about 2,000 vericoding problems in Dafny—a language designed for formal verification—the best LLMs solved over 60%, according to non-peer-reviewed research by Tegmark’s group. This was achieved with off-the-shelf LLMs, and Tegmark expects that training specifically for vericoding could improve scores rapidly.

And counterintuitively, the speed at which AI generates code could actually ease maintainability concerns. Alex Worden, principal engineer at the business software giant Intuit, notes that maintenance is often difficult because engineers reuse components across projects, creating a tangle of dependencies where one change triggers cascading effects across the code base. Reusing code used to save developers time, but in a world where AI can produce hundreds of lines of code in seconds, that imperative has gone, says Worden.

Instead, he advocates for “disposable code,” where each component is generated independently by AI without regard for whether it follows design patterns or conventions. They are then connected via APIs—sets of rules that let components request information or services from each other. Each component’s inner workings are not dependent on other parts of the code base, making it possible to rip them out and replace them without wider impact, says Worden.

“The industry is still concerned about humans maintaining AI-generated code,” he says. “I question how long humans will look at or care about code.”

A narrowing talent pipeline

For the foreseeable future, though, humans will still need to understand and maintain the code that underpins their projects. And one of the most pernicious side effects of AI tools may be a shrinking pool of people capable of doing so.

Early evidence suggests that fears around the job-destroying effects of AI may be justified. A recent Stanford University study found that employment among software developers aged 22 to 25 fell nearly 20% between 2022 and 2025, coinciding with the rise of AI-powered coding tools.

Experienced developers could face difficulties too. Luciano Nooijen, an engineer at the video-game infrastructure developer Companion Group, used AI tools heavily in his day job, where they were provided for free. But when he began a side project without access to those tools, he found himself struggling with tasks that previously came naturally. “I was feeling so stupid because things that used to be instinct became manual, sometimes even cumbersome,” says Nooijen.

Just as athletes still perform basic drills, he thinks the only way to maintain an instinct for coding is to regularly practice the grunt work. That’s why he’s largely abandoned AI tools, though he admits that deeper motivations are also at play.

Part of the reason Nooijen and other developers MIT Technology Review spoke to are pushing back against AI tools is a sense that they are hollowing out the parts of their jobs that they love. “I got into software engineering because I like working with computers. I like making machines do things that I want,” Nooijen says. “It’s just not fun sitting there with my work being done for me.”