The first AI coding assistants reached the consumer market — primarily as autocomplete features in mainstream editors — in 2021. Three years of rapid iteration have produced tools that are genuinely useful, that have changed daily practice for a substantial fraction of working programmers, and that have not, for the most part, lived up to the most aggressive productivity claims their vendors continue to make.
This piece is a working developer’s assessment of where the technology has actually moved the practice of programming, where it has not, and where the remaining open questions are. It is informed by three years of using these tools day-to-day, by the academic literature on programmer productivity that has emerged in this period, and by the regular conversations with other working developers that any technology beat involves.
What the tools actually do well
The clearest productivity gains have been in routine, well-bounded code. Writing test scaffolding, translating between two languages with similar semantics, generating boilerplate that conforms to a familiar pattern, completing the second half of a function whose first half makes the second half mostly obvious — these are the cases where the tools are genuinely faster than typing the code yourself.
Several controlled studies have measured these gains. The most rigorous (a randomized study with about 100 developers completing a series of moderately complex tasks) found median productivity gains of about 25 percent on the routine tasks the study isolated. Other studies have found similar or somewhat larger numbers. The basic finding — that AI assistants meaningfully accelerate routine programming — is robust.
Where the tools work less well is on the harder kinds of programming. Tasks that require holding a large, partially understood codebase in mind; tasks that require careful reasoning about edge cases that are not obvious from the local code; tasks that require understanding the specific conventions of a project that does not look like the average open-source project the model was trained on; tasks that require disambiguating between multiple plausible designs for an unsolved problem — these cases are where the tools’ productivity contribution is smaller, and sometimes negative.
A subtler problem with the larger-task case is that the tools confidently produce plausible-looking code for problems they have not actually solved. The output compiles, passes obvious tests, and contains a subtle but serious bug. A working developer who is using the tool well will catch these. A developer who is using it less well may not. The quality of the developer review of generated code is the dominant variable in whether the productivity gains translate to actual delivered software.
Where the productivity claims diverge from reality
Vendor marketing claims for these tools have consistently outrun what controlled studies measure. The 2x, 5x, and 10x productivity gains advertised in vendor materials are not what the studies find. The careful studies find, on the populations and tasks they measure, gains in the 20 to 35 percent range — real and meaningful, but a fraction of the headline numbers.
Several factors contribute to the gap. First, vendor studies sometimes measure narrower outcomes than working software (lines of code committed, time-to-completion on isolated tasks) that overstate the gain on actual delivered features. Second, the productivity gains on routine tasks are real but those tasks are not the bottleneck on most software projects; speeding up routine work by 30 percent does not speed up the overall delivery rate by 30 percent if the bottleneck is somewhere else. Third, some of the claimed productivity gains have not survived attempts at independent replication.
The honest summary is that the tools are useful, that they make daily programming meaningfully different from how it was three years ago, and that they have not produced the order-of-magnitude productivity gains some vendors continue to claim.
What has changed in daily practice
A working developer in 2026 spends meaningfully less time typing routine code than a working developer in 2022 did. The time freed up is spent on a mix of activities: more code review of generated code, more time on architecture and design questions, more time on the cross-system debugging that the tools cannot accelerate, more meetings.
The composition of that time is the part that matters and the part that has not been carefully studied. If the time freed up flows into the higher-judgment work that distinguishes senior developers, the long-run productivity effect could be larger than the studies measure. If the time flows into more meetings or into reviewing more code that ought not to have been generated, the effect is smaller.
The integration of AI assistants into the development workflow has also changed. Three years ago, the dominant pattern was an autocomplete-style suggestion tool inside the editor. The current dominant patterns include agent-style tools that take higher-level instructions and produce multi-file changes; chat-style assistants that participate in code review; and review tools that flag potential issues in human-written code. The autocomplete pattern is still in heavy use but is no longer the only mode.
The junior-developer question
There is real disagreement among working senior developers about whether AI coding assistants are good or bad for junior developers’ learning. The case for is that the tools handle the routine work that no one needs to learn through repetition, freeing the junior to focus on harder learning faster. The case against is that the tools shortcut the precise feedback loop — write code, see it fail in some specific way, understand why — that builds durable programming intuition.
The evidence base on this question is thin and not yet conclusive. The few studies that have tried to measure long-run skill development in junior developers using AI assistants are too small and too short to be definitive. The strongest opinions on the question (in either direction) are more confident than the data warrant.
The position we have come to, after observing several junior developers we work with, is that limited and supervised use of these tools is sensible: enough use to participate in modern engineering culture, not so much use that the routine learning loop is bypassed. We would not be confident that the typical industry practice on this question — which generally is “use the tools as much as possible from day one” — is the right one.
What the tools have not changed
The hardest parts of software engineering have not been substantially affected by these tools. Choosing the right abstractions for a problem still requires judgment that the tools cannot supply. Debugging a complex distributed system still requires reasoning the tools cannot perform reliably. Working with a stakeholder to figure out what they actually want still requires social and product skills the tools do not have.
The interesting practical question for the next several years is whether the tools’ contribution to the routine parts of programming continues to grow at the rate it did between 2022 and 2025, or whether the rate of improvement has begun to plateau. Recent benchmarks suggest the gains in raw code-generation capability have continued to come in but have begun to come in more slowly. The gains in the harder parts of programming — multi-step reasoning, novel algorithm design — have come in even more slowly.
Whatever the ultimate trajectory turns out to be, the tools are a permanent part of the practice of programming now. The interesting question is not whether to use them but how thoughtfully.