Is AI actually making developers slower? Recent data reveals a startling discrepancy: while developers feel 20% more productive using AI, independent tests show actual work time can increase by nearly 19% due to poor code quality. From the "vibe-coding" trend to the hidden costs of AI-generated technical debt, we dive into why the 20% "AI discount" offered by consulting firms might be the most expensive mistake a business can make.
If we browse LinkedIn posts, laudatory articles in the trade press, or analyses presented primarily by vendors, the answer seems obvious: AI is a revolution. However, recent data suggests a more nuanced reality.
In 2024, SoftServe conducted an experiment involving 1,000 developers across 7 countries, covering 1,500 tasks performed both with and without the help of Generative AI (GPT-3.5/4.0). They measured task completion time, error rates, documentation quality, and subjective participant feedback. The results were impressive:
Similarly, JetBrains surveyed over 23,000 developers last year. A vast majority (96%) claimed that using AI tools saves them time, nearly a quarter felt the generated code was better, and over half reported that automated tools led to increased productivity.
In 2025, Stanford University’s AI Index Report, supported by experimental research on 100,000 developers from 500 companies, generally agreed that AI increases productivity—though not as drastically as vendor reports suggest. In Poland (4,000 developers), the increase was 18.8%, in the USA 19.3%, and in Sweden 20.6%.
However, researchers highlighted a critical concern: code quality. They found that AI-generated code required manual fixes in 22% of cases, while the European average for traditionally built solutions sits at just 12.5%.
A fascinating "wrench in the works" of the generative coding narrative was recently published by METR (Model Evaluation and Transparency Research)—an independent research organization evaluating AI model capabilities. Crucially, METR is not affiliated with AI vendors or implementation firms, lending a high degree of perceived independence to their findings.
Their study focused on a much smaller but highly specialized group: 16 experienced open-source developers (minimum 5 years of experience, working on repositories with at least 1 million lines of code and 22,000 GitHub stars).
The participants worked on 246 real-world technical problems—bug fixes, refactoring, and feature enhancements—randomly choosing to use or not use generative tools (primarily Cursor Pro with Claude 3.5 and 3.7). The results were startlingly different:
The conclusions from the METR study feel significantly more realistic than statistics published by companies with a vested interest in the tools. They point to several critical issues:
We should hesitate to claim AI is useless. The difference in results often stems from the nature of the tasks. Corporate studies often focus on "business" tasks, while the open-source study looked at "passionate" experts whose priority isn't just economic efficiency, but algorithmic excellence and long-term maintainability.
A fitting historical parallel is the introduction of RAD (Rapid Application Development) tools in the 80s and 90s (Borland Delphi, MS Visual Basic, PowerBuilder). These tools significantly boosted speed at the cost of control over underlying algorithms. Much like today’s "vibe-coding," they generated code automatically, albeit through different methods.
Supporting development with Generative AI increases speed but often at the expense of code quality. You can solve this with more rigorous QA and code reviews, but then those 45% or 25% productivity gains quickly evaporate.
If you accept lower quality standards, the initial economic gain will be short-lived, as the cost of maintaining that software will far exceed that of human-written code.
A real-world example: I recently reviewed RFP responses from several major consulting firms. All of them heavily promoted the use of AI to lower costs, yet they offered suspiciously short warranty periods (1 to 2 months). One bidder even offered two price tiers: one with and one without AI accelerators, with a 20% price difference.
Is that 20% discount worth the long-term technical debt? I leave that to the reader to decide.
Strategic Technology, Delivery & Transformation Architect
Seasoned technology executive and transformation leader dedicated to bridging the gap between high-level business strategy and complex engineering execution. Specialized in stabilizing volatile IT environments, scaling agile delivery across international borders, and mentoring the next generation of technology leaders. Whether acting as a Fractional CTO or an Interim Program Director, establishes the operational discipline and strategic oversight needed to drive predictable, high-value outcomes in the most demanding industries.