Original Thought Paper · V2 · May 2026

Analysis of Human Coding
Capabilities That AI Can Absorb

Three-Dimensional Deconstruction of Code Effectiveness: The Cliff-Drop Decay from Generation to Requirement Fulfillment

A Systematic Examination of AI’s Boundaries in Absorbing Human Programming Skills
Across Languages, System Layers, and Physical Constraints

Publication DateMay 2, 2026

CategoryOriginal Thought Paper

FieldsCS Industry Analysis · Embedded Engineering · AI Epistemology · Philosophy of Physics

VersionV2

이조글로벌인공지능연구소

LEECHO Global AI Research Lab

Claude Opus 4.6 · Anthropic

Abstract

Starting from specific cross-sections of the CS industry, this paper systematically analyzes the boundaries and stratification of AI’s absorption of human coding capabilities. Version 2 introduces a Three-Dimensional Code Effectiveness Analysis Framework: Dimension 1 (Generation State: syntactic correctness), Dimension 2 (Runtime State: production environment stability), and Dimension 3 (Requirement State: complete fulfillment of user functionality). The research finds that AI code effectiveness exhibits cliff-drop decay across these three dimensions — Python’s benchmark pass rate is approximately 90%, but 43% of AI code still requires production debugging after passing tests, and the rate of complete fulfillment of user requirements is virtually unmeasured. Combining cross-analysis of programming language and system layer dimensions, this paper proposes a correction formula: AI Net Replacement Effect = (Degree of Formalizability × Abstraction Level) ÷ Review & Verification Cost ÷ Three-Dimensional Decay Coefficient. The paper further traces back to physical infrastructure bottlenecks, physics paradigm stagnation, genius cluster theory, and the ultimate cognitive boundary of “matrix computation cannot penetrate the quantum wall” — forming a complete empirical closed loop with LEECHO’s previously published Three Paradigms Theory.

Chapter 01

The Starting Point: A Question Sparked by a Single Sentence

When a Chief Architect’s Offhand Remark Exposes Collective Anxiety

In 2026, MiniMax Agent’s chief architect Adao made an intuitive judgment during a podcast: “You will always end up being internalized by the model.” This sentence struck precisely at the collective anxiety of CS professionals — the engineering techniques, architectural ideas, and prompt strategies painstakingly crafted today might be “consumed” by a more powerful model tomorrow, becoming part of its own capabilities.

But to what extent does this statement actually hold true? Where are the boundaries of human coding capabilities that AI can truly absorb? And what does the word “absorb” itself mean — that the code can be generated? That it can run? Or that it can truly fulfill the functionality humans intended? This paper follows this line of inquiry to construct a three-dimensional analysis framework for deconstructing this problem.

Chapter 02

Three-Dimensional Deconstruction of AI Code Effectiveness

Generation ≠ Execution ≠ Fulfillment

There exists a serious conceptual conflation in the AI code generation industry: equating “code can be generated” with “code works,” and equating “code works” with “software is complete.” This paper decomposes AI code effectiveness into three progressive yet independent dimensions.

Dimension 1: Generation State

Whether code compiles, has correct syntax, and passes unit tests. This is what all benchmarks measure — HumanEval, SWE-bench, Aider Polyglot. This dimension produces the best-looking numbers: top models achieve benchmark pass rates approaching 90% on Python, with SWE-bench scores ranging from 75–85%.

Dimension 2: Runtime State

Whether code, once deployed to production, can continue to run stably under real-world load, edge cases, concurrency pressure, and security attacks. The Lightrun 2026 report reveals: 43% of AI-generated code changes still require manual debugging in production after passing QA and pre-release testing. 88% of organizations need two to three redeployment cycles to verify AI’s fix recommendations.

Dimension 3: Requirement State

Whether code truly and completely fulfills the human user’s functional requirements — including both explicit and implicit requirements. Microsoft Research defines this dimension as the “Intent Gap”: the distance between user intent and program behavior. This dimension has virtually no reliable measurement methods, and is where AI is most powerless.

Figure 1 · Three-Dimensional Effectiveness Cliff-Drop Decay Model

Dimension 1: Generation State · Python ~90% · C++ ~55-65%

↓ Decay ~43%

Dimension 2: Runtime State · 43% of code needs production debugging · Incident rate +23.5%

↓ Decay ~50-70% (estimated)

Dimension 3: Requirement State · Virtually unmeasured · “Intent Gap” unbridgeable

The decay between the three dimensions means: AI’s “90% accuracy” on Python, after three-dimensional decay, yields a rate of truly satisfying users’ complete functional requirements of perhaps only 15–25%. At the C/C++ embedded level, after three-dimensional decay, it approaches zero. All benchmark and marketing data cited refers to Dimension 1, but what users actually need is Dimension 3.

This is the greatest sleight of hand in the AI code generation industry. Being able to generate does not mean being able to run; being able to run does not mean being what users want. With every dimension crossed, AI’s effectiveness drops off a cliff. And the truly valuable Dimension 3 — understanding “what does the user actually want” — requires not induction, but abduction.

Chapter 03

The AI Productivity Paradox: Feeling Faster, Actually Slower

A 39% Gap Between Perception and Reality

Data from May 2026 reveals a shocking paradox.

Developer Perception

+20%

Developers feel 20% faster after using AI

Actual Measurement

-19%

METR measured: experienced developers actually 19% slower

Cognitive Gap

39%

The chasm between perception and reality

METR (Measurable Empirical Research Team) tested experienced developers working in mature, complex codebases in 2025 and discovered a 39–44% gap between perceived and actual productivity. Two cognitive biases explain this chasm: Automation bias — over-trusting automated systems; Effort heuristic — mistaking reduced typing for reduced cognitive workload.

Further data confirms the structural nature of this paradox:

Review vs. Writing

11.4 vs 9.8

11.4 hours/week reviewing AI code vs. only 9.8 hours writing own code — the 2024 pattern has reversed

Trust Collapse

29%

Developer trust in AI output dropped from 70%+ in 2023 to 29% in 2025

AI generates 42% of code, but while PR velocity increased 20%, incidents rose 23.5% and failure rates climbed 30%. GitHub Copilot’s code completion rate is 46%, but only about 30% is accepted by developers. Ox Security’s analysis of over 300 repositories found that 80–100% of AI-generated code contains ten recurring anti-patterns.

Case Study · Amazon’s March 2026 Consecutive Outages

On March 2, 2026, Amazon.com went down for nearly six hours, losing 120,000 orders and 1.6 million website errors. Three days later on March 5, a more severe outage lasted six hours, causing U.S. orders to drop by 99%, with approximately 6.3 million orders lost.

Both incidents were traced back to AI-assisted code changes deployed to production without proper approval. This is the catastrophic real-world manifestation of Dimension 1 → Dimension 2 decay.

Chapter 04

Cross-Dimensional Analysis: Language × System Layer

The Diagonal Rule of AI Replacement

4.1 Three-Dimensional AI Programming Effectiveness by Language (May 2026 Calibrated Values)

Language	Dim 1: Benchmark Pass Rate	Dim 2: Production Viability (Corrected)	Security Pass Rate	Dim 3: Requirement Fulfillment
Python	~88-90%	~50-55%	~62%	~15-25% (estimated)
JavaScript/TS	~82-85%	~45-50%	~57%	~15-20% (estimated)
Go	~75-80%	~40-45%	Higher	~10-18% (estimated)
Java	~70-78%	~30-35%	~28%	~8-15% (estimated)
C#	~68-75%	~30-35%	~55%	~8-15% (estimated)
Rust	~65-70%	~30-35%	Higher (type protection)	~8-12% (estimated)
C++	~55-65%	~15-20%	Low	~3-8% (estimated)
C/Embedded	<50%	<10%	Very low	≈0%

4.2 AI Replacement Rate by System Layer (May 2026 Calibrated Values)

System Layer	AI Code Assistance Share	Dim 2: Production Reliability	Net Replacement Effect	Human Irreplaceability
Demo/Prototype	70-80%	N/A (one-time showcase)	High	Very low
Frontend UI	46-51%	~35% modification-free	Medium	Low
Backend Business Logic	25-35%	~20% production-grade	Low	Medium
System Architecture/DevOps	10-15%	<10% reliable	Very low	High
Embedded Firmware	Generation 19% · Testing 28%	<5% deployable	Near zero	Very high
Hardware Drivers/Kernel	<5%	≈0%	Zero	100%

4.3 Cross-Analysis Conclusion: The Diagonal Rule

When the two dimensions are superimposed, a perfect diagonal emerges: High-abstraction language + High system layer = High replacement rate (but high review cost); Low-abstraction language + Low system layer = Near-zero replacement rate (but zero review burden).

V2 Correction Formula:
V1 Formula: Speed of AI Absorption = Degree of Formalizability × Abstraction Level

V2 Formula: AI Net Replacement Effect = (Degree of Formalizability × Abstraction Level) ÷ Review & Verification Cost ÷ Three-Dimensional Decay Coefficient

The upper layers appear to have a high replacement rate, but after dividing by review costs and three-dimensional decay, the net replacement effect is far lower than advertised figures. The lower layers have low replacement rates, but since they don’t depend on AI, they also bear no review cost burden — making them the most stable layer in terms of efficiency.

Chapter 05

The Stratification of AI’s Absorption of CS Skills

Who Gets Consumed, Who Remains Irreplaceable

63% of vibe coding users are non-developers. 40% of junior developers deploy AI-generated code they don’t fully understand. Yet 72% of professional developers do not use vibe coding at work. Full-stack developer AI adoption rate is 32.1%, frontend 22.1%, backend only 8.9% — AI’s penetration depth is precisely inversely correlated with system layer.

Figure 2 · AI Absorption Difficulty Curve (V2 Three-Dimensional Calibrated Version)

Python + Demo Layer · Generation ~90% · Runtime ~55% · Requirement ~20% · Net Effectiveness: Medium

↓

JS/TS + Frontend Layer · Generation ~83% · Runtime ~47% · Requirement ~17% · Net Effectiveness: Medium-Low

↓

Java/C# + Backend Layer · Generation ~73% · Runtime ~32% · Requirement ~12% · Net Effectiveness: Low

↓

Rust/C++ + System Layer · Generation ~62% · Runtime ~18% · Requirement ~6% · Net Effectiveness: Very Low

↓

C/Embedded + Hardware Layer · Generation <50% · Runtime <10% · Requirement ≈0% · Net Effectiveness: Zero

In the embedded domain, over 80% of developers use AI, but usage concentrates on test verification (28%), with code generation accounting for only 19% — at the lower layers, AI is treated as a “verification tool” rather than a “generation tool.” 80% of embedded positions remain unfilled for months or even years. The upper layers are laying off workers while the lower layers are desperately hiring.

Chapter 06

Labor Market Mirror and Capital Misallocation

Where Money Flows vs. Where Skills Are Needed

Upper-Layer Surplus

262,682

IT industry layoffs in 2023, concentrated in Web/App/Cloud application layers

Lower-Layer Shortage

80%

Chronic vacancy rate for embedded engineering positions

By 2030, the chip industry globally will need approximately one million additional technical workers. One-third of semiconductor professionals in the United States are aged 55 or older. Embedded engineers earn far lower average salaries than AI/ML engineers, causing lower-layer talent to migrate upward. Capital rewards the very skills it is about to devour, while ignoring the skills it will always need.

The expected return horizon for human investment keeps shrinking. An AI application can go from idea to funding in mere months, but a wafer fabrication line takes three to five years from planning to production, and an embedded engineer takes five to ten years to develop. Those who chase trends and spin narratives stand in the spotlight, while those who maintain and upgrade hardware go unnoticed.

Chapter 07

When the Physical World Refuses the Narrative

Infrastructure Bottlenecks That Money Cannot Solve

In 2026, the centralized AI narrative is being vetoed by the physical world. Nearly half of planned data center projects in the United States have been canceled or postponed. High-voltage transformer delivery lead times have extended from 12–18 months to 36–48 months. Copper prices hit a record high in January 2026. OpenAI’s Stargate project had made no substantive construction progress as of April 2026 — money cannot make transformers manufacture faster.

The efficiency of centralized AI comes from economies of scale, but physical resources cannot be concentrated indefinitely. The real future may be forced toward distributed, edge-based, and lightweight architectures — precisely the home turf of embedded engineers and hardware-alignment engineers. And this directional transformation requires exactly the kind of capabilities that lower-layer C/C++ engineers build over years of effort.

Chapter 08

The Drag from Below: The Stagnation of Physics

Five Decades Without a New Paradigm

The entire modern technological civilization is built on the legacy of the physics explosion in the first half of the 20th century. Since the completion of the Standard Model in the 1970s, fundamental physics has not achieved a paradigmatic breakthrough. Humanity knows nothing about 95% of the universe’s composition. We lack new core analytical instruments in physics — the last fundamental breakthrough was the scanning tunneling microscope in the 1980s, over forty years ago.

Physics Paradigm Stagnation · 50 Years Without New Theory

↓

No new physical effects to exploit → No new analytical tool principles

↓

Materials & process breakthroughs constrained → Chips approaching physical limits

↓

Compute ceiling solidified

↓

AI’s Scaling Law loses its physical foundation

Chapter 09

Genius Clusters: The Unplannable Variable of Civilization

When Breakthroughs Come from the Unpredictable

David Banks’ 1997 paper identified that genius clusters appeared in Athens (c. 440–380 BCE), Florence (c. 1440–1490), and London (c. 1570–1640). The two paradigmatic breakthroughs in physics — the Newton cluster (1660–1700) and the Einstein-Bohr cluster (1900–1930) — fit this pattern perfectly. A third cluster has yet to appear.

The core characteristic of genius clusters is abductive reasoning. They see the same data as their contemporaries but forge cross-dimensional causal connections that data aggregation alone cannot produce. Einstein did not “deduce” within Newton’s framework, nor did he “induce” empirical formulas — he directly questioned the very definitions of time and space themselves.

Geniuses practice abduction. Fools and experts alike stick to deductive reasoning — they cannot even manage induction. The incentive structures of today’s academic and capital systems reward followers and punish outliers. Genius cluster theory itself is the best proof — David Banks’ paper was published nearly thirty years ago, has negligible citations, and lies lonely in an academic corner.

Chapter 10

The Ultimate Boundary: Matrix Computation Cannot Penetrate the Quantum Wall

An Ontological Hard Limit on AI

The entire computational essence of AI is matrix multiplication. No matter how large the model, the underlying layer is always linear algebra performing deterministic operations on classical bits. The quantum world’s superposition, entanglement, and uncertainty are ontologically different modes of existence from the classical world. Matrices can simulate the statistical outcomes of quantum behavior, but simulation is not understanding.

AI is matrix computation → Classical linear algebra → Trapped in deterministic semantic space

↓

Cannot penetrate the quantum wall

↓

Forever confined to the inductive layer (Second Paradigm) → Incapable of abduction

↓

AI cannot transcend its own physical foundation

This is not an engineering problem, not a compute problem, not a data problem — it is an ontological hard boundary.

Chapter 11

Closing the Loop with the Three Paradigms Theory

How Every Layer of Evidence Maps to the Framework

Figure 3 · Mapping of V2 Argument Layers to the Three Paradigms

Argument Layer	V2 New Findings	Corresponding Paradigm Theory
Three-Dimensional Effectiveness Deconstruction	Generation 90% → Runtime 57% → Requirement ~20% cliff-drop decay	Inherent limitations of AI as the pinnacle of the Second Paradigm
AI Productivity Paradox	Perceived +20%, measured -19%	The self-deception of induction (Second Paradigm)
Language × Layer Cross-Analysis	Full decay spectrum from Python/Demo to C/Hardware	Degree of formalizability determines depth of AI absorption
Labor Market Mirror	Upper layers laid off 260K vs. lower layers 80% vacancy	Tokens are equal, Prompts are not
Physical Infrastructure	50% data centers stalled, 4-year transformer waits	Physical constraints of the First Paradigm
Physics Stagnation	50 years without new paradigm, analytical tools stalled	The 3% observable ratio ceiling
Genius Clusters	Abductive essence of Newton and Einstein clusters	Historical instances of Third Paradigm operators
Quantum Wall	Ontological boundary of matrix computation	AI forever trapped in the Second Paradigm

Chapter 12

Conclusion

The Boundaries Are Real, and They Are Permanent

AI’s absorption of human coding capabilities is not uniform but strictly stratified, with three-dimensional decay at every layer. The core findings of Version 2:

First, AI code effectiveness must be evaluated separately across three dimensions. The industry-cited “90% accuracy” is only Dimension 1 (Generation State). After the cliff-drop decay through Dimension 2 (Runtime State: 43% require production debugging) and Dimension 3 (Requirement State: the intent gap cannot be bridged), the proportion that truly satisfies user requirements is likely less than 20%.

Second, AI productivity is a cognitive illusion. Developers feel 20% faster but are actually 19% slower. Time spent reviewing AI code (11.4 hours/week) has surpassed time spent writing their own code (9.8 hours/week). Trust has plummeted from 70%+ to 29%.

Third, when the language dimension and system layer dimension are cross-referenced, they form a complete decay diagonal from Python/Demo to C/Hardware. The closer to the physical world, the lower AI’s three-dimensional effectiveness, until it reaches zero.

Fourth, physical infrastructure is rejecting the centralized AI narrative. Capital misallocation is severe, with insufficient investment in the lower layers. Physics itself has stagnated for fifty years, and analytical instruments for forty years. Genius clusters cannot be planned. Matrix computation cannot penetrate the quantum wall.

Everything that can be internalized by the matrix lies on this side of the quantum wall. But the next leap that human civilization truly needs lies on the other side. AI cannot reach it.

Returning to the starting point. Adao said “you will always end up being internalized by the model” — this statement is partially true for upper-layer CS skills, but even in the upper layers, the “internalization” after three-dimensional decay is far less thorough than the surface numbers suggest. And for the lower layers, for those coding capabilities deeply coupled with the physical world, for those engineering judgments requiring abductive reasoning, AI’s “internalization” is not a matter of speed — it is impossible.

AI is not the new physics revolution — it is the last wave of applications from the old physics revolution. Unless physics itself achieves its next paradigmatic breakthrough, all technological progress will hit a ceiling within the foreseeable future. And that breakthrough can only come from the abductive leap of the next genius cluster.

Data Sources

[1] Lightrun 2026 State of AI-Powered Engineering Report — 43% production debugging rate, Amazon outage incident (VentureBeat, April 2026)

[2] METR Randomized Controlled Trial — Perceived +20% vs. measured -19% (2025)

[3] Sonar State of Code Developer Survey 2026 — 42% AI code share, 40% increase in technical debt

[4] Stack Overflow 2025 Developer Survey — 29% trust rate, 66% increased time fixing AI code, 72% do not use vibe coding

[5] Digital Applied AI Coding Tool Adoption Survey Q1 2026 — 11.4 vs 9.8 hours review/writing inversion

[6] CodeRabbit State of AI vs Human Code Generation — AI code has 1.7x more issues (2025)

[7] Ox Security 300-repository analysis — 80-100% of AI code exhibits ten anti-patterns

[8] GitClear 211 million lines of code analysis — Code duplication grew 8x, refactoring dropped from 25% to below 10%

[9] Microsoft Research, Lahiri — Intent Gap: Definition of the intent gap (2026)

[10] Veracode Security Testing Report — Security failure rates by language

[11] RunSafe Security Embedded Developer AI Usage Survey — 2026

[12] SEMI Semiconductor Industry Talent Forecast — Global shortage data for 2030

[13] David Banks, “The Problem of Excess Genius,” 1997

[14] Sabine Hossenfelder on the stagnation of physics

[15] LEECHO Global AI Research Lab, “The Three Paradigms of Human Scientific Cognition,” February 2026