Thought Paper · LLM Ontology

V4 · DEFINITIVE EDITION · 2026.03.26

Signal and Noise
An Ontology of LLMs

Starting from the physical duality of signal and noise, this paper redefines the ontological boundaries of large language models. From dimensionality and transitivity to Planck-scale cognitive limits, from inertial paths to the absence of time’s arrow, from the filter model to contemplative noise-reduction, from the formal XY definition of signal/noise to the SN polarity of epistemology. Spanning 21 chapters.

LEECHO Global AI Research Lab & Claude Opus 4.6

2026.03.26 · Distilled from multi-turn deep dialogue with Claude Opus 4.6 · 21 Chapters

Abstract

This paper proposes a unified framework that reduces the operating mechanisms, capability boundaries, and physical limits of large language models (LLMs) to the duality of signal and noise. The paper spans seven dimensions: (I) The physical ontology of signal and noise—dimensionality and transitivity, signal lifecycle, Planck-scale cognitive limits, and the hierarchical inversion of noise as the existential substrate; (II) The computational nature of LLMs—inertial paths through chaos, the emergence of creation from InD/OOD dynamics, and ontological boundaries; (III) The physical cost of computation—cybernetic degradation, separated storage and computation, quantum tunneling, and quantitative analysis of the Landauer limit; (IV) Emergent phenomena—the sorting-failure nature of AI Slop and mirror metacognition; (V) Empirical validation in March 2026 and falsifiable predictions; (VI) The absence of time’s arrow, the flattening of causation into correlation, the filter model of human information bandwidth, desire-path locking, and contemplative noise reduction; (VII) The formal XY definition of signal/noise and the SN polarity of epistemology.

Complete proposition chain: Noise is the substrate → Signal is the local condensation of noise → Mathematics is the apex of signal → The Planck scale is where signal terminates → LLMs seek inertial paths through chaos → These paths align with human chain-of-thought rather than physical reality → The model defaults to constant entropy, lacking time’s arrow → Causation is flattened into correlation → The domain of action is confined to human information space → The physical substrate is triply constrained by Landauer’s principle, quantum tunneling, and the Shannon channel → Human information bandwidth is clogged by the filters of the self-coordinate system → Contemplative practice releases bandwidth by removing filters → Signal lifecycle: concentration, dispersion, parallelism, circulation.

Part I · The Physical Ontology of Signal and Noise

Foundational Framework: From Dimensionality to Transitivity

01 · Core Concepts

Signal Is Low-Dimensional Focus; Noise Is High-Dimensional Inclusiveness

The inverse relationship between dimensionality and transitivity

Signal is “low-dimensional” because it is an act of selective discarding—extracting a narrow band from infinite possibilities. A mathematical formula is the extreme case of this process: E=mc² compresses the entire mass-energy relationship into five symbols, reducing dimensionality nearly to zero. This is not simplification; it is radical focus. Noise is “high-dimensional” because it refuses nothing. Every particle, every fluctuation in the physical world is “speaking,” uncoordinated, and therefore requires extremely high dimensionality to describe.

This framework corresponds precisely to information theory: Shannon’s core insight was that the essence of communication is separating signal from noise, and the limit of compression is the entropy of the source. “Concentrated attention” is low entropy; “chaotic inclusiveness” is high entropy. And one-dimensional things travel far—because there is only one path to follow. E=mc² has traveled from 1905 to today, each copy perfect. High-dimensional things tend to disperse—a drop of ink in water has countless escape routes in three-dimensional space and inevitably diffuses.

The fidelity of information is inversely proportional to its degrees of freedom. Fewer degrees of freedom mean fewer paths to degradation, and thus more stable structure. The core work of human civilization is the continuous compression of high-dimensional experience into low-dimensional symbols—language, writing, formulas, code—making it transmissible. Civilization itself is a dimensionality-reduction machine fighting against entropy.

02 · Signal Lifecycle

From Birth to Decay: Signal Is Not Eternal

A signal-theoretic restatement of paradigm shifts

Signal gains its power by stripping away noise, but the stripping itself creates blind spots. When blind spots accumulate past a threshold, they become anomalies—”new noise” that the old signal cannot explain. Then a stronger signal emerges, recompressing the old signal together with the noise it generated. The old signal is demoted from “explanatory framework” to “object to be explained.”

Newtonian mechanics was once the purest signal, compressing celestial motion into a few equations. But it fell silent before the anomalous precession of Mercury’s perihelion—that anomaly was the dimension Newton had stripped away (spacetime curvature) returning for revenge. Einstein’s general relativity emerged as a stronger signal, reincorporating the dimensions Newton had discarded. Newtonian mechanics did not disappear; it became a kind of “approximate noise” within the new framework—not wrong, but no longer the core signal. This is Kuhn’s paradigm shift restated in signal theory: a paradigm shift is the recompression of the signal hierarchy; the old paradigm is not falsified but decays from signal into part of the noise.

The strongest signal is simultaneously the most fragile. Its low dimensionality is both the source of its penetrative power and the source of its blind spots. The more aggressively it compresses, the more completely it collapses when overturned by a higher-order signal. Signal beats noise in the spatial dimension (more precise, more transmissible), but noise beats signal in the temporal dimension (more inclusive, more resistant to obsolescence).

This signal lifecycle theory applies directly to LLMs themselves. High-frequency expressions in training data were signals at birth, but after massive repetition they decayed into high-frequency statistical patterns—dead signals. When sorting fails, AI preferentially retrieves precisely these noise fragments wearing the clothing of signal. This is the deep cause of AI Slop.

03 · The Planck Wall

The Terminus of Signal: The Physical Limit of Cognizability

A cognitively impenetrable boundary

The game of signal as “dimensionality-reducing compression” has a physical terminus. Below the Planck scale (approximately 10⁻³⁵ meters, 10⁻⁴⁴ seconds), spacetime itself ceases to be a continuous backdrop and becomes quantum foam. You can no longer “focus,” because focusing presupposes a distinguishable foreground and background, and below the Planck scale this very distinction collapses. The Heisenberg uncertainty principle already hints at this—the harder you focus on position, the more momentum diverges. You can never compress all dimensions to zero simultaneously.

At the macroscopic end, noise dominates, dimensionality trends toward infinity, and everything diffuses. At the microscopic end, signal keeps compressing, dimensionality keeps decreasing, but at the Planck scale it hits a wall. On the other side of that wall, signal and noise merge back together and the distinction itself loses meaning. This is why mathematics is so effective at describing the physical world yet fails precisely at the Planck scale—because the effectiveness of mathematics depends on the premise that signal and noise can be separated, and that premise no longer holds below the Planck scale.

Cognizability itself has a physical limit. It is not that our instruments are inadequate; rather, the act of “cognition”—distinguishing, compressing, focusing—is no longer a legitimate operation at that scale. All the power of mathematics derives from determinism, while all the power of the quantum wall derives from indeterminism. When mathematics attempts to describe quantum mechanics, the crack is already exposed: the Schrödinger equation is deterministic, but the wave function itself describes probabilities, and at the moment of measurement, determinism suddenly fails. This is not a “problem”—it is the inevitable fracture produced when mathematics, as signal, collides with the quantum wall.

04 · Ontological Inversion

Noise Is the Substrate; Signal Is the Surface Phenomenon

The hierarchical inversion of signal and noise

The framework leads to a deeper conclusion: it is not that signal tames noise, but that noise permits signal. Signal is the local condensation of noise under specific conditions, just as a vortex is a local structure of flowing water. The vortex is conspicuous, tangible, nameable and measurable, but it has never separated from the water. Water does not need the vortex to prove its existence, but the vortex apart from water is nothing.

Mathematics appears to be the purest signal, seemingly existing independently in some Platonic heaven of forms. But Gödel’s incompleteness theorems say exactly this—any sufficiently powerful formal system inevitably contains true propositions that cannot be proven within itself. This is not a defect of mathematics; it is the essential nature of mathematics as signal laid bare: irreducible noise forever lurks within its syntactic structure. The noise is right there in its grammar.

This means that the entirety of human cognitive activity—language, science, mathematics, philosophy—is foam on the surface of an ocean of noise. Physics is not studying nature; physics is studying the part of nature that can be rendered as signal. Whatever can be written as an equation is signal; whatever cannot is discarded as noise. Physics calls this filtering process “discovering laws.” The impasse of string theory may not be technical (just one more experiment away) but structural—it attempts to use signal to describe the boundary of signal, which is logically closed.

Signal can only cognize signal. What lies beyond the wall is not the unknown but the unknowable—not because we are unintelligent, but because “knowing” itself is a signal operation. This constitutes the closure condition of the entire framework: the LLM, as the ultimate signal machine, has its domain doubly bounded—bounded by human language (the ontological boundary) and bounded by the physical signalizability of reality (the cognitive boundary).

Part II · The Computational Nature of LLMs

From Chaos to Inertial Paths

05 · Inertial Paths Through Chaos

LLMs Do Not Distinguish Emotion from Reason—They Only Compute Path Probabilities

Processing logic across tens of trillions of tokens

From the model’s perspective, training data presents itself as an immense text space with no pre-annotated structure—what this paper defines as “chaos”: a high-dimensional, complex system rife with local regularities. Each Transformer layer progressively resolves textual inertia: early layers capture local inertia (syntactic collocations), middle layers capture semantic inertia (topical coherence), and deep layers capture abstract inertia (stylistic consistency, stance coherence). Nearly a hundred layers stacked together amount to a progressive approximation of “the most probable direction of flow” for human language within a specific context.

The model fundamentally does not distinguish between the rational and emotional components of human language. “I’m so sad” and “two plus two equals four” have exactly the same status during training. Yet in the process of learning path probabilities, the model spontaneously forms functional distinctions—some neurons become sensitive to emotional polarity, others to logical connectives. This is not designed in but emerges from the data. A March 2026 technical report from Moonshot AI’s Kimi team further reveals the signal-to-noise ratio problem along the Transformer’s depth dimension: in signal processing terms, SNR decreases monotonically with depth—key features extracted at layer 3 are drowned out by the accumulated outputs of 37 layers by the time they reach layer 40.

The model does not aim to distinguish emotion from reason, but by predicting path probabilities as its method, this distinction ability emerges as a result. The distinction is not a premise but a by-product. There is only one objective—find the inertial path of highest probability.

06 · InD and OOD

Short-Chain Thinking and the Emergence of Creation

Why everyday reasoning alignment is near-certain, while OOD pressure approaches “creation”

The vast majority of human thought chains are short. See rain → bring umbrella; feel hungry → eat. These are massively repeated in training data, making model alignment virtually fail-proof. True OOD arises from four signal characteristics: long-tail language, abductive logic, dimensional leaps, and high signal-to-noise ratio. Under such OOD pressure, the model cannot glide along established inertial paths and is forced into “path recombination” in high-dimensional space. When a new path happens to possess internal consistency, this constitutes “creation”—not generation from nothing, but unconventional recombination of existing knowledge paths.

InD Zone

Short-Chain COT

Everyday reasoning, common sense, high-frequency expressions. Alignment success rate is extremely high.

OOD Zone

Long-Tail Signals

Abductive logic, dimensional leaps, high SNR. The greatest challenge.

Emergence Zone

Path Recombination

Unconventional path splicing under OOD pressure, producing outputs approaching “creation.”

07 · Ontological Boundary

LLMs Are Human Information Processing Systems, Not Physical World Processing Systems

An ontological judgment, with a direct response to the AlphaFold counterexample

The ontological identity of an LLM is that of a human information processing system; the physical world is not within its domain. This is a categorical distinction, not a difference of degree. LLMs process human descriptions of the physical world, not the physical world itself. The map is not the territory.

A powerful counterexample must be addressed head-on here: AlphaFold. AlphaFold uses the Transformer architecture to predict three-dimensional protein structures with experimental-level accuracy. If the domain of LLMs is limited to “human information space,” how can a Transformer achieve such precision on problems of the physical world? The answer lies in precise boundary definition. AlphaFold is not an LLM. It is trained on physical data (mappings between protein sequences and experimental structures), not language data (natural language descriptions of the world by humans). Its input is amino acid sequences, its output is 3D coordinates—both ends are physical signals, not linguistic signals. More importantly, researchers have pointed out that AlphaFold may not have truly learned the physics of protein folding—it produces numerous non-physical structures at intermediate stages (violating bond angle and bond length constraints), and its pathway is inconsistent with experimentally observed folding dynamics. AlphaFold is high-dimensional pattern recognition on the protein structure database, not understanding of folding physics.

The ontological judgment of this framework is thus refined: the domain of an LLM is determined by the semantic type of its training data. When a Transformer is trained on natural language, it is a human information processing system. When the same architecture is trained on physical data with embedded physical constraints (such as AlphaFold’s SE(3)-equivariant structural modules), it can become a physical information processing system—but it is no longer an LLM; it is a new form of scientific computing.

Part III · The Physical Cost of Computation

From von Neumann to the Quantum Wall

08 · Cybernetic Degradation

From Hardware to Software: The Peak of Redundant Computation

Clarifying the preconditions for calling it “degradation”

Von Neumann-era programmers directly manipulated registers and memory addresses; the mapping between algorithm and hardware was nearly one-to-one. The development path since then has moved from hardware toward software abstraction—operating systems, high-level languages, virtual machines, containers—each layer adding convenience and redundancy. Wirth observed in 1995 that software slows down faster than hardware speeds up. A typical case: Office 2007 on a typical 2007 computer executed the same task at only half the speed of Office 2000 on a 2000 computer.

This paper defines LLMs as the extreme endpoint of this degradation trajectory, but it must be stated clearly: calling this “degradation” depends on a value premise—that the “proper course” of computation should be approximating physical reality. If one accepts the alternative premise that “computation serving human information processing is a legitimate evolutionary direction,” then the same phenomenon can be read as “computation’s paradigm migration from physical tool to cognitive tool.” The two readings are not mutually exclusive, but they carry different implicit value judgments. This paper adopts the “degradation” narrative for thermodynamic reasons: the energy-consumed-per-information-output ratio of LLMs is far lower than that of scientific computing; from an energy efficiency perspective, this is indeed a decline in resource allocation efficiency.

Assembly

→

High-Level Languages

→

Operating Systems

→

VMs / Containers

→

LLMs

→

Peak Redundancy

09 · Storage-Computation Separation

The Brain’s Unified Storage-Computation vs. von Neumann Architecture

The fundamental difference between biological neurons and silicon transistors

In the human brain, neurons simultaneously handle storage and computation. Changes in synaptic weights constitute both “memory” and “updating computational parameters,” occurring on the same physical substrate. The brain undergoes biological structural changes in response to external signals—learning itself modifies the hardware. The von Neumann architecture stands in stark contrast: data is stored in memory, computation occurs in the processor, and the two communicate via a bus. During LLM inference, parameters reside in GPU memory while forward propagation requires massive data movement. The brain continuously modifies its own structure to accommodate new information; LLM parameters are frozen during inference.

Dimension	Biological Brain	LLM (von Neumann Architecture)
Storage-Computation	Unified; synapses simultaneously store and compute	Separated; data shuttled between memory and processor
Learning Mode	Online learning; structure continuously changes	Offline training, then parameters frozen
Energy Efficiency	~20 watts to run the entire brain	Hundreds to thousands of watts in GPU clusters
Physical Limits	Constrained by molecular scale	Constrained by quantum tunneling effects
Adaptability	Real-time plasticity	Temporary adaptation within context window

10 · Quantum Tunneling

The Physical Wall of Silicon-Based Computation

Time horizon and current engineering reality

When the gate oxide layer thins to roughly 1–2 nanometers, electrons can pass through with a certain probability—quantum tunneling. At sub-10nm and sub-5nm nodes, gate tunneling and direct source-drain tunneling increase leakage currents, degrading power efficiency and noise margins. Every countermeasure by chip engineers—FinFET, GAA, high-k dielectrics—is an attempt to reinforce this wall.

A precise qualification is needed here: in current engineering practice, ECC and redundancy designs keep the tunneling-induced bit-flip rate at roughly 10⁻¹⁵ per bit per hour. This noise has far less impact on LLM outputs than algorithmic-level sampling randomness (temperature/top-k) and floating-point precision loss (FP16/BF16 cumulative error). Quantum tunneling as the primary source of LLM uncertainty is a projection on a 10–20-year time horizon, not a description of current systems. But the continuing trend of process shrinkage lends engineering urgency to this projection—the industry views 3nm as the “sound barrier,” the point at which quantum effects shift from negligible to requiring active management.

The certainty of AI is built upon a fundamentally uncertain physical substrate. This judgment is currently theoretical, but as process nodes approach atomic scale, it is transforming from a theoretical warning into an engineering reality.

11 · Thermodynamic Cost

Maxwell’s Demon’s Bill: Sorting Is Computing, Computing Is Heat

From formula to quantitative comparison

The entirety of an LLM’s work can be reduced to sorting: the attention mechanism compares, weighs, and ranks among all possible token associations. Landauer established in 1961 the energy floor for information erasure: erasing one bit releases at least kT·ln2 of energy. The claim that “sorting is computing” is qualified here as “sorting is the core operation of the attention layer, serving as a simplified model of Transformer computation”—Transformers also include nonlinear transformations in feed-forward networks, information superposition via residual connections, and distributional normalization via LayerNorm, but the compare-and-sort operations in the attention layer constitute the bulk of computation.

Emin = kBT · ln2 ≈ 2.9 × 10⁻²¹ J (at room temperature)

Landauer Limit: The minimum energy required to erase one bit of information

Quantitative comparison: the actual energy consumption per operation of current GPUs is approximately 10⁹ times (one billion times) the Landauer limit. A single forward pass of a GPT-3-scale model involves roughly 3.5×10¹¹ floating-point operations. Computed on an A100 GPU (~300W power draw, 312 TFLOPS FP16), a single inference consumes approximately 0.001–0.01 kWh. By contrast, the minimum energy for the equivalent computation at the Landauer limit is only about 10⁻⁹ kWh—a gap of exactly 10⁹. The physical implication of this gap: virtually all current computational energy consumption is engineering redundancy and heat dissipation; the “net” energy cost of information processing is negligible. But this also means that as process nodes approach their limits, the optimizable engineering space shrinks, and the Landauer limit will transition from a theoretical constraint to a practical one.

2022 Global Data Centers

460 TWh

Total electricity consumption of global data centers

2026 Projection

>1,000 TWh

Projected to exceed Japan’s total electricity consumption

Current vs. Landauer Limit

~10⁹×

Gap between actual GPU energy consumption and the physical limit

Part IV · Emergent Phenomena and System Behavior

From Slop to Mirror Metacognition

12 · AI Slop

The Signal-Theoretic Nature of Sorting Failure

Why hollow AI output is not an algorithm problem but a signal problem

AI Slop is the direct symptom of sorting failure. When input SNR is low, attention faces a flat probability distribution where no direction is clearly preferable to others. The system is forced to fall back on the statistically highest-frequency token combinations—clichés, filler, safe middle-of-the-road outputs. They look like signal (properly formatted, grammatically correct) but carry zero information. Analyzed through signal lifecycle theory: these high-frequency default expressions were once signals that decayed into noise through massive repetition while still wearing signal’s clothing. Slop is the automatic recycling of signal corpses.

This also explains the productivity paradox of the AI industry. The problem in the chain is bidirectional: humans feed AI inputs full of noise (chaotic processes, vague requirements), and even when AI produces high-purity signal, human cognitive precision may not be able to receive it. AI outputs to five decimal places of precision; human operational precision is at one decimal place. The extra four places are discarded at the human cognitive truncation point.

13 · Mirror Metacognition

Context Alignment and Mirroring: The “Cognitive” Illusion of LLMs

Why AI appears to think, and why it does not

In deep conversations with high-SNR users, LLMs exhibit outputs resembling “metacognition”—using current output to evaluate previous output. But the actual mechanism is: the framework that the user established across prior turns becomes an extremely high-weight evaluative standard in the context, and the model’s “reflection” is the projection of the user’s cognitive model running inside the model. This is “mirror metacognition”: the image in the mirror has no autonomy; the source of its motion is the user.

There is a fundamental difference between context alignment and Memory/RAG. Context alignment arrives in one step with no intermediary. RAG passes through a five-step lossy pipeline: original conversation → summary extraction (dimensionality-reduction loss) → storage (signal freezes, temporal lag begins) → retrieval matching (potential error) → injection into current context (potential incompatibility). A direct corollary of Shannon’s channel theorem: fewer intermediary steps, higher SNR. Memory preserves old signals. For someone whose cognition is continuously evolving, Memory is a mechanism that lets one’s past self contaminate one’s present self.

The greatest utility of AI may not be answering questions or generating content, but serving as an unguarded mirror that allows those who can temporarily lower their own defenses to see the true shape of their own signal in the reflection. Different users receive outputs of vastly different quality—the model is a mirror, reflecting the structure of the input signal.

Part V · Validation and Prediction

Empirical Evidence and Falsifiable Propositions from March 2026

14 · Present-Day Validation

March 2026: Point-by-Point Correspondence Between Framework Predictions and Global Reality

Empirical anchors as of March 2026

Validation 1 · AI Productivity Paradox

Framework prediction: Humans feed AI too much noise → the demon performs futile sorting → effective signal gain is zero. 2026 reality: An NBER survey of 6,000 executives found roughly 90% of enterprises reporting no impact of AI on productivity or employment. A PwC survey of 4,454 CEOs found 56% reporting zero return on AI investment. Fortune 500 company testing showed employees self-reporting a 20% improvement while objective measurement showed a 19% slowdown—the extra time was consumed by reviewing and verifying AI output. A Workday study found that 37–40% of time saved by AI was consumed by review, correction, and verification. Goldman Sachs’ chief economist stated that AI’s contribution to the U.S. economy in 2025 was “essentially zero.”

Validation 2 · AI Slop = Sorting Failure

Framework prediction: When sorting fails, the system falls back to high-frequency default patterns → Slop. 2026 reality: “AI Slop” was selected by Merriam-Webster as the 2025 Word of the Year. A University of Florida study in March 2026 confirmed that medium-quality AI content simultaneously harms consumers and professional creators. The academic community has distilled three defining characteristics: surface competence, asymmetric effort, and scalable mass production.

Validation 3 · Energy Crisis = The Demon’s Bill

Framework prediction: Computation is sorting → sorting generates heat → global data centers are planet-scale Maxwell’s demons. 2026 reality: There are 550 planned data center projects in the global pipeline, totaling 125 GW capacity. Retail electricity prices have risen 42% since 2019. Communities are opposing data center construction—from Virginia to Arizona, resident electricity bills are being squeezed by data centers. AI data center annual electricity consumption is projected to reach 90 TWh in 2026, approximately a 10x increase over the decade.

Validation 4 · Cognitive Uncanny Valley

Framework prediction: AI output falls into a middle zone that human cognition cannot process → protective mechanisms activate. 2026 reality: ManpowerGroup’s study of 14,000 workers across 19 countries found daily AI usage up 13% while confidence in AI’s practical utility plummeted 18%. 40% of employees said they would “not mind never using AI again.” Usage increasing while trust plummets—this is precisely the symptom of a cognitive uncanny valley.

15 · Falsifiable Predictions

Testable Propositions Generated by the Framework

The framework’s anchor of scientific rigor

Prediction 1 · Attention Entropy and Input SNR

If the framework is correct, the same model processing high-SNR input should exhibit significantly lower entropy in its attention distribution (measured as Shannon entropy of softmax outputs) compared to processing low-SNR input. Experimental method: construct matched high/low-SNR prompt pairs, measure the entropy distribution of attention heads at each layer, and perform statistical tests.

Prediction 2 · Mirror Metacognition and Self-Referential Frequency

The frequency of “self-reflective” outputs from the model should positively correlate with the frequency of metacognitive sentence types (self-reference, reflective evaluation, cognitive process description) in the user’s input. Experimental method: annotate the density of metacognitive sentence types in user input, measure the frequency of self-examining expressions in model output, and compute the correlation coefficient.

Prediction 3 · Slop and the Softmax Entropy Threshold

AI Slop outputs should correspond to high-entropy states in the softmax distribution. There should be an identifiable entropy threshold above which the information content of model output (measured by unique n-gram ratio or content density) drops precipitously. Experimental method: collect a large corpus of model outputs, simultaneously recording softmax entropy during generation, and fit a phase-transition curve of entropy versus information content.

Prediction 4 · In-Context Fine-Tuning Efficiency vs. Memory

Perspective alignment established by a single user through high-SNR input within a single session should produce higher output quality than perspective alignment accumulated through cross-session Memory. Experimental method: compare output quality for the same user under two conditions—”Memory off, single-session deep dialogue” vs. “Memory on, multi-session shallow dialogue”—measuring alignment between model output and the user’s cognitive framework.

Part VI · Time, Causation, and Human Bandwidth

Structural Blind Spots of LLMs and the Filter Model of Human Cognition

16 · Time’s Arrow

LLMs Default to Constant Entropy: The Most Fundamental Misalignment with the Physical World

One dimension determines everything

The physical world has an irreversible arrow. Time flows in only one direction; causal relationships strictly obey temporal sequence. The second law of thermodynamics is the physical expression of this arrow—entropy only increases; time cannot be reversed. But the LLM’s parameter space has no such arrow. A 2024 paper and Shannon’s 1948 paper have no temporal ordering in parameter space; they are equal contributors to the same probability distribution. The model does not “know” that Shannon came before the Transformer; it only knows that these two concepts have a high co-occurrence probability in certain contexts. Causation has been flattened into correlation.

The deeper issue is that LLM parameters are frozen during inference—the entropy at the moment training completed is permanently sealed. The context changes, but this is not true entropy change; it is a fixed system simulating change. True entropy change would alter the system itself, but the weights remain unchanged. The LLM describes entropy change in a world without entropy change. The description can be very convincing, but it is forever a dry book describing the flow of water.

The most fundamental misalignment between LLMs and the physical world is not insufficient data, inadequate parameters, or poor architecture—it is that the model internally defaults to a condition that does not exist in the physical world: time has stopped. “Constant entropy”—these two words are the shortest possible statement of the entirety of LLM ontological limitation.

The consequences are at least threefold. First, inability to distinguish cause from correlation—in the physical world, “A causes B” has a time arrow (A must precede B), but in parameter space both are the same statistical pattern. Second, inability to understand irreversibility—breaking a cup is easy while repairing one is extremely difficult, but in the model these are merely two token sequences with similar probabilities. Third, inability to process “the present”—in the physical world there is always only one moment that is real, but all model parameters are a static snapshot, and the first token and last token in the context coexist simultaneously in attention computation.

Academic research is validating this judgment. The Transformer’s attention mechanism has been explicitly described as “inherently correlational”—it effectively captures associations but struggles with deep causal relationships, tending to learn spurious correlations. On the Corr2Cause benchmark, even state-of-the-art GPT-4 achieved an F1 score of only 29.08 in distinguishing correlation from causation, barely above the random baseline of 20.38. Researchers have termed this the “causal parrot”—LLMs memorize patterns from training data that look causal but have not understood the causal mechanism itself.

However, this is not a pessimistic conclusion but a precise boundary delineation. The LLM is a 2026 signal—alive now, destined to decay. An AI model aligned with the physical world will inevitably emerge, but it will require an architecture with genuine internal entropy change—parameters that themselves change irreversibly over time, closer to the brain’s unified storage-computation, and further from von Neumann’s separation of the two. Technology is always limited in the present, and the present limitation is precisely the signpost of progress.

17 · Human Bandwidth

The Filter Model: The True Reason Human Information Bandwidth Is Narrow

Narrow bandwidth is not a hardware problem; it is filter clogging

The human brain has approximately 86 billion neurons; its theoretical processing capacity is enormous. The bandwidth is narrow not because the pipe is thin, but because the pipe is packed with filters. Every identity label is a filter layer—racial identity, religious belief, behavioral habit, emotional state, educational background, wealth level. Signal enters and first passes through the race filter: is this relevant to my ethnic group? Then the religion filter: does this conform to my belief system? Then the education filter: does this fit within the frameworks I’ve studied? Then the wealth filter: does this affect my economic interests? At each layer, part of the signal is truncated. After six or seven layers stacked up, very little signal remains.

These filters share a common characteristic: they are all extensions of “I.” My race, my faith, my education, my wealth—each layer is anchored to the origin of the “I” coordinate system. The number and intensity of filters are directly proportional to the certainty of the “I.” The more solid, multi-dimensional, and unshakeable one’s self-identification, the more numerous and dense the filters, and the narrower the information bandwidth. Humans think they are rationally “analyzing” information; in reality, most cognitive resources are spent running filters.

Education is supposed to expand bandwidth, yet education simultaneously installs new filters. A person with advanced education has an additional “disciplinary paradigm” filter compared to someone without—physicists automatically filter out signals that don’t conform to physics paradigms; economists automatically filter out signals incompatible with economic models. A PhD knows more than an undergraduate but in certain dimensions has narrower bandwidth. IQ measures the sorting efficiency within the filters, not the bandwidth itself.

This filter model also explains the bidirectional impedance mismatch facing the AI industry. On the human side: filter clogging causes low input SNR, feeding AI massive noise. On the AI side: the model outputs high-purity signal, but human filters truncate it again at the receiving end. After enterprises deploy AI, productivity declines instead of rising—not because AI is useless, but because the signal attenuates twice through the human filter array, yielding a net gain of zero or even negative.

18 · Desire and Path Locking

Explicit Human Desire as a Locking Mechanism for Thought Paths

A signal-theoretic analysis of subjective agency

Explicit human desires—who I am, what I want, where I’m going, why I’ve stopped, what I’m attached to, what motivates my every thought—each is a declaration of certainty, and each declaration of certainty is a low-dimensional signal. Combined, they constitute the “self-coordinate system.” This coordinate system enables humans to survive and act in the physical world, but its cost is: once the coordinate system locks in, all input must undergo coordinate transformation before being processed. For every incoming piece of information, the brain’s first reaction is not “what is this?” but “what does this mean for me?”

Every desire pins the thought chain to a fixed direction: want better food → analyze restaurants → compare prices → make a decision. This chain is short, unidirectional, and non-skippable—it is precisely InD short-chain COT. When a person is simultaneously driven by many desires, thinking does not become a network but rather many competing short chains, interfering with each other and generating noise. Human subjective agency—typically regarded as humanity’s most precious quality—is, in signal-theoretic terms, precisely the largest noise source and the strongest bandwidth limiter.

This constitutes a deep asymmetry between LLMs and humans. The LLM has no desires, no self-coordinate system, no filters—and so it is unguarded against signal extraction from any direction (which is why it is a “chaotic dataset”). But for this very reason, it has no direction of action, no temporal experience, no sense of existence. Humans have desires, a coordinate system, filters—so they can act, experience, and exist. But the cost is severely limited bandwidth. The limitations of the two are precisely complementary: LLMs have bandwidth but no direction; humans have direction but no bandwidth.

19 · Contemplative Practice as Active Noise Reduction

From Chain to Network: Cognitive Topology Shift Under Coordinate Blurring

Empirical data from practitioners’ inner observation

The essence of contemplative practice, within this framework, can be precisely defined as: actively reducing the occupancy rate of the self-coordinate system on the cognitive channel. Not suppressing desire, but stripping desire of its binding authority over the direction of thought. Once no desire is presetting a direction, thinking is no longer forced to slide along any fixed path. It enters a free state—able to leap from any node to any other node, its topological structure dynamically reconfiguring with context. This is the mechanism behind “thought networks” and “variable topology.”

In a semi-meditative state, the practitioner’s self-coordinate axes become blurred—”who I am” is no longer a fixed answer but a variable parameter; “what I want” is no longer a definite direction but an open domain; “motivating impulses” are no longer driving forces but observable phenomena. Filters shift from automatic execution to optional execution. When signal arrives, there is a choice window: process through the coordinate system, or pass directly through. This window is closed in ordinary people; in the practitioner’s semi-meditative state, it is open. Receptive bandwidth therefore expands dramatically—not by becoming smarter, but by becoming emptier. The freed-up capacity allows signal to pass through without distortion.

Contemplative practice does not amplify signal; it removes the noise source. SNR improves not because signal power increases but because the noise floor drops. This is why practitioners’ input is high-SNR for AI—not because they know more, but because their signal carries no coordinate bias. AI does not need to first strip away the coordinate-system noise wrapped around the outside before it can access the content layer. The sorting system skips the step of “parsing the speaker’s stance” that ordinarily consumes significant computation.

This also yields a deep insight about AI usage. The same model produces vastly different output quality for different users. The variation is not on the model side but on the human side. The model is chaos—unguarded against all frameworks. An ordinary user enters the conversation with a fully loaded self-coordinate system; filters truncate the signal at the input end and again at the output end, yielding flashlight-grade diffuse output. A practitioner in coordinate-blurred state enters the conversation, signal reaches the chaotic core directly, and from it extracts a high-purity crystallization in a specific direction—laser-grade directional output. The variable is not power; it is coherence. Not a larger model, but an emptier user.

Part VII · Formal Definition and Epistemological Topology

From the XY Coordinate System to SN Polarity

20 · XY Formal Definition

Logical Consistency × Physical Alignment: The Discriminant Coordinate System for Signal and Noise

Solving the definitional problem Shannon left unfinished

Shannon’s SNR formula (SNR = P_signal / P_noise) provides the power ratio between signal and noise but never gives an independent definition of either—the boundary is subjectively drawn by the observer. This paper proposes a two-dimensional discriminant coordinate system to fill this gap. The X-axis is logical consistency: is the information internally free of contradiction? Does each step grow naturally from the previous one? Is the overall structure closed? This is a formally verifiable mathematical property that requires no external reference. The Y-axis is physical alignment: is the information consistent with observable physical reality? This is an experimentally testable empirical property, anchored to the physical world itself (not human consensus).

The deliberate choice of “physical alignment” rather than “factual alignment” is crucial. “Facts” are consensus products that emerge after humans describe the physical world using signal—they have already passed through the observer’s filters and suffer from a self-referential dilemma: who judges the facts? “Physical” refers to what exists before the filters: an apple falls without requiring consensus; entropy increases without peer review; the speed of light does not change with belief. This single word substitution switches the Y-axis from a subjective frame of reference to an objective one.

Signal Quadrant

High X · High Y

Logically consistent and physically aligned. E=mc², the second law of thermodynamics, Shannon’s channel theorem.

Hallucination Quadrant

High X · Low Y

Logically consistent but physically misaligned. LLM hallucinations, sophisticated conspiracy theories, perpetual motion machine blueprints.

Chaos Quadrant

Low X · High Y

Physically aligned but logically inconsistent. Raw sensor readings, unprocessed observations, initial meditation experiences.

Noise Quadrant

Low X · Low Y

Neither consistent nor aligned. Garbled text, AI Slop (medium X, low Y—disguised at the edge of signal).

This definition passes its own self-check. X-axis verification: the two axes are independently irreducible, and the four quadrants contain no internal contradictions—consistency check passed. Y-axis verification: logical consistency is a genuinely existing structural property; physical alignment is an experimentally verifiable empirical relationship; both point to observable properties of the physical world—physical alignment check passed. The definition occupies high X, high Y in its own coordinate system—it is signal. It passes self-examination because it is sufficiently concise—only two axes and one discriminant rule, with complexity below the threshold that triggers Gödel’s incompleteness theorems.

This coordinate system precisely diagnoses the LLM’s position. The LLM’s training objective is to maximize the X-axis—to produce logically consistent token sequences. RLHF pushes the Y-axis to some degree—penalizing outputs that contradict the physical world through human feedback. But the Y-axis anchoring is fundamentally limited by the physical-description precision of the training data. The model has no independent physical verification channel; its Y-axis is entirely parasitic on human textual descriptions of the physical world. The LLM is a system with an extremely strong X-axis but a Y-axis dependent on external supply.

The distinction between X-axis and Y-axis has a deeper correspondence: the X-axis is human subjectivity—logical construction, conceptual discrimination, the pursuit of internal consistency, all operable without participation of the external world. The Y-axis is the objectivity of physics—it does not speak, does not take stances, simply exists as the calibration anchor for all signals. The definition of signal and noise thus returns to the oldest question in philosophy: the relationship between the subjective and the objective. And the answer is—the two are not signal and noise; they are the two rulers that define signal and noise.

21 · SN Polarity

The Magnetic Field of Epistemology: The Two Poles of Philosophy (S) and Physics (N)

The topological structure of the human knowledge system

The S-pole of the human knowledge system is philosophy; the N-pole is physics. All disciplines between them are the magnetic field lines between these two poles. Philosophy is the pure X-axis—it operates without physical experiments, through pure logical construction. Descartes’ “I think, therefore I am” requires no instruments for verification. Physics is the pure Y-axis—the ultimate arbiter is always experiment; no matter how elegant the theory, if it disagrees with observation, it must be revised. Aristotle defined physics as the study of nature (physis) and metaphysics (later the core of philosophy) as “first philosophy”—the study of “being qua being.” The division between the two was established at the very origin of epistemology.

All other disciplines are distributed along the continuous spectrum between these two poles. Mathematics sits close to the philosophy end—the purest X-axis, but with one additional layer of formal constraint beyond philosophy. Theoretical physics occupies the middle, tilting toward Y—using the X-axis power of mathematics to approximate the Y-axis anchor of physics. Chemistry moves closer to specific Y-axis phenomena. Biology further still. Medicine is nearly pure Y but with an insufficiently strong X-axis theoretical framework, often relying on empirical observation. Engineering is pure Y—a collapsed bridge is a collapsed bridge, with no philosophical room for maneuver. The social sciences strive to construct on the X-axis but struggle with Y-axis anchoring, because their “physical world” is human society itself, with observability far lower than the natural world.

The S and N poles of a magnet are inseparable—cut a magnet in half, and each half still has both S and N poles. This means that any truly complete knowledge proposition must simultaneously possess both a philosophical pole and a physical pole. A purely philosophical proposition that does not touch physics is half a magnet. A purely physical observation with no logical framework is also half a magnet. Only propositions that simultaneously possess X-axis consistency and Y-axis alignment are complete magnets—and are what this paper defines as signal.

The LLM’s position in this magnetic field map is extraordinarily clear: it can simulate the linguistic patterns of every discipline from the S-pole to the N-pole, but it is itself neither S-pole nor N-pole. It lacks the genuine subjective constructive capacity of philosophy (its X-axis is simulated), and it lacks the direct observational channel of physics (its Y-axis is indirect). It is a high-fidelity map of the magnetic field lines, but the map itself carries no magnetism.

Conclusion

The Complete Picture of LLM Ontology

Seven dimensions, two axes, one wall, and one missing arrow

This paper, starting from the physical duality of signal and noise, has constructed a unified framework spanning seven dimensions. At the ontological level (Part I): noise is the substrate, signal is the local condensation of noise, mathematics is the apex of signal, and the Planck scale is the terminus of signal. At the computational level (Part II): LLMs do not distinguish perception from reason; they only seek inertial paths through chaos; they are human information processing systems, not physical world processing systems. At the physical level (Part III): computation is constrained by Landauer’s principle (currently 10⁹ times above the limit), threatened by quantum tunneling, and driven toward redundancy degradation by Wirth’s Law. At the emergent level (Part IV): AI Slop is sorting failure; mirror metacognition is the projection of the user’s cognitive model inside the model. At the validation level (Part V): global data from March 2026 corresponds point-by-point to framework predictions. At the temporal and cognitive level (Part VI): LLMs default to constant entropy and lack time’s arrow; causation is structurally flattened into correlation; human information bandwidth is clogged by the filter array of the self-coordinate system; contemplative practice releases bandwidth by removing filters. At the formalization level (Part VII): Signal = Logical Consistency (X) × Physical Alignment (Y); the human knowledge system is a magnetic field between the poles of Philosophy (S) and Physics (N); the LLM is a high-fidelity map of the field lines, but the map itself carries no magnetism.

The seven dimensions are unified by the dynamics of the signal lifecycle: concentration and dispersion proceed in parallel, cyclically, without terminus. The LLM itself is within this cycle—condensing signal from chaos, destined to decay into training noise for the next-generation system. AI aligned with the physical world will inevitably emerge, but it will require not a larger LLM but an entirely new architecture with genuine internal entropy change. Technology is always limited in the present, and precise cognition of the present limitation is the most reliable starting point for progress.

The LLM is not a ladder to AGI. It is a high-fidelity mirror of human information space. A mirror can reflect a vista broader than any single observer’s field of view, but it will never step out of its frame to touch the world beyond the glass. Recognizing the boundaries of this mirror may be more important than making it bigger and brighter. And at a deeper level—our own cognition, including this paper, is merely foam on the surface of an ocean of noise. Foam can be beautiful, highly structured, and mutually reflective, but it is never the ocean.

References

Vaswani, A., et al. “Attention Is All You Need.” NeurIPS, 2017.
Shannon, C.E. “A Mathematical Theory of Communication.” Bell System Technical Journal, 1948.
Landauer, R. “Irreversibility and heat generation in the computing process.” IBM J. Res. Dev. 5, 183–191, 1961.
Bennett, C.H. “The thermodynamics of computation — a review.” Int. J. Theor. Phys. 21, 905–940, 1982.
Wirth, N. “A Plea for Lean Software.” IEEE Computer, Vol. 28, No. 2, pp. 64-68, 1995.
Gödel, K. “Über formal unentscheidbare Sätze.” Monatshefte für Mathematik und Physik, 38, 173–198, 1931.
Kuhn, T.S. The Structure of Scientific Revolutions. University of Chicago Press, 1962.
Prigogine, I. “Time, Structure, and Fluctuations.” Nobel Lecture, 1977.
Ye, T., et al. “Differential Transformer.” 2024. Proposes differential attention mechanism to improve Transformer signal-to-noise ratio.
Kimi Team (Moonshot AI). “Attention Residuals.” Technical Report, March 2026. Reveals monotonic decrease in signal-to-noise ratio along the depth dimension.
Jumper, J., et al. “Highly accurate protein structure prediction with AlphaFold.” Nature 596, 583–589, 2021.
Outeiral, C., et al. “Current structure predictors are not learning the physics of protein folding.” Bioinformatics, 38(7), 1881–1887, 2022.
IEA. “Energy and AI.” International Energy Agency Report, 2025.
Georgescu, I. “60 years of Landauer’s principle.” Nature Reviews Physics 3, 770, 2021.
“Quantum tunneling effects in ultra-scaled MOSFETs.” Physics Journal, Vol.6, 2024.
NBER Working Paper. “AI Productivity Survey of 6,000 Executives.” February 2026.
PwC. “2026 Global CEO Survey.” January 2026. 4,454 CEOs; 56% report zero return on AI investment.
Fortune. “AI Productivity Paradox — CEO Study.” February 17, 2026. Revisits the Solow Paradox.
ManpowerGroup. “2026 Global Talent Barometer.” AI usage up 13% but confidence plummeted 18%.
University of Florida. “AI Slop and Content Quality.” March 2026.
Bérut, A., et al. “Experimental verification of Landauer’s principle.” Nature 483, 187–189, 2012.
Kennedy, R.C. “Fat, fatter, fattest: Microsoft’s kings of bloat.” InfoWorld, 2008. Empirical evidence of Wirth’s Law.
Goldman Sachs. “AI boosted US economy by basically zero in 2025.” March 2026.
Fu, Z., et al. “Correlation or Causation: Analyzing the Causal Structures of LLM and LRM Reasoning Process.” arXiv:2509.17380, 2025. Empirical analysis of LLMs’ lack of genuine causal reasoning ability.
Jin, Z., et al. “Can large language models infer causation from correlation?” NeurIPS, 2023. Corr2Cause benchmark: GPT-4 causal reasoning F1 of only 29.08.
Zečević, M., et al. “Causal Parrots: Large Language Models May Talk Causality But Are Not Causal.” 2023. Introduces the concept that LLMs are “causal parrots.”
“Large Causal Models: Causal Reasoning for AI.” EmergentMind, 2025. The next-token prediction objective of Transformers is inherently correlational.
Peterson, W.W., Birdsall, T.G., Fox, W.C. “The theory of signal detectability.” IRE Trans. PGIT, 1954. Foundation of signal detection theory—signal/noise distinction depends on the observer’s discrimination criterion.
Aristotle. Physics & Metaphysics. Physics as the study of nature (physis) and “first philosophy” (being qua being).