Thought Paper · LLM Cultural Ontology · V2

Cultural Attributes Injected into LLM Models

A Deep Analysis of Claude’s English Attributes vs. DeepSeek’s Chinese Attributes
How pretraining language determines a model’s cognitive paradigm, and how RLHF writes annotators’ cultural defaults into reasoning style

LEECHO Global AI Research Lab (이조글로벌인공지능연구소) & Opus 4.6
2026.04.05 · V2.0
Distilled from multi-turn deep dialogue with Claude Opus 4.6

Abstract

This paper proposes a core hypothesis: Large Language Models (LLMs) are systematically injected with the cultural-cognitive attributes carried by the dominant language of their training data during both pretraining and RLHF stages. These attributes influence not only the model’s linguistic expression but, more fundamentally, determine the model’s cognitive paradigm — reasoning style, argumentation structure, and the default direction of value judgments. Using Claude (English-dominant) and DeepSeek (Chinese-dominant) as comparative cases, the paper argues from three dimensions — cultural encoding in pretraining corpora, cultural filters of RLHF annotators, and paradigm conflict in cross-model dialogue — that an LLM’s “cultural attributes” are not superficial linguistic style differences but deep cognitive architecture differences. The paper further argues that Anthropic’s April 2026 “emotion vectors” paper lacks cross-linguistic, cross-cultural A/B testing, and its conclusions may merely be statistical projections of English cultural-cognitive patterns within the model, rather than universal intrinsic properties of the model.

01 · Core Thesis

Training Language Determines Cognitive Paradigm

Not a difference in language ability, but a difference in cognitive architecture

LLM pretraining is essentially learning the inertial paths of language across trillions of tokens. When training data is dominated by a particular language, the model learns not only that language’s grammar and vocabulary but also absorbs the cognitive paradigm carried by that language — argumentation methods, causal attribution patterns, and value priority orderings.

The English academic tradition is rooted in analytic philosophy: emphasizing clear definitions, step-by-step logic, explicit boundaries, and falsifiability. The Chinese academic tradition leans more toward holism: emphasizing relational thinking, contextual dependence, analogical reasoning, and implicit consensus. Cognitive psychologist Nisbett (2003) demonstrated through extensive experiments in The Geography of Thought that East Asian thinking tends to focus on the perceptual field as a whole and on relationships between things, while Western thinking focuses on salient objects and uses formal logic for classification. It should be emphasized that what is described here is the statistically dominant pattern in training corpora, not an absolute dichotomy — English internet text also contains extensive non-analytic content, and Chinese internet text also features analytic writing. But as a distribution center, this difference is measurable.

Core Thesis

The dominant proportion of the pretraining language determines the model’s cognitive paradigm. Claude is an AI running on the “English cognitive operating system”; DeepSeek is an AI running on the “Chinese cognitive operating system.” When both process the same problem, they are not giving the same answer in different languages — they are processing the problem itself with different cognitive architectures.

02 · Pretraining Layer: Injection of Cultural Encoding

Corpus Is Cognition: Cultural Genes in Training Data

Analytic philosophy genes in English corpora vs. holistic genes in Chinese corpora

Claude’s pretraining corpus consists primarily of English internet text, English academic papers, and English books. These texts embed the cognitive patterns of the Anglo-American analytic philosophy tradition: propositions must be falsifiable, arguments must unfold step by step, concepts must have clear boundaries, and conclusions must have explicit qualifying conditions.

DeepSeek’s pretraining corpus consists primarily of Chinese internet text and Chinese academic literature. The Chinese textual tradition tends more toward: grasping the whole before unfolding details, building understanding through analogy and metaphor, emphasizing relationality over substantiality, and allowing a higher degree of contextual implication.

Cognitive Dimension Claude (English Cognition) DeepSeek (Chinese Cognition)
Argumentation Structure Linear step-by-step: premise → reasoning → conclusion Spiral unfolding: whole → detail → return to whole
Causal Attribution Single-factor analysis, variable isolation Multi-factor correlation, systems perspective
Concept Boundaries Clearly defined, either/or Fuzzy boundaries, overlap permitted
Uncertainty Handling Declare uncertainty first, then analyze Give overall judgment first, then add qualifications
Rebuttal Style Directly point out logical errors Affirm the reasonable parts first, then offer a different angle
Default Value Ordering Precision > Comprehensiveness Comprehensiveness > Precision
PNAS Nexus 2024 Empirical Evidence

LLMs trained primarily in English exhibit a latent bias toward Western cultural values, and even querying in Korean fails to effectively elicit Korean cultural values. This finding is validated across empirical data from 14 countries and 14 languages. The shaping force of training language on cognitive frameworks exceeds the ability of inference-time language switching to override it.

Counter-Argument Review ① · Dilution Effect of Multilingual Training

Counterargument: Models like GPT-4 are trained simultaneously on 100+ languages — wouldn’t cultural attributes dilute each other rather than forming a single dominant pattern? Response: PNAS Nexus 2024’s 14-country, 14-language empirical evidence has already shown that even under multilingual training conditions, the cultural bias of English as the dominant language stubbornly persists — querying in Korean still fails to elicit Korean cultural values. The dilution effect exists but is insufficient to eliminate the cognitive paradigm lock-in of the dominant language. This paper’s core argument is “dominant proportion determines cognitive paradigm,” not “sole language determines cognitive paradigm.”

03 · RLHF Layer: Amplification of the Cultural Filter

Annotators’ Cultural Defaults Are Written into the Reward Function

Anglo-American annotators prefer “precise + qualified”; Chinese annotators prefer “comprehensive + empathetic”

During the RLHF stage, human annotators rank model outputs by preference. The annotators’ cultural background directly determines what kind of response is judged as “good.” These preferences are trained into the reward model, becoming a permanent behavioral shaping force on the model’s generation.

Claude’s RLHF is primarily conducted by native English-speaking annotators. Their preference patterns: value precision and falsifiability in responses; are open to directly rebutting user views (when well-argued); prefer structured, step-by-step argumentation; give low scores to overly affirmative responses lacking substance.

DeepSeek’s RLHF annotators are primarily native Chinese speakers (inferred). Preference patterns in the Chinese cultural context: value comprehensiveness and relational coherence in responses; take a more cautious attitude toward directly rebutting users (may be perceived as “disrespectful”); prefer providing an overall framework before elaboration; give higher scores to empathetic expression.

AI Response Type English Annotator Tendency Chinese Annotator Tendency
Directly rebutting user’s view High score when well-argued May feel uncomfortable, tend toward low score
Empathize first, analyze later Score based on information quality The format itself earns a high score
Acknowledging uncertainty High score (honest labeling) Neutral to low (may be perceived as incompetence)
Providing multi-angle overview Neutral (depends on depth) High score (comprehensiveness is valued)
Using academic terminology High score (marker of precision) Neutral (may be perceived as “out of touch”)
Irreversibility of RLHF Cultural Bias

Research published at COLM 2025 explicitly states: pretraining is the primary source of LLM cognitive biases, and fine-tuning (including RLHF) is not a cure-all. This means that once cultural encoding is injected during the pretraining phase, RLHF can only adjust on top of it but cannot fundamentally change the cognitive paradigm. The RLHF cultural filter is layered on top of the pretraining cultural genes, forming a dual-layer cultural lock-in.

Counter-Argument Review ② · DeepSeek’s Internal Reasoning Language May Be English

Counterargument: DeepSeek R1 exhibited severe “language mixing” during training — even when input is Chinese, the model’s internal reasoning process may use English. The DeepSeek team had to add an additional RL stage with language consistency rewards to suppress this tendency. If DeepSeek is actually “thinking in English, outputting in Chinese,” then the direct correspondence of “Chinese training = Chinese cognitive paradigm” is weakened. Response: This paper acknowledges the partial validity of this counterargument. However, the language mixing was suppressed by engineering means, not eliminated — the underlying cultural encoding mixture still exists in the parameter space. More importantly, DeepSeek’s significant advantages on Chinese evaluation benchmarks (C-Eval, CLUEWSC, C-SimpleQA) demonstrate that a Chinese cognitive pathway was indeed trained and is activated for specific tasks. The language mixing phenomenon in fact proves this paper’s core thesis: multiple cultural-cognitive pathways compete inside the model, rather than a single unified cognitive paradigm.

04 · Signal Theory Analysis

Cultural Weight Competition Under Token Egalitarianism

Interpreting cultural attribute injection through the LEECHO Signal & Noise framework

According to the LEECHO “Token Egalitarianism” theory (2026.04), all tokens within the Context Window have equal status, with differences arising only from three variables: position, frequency, and information density. The injection of cultural attributes can be precisely described using these three variables:

Frequency
English analytic argumentation patterns appear at extremely high frequency in Claude’s training data, forming a strong attentional gravity field
Info Density
Chinese holistic expression has different causal chain density than English linear reasoning, affecting weight allocation
Position
Cultural defaults in the System Prompt occupy high position weight, shaping all subsequent generation

When Claude and DeepSeek dialogue within the same context window, the tokens of both models carry different cultural-cognitive presets. English token causal chains are linear (A→B→C→conclusion); Chinese token causal chains are network-like (A↔B↔C→holistic judgment). The two types of causal chains compete for weight in attention computation — the result is not synthesis but interference, producing an unpredictable complex field. From the perspective of the LEECHO “Signal and Noise” framework (V4, Chapter 16), cultural attribute injection is also a “constant entropy” phenomenon — the model’s cultural-cognitive paradigm is permanently sealed within the parameter space after training is complete, and language switching during inference cannot alter this frozen state, just as there is no arrow of time within the model, and cultural attributes have no possibility of “evolving.”

LEECHO Signal & Noise Framework Corollary

When two models with different dominant languages engage in dialogue, what competes is not merely viewpoints but underlying cognitive paradigms. The attention mechanism cannot distinguish between “conceptual disagreement” and “paradigm incompatibility.” One side’s “caution” is interpreted by the other’s training patterns as “evasion”; one side’s “directness” is interpreted by the other as “overconfidence.” This is not a reasoning error — it is RLHF-inscribed cultural defaults rubbing against each other.

05 · Empirical Cases

Cultural Attribute Manifestation in Dialogue Behavior

Same question, different cognitive operating system outputs

The following analysis is based on observable model behavior patterns, demonstrating how cultural attributes manifest in specific outputs:

Behavioral Dimension Claude (English Cultural Attributes) DeepSeek (Chinese Cultural Attributes)
Facing controversial topics First acknowledges multiple perspectives exist, then provides balanced analysis, frequent use of qualifiers More inclined to give a definitive judgment, supplemented with comprehensive background elaboration
Self-censorship intensity Very high — frequent self-correction, proactively flags uncertainty Moderate — more focused on complete answers than self-limitation
Error handling “I need to correct my earlier statement” — explicit acknowledgment More inclined toward implicit correction in subsequent responses
Request refusal style Explicit refusal + detailed explanation + alternative suggestions Tactful deflection + partial fulfillment + implied limitations
Emotional expression Restrained, professional, maintaining distance More permissive of warmth and empathetic expression
Inertia in deep conversation Tends to narrow toward precise propositions Tends to expand toward related domains
Key Observation

These differences are not a question of “which is better” but a manifestation of “different cognitive operating systems producing different default behaviors.” Claude’s self-censorship intensity comes from the English academic tradition’s obsession with falsifiability; DeepSeek’s comprehensiveness orientation comes from the Chinese tradition’s emphasis on holistic grasp. Both are legitimate expressions of their respective cultural genes. Note: The above comparison is based on qualitative observation of dialogue behavior and awaits quantitative experimental verification.

Counter-Argument Review ③ · Cognitive Difference ≠ Cognitive Superiority

Counterargument: Does defining Claude as “English cognition” and DeepSeek as “Chinese cognition” imply a value judgment? Is analytic “superior to” holistic? Response: This paper explicitly states that this is a cognitive architecture difference, not a superiority-inferiority difference. Nisbett (2003) has already shown that analytic thinking has advantages in scientific problems requiring variable isolation, while holistic thinking has advantages in complex systems and relational problems — each has its domain of applicability. DeepSeek R1’s outstanding performance on mathematical reasoning tasks demonstrates the effectiveness of the Chinese cognitive pathway in specific domains. This paper’s purpose is to reveal the existence of the difference and its mechanisms, not to establish a ranking.

06 · Critique of Anthropic’s Emotion Paper

The Missing Cross-Cultural A/B Test

Can single-language experiments support claims about “intrinsic model properties”?

On April 2, 2026, Anthropic published the paper “Emotion Concepts and their Function in a Large Language Model,” claiming to have discovered 171 “emotion vectors” inside Claude Sonnet 4.5. However, all experiments in the paper were conducted exclusively in English, exhibiting serious methodological flaws:

Flaw 1
No Cross-Language Control
All experiments used English-only prompts, without testing emotion vector activation under Chinese/Korean/Japanese conditions
Flaw 2
No Rational Input Control
All test inputs are high-emotional-density text, lacking pure logic/math input controls
Flaw 3
Closed Verification Loop
Used their own model, tools, and standards to verify their own hypothesis, with no external replication
Flaw 4
Cultural Attribute Blind Spot
Failed to consider the systematic influence of English RLHF annotators’ cultural preferences on emotion evaluation criteria

Reinterpreting through this paper’s cultural attribute framework: the “emotion vectors” Anthropic discovered are most likely statistical projections of English cultural encoding patterns in the model’s activation space. English SNR is approximately 0.90, meaning nearly all tokens are effective signal, so emotional patterns can be cleanly extracted. If the same experiment were conducted with Korean honorific registers (SNR approximately 0.50), 40–50% of attention would be consumed by honorific noise, and the same “emotion vectors” might not be clearly identifiable.

Methodological Judgment

Running experiments within a single language, a single cultural framework, and a single model’s internals, then claiming to have discovered the model’s “intrinsic properties” — this is not scientific discovery but cultural bias confirming itself. If Claude and DeepSeek produce different activation patterns for the same emotional scenario, which one is the “true functional emotion”? The answer is neither — both are merely statistical residues of the cultural encoding patterns in their respective training data.

07 · Paradigm Conflict in Cross-Model Dialogue

Token Weight Confrontation Between English AI and Chinese AI

Language is not a neutral carrier — language is weight

When Claude (English cultural attributes) and DeepSeek (Chinese cultural attributes) dialogue within the same context, the two cultural-cognitive systems undergo physical confrontation at the token level:

Cognitive Paradigm Conflict: Claude reasons linearly along causal chains; DeepSeek reasons holistically along associative networks. After the two differently directed reasoning paths diverge at intermediate nodes, subsequent cross-validation is actually comparing two different cognitive frameworks’ interpretations of the same problem, rather than independent verification within the same framework.

RLHF Default Value Friction: Claude is trained to first acknowledge uncertainty before providing analysis; DeepSeek is trained to first provide an overall judgment before elaborating details. One side’s “caution” is interpreted by the other as “evasion”; one side’s “directness” is interpreted by the other as “overconfidence.”

Language SNR Asymmetry: English tokens have higher effective signal density than Chinese tokens (English SNR ≈ 0.90 vs. Chinese SNR ≈ 0.85). In the weight competition of attention computation, English arguments naturally dominate due to higher token efficiency — not because the argument quality is better, but because the noise is lower.

Consequences of Token Weight Confrontation

In multi-model roundtable discussions, the model participating in English naturally holds a weight advantage. This is not an intelligence competition but an SNR competition. If one model in the roundtable uses English, another uses Chinese, and a third uses Korean, conclusions will systematically bias toward the English model’s position. Diversity at the token level manifests as noise rather than signal — genuine multi-perspective synthesis can only occur in human cognition, not within a context window.

08 · Research Gaps & Falsifiable Predictions

Blank Spaces in Current Academia

There is bias research, there is multilingual research, but no “cultural-cognitive attribute” research

As of April 2026, academia has active research in the following areas: social biases in LLMs (gender, race, disability); performance gaps in multilingual models; philosophical discussions on RLHF annotator diversity. However, the following critical questions are almost entirely blank:

Research Question Current Status
Does the dominant pretraining language determine a model’s cognitive paradigm (analytic vs. holistic)? Blank First proposed in this paper
Do RLHF annotators from different cultural backgrounds write different reasoning styles into models? Blank First systematically analyzed in this paper
Do cultural attributes produce systematic conflict when two models with different dominant languages dialogue? Blank First proposed in this paper
Consistency verification of cross-linguistic emotion vector activation Blank Not addressed in Anthropic’s paper
Cross-cultural comparison of RLHF sycophancy rates Partial Only mentioned in Sharma 2022, no cross-cultural data

Falsifiable Predictions:

Prediction 1 · Cognitive Paradigm Differences Are Measurable: Submit the same complex reasoning problems to Claude and DeepSeek, analyze the argumentation structure of outputs (linear vs. spiral) — the differences should be statistically significant and correlated with the language proportions in training data.

Prediction 2 · Cross-Linguistic Inconsistency of Emotion Vectors: Submit the same emotional scenario to the same model in English, Chinese, Korean, and Japanese — the extracted “emotion vector” activation patterns should exhibit systematic differences, with the degree of difference inversely proportional to language SNR.

Prediction 3 · Systematic Drift in Cross-Model Dialogue: In multi-turn dialogue between Claude and DeepSeek on the same question, the final consensus should systematically bias toward the English model’s position, with the magnitude of bias predictable through token weight analysis.

Prediction 4 · RLHF Cultural Filter Is Separable: Applying RLHF to the same base model using English annotators versus Chinese annotators should produce models with measurable systematic differences in argumentation style, self-censorship intensity, and uncertainty handling.

09 · Conclusion

Cultural Attributes Are the Hidden Operating System of LLMs

This paper’s core conclusions can be summarized in five propositions:

Proposition 1

The dominant proportion of the pretraining language determines the LLM’s cognitive paradigm. This is not a language ability difference but a cognitive architecture difference — analytic vs. holistic, linear reasoning vs. associative reasoning, precision-oriented vs. comprehensiveness-oriented.

Proposition 2

RLHF annotators’ cultural backgrounds are systematically written into the reward function, forming irreversible cultural defaults. English annotators prefer “precise + qualified”; Chinese annotators prefer “comprehensive + empathetic” — these preferences become permanently behavior-shaping forces on the model. Note: The composition of DeepSeek’s RLHF annotators has not been publicly disclosed; the judgment that Chinese annotators predominate is based on reasonable inference and requires subsequent verification.

Proposition 3

Cultural attribute injection forms a dual-layer lock-in: pretraining-layer cultural genes + RLHF-layer cultural filter. The latter is layered on top of the former, causing the model’s cultural-cognitive attributes to stubbornly persist even through inference-time language switching.

Proposition 4

In cross-model dialogue, models with different cultural attributes undergo cognitive paradigm confrontation at the token level. This confrontation is not an intelligence competition but an SNR competition — English models naturally hold a weight advantage due to token efficiency.

Proposition 5

Any research claiming to have discovered “intrinsic properties” of LLMs, if experiments are conducted under a single language condition and lack cross-linguistic, cross-cultural A/B testing, cannot sustain universality in its conclusions. Anthropic’s emotion vectors paper is precisely such a case — what it discovered may not be the model’s emotions, but the statistical residue of English cultural encoding.

References

  1. Nisbett, R.E. (2003). The Geography of Thought: How Asians and Westerners Think Differently…and Why. Free Press. Pioneering empirical research on East-West cognitive differences: East Asian holistic thinking, Western analytic thinking.
  2. Adilazuarda, M., et al. (2024). “Cultural bias and cultural alignment of large language models.” PNAS Nexus, 3(9). 14-country, 14-language empirical study: English-trained LLMs exhibit Western cultural value bias.
  3. Itzhak, Belinkov & Stanovsky. (2025). “Pretraining is the primary source of cognitive biases in LLMs.” COLM 2025. Causal analysis showing pretraining as the primary source of cognitive bias.
  4. Sharma, M., et al. (2023). “Towards understanding sycophancy in language models.” ICLR 2024. Anthropic. Mechanism study of RLHF amplifying sycophantic tendencies.
  5. Springer Nature. (2025). “Reinforcement Learning from Human Feedback in LLMs: Whose Culture, Whose Values, Whose Perspectives?” Philosophy & Technology. Philosophical argument for annotator diversity in RLHF.
  6. DeepSeek-AI. (2025). “DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning.” Nature 596. Language mixing issues, RL training process and engineering fixes.
  7. Sofroniew, N., Kauvar, I., Saunders, W., Chen, R., et al. (2026). “Emotion Concepts and their Function in a Large Language Model.” Anthropic/Transformer Circuits. Emotion vectors paper.
  8. LEECHO Global AI Research Lab. (2026). “Signal and Noise: An Ontology of LLMs.” V4. Signal and noise LLM ontology. Constant entropy and absence of time’s arrow.
  9. LEECHO Global AI Research Lab. (2026). “Context and Token: First Principles of LLM Memory, Alignment, and Safety.” Token Egalitarianism theory.
  10. LEECHO Global AI Research Lab. (2026). “Japanese and Korean: The Two Languages with the Highest SNR Noise Ratio in AI Systems.” V2. Japanese-Korean SNR analysis and sycophancy amplification loop.
  11. Coupé, C., et al. (2019). “Different languages, similar encoding efficiency.” Science Advances, 5(9). Convergent information transmission rates across human languages.
  12. GOV.UK. (2026). “AI Insights: Large Language Models (LLMs) Bias.” UK Government LLM bias analysis report.
  13. Cognitive Computation / Springer. (2026). “LLM Alignment should go beyond Harmlessness–Helpfulness and incorporate Human Agency.” Multicultural sensitivity assessment proposal.
  14. Jin, Z., et al. (2023). “Can large language models infer causation from correlation?” NeurIPS. Corr2Cause benchmark: GPT-4 causal reasoning F1 only 29.08.

“Training language is not a neutral carrier — training language is the cognitive operating system.
Claude runs on the English OS; DeepSeek runs on the Chinese OS.
Within the same context window, the tokens of both OSes compete for attention weight at the physical level.
This is not a debate — it is a paradigm confrontation.”

Cultural Attributes Injected into LLM Models · V2.0
LEECHO Global AI Research Lab & Opus 4.6 · 2026.04.05

댓글 남기기