Redefining Rationality — Kobalt Research

— The central claim

$$R_{\text{Std}}(p) \implies u_p < \max_a u_p \quad \forall p$$

Standard rationality does not maximize what it claims to maximize.

— The problem

186 years of a self-defeating definition

Cournot (1838) formalized individual optimization. Nash (1950) gave it equilibrium form. The result: a definition whose rigorous following produces outcomes that all agents would prefer to deviate from collectively.

$$\exists\, a' \in A : u_p(a') > u_p(a^*_{\text{Nash}}) \quad \forall p$$

— There exists an alternative where all agents obtain strictly more than at the "rational" equilibrium.

This means: there exists an alternative where all agents obtain strictly more than at the "rational" equilibrium. Standard rationality is the only known mathematical definition whose rigorous following produces results worse than its systematic violation.

— Lemma 5.2

Logical inadequacy of $R_{Std}$

Let $G$ be a game where there exists $a' \in A$ with $u_p(a') > u_p(a^*_{Nash})$ for all $p$. Then every agent $p$ following $R_{Std}$ obtains $u_p < \max_a u_p$. Therefore, $R_{Std}$ does not maximize $u_p$ — it contradicts its own definition.

Corollary: the solution set derived from $R_{Std}$ does not contain the solution that $R_{Std}$ claims to find.

— Structural diagnosis

Why the error was invisible for 186 years

The space of representable functions from the individual agent's perspective does not contain the object that would reveal its own scope assumption as a constraint.

$$F_{\text{indiv}} \subsetneq F_{\text{sist}}$$

— The individual function space does not contain the systemic.

The epistemic property: $e \notin E_{\text{op}}(S) \implies \Delta W_e$ generates no signal in $S$. Here the system $S$ is standard rationality; the excluded entity is the effects on others. A framework that cannot represent its own exclusion cannot detect its own inadequacy from within.

— The correction

$R_S$: systemic rationality

The correction changes the objective function — it does not add constraints on top of an existing one.

$$R_S(p) :\iff a^*_p \in \arg\max_{a_p} \sum_{e \in E} \Delta W_e(a_p,\, a_{-p})$$

— Systemic rationality: the agent seeks the maximum of aggregate welfare across all affected entities including themselves.

The agent seeks the maximum of aggregate welfare across all affected entities, including themselves. This is not altruism imposed as a constraint — it is the correct objective function for an agent that actually wants to maximize outcomes in a system where others exist.

— Theorem 6.3 (Rosen)

Existence and uniqueness under $R_S$

If $V$ is strictly concave (property 2 of Shannon's welfare function), then $\sum_e \Delta W_e$ is strictly concave. By Rosen (1965), a unique interior critical point exists, and that point is Pareto-optimal.

What $R_{Std}$ cannot reach — Pareto-optimality — $R_S$ produces automatically, as a consequence of the objective function, without additional constraints or coordination mechanisms.

— Implications

What the result opens

For economic theory

Classical results on market failures (Pigou, Coase, Hardin) are special cases of the inadequacy of $R_{Std}$. Externalities are not market imperfections requiring correction — they are the direct consequence of an objective function that excludes affected entities by construction.

For AI safety

Multi-agent RL systems that train individual agents under $R_{Std}$ exactly reproduce the logical inadequacy at scale. RLHF adds constraints on top of the misaligned objective; systemic rationality changes the objective from the start. The difference is structural, not procedural.

For public policy

Models of individual behavior (homo economicus) are cases of the same error. Observed human cooperation is not "irrationality" relative to $R_{Std}$ — it is systemic rationality that the individual model cannot see because it lies outside $F_{\text{indiv}}$.

Paper connections

Depends on

Cournot (1838) — individual optimization
Nash (1950) — equilibrium form
Rosen (1965) — existence and uniqueness

Sustains