Reasoning Emerges from Constrained Inference Manifolds in Large Language Models

Yanbiao Ma1, Fei Luo1, Linfeng Zhang2,3, Chuangxin Zhao2, Mingxuan Wang1, Yinan Wu2, Zhe Qian1, Yang Lu4, Long Chen3, Zhao Cao1, Xiaoshuai Hao†‡3, Ji-Rong Wen†1, Jungong Han†2
1Renmin University of China 2Tsinghua University 3Xiaomi EV 4Xiamen University
Corresponding authors   Project leader
ybma1998@ruc.edu.cn   haoxiaoshuai@xiaomi.com   jrwen@ruc.edu.cn   jghan@tsinghua.edu.cn

Abstract

Reasoning in large language models is predominantly evaluated through labeled benchmarks, conflating task performance with the quality of internal inference. Here we study reasoning as an intrinsic dynamical process by examining the evolution of internal representations during inference.

We find that inference-time dynamics consistently self-organize into low-dimensional manifolds embedded within high-dimensional representation spaces. Such geometric compression is pervasive, but it is not sufficient for stable or reliable reasoning. Effective reasoning dynamics emerge within a constrained structural regime characterized by three conditions: adequate representational expressivity, spontaneous manifold compression, and preservation of non-degenerate information volume within the compressed subspace.

Based on these insights, we introduce a unified, label-free diagnostic computed solely from internal dynamics. The findings suggest that reasoning in LLMs is fundamentally governed by geometric and informational constraints, offering a complementary framework to benchmark-centric assessment.

Key Findings

01

Reasoning trajectories collapse into manifolds

Inference-time hidden states rapidly concentrate on compact, low-dimensional trajectories despite living inside high-dimensional representation spaces.

02

Compression alone is not enough

Models with similarly low intrinsic dimensionality can exhibit very different reasoning behavior, so healthy reasoning requires additional structural constraints.

03

A label-free diagnostic exposes reasoning health

The proposed score combines world expressivity, stimulus-induced dimensionality, and information volume without using task labels or reference answers.

Constrained Inference Manifolds

The paper reframes reasoning as a dynamical process unfolding in representation space during generation. Across model families, scales, and prompts, internal trajectories become low-dimensional during inference while the underlying vocabulary embedding space remains highly expressive.

Robust reasoning appears when three constraints are jointly satisfied: the model retains broad representational expressivity, inference dynamics organize into compact manifolds, and those manifolds preserve non-degenerate information volume.

Label-free reasoning health diagnostic

Dworld measures representational expressivity, Dstim measures stimulus-induced manifold dimensionality, and V measures information volume preserved during inference.

Paper Figures

Figure 1: Inference-time reasoning dynamics self-organize into low-dimensional manifolds
Figure 1

Inference-time dynamics self-organize into low-dimensional manifolds

Layer-wise intrinsic dimensionality reveals compact reasoning trajectories across representative LLM families.

Figure 2: Low-dimensional organization is robust across stimuli and decoupled from global representational capacity
Figure 2

Low-dimensional organization is robust across stimuli

Inference trajectories concentrate on compact manifolds while static vocabulary embeddings retain high dimensionality.

Figure 3: Low-dimensional structure alone does not ensure robust reasoning
Figure 3

Low dimensionality alone does not ensure robust reasoning

Benchmark correlations show that compression by itself cannot explain reasoning quality.

Figure 4: Healthy reasoning preserves information volume within compact manifolds
Figure 4

Healthy reasoning preserves information volume

Effective reasoning balances geometric constraint with non-degenerate structured variation.

Figure 5: Unified diagnostic predicts reasoning robustness
Figure 5

Unified diagnostic predicts reasoning robustness

The label-free diagnostic integrates expressivity, manifold compression, and information volume.

Code

Our experiments are built on the perceptual-manifold-geometry Python package, which provides geometric analysis tools for high-dimensional data manifolds including intrinsic dimension, curvature, density, and topological structure.

PyPI Package

BibTeX

@misc{ma2026reasoning,
  title={Reasoning emerges from constrained inference manifolds in large language models},
  author={Yanbiao Ma and Fei Luo and Linfeng Zhang and Chuangxin Zhao and Mingxuan Wang and Yinan Wu and Zhe Qian and Yang Lu and Long Chen and Zhao Cao and Xiaoshuai Hao and Ji-Rong Wen and Jungong Han},
  year={2026},
  eprint={2605.08142},
  archivePrefix={arXiv},
  primaryClass={cs.LG},
  url={https://arxiv.org/abs/2605.08142}
}