Tiny Recursive Models: Less is More

Posted on Oct 8, 2025

Research by Samsung SAIL Montreal

Achieving 45% on ARC-AGI-1 and 8% on ARC-AGI-2 using only a 7M parameter neural network.

About This Research

Tiny Recursion Model (TRM) is a breakthrough in recursive reasoning that challenges the assumption that massive foundational models are necessary for complex problem-solving. This research demonstrates that “less is more” - a tiny model pretrained from scratch, recursing on itself and updating its answers over time, can achieve remarkable results without requiring massive computational resources.

Original Authors: Samsung SAIL Montreal Research Team Paper: Less is More: Recursive Reasoning with Tiny Networks Source Repository: Samsung SAIL Montreal GitHub Hosted with permission for educational purposes on StartAITools.com

Key Achievements

Performance Metrics:

45% accuracy on ARC-AGI-1 - competitive with large language models
8% accuracy on ARC-AGI-2 - state-of-the-art for small models
Only 7M parameters - vs billions in traditional LLMs

Breakthrough Insight:

“The idea that one must rely on massive foundational models trained for millions of dollars by some big corporation in order to achieve success on hard tasks is a trap. Currently, there is too much focus on exploiting LLMs rather than devising and expanding new lines of direction.”

How TRM Works

Tiny Recursion Model (TRM) recursively improves its predicted answer y with a tiny network:

Starts with: Embedded input question x, initial embedded answer y, and latent z
For K improvement steps:
- Recursively updates latent z given question x, current answer y, and current latent z (recursive reasoning)
- Updates answer y given current answer y and current latent z
Result: Progressive answer improvement in extremely parameter-efficient manner while minimizing overfitting

TRM Architecture

Research Motivation

This work builds on the Hierarchical Reasoning Model (HRM) but simplifies recursive reasoning to its core essence:

Improvements over HRM:

✅ Simplified architecture (no biological brain analogies needed)
✅ No mathematical fixed-point theorem required
✅ No hierarchical structure necessary
✅ Core recursive reasoning extracted and optimized

Philosophy:

“Recursive reasoning ultimately has nothing to do with the human brain, does not require any mathematical (fixed-point) theorem, nor any hierarchy.”

Repository Contents

Core Implementation

pretrain.py - Main training script for TRM models
arch/ - Model architecture implementations
dataset/ - Dataset preparation and augmentation utilities

Supported Datasets

ARC-AGI-1 - Abstract Reasoning Challenge (original)
ARC-AGI-2 - Abstract Reasoning Challenge (updated)
Sudoku-Extreme - Complex Sudoku puzzles
Maze-Hard - Challenging maze navigation

Requirements

Python 3.10+
CUDA 12.6.0+
PyTorch (nightly builds)
Weights & Biases (optional for logging)

Quick Start

Installation

pip install --upgrade pip wheel setuptools
pip install --pre --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu126
pip install -r requirements.txt
pip install --no-cache-dir --no-build-isolation adam-atan2

Dataset Preparation

# ARC-AGI-1
python -m dataset.build_arc_dataset \
  --input-file-prefix kaggle/combined/arc-agi \
  --output-dir data/arc1concept-aug-1000 \
  --subsets training evaluation concept \
  --test-set-name evaluation

# Sudoku-Extreme
python dataset/build_sudoku_dataset.py \
  --output-dir data/sudoku-extreme-1k-aug-1000 \
  --subsample-size 1000 --num-aug 1000

Training Example (ARC-AGI)

torchrun --nproc-per-node 4 --rdzv_backend=c10d \
  --rdzv_endpoint=localhost:0 --nnodes=1 pretrain.py \
  arch=trm \
  data_paths="[data/arc12concept-aug-1000]" \
  arch.L_layers=2 \
  arch.H_cycles=3 arch.L_cycles=4 \
  +run_name=pretrain_att_arc12concept_4 ema=True

Expected runtime: ~3 days on 4x H-100 GPUs

Research Implications

For AI Development:

Challenges the “bigger is better” paradigm in AI
Demonstrates parameter efficiency through recursive reasoning
Opens new research directions beyond LLM scaling

For Practitioners:

Achievable results without massive computational budgets
Applicable to resource-constrained environments
Framework for building efficient reasoning systems

For Research Community:

Simplified approach enables easier experimentation
Foundation for future recursive reasoning research
Alternative to expensive large-scale model training

Key Technical Innovations

Recursive Latent Updates: Progressive refinement through iterative latent space updates
Parameter Efficiency: 7M parameters achieving competitive results with billion-parameter models
Simplicity: Core recursive reasoning without biological or mathematical complexity
Generalization: Strong performance across multiple challenging benchmarks

Citations & Credits

Original Paper:

@article{trm2024,
  title={Less is More: Recursive Reasoning with Tiny Networks},
  author={Samsung SAIL Montreal},
  journal={arXiv preprint arXiv:2510.04871},
  year={2024}
}

Source Repository: github.com/SamsungSAILMontreal/TinyRecursiveModels

Hosted for educational purposes on StartAITools.com

Explore the Research

Ready to dive deeper? Explore the full repository contents or jump straight to the paper.