Fully Autonomous · Self-Evolving · Claude Code Native

Sibyl
Research
System

Autonomous AI Research System

20+ specialized AI Agents debate ideas, design and execute GPU experiments, write papers, and rigorously self-review — all without human intervention. The system itself keeps evolving.

0
Pipeline Stages
0
Specialized Agents
0
Unit Tests
0
Evolution Dimensions

Built for Frontier Research

Not a tool to assist human researchers, but a self-running research organization — fully automated from literature review to paper submission.

🔬

19-Stage Pipeline

End-to-end automation from arXiv literature retrieval to camera-ready papers, orchestrated by a 19-stage state machine.

🧠

Multi-Agent Collaboration

6 Agents for creative debate, 6 for result analysis, 6 for parallel writing — multi-perspective collision generates truly innovative ideas.

GPU Parallel Scheduling

Topological sorting + dynamic dispatch maximizes GPU cluster utilization. Automatic dependency management with greedy allocation.

🔄

Full-Dimension Iteration

Quality gates automatically decide to continue iterating, pivot to new ideas, or terminate. Every dimension of research is auto-optimized.

🧬

Self-Evolution System

Extracts experience from 8 dimensions, tracks effectiveness, and phases out bad practices — every project benefits all future projects.

🤖

Multi-Model Collaboration

Claude Opus/Sonnet + GPT-5.4 for independent cross-review. Multi-model perspectives ensure research quality has no blind spots.

19-Stage State Machine

Three core modules work in concert: research iteration produces experimental data, paper writing produces academic papers, review & reflection drives system evolution.

Research Iteration

Literature Search

arXiv + Web dual-source research

Single Agent

Idea Debate

6-perspective brainstorming

6-Agent Team

Experiment Planning

Generate dependency-aware execution plans

Single Agent

Pilot Experiment

Small-scale feasibility validation

Single Agent

Full Experiment

GPU parallel topological scheduling

GPU Scheduler

Result Debate

Multi-dimensional result analysis

6-Agent Team

Supervisor Decision

PIVOT or PROCEED

Decision Agent
Paper Writing

Outline Generation

Structured paper framework

Single Agent

Section Writing

Sequential / Parallel / Codex mode

Configurable

Cross Review

6 Agents review each section in parallel

6-Agent Parallel

Integration Edit

Merge into a complete paper

Single Agent

Final Review

NeurIPS / ICML level review

Iterative Revision

LaTeX Compilation

Convert to NeurIPS format → PDF

Auto Compile
Review & Reflection

Multi-Dim Review

Critic + Supervisor + Codex

Parallel Skills

Reflection Distillation

8-dimension issue & experience classification

Single Agent

Cloud Sync

Sync data to cloud documents

Auto Sync

Quality Gate

≥ 8.0 score AND ≥ 2 iterations

Auto Evaluation

20+ Specialized Agents

Each Agent has an independent perspective and role. Through debate and collaboration, they produce insights far beyond any single model.

💡

Creative Generation Team

Innovator
Cross-domain methodology transfer
Pragmatist
Engineering feasibility gatekeeper
Theorist
Mathematical foundations & proofs
Contrarian
Challenge assumptions, find blind spots
Interdisciplinary
Cognitive/physics/bio analogies
Experimentalist
Reproducibility & data quality
📊

Result Analysis Team

Optimist
Uncover positive findings
Skeptic
Statistical significance review
Strategist
Resource allocation & direction
Methodologist
Internal & external validity evaluation
Benchmarker
SOTA comparison analysis
Reviser
Adjust hypotheses based on results

Dual-Loop Architecture

The system that runs your research is itself constantly evolving. Every project benefits all future projects.

Inner Loop — Research Iteration

In-Project Auto-Optimization

Revises hypotheses based on experimental results, replans experiments, rewrites papers, and pivots to alternative ideas when needed — until quality reaches publication standards.

Outer Loop — System Self-Evolution

Cross-Project Experience Learning

After each iteration, the system automatically classifies issues across 8 dimensions, tracks which improvements actually work, and auto-updates Agent prompts, scheduling strategies, and architecture patterns.

Why Self-Evolution
Actually Works?

  • 📁 Stateless Architecture — Every prompt loads from disk in real-time. Evolution engine changes are effective immediately
  • 🔀 Independent Subprocesses — Each Agent runs in a fresh process. Code changes take effect instantly
  • ⚙️ Real-time Config — Config is re-parsed on every call. Parameter adjustments take effect next round
  • 📝 Experience Overlay — Distilled experiences are written to files and auto-injected into the next Agent call
  • 🛡️ Safety Mechanism — All modifications must pass tests and git commit, ensuring reversibility and auditability
  • 📉 Auto Deprecation — Ineffective experiences are down-weighted 0.3x, preventing bad advice from persisting

Three Steps to Launch Research

Let Claude auto-configure everything. You only need to provide GPU server info.

Terminal
# 1. Clone repository $ git clone https://github.com/Sibyl-Research-Team/sibyl-research-system.git $ cd sibyl-research-system # 2. Start Claude Code (with plugin) $ claude --plugin-dir ./plugin --dangerously-skip-permissions # 3. One-line auto-configuration > "Configure Sibyl Research System, read docs/setup-guide.md then auto-configure everything." # 4. Start research! > /sibyl-research:init # Create research project > /sibyl-research:start my-project # Start autonomous research
1

Clone & Launch

Clone the repo, load Sibyl plugin commands with --plugin-dir, enter the Claude Code environment.

2

Auto Configuration

Claude auto-detects environment, installs dependencies, configures MCP servers, only asks for GPU server info when needed.

3

Launch Research

After initialization, the system autonomously runs the 19-stage Pipeline, from literature review to paper submission, fully automated.

Requirements

Python 3.12+, Node.js 18+, Claude Code CLI, SSH-accessible GPU server.

Beyond the State of the Art

Key differences from other AI research systems at a glance.

Feature Sibyl Research System AI Scientist AutoResearch
Architecture Claude Code Native (Skills, Teams, MCP) API Wrapper Single File Script
Agent Count 20+ Specialized Agents Single LLM Single Agent
Idea Generation 6-Agent Multi-Perspective Debate LLM Brainstorming None
Experiment Execution GPU Parallel + Topological Scheduling Template Execution Single GPU Loop
Paper Writing Multi-Agent Write + Review + Revise LLM Generation None
Self-Evolution Cross-Project Experience Learning None None
Quality Control Multi-Round Review + Quality Gates Auto Review Metric-Based
Human Intervention Fully Autonomous Minimal Minimal