Active Research Initiatives
Core research directions in flight, ranked by current progress.
AlphaCell Integration
Generating interactome data exponentially to power next-generation AI models for cellular prediction.
Synthetic Biology Platforms
Novel SynBio technologies for rapid prototyping and testing of engineered biological systems.
Planet Engineering Research
Developing algae-based solutions for atmospheric processing and terraforming applications.
A Trinity Technology Platform
Merging artificial intelligence, digital logic, and life science to build the next-generation synthetic biology foundation
AI4Cell Molecular Interaction
Build an AI-driven molecular interaction library, accumulating cell behavior data at exponential speed. Use large models to predict protein folding and signaling pathways, accelerating biological component design iteration.
Digital Logic Biotech
Inspired by integrated circuit design, abstract cell signaling pathways as logic gate circuits. Modularly engineer cell factories for predictable, reproducible biosynthesis workflows.
AlphaCell Development
Proprietary cancer cell phenotype analysis model, improving data analysis accuracy from 51% to 90%. Provides high-confidence single-cell characterization for precision medicine and drug development.
Large-Scale Ground-Truth Data Foundation
We generate proprietary, inherently labeled ground-truth experimental data via NxN full-matrix wet-lab assays — the research data foundation no in-silico model can simulate — to train AlphaCell and improve AlphaFold protein structure prediction.
Data Generation
10M+ assays
Wet-lab NxN full-matrix assays produce ground-truth interactome data at unprecedented throughput — the data layer no in-silico model can simulate.
Data Training
Self-Labeled
Ground truth data is inherently labeled — no annotation needed. Starting from a small AlphaCell model keeps compute costs minimal
Data Storage
NxN Compact
Structured NxN matrix format is more compact than unstructured data, reducing storage and retrieval overhead significantly
Ground Truth Data Trains Both Models
Every experiment produces inherently annotated ground truth data that feeds two AI systems simultaneously
AlphaCell
Cancer cell phenotype (proprietary)
AlphaFold
Protein structure prediction (improved)
AlphaCell Algorithm: 3-Step Pipeline for Cell Regulatory Networks
Step 1: Spatiotemporal transcriptomics input → Step 2: DMD (FFT + Phase Analysis) → Eigen-clusters, decomposing each gene's expression into ~100+ transcription factor contributions → Step 3: Diffusion Model + GRN to infer the full gene regulatory network. DMD originates from fluid dynamics; SATORI reimplemented the legacy MATLAB package in Python/PyTorch.
Meta-Learning Dual-Layer Architecture
Data Moat CoreFNO + Evo2 + Diffusion Model synergistically process spatiotemporal transcriptomics and genomic sequence data to infer complete Gene Regulatory Networks (GRN)
Observes the Inner Model's training process, automatically discovers optimal hyperparameter combinations, continuously improving model performance
AutoResearch Autonomous Science Engine
Layer 3 · Simulating Scientific ThinkingInspired by Karpathy's AutoResearch and Sakana AI Scientist, AlphaCell goes beyond data analysis — it autonomously proposes hypotheses, designs experiments, evaluates results, and iterates discoveries. Each round takes 5 minutes, running 100+ experiments overnight.
CELLOS End-to-End Workflow
Based on the Design-Build-Test-Learn (DBTL) closed loop, integrating AlphaCell end-to-end model and ms-swift fine-tuning framework, achieving full-chain automation from AI design to mass production delivery
AlphaCell Algorithm Architecture
AI Training Data
Key Performance Metrics
The Complete SATORI Mission
From Decoding Life to Evolving Life — building the synthetic biology closed loop
Decode Life
解码生命
- 5 specialized agents analyze different omics data
- Each agent independently outputs P(edge) probabilities
- Joint probability guides Diffusion Model → GRN
- Logic decoupling avoids multimodal overfitting
Design Life
设计生命
- Convex Analysis for metabolic network programming
- S · v = 0 (stoichiometric matrix × flux vector)
- Log-phase cells → static mathematical solution space
- Original MATLAB package → SATORI Python rewrite
Build Life
改造生命
- NexT nanotechnology-mediated gene editing
- CRISPR precision editing of identified targets
- Signal peptide AI design for secretion optimization
- Multi-round iterative chassis cell engineering
Evolve Life
进化生命
- AutoResearch autonomous iteration
- QR code tracking for high-throughput screening
- Directed evolution every 2 weeks
- All data feeds back to Decode Life → continuous loop
5 Specialized Agents · Independent Inference · Joint Decision
Each agent focuses on a single omics type, independently producing edge probabilities — avoiding single-model multimodal overfitting
Epigenomics
→ P(edge)Transcriptomics (FNO)
→ P(edge)Genomics (Evo2)
→ P(edge)Metabolomics
→ P(edge)Proteomics
→ P(edge)Logic decoupling vs single-model multimodal overfitting
Dynamic Phase vs Steady State
Dynamic phases → FNO (spatiotemporal dynamics)
Steady state (log phase) → Convex Analysis (metabolic programming)
Convex Analysis
S · v = 0
Stoichiometric matrix × flux vector = 0 · steady-state optimal flux distribution