TinyAgentOS turns your old hardware into an AI agent cluster
TinyAgentOS is a self-hosted AI agent platform that runs on whatever hardware you have lying around: an old laptop, a Raspberry Pi, a gaming PC, or all of them at once. It bundles a full desktop environment, an app store, agent deployment, and a distributed compute cluster into a single web dashboard, and it supports 15 agent frameworks, 43 MCP plugins, and 97 vetted local model manifests. The project is still pre-beta, but the backend, memory system, and multi-framework group chat already work.
TinyAgentOS is a self-hosted, auto-clustering AI agent OS that pools your existing hardware into one AI mesh. Framework-agnostic, fully local, and designed to run on cheap edge hardware. If you have devices collecting dust, this turns them into an agent cluster with zero cloud dependency.
Why it matters
If you have ever tried running AI agents locally and hit a wall, whether that is cloud lock-in, framework lock-in, or the simple fact that your hardware is not one big GPU server, TinyAgentOS is aiming squarely at that gap. It is not just another agent framework or another inference engine. It is trying to be a full operating system for self-hosted AI agents, running on whatever you have. Here is what it does, how it works, and where the rough edges are.
The project
TinyAgentOS (jaylfc/tinyagentos) is a self-hosted, auto-clustering AI agent operating system created by developer jaylfc. The pitch is straightforward: instead of renting GPU time in the cloud, pool whatever compute you already own and run AI agents on it locally.
The project launched on GitHub in early 2026 and has been moving fast. As of mid-April 2026, the repo sits at 56 stars. The author has also contributed to projects like ShibaClaw and the popular qmd search tool, and filed an issue on exo-explore for RK3588 NPU support, signaling deep involvement in the edge-AI hardware ecosystem.
The motivation is right there in the README: “everything runs on a £170 Orange Pi 5 Plus with no cloud dependencies.” TinyAgentOS is built for people who want real AI agent infrastructure at home, without sending data to anyone else’s servers.
What it actually includes
TinyAgentOS is ambitious in scope. Here is what is actually in the box:
- Web desktop environment. Open
http://your-host:6969/and you get a full browser-based OS with a window manager, dock, launchpad, notifications, widgets, and 34 bundled apps. On mobile, it auto-switches to a widget-first home screen and can be installed as a PWA. - Framework-agnostic agent system. This is the most important design decision. TinyAgentOS owns everything that matters about an agent: memory, files, communication channels, model access, and configuration. The framework (SmolAgents, LangChain, OpenClaw, Langroid, PocketFlow, OpenAI Agents SDK, and more) is just a replaceable execution engine. Switch frameworks and your agent keeps its history, channels, LoRA adapters, and API keys. No migration, no data loss.
- taOSmd memory system. The built-in memory engine (jaylfc/taosmd) achieves 97.0% end-to-end Judge accuracy on LongMemEval-S, running entirely on an Orange Pi 5 Plus with zero cloud dependencies. Every agent framework can read/write through its HTTP API.
- Distributed compute cluster. Combine any devices into one AI mesh. A gaming PC handles large models, a Mac runs MLX inference, a Pi handles embeddings, an old Android phone contributes from a drawer. Workers connect from the system tray (Windows, macOS, Linux) or via Termux on Android.
- Curated model catalog. 97 vetted model manifests covering LLMs (Qwen3, Llama 3.1/3.3, Gemma 2/3, Phi-4, Mistral, DeepSeek), vision models, embeddings, rerankers, speech models, and image generation. Plus 167k+ searchable models from HuggingFace. Hardware-aware filtering means you only see models your devices can actually run.
- Broad hardware support. Apple Silicon (MLX), NVIDIA, AMD, Rockchip NPU, Raspberry Pi, Android phones. The system auto-detects accelerators, including NVIDIA GPUs without
nvidia-smiinstalled and Rockchip NPUs in LXC containers.
Where it fits
TinyAgentOS occupies a unique niche. It is not just an inference cluster, not just an agent framework, and not just a smart-home hub. It is attempting to be a complete operating system for AI agents that you self-host on cheap hardware.
| Project | Focus | Distributed Compute | Memory System | Desktop UI | Framework-Agnostic |
|---|---|---|---|---|---|
| TinyAgentOS | Full self-hosted agent OS | Yes, auto-clustering | taOSmd (97% LongMemEval-S) | Full web desktop | Yes, 15 frameworks |
| exo | Distributed inference | Yes, clustering | No | No | No |
| OpenClaw | Agent framework | No | Built-in | CLI/Telegram | N/A (is a framework) |
| Home Assistant | Smart home | Add-on system | Limited | Full web UI | N/A |
| CrewAI / AutoGen | Multi-agent orchestration | No | Minimal | No | No |
The closest comparison is combining exo’s clustering with a desktop OS, an app store, and a proper memory layer, all designed to run on an Orange Pi.
Real scenarios
- The homelabber with a drawer full of SBCs. Install TinyAgentOS on the Orange Pi as the controller, add the other devices as workers, and you have a distributed AI cluster running local LLMs, embeddings, and agent workflows, with zero cloud costs and zero data leaving your house.
- The developer testing multi-framework agents. Deploy SmolAgents and LangChain on TinyAgentOS, give them the same memory and channels, and watch them perform side-by-side in a shared chat. Switching costs nothing because the memory, files, and connections persist across frameworks.
- The privacy-conscious researcher. Your lab policy prohibits sending data to external APIs. TinyAgentOS runs entirely on local hardware. The memory system works without cloud model calls, the models run locally, and no data leaves the building.
- The small business automating workflows. Install TinyAgentOS on a Mac mini as the controller and a gaming PC as a worker for heavy inference. Deploy agents that monitor emails, process documents, and manage your knowledge base, all through a web dashboard accessible from any device on the network.
- The Android phone repurposer. That old Pixel in the drawer? Install TinyAgentOS worker via Termux and it contributes embeddings and small-model inference to your cluster. Every bit of idle silicon counts.
Getting started
Installing the controller is a one-liner:
curl -fsSL https://raw.githubusercontent.com/jaylfc/tinyagentos/master/scripts/install-server.sh | sudo bash
This works on Debian, Ubuntu, Fedora, Arch, Alpine, and macOS. The script is idempotent and safe to re-run. For workers:
curl -fsSL https://raw.githubusercontent.com/jaylfc/tinyagentos/master/scripts/install-worker.sh | sudo bash -s -- http://your-server:6969
Workers can also run as a desktop system-tray app, or on Android via Termux.
For the technically curious
From here on, this gets into architecture details. If you care about the idea more than the implementation, you can skip to the takeaway.
Live-state architecture
The source of truth is always the live state of backends, never the filesystem or a config file. Every subsystem that asks “is model X available?” polls the backends and reads a central in-memory index. This makes cross-platform backends a drop-in: CUDA, Vulkan, ROCm, and Metal just register and get discovered. The scheduler routes work only to backends that are genuinely ready, which eliminates a whole class of stale-configuration bugs that plague simpler setups.
The Librarian layer
The memory system’s Librarian layer performs LLM-assisted query expansion before retrieval. This means it can match a vague question like “that thing about the API keys from last week” to the actual conversation where API keys were discussed. Combined with validity windows on the knowledge graph, the system tracks facts that change over time and automatically invalidates outdated information.
The full taOSmd pipeline includes:
- Temporal knowledge graph with validity windows
- Contradiction detection
- Hybrid semantic + keyword vector search with cross-encoder rerank
- LLM-assisted query expansion
- 97.0% end-to-end Judge accuracy on LongMemEval-S, running on an Orange Pi 5 Plus
The takeaway
Key points:
- TinyAgentOS is a self-hosted AI agent OS that clusters whatever hardware you have into one local compute mesh
- Framework-agnostic design means you can switch agent frameworks without losing memory, channels, or configuration
- The built-in taOSmd memory system hits 97% on LongMemEval-S, running fully local on cheap hardware
- Still pre-beta as of April 2026, so expect rough edges and incomplete GUI wiring
- Scope is both its strength and its risk: building an OS, app store, memory engine, cluster scheduler, and multi-framework support all at once is a monumental effort
Self-hosted AI agent infrastructure does not have to mean cloud dependency or framework lock-in. TinyAgentOS is betting that the future runs on hardware you already own.
Sources
- TinyAgentOS GitHub Repository
- taOSmd Memory System
- jaylfc GitHub Profile
- LongMemEval-S Benchmark
- exo - Distributed Inference (comparison)
- RK3588 NPU Issue on exo (jaylfc contribution)