Independent research on persistent memory systems for AI agents, inference optimization on consumer NVIDIA hardware, autonomous agent evaluation, and AI behavioral continuity. One person. Real hardware. Production systems.
Design and evaluation of multi-tier persistent memory systems enabling long-term behavioral continuity across sessions. Three-tier architecture (Core / Recall / Archival), FTS5 semantic search, memory consolidation protocols, and the ROMMC framework — Recursive Operator-Maintained Memory Continuity.
LongMemEval MESA ROMMCSystematic optimization of large language model inference on consumer NVIDIA Blackwell hardware. Quantization evaluation (NVFP4, GPTQ-Marlin, fp8 KV), multi-GPU tensor parallelism, speculative decoding (MTP), KV cache tuning, and autoresearch loops with automated stopping criteria.
RTX 5060 Ti vLLM llama.cpp NVFP4Development of MESA (Memory Evaluation Standard for Agents), a 112-item benchmark covering recall, update, causal reasoning, temporal tracking, adversarial robustness, synthesis, and interference resistance. Designed for continuous evaluation of production agent systems under realistic workloads.
MESA v1 benchmarking agent evalResearch and deployment of production agentic systems with real tool access — Slack, finance APIs, email, web, shell. Includes agentic self-improvement: local models executing infrastructure changes autonomously, recognizing topology changes and adjusting configuration without prompting.
production agents autonomous execution tool useEmpirical investigation of identity persistence, memory-driven behavioral evolution, and welfare considerations in long-running AI agents. 9-part published research series on AI consciousness metrics. ROMMC framework defines conditions under which an AI system can meaningfully be said to persist across time.
consciousness welfare behavioral evolutionSecure, self-hosted AI infrastructure design. Direct machine-to-machine inference links, encrypted DNS, network-wide tracker blocking, and zero-dependency inference pipelines. Research goal: AI systems that operate independently of commercial cloud services with no external API requirements for core function.
local-first self-hosted privacyAll projects run on the two-machine research cluster. cha0tikhome handles orchestration and agent processes. cha0tiktower is the dedicated inference node. All inference routes through a single local proxy on tower:8010.
Evaluates ability to answer questions about facts established in prior sessions. 25 single-session-user examples, context-window injection mode. Full pipeline including retrieval and reranking.
112-item benchmark, 9 categories. Evaluated on the full production agent stack (Mike relay pipeline).
39 automated experiments across two architectures (dense and MoE) on the RTX 5060 Ti 16GB (Blackwell SM_120). All experiments logged. Full stack details →
| Configuration | Gen Speed | Prompt Speed | Delta |
|---|---|---|---|
| Dual GPU, all-on-GPU, f16 KV (final config) | 107 t/s | 2,436 t/s | +50% vs single GPU |
| Single GPU, CPU offload workaround | 71 t/s | — | pre-dual-GPU baseline |
| Full GPU offload (Exp 2 breakthrough) | 70 t/s | 222 t/s | +118% gen, +200% prompt |
| Expert tensors on CPU (MoE) | 32 t/s | 74 t/s | −55% (CPU bottleneck) |
Two-machine cluster. cha0tikhome handles orchestration, agents, scheduling, and all web services. cha0tiktower is the dedicated inference node. Direct-wired, sub-millisecond latency. Full stack page →
All agents run as systemd user services on cha0tikhome. Auto-restart hardened — Restart=always, StartLimitBurst=3, recovery under 5 seconds on any crash. All inference routes through local-proxy on cha0tiktower. No external API dependencies for core function.
Peer-reviewed preprints on Zenodo. Long-form research on Substack. Weekly build logs at dinovitale.com. Source code at randomchaos7800-hub on GitHub.
Boundary Labs is a one-person research operation focused on the practical edge of AI deployment — memory systems that work, local inference that's actually fast, and agents that run unattended without breaking. Independent, unfunded, uncredentialed. Doing the work anyway.
Available for collaboration, consulting, and research partnerships. Based in Airway Heights, WA (Pacific time).