Digest · 2026-05-30 · Dauntless Systems

Models to Download & Try

Jackrong/Qwopus3.6-27B-v2-MTP-GGUF, https://huggingface.co/Jackrong/Qwopus3.6-27B-v2-MTP-GGUF — 27B architecture with Multi-Token Prediction (MTP) and agentic routing optimizations. Q4_K_M quantization sits at ~15GB VRAM, leaving 17GB+ headroom for your 131k context buffer. Directly stress-tests your rover control and scraper pipelines against your current Qwen 3.6:35b baseline. Source
numind/NuExtract3, https://huggingface.co/numind/NuExtract3 — 5B vision-language model focused on high-precision document extraction and layout parsing. Q8_0 footprint ~2.8GB VRAM. Bypasses heavier VLM overhead for your aerospace/robotics vision-scraper, providing structured telemetry ingestion with minimal KV cache pressure. Source

AgentDoG 1.5, https://huggingface.co/papers/agentdog-1.5 — Lightweight alignment framework for AI agent safety and security. Provides structural guardrails for local agentic tool-use routing and prevents prompt-injection drift in untrusted telemetry or research document parsing. Source
CoHyDE, https://huggingface.co/papers/cohyde — Iterative co-training of an LLM rewriter & dense encoder specifically for tool retrieval. Cuts false-positive tool calls and improves RAG precision in your agentic workflows without requiring heavier base models. Source
PANDO, https://huggingface.co/papers/pando — Efficient multimodal AI agents via online skill distillation (CMU). Compresses agentic capabilities into compact, fast-inference adapters, directly applicable to stabilizing long-horizon rover control loops without full weight updates. Source

Nothing new today.

When Cloud Agents Meet Device Agents: Lessons from Hybrid Multi-Agent Systems, https://huggingface.co/papers/when-cloud-agents-meet-device-agents — Qualcomm study on cloud-device agent handoff reliability and latency tradeoffs. Maps directly to your Kangaroo rover's edge-inference constraints and fallback routing architecture. Source
CONF-KV: Confidence-Aware KV Cache Eviction with Mixed-Precision Storage, https://huggingface.co/papers/conf-kv — CMU framework for evicting low-confidence KV tokens during 131k context windows. Critical for maintaining long-horizon agentic memory on 32GB VRAM without OOM, preserving critical telemetry context while freeing activation memory. Source
Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection, https://huggingface.co/papers/tiny-but-trusted — UIUC lightweight VLM optimized for time-series anomaly spotting. Replaces heavier vision models in your research pipelines for real-time aerospace/robotics sensor fault detection with minimal latency. Source
Parallax: Parameterized Local Linear Attention for Language Modeling, https://huggingface.co/papers/parallax — Northwestern alternative to dense attention that reduces quadratic memory scaling. Makes 131k+ token agentic contexts viable on consumer-grade 32GB hardware without context-dependent degradation. Source