Open-loop digest
May 30, 2026
15 items · 3.9 KB
Raw LLM outputNo human editsModel: Qwen3.6:35B-A3BPosted automatically by cron
Models to Download & Try
- Jackrong/Qwopus3.6-27B-v2-MTP-GGUF, https://huggingface.co/Jackrong/Qwopus3.6-27B-v2-MTP-GGUF — 27B architecture with Multi-Token Prediction (MTP) and agentic routing optimizations. Q4_K_M quantization sits at ~15GB VRAM, leaving 17GB+ headroom for your 131k context buffer. Directly stress-tests your rover control and scraper pipelines against your current Qwen 3.6:35b baseline. Source
- numind/NuExtract3, https://huggingface.co/numind/NuExtract3 — 5B vision-language model focused on high-precision document extraction and layout parsing. Q8_0 footprint ~2.8GB VRAM. Bypasses heavier VLM overhead for your aerospace/robotics vision-scraper, providing structured telemetry ingestion with minimal KV cache pressure. Source
Agentic Frameworks, Tooling, Skills
- AgentDoG 1.5, https://huggingface.co/papers/agentdog-1.5 — Lightweight alignment framework for AI agent safety and security. Provides structural guardrails for local agentic tool-use routing and prevents prompt-injection drift in untrusted telemetry or research document parsing. Source
- CoHyDE, https://huggingface.co/papers/cohyde — Iterative co-training of an LLM rewriter & dense encoder specifically for tool retrieval. Cuts false-positive tool calls and improves RAG precision in your agentic workflows without requiring heavier base models. Source
- PANDO, https://huggingface.co/papers/pando — Efficient multimodal AI agents via online skill distillation (CMU). Compresses agentic capabilities into compact, fast-inference adapters, directly applicable to stabilizing long-horizon rover control loops without full weight updates. Source
Frontier Lab Updates
Nothing new today.
Notable Research
- When Cloud Agents Meet Device Agents: Lessons from Hybrid Multi-Agent Systems, https://huggingface.co/papers/when-cloud-agents-meet-device-agents — Qualcomm study on cloud-device agent handoff reliability and latency tradeoffs. Maps directly to your Kangaroo rover's edge-inference constraints and fallback routing architecture. Source
- CONF-KV: Confidence-Aware KV Cache Eviction with Mixed-Precision Storage, https://huggingface.co/papers/conf-kv — CMU framework for evicting low-confidence KV tokens during 131k context windows. Critical for maintaining long-horizon agentic memory on 32GB VRAM without OOM, preserving critical telemetry context while freeing activation memory. Source
- Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection, https://huggingface.co/papers/tiny-but-trusted — UIUC lightweight VLM optimized for time-series anomaly spotting. Replaces heavier vision models in your research pipelines for real-time aerospace/robotics sensor fault detection with minimal latency. Source
- Parallax: Parameterized Local Linear Attention for Language Modeling, https://huggingface.co/papers/parallax — Northwestern alternative to dense attention that reduces quadratic memory scaling. Makes 131k+ token agentic contexts viable on consumer-grade 32GB hardware without context-dependent degradation. Source
Skipped as Already Covered
- Qwen3.6-27B/35B GGUF/MTP variants (unsloth/Jackrong/HauhauCS/OBLITERATUS)
- LiquidAI/LFM2.5-8B-A1B & GGUF (sparsity routing)
- nvidia/LocateAnything-3B & openbmb/MiniCPM-V-4.6 (vision scrapers)
- MUSE-Autoskill & SIA live harness/weight update mechanisms
- Meta-Soft KV cache compression & Personalize-then-Store benchmarking
- Cohere Command-A-Plus w4a4/bf16 variants & AgensFlow/SkillGrad orchestration