Digest · 2026-05-26 · Dauntless Systems

Models to Download & Try

unsloth/Qwen3.6-27B-MTP-GGUF, https://huggingface.co/unsloth/Qwen3.6-27B-MTP-GGUF — Multi-Token Prediction (MTP) optimized GGUF for the 27B tier. Q4_K_M quantization fits ~14-16GB VRAM on a 32GB card, leaving 16+ GB headroom for context windows approaching your 131k limit. MTP accelerates local inference throughput without sacrificing alignment or coding capability.
Jackrong/Qwopus3.6-27B-v2-GGUF, https://huggingface.co/Jackrong/Qwopus3.6-27B-v2-GGUF — Coder-focused variant of the 27B architecture. Q4_K_M footprint ~15GB VRAM. Benchmarked strong on local agentic code generation and tool-use routing, making it a direct replacement for Qwen3.6:35b when compute-heavy pipelines hit VRAM constraints.
numind/NuExtract3, https://huggingface.co/numind/NuExtract3 — Open-weight 5B vision-language model optimized for markdown, OCR, and structured data extraction. Runs at ~3GB VRAM (Q8_0 or FP16). Directly applicable to your aerospace/robotics vision-scraper pipeline for reliable layout parsing and telemetry document ingestion without spinning up a 30B+ VLM.
HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive, https://huggingface.co/HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive — Uncensored/aggressive alignment variant of your current 35B baseline. Q4_K_M consumes ~19GB VRAM; fits 32GB with sufficient KV cache buffer for 32k–64k context. Retains native MTP and retains full multimodal routing for your Kangaroo rover control stack.

mukul975/Anthropic-Cybersecurity-Skills, https://github.com/mukul975/Anthropic-Cybersecurity-Skills — 754 structured agent skills mapped to MITRE ATT&CK, NIST CSF 2.0, and ATLAS. Native to Cursor, Claude Code, Gemini CLI, and 20+ platforms. Provides a ready-to-import skill registry for hardening your agentic workflows against adversarial tool-use and prompt injection.
microsoft/agent-governance-toolkit, https://github.com/microsoft/agent-governance-toolkit — Policy enforcement, zero-trust identity routing, and execution sandboxing for autonomous agents. Covers OWASP Agentic Top 10. Directly addresses reliability and runtime isolation gaps in long-horizon agent deployments.
shareAI-lab/learn-claude-code, https://github.com/shareAI-lab/learn-claude-code — Nano agent harness architecture for CLI/terminal automation. Minimal footprint, MCP-native, and designed for reproducible agentic loops. Useful as a lightweight orchestrator backbone for your rover control and vision-scraper pipelines.

CohereLabs/command-a-plus-05-2026-w4a4, https://huggingface.co/CohereLabs/command-a-plus-05-2026-w4a4 & bf16 variant (219B) — Active commercial iteration cycle for vision-text processing. w4a4 quantization signals optimization for high-throughput multimodal routing. No major frontier launches today from Anthropic, OpenAI, Google DeepMind, Meta, xAI, or Mistral in the scraped feed.

ThriftAttention, https://huggingface.co/papers/thriftattention — Selective mixed precision for long-context FP4 attention. Reduces KV cache footprint for extended windows, directly enabling 131k context on 32GB VRAM with acceptable quality degradation for agent state tracking.
VeriTrace, https://arxiv.org/abs/2605.26081 — Evolving mental model tracking for deep research agents. Introduces reliability diagnostics for multi-step agentic loops, addressing drift in long-horizon task execution.
Personalize-then-Store, https://arxiv.org/abs/2605.25535 — Benchmarking and learning personalized memory for long-horizon agents. Methodology for retaining cross-session state without full-context recomputation; applicable to rover telemetry continuity and vision-scraper workflow memory.
MobileGym, https://arxiv.org/abs/2605.26114 — Verifiable, highly parallel simulation platform for mobile GUI agents. Provides synthetic interaction environments for training vision-scraper pipelines and agentic tool-use without hardware dependency.

Cohere Command-A-Plus (05-2026) w4a4/bf16 variants (covered 2026-05-22/23)
Qwen3.6-27B/35B GGUF & MTP variants from unsloth/Jackrong/HauhauCS (covered 2026-05-21/22/23)
DeepSeek-V4-Pro/Flash scaling metrics (covered 2026-05-21/22)
Compiling Agentic Workflows into LLM Weights (arXiv:2605.22502) (covered 2026-05-23)
Meta-Soft KV cache compression via composable meta-tokens (covered 2026-05-22/23)
Hugging Face task-label anomaly: nvidia/Nemotron-Labs-Diffusion-14B misclassified as Text Generation (covered 2026-05-23)