Abstracted
A weekly digest of the most commercially relevant arXiv papers for operators, PMs, investors, and non-research engineers.
Weekly Brief
Archive
Feedback
Home
/
Sitemap
Library sitemap
All weeks and briefs
Crawlable links to every public weekly edition page and every individual brief page.
Week of Apr 27, 2026
Synthetic Computers at Scale for Long-Horizon Productivity Simulation
ObjectGraph: From Document Injection to Knowledge Traversal -- A Native File Format for the Agentic Era
When Your LLM Reaches End-of-Life: A Framework for Confident Model Migration in Production Systems
Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models
Bian Que: An Agentic Framework with Flexible Skill Arrangement for Online System Operations
When to Retrieve During Reasoning: Adaptive Retrieval for Large Reasoning Models
From Soliloquy to Agora: Memory-Enhanced LLM Agents with Decentralized Debate for Optimization Modeling
PolyKV: A Shared Asymmetrically-Compressed KV Cache Pool for Multi-Agent LLM Inference
LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People
Week of Apr 20, 2026
Learning to Communicate: Toward End-to-End Optimization of Multi-Agent Language Systems
CHASM: Unveiling Covert Advertisements on Chinese Social Media
Scalable AI Inference: Performance Analysis and Optimization of AI Model Serving
Bimanual Robot Manipulation via Multi-Agent In-Context Learning
DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data
ShadowPEFT: Shadow Network for Parameter-Efficient Fine-Tuning
Harmful Intent as a Geometrically Recoverable Feature of LLM Residual Streams
A multimodal and temporal foundation model for virtual patient representations at healthcare system scale
Latent Phase-Shift Rollback: Inference-Time Error Correction via Residual Stream Monitoring and KV-Cache Steering
Week of Apr 13, 2026
AIPC: Agent-Based Automation for AI Model Deployment with Qualcomm AI Runtime
AgentGA: Evolving Code Solutions in Agent-Seed Space
Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG
From Anchors to Supervision: Memory-Graph Guided Corpus-Free Unlearning for Large Language Models
Bridging MARL to SARL: An Order-Independent Multi-Agent Transformer via Latent Consensus
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
Policy-Invisible Violations in LLM-Based Agents
AutoSurrogate: An LLM-Driven Multi-Agent Framework for Autonomous Construction of Deep Learning Surrogate Models in Subsurface Flow
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents
The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems
Week of Apr 6, 2026
KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation
KV Cache Offloading for Context-Intensive Tasks
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver
Don't Overthink It: Inter-Rollout Action Agreement as a Free Adaptive-Compute Signal for LLM Agents
LegoDiffusion: Micro-Serving Text-to-Image Diffusion Workflows
Small Vision-Language Models are Smart Compressors for Long Video Understanding
Dynamic Attentional Context Scoping: Agent-Triggered Focus Sessions for Isolated Per-Agent Steering in Multi-Agent LLM Orchestration
More Capable, Less Cooperative? When LLMs Fail At Zero-Cost Collaboration
DIVERSED: Relaxed Speculative Decoding via Dynamic Ensemble Verification
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents
Gym-Anything: Turn any Software into an Agent Environment
AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent
LatentAudit: Real-Time White-Box Faithfulness Monitoring for Retrieval-Augmented Generation with Verifiable Deployment
SkillX: Automatically Constructing Skill Knowledge Bases for Agents
Week of Mar 30, 2026
Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies
MOON3.0: Reasoning-aware Multimodal Representation Learning for E-commerce Product Understanding
Learning to Play Blackjack: A Curriculum Learning Perspective
Mimosa Framework: Toward Evolving Multi-Agent Systems for Scientific Research
CirrusBench: Evaluating LLM-based Agents Beyond Correctness in Real-World Cloud Service Environments
Courtroom-Style Multi-Agent Debate with Progressive RAG and Role-Switching for Controversial Claim Verification
Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design
SkinGPT-X: A Self-Evolving Collaborative Multi-Agent System for Transparent and Trustworthy Dermatological Diagnosis
Doctorina MedBench: End-to-End Evaluation of Agent-Based Medical AI
Week of Mar 23, 2026
AD-CARE: A Guideline-grounded, Modality-agnostic LLM Agent for Real-world Alzheimer's Disease Diagnosis with Multi-cohort Assessment, Fairness Analysis, and Reader Study
WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience
The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More
Self-Distillation for Multi-Token Prediction
VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs
MsFormer: Enabling Robust Predictive Maintenance Services for Industrial Devices
SecureBreak -- A dataset towards safe and secure models
AI Token Futures Market: Commoditization of Compute and Derivatives Contract Design
Efficient Zero-Shot AI-Generated Image Detection
PRISM: Breaking the O(n) Memory Wall in Long-Context LLM Inference via O(1) Photonic Block Selection
Week of Mar 16, 2026
Memento-Skills: Let Agents Design Agents
Governed Memory: A Production Architecture for Multi-Agent Workflows
Lightweight Adaptation for LLM-based Technical Service Agent: Latent Logic Augmentation and Robust Noise Reduction
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild
Is Conformal Factuality for RAG-based LLMs Robust? Novel Metrics and Systematic Insights
Evaluating Agentic Optimization on Large Codebases
MAC: Multi-Agent Constitution Learning
CUBE: A Standard for Unifying Agent Benchmarks
The PokeAgent Challenge: Competitive and Long-Context Learning at Scale
Intelligent Co-Design: An Interactive LLM Framework for Interior Spatial Design via Multi-Modal Agents
AgentTrace: Causal Graph Tracing for Root Cause Analysis in Deployed Multi-Agent Systems
Week of Mar 9, 2026
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections
Automatic Generation of High-Performance RL Environments
XSkill: Continual Learning from Experience and Skills in Multimodal Agents
Slow-Fast Inference: Training-Free Inference Acceleration via Within-Sentence Support Stability
Think While Watching: Online Streaming Segment-Level Memory for Multi-Turn Video Reasoning in Multimodal Large Language Models
CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges
When OpenClaw Meets Hospital: Toward an Agentic Operating System for Dynamic Clinical Workflows
OSCBench: Benchmarking Object State Change in Text-to-Video Generation
RoboClaw: An Agentic Framework for Scalable Long-Horizon Robotic Tasks
One Supervisor, Many Modalities: Adaptive Tool Orchestration for Autonomous Queries
COMIC: Agentic Sketch Comedy Generation
LookaheadKV: Fast and Accurate KV Cache Eviction by Glimpsing into the Future without Generation
Nurture-First Agent Development: Building Domain-Expert AI Agents Through Conversational Knowledge Crystallization
Does LLM Alignment Really Need Diversity? An Empirical Study of Adapting RLVR Methods for Moral Reasoning
Resource-constrained Amazons chess decision framework integrating large language models and graph attention
OpenClaw-RL: Train Any Agent Simply by Talking
Context Engineering: From Prompts to Corporate Multi-Agent Architecture
Latent World Models for Automated Driving: A Unified Taxonomy, Evaluation Framework, and Open Challenges
From Days to Minutes: An Autonomous AI Agent Achieves Reliable Clinical Triage in Remote Patient Monitoring
Meissa: Multi-modal Medical Agentic Intelligence
Tool Receipts, Not Zero-Knowledge Proofs: Practical Hallucination Detection for AI Agents
PostTrainBench: Can LLM Agents Automate LLM Post-Training?
SplitAgent: A Privacy-Preserving Distributed Architecture for Enterprise-Cloud Agent Collaboration
Ares: Adaptive Reasoning Effort Selection for Efficient LLM Agents
Week of Mar 2, 2026
HLER: Human-in-the-Loop Economic Research via Multi-Agent Pipelines for Empirical Discovery
SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration
Light
Dark