LLM-ADAM: A Generalizable LLM Agent Framework for Pre-Print Anomaly Detection in Additive Manufacturing explained

Brief context

Publication timing, weekly edition context, and source links for this brief.

Week

May 4, 2026

Published

May 5, 2026, 3:38 AM

Current score

Original paper

The executive brief below is grounded in the source paper and linked back to the arXiv abstract.

Additive manufacturing (AM) continues to transform modern manufacturing by enabling flexible, on-demand production of complex geometries across diverse industries. Fused filament fabrication (FFF) has extended AM to laboratories, classrooms, and small production environments, but this accessibility shifts process-planning responsibility to users who may lack manufacturing expertise. A syntactically valid slicer profile can still encode thermally or geometrically harmful settings, and subtle G-code edits can alter extrusion, cooling, or adhesion before a print begins. Pre-print G-code screening catches accidental or adversarial machine-program errors before material or machine time is wasted. This paper proposes LLM-ADAM as a generalizable LLM framework for pre-print anomaly detection in AM. The framework decomposes the task into three roles: Extractor-LLM maps a G-code file to a structured process-parameter schema; Reference-LLM converts printer and material documentation into aligned operating ranges; and Judge-LLM interprets a deterministic deviation table and G-code evidence to decide whether a part is non-defective or belongs to an anomaly class. Printers, materials, and LLM backbones are interchangeable test conditions, not fixed assumptions. We evaluate the framework on an N=200 FFF G-code corpus spanning two desktop printer families, two materials, and five classes including non-defective, under-extrusion, over-extrusion, warping, and stringing. The best framework configuration reaches 87.5% accuracy, compared with 59.5% for the strongest engineered single-LLM baseline. The results show that structured decomposition, rather than backbone strength alone, is the dominant source of improvement, with defect classes identified at or near ceiling for leading configurations while residual errors concentrate on conservative false alarms for non-defective samples.

Open the original arXiv page

Score 87Full-paper briefagentsmodelsinferencedata

Executive brief

A short business-reader brief that explains why the paper matters now and what to watch or do next.

Why this is worth your attention

This paper points to a practical near-term use for LLM agents in manufacturing: not running the printer, but checking the machine instructions before a bad print consumes material, time, or trust. The important shift is that the system does not ask one model to “understand G-code”; it splits the job into structured extraction, manual-grounded reference ranges, deterministic deviation checks, and a final evidence-based judgment. The result is materially better than a single-LLM baseline in a controlled FFF testbed, but still short of an autonomous production QA layer because it is narrow, documentation-dependent, and does not yet repair the files it flags.

If this approach holds up, additive manufacturing QA can move from detecting failed prints after time and material are spent to a digital preflight check on the machine instructions themselves. That matters most where FFF printers are used by non-experts or distributed teams that cannot rely on a manufacturing engineer reviewing every slicer profile.
The paper’s strongest message is architectural: breaking the job into extraction, documentation-grounded reference ranges, deterministic comparison, and final judgment beat a tuned single-LLM baseline by a wide margin. For buyers and builders, the question is less “which model?” and more “where are the controlled interfaces, schemas, and audit trail?”
A credible tool should show which G-code parameters were extracted, which printer/material manuals supplied the safe ranges, and which deviations drove the decision. If a vendor’s answer is mostly free-form model reasoning, it is missing the mechanism that produced the paper’s gains.
The evidence is encouraging but narrow: 200 controlled FFF cases across two desktop printer families, two materials, and five labels. The remaining hard case is non-defective files, where conservative false alarms could create review queues and erode operator trust.
The commercially important next step is closing the loop: flagging a risky G-code file is useful, but proposing safe slicer or G-code corrections is where labor savings become clearer. Also watch whether vendors can control the final judge-call cost, since the authors identify it as the dominant per-file expense at scale.

Evidence ledger

The strongest claims in the brief, along with the confidence and citation depth behind them.

capabilityhighp.14

The best staged LLM-ADAM configuration substantially outperformed a tuned single-LLM baseline on the same N=200 corpus.

stackhighp.2p.3

The main improvement comes from workflow decomposition and deterministic intermediate artifacts, not simply using a stronger model.

caveathighp.16p.17

The approach still depends on documentation access and human correction after detection.

Related briefs

More plain-English summaries from the archive with nearby topics or operator relevance.

cs.LG

AutoSurrogate: An LLM-Driven Multi-Agent Framework for Autonomous Construction of Deep Learning Surrogate Models in Subsurface Flow

Jiale Liu, Nanzhe Wang

Read brief arXiv

cs.LG

Bridging MARL to SARL: An Order-Independent Multi-Agent Transformer via Latent Consensus

Zijian Zhao, Jing Gao, Sen Li

Read brief arXiv

cs.CR

The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems

Yihao Zhang et al.

Read brief arXiv

cs.AI

AgentGA: Evolving Code Solutions in Agent-Seed Space

David Y. Y. Tan, Kellie Chin, Jingxian Zhang

Read brief arXiv