LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People explained

Brief context

Publication timing, weekly edition context, and source links for this brief.

Week

Apr 27, 2026

Published

Apr 27, 2026, 2:32 AM

Current score

Original paper

The executive brief below is grounded in the source paper and linked back to the arXiv abstract.

Indoor navigation remains a critical accessibility challenge for the blind and low-vision (BLV) individuals, as existing solutions rely on costly per-building infrastructure. We present an agentic framework that converts a single floor plan image into a structured, retrievable knowledge base to generate safe, accessible navigation instructions with lightweight infrastructure. The system has two phases: a multi-agent module that parses the floor plan into a spatial knowledge graph through a self-correcting pipeline with iterative retry loops and corrective feedback; and a Path Planner that generates accessible navigation instructions, with a Safety Evaluator agent assessing potential hazards along each route. We evaluate the system on the real-world UMBC Math and Psychology building (floors MP-1 and MP-3) and on the CVC-FP benchmark. On MP-1, we achieve success rates of 92.31%, 76.92%, and 61.54% for short, medium, and long routes, outperforming the strongest single-call baseline (Claude 3.7 Sonnet) at 84.62%, 69.23%, and 53.85%. On MP-3, we reach 76.92%, 61.54%, and 38.46%, compared to the best baseline at 61.54%, 46.15%, and 23.08%. These results show consistent gains over single-call LLM baselines and demonstrate that our workflow is a scalable solution for accessible indoor navigation for BLV individuals.

Open the original arXiv page

Score 86Full-paper briefagentsinferenceinfradata

Executive brief

A short business-reader brief that explains why the paper matters now and what to watch or do next.

Why this is worth your attention

Indoor navigation for blind and low-vision people is usually treated as an infrastructure problem: install beacons, map buildings manually, and keep the system maintained. This paper points to a cheaper operating model—turn an existing floor plan into a structured route graph, validate it with agent checks, and use lightweight visual markers for localization—while showing better results than single-call LLM baselines in limited tests. The business implication is that campuses, hospitals, airports, and large offices may eventually be able to pilot accessibility navigation from documents they already have, but the evidence is not yet strong enough for safety-critical deployment.

The useful shift is not “LLMs can read maps”; it is that a usable indoor navigation layer might be bootstrapped from existing floor plans plus light checkpointing, rather than bespoke beacon networks or full building scans. That still means per-site work—especially marker placement—but the cost profile could be materially different.
For any vendor claiming AI navigation, ask how routes are tied to doors, rooms, coordinates, and validation checks—not just whether the model can generate instructions. The paper’s gains come from graph grounding, object detection, and retry logic, which are more operationally meaningful than a polished one-shot LLM response.
Performance drops sharply on longer and more complex routes: on MP-3, the proposed method reaches only 38.46% success on long routes, despite beating the baselines. That is promising for a research prototype, but not yet a reliability level for unsupervised wayfinding by blind or low-vision users.
The real adoption test is whether blind and low-vision participants can use the instructions safely across unfamiliar buildings, with missed turns, crowds, temporary obstructions, and poor signage. The current physical validation used one sighted evaluator, so procurement or accessibility teams should wait for BLV user studies before treating this as more than a pilot candidate.
The approach depends heavily on OCR, door detection, and clean room geometry. If it only works on well-labeled architectural plans, the near-term market is likely controlled pilots in campuses, hospitals, airports, and corporate sites—not broad consumer navigation.

Evidence ledger

The strongest claims in the brief, along with the confidence and citation depth behind them.

capabilityhighp.1p.3

The paper claims a multi-agent pipeline can convert a single floor plan image into a structured spatial knowledge graph and route instructions.

capabilityhighp.1p.1

The proposed workflow outperforms single-call LLM baselines on the UMBC MP-1 and MP-3 floor-plan evaluations.

stackhighp.4p.5

The system relies on a structured stack: graph retrieval, semantic retrieval, visual grounding, and BFS path planning rather than free-form generation alone.

caveathighp.5p.7

The current evidence does not establish field-ready safety: localization uses fiducial markers and the real-world pilot did not involve BLV participants.

Related briefs

More plain-English summaries from the archive with nearby topics or operator relevance.

cs.LG

AutoSurrogate: An LLM-Driven Multi-Agent Framework for Autonomous Construction of Deep Learning Surrogate Models in Subsurface Flow

Jiale Liu, Nanzhe Wang

Read brief arXiv

cs.IR

Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG

Yiqun Sun, Pengfei Wei, Lawrence B. Hsieh

Read brief arXiv

cs.LG

Bridging MARL to SARL: An Order-Independent Multi-Agent Transformer via Latent Consensus

Zijian Zhao, Jing Gao, Sen Li

Read brief arXiv

cs.LG

A multimodal and temporal foundation model for virtual patient representations at healthcare system scale

Andrew Zhang et al.

Read brief arXiv