The final ACM camera-ready version of my IAIT2026 paper #9799 is now complete:
A Validation and Governance Framework for Multi-Agent LLM Scientific Software Development
This paper presents a validation and governance framework for multi-agent LLM scientific software development, focusing on how LLM-generated systems can appear correct at the test level while still failing at the process, architecture, or governance level.
The core idea is simple:
Passing tests is not enough.
Multi-agent LLM software systems need structured validation, staged review, traceable artifacts, and governance controls that make the development process inspectable rather than merely executable.
The paper argues that scientific software built with LLM agents should not be judged only by whether code runs or tests pass. It should also be evaluated by how the code was produced, reviewed, validated, and governed.
This work grew out of my ongoing experiments with multi-agent LLM software development, Ruby/Rails prototypes, blackboard-style coordination, and practical validation workflows.
The final ACM camera-ready version is attached here for anyone interested.
IAIT2026_ACM_Paper_9799_Final.pdf (457.7 KB)