Human-in-the-loop as a design choice for AI in Systems Engineering
You can trust AI — up to the level at which you can verify it
AI can speed up Systems Engineering: summarizing requirements, drafting specifications, detecting inconsistencies, analyzing changes, preparing verification plans. In construction, infrastructure, energy transition and other complex technical projects, “faster” is only valuable if it remains controllable. SE is not a text production process — it is a proof and decision‑making process. That is precisely why human‑in‑the‑loop is not optional, but an explicit design choice in your way of working.
In this article, I work towards one practical principle: I can trust AI up to the level at which I can verify, trace and audit its output — and everything above that requires explicit human oversight, with clear acceptance rules.
Why human‑in‑the‑loop is indispensable when using AI in Systems Engineering
SE requires evidence, not just plausible text
SE is about managing complexity through explicit agreements: what is the requirement, why does it exist, what is the impact of a change, and how do I demonstrably show that the system complies? Requirements engineering — with quality attributes such as unambiguity, traceability and completeness — is fundamental here. ISO/IEC/IEEE 29148 is a well‑known standard in this domain.
AI output, no matter how well worded, is not yet evidence. It is a proposal that only becomes useful once I can demonstrate which sources were used, which assumptions were made, how it fits into the baseline, and what review and acceptance have taken place. Without that step, AI speeds up text production — while SE needs an accelerator for decision‑making with evidence.
“People make mistakes too” is true — but SE is specifically designed to capture errors in a controlled way "
People make mistakes, especially under time pressure. SE processes — reviews, baselines, change control, V&V, peer checks — exist precisely to systematically reduce errors and make them visible before acceptance. AI adds a new error pattern that falls outside that classical picture: errors that are produced quickly and at scale, are convincingly worded, and have no inherent source citation or reasoning trail.
Incorrect assumptions can quietly propagate into requirements, interfaces, verification claims or safety arguments. Once they are embedded in the chain, they are difficult and costly to correct. In this light, human‑in‑the‑loop is not a sign of distrust in AI — it is risk management aligned with how SE works.
Hallucinations: the problem is real — and cannot be configured away
What research shows
“Hallucination” — model output that is factually incorrect or not well grounded in source material — is not a uniform concept. Definitions and measurement methods differ by context and task. That makes it unrealistic to claim that the problem is definitively solved with a single configuration, prompt, or tool choice.
For SE, this is especially relevant because we work with context‑dependent definitions (interfaces, environmental conditions, design constraints), normative claims (“must”, “shall”, “may not”), and chains of dependencies in which a small error can have major downstream impact.
Where SE is vulnerable
In practice, the risks are highest where:
-
Numbers and limits matter — tolerances, load cases, performance requirements.
-
References are crucial — standards, contract requirements, system requirements.
-
Cross‑document consistency counts — CONOPS, requirement set, interface control, verification plan.
The danger is not only that the output is wrong. The danger is that the team — under time pressure — too quickly labels the output as plausible and incorporates it into artefacts that later carry heavy contractual, safety‑related or operational weight. Convincing output is not proof of correctness. In SE, where evidencing is the norm, plausibility is not enough.
The pitfall: speed plus persuasiveness leads to faster propagation
Automation bias as a human phenomenon
Human factors research has long described that people using automation tools tend to over‑rely on the tool, verify less actively, and detect deviations too late. This is called automation bias or complacency — and it is not a matter of carelessness, but a cognitive phenomenon that arises as soon as tooling is perceived as reliable.
What further reinforces this with AI language models is the presentation quality. An LLM produces fluent, well‑structured text in the correct domain register. That increases perceived credibility — and thus the risk that verification steps are skipped.
How this translates to SE
In construction, infrastructure, energy transition and other complex technical projects, three situations are typically risky:
-
Requirement interpretation under time pressure: AI provides a neat interpretation of a contract requirement. The team adopts it as “the intent is clear”, while a single nuance later leads to rework or claims.
-
Change impact analyses: AI produces an apparently complete impact list. The team trusts its completeness and misses an interface dependency or a verification artefact that also needs updating.
-
V&V artefacts: AI generates test cases that appear logical but do not align with the real acceptance criteria or are not traceable to requirements.
The essence: AI makes it easier to push ahead — and that increases the need for explicit review and acceptance rules.
What does work: verification loops and explicit sign‑off
The pattern: draft → verify → revise
The most robust pattern for AI in evidence‑sensitive processes is the verification loop. AI never produces the final product, only a reviewable proposal. That proposal then goes through an explicit verification step before it is allowed into the baseline. This aligns with existing SE practices — but must be made explicit and embedded as soon as AI tooling is used.
Concrete controls I would secure in SE
1) Output classification — what is this, actually?
-
Draft / idea: may be shared quickly, but not placed in the baseline.
-
Proposal for requirement or decision: may only go to review with source and trace.
-
Baseline candidate: requires explicit sign‑off by an authorized role.
2) Source and traceability requirements for AI output
-
No source or trace = no acceptance as requirement, interface agreement or verification claim.
-
Every normative sentence (“must”, “shall”) receives a trace link to a source: contract, standard, stakeholder requirement.
-
Mandatory challenge step
-
Have AI (or a second prompt) explicitly search for contradictions or missing conditions.
-
Have an engineer assess and sign off on that step.
4) Four‑eyes principle for high‑impact items
-
For items with high impact — safety, compliance, major cost, interface agreements — a second reviewer must perform an independent check.
5) Logging for audit
-
tore prompt, context, output, sources used and who performed which acceptance — so that you can later reconstruct why something was adopted.
This is human‑in‑the‑loop as process design: the human is not the last resort, but an explicit control layer with authorities and responsibilities.
Governance: from using AI to controlling AI
NIST AI RMF as a structuring framework
For governance, it is useful to structure AI risks instead of reacting ad hoc to incidents. NIST AI RMF 1.0 offers a useful framework with four functions: govern, map, measure, and manage. The core idea: AI risks are not a technical problem of the tool, but a governance issue for the organization that deploys the tool.
For SE projects, this helps position AI not as mere tooling, but as a capability with scope (where to use it and where not), risks (what can go wrong and with what impact), controls (which checks and who signs off) and monitoring (how to detect drift or quality issues).
RACI for acceptance: who is allowed to mark something as “seen and accepted”?
If you make one thing explicit, make acceptance authority explicit. Without this clarity, AI‑driven speed almost automatically leads to faster propagation.
-
Responsible: who creates or updates the artefact (with AI as a helper)?
-
Accountable: who accepts and baselines it (and carries the risk)?
-
Consulted: who must review the content — discipline owners, interface owners.
-
Informed: who must be kept informed — project management, QA, client.
ISO/IEC/IEEE 29148 remains fully in force as the quality bar. The standard does not change because requirements have been drafted by an AI — the verification threshold stays the same.
Conclusion
You can trust AI up to the level at which you can verify it.
This is not a cynical conclusion — it is a workable and practical principle. It means that AI can be used very effectively to arrive more quickly at analyses and concept artefacts, as long as the verification process is at least as good as, or better than, without AI.
Human‑in‑the‑loop is not a brake on innovation and not a temporary measure until AI is “good enough”. It is an explicit design choice in the SE chain — just as deliberate as a verification milestone or a baseline review. Organizations that set this up properly combine AI’s speed advantage with the evidencing that complex engineering requires.
In high‑impact projects, trust is not a feeling. It is something you build with verification, traceability and demonstrability. AI may accelerate — the human remains the owner of acceptance.
About Basewise: Basewise supports organizations in construction, infrastructure, energy transition, shipbuilding and offshore with the application of Systems Engineering. We combine domain expertise with methodical guidance — including for the responsible use of AI in SE processes.
You May Also Like
These Related Stories

Why generic AI falls short for requirements management
%20doc%20=%20fitz.open(pdf_path)%20full_text%20=%20%20for%20page%20in%20doc%20full_text%20+=%20page.get_text()%20return%20full_text%20%23%20Gebruik%20%23%20print(extract_text_from_p.png)
From document to a requirements set
%20doc%20=%20fitz.open(pdf_path)%20full_text%20=%20%20for%20page%20in%20doc%20full_text%20+=%20page.get_text()%20return%20full_text%20%23%20Gebruik%20%23%20print(extract_text_from_p%20(1).png)


No Comments Yet
Let us know what you think