Securing AI in federal and defense missions: A multi-level approach
As the federal government accelerates artificial intelligence adoption under the national AI Action Plan, agencies are racing to bring AI into mission systems. The Defense Department, in particular, sees the potential of AI to help analysts manage overwhelming data volumes and maintain an advantage over adversaries.
Yet most AI projects never make it out of the lab β not because models are inadequate, but because the data foundations, traceability and governance around them are too weak. In mission environments, especially on-premises and air-gapped cloud regions, trustworthy AI is impossible without secure, transparent and well-governed data.
To deploy AI that reaches production and operates within classification, compliance and policy constraints, federal leaders must view AI security in layers.
Levels of security and governance
AI covers a wide variety of fields such as machine learning, robotics and computer vision. For this discussion, letβs focus on one of AIβs fastest-growing areas: natural language processing and generative AI used as decision-support tools.
Under the hood, these systems, based on large language models (LLMs), are complex βblack boxesβ trained on vast amounts of public data. On their own, they have no understanding of a specific mission, agency or theater of operations. To make them useful in government, teams typically combine a base model with proprietary mission data, often using retrieval-augmented generation (RAG), where relevant documents are retrieved and used as context for each answer.
Thatβs where the security and governance challenges begin.
Layer 1: Infrastructure β a familiar foundation
The good news is that the infrastructure layer for AI looks a lot like any other high-value system. Whether an agency is deploying a database, a web app or an AI service, the ATO processes, network isolation, security controls and continuous monitoring apply.
Layer 2: The challenge of securing AI augmented data
The data layer is where AI security diverges most sharply from commercial use. In RAG systems, mission documents are retrieved as context for model queries. If retrieval doesnβt enforce classification and access controls, the system can generate results that cause security incidents.
Imagine a single AI system indexing multiple levels of classified documents. Deep in the retrieval layer, the system pulls a highly relevant document to augment the query, but itβs beyond the analystβs classification access levels. The analyst never sees the original document; only a neat, summarized answer that is also a data spill.
The next frontier for federal AI depends on granular, attribute-based access control.
Every document β and every vectorized chunk β must be tagged with classification, caveats, source system, compartments and existing access control lists. This is often addressed by building separate βbinsβ of classified data, but that approach leads to duplicated data, lost context and operational complexity. A safer and more scalable solution lies within a single semantic index with strong, attribute-based filtering.
Layer 3: Models and the AI supply chain
Agencies may use managed models, fine-tune their own, or import third-party or open-source models into air-gapped environments. In all cases, models should be treated as part of a software supply chain:
- Keep models inside the enclave so prompts and outputs never cross uncontrolled boundaries.
- Protect training pipelines from data poisoning, which can skew outputs or introduce hidden security risks.
- Rigorously scan and test third-party models before use.
Without clear policy around how models are acquired, hosted, updated and retired, itβs easy for βone-off experimentsβ to become long-term risks.
The challenge at this level lies in the βparity gapβ between commercial and government cloud regions. Commercial environments receive the latest AI services and their security enhancements much earlier. Until those capabilities are authorized and available in air-gapped regions, agencies may be forced to rely on older tools or build ad hoc workarounds.
Governance, logging and responsible AI
AI governance has to extend beyond the technical team. Policy, legal, compliance and mission leadership all have a stake in how AI is deployed.
Three themes matter most:
- Traceability and transparency. Analysts must be able to see which sources informed a result and verify the underlying documents.
- Deep logging and auditing. Each query should record who asked what, which model ran, what data was retrieved, and which filters were applied.
- Alignment with emerging frameworks. DoDβs responsible AI principles and the National Institute of Standards and Technologyβs AI risk guidance offer structure, but only if policy owners understand AI well enough to apply them β making education as critical as technology.
Why so many pilots stall β and how to break through
Industry estimates suggest that up to 95% of AI projects never make it to full production. In federal environments, the stakes are higher, and the barriers are steeper. Common reasons include vague use cases, poor data curation, lack of evaluation to detect output drift, and assumptions that AI can simply be βdropped in.β
Data quality in air-gapped projects is also a factor. If your query is about βmissiles,β but your system is mostly indexed with documents about βtanksβ, analysts can expect poor results, also called βAI hallucinations.β They wonβt trust the tool, and the project will quietly die. AI cannot invent high-quality mission data where none exists.
There are no βquick winsβ for AI in classified missions, but there are smart starting points:
- Build upon a focused decision-support problem.
- Inventorying and tagging mission data.
- Bringing security and policy teams in early.
- Establishing an evaluation loop to test outputs.
- Designing for traceability and explainability from day one.
Looking ahead
In the next three to five years, we can expect AI platforms, both commercial and government, to ship with stronger built-in security, richer monitoring, and more robust audit features. Agent-based AI pipelines with autonomous security accesses that can pre-filter queries and post-process answers (for example, to enforce sentiment policies or redact PII) will become more common. Yet even as these security requirements and improvements accelerate, national security environments face a unique challenge: The consequences of failure are too high to rely on blind automation.
Agencies that treat AI as a secure system β grounded in strong data governance, layered protections and educated leadership β will be the ones that move beyond pilots to real mission capability.
Ron Wilcom is the director of innovation for Clarity Business Solutions.
The post Securing AI in federal and defense missions: A multi-level approach first appeared on Federal News Network.

Β© Getty Images/ThinkNeo