Moving Fast Doesn’t Have to Break Things: The U.S. Must Stop Compromising Critical Infrastructure with Patchwork AI Security Approaches

Concentric protective rings with the outermost ring broken into fragments — illustration of PETs as the missing layer in U.S. critical-infrastructure AI security

The System Prompt

AI models today are capable but unreliable. Pressure to adopt AI in high-risk environments is growing. Critical Infrastructure owners and operators, who range from major firms to one-person local water authorities, are the companies and agencies responsible for systems whose disruption would harm national security, public health, or the economy. These operators are now widely deploying high-risk AI use cases into their systems. These AI deployments include everything from AI management of oil and gas pipelines to agentic financial marketplaces, AI security screening decisions, or autonomous railroad operations. Leaps in AI capability are accelerating this adoption and integration by lowering barriers to entry and automating aspects of the software development process. As AI systems become more embedded in critical operations, the data that powers them becomes critical infrastructure in its own right.

As AI systems scale and become more capable, their underlying data sources must be managed with technologies that ensure their reliability, verifiability, and security. In high-risk critical infrastructure deployments, unexpected outputs or system failures could affect human health and wellbeing at local, regional, national, or international scale. AI systems deployed for these use cases should meet the highest possible standards of safety and security assurance. Addressing these vulnerabilities requires both technical and governance solutions that go beyond current cybersecurity and data handling best practices.

The Context Window

After an initial period of caution following ChatGPT’s release in November 2022, critical infrastructure owners and operators began deploying AI across increasingly high-risk use cases. A combination of increased AI capability, availability, and financial pressure to find cost efficiencies has led to AI deployment for increasingly high-risk use cases. These operational technology deployments often lag behind state of the art in AI systems architecture and software stacks, but nonetheless introduce new and capable commercial AI models in high risk environments. The growing size and complexity of training data, combined with shorter AI model development, deployment cycles, and shrinking length, puts extraordinary pressure on AI security practitioners who have limited time and authority to perform the additional checks required for safe deployment.

AI systems in high-risk environments remain vulnerable through several distinct attack surfaces. Training data may be corrupted or poisoned before ingestion, introducing systematic errors. Models deployed via cloud service providers often expose sensitive operational data to third-party systems. Post-deployment retraining loops can introduce subtle model drift that is difficult to detect. Additionally, AI models in high-risk settings are hard to predict and control, resulting in the development of multilayered patchwork approaches such as AI system moderation platforms and ‘harnesses‘. Moreover, AI models remain unable to precisely identify how their training data impacts their outputs, making them hard to interpret or secure. Many of these vulnerabilities remain unsolved as they are characteristics of the current generation of transformer architectures, while others can be addressed with wider adoption of existing data security frameworks.

Despite initial adoption of voluntary AI risk management procedures post ChatGPT, AI risk management is uneven and lacks mandatory requirements or system certifications. Frameworks such as the U.S. National Institute for Standards and Technology’s (NIST) AI Risk Management Framework (AI RMF), lack specific guidance for critical infrastructure operators, resulting in AI systems in high-risk environments that remain inadequately tested, evaluated, and managed. Initial NIST and IEEE AI security standards were never meant to provide industry specific controls or establish high-stakes critical infrastructure guardrails. NIST’s Center for AI Standards and Innovation (CAISI) has continued to publish updated guidance alongside other industry specific frameworks to make its frameworks more usable, but critical systems owners continue to face novel AI implementation challenges without any AI-specific minimum safety or security requirements. Comprehensive AI security approaches for critical infrastructure operators remain difficult due to a number of factors, including lack of consensus on safety controls, talent availability, nonstandard hardware, and industrial sector complexity.

The Emerging Crisis

In the United States, critical infrastructure was initially defined by Presidential Policy Directive 21. It refers to systems, networks, and assets (physical or virtual) whose incapacitation or destruction would have a significant impact on national security, economic stability, public health, or safety. Critical infrastructure cybersecurity is a key focus of the Trump Administration’s March 2026 Cyber Strategy (Pillar Four). Critical infrastructure sectors include everything from energy to communications to defense manufacturing. Critical infrastructure owners and operators, 80 percent of whom are private sector companies, implement a wide range of AI use cases that range from analytics dashboards for (Supervisory Control and Data Acquisition) SCADA augmentation, to real-time load management, service optimization, and recently, multi-agent decision-making support tools.

After adopting enterprise AI tools like Microsoft Copilot, or Google Gemini from cloud service providers (CSPs), critical infrastructure owners and operators implemented AI use cases like AI-generated detection and alerting across their physical and digital infrastructure or decision-support capabilities to interpret energy grid, power plants, transit, water treatment, shipping, and healthcare data. Recent approaches automate existing workflows with AI agents and applications to support emergency response and law enforcement, adding new sensors to collect data for AI systems to interpret and assess. Each of these post-deployment re-training loops enhances AI’s value for critical infrastructure owners and operators, while also increasing the potential for data vulnerabilities. With each addition, the larger the web of interconnected agents, tools, and databases becomes more valuable for critical applications, and more challenging to secure. This complexity is not a reason to limit AI adoption, but it requires that security and interpretability be built in from the start.

AI security, like cybersecurity, is an ongoing process to be actively managed and refined, not an end state to achieve. Security requires data sharing between systems service operators, employees on the ground, suppliers, and regulators. Such sensitive data sharing is only possible through rigorous data accuracy maintenance and access management controls. These include zero-trust architectures, authorization controls, regular updates to datasets, data quality validation, and pre-deployment testing and simulations to measure system performance under unexpected or out-of-pattern conditions.

Unlike earlier generations of automation that critical infrastructure IT system operators hard coded with strict, specific, deterministic rules, modern deployed AI systems operate with a hybrid mix of nondeterministic responses from AI models and ad-hoc moderation rules that work to limit adverse outputs. These AI systems learn from data, retain statistical representations of that data, and surface insights from datasets. They can also be adversarially manipulated in ways that are difficult to predict or control, such as prompt injection, where instructions or code input into the model can induce the model to share user information or divulge its weights. Data integrity, which refers to the accuracy, consistency, and reliability of data, can often be damaged through data compression, obsolescence, or lack of operational evaluation by critical infrastructure owners or operators, who often lack access to the proprietary training data and data controls of the commercial models they deploy, or lack the internal talent to evaluate AI system development or deployment.

Data integrity requires control over the data lifecycle, from collection to labeling and structuring. Unclear integrity in critical infrastructure applications often occurs because critical infrastructure owners purchase third-party commercial models from Cloud Service Providers (CSPs) or use open source models available in model libraries like Google Model Garden, HuggingFace, or AWS Bedrock, without re-training or evaluating the limitations of the model’s underlying training data. Critical Infrastructure providers may also use Retrieval-Augmented Generation (RAG) approaches to make fully trained models responsive to their proprietary data and use cases, which can lead to unexpected failure modes, such as model drift as the underlying base model loses coherence when it clashes with proprietary data introduced after the fact that may contain inconsistent or contradictory data points. For critical infrastructure AI deployers, an “open-source” model may not be truly “open”, as there is no widely agreed standard of AI system transparency. An AI model may share its weights or information about its training data without fully disclosing modifications to the datasets themselves or the training techniques used to develop the model. This information asymmetry can limit the ability of a critical infrastructure IT owner to assess the capabilities or constraints of the model, and is in part a result of the challenges in interpreting how current AI models reach their conclusions.

In critical infrastructure settings, established AI system challenges like model drift could be the difference between a camera identifying and labeling something as fog when it is actually smoke, and model decay or collapse could result from the loss of a critical source and misdirect billions in cargo. As critical infrastructure organizations move to deploy AI agents at scale, AI systems increasingly generate new data through their use and sensor integrations, which then may be used to automatically retrain the agents or underlying model over time, resulting in unexpected outputs that are both novel and untested by the AI engineers in charge of overseeing the AI system. Because critical infrastructure owners and operators often lack the financial or human resources to train their own foundational models, even when testing identifies risks, remediation options are limited to post-training interventions akin to retrofitting rather than redesign.

The Response: Secure and Private System Design

To address the security concerns that result from opaque, centralized, patchwork approaches with unknown supply chain dependencies, AI systems deployed in critical infrastructure should deploy security frameworks that emphasize transparency, asset decentralization, interoperability, and stronger data and privacy controls. The U.S. has worked to solve this problem in critical hardware through supply chain monitoring, governance, and modular open system design. This approach should apply to AI as well. Critical infrastructure operators continue to operate AI systems with limited information about the models they deploy, requiring additional transparency requirements across the critical infrastructure AI supply chain. The resulting critical infrastructure AI supply chain has a number of key points of failure alleviated by decentralization, such as centralized databases, dependence on a specific model or version for system function, and centralized hardware or model providers’ relationships without established alternatives. Interoperability allows a greater depth of alternative providers, cross-compatibility across the AI system stack, and standardized approaches that provide operators with clear options for updating and modifying AI systems over time.

Privacy Enhancing Technologies (PETs) are one category of tools that can help critical infrastructure owners and operators meet these needs while building systems designed for data and privacy. PETs enable encrypted, verifiable controls over information, which can be useful for establishing who has fine-grained permissions to access model assets and engage in the model training process. In critical infrastructure, this means owners and operators can implement originator controls on their data pipelines that dynamically evaluate use requests, resource sensitivity, and environmental context before executing new workloads. This prevents owners and operators from locking themselves into a proprietary ecosystem and protects them against AI-enabled hacks, while also giving them the flexibility to collaborate widely on their own terms. In a collaborative ecosystem, PETs can enable flexible, secure, and verifiable authorization to view, patch, or stress-test models, so that necessary interpretability and transparency audits can be conducted and system trust in high-risk contexts can be continuously verified.

PETs underpin frameworks like OpenMined’s Attribution-Based Control (ABC), which centers on the core idea that AI systems should enable a bidirectional relationship between two parties, in which they negotiate which sources are applied and how much capability to create. Under the ABC framework, data-producing and data-consuming critical infrastructure owners and operators would deploy technical approaches to control which AI outputs they support and to calibrate the degree to which they offer that support. Similarly, as AI deployers and consumers of AI models, owners and operators should be offered solutions that provide them the ability to control which data sources they rely on, and to calibrate the degree to which they rely on each data source. In a setting where relying on high-quality contextual data is the difference between life and death, this level of observability and control is essential.

AI systems that abide by the ABC framework, or comprehensively adopt PETs, would avoid locking users into a proprietary ecosystem or ongoing contract. PETs approaches emphasize greater communication, collaboration, and connectivity between those with data and those seeking insights at scale. Critical infrastructure systems are often complex and data rich, with numerous human and automated monitoring systems that need to securely communicate insights, while protecting the underlying sensitive data. For example, data about financial counterparties, power plant system designs, or the failure modes of a hydroelectric dam. Depending on who is seeking information from a critical infrastructure provider, and the relationship to the AI system, (e.g. manager, supplier, or employee) data and privacy controls should automatically determine the necessary insights, data access, and data fields.

Deploying PETs in U.S. critical infrastructure systems also reduces the elements of AI model training that lead to unintended model outputs, which can range from catastrophic system failures to leaking sensitive data, or hallucinations. PETs frameworks enable interpretability and engineering to prevent unintended system behaviors without sacrificing privacy or efficiency. Over time, AI methods used for high-risk use cases should implement privacy enhancing techniques that enable transparency over the causal relationships between inputs and outputs of AI models, enhancing the overall trustworthiness and reliability of the AI system. PETs also ensure even in the event of AI system compromise by a threat actor, data compromise is kept to a minimum. Expanding the adoption of PETs tools in critical infrastructure AI systems will protect their sensitive data and enhance AI system reliability, verifiability, resilience, and ultimately win human trust in high-risk AI deployments.

The Moment

U.S. policymakers and security agencies continue to acknowledge the data security risks posed by AI systems deployed in critical infrastructure, especially following the rapid growth of OpenClaw and Moltbot (autonomous AI agents without security controls that can access user hardware and the internet, allowing AI models to act in unpredictable or malicious ways) and Anthropic’s Claude Mythos Preview, which Anthropic deemed too dangerous to release publicly due to its advanced hacking capabilities. Amid a contentious global race to develop more capable AI systems, there is broader consensus on security best practices for high-risk systems than might initially appear, as indicated by the still-growing global network of AI Safety Institutes first announced in November 2024. The Trump Administration’s AI Action Plan, released in July 2025, also aptly called for data, privacy, and integrity controls conformant with ISO’s 42001 AI Management Standards and CISA’s Secure by Design and Secure by Default initiatives, including cryptographic hashes and/or checksums to verify that large datasets and critical files have not been manipulated, or degraded.

In China, lawmakers are developing an AI security and standards architecture to manage high-risk deployments, and Chinese firms are broadly adopting the Model Context Protocol and other tools to secure and standardize AI agent interactions. Similarly, the European Union’s AI Act establishes binding requirements for high-risk AI systems, including mandatory data governance and transparency obligations that align with PET’s data governance principles. And in December 2025, joint guidance from CISA, the NSA, FBI, the United Kingdom, New Zealand, and Australia on AI in operational technology environments explicitly cautioned infrastructure operators to limit the exposure of sensitive data to AI models, particularly when those models are externally hosted or integrated across organizational boundaries. These best practices could ultimately form the basis for future AI Bills of Materials (AI BOM) requirements and safety certifications prior to high-risk deployments, similar to existing certification and inspection requirements for other aspects of critical infrastructure systems.

The Mission

Security and AI efficiency are often in tension. PETs offer a path to both. PETs won’t solve every AI security problem; system design and cybersecurity fundamentals are still vital. Advances in processing, storage, and model training make it more achievable than ever to deploy secure and reliable AI systems at scale with the right controls in place. Currently available PETs can manage complex privacy and data controls at scale for both AI agents and human users, while maintaining least-privilege access, protecting sensitive data, improving system interpretability, and guarding against unintended behaviors. That combination is what high-risk critical infrastructure AI deployments need to be secure.

If the United States wants to build consequential AI systems that are not only the best in the world but also deliver critical services to the American people in the most reliable, innovative, and secure way possible, then data integrity and privacy-enhancing approaches can no longer sit on the margins. Critical infrastructure owners, operators, and policymakers overseeing AI deployment in critical systems should adopt PETs before a major incident forces them to.

Interested? 👀

Sign up to recieve an email when new content like this is posted.

Want to write for OpenMined or help update a post?

Let us know!

By sending, you agree to our privacy policy
and join the OpenMined Newsletter.

Author: Noah Ringler

Category:

research, policy

Location:

United States of America (USA)

Topics:

AI Safety, Privacy-Enhancing Technologies (PETs), Structured Transparency, Attribution-Based Control

Continued Reading...

View all posts

April 9, 2026
research
policy

What Is “Network Sourced” AI?

March 3, 2026
policy
stories