Securing AI Inference: The Overlooked Security Frontier in 2026

Insider Brief:

During Securing AI Inference Against Adversarial Threats in 2026, speakers argued that while most AI discourse still centers on model training, the more immediate enterprise risk lies in inference, the operational phase where models are queried and where both proprietary logic and sensitive prompts become exposed.

Tyson Macaulay (01Quantum), Subin Alexander (CGI), and Kristin Milchanowski (BMO) identified AI inference as an emerging weak point in cybersecurity architecture, with threats ranging from nation-state data extraction to unintended prompt leakage.

Live audience polling underscored the urgency: nearly half of participants reported low confidence that their AI systems meet anticipated 2026 standards, and more respondents ranked “harvest now, decrypt later” risks above model drift as their primary digital trust concern.
The panel concluded that organizations should begin post-quantum preparation immediately by inventorying cryptographic dependencies, embedding cryptographic agility into procurement processes, and integrating quantum-resistant techniques such as homomorphic encryption in ways that strengthen security without requiring wholesale infrastructure replacement.

Hear the full discussion from the panelists shaping the 2026 AI security agenda here.

Most of the public conversation around AI still centers on training, be it bigger models, better performance, more compute. But in enterprise environments, the real exposure often begins after the model is built.

During a recent webinar hosted by The Quantum Insider in partnership with 01Quantum, leaders from financial services, cybersecurity, and consulting examined a growing blind spot: AI inference. Titled Securing AI Inference Against Adversarial Threats in 2026, the discussion centered on the present-day reality that inference has become a high-value attack surface. It also marked the official kickoff to the 2026 Year of Quantum Security, demonstrating a shift from general quantum awareness toward practical adoption strategies across enterprise environments.

Why Is Inference Where Value Truly Lives in AI?

As Tyson Macaulay, COO of 01Quantum, explained, inference is “AI working.” It is the operational moment when a model is queried, when questions are asked against a trained system. And that is precisely where risk accumulates. The panel characterized inference deployments as the emerging “weakest link” in modern cybersecurity architecture, not because models are inherently flawed, but because execution layers have scaled faster than the controls surrounding them.

Inference models often contain the distilled intellectual property of an organization. In expert systems especially, the model itself reflects proprietary training data, domain knowledge, and internal logic. In some cases, models can be reverse engineered to reveal insights about their training data

But the exposure runs both directions. Prompts themselves reveal information about individuals, about businesses, and about strategy. A medical query reveals personal health data. A corporate query may signal product development direction or operational weakness In short: the question can be as sensitive as the model.

Yet according to Macaulay, roughly half of emerging AI security standards discussion, including those from NIST and ISO, now focus on prompt and inference model security. The industry is only beginning to recognize the scale of that exposure.

The Adversary Is Already Here

Subin Alexander of CGI noted that CISOs and CTOs are already confronting this reality. Organizations are dealing with shadow AI, unclear visibility into agentic systems, and growing regulatory pressure around responsible usage

Nation-state actors are targeting cloud AI systems to extract intellectual property, blueprints, and personally identifiable information. Agentic identities introduce new complexity: autonomous agents operating within enterprise systems can be difficult to control, and when compromised, can exfiltrate data at scale.

There is also the more subtle threat of unintended exposure. Terms-of-service agreements may allow model hosts to use prompt data in ways organizations did not fully anticipate. Inference traffic becomes data exhaust — valuable, analyzable, and potentially exploitable. For mid-market organizations, recovery from major incidents can take months, often at far greater cost than preventative investment would have required.

Audience polling during the session spoke to the urgency. Nearly half of attendees (46.2%) admitted they are not confident their current AI systems meet anticipated 2026 standards, while complexity of implementation remains the primary barrier to action. Notably, “harvest now, decrypt later” concerns have overtaken model drift as the leading digital trust risk among infrastructure leaders.

How Can Financial Services Lead With Empathy and Control?

From the financial sector perspective, Kristin Milchanowski, Chief AI and Data Officer at BMO, framed the issue differently. For banks, trust cannot be optional; it must be structural.

Financial institutions operate under some of the strictest regulatory regimes globally. That reality forces early adoption of privacy controls, third-party risk governance, and responsible AI frameworks. BMO’s approach reflects a deliberate stance by bringing large language models in-house where possible, ensuring that additional training using proprietary data remains contained.

Milchanowski emphasized a principle that may become foundational to enterprise AI governance: innovation without empathy is efficiency without trust. Responsible AI is a cultural transformation and board-level priority. She also pointed to the equally important issue of hallucination. Recent research suggests that hallucinations may stem more from data layer drift than purely algorithmic design. If true, this shifts defensive focus from model mechanics to data governance, which is another inference-adjacent vulnerability.

How Does Security Enable Business Beyond Cost Savings?

The webinar also addressed quantum security directly. The “harvest now, decrypt later” risk, where encrypted data is collected today to be broken once quantum capability matures, remains a major concern

Alexander stressed that organizations must inventory cryptographic dependencies and begin migration planning now. Transitioning to post-quantum cryptography (PQC) is not a flip-of-a-switch event; it is a multi-year roadmap

Macaulay added that minimum viable PQC readiness begins with core systems: identity and access controls, encrypted network traffic, and vendor procurement requirements that mandate cryptographic agility. Embedding PQC expectations into contract renewals may be one of the most practical accelerators available today.

One of the more nuanced arguments of the session was that post-quantum security does not require wholesale infrastructure replacement. The focus is on embedding cryptographic resilience into existing workflows, which reduces disruption while strengthening long-term viability. 01Quantum is exploring the use of fully homomorphic encryption (FHE), a lattice-based post-quantum technique that allows encrypted data to be processed without decryption Applied to AI inference, this would allow models and prompts to remain encrypted during execution, ultimately mitigating model extraction and prompt leakage simultaneously. If deployed effectively, such approaches could reduce reliance on complex guardrail systems and open new business models for securely exposing high-value expert systems.

The Year Ahead

Inference, not training, is emerging as the critical battleground. Prompts, agentic identities, model extraction, data drift, and post-quantum preparedness are converging into a single operational question: how do you preserve digital trust while accelerating AI integration?

The message from the panel was consistent. Start with the fundamentals. Build inventory. Embed cryptographic agility into procurement. Align innovation with governance. And treat AI inference as infrastructure versus functionality.

Because in 2026, AI inference is operational critical infrastructure, and critical infrastructure must be secured before it is tested by failure. To explore the full discussion in detail, access the complete webinar replay and hear directly from the panelists shaping the 2026 AI security agenda.

Frequently Asked Questions

Why is AI inference considered a bigger security risk than AI training in enterprise environments?

Inference is where a deployed AI model is actively queried, exposing both the model’s proprietary logic and the sensitive information contained in user prompts. Unlike training, which happens in controlled environments, inference operates continuously at scale with limited security controls surrounding the execution layer.

What is the “harvest now, decrypt later” threat and why has it overtaken model drift as a top security concern?

“Harvest now, decrypt later” refers to adversaries collecting encrypted data today with the intention of decrypting it once sufficiently powerful quantum computers become available. Audience polling during the webinar showed this concern has now surpassed model drift as the leading digital trust risk among infrastructure leaders, reflecting growing urgency around post-quantum cryptography migration.

How can businesses start preparing for post-quantum cryptography without replacing all their infrastructure?

Panelists recommended starting with an inventory of cryptographic dependencies and embedding cryptographic agility requirements into vendor contracts and procurement processes. The focus is on integrating quantum-resistant techniques like homomorphic encryption into existing workflows rather than overhauling entire systems at once.

How does fully homomorphic encryption address AI inference security specifically?

Fully homomorphic encryption allows data to be processed while remaining encrypted, meaning AI models and user prompts could stay protected throughout execution without ever being decrypted. Applied to inference, this approach could simultaneously mitigate model extraction and prompt leakage while potentially reducing dependence on complex guardrail systems.