Clean Files, Safe Operations: Defending Federal and OT Systems from AI-Driven Threats

November 13, 2025

 

As AI accelerates both innovation and cyberattacks, federal, defense, and OT systems now face an overwhelming surge of seemingly harmless files that conceal sophisticated, AI-generated threats. Attackers are producing thousands of new malware variants per hour, hiding malicious payloads inside everyday PDFs, Office documents, and other trusted formats.

In this episode of Exploited: The Cyber Truth, host Paul Ducklin talks with RunSafe Security Founder and CEO Joseph M. Saunders and Glasswall Senior Solutions Architect Kelly Davis to uncover how organizations can stay ahead of AI-enhanced file-based attacks.

Kelly explains how Content Disarm and Reconstruction (CDR) ensures that only verified clean files reach secure networks. Joe connects these practices to embedded and OT systems, where compromised software can have physical, real-world consequences.

Together, they explore:

  • How attackers hide malware deep inside PDFs, Office documents, and workflows that users trust
  • Why detection-based security is too slow—and how AI is widening the gap
  • The four-step CDR process and its role in both inbound and outbound protection
  • How federal agencies can adopt file-level defenses using pilots, boundary controls, and workflow APIs
  • How runtime defenses and binary diversification protect OT systems from memory-based attacks
  • Why generating SBOMs at build time is essential for software supply chain integrity
  • How organizations can use technology to reverse attacker economics and regain the advantage

 

Speakers: 

Paul Ducklin: Paul Ducklin is a computer scientist who has been in cybersecurity since the early days of computer viruses, always at the pointy end, variously working as a specialist programmer, malware reverse-engineer, threat researcher, public speaker, and community educator.

His special skill is explaining even the most complex technical matters in plain English, blasting through the smoke-and-mirror hype that often surrounds cybersecurity topics, and  helping all of us to raise the bar collectively against cyberattackers.

LinkedIn 


Joseph M. Saunders:
Joe Saunders is the founder and CEO of RunSafe Security, a pioneer in cyberhardening technology for embedded systems and industrial control systems, currently leading a team of former U.S. government cybersecurity specialists with deep knowledge of how attackers operate. With 25 years of experience in national security and cybersecurity, Joe aims to transform the field by challenging outdated assumptions and disrupting hacker economics. He has built and scaled technology for both private and public sector security needs. Joe has advised and supported multiple security companies, including Kaprica Security, Sovereign Intelligence, Distil Networks, and Analyze Corp. He founded Children’s Voice International, a non-profit aiding displaced, abandoned, and trafficked children.

LinkedIn

Guest Speaker – Kelly Davis, Senior Solutions Architect at Glasswall: 

Kelly Davis, Glasswall’s Senior Solutions Architect, brings deep expertise in DevOps, IT architecture, and Zero Trust security. Previously, he was a Lead IT Specialist at the Command Control and Communication Tactical Directorate Communications Networks Division in the DoD, delivering secure, scalable solutions in high-stakes environments. At Glasswall, he applies this experience to drive innovation and resilience across the company’s cybersecurity solutions.

LinkedIn

Episode Transcript

Exploited: The Cyber Truth,  a podcast by RunSafe Security. 

[Paul] (00:07)

Welcome back everybody to this episode of Exploited: The Cyber Truth. I am Paul Ducklin, joined as usual by Joe Saunders, CEO and Founder of RunSafe Security. Hello Joe.

[Joe] (00:22)

Hello Paul, look forward to the discussion today.

[Paul] (00:24)

You’re on the road again, aren’t you? From the woodlands of Texas, I believe.

[Joe] (00:29)

It’s fun to travel and it’s fun to be in the woodlands. I’m visiting family, but I’m actively engaged in working as well. So happy to be here.

[Paul] (00:35)

Yes, cybersecurity doesn’t take any rest, does it? And with that, let me introduce today’s special guest, Kelly Davis, who is Senior Solutions Architect at Glasswall. Welcome, Kelly.

[Kelly] (00:50)

Thank you, Paul. Thank you, Joe, for having me.

[Paul] (00:52)

Very provocative title this week, Clean Files, Safe Operation, with the subtitle of Defending Federal and OT Systems from AI Driven Threats. Kelly, why don’t you open our innings by telling us what the problem is with clean files, or more importantly with files that are not clean, in IT in general and in federal government circles specifically.

[Kelly] (01:20)

Yeah, most certainly. I spend most of my time working with the Federal and Defense customers, typically helping them solve file-based threats.

[Paul] (01:29)

And that’s not just files that are coming into an organization or somebody’s inbox. It’s also the stuff that you produce and then need to deliver.

[Kelly] (01:40)

That’s actually it. It can be falls at rest, falls that are being utilized from day-to-day routines. What we do is we prevent that endless game of the whack-a-mole and try to figure out what’s actually guaranteed safe and what’s not actually safe. At Glasswall we don’t look for the bad stuff that may actually exist. We rebuild falls to make sure they’re clean, period.

[Paul] (02:00)

So do you want to say something about the kind of risks that files such as documents or spreadsheets or PDFs or whatever it might be pose? It’s not like you get one document in March and then maybe produce an updated version in June. There are new versions of files all the time that are moving around inside and outside an organization. So how do you keep that under control? And what are the attackers trying to do to probe systems with rogue files?

[Kelly] (02:30)

To start with, you’re a new landscape right now where AI is the big boom, right? Everything is AI. You have access to AI at your fingertips from your phone, your mobile devices, or from the web. It’s a wild world we’re in right now. So attackers are generating 20 to 40,000 new malware variants every hour. Obviously, that’s extremely scary. Using AI to study our defenses.

[Paul] (02:54)

When we talk about malware, that’s not necessarily traditional malware like, here is a program, an executable file that does bad things. In fact, it could even be something like a document that doesn’t contain any particular malicious code, but contains malicious or misleading instructions, state-of-the-art phishing if you like, that lures people into taking actions themselves that after they’ve done so, they desperately wish they hadn’t. And unlike the old days, they’ll probably be correctly spelled, have correct grammar, and be reasonable looking. How do you cope with that?

[Kelly] (03:35)

I’ll go back to what we’re seeing in AI and how they’re able to hide malware in the most innocuous places, like embedded in a PDF. PDF documents are extremely convoluted. If you can deconstruct a PDF document, a lot of people won’t really understand this because at the human eye level, it just looks like a nice document. But at the binary layer and the data structure layer of a PDF, there’s hidden JavaScript, there’s all sorts of untucked macros and aqua forms.

[Paul] (04:01)

Yes, the last time I personally looked at the PDF standards documents was about a decade ago. Oh my. Even then, they were something like 600 pages long. That’s a pretty daunting challenge, isn’t it?

[Kelly] (04:17)

And that’s where the attackers in the AI, that’s how they’re injecting the so-called payloads tucked away within the data in the binary layers of these files. The same goes with Excels and macros. It can look completely legitimate. You open it up, you click a cell, and then a payload detonates. That’s essentially how they’re being able to manipulate and gain access to your various systems through these documents. In this world, in this landscape, files remain the primary attack vector. PDF files are office docs. These are the main trusted formats and nobody typically questions them until it’s too late. It’s a massive threat vector these days.

[Paul] (04:55)

And that old-school advice that says things like, well, don’t open documents from people you don’t know is pretty much useless these days, isn’t it? It’s no use saying, for example, to someone who works in the HR department, don’t open resumes from people who don’t yet work for the company and that you’ve never met before, because that’s their job, is to open those files and see if the candidate is suitable. 

So when you talk about clean files, does that mean taking a file that has arrived, deconstructing it, and rebuilding it so that the informative content is the same as before, but all the bits that aren’t strictly necessary have been removed so that they can’t lie dormant inside the system to do something bad later?

[Kelly] (05:46)

This is essentially how it works. It’s a simple four step process. With Glasswall, it’s called Content Disarm and Reconstruction. But the four steps within that. One, inspect. It’s where we break down the file into its constituent components to validate the structure of the file. And step two is to rebuild. So we repair the invalid and malformed structures that could potentially be within a file at the binary layer. Then you have step three, we’ll clean. So we remove all the high-risk elements that do not pertain or do not match up to a vendor specification so that PDF spec that you talked about, it removes the high-risk elements based on the policy, the macros, the JavaScript or the embedded files that could be within that file. And then lastly, the fourth step is we deliver and it’s fully functional. You can’t tell that it has been inspected and rebuilt back to its compliance standard. The users don’t even know what’s happening if a file is being CDR’d by Glasswell and CDR’d Content Disarm and Reconstruction.

[Paul] (06:43)

Now does this mean that for some customers they may need to adjust some of their policies and procedures or maybe to improve the tools that they use in say the automatic generation of documents? If someone is triggering alerts or alarms very regularly with documents that they genuinely and legitimately created then you may actually have uncovered that there’s a flaw in the document building process that they themselves are using. And I guess when it comes to passing documents around in secure environments, particularly ones where documents may move between different security levels, that’s a great thing to know, that someone is inadvertently not playing by the rules but never realized it.

[Kelly] (07:32)

There’s a stat there. So there’s about one in 100,000 files that contain potentially malicious content, which is a significant threat service. If you think about that, right, that’s just day to day usage.

[Paul] (07:44)

So how does the issue of deconstructing and rebuilding files differ when those files need to move between different security levels or from place to place inside a segmented network? Does that mean that there are constructs inside things like Word documents and PDF files that you tolerate at one security level but might want to strip out at a more strict security level? Because there’s more that can go wrong.

[Kelly] (08:15)

These environments are brutal for security. And when you’re thinking about various classification levels, whether it’s air gap networks or files that need to be transferred into a different classification level, Randy Resnick, the DoD, the DoW now, the DoW CIO, admitted that there’s something hugely wrong these days when it comes to the poor job of integrating security over the last 30 years. What we were doing was just basically duct taping certain things and poorly implementing solutions to protect the various classification levels. 

Obviously, in these environments, there are certain items that are not suitable for specific environments. Broken down by security levels, classification levels, enclave levels, these various files are broken down by personnel who can view certain components. When you’re passing data through Glasswall, you have the ability to modify your policies to cleanse certain data to prevent the spillage while the fall is transferring into another environment so that certain personnel can now view these documents without breaching rules. So from a security standpoint, we help these systems and these organizations at those levels protect their data through transfers and air gap environments within the various classification levels.

[Paul] (09:32)

So it’s fairly obvious how AI comes into the attack side. Although it doesn’t create new types of attack, it makes it really easy for attackers to produce not just tens or hundreds of different variants of a known attack, but as you say tens or hundreds of thousands of new samples per day or even per hour. When it comes to moving stuff out of the organization, how do you make sure that incorrect or malicious material hasn’t been injected on the inside before it goes out? Is that a similar process just done in the outbound direction?

[Kelly] (10:13)

Exactly that. The way you would clean your files inbound is the same way you would clean your files outbound with Glasswall. We have specific toolings that can be put in place that can be embedded into specific firewalls. If you’re familiar with the ICAP protocol, it’s a security protocol that you can enable that allows you to proxy to various servers or whatnot. You can use Glasswall in that capacity whether it’s outbound or inbound. Files that are being transported out and then can then lean upon Glasswall to sanitize the documents or to redact specific content from the documents, whether it’s leaving the organization or coming into your organization. If you think about what AI is doing to help attackers study the target environments, you can only imagine what we’re doing on the other side with AI to ensure that we are meeting the standards and having that leg up for security toolings.

[Paul] (11:01)

We can get the AI automatically to deal with the work harder point of view. It doesn’t get boring looking at 200,000 new documents in a day, whereas a human would just run out of the ability to focus. And that leaves much, much more time for expert humans to work smarter so we get to work smarter and harder at the same time.

[Kelly] (11:25)

Yes, I wholeheartedly agree. Coming from where we are now with the advancement of AI and how we’re using it to plan our offensive attacks. And the real truth is the old traditional detection way is outdated, right? And it was always playing catch-up. If you think about the time a detection system or signature within a malware software gets updated, generally you see an average of about 18 days before a new threat gets detected based on that signature that has just been released.

So you’re always playing catch-up. And if you think about the federal environments and they move at a snail’s pace, that 18-day gap could seem like 18 years. They’re definitely playing catch-up in there, outdated by the numbers.

[Paul] (12:09)

So Joe, maybe I can flip things over from the more conventional IT side to the OT side, where generally you’re not sending specifications documents and spreadsheets and PowerPoint files to your embedded devices, but you are sending very specialized executable code files.

You can’t rewrite those executables by changing the way they behave and the actual operations they perform, because they may be mandated to do certain things in a certain way within a certain time in order to get their certification. But you can nevertheless build a security component into those files, can’t you? So that they actually perform in exactly the same way, but if they do misbehave, then they’re much less likely to be exploitable or to go haywire in an uncontrolled fashion.

[Joe] (13:07)

Yeah, exactly right. If you think about what Kelly’s talking about with these kinds of files and all the information that’s contained within them, that’s certainly a serious threat that has to be mitigated in the approach Kelly’s describing makes a lot of sense. But as you suggest, the OT environment might be a little different. We do see various levels of defense and depth that’s out there. On one level, what people are doing is they could say, well, at least I know that the software I booted matches the software that was shipped and have some kind of attestation or signing to get to that secure boot. But I think where you’re going then, Paul, is, well, that’s great, but then those files, as they get loaded into memory, they’re exposed to runtime attacks. And what are the approaches to stop those?

[Paul] (13:57)

If you’re going to put some extra special magic into the file, then it needs to be done in advance so the file is protected before it’s signed, before it’s delivered, before it’s installed, before it’s launched.

[Joe] (14:09)

Right. And so that entry points through the software supply chain. And then ultimately it’s very likely that whatever that malicious act is attempting to achieve, good chance that it’s going to attempt to be achieved at runtime when the software is loaded into memory in a similar way. I mean, what you want to do is do things as those files are getting loaded into memory and relocate where those vulnerabilities could be so that these memory-based attacks couldn’t be realized. If you can ship software binaries that then load uniquely every time so that at runtime, attackers can’t deliver the payload or can’t exploit the software, I think that’s part of the difference. And there’s still lots of really good information in these OT systems, just like you might find in these files that Kelly’s talking about. There’s operational data. The consequences are also very significant.

Sending off the file-based attacks using ways to randomize where those functions load into memory to prevent exploitation in the first place, even if someone compromises the supply chain or does get on system and tries to introduce arbitrary code to do something different is ultimately the goal. The approaches here we’re talking about are very different, but the end result is trying to come up with novel ways to stay ahead of those attackers. And in the world of AI, I think that only becomes more complex. The approach Kelly’s taking on the IT side, I think is great to try to stay ahead of those maneuvers and not chase the changing, ever-evolving signature, but eliminating the vulnerability in the first place.

[Paul] (15:52)

And Joe, when it comes to the concept of clean files and computer software, programs, executables, binaries, there’s the whole issue of whether a file should be considered clean in the supply chain, combined with the question of, did you actually include the clean file that you intended when you built the software, or did it somehow surreptitiously get swapped out at the last moment. And that’s where Software Bills of Materials or SBOMs come in, isn’t it?

[Joe] (16:30)

Yeah, and I think the timing when you build that Software Bill of Materials as closely to when that binary is produced in the first place is a key step in there. Building that at the same time as the binary is getting generated is the best approach because then what goes in the software build materials matches exactly what went into the binary in the first place. What a lot of people are doing in the embedded software space anyway, and in these OT systems is they will look to derive an SBOM from the binary after the fact, maybe six months, eight months, 12 months after it’s been produced, that distance creates and that time gap creates a lot of risks. So, generating that Software Bill of Materials and knowing exactly with 100 % completeness, 100 % correctness, what went into that binary in the first place, and then securing that binary and sharing that in a trusted way becomes a really good way to vouch what’s in that binary in the first place.

But then these other techniques, these defense in depth strategies, ensuring secure boot, ensuring runtime defense, those also then play a role in the overall defense posture.

[Paul] (17:41)

So Kelly, if I can return to you now, when Joe is talking about protecting binaries, executable files that are built and supplied, say for embedded systems or very specialized devices, there’s a necessary limit on the number of distinct final executables that get pushed from the development environment into the wild. It’s very different when it comes to files like documents and PDFs because they typically circulate and quite purposefully get modified along the way, possibly by many people legitimately inside an organization or the organizations that body works with. So what sort of controls do you think that federal government organizations could introduce that perhaps they haven’t already that make it more likely that they will construct what you might call clean files in the first place and reduce the risk that they will inadvertently introduce malicious content. Sorry that was rather a long question.

[Kelly] (18:52)

That’s a great question and it becomes tactical very quick. The playbook that I would use for this type of protection is first you have the obvious entry point, right? Into these organizations. The one main facet is the email gateways, right? And web downloads. The way to protect that is to having some sort of integration at that point, whether it’s an ICAP server or existing proxies and users don’t even know that it’s happening. Second, I would focus on boundary protection. So talking in the federal space, every file that’s crossing a classification level or network boundaries, they obviously would get sanitized or sandboxed, put into a different environment to detonate and to make sure that they’re cleansed. But obviously, we know sandboxing is a little outdated. It takes a little bit time. You may not actually get the full report. And then the next would be embedded into your workflow. So as you mentioned, you have the developers developing, and then they have a pipeline or some sort of workflow where it has to get pushed over to the next QA team, and then they generate an artifact and then the artifact triggers another pipeline where an SBOM or whatever that may be or payload may be. I know that we have specific partners and agencies using our REST full APIs in a way to automatically clean files, whether it’s in your SharePoint environment or your new cloud S3 bucket environments. Wherever the files live, embedding it in your day-to-day workflow, think that’s what makes this great from a security standpoint and then scaling it. The main key is policy differentiation within these enclaves or these various classifications. Maybe you strip all the macros from a file that’s coming in from the internet. On the other side, you’re more familiar with internal documents. Basically configuring at your own risk tolerance at that point.

[Paul] (20:34)

It sounds as though there are some compromises that users might be expected to make that they might be a little bit resistant to at first. It is quite difficult to persuade people to give up IT conveniences that they’ve had in the past, even if you’re only asking them to give up a very little bit. We’ve seen that with multi-factor authentication, haven’t we? With people saying, how can you expect me to spend an extra 30 seconds a day typing in this code or presenting this magic key and then once they get used to it they realize you know what it wasn’t that hard after all. Is it the same when it comes to managing things like document flow inside a big and possibly bureaucratic organization?

[Kelly] (21:20)

I wholeheartedly agree that introducing the various authentication protocols can definitely be cumbersome and users are not fond of it. But when they realize the protection behind it, typically it just works in with the day-to-day workflow. From a document standpoint, yes, it depends on how you implement this. So there’s a few seamless ways you can do so. You can integrate via proxy so the users have no idea that it’s happening in the backend. And then you can utilize a modular open system approach querying the set API or the applications to then sanitize those files or to sanitize what’s happening in the back end without the users knowing. And then there’s also here, so it’s just something really cool that we’re working on here at Glasswall. 

It’s with our foresight AI capability. It’s predicting what these files may look like and providing you an intelligent threat report on what will get stripped out from the file prior to it happening. So this can be bubbled up and provided to your IT system way ahead of time while files are being in transit through your email gateways or your proxies. So you’re not just protecting the files, but you’re learning about what’s happening and what’s targeting you.

[Paul] (22:23)

So presumably for your outbound files where you’ve constructed things that you think are okay and nobody’s complained about before, you might actually learn ways in which you could simplify your workflow, simplify the types of documents that you produce, which will actually save you time and money and make everybody that you sent these documents to safer at the same time for a true sort win-win situation.

[Kelly] (22:49)

I would agree. If an organization is a little timid about getting their feet wet or diving off the deep end, you start with a pilot, right? Pick your highest-risk files and the flow that they may be coming in to protect them first, right? Maybe it’s inbound email for executives or files crossing to an unclassified to a classified environment, show that value and then expand it based on that. The Navy did a project. You all can look it up, it’s public. They did Project Flank Speed where it was an initiative to focus on how to protect data coming in and out.

And they did it by iterative approach, tackling one component at a time, and then going down the cycle and ensuring that it met that standard. I guess you just start with the pilot and then you just continue to move forward with it.

[Paul] (23:29)

I’m conscious of time Joe, so I’d like to ask you if you would provide some, what I might call, encouraging concluding remarks. It’s clear that a little bit of discretion goes an awful long way. What would your main advice be, particularly for federal government departments that think that this is all too hard and they’ll never get there? How can we get this started in a way that will let us stay ahead?

[Joe] (23:57)

Well, I do think we all can agree that attackers are creative. They’re often well-funded and it’s a constantly evolving landscape, and with the introduction of AI, either the volume’s increasing or sophistication is increasing. Yeah, or both. And so the attacks are evolving and having a good process that’s not trying to chase the new innovation I think is part of the breakthrough in security. 

And we look for these asymmetric shifts in defense tools, such as cleaning files, both inbound and outbound. For the government, part of the opportunity here is to not try to reinvent all these ideas. Product companies that are producing stuff, their technology, their software does drive tremendous innovation and brings that asymmetric shift to the equation.

That’s what we try to do at RunSafe with feeding files, I think, Kelly and team. That’s what they’re trying to do here. So we do need to rely on technology and be mindful of the effects of AI, but certainly look for asymmetric shifts ultimately.

[Paul] (25:12)

So loosely speaking, Joe, you’re sort of talking about changing the economic equation, if you like. So it’s become much cheaper and easier for attackers to generate thousands or hundreds of thousands of malware variants, for example. But that doesn’t mean that we can’t nevertheless continually make it more expensive for the attackers, even though they have these things that they consider optimizations.

[Joe] (25:40)

Well said, Paul. A podcast dedicated to staying ahead of the attack is a proposed well. 

[Paul] (25:47)

Yes. Well, with that, let me say that is a wrap for this episode of Exploited: The Cyber Truth. Thanks to everybody who tuned in and listened. Thank you so much, Kelly and Joe, for, I guess, just touching the surface of this broad and deep field of cybersecurity that we live in. 

If you find this podcast insightful, please don’t forget to subscribe so you know when each new episode drops. Please like and share us on social media as well and don’t forget to recommend us to everybody in your team. Here’s to fighting back against the attackers in a way that means we really do work harder and smarter at the same time. Stay ahead of the threat. See you next time.

How Generative AI Is Addressing Warfighter Challenges

How Generative AI Is Addressing Warfighter Challenges

  In today’s fast-paced defense environment, speed and intelligence win battles before they begin. In this episode of Exploited: The Cyber Truth, Joseph M. Saunders of RunSafe Security and Arthur Reyenger of Ask Sage explore how generative AI is revolutionizing...

read more