May 08, 2025

Lessons in Learning

Ensuring Interoperability for Autonomous Systems in the Department of Defense

Executive Summary

Although claims of a revolution in military affairs may be overhyped, the potential for artificial intelligence (AI) and autonomy to change warfare is growing every year. Countries are deploying drones that can guide themselves semi-autonomously into targets, while AI is improving the efficiency of targeting workflows in militaries around the globe. The dangers from these systems are real and significant. Addressing these risks, whether from individual systems or groups of them, is essential in taking full advantage of the benefits these technologies promise.

The Department of Defense (DoD) has a long history of safely developing and deploying crewed systems, but AI and autonomy introduce new roadblocks. In a previous report, the author discussed the novelty of AI and autonomous systems, focusing on their ongoing and projected impacts on the DoD’s test and evaluation (T&E) infrastructure. While the particularities of individual AI systems present a challenge, so do the interactions between systems in the same environment. This report focuses on the ways that groups of autonomous systems (whether AI-enabled or not) introduce new vulnerabilities that may not be present when just testing or experimenting with one of them. The potential for conflict between autonomous platforms is significant and the need for interoperability between them requires a concerted effort that spans all the military services. As development of this technology moves at breakneck pace among both the United States and its adversaries, now is the time to establish a technically grounded and dynamic framework across the Joint Force to ensure that the U.S. military retains its ability to fight as a unified body.

Recommendations

This study yielded findings that apply across the life cycle of AI and autonomous systems, including research and system development, which concerns the technical elements of designing and engineering these systems, and T&E, which concerns the practical and policy elements of virtual and live testing. These findings are relevant across several communities, as developing true interoperability between systems requires active engagement from early in their development all the way through to sustainment, driven by everyone from concept designers to program managers to engineers.

The services should appoint or empower leadership to ensure AI and autonomous systems are developed to maintain interoperability, as outlined in operational concepts
. Given that operational concepts consider, and assume, the tacit ability of systems to occupy shared environments, leadership should work to avoid siloing development efforts and emphasize greater collaboration between programs developing systems that are expected to interoperate.

The DoD should explore the development of behavioral standards that ensure system interoperability, matching the degree of standardization seen in operator training and tactics, techniques, and procedures. While existing technical standards primarily consider elements like interfaces (how machines communicate), autonomous systems will require behavioral standards, analogous to the procedures learned by human operators, to coordinate with each other. These standards, going beyond low-level requirements like communications protocols, should consider how systems are expected to interact with each other in a shared environment, such as automatically deconflicting maneuver and fires.

Testing authorities should coordinate to ensure that T&E policy is formulated and implemented to ensure compatibility between autonomous systems. While services should raise interoperability concerns earlier in the system development process—creating linkages between programs that jointly appear in operational concepts—certifying sufficient interoperability down the line, as outlined in the framework in this paper, is the responsibility of the T&E community.

The T&E community should employ common modeling and simulation (M&S) tools to enhance interoperability, alongside live testing and experimentation. While standards provide one explicit means of developing interoperability, developing and sharing M&S resources, without direct coordination, drives engineering efforts toward producing compatible systems.

Introduction

The year is 2035. With jamming of communications and navigation systems increasingly prevalent, militaries around the globe have turned to autonomous drones to project power into contested areas. Amid an ongoing conflict in eastern Europe, a surveillance drone flying at 5,000 meters spots a potential enemy target—what appears to be a tank hidden among trees. Eyeing a chance to eliminate a substantial threat, a nearby commander deploys two loitering munitions, with the hopes of assessing and striking the target while the enemy remains unaware.

Given the valuable role of enemy armor in recent gains in the conflict, it is critical that these two drones can strike and completely disable the suspected target. The situation is not ideal—these two loitering munitions are all that was available at the time, though they come from two different manufacturers. As they each approach their shared target, something unusual happens: While one drone spots and classifies the target as an enemy tank, the other assesses, with high certainty, that the platform at the given coordinates is actually a friendly personnel carrier, likely passing through as part of another ongoing operation. As the drones’ batteries run low and the suspected target prepares to move, a decision must be made whether to engage the target. Because the drones disagree, each with high confidence, the right course of action remains unclear.

In the face of this conflict, what should these drones do? A human cannot adjudicate between the different drones’ target assessments because of jamming, nor is there enough time for the drones to retrograde to a position where the information can be shared with operators. One could argue that the most important assessment is the original one—after all, one might expect an intelligence, surveillance, and reconnaissance (ISR) platform to have the highest quality sensors and algorithms for identifying potential targets. Or perhaps the risks to friendly forces are too great—any intelligence that doubts the original classification of “foe” is sufficient cause to call off a potential strike. Or perhaps the different loitering munitions act according to their own decision-making logic. The optimal outcome is anything but clear.

As this fictional scenario illustrates, the risks from teams of autonomous systems are not simply the sum of the damage that each platform can cause individually. While each drone presents its own risks, together they introduce new ones if they fail to have sufficient interoperability in terms of decision-making with other artificial intelligence (AI)-enabled systems, such as battle management software. Even while maintaining active communication with each other—sharing target classifications—they might not be able to agree on a course of action.

Autonomous systems need to be extremely reliable if they are going to hunt for and engage enemy targets with lethal force in areas where humans cannot oversee them or exert any real-time control. Armed autonomous drones need to be capable of operating in a complex environment that is cluttered with civilians, enemy forces, friendly forces, different geographic and topographical features, and where adversaries are actively trying to deceive them. Given this difficult and highly dynamic environment, it should not be surprising that different algorithms may perceive things differently. The consequences of failure are high, and the necessary reliability cannot be self-contained. Instead, reliability must emerge across a heterogeneous system of autonomous and AI-enabled systems that will operate together, often with limited oversight in dynamic and denied environments.

The risk that a situation like the previous one materializes is quite dangerous and closer than many might assume. Russia and Ukraine have diverse drone fleets and are racing to improve autonomy as a counter to electronic warfare. Moreover, both the United States and China are investing heavily in uncrewed and autonomous systems to give them an advantage on the battlefield. The United States is relying on AI-enabled computer vision software to help defeat drone and missile strikes in the Middle East. Thus, as efforts like the Replicator Initiative and Task Force 59 accelerate the procurement and deployment of autonomous systems, it is crucial to address these dangers now, before large-scale conflict emerges.

Reliability must emerge across a heterogeneous system of autonomous and AI-enabled systems that will operate together, often with limited oversight in dynamic and denied environments.

In the previous example, the potential for an accidental strike on friendly forces is high, but the risks are not simply to combatants. Fully autonomous drones may target civilians inadvertently, or incorrectly deem them hostile, contrary to human-adjudicated laws of war. Or, perhaps, a drone might rigidly apply these laws to strike a lawful combatant that any moral warfighter would avoid. This dilemma, one which Paul Scharre lays out in Army of None, extends beyond whether just a single drone “know[s] when it is lawful to kill, but wrong.” In a battlespace filled with autonomous systems, unanticipated interactions between platforms can induce unintended side effects, with disastrous consequences.

Moreover, as autonomous systems, which are intended to provide mass or capacity on the battlefield, are deployed in greater numbers, the possibility and cost of interoperability failures grow. Without active communications links, human operators may remain completely unaware of the potential for friendly fire. Now, considering the volume of these systems appearing in future warfighting concepts (in the hundreds, if not thousands), the threat grows dramatically and can span the entire area of operations.

The dangers from autonomous systems failing to work together are at the heart of this study. By examining examples of interoperability—or the lack thereof—in the U.S. military, the author defines a new framework for how and when autonomous systems will need to work together to achieve mission objectives.

This report frames the problem of interoperability, showing how it has been difficult to achieve and articulating why previous definitions are not sufficient for autonomous systems. It then articulates two classes of conflict between autonomous systems, both of which have unique dimensions relative to crewed platforms. The author presents the system interoperability continuum, which classifies systems based on their interoperability with each other, and provides case studies that illustrate the patterns seen on the continuum. Finally, the report concludes with recommendations for the Department of Defense (DoD) to address these interoperability concerns, based on these case studies, as well as review of current work on autonomy both within and outside the department.

Download the Full Report

Download PDF

1. Paul Scharre, Army of None: Autonomous Weapons and the Future of War (New York City: W. W. Norton & Company, 2018), 2–4.

Author

  • Josh Wallin

    Fellow, Defense Program

    Josh Wallin is a fellow with the Defense Program at the Center for a New American Security (CNAS). His research forms part of the Artificial Intelligence (AI) Safety and Stabi...

View All Reports View All Articles & Multimedia