An Action Plan to increase the safety and security of advanced AI

In October 2022, a month before ChatGPT was released, the U.S. State Department commissioned an assessment of proliferation and security risk from weaponized and misaligned AI.

In February 2024, Gladstone completed that assessment. It includes an analysis of catastrophic AI risks, and a first-of-its-kind, government-wide Action Plan for what we can do about them.

DISCLAIMER: All written publications available for download on this page were produced for review by the United States Department of State. They were prepared by Gladstone AI and the contents are the responsibility of the authors. The authors’ views expressed in these publications do not reflect the views of the United States Department of State or the United States Government.

Eye icon.

Action Plan overview

This Action Plan, the first of its kind, was developed over thirteen months. It's squarely aimed at addressing catastrophic risks from weaponization and loss of control over advanced AI systems on the path to AGI.

LOEs icon

Structure of the Action Plan

The Action Plan is made up of five lines of effort, or LOEs. The LOEs are designed to be mutually reinforcing, as part of a strategy of defense in depth against critical risks.

Together, these LOEs put us on a path to stabilize (LOE1), strengthen (LOE2, LOE3), and scale (LOE4, LOE5) advanced AI development safely and securely.

LOE1 — Establish interim safeguards

AI systems can already be weaponized in concerning ways, and AI capabilities are accelerating almost inconceivably fast. So if we want to buy down that risk in the short term, while positioning ourselves for the long haul, we can do three things:

  • Monitor developments in advanced AI to ensure that the U.S. government’s view of the field is up-to-date and reliable.
  • Create a task force to coordinate implementation and oversight of interim safeguards for advanced AI development.
  • Put in place controls on the advanced AI supply chain.
LOE2 — Strengthen capability & capacity

Over time advanced AI could raise the prospect of new forms of weaponization, and even increase the risk that we might lose control of the systems we’re developing. If we want the ability to prepare for and respond to AI incidents quickly and in a technically informed way, we can do four things:

  • Establish working groups for the action plan LOEs.
  • Increase preparedness through education and training.
  • Develop an early-warning framework for advanced AI and AGI incidents.
  • Develop of scenario-based contingency plans.
LOE3 — Support AI safety research

Controlling very capable AI systems is still an unsolved technical problem. If we want to close that technical gap as quickly as possible, we can do two things:

  • Support advanced AI safety and security research, including in AGI-scalable alignment.
  • Develop safety and security standards for responsible AI development and adoption.
LOE4 — Formalize safeguards in law

Right now there is no clear model for long-term regulation that will protect us from catastrophic risks while promoting responsible development and adoption of advanced AI. To close these gaps, we need a legal framework that accounts for the new risks posed by advanced AI, and by the potential emergence of AGI. This could involve two things:

  • Create an advanced AI regulatory agency with rulemaking and licensing powers.
  • Establish a criminal and civil liability regime, including emergency powers to enable rapid response to fast-moving threats.
LOE5 — Internationalize advanced AI safeguards

As advanced AI matures, countries may race to build sovereign advanced AI capabilities. Unless we manage this situation responsibly, these dynamics could risk triggering various forms of escalation. That responsible management could involve four things:

  • Build domestic and international consensus on catastrophic AI risks and necessary safeguards.
  • Enshrine those safeguards in international law.
  • Establish an international agency to monitor and verify adherence to those safeguards.
  • Establish a supply chain controls regime with partners to limit the proliferation of dual-use AI technologies.
Info icon

Background materials

Backed by world-class historical and technical teams, the Action Plan was informed by conversations with over two hundred stakeholders — from governments, cloud providers, security experts, the AI safety community, and the world's most advanced frontier AI labs.

It was developed as part of an assessment of AI risk commissioned by the U.S. State Department. That assessment has three parts:

  • A historical review of nonproliferation regimes developed for previous emerging technologies.
  • A survey of the technical AI landscape, including likely research and development trajectories for advanced AI in the near future.
  • The Action Plan itself which draws on these surveys to support policy recommendations that address catastrophic risks from weaponization and loss of control of advanced AI.
FAQ dropdown arrow.
Explainer on advanced AI
Since the public release of GPT-3 in mid-2020, AI has entered an era of foundation models, scaling and general-purpose algorithms — the era of advanced AI. Until just a few years ago, scaling was a completely fringe idea, but it’s since become what many people think is a promising path to human-level AI.

As AI systems become increasingly capable, they create new risks to national security that need to be monitored. These include accident risks, as well as risks of new malicious applications that were previously impossible.
FAQ dropdown arrow.
Explainer on weaponization risk
Along with its amazingly positive applications, advanced AI comes with some significant and rapidly growing risks. The first of these is weaponization: the intentional use of AI to cause harm. Major categories of weaponization risk include cyber, AI-augmented disinformation, robotic control, psychological manipulation, and weaponized biological or material sciences.
FAQ dropdown arrow.
Explainer on loss of control risk
The discrepancy between the goals we want an AI system to internalize, and the goals it ends up pursuing is the rule, not the exception. In fact, we literally don’t know how to train an AI model to reliably pursue the objectives that we actually want them to pursue in all situations – all we can ever do is train them to pursue proxies for our real objectives.

This can have increasingly serious consequences as we increase the model’s capabilities, its ability to interact with the world, and the stakes of its deployment. 
Checkmark icon

Request the Action Plan

If you'd like to request a copy of the Action Plan, please fill out the form below.

Request a copy of the Action Plan.

By agreement with the State Department, Gladstone is required to share the Action Plan by request only. Enter your name and email below, and we'll respond ASAP.

Thanks for your submission! 🙏
‍
We've received your request and will get back to you as soon as we can.
There was an error submitting the form. Please try again, or contact us directly at [email protected]. Thanks!

Request a copy of the Action Plan.

By agreement with the State Department, Gladstone is required to share the Action Plan by request only. Enter your name, email, and publication below and we'll respond ASAP. For general media inquiries, feel free to reach out at [email protected].

Check out the bios of the full action plan team.

Thanks for your submission! 🙏
‍
We've received your request and will get back to you as soon as we can.
There was an error submitting the form. Please try again, or contact us directly at [email protected]. Thanks!

DISCLAIMER: All written publications available for download on this page were produced for review by the United States Department of State. They were prepared by Gladstone AI and the contents are the responsibility of the authors. The authors’ views expressed in these publications do not reflect the views of the United States Department of State or the United States Government.

Hero star