Government agencies and other public sector organizations are under immense pressure today to enhance efficiency and reduce costs. But they must still find ways to manage a wide array of critical tasks — from processing passport applications and disbursing Social Security benefits to translating diplomatic documents and detecting suspicious behavior in public spaces.
As citizens, meanwhile, we all expect government agencies and public sector organizations to deliver fast, convenient experiences. We want to access information and submit requests around the clock, using a variety of digital devices.
In my discussions with technology leaders in the public sector, I often hear that they are eager to adopt AI agents — or “agentic AI” — as a way to quickly address efficiency mandates while enhancing services. Their peers in other industries similarly recognize the promise of AI agents. In a recent survey by Boston Consulting Group, 58% of IT buyers in commercial businesses report that they are already deploying AI agents, and an additional 35% are actively considering it. These buyers anticipate an ROI of 13.7% from AI agents — higher than the ROI expected from non-agentic generative AI (GenAI) applications. Public sector organizations, meanwhile, could achieve significant cost savings.
Still, AI agents come with risks. Providing inaccurate information or incorrect results can quickly erase efficiency and cost-saving benefits. As organizations plan implementations, incorporating guardrails and creating loops for continuous improvement will be key for maximizing the gains from AI agents.
There are numerous potential use cases for AI agents in the public sector. In each one, agents can fill the urgent need to reduce or eliminate manual human tasks. Here are four simple use case examples:
Phone and web support: An AI agent could handle the initial phone and web intake for an agency like the Social Security Administration or IRS, providing information and recommending actions for citizens. If someone needs additional assistance, the AI agent could transfer the call or online request to a human.
Public safety video surveillance: A public transit agency could use an AI agent to detect anomalies in real-time video feeds from bus stops or train stations. The AI agent would be on continuously. It could identify unusual behavior and alert the right personnel for deeper human analyses.
Document processing: For agencies that must deal with multiple languages and various types of documentation, AI agents could translate documents, classify content, and then send information to specific people or store it in an appropriate location. The agents could substantially reduce tedious administrative work for humans.
Internal agency search: Employees within a government agency could use an agent-based chatbot to rapidly find policies applicable to their specific department or agency. This tool would help eliminate the costly, time-consuming work of manually searching through electronic or physical policy documents.
Though AI agents have many promising use cases for all public sector organizations, implementing them comes with some challenges.
First and foremost, the large language models (LLMs) that serve as the brains of AI agents are non-deterministic. In other words, if you give the exact same prompt to an LLM five times, you might get five slightly different answers — and the model won’t tell you why the answers vary. This inconsistency can cause significant problems, especially when you are counting on the AI agent to make decisions related to public safety or to handle citizen’s personal information.
A second — related — challenge is the lack of guardrails. To catch inconsistent or erroneous results before they do damage, you need to have visibility and control over AI agent flows. At the same time, you need to retain control over the data that you are using to train LLMs. The accuracy of results depends largely on providing accurate, consistent, and high-quality data to models.
Tools are just coming into the market to address some of these concerns about the lack of guardrails. So, it is imperative to keep up to date on AI security offerings.
Successfully implementing AI agents requires careful planning, technical work, and process change. After first defining an agent’s functionality, you need to build the system and the workflow. And from the beginning, you should incorporate guardrails and construct a means for continuously improving results.
Define what you want the agent to do. It sounds simple, but clearly defining your goal is a critical first step, because you can then determine whether you actually need an AI agent to achieve that goal. Some organizations race to implement AI agents when a simple workflow tool would be fine. AI is required only when the workflow is more dynamic. If you can automate a task without an AI agent, you might save significant time and effort.
Choose the right LLM. If you determine that an AI agent is necessary for automating a particular task, you need to establish its reasoning and planning mechanisms. In most cases, that involves choosing an LLM, be it an open-source or commercial LLM. You can then use retrieval-augmented generation (RAG) to retrieve agency-specific data, which will help ensure accurate and personalized responses.
Prepare for technical integrations. The purpose of an AI agent is to leverage an LLM to understand a task or goal, and then execute on that task or achieve that goal. To make all this happen, you need access to tools — tools that might open a browser or write to a database. And to facilitate those LLM-to-tool interactions, you need the right communication protocols.
Until recently, developers have primarily used APIs to integrate tools and LLMs. However, a newer model control protocol (MCP) is now becoming the de facto standard for connecting AI systems with external applications. MCP has been widely accepted and will drive innovation by providing a simple integration for all AI models and tools to work together.
Establish guardrails. AI agents promise to conduct tasks autonomously, but controlling those agents often requires a degree of human involvement. Specifically, you might need to keep a human in the loop to review and confirm some actions before the agent takes them.
Take the example of AI agents employed for phone-based contact centers, which are used by many agencies. You could have a human validate the responses provided by the agent to help ensure accuracy. If a person is contacting the office about a check that has not been received, the AI agent can look into the issue and propose a response such as, “Check was not sent due to system error.” The human would then review that AI-generated response and ensure it is correct before responding to the citizen and then sending out the check.
Beyond keeping a human in the loop, security capabilities and data governance policies are essential for minimizing the risks associated with AI agents.
Create a closed loop. In addition to monitoring inputs and outputs, you need a way to continuously improve the accuracy of results, making the agent smarter and more effective over time. Doing so requires you to create a closed loop. Through that loop, you can make adjustments to the prompting, flow of the information, or data set in the RAG system.
Additional techniques can help further improve accuracy. In particular, incorporating more data into models and adding greater context to prompts will enable you to enhance the accuracy of results.
Some commercial organizations (such as Salesforce and Microsoft) are already implementing AI agents into workflows. These organizations are starting with closed environments, leveraging AI agents to give their users new or better experiences. For example, a Microsoft user can ask an AI agent to summarize their emails from a particular person, accept all invitations from their manager, and reschedule requests that overlap with their manager’s requests.
Public sector organizations, under extreme pressure to enhance efficiency and cut costs, are poised for AI agent adoption. Your organization might start implementing agents for simple workflows to save time and reduce the need for manual human effort.
But in the near future, you might be able to use AI agents in more complicated, open environments. The combination of AI agents with IoT, for example, will create important new possibilities, such as using AI agents with autonomous tools or vehicles. These use cases can continue to improve the quality of services delivered to citizens while dramatically enhancing operational efficiency — which will remain a critical goal for government agencies.
Cloudflare can help your organization build AI agents and establish necessary guardrails. The Cloudflare Workers AI platform enables organizations to create and run AI agents and applications on Cloudflare’s global network, which helps maximize performance and improve scalability of models while streamlining development and reducing costs. Meanwhile, Cloudflare can accelerate implementation of a Zero Trust security model that improves the accuracy of AI agent outputs by controlling access to the data and models used by the agents. With Cloudflare, government agencies and other public sector organizations can capitalize on the power of AI agents to improve services and reduce costs while strengthening the guardrails for their use.
This article is part of a series on the latest trends and topics impacting today’s technology decision-makers.
Learn more about implementing guardrails for the use of AI agents in your organization with the Ensuring safe AI practices guide for CISOs.
Dan Kent — @danielkent1
Field CTO for Public Sector, Cloudflare
After reading this article you will be able to understand:
4 key use cases for AI agents in the public sector
The primary challenges organizations are facing with implementation
5 considerations for organizations to develop a wireframe plan to get started