Ensuring Caseworker Expertise is Considered in Discussions of How Agencies Will Use AI
The Office of Science and Technology Policy recently asked for input on what will become a national strategy on artificial intelligence. The request for information (RFI) covered a wide range of topics — from AI’s impact on the future of work to how it will change the information ecosystem and questions about regulation. One section of the RFI, however, is of particular relevance to caseworkers, with questions about “innovating in public services,” including:
How can the Federal Government effectively and responsibly leverage AI to improve Federal services and missions? What are the highest priority and most cost-effective ways to do so?
How can Federal agencies use shared pools of resources, expertise, and lessons learned to better leverage AI in government?
What unique opportunities and risks would be presented by integrating recent advances in generative AI into Federal Government services and operations?
These are questions that touch on the work that caseworkers do every day at the front lines of how constituents interact with government and the issues or difficulties they experience—in many ways, caseworkers are one of the sources of expertise that Federal agencies should be consulting as they implement emerging technologies.
We at POPVOX Foundation want to ensure that this insight from caseworkers is included in the high level discussions about policy and priorities that will take place over the coming months—so we submitted a response making that point. As we noted, there is precedent for caseworkers being consulted by OMB: in 1977, a study group under President Carter surveyed Congressional caseworkers for their insights on agencies most in need of reorganization, and presented those findings. We’re hopeful that today’s OMB will update this methodology to leverage the experience among caseworkers on how constituents interact with the federal government, and priorities for leveraging AI to make it work better.
We will keep you updated on published findings from the RFI, and we hope you will reach out if your office submitted a response, or if you are interested in participating in an OMB focus group on agencies and AI.
And if you are interested but feeling a little overwhelmed by all the talk about these new technologies, you are not alone! Keep reading for some background and updates on recent events—and just know that we are all learning together.
Catch Me Up: What is an AI/LLM?
The generative AI models that have recently captured public attention are “Large Language Models,” or LLMs that include services such as ChatGPT from OpenAI, Claude from Anthropic, and Bard from Google. These LLMs are composed of algorithms trained on incredibly large amounts of data—literally billions of text data points or “tokens.” For most of these models, the goal of the algorithm is to be able to predict the word that should come next in response to a prompt.
The critical components that determine how these tools behave are 1) the data they are trained on, and 2) the rules built into their algorithms that govern the types of responses and actions they can generate.
While this predictive capacity sounds simple (although extremely effective for many repetitive tasks), it can demonstrate sophisticated capabilities like the ability to negotiate bills (although some caution is warranted), fly aircraft, explain its own reasoning, and engage in “moral self-correction” to compensate for bias introduced in training data.
While current AI models operate in response to prompts, next-generation AI tools will be able to act as autonomous “agents” that can generate and execute their own task lists in response to given objectives—operating closer to Artificial General Intelligence (AGI), or AI models that can reason with human-like sentience.
How Will This Impact Constituents?
While we do not yet understand all the ways these new tools will impact how people interact with governments (and vice-versa), it is becoming clear that big changes are coming. Three capabilities of generative AI tools are already poised to play significant roles:
The ability to synthesize large amounts of data into an algorithm that continually learns with input from human experts
The ability to make large and complex sets of rules and data “interactive”
The ability to model complex and multi-variable situations, producing accurate estimates of direct and indirect effects of proposed changes
For example, imagine…
Instant claims decisions
When a veteran submits a claim, an algorithm trained on applicable law and thousands of past claimant decisions can call up the veteran’s medical records and render a decision or a recommendation for next steps immediately—and even answer questions about how it made its decision at a level the claimant or their lawyer can understand. The appeals process brings in human claims examiners to review the AI’s stated reasoning for the claims decision. Appeals decisions from claims examiners are fed back into the training data to make the initial decision algorithm more accurate, just like training a new claims examiner.
How far away is this?
This development may not be very far off: facing an enormous backlog of benefits claims, the VA is running a pilot of automated benefits decisions that cuts decision making down from 100 days per claim to just two. Other agencies are already seeing the results of similar systems: for example, the Department of Defense recently deployed a tool to automatically call potential military recruits’ medical records and determine their eligibility as part of the screening process at Military Entrance Processing Stations (MEPS).
Unified constituent experience
Today, for citizens interested in interacting with government, there are few clear places to start. Instead of putting the burden on citizens to navigate their interactions, a unified interactive interface could make the experience of government more like talking to a person. Natural language prompts like “I can’t afford my bills, how can I get help?” or “What’s going on with my claim?” or “Do I have to pay taxes on the house I inherited?” or “What is City Council doing about drug crime in my neighborhood?” make navigating government seamless and easy. AI’s ability to accurately summarize and translate complex information opens up additional options to tailor information to the needs of the person reading.
Government entities are also saved the time and expense of building, maintaining, and updating independent websites, allowing millions of dollars to be redirected toward AI training and development and other functions.
How far away is this?
Not that far: countries like Ukraine have already made strides integrating their constituent experiences into one unified platform, streamlining government bureaucracy (and demonstrating resistance to Russian cyber attacks); the UAE has already integrated an AI chatbot into its service. Consumer-facing AI chatbots already play an enormous role in many businesses, including legal self-help and educational tutoring, representing substantial commercial efforts to improve this technology. While the technical skills to develop these models may be lacking in many areas of the US government, Singapore’s model to make AI tools available for public-sector developers may serve as a template to bridge the skills gap and accelerate the integration of AI into public-sector services.
Digital twin for US code and implementation:
When a caseworker runs into a constituent problem, they could consult a “digital twin” model for the US government trained on the entire body of US code, agency rulemaking, census and IRS data, as well as internal agency information on structure, standard operating procedure, personnel practices, budget requests, strategic plans, etc. With a comprehensive model capturing how legislation, implementation, and constituent behavior fit together, this model can provide a detailed view of how policy happens from the highest-level to the most granular questions. Properly built and trained, these models can run simulations that model the macro- and micro- impacts of potential policy and operational changes, or the impact of external variables like changes in the economy or natural disasters.
How far away is this?
While an entire working digital twin for US law and agencies is likely far off, complex “Digital Twin” models at scale are already in operation around the world. For example, Singapore integrated physical, legal, and infrastructure mapping data into a digital twin that can accurately predict how water moves through the island. This country-scale digital twin allows policymakers to model how building changes will impact water levels, making water management, urban planning, and disaster response planning far faster and more accurate.
In the US, researchers are also exploring “law as code” models that allow for more accurate prediction and modeling of changes to benefits programs—essentially making the complex modeling of a CBO score available to the public.
The Bottom Line
In the coming weeks and months, people who care about safety, security, and risk will hear a lot about why they should be afraid of these models, and the potential risks and harms they can cause. These risks are real—we’ve talked about some of the potential harms that might show up in casework before.
But balanced against those risks is the fact that this tech is a generational opportunity to look at government and ask, “how would it work if it were magic?”
We hope that caseworkers’ deep understanding of the constituent experience will be a vital source of expertise to help guide and develop the policy governing AI in government in the future.