AI-Powered Rules as Code: Experiments with Public Benefits Policy

Public interest technologists are still early in our understanding of how to effectively use large language models (LLMs) to translate policy into code. This report documents four experiments conducted to evaluate the current performance of commercially-available LLMs in translating policies into plain language summaries, machine-readable pseudocode, or usable code within a Rules as Code process. We used eligibility rules and policies for the Supplemental Nutrition Assistance Program (SNAP) and Medicaid. 

The experiments include asking a chatbot or LLM about specific policies, summarizing policy in a machine-readable format, and using fine-tuning or Retrieval-Augmented Generation (RAG) to enhance an LLM’s ability to generate code that encodes policy. We found that LLMs are capable of supporting the process of generating code from policy, but still require external knowledge and human oversight within an iterative process for any policies containing complex logic. 

The research team conducted initial experiments from June to September 2024 during the Policy2Code Prototyping Challenge. The challenge was hosted by the Digital Benefits Network and the Massive Data Institute, as part of the Rule as Code Community of Practice. Twelve teams from the U.S. and Canada participated in the resulting Policy2Code Demo Day. The research team finished running the experiments and completing the analysis from October 2024 to February 2025. 

 

Read the full report and  summary + key takeaways on the Digital Government Hub.

Read the announcement on the Beeck Center Updates page.

Associated Projects

Contributing Authors

Ariel Kennan
Ariel Kennan

Ariel Kennan

Senior Director, Digital Benefits Network

Alessandra Garcia Guevara

Alessandra Garcia Guevara

Student Analyst

Jason Goodman

Jason Goodman

Student Analyst