The Boundaries Between Standards and Policy: AI Training as a Case Study
One of the areas I’m tracking from my usual standards’ perspective is how we set up guardrails for AI—how we contain its risks while still allowing the world to benefit from its utility. This challenge provides an excellent case study in the limitations of technical standards and where policy must step in to complement them.
A perfect example is the new IETF AI Preferences (AIPREF) working group.
The AI Preferences Working Group will standardize building blocks that allow for the expression of preferences about how content is collected and processed for Artificial Intelligence (AI) model development, deployment, and use.
That covers a lot of ground. AIPREF is working on a machine-readable vocabulary to signal whether a website opts in or out of AI collecting and using its content for training or generation. That one issue alone is incredibly complex, raising fundamental questions about how standards and policy interact.
Where Standards End and Policy Begins
Let’s break down some of the key issues:
What’s the default if no AI preference is specified? Is it opt-in (AI can collect anything unless told otherwise) or opt-out (AI needs explicit permission)?
How granular should this be? Can a site allow AI to use some content but not all?
How public should AI preferences be? Should opt-in or opt-out status be easily accessible, or should it be restricted?
How do you protect non-public AI preferences? If a site’s AI policies shouldn’t be public, how should that information be secured?
Pop quiz: which of these should be covered in a technical standard, and which belong in policy or regulation?
Technical Standards vs. Policy: Drawing the Line
The first one—what’s the default setting?—is a policy decision. The standard just needs to provide a vocabulary that enables either choice. Same for the second issue—granularity. Whether a site can allow AI to use some content but not others is a policy decision; the standard just provides the technical means to express it.
What about the third—public vs. private AI preferences? That’s policy too. Different jurisdictions may require different levels of transparency, and contract law could add further complexity. Again, the standard has to support both possibilities.
The fourth, though—protecting non-public AI preferences—is a technical problem. If the information isn’t public, it needs to be encrypted. If it’s encrypted, authorized entities need the keys to decrypt it. And just like that, we’ve gone from a vocabulary problem to a key management problem—an entirely separate research area, likely out of scope for AIPREF.
📩 Want to stay updated? I write about AI, digital identity, and standards—because someone has to keep track of all this! Subscribe to get a notification when new blog posts go live. No spam, just the good stuff. [Subscribe here]
The Broader AI Policy Landscape
AIPREF isn’t the only group tackling these issues. The G7 Hiroshima AI Process (HAIP), led by the OECD, delivered a Comprehensive Policy Framework in 2023. This includes Guiding Principles, a Code of Conduct, and a report outlining the G7 Common Understanding of Generative AI—all aimed at balancing innovation with responsible AI governance in some of the biggest markets in the world.
Meanwhile, in the UK, the Information Commissioner’s Office (ICO) weighed in on AI and copyright in response to a government consultation. While copyright law isn’t in the ICO’s jurisdiction, personal data protection is—and the two areas overlap significantly. The ICO emphasized the need for clearer policy signals so AI developers know what they can and can’t do. (Sound familiar? That’s essentially the first question in our list above.) They also pointed out potential unintended consequences, such as AI training exceptions in copyright law that don’t necessarily override other legal obligations, like data protection rules.
AI and Ethical Use: Mind the Gaps
As AI tools become more integrated into daily life, we all need to check our assumptions about whether the content we interact with is ethically sourced. This isn’t about avoiding AI—it’s about being thoughtful in how we use it. Standards provide the technical mechanisms, but it’s policy that defines the rules. And if policy doesn’t step up, AI developers are left guessing—or making those decisions themselves.