[Contrib] Agent-OS Governance Guardrails: Kernel-Level Policy Enforcement#2422
[Contrib] Agent-OS Governance Guardrails: Kernel-Level Policy Enforcement#2422imran-siddique wants to merge 2 commits intoopenai:mainfrom
Conversation
Adds kernel-level guardrails for OpenAI Agents SDK. Features: - GovernanceGuardrail: Output guardrail with policy enforcement - GovernancePolicy: Configure blocked patterns, tools, limits - GovernedRunner: Wrapper with automatic guardrail injection - Input validation for pre-processing - Violation callbacks for alerting Integration with Agent-OS kernel for enterprise governance. See: https://github.com/imran-siddique/agent-os
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4a37fd7363
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
src/agents/contrib/_governance.py
Outdated
| tripwire_triggered=True, | ||
| ) | ||
|
|
||
| return GuardrailFunctionOutput(tripwire_triggered=False) |
There was a problem hiding this comment.
Provide output_info when guardrail passes
When no violation is found, the guardrail returns GuardrailFunctionOutput(tripwire_triggered=False) without the required output_info argument. The real GuardrailFunctionOutput dataclass in agents.guardrail requires both fields, so this path raises a TypeError on the first successful output check. This means any non-blocked output will crash the run instead of allowing it. Consider passing output_info=None (or similar) to match the required signature.
Useful? React with 👍 / 👎.
| # Add guardrail to agent if not present | ||
| if hasattr(agent, "output_guardrails"): | ||
| if self.guardrail not in agent.output_guardrails: | ||
| agent.output_guardrails = list(agent.output_guardrails or []) | ||
| agent.output_guardrails.append(self.guardrail) | ||
|
|
There was a problem hiding this comment.
Wire tool policies into tool guardrails
The runner only appends the governance guardrail to agent.output_guardrails, so tool policies like blocked_tools and max_tool_calls are never enforced during execution. check_tool is only invoked if users call it manually, which contradicts the documented “Tool Control” and “Rate Limiting” behavior for GovernedRunner. As implemented, a run can call any tool unlimited times without triggering a violation. Consider adding a tool input guardrail or otherwise integrating check_tool into the tool execution path.
Useful? React with 👍 / 👎.
- Add output_info=None when guardrail passes (fixes TypeError) - Clarify that tool policies require manual check_tool() calls - Add Tool Policy Enforcement section to README
|
Thanks for sending this patch. We don't plan to have this extension as part of this repository, so please consider having this module in your repo (or your contrib repo) instead. |
|
Thanks @seratch for the quick review and feedback! Totally understand - we'll keep this in the Agent-OS repo as a standalone integration. If you or the team have any input on the guardrail patterns or suggestions for how Agent-OS could better complement the SDK, we'd love to hear it. Feel free to open an issue or discussion on our repo anytime. Appreciate the work on the Agents SDK - it's a great foundation! 🚀 |
Summary
Adds kernel-level guardrails for OpenAI Agents SDK using Agent-OS.
Why This Matters
The Agents SDK provides powerful guardrail primitives. This contribution adds a ready-to-use governance guardrail that provides:
Changes
Example Usage
\\python
from agents import Agent, Runner
from agents.contrib import create_governance_guardrail
Create guardrail
guardrail = create_governance_guardrail(
blocked_patterns=["DROP TABLE", "rm -rf"],
blocked_tools=["shell_execute"],
max_tool_calls=10,
)
Create agent with guardrail
agent = Agent(
name="analyst",
instructions="Analyze data safely",
output_guardrails=[guardrail],
)
Run agent
result = await Runner.run(agent, "Analyze Q4 sales")
\\
Value for SDK Users
Integration Path
This guardrail works standalone with the SDK's guardrail system, but can also integrate with the full Agent-OS kernel for:
Related Work
References