Skip to content

[Contrib] Agent-OS Governance Guardrails: Kernel-Level Policy Enforcement#2422

Closed
imran-siddique wants to merge 2 commits intoopenai:mainfrom
imran-siddique:contrib/agent-os
Closed

[Contrib] Agent-OS Governance Guardrails: Kernel-Level Policy Enforcement#2422
imran-siddique wants to merge 2 commits intoopenai:mainfrom
imran-siddique:contrib/agent-os

Conversation

@imran-siddique
Copy link

Summary

Adds kernel-level guardrails for OpenAI Agents SDK using Agent-OS.

Why This Matters

The Agents SDK provides powerful guardrail primitives. This contribution adds a ready-to-use governance guardrail that provides:

  • Content Filtering: Block dangerous patterns (SQL injection, shell commands)
  • Tool Control: Limit which tools agents can use
  • Rate Limiting: Cap tool invocations per run
  • Input Validation: Filter inputs before agent processing
  • Violation Handling: Callbacks for alerting and logging

Changes

  • Added \src/agents/contrib/\
    • _governance.py\ - GovernanceGuardrail, GovernancePolicy, GovernedRunner
    • _init_.py\ - Public exports
    • \README.md\ - Documentation and examples

Example Usage

\\python
from agents import Agent, Runner
from agents.contrib import create_governance_guardrail

Create guardrail

guardrail = create_governance_guardrail(
blocked_patterns=["DROP TABLE", "rm -rf"],
blocked_tools=["shell_execute"],
max_tool_calls=10,
)

Create agent with guardrail

agent = Agent(
name="analyst",
instructions="Analyze data safely",
output_guardrails=[guardrail],
)

Run agent

result = await Runner.run(agent, "Analyze Q4 sales")
\\

Value for SDK Users

Feature Without Guardrail With Agent-OS
Pattern Blocking DIY Built-in
Tool Limits Manual Automatic
Input Validation Separate Integrated
Violation Tracking DIY Built-in callbacks

Integration Path

This guardrail works standalone with the SDK's guardrail system, but can also integrate with the full Agent-OS kernel for:

  • GDPR/HIPAA compliance policies
  • Cost control limits
  • Human-in-the-loop approval flows
  • Cross-framework governance

Related Work

References

Adds kernel-level guardrails for OpenAI Agents SDK.

Features:
- GovernanceGuardrail: Output guardrail with policy enforcement
- GovernancePolicy: Configure blocked patterns, tools, limits
- GovernedRunner: Wrapper with automatic guardrail injection
- Input validation for pre-processing
- Violation callbacks for alerting

Integration with Agent-OS kernel for enterprise governance.

See: https://github.com/imran-siddique/agent-os
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4a37fd7363

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

tripwire_triggered=True,
)

return GuardrailFunctionOutput(tripwire_triggered=False)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Provide output_info when guardrail passes

When no violation is found, the guardrail returns GuardrailFunctionOutput(tripwire_triggered=False) without the required output_info argument. The real GuardrailFunctionOutput dataclass in agents.guardrail requires both fields, so this path raises a TypeError on the first successful output check. This means any non-blocked output will crash the run instead of allowing it. Consider passing output_info=None (or similar) to match the required signature.

Useful? React with 👍 / 👎.

Comment on lines +318 to +323
# Add guardrail to agent if not present
if hasattr(agent, "output_guardrails"):
if self.guardrail not in agent.output_guardrails:
agent.output_guardrails = list(agent.output_guardrails or [])
agent.output_guardrails.append(self.guardrail)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wire tool policies into tool guardrails

The runner only appends the governance guardrail to agent.output_guardrails, so tool policies like blocked_tools and max_tool_calls are never enforced during execution. check_tool is only invoked if users call it manually, which contradicts the documented “Tool Control” and “Rate Limiting” behavior for GovernedRunner. As implemented, a run can call any tool unlimited times without triggering a violation. Consider adding a tool input guardrail or otherwise integrating check_tool into the tool execution path.

Useful? React with 👍 / 👎.

- Add output_info=None when guardrail passes (fixes TypeError)
- Clarify that tool policies require manual check_tool() calls
- Add Tool Policy Enforcement section to README
@seratch
Copy link
Member

seratch commented Feb 5, 2026

Thanks for sending this patch. We don't plan to have this extension as part of this repository, so please consider having this module in your repo (or your contrib repo) instead.

@seratch seratch closed this Feb 5, 2026
@imran-siddique
Copy link
Author

Thanks @seratch for the quick review and feedback! Totally understand - we'll keep this in the Agent-OS repo as a standalone integration.

If you or the team have any input on the guardrail patterns or suggestions for how Agent-OS could better complement the SDK, we'd love to hear it. Feel free to open an issue or discussion on our repo anytime.

Appreciate the work on the Agents SDK - it's a great foundation! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants