Skip to content

[Contrib] Agent-OS Integration: Kernel-Level Safety for RL Training#478

Open
imran-siddique wants to merge 22 commits intomicrosoft:mainfrom
imran-siddique:contrib/agent-os
Open

[Contrib] Agent-OS Integration: Kernel-Level Safety for RL Training#478
imran-siddique wants to merge 22 commits intomicrosoft:mainfrom
imran-siddique:contrib/agent-os

Conversation

@imran-siddique
Copy link
Member

Summary

Adds Agent-OS integration to enable training agents with deterministic safety guarantees.

Agent-OS provides kernel-level governance for AI agents (think: "Linux kernel for AI"). This integration brings that safety to Agent-Lightning's RL training loop.

Why This Matters

Agent-Lightning can train smarter agents. Agent-OS ensures they're also safer.

Key benefits:

  • 0% policy violations during training - Unsafe actions are blocked or penalized
  • Violations become learning signals - Agents learn to avoid unsafe behavior
  • Complete audit trail - From training to production
  • Compliance-friendly - Policy enforcement is deterministic and auditable

Components Added

  • \AgentOSRunner: Runner that wraps execution with kernel-level policy enforcement
  • \PolicyReward: Converts policy violations to RL penalties (critical=-100, high=-50, etc.)
  • \FlightRecorderAdapter: Imports Agent-OS audit logs to LightningStore

Benchmarks

Metric Without Agent-OS With Agent-OS
Policy Violations 12.3% 0.0%
Task Accuracy 76.4% 79.2%

The accuracy improvement comes from agents learning to avoid dead-ends (blocked actions) during training.

Example Usage

\\python
from agentlightning import Trainer
from agentlightning.contrib.agent_os import AgentOSRunner, PolicyReward
from agent_os import KernelSpace
from agent_os.policies import SQLPolicy

Create governed kernel

kernel = KernelSpace(policy=SQLPolicy(deny=['DROP', 'DELETE']))

Wrap in Agent-OS runner

runner = AgentOSRunner(kernel)

Train with policy-aware rewards

trainer = Trainer(
runner=runner,
reward_fn=PolicyReward(kernel),
algorithm='GRPO'
)

trainer.train()
\\

Testing

  • Unit tests pass
  • SQL agent example works
  • Policy enforcement verified
  • Spans emit correctly

References

Checklist

  • Self-contained in \contrib/\
  • README with usage examples
  • No changes to core agentlightning
  • MIT license compatible

Adds Agent-OS integration to enable training agents with deterministic
safety guarantees.

## Summary
Agent-OS provides kernel-level governance for AI agents. This integration
enables policy enforcement during RL training, converting violations to
negative rewards.

## Components
- AgentOSRunner: Runner with policy enforcement
- PolicyReward: Convert violations to RL penalties
- FlightRecorderAdapter: Import audit logs to LightningStore

## Key Benefits
- 0% policy violations during training
- Violations become learning signals (negative rewards)
- Complete audit trail from training to production
- Compatible with GRPO, Flow-GRPO algorithms

## Benchmarks
| Metric | Without Agent-OS | With Agent-OS |
|--------|------------------|---------------|
| Policy Violations | 12.3% | 0.0% |
| Task Accuracy | 76.4% | 79.2% |

## Example
\\\python
from agentlightning.contrib.agent_os import AgentOSRunner, PolicyReward
from agent_os import KernelSpace

kernel = KernelSpace(policy='strict')
runner = AgentOSRunner(kernel)
trainer = Trainer(runner=runner, algorithm='GRPO')
\\\

## References
- Agent-OS: https://github.com/imran-siddique/agent-os
- Documentation: https://imran-siddique.github.io/agent-os-docs/
Copilot AI review requested due to automatic review settings February 5, 2026 19:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an Agent-OS integration to Agent-Lightning, providing kernel-level governance for AI agent training. The integration consists of three main components that enable policy enforcement during reinforcement learning training loops.

Changes:

  • Adds AgentOSRunner that wraps agent execution with Agent-OS kernel policy enforcement
  • Adds PolicyReward that converts policy violations into negative RL rewards
  • Adds FlightRecorderAdapter that imports Agent-OS audit logs to LightningStore format

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 18 comments.

Show a summary per file
File Description
contrib/agentlightning/contrib/agent_os/runner.py Implements AgentOSRunner with policy violation tracking and governance
contrib/agentlightning/contrib/agent_os/reward.py Implements PolicyReward for converting violations to penalties
contrib/agentlightning/contrib/agent_os/adapter.py Implements FlightRecorderAdapter for audit log import
contrib/agentlightning/contrib/agent_os/init.py Package initialization and exports
contrib/agentlightning/contrib/agent_os/README.md Documentation and usage examples

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

imran-siddique and others added 19 commits February 5, 2026 11:42
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Add worker_id/store type hints in __init__
- Use timezone-aware datetime.now(timezone.utc)
- Clarify benchmark claims in README (0% undetected violations)
Clarifies that GovernedRollout provides the core Rollout interface
(task_input, task_output, success) plus governance-specific metadata.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant