Conversation
Adds pluto.compat.wandb module that lets users replace `import wandb` with `import pluto.compat.wandb as wandb` to route all logging through pluto with minimal code changes. Module structure: - __init__.py: Module-level API (init, log, finish, watch, config, summary, run) - run.py: Run class wrapping pluto.Op with wandb.Run-compatible interface - config.py: Dict-like Config object that syncs mutations to pluto - summary.py: Dict-like Summary object tracking last-logged values - data_types.py: Wrappers (Image, Audio, Video, Table, Histogram, Html, Artifact, AlertLevel) that convert to pluto equivalents Key features: - commit=False buffering (accumulate data across log calls) - Nested dict flattening with / separator - wandb env var fallbacks (WANDB_PROJECT, WANDB_MODE, WANDB_TAGS, etc.) - Graceful degradation for unsupported features (define_metric, save, etc.) - Context manager support - Disabled-mode fallback when pluto.init() fails https://claude.ai/code/session_01VTSZKK5UsMqjiADFX57SMY
Summary of ChangesHello @asaiacai, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a wandb-compatible layer for Pluto, allowing users to easily switch from wandb to pluto with minimal code changes. The implementation covers core API functions like init, log, finish, and various data types. The module handles environment variable fallbacks and gracefully degrades for unsupported features, which is a good design choice for compatibility. The accompanying tests provide good coverage for the new functionality. I've identified a few areas for improvement related to redundant logic, unused parameters, and potential data handling nuances that could enhance correctness and maintainability.
pluto/compat/wandb/data_types.py
Outdated
|
|
||
| if self.np_histogram is not None: | ||
| # np_histogram is a tuple of (values, bin_edges) | ||
| return PlutoHistogram(data=self.np_histogram, bins=self.np_histogram) |
There was a problem hiding this comment.
When np_histogram is provided (which is a tuple of (values, bin_edges)), the _to_pluto() method passes the entire tuple as the bins argument to PlutoHistogram. The PlutoHistogram constructor expects bins to be an integer or a sequence of bin edges. Passing the tuple directly might not correctly extract the bin edges, potentially leading to incorrect histogram representation. It should likely be bins=self.np_histogram[1] to pass only the bin edges.
return PlutoHistogram(data=self.np_histogram, bins=self.np_histogram[1])| mode: Optional[str] = None, | ||
| caption: Optional[str] = None, | ||
| grouping: Optional[int] = None, | ||
| classes: Any = None, | ||
| boxes: Any = None, | ||
| masks: Any = None, | ||
| file_type: Optional[str] = None, | ||
| normalize: bool = True, | ||
| ) -> None: |
There was a problem hiding this comment.
Several parameters (grouping, classes, boxes, masks, file_type, normalize) are accepted by the Image constructor but are not stored or used within the class or during conversion to PlutoImage. This can create confusion for users expecting these parameters to have an effect. Consider removing them from the signature or adding a warning if they are provided.
def __init__(
self,
data_or_path: Any = None,
caption: Optional[str] = None,
) -> None:
pluto/compat/wandb/__init__.py
Outdated
| config_dict: Optional[Dict[str, Any]] = None | ||
| if config is not None: | ||
| if isinstance(config, dict): | ||
| config_dict = dict(config) | ||
| elif hasattr(config, '__dict__'): | ||
| config_dict = vars(config) | ||
| else: | ||
| config_dict = {} | ||
|
|
||
| if config_dict and config_include_keys: | ||
| config_dict = { | ||
| k: v for k, v in config_dict.items() if k in config_include_keys | ||
| } | ||
| if config_dict and config_exclude_keys: | ||
| config_dict = { | ||
| k: v for k, v in config_dict.items() if k not in config_exclude_keys | ||
| } |
There was a problem hiding this comment.
The configuration filtering logic could be more robust. If config is None or not a dict/object, config_dict could remain None, leading to errors if config_include_keys or config_exclude_keys are present. It's safer to initialize config_dict to an empty dictionary. Additionally, if both config_include_keys and config_exclude_keys are provided, the order of operations might lead to unexpected results. Consider clarifying the precedence or disallowing both simultaneously.
config_dict: Dict[str, Any] = {}
if config is not None:
if isinstance(config, dict):
config_dict = dict(config)
elif hasattr(config, '__dict__'):
config_dict = vars(config)
if config_include_keys:
config_dict = {
k: v for k, v in config_dict.items() if k in config_include_keys
}
if config_exclude_keys:
config_dict = {
k: v for k, v in config_dict.items() if k not in config_exclude_keys
}| settings=pluto_settings or None, | ||
| run_id=run_id, | ||
| ) | ||
| except Exception as e: |
There was a problem hiding this comment.
When pluto.init() fails, the exception e is logged directly in the warning message. For better debugging and to capture the full context of the error, it's recommended to log the exception with exc_info=True.
logger.warning('%s: pluto.init() failed (%s), creating disabled run', tag, e, exc_info=True)
pluto/compat/wandb/config.py
Outdated
| # Use object.__setattr__ to avoid triggering our __setattr__ | ||
| object.__setattr__(self, '_op', op) | ||
| object.__setattr__(self, '_data', {}) | ||
| object.__setattr__(self, '_allow_val_change', True) |
There was a problem hiding this comment.
| name: Optional[str] = None, | ||
| checksum: bool = True, | ||
| max_objects: Optional[int] = None, | ||
| ) -> 'Artifact': |
There was a problem hiding this comment.
pluto/compat/wandb/run.py
Outdated
| import time | ||
|
|
||
| return time.time() |
There was a problem hiding this comment.
The start_time property currently returns time.time(), which is the current time, not the actual start time of the run. wandb.run.start_time typically refers to the timestamp when the run was initialized. This property should ideally retrieve the actual run start time from self._op.settings or a similar source to provide accurate information.
| import time | |
| return time.time() | |
| @property | |
| def start_time(self) -> float: | |
| return getattr(self._op.settings, 'start_time', time.time()) | |
pluto/compat/wandb/run.py
Outdated
| from pluto.file import Artifact as PlutoArtifact | ||
|
|
||
| art = PlutoArtifact(data=artifact_or_path, caption=name) | ||
| log_name = name or 'artifact' |
There was a problem hiding this comment.
When log_artifact is called with a string path and no name is provided, the log_name defaults to 'artifact'. If multiple string paths are logged without explicit names, they will all be logged under the same key, potentially overwriting previous logs or making them indistinguishable. It would be more robust to use the base filename of the path as the default name if name is None.
art = PlutoArtifact(data=artifact_or_path, caption=name)
log_name = name or os.path.basename(artifact_or_path)
pluto/compat/wandb/summary.py
Outdated
| if isinstance(v, (int, float)) and not isinstance(v, bool): | ||
| store[k] = v | ||
| elif hasattr(v, 'item') and callable(v.item): | ||
| store[k] = v.item() |
There was a problem hiding this comment.
The _update_from_log method explicitly excludes boolean values from being stored in the summary (and not isinstance(v, bool)). While booleans are often treated as flags, wandb.summary might include them as scalar values in some contexts. If full wandb.summary compatibility is desired, this exclusion might lead to a slight behavioral difference. Consider if this exclusion is intentional and aligns with wandb's behavior, or adjust to include booleans if appropriate.
| if isinstance(v, (int, float)) and not isinstance(v, bool): | |
| store[k] = v | |
| elif hasattr(v, 'item') and callable(v.item): | |
| store[k] = v.item() | |
| for k, v in data.items(): | |
| if isinstance(v, (int, float)): | |
| store[k] = v | |
| elif hasattr(v, 'item') and callable(v.item): | |
| store[k] = v.item() | |
tests/test_wandb_compat.py
Outdated
| op.add_tags.assert_called() | ||
|
|
There was a problem hiding this comment.
The test_tags_get_and_set test only asserts that op.add_tags was called, but it doesn't verify the arguments passed to add_tags or remove_tags. This makes the test less robust. It would be more effective to assert the specific tags that were added and removed.
| op.add_tags.assert_called() | |
| # Setting tags | |
| run.tags = ('tag1', 'tag2') | |
| op.remove_tags.assert_called_with(['tag1']) | |
| op.add_tags.assert_called_with(['tag2']) | |
19 new tests in TestParityContract that pin the exact pluto call sequences for common wandb workflows: - Standard training loop (init → config → log N → finish) - Nested metric namespace flattening (train/loss, val/acc) - commit=False buffering and flush behavior - Duplicate key resolution (later values win) - Explicit step= forwarding - Config mutations (attr, dict, bulk update, argparse namespace) - config_include_keys / config_exclude_keys filtering - Tags lifecycle (init tags, runtime mutation) - Data type conversion in log() (Image, Table, Histogram → pluto) - Summary auto-tracking of last scalar per key - Context manager lifecycle (success and exception exit codes) - reinit finishing previous run - watch/alert forwarding - Full realistic workflow (config+tags+mixed data+summary) - Module state reset after finish - log_artifact call sequence (one op.log per file) https://claude.ai/code/session_01VTSZKK5UsMqjiADFX57SMY
Users can now swap `wandb` for `pluto-ml` in their dependencies and keep `import wandb` unchanged — no source edits needed. How it works: - Top-level `wandb/` package included in pyproject.toml packages list - `wandb/__init__.py` re-exports everything from pluto.compat.wandb - Common submodule stubs so deep imports don't break: - wandb.sdk, wandb.sdk.data_types - wandb.data_types - wandb.plot (no-op stubs for line_series, confusion_matrix, etc.) - wandb.apis (Api stub that raises NotImplementedError on queries) - wandb.util (generate_id, make_artifact_name_safe, to_json) - wandb.integration.lightning (WandbLogger → pluto MLOPLogger) 14 new tests in TestTopLevelWandbPackage verifying all import patterns. https://claude.ai/code/session_01VTSZKK5UsMqjiADFX57SMY
Two ways to compare dashboards side-by-side: 1. pytest-based (tests/test_wandb_visual_parity.py): # Pluto shim side (our wandb package): PLUTO_API_TOKEN=<token> pytest tests/test_wandb_visual_parity.py -k pluto -v -s # Real wandb side (separate venv with pip install wandb): WANDB_API_KEY=<key> pytest tests/test_wandb_visual_parity.py -k real_wandb -v -s 2. Standalone runner (tests/wandb_visual_parity_runner.py): # Same script, auto-detects which backend is installed: PLUTO_API_TOKEN=<token> python tests/wandb_visual_parity_runner.py WANDB_API_KEY=<key> python tests/wandb_visual_parity_runner.py Both run identical training loops (20 epochs, 1000 steps, same seed) with: scalar metrics, nested namespaces (train/, val/), histograms, tables, images, config mutations, summary overrides, and tags. Prints dashboard URLs for visual comparison. https://claude.ai/code/session_01VTSZKK5UsMqjiADFX57SMY
- Fix Histogram._to_pluto() passing full tuple instead of bin edges
- Initialize config_dict to {} to prevent None edge cases
- Add exc_info=True for better init failure debugging
- Remove unused _allow_val_change attribute from Config
- Simplify redundant reinit/finish logic in init()
- Remove dead run_id reassignment
- Record actual start_time at Run init instead of returning time.time()
- Use os.path.basename for log_artifact default name
- Include booleans in summary (consistent with wandb behavior)
- Strengthen test_tags_get_and_set assertion
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement define_metric across the full stack:
- Op.define_metric() stores definitions and syncs to server (best-effort)
- Op.get_metric_definition() with glob pattern support
- Summary aggregation (min/max/mean/first/last) in wandb compat layer
- Sync process plumbing (RecordType.METRIC_DEF, enqueue, upload, dispatch)
- ServerInterface.update_metric_definitions() for direct API calls
Also fixes two pre-existing CI failures:
- Fix mypy error: __exit__ return type bool -> None in Run
- Fix test_table_from_dataframe: add pytest.importorskip('pandas')
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enable pluto shim visual parity live test using MLOP_API_TOKEN secret, matching the neptune-compat pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2976742 to
42710b3
Compare
Adds pluto.compat.wandb module that lets users replace
import wandbwithimport pluto.compat.wandb as wandbto route all logging through pluto with minimal code changes.Module structure:
Key features:
https://claude.ai/code/session_01VTSZKK5UsMqjiADFX57SMY
Tested (run the relevant ones):
bash format.sh