Skip to content

Conversation

@Omswastik-11
Copy link

@Omswastik-11 Omswastik-11 commented Dec 23, 2025

Redundant syntax for getters

In getters, syntax is repeated and redundant, mainly through
the submodule having to be imported or addressed.

import openml

# List all datasets and their properties
openml.datasets.list_datasets(output_format="dataframe")

# Get dataset by ID
dataset = openml.datasets.get_dataset(61)

# Get dataset by name
dataset = openml.datasets.get_dataset('Fashion-MNIST')

# This is similar for flows, runs, studies, such as

study = openml.studies.get_study(42)
flow = openml.flows.get_flows(42)

API implementation

import openml

# List all datasets
datasets_df = openml.list("dataset", output_format="dataframe")

# Get dataset by ID
dataset = openml.get("dataset", 61)

# Get dataset by name
dataset = openml.get("dataset", "Fashion-MNIST")

# Get task
task = openml.get("task", 31)

# Get flow
flow = openml.get("flow", 10)

# Get run
run = openml.get("run", 20)

# Shortcut: infer dataset from name when no type specified
dataset = openml.get("Fashion-MNIST")

Implementation Details

  • Added openml.list(object_type: str, **kwargs) -> Any, a dispatcher that forwards to:

    • list_datasets
    • list_tasks
    • list_flows
    • list_runs
  • Added openml.get(object_type_or_name, identifier=None, **kwargs) -> Any, a unified getter with support for:

    • Type-based lookup

      openml.get("dataset", 61)
      openml.get("dataset", "dataset_name")
    • Name-only shortcut for datasets

      openml.get("Fashion-MNIST")
  • Exported both functions via __all__ and documented them with docstrings.

  • Preserved full backward compatibility:

    • Existing submodule APIs (e.g., openml.datasets.get_dataset) remain unchanged.
  • Added unit tests to validate dispatcher behavior without requiring network access.

@Omswastik-11 Omswastik-11 marked this pull request as ready for review December 24, 2025 10:22
@Omswastik-11 Omswastik-11 changed the title [ENH] improved the Getter API for users [ENH] Simplified Unified Get/List API Dec 24, 2025
"run": runs.functions.list_runs,
}

try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should really stop abusing try/except for case distinctions.
This is not good style, since you cannot distinguish actual exceptions from the try-block with the intended exception.

Instead, use if/else with a precise condition. In this case, you can also:

  • use dict.get, and then check if None was retrieved.
  • do an input check on object_type

return func(**kwargs)


def get(object_type_or_name: Any, identifier: Any | None = None, /, **kwargs: Any) -> Any:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first arg can be two different things, I would avoid that - instead, I would do one of two things:

  • use *, the argument syntax
  • make the identifier first, and object_type second

"run": runs.functions.get_run,
}

try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again a try/except that you should avoid.

dataset_ids: list[int | str] | None = None,
flow_ids: list[int] | None = None,
run_ids: list[int] | None = None,
task_ids: builtins.list[int] | None = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you changing this?

GetDispatcher = Dict[str, Callable[..., Any]]


def list(object_type: str, /, **kwargs: Any) -> Any: # noqa: A001
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list is not a good name, as it overloads python list - we should avoid that!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what other good names are there?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe list_all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants